Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Essays on narrative economics, Climate macrofinance and migration
(USC Thesis Other)
Essays on narrative economics, Climate macrofinance and migration
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ESSAYS ON NARRATIVE ECONOMICS, CLIMATE MACROFINANCE AND
MIGRATION
by
Thomas Ash
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(ECONOMICS)
May 2023
Copyright 2023 Thomas Ash
Epigraph
Concerning Part 1:
“Simplybymakingnoiseswithourmouths,wecanreliablycauseprecisenewcombinationsofideastoarise
ineachothersminds.”
– Stephen Pinker, the Language Instinct (1994), p.1; and
– Joseph Farrell and Matthew Rabin, Cheap Talk (1996), American Economic Review.
Concerning Part 2:
“Climatechangeisthedefiningcrisisofourtimeanddisplacementisoneofitsmostdevastating
consequences.”
– United Nations, High Commissioner for Refugees.
ii
Acknowledgments
They say it takes a village to raise a child; after five years of graduate study I am convinced that the same
can be said of completing a PhD thesis. I am deeply indebted to my PhD committee for many fruitful
and interesting discussions, words of encouragement and occasional well-warranted and well-meaning
reprimands. I am grateful to my primary advisor, Pablo Kurlat, who taught me that few words can mean
much more than many, and whose own research both gives me a headache and inspires me to create work
of similar mathematical beauty as well as rigour. I am grateful to my advisor Gerard Hoberg, whose words
of encouragement have always been a strong source of comfort as I attempted to untangle the complex
mysteries of modern computational linguistics, and whose papers are among the best I’ve seen in the field
of text-analysis finance and much wider still. I am grateful to my third senior advisor Matthew Kahn for his
boundless energy, astounding creativity and passion; sitting in a room with Matt easily created a hundred
possible interesting papers. His work documenting climactic impacts on the economy is a literature I hope
to stand on in the coming years.
My junior advisers are both scholars that I highly respect and whose work I expect to flourish. David
Zeke has impressed me with his knowledge and acuity, whilst wading through the muddy waters of my
research, easily retrieving its real findings. My advisor, friend and coauthor, Emily Nix, has spent countless
hours teaching me the fundamentals of research as well as giving me the opportunity to work on a project
with her. I am deeply grateful to have known both.
Many students, professors and seminar participants at USC have been a positive source of inspiration
and I sadly only have space to mention but a few. Standout teachers have been Michael Magill and Joel
David. A standout Professor has been Antonio Bento, whose belief in the value of PhD students was
humbling and unwavering. Stand out co-PhDs have been Karim Fajury, Sam Boysel, Constantin Charles,
Qitong Wang and Shaoshuang Yang, who took extra unsolicited time to engage and help develop my work;
our interactions have meant the world to me. External to USC, the work and words of Kristopher Nimark,
iii
Stephen Hansen, Matthew Gentzkow, Lint Barrage among others, even if less memorable to them, will
be long remembered. I look forward to the future, and the interesting conversations that hopefully are
contained within it, thanks to the positive experiences I have had with all these scholars and giants.
A final thanks has to go to the fields of Economics, Mathematics and Linguistics themselves, as well
as all those who contribute to them. This research and these ideas have and will continue to inspire and
humble me. As with many things, Keynes puts it best: “Practical men who believe themselves to be quite
exemptfromanyintellectualinfluence,areusuallytheslavesofsomedefuncteconomist.”
And, at last, final remarks must be left for my friends and family. My father, whose discipline will
never leave me; my mother whose support has always been unconditional; my brother who will always
understand best who and what I am; my aunt and Grandfather who I have always emulated. Many friends
have been there for me over the years but in particular William Lee and Knight Sukthaworn have been role
models. Special thanks also to Mehregan Ameri and Olivia Brown. Lena McCauley helped shape much
of my life thinking and goals. Former bosses, William Baker of NERA and Jeremy Leonard of Oxford
Economics, taught me everything I know about writing, productivity and getting things done well. Many
others should fill these pages; a prior draft of these acknowledgements contains them all and is available
from the author.
As is true for many, my Grandmother has been a unique source of knowledge during my life. Had
she grown up in a world more encouraging of female scholarship, she likely would have beaten me
as the first to obtain a full-time degree, not to mention a doctoral degree, in my family. Her shrewd
observation, insistence on always having something interesting to talk about, and quiet knowing of things
laid the foundation for the scholar I have, and will continue to, try and become. Her emphasis on happily
“agreeing to disagree” with others, would in fact turn prescient in discussions we had twenty years prior to
it becoming an established concept in the field of financial speculation and heterogenous belief bubbles (see
Simsek (190)); this too became central to my job market paper as can be seen in chapter 3. She sadly passed
away during my graduate studies, will never be forgotten, and this thesis could only ever be dedicated to
her.
iv
Table of Contents
Epigraph........................................................................................................ ii
Acknowledgments............................................................................................ iii
List of Tables................................................................................................... ix
List of Figures.................................................................................................. xi
Abstract......................................................................................................... xiv
I Essays on Narrative Economics ............................................................... xvii
1 Chapter 1 A tale of two Bitcoin Bubbles: explosive behavior in textual risk factors ........ 1
1 Introduction................................................................................................. 1
2 Data .......................................................................................................... 5
2.1 Twitter Data - Bitcoin ........................................................................... 5
2.2 Price Data - Bitcoin .............................................................................. 8
3 Methodology................................................................................................ 8
3.1 Latent Dirichlet Allocation (LDA) ............................................................. 9
3.2 Word2Vec .......................................................................................... 10
3.3 Combining LDA with Word2Vec to create text scores ..................................... 12
3.4 Explosive unit root test.......................................................................... 13
4 Text implementation ...................................................................................... 14
4.1 Identify risk factors during first Bitcoin Bubble............................................. 14
4.2 Identify risk factors during second Bitcoin Bubble ......................................... 16
v
5 Testing for explosive behavior ........................................................................... 17
5.1 Explosive behavior in Bitcoin prices .......................................................... 17
5.2 Explosive behavior in Risk Factors: bubble episode 1 ...................................... 18
5.3 Explosive behavior in Risk Factors: bubble episode 2 ...................................... 21
6 Conclusion .................................................................................................. 24
Appendix ........................................................................................................... 26
2 Chapter 2 Disease-economy trade-offs under alternative epidemic control strategies ...... 29
1 Introduction................................................................................................. 29
2 Results ....................................................................................................... 34
2.1 Methodological contributions .................................................................. 34
2.2 Robustness to assumptions on coupling parameters and functions...................... 38
3 Discussion................................................................................................... 39
4 Methods ..................................................................................................... 43
4.1 Contact function.................................................................................. 44
4.2 Calibrating contacts.............................................................................. 45
4.3 SIRD epidemiological model.................................................................... 46
4.4 Choices............................................................................................. 47
4.5 Utility calibration ................................................................................ 49
4.6 Model applications ............................................................................... 49
Appendix ........................................................................................................... 58
3 Chapter 3 Talking and bubbles: does social media affect the speed and magnitude of
bubble formation? ........................................................................................ 59
1 Introduction................................................................................................. 59
2 A model of talking and listening during bubble events.............................................. 64
3 Model results ............................................................................................... 68
3.1 Trade-offs between idea substitutability and disagreement ............................... 68
3.2 Listening choices affect asset prices........................................................... 71
3.3 Model can explain trade volume and “talk volume” ........................................ 74
3.4 Informational cascades can be recovered from the model ................................. 77
vi
4 Data .......................................................................................................... 78
5 Empirical methodology ................................................................................... 80
5.1 Computational linguistics methodologies.................................................... 81
5.2 Recovering narratives from twitter data...................................................... 84
5.3 Measuring talking, listening and those infected by ideas.................................. 87
6 Calibration .................................................................................................. 89
6.1 Calibration approach: social media or news parameters .................................. 90
6.2 Calibration fit to social media: targeted and untargeted moments ...................... 91
7 Counterfactual results: removing social media ....................................................... 94
7.1 Counterfactual removing social media ....................................................... 94
7.2 Key mechanisms.................................................................................. 95
7.3 Main results on size and speed of bubbles due to social media ........................... 97
8 Conclusion .................................................................................................. 98
Appendix ........................................................................................................... 101
II Essays on Climate Change and Migration...............................................114
4 Chapter 4 Are macro-climate models missing financial frictions: Empirical evidence and
a structural model ........................................................................................ 115
1 Introduction................................................................................................. 115
2 Data and descriptive statistics ........................................................................... 119
2.1 Natural disaster data ............................................................................. 119
2.2 Identifiction and randomness of natural disasters .......................................... 120
2.3 Dealscan data ..................................................................................... 121
2.4 State and year variation in credit spreads .................................................... 124
3 Financial frictions during climate-related natural disaster shocks................................. 126
3.1 Empirical framework ............................................................................ 126
3.2 Panel results on interest rates/corporate credit spreads ................................... 126
3.3 Robustness ........................................................................................ 128
4 A financial macroeconomy with climate............................................................... 131
vii
4.1 Model Environment.............................................................................. 133
4.2 Climate module ................................................................................... 133
4.3 Consumers ........................................................................................ 135
4.4 Final output firm ................................................................................. 135
4.5 Energy firms ...................................................................................... 140
4.6 Equilibrium........................................................................................ 141
4.7 Key analytical results ............................................................................ 142
5 Conclusion .................................................................................................. 144
Appendix ........................................................................................................... 147
5 Chapter 5 How Asylum Seekers in the United States Respond to Their Judges: Evidence
and Implications .......................................................................................... 149
1 Introduction................................................................................................. 149
2 Institutional Context ...................................................................................... 155
2.1 Who Is An Asylum Seeker? .................................................................... 155
2.2 Immigration Judges .............................................................................. 156
2.3 Anecdotal Accounts of Asylum Seeker Absentia ........................................... 160
3 Data and Descriptive Results............................................................................. 163
3.1 Data................................................................................................. 163
3.2 Descriptive Results ............................................................................... 164
4 Empirical Specification.................................................................................... 168
4.1 Research Design .................................................................................. 168
4.2 Validity of Random Judge Assignment ....................................................... 171
5 Results ....................................................................................................... 177
5.1 Additional Robustness........................................................................... 179
6 Implications for Research Designs Using Randomly Assigned Judges ............................ 181
7 Conclusion .................................................................................................. 184
Appendix ........................................................................................................... 187
References...................................................................................................... 202
viii
List of Tables
1.1 Biggest users are both well known users and prominent individual users ....................... 7
1.2 Post filtering LDA factors, first bubble episode ....................................................... 15
1.3 Examples of 41 final textual risk factors: 1st bubble, most explosive.............................. 16
1.4 Examples of 68 final textual risk factors: 2nd bubble, most explosive............................. 17
3.1 Examples of text data for Bitcoin and Gamestop ..................................................... 80
3.2 Document and user characteristics ..................................................................... 82
3.3 Idea vocabularies: user-approach ....................................................................... 86
3.4 Counterfactuals to demonstrate impact of social media............................................. 90
3.5 Calibration is close to targeted and un-targeted series .............................................. 92
3.6 Range of estimates of SM effect on size, speed and crash of Bitcoin bubble 2017/18 ........... 98
3.7 Recovered narratives: LDA Approaches ............................................................... 111
3.8 Other model parameters .................................................................................. 113
4.1 Regression results: credit spreads on normalized disasters and control variables. .............. 128
4.2 Sensitivity using dummy variables instead of using number of disaster declarations. ......... 132
5.1 Judge Characteristics, 2009-2015 ....................................................................... 158
5.2 Test of Random Assignment to Judges ................................................................. 174
5.3 Asylum Seeker Responses to Judge Leniency, 2009-2015............................................ 179
5.4 Simulation of Bias with Endogenous Response to Randomly Assigned Judge Leniency....... 186
5.5 Test of Random Assignment to Judges ................................................................. 192
5.6 Main Data Sets ............................................................................................. 195
5.7 Correlation in Judge Leniency Across Data Sets...................................................... 196
ix
5.8 Test of Random Assignment to Judges - Estimation Sample ....................................... 197
5.9 Asylum Seeker Responses to Judge Leniency Robustness to Month by Court Fixed Effects,
2009-2015 .................................................................................................... 198
5.10 Parameters Used in the Monte Carlo Simulation ..................................................... 200
x
List of Figures
1.1 Tweet quantity increase as Bitcoin price increases................................................... 6
1.2 Most frequent tweeters quantity is less correlated with bitcoin price changes.................. 8
1.3 Explosive behavior in Bitcoin prices: 2017-2020 ...................................................... 18
1.4 Explosive behavior in textual risk factors: 1st Bitcoin bubble, 2017-2018 ........................ 20
1.5 “North Korea” explosive textual risk factor............................................................ 21
1.6 Explosive behavior in textual risk factors: 2nd Bitcoin bubble, 2019 .............................. 22
1.7 “Interest Rate” explosive textual risk factor ........................................................... 23
1.8 “Full Node” explosive textual risk factor ............................................................... 24
1.9 Post filtering LDA factors ................................................................................ 27
1.10 41 textual risk factors during Bitcoin Bubble period (note axes differ) ........................... 28
2.1 Coupled system schematic................................................................................ 52
2.2 Disease dynamics and economic outcomes under voluntary isolation, blanket lockdown,
and targeted isolation. .................................................................................... 53
2.3 Key model mechanisms. .................................................................................. 54
2.4 Model outcomes with different information frictions. .............................................. 55
2.5 Model outcomes with different compliance rates. ................................................... 56
2.6 Result sensitivity to key model parameters. .......................................................... 57
3.1 There is a trade-off between substitutability and disagreement ................................... 70
3.2 Comparative statics: more novelty and disagreement affect bubbles and crashes ............. 72
3.3 Listening choices affect price path cumulatively .................................................... 74
3.4 Talk volume empirically is elevated during bubble episodes ...................................... 76
3.5 Text observations over time ............................................................................. 81
xi
3.6 Number of talkers throughout Bitcoin bubble episode (User approach) .......................... 89
3.7 Model price versus data in User and LDA approaches............................................... 93
3.8 Increase in avg listening per user causes rise, then fall in proportion of listeners causes fall. 93
3.9 Bubble grows slower, with lower magnitude and no crash ......................................... 95
3.10 Relative speed of infections in social media and news models ..................................... 95
3.11 Listening differs across social media and news models.............................................. 97
3.12 Listening tends to precede the bubble peak ........................................................... 101
3.13 Talking tends to move in line with the bubble ....................................................... 102
3.14 Average tweets per user relatively constant over bubble period (Bitcoin shown) .............. 109
3.15 Comparison of listening measures, retweets versus favorites (Bitcoin shown, smoothed) .... 110
3.16 LDA approach: increase in avg listening per user causes rise, then fall in proportion of
listeners causes fall ........................................................................................ 110
3.17 LDA approach: price in data, SM model and news model........................................... 112
3.18 LDA approach: speed mechanism in SM model and news model.................................. 112
3.19 Data for user and LDA approaches: infections, news vs social media ............................ 112
4.1 Distribution of state natural disasters over time. ..................................................... 120
4.2 Number of counties declaring natural disasters by state and year................................. 122
4.3 Coefficients of variation by state (1980-2016). ........................................................ 123
4.4 Persistence of natural disasters by state (1980-2016). ................................................ 124
4.5 Distribution of state credit spread measure over time. .............................................. 125
4.6 Sensitivity of main results to alternative data approaches and analytical choices. ............. 146
5.1 The Asylum Process ....................................................................................... 156
5.2 Raw Variation in Judge Leniency Across Courts, 2009-2015........................................ 161
5.3 Variation in Average Grant Rates By Court, 2009-2015.............................................. 162
5.4 Number of Asylum Cases, Grants, and Absentia 2009-2019 ........................................ 166
5.5 Share of Major Nationalities, 2009-2019 ............................................................... 167
5.6 Correlation between Case Characteristics and Absentia, 2009-2015 .............................. 168
5.7 Variation in Judge Leniency with Court by Month Fixed Effects Removed ...................... 176
5.8 Robustness of Judge Leniency on Absentia Results .................................................. 181
xii
5.9 Publicly Available Judge Data: Example for Judge Neumeister .................................... 190
5.10 Variation in Judge Leniency Averages By Court, 2009-2015 ........................................ 199
xiii
Abstract
This thesis entitled “Essays on Narrative Economics, Climate Macrofinance and Migration” contains three
essays (Part 1) developing an approach to studying narratives; as well as two essays (Part 2) on Climate
Change Macrofinance and Migration.
The first chapter demonstrates how verbal content can be used to catalog and identify financial asset
bubbles. I recover time series of “textual risk factors”, economic factors that users regularly discuss, for two
Bitcoin bubble episodes using social media data. I apply time series tests for “explosivity”, typically used
to test for bubbles and/or “irrational exuberance” in price data, to these text series. This exercise generates
three findings about the behavior of verbal content during bubble events: (1) during bubbles, explosive
prices do co-occur with explosivity in some, though not all, of these textual risk factors; (2) explosivity is
more present in textual risk factors measured using all tweets rather than just those with non-zero retweets,
suggesting a key role for tweets typically considered unpopular; and (3) explosive textual risk factors
differ across episodes, suggesting each bubble is associated with its own set of unique verbal content.
These findings suggest that a procedure of checking textual risk factors for explosivity may provide a
useful additional test of bubble formation. I further provide evidence on important events/verbal content
associated with each Bitcoin bubble episode.
In the second chapter, with coauthors, I use a contagion model (a modelling strategy I employ in
the following chapter for narratives) to study pandemics. Public policy and academic debates regarding
pandemic control strategies note disease-economy trade-offs, often prioritizing one outcome over the
other. Using a calibrated, coupled epi-economic model of individual behavior embedded within the broader
economy during a novel epidemic, we show that targeted isolation strategies can avert up to 91% of
economic losses relative to voluntary isolation strategies. Unlike widely-used blanket lockdowns, economic
savings of targeted isolation do not impose additional disease burdens, avoiding disease-economy trade-
offs. Targeted isolation achieves this by addressing the fundamental coordination failure between infectious
xiv
and susceptible individuals that drives the recession. Importantly, we show testing and compliance frictions
can erode some of the gains from targeted isolation, but improving test quality unlocks the majority of the
benefits of targeted isolation.
In the third chapter I bring together methods developed in the prior two to build an approach to
modelling narratives. I develop a theory of talking and listening during bubble events. Agents decide how
much to “listen” (e.g. read tweets about Bitcoin) which may cause them to adopt an idea (e.g “blockchain”).
This changes their beliefs, affecting their investment decisions. The model provides a role for language in
bubbles – language determines an idea/narrative’s optimism/pessimism, but also its novelty. These two
roles for language interact, driving bubble and crash phases. Using Twitter data, I calibrate the model
using modern computational linguistics techniques for several bubbles. The calibrated model can explain
aggregate talking and listening on Twitter, as well as the bubble price. I use the model to show that with
social media bubbles form faster and have larger magnitude. The framework may be useful for modelling
other economic aggregates impacted by social interactions e.g. the emergence of political, social and
environmental innovations/investments.
In the fourth chapter, and the first of Part 2, I examine the local financial effects of natural disasters
to study climate change. Substantial debate exists on how financial markets will react to climate change.
Given the importance of these markets in the economy, and how the economy responds to shocks, understanding
this relationship is important. I contribute to this debate by providing empirical evidence that establishes
the financial consequences of climate shocks and building a theoretical model to parameterize the role of
financial markets in the climate problem. My empirical results show that routine climate shocks drag on
firms’ ability to raise financing via higher credit spreads; while larger disasters cause much larger spikes
in credit spreads of 60-100 basis points for an average firm. I use this evidence to motivate a financial
macroeconomy with climate model (a DSGE model with a climate externality and collateral constraint) and
state important results and intuition. The marginal externality damage equation from the macroclimate
literature still holds but reductions in economic output are amplified.
Finally, an additional critical impact of climate change is the displacement it may create; in my final
chapter I study a specific form of migration to the United States. Every year many thousands of migrants
seek asylum in the United States. Upon entry, they encounter U.S. immigration judges who exhibit large
variability in their decisions. We document on average a 20 percentage point within-court gap in grant
rates between the least versus most lenient judges. We find that asylum seekers respond to these large
xv
discrepancies across judges. Focusing on the years 2009-2015, preceding and during a major increase
in asylum applicants, we estimate that asylum seekers who are quasi-randomly assigned to less lenient
immigration judges are more likely to be absent for their immigration hearings. We show that this type
of endogenous response to decision-maker leniency leads to bias in second-stage estimates when using
randomly assigned judges and variation in judge leniency as an instrument. We conclude that the extreme
variability in judicial decisions in United States immigration courts causes important distortions in the
behavior of those subject to such caprice.
xvi
PartI
EssaysonNarrativeEconomics
xvii
Chapter1
AtaleoftwoBitcoinBubbles: explosivebehaviorintextualriskfactors
1 Introduction
Many questions still remain about financial bubbles (see Brunnermeier and Oehmke (41)). Among these is
understanding how bubbles start in the first place? Most of the theoretical literature assumes they begin
with exogenous shocks or other features outside of these models. The literature has also noted that bubbles
are not simply wild rises and falls in prices, but a more general phenomena: “thereismuchmoretoabubble
thanameresecuritypriceincrease: innovation,displacementofexistingfirms,creationofnewones,andmore
generally a paradigm shift as entrepreneurs and investors rush towards a new El Dorado” (Greenwood et al.
(108)). In this paper, I examine these outstanding questions using new approaches for constructing textual
risk factors (Hanley and Hoberg (112)), which give a richer picture of bubble price episodes based on social
media text data. This richer information allows for a deeper analysis of the environment in which bubbles
begin, whilst also evaluating bubbles beyond just price and financial data. I construct this data and apply
existing time series approaches for analyzing bubbles to these new data series.
I first construct textual risk factors on two social media era financial bubbles for the same asset:
the separate Bitcoin bubbles of (1) 2017/18 and (2) 2019.
1
The measurement of textual risk factors was
developed by Hanley and Hoberg (112) to dynamically identify emerging risks to financial stability. Risk
factors are dynamic in that they differ depending on the episode of financial instability. Bubbles are
typically considered episodes of financial instability making them a good candidate for this text methodology.
I identify a total of 41 textual risk factors for the first Bitcoin bubble in 2017/18. I identify a further 68 risk
factors for the second Bitcoin bubble in 2019.
1
Both satisfy common definitions of financial bubbles from the literature. E.g. Greenwood et al. (108) identify bubbles as
occurring when the price of a financial asset rises by at least 100% over a two year period, followed by a fall of at least 50% over
the following one year period.
1
I then apply explosive unit root tests to this new time series data. One-sided, “explosive”, unit root
tests have been developed for some time to identify bubbles in asset price data (Diba and Grossman
(72), Hall et al. (111), Phillips et al. (172), Cheung et al. (60)). These approaches are applied recursively
allowing them to date-stamp when periods of explosivity begin and end. Phillips et al. (172) develop
the approach and show how their time series test of “explosivity” relates to prices deviating from the
present-discounted value of future dividend flows and therefore represents an irrational deviation from
long-term fundamentals. In this sense they emphasize their value as measures of “irrational exuberance”
in price data, noting “We first define financial exuberance in the time series context in terms of explosive
autoregressivebehavior…Inthiscontext,theapproachiscompatiblewithseveraldifferentexplanationsofthis
period of market activity, including the rational bubble literature, herd behavior, and exuberant and rational
responses to economic fundamentals. All these propagating mechanisms can lead to explosive characteristics
in the data.” I take these time series tests and apply them to my textual risk factors to measure irrational
exuberance propagating in verbal content.
I first demonstrate that Bitcoin price data tests positive for explosive unit roots (the null of a normal
unit root process is rejected in favor of the alternative hypothesis that the data follows an explosive unit
root) during both Bitcoin bubble episodes of interest. This validates that the test correctly identifies two
episodes in the Bitcoin price data widely viewed to constitute bubble events. I show that during the first
Bitcoin bubble explosive behavior begins towards the end of November 2017 as the Bitcoin price begins to
escalate over $8k and remains until the end of December 2017 as the Bitcoin price began to crash, falling
below $13k from its peak at around $20k. Similarly for the second Bitcoin bubble I show that explosive
behavior begins around May 2019 as the price begins to escalate above$6k, before subsiding in June 2019
just as the Bitcoin price crashed to below $10k following its peak at around $13k. Interestingly, after this
second episode the price continues to gradually fall or readjust to$7k over a 6 month period which is not
identified as explosive, and instead seems consistent with a gradual market readjustment.
Second, I apply explosive unit root tests to textual risk factors generated for the two bubble episodes.
I find that during both Bitcoin bubbles there are textual risk factors also testing positive for explosivity.
The first bubble episode features 12 explosive textual risk factors while the second features just 3. The
first Bitcoin bubble had explosive verbal content related to a risk factor labelled “north korea” but which
referred to hacking and cybertheft more generally, which conincided with the crash. The second bubble
appeared to start during an explosive textual risk factor about “full node”s, coninciding with an announcement
2
of the release of a new smartphone that could operate Bitcoin full nodes. During the crash phase of the
second Bitcoin bubble an explosive spike in an “interest rate” textual risk factor occurred as the Federal
Reserve announced interest rate falls which co-occurred with a rally in the rapidly falling Bitcoin price,
perhaps slowing the downward price adjustment Bitcoin was experiencing. I also note that both bubbles
feature a range of textual risk factors that do not feature explosivity. This methodology then is a useful
screening device for verbal content featuring explosive patterns that co-occur with explosive patterns in
the price data during bubble events (i.e. verbal factors where the associated number of conversations are
increasing explosively as the price also increases explosively). This evidence suggests that bubble episodes
tend to feature explosive increases in conversation threads on social media. Further study of textual risk
factors for a wider range of bubbles would be useful to understand whether this is a general feature of all
bubbles.
I also examine how these explosive patterns differ across different types of tweets. Tweets in my dataset
differ across a range of attributes such as the type of user posting them (large users, users representing
large companies, users with large or small number of followers), their text size in terms of number of words,
or their popularity as measured by number of retweets or favorites. Popularity of tweets is an important
indicator as it indicates whether a given tweet ultimately reaches a large audience. I examine the set of
“popular tweets” (retweets greater than zero) as well as all tweets (popular and unpopular). Interestingly
I find that when unpopular tweets are excluded fewer textual risk factors are explosive. This suggests that
measuring explosive verbal content requires an assessment of all verbal content, popular and otherwise.
Similarly, bubble episodes do not necessarily require an explosive increase in popular tweets as may be
hypothesized – i.e. a reasonable hypothesis may have been that for a bubble to form requires an explosive
increase in popular tweets about specific topics, without necessarily including an explosive increase in the
overall number of tweets about that topic.
This paper contributes to a literature using and developing text analysis methodologies to answer
research questions in Economics and Finance. A summary can be found in Gentzkow et al. (101). Many
papers in this literature use word count approaches to measure important economic concepts by the
frequency they are discussed, see for example Hassan et al. (116). A prominent method that I will utilize
in this paper is the LDA approach: for example Hansen et al. (114) use LDA to identify key textual themes
being discussed in FOMC meetings and associate them with the macroeconomy. A range of papers in
the finance literature use various text based methodologies among other approaches to measure investor
3
sentiment; this literature is summarized by Zhou (207). A key paper that I follow is Hanley and Hoberg
(112) who combine LDA with neural network approaches – approaches which develop a vector space of
words and use this vector space to measure linguistic distances between concepts. I follow this latter paper
in combining these two approaches to find textual risk factors.
The theoretical literature on financial asset bubbles is large. Papers include models of rational bubbles
(e.g. Tirole (201)), bubbles under asymmetric information (Brunnermeier (39)), herding and informational
cascades ((29), Banerjee (14)) and heterogenous belief bubbles (Simsek (190)). The empirical literature
is smaller and more recent, examples include Greenwood et al. (108) who examine all bubbles from the
last 100 years and summarize their associated financial characteristics, as well as a number of studies
that look specifically at individual bubbles events, such as Pastor and Veronesi (166) who examine the
dot-com bubble. Some work has been conducted in the laboratory: Celen and Kariv (54) show that in an
experimental setting herding and informational cascades occur often, behavior that may underpin financial
bubbles, and theorize that this behavior may occur due to deviations from Bayesian rationality in the form
of errors made by participants.
Finally, I contribute to a literature on using time series tests to try and identify bubble behavior. A
range of papers in Econometrics have debated the possibility that bubble may follow a unit root type
process. In particular, Diba and Grossman (72) examine whether a finding of a unit root in the difference
of a financial asset price time series may be indicative of bubble behavior. This led to critique from e.g.
Evans (86). More recent approaches have emphasized the use of rolling “explosive”, one-sided unit root
tests on financial asset price time series data to test for bubbles as well as their start and end (Phillips et al.
(172, 170, 171)). These approaches have been developed in a series of papers changing the way in which
the rolling windows are calculated. I contribute to this literature by applying these tests to verbal content
data during bubbles to test for bubble behavior.
This paper proceeds as follows. In section 2 I describe the two datasets used in this paper, their
collection and their descriptive statistics. In section 3, I describe the text analysis methodologies used in
this paper along with referencing showing that these are now established, peer-reviewed methodologies
available to economic research. I also describe the explosive unit root tests used in this paper. In section 4,
I describe the results of using the text methodologies on the Bitcoin bubble data as well as the textual risk
factors that will be studied later in the paper. In section 5, I show results of using the explosive unit root
tests on both price and textual risk factor data. Finally, section 5 concludes.
4
2 Data
Social media data collection poses computational and big data challenges. Twitter datasets include millions
of tweets and can cover a wide range of topics. The more tweets that are collected the more representative is
the data of the true universe of conversations occurring. However the more tweets collected the larger are
the data processing and computation issues. As a result, it is important to carefully select the dataset,
trading off the informational benefit against the computational costs. In this section, I describe data
collection methodology and relevant choices made to limit the dataset to a manageable size.
2.1 TwitterData-Bitcoin
I collect/webscrape every tweet that included the word “Bitcoin” over the periods of two bitcoin bubbles.
2
The first Bitcoin bubble (hereafter the “first bubble episode”) occurred between January 2017 to April 2018,
peaking at around $20k per Bitcoin in December 2017. The second Bitcoin bubble (hereafter the “second
bubble episode”) occurred between December 2018 and December 2019, peaking at around$13k per Bitcoin
in June 2019. The full price progression over the whole period is shown in figure 1.3 (grey dashed line, right
axis).
The overall dataset includes over 20 million tweets from a wide range of different users, each of whom
have different quantities of followers and levels of popularity (as measured by average retweets for their
tweets). Restricting the dataset to the set of tweets including the word “Bitcoin” ensures the tweets
focus on the research question (i.e. the evolution of verbal content during these Bitcoin Bubbles). This
keyword restriction may lose some relevant tweets that influenced the price path of Bitcoin (for example if
conversations on unrelated topics led the Bitcoin conversation to move in a new direction). Nevertheless,
the data restrictions likely do capture the main conversations taking place about Bitcoin over the period
whilst remaining of a tractable size.
3
As expected, the number of Tweets increases as the price and therefore attention on Bitcoin increases.
This is shown for the first Bitcoin bubble in figure 1.1 where the number of Tweets tracks the Bitcoin price
movements closely. However it worth noting that the tweet count actual appears to increase and peak
2
Both of these bubbles meet empirical definitions of an asset price bubble from the literature. See Greenwood et al. (108).
3
Other research suggests that even if this dataset misses the exact origination of some discussion of Bitcoin verbal content,
most of these discussions are likely to quickly show up in my dataset. In particular, Cag´ e et al. (48) show that information spreads
across the internet very quickly, much of which is direct copies, with slight alteration, of original material. Chawla et al. (56)
similarly show that information diffuses very rapidly on Twitter.
5
slightly before the price does, suggesting verbal content may slightly precede price movements. More
surprisingly, key users (i.e. those that post the most Tweets including the word Bitcoin over the whole
period) do not increase their number of tweets over the period (figure 1.2 shows this, again for the first
bubble episode). These users also obtain a similar, unchanging quantity of retweets (on average over their
daily tweets) over the period. This suggests that, during a bubble episode, the Bitcoin-specific network of
users on Twitter does not necessarily gain new key users, but instead just acquires more small users.
Figure 1.1: Tweet quantity increase as Bitcoin price increases
Examining key users in more detail, I observe that this group comprises of familiar news outlets such
as Business Insider and CNBC, as well as large circulation cryptocurrency specific news websites such as
CoinTelegraph, Coindesk and CCNMarkets. In addition, this group includes a small number of individual
users, each of whom have a very high quantity of followers (see Table 1.1 which shows key users, again
for the first bubble episode).
Standard text processing is applied to all the tweets including removing stopwords, tokenizing (reducing
each word to its associated token to avoid counting verbs, nouns or adjectives with the same meaning as
separate concepts). Regular words that are unlikely to be informative are removed, for example the word
“Bitcoin” by construction of the dataset appears in every tweet and so is removed.
6
Table 1.1: Biggest users are both well known users and prominent individual users
7
Figure 1.2: Most frequent tweeters quantity is less correlated with bitcoin price changes
2.2 PriceData-Bitcoin
The Bitcoin price data I use for this analysis is obtained from the Federal Reserve Economic Data (FRED)
service. This is a standard source for economic and financial data and is no different from other easily
attainable data on the Bitcoin price.
3 Methodology
My text analysis methodology is simple and follows approaches used in peer-reviewed research. A key
issue with text data is how to turn it into numerical information that can be processed and analysed like
regular data, but that still retains the information of the original text. Hanley and Hoberg (112) do this by
combining two methodologies from computational linguistics: Latent Dirichlet Allocation (LDA; a topic
modelling approach) and Word2Vec (a neural network approach). This approach delivers two advantages.
First, it delivers an interpretable set of “textual risk factors” from the data (rather than pure LDA approaches
which struggle with interpretability). Second, the approach can use different levels of subjectivity, with
8
the most objective version requiring no language judgements by the researcher (beneficial to preserve
researcher independence from the results of the analysis). I utilise their approach, adapted for social media
data, in this paper the details of which I provide below. Further I combine this text methodology with time
series tests for explosive unit roots for bubble detection which I describe in the final subsection.
3.1 LatentDirichletAllocation(LDA)
This approach uses a probabilistic assessment of word frequency across documents to derive a set of
common “topics”. It is akin to factor analysis in time series – it removes noise and focuses on continually
present factors (or “topics”) in the text data. In so doing, it provides a linguistic summary of the themes
present in the text corpus. I present a concise description of the methodology here, and provide references
to demonstrate that this methodology is now a common and accepted text analysis methodology; more
information can be found in the original Journal of Machine Learning paper Blei et al. (30).
The LDA model works over a document corpus of M documents (in my case Tweets) D =d
1
,d
2
,...,d
M
.
Each document comprises a sequence of N wordsw
j
denoted byd
i
=(w
1
,w
2
,...,w
N
d
) where eachw
j
is
anN× 1 vector with a1 at the position corresponding to that word, and zeros otherwise.
LDA envisages that all the words in each document are drawn from a distribution of topics which exists
in general, as well as from a per document distribution of topics. Over all the wordsv
n
∈ v
1
,..,v
V
in the
text corpus with vocab of sizeV , a word in topick∈1,..,K has probabilityβ k,n
=p
v
n
=1|z
k
=1
of
being drawn.β k
then is a vector with entries for each word and is itself drawn from a Dirichlet distribution
with a single scalar parameterβ k
∼ Dir(η ) whereη is a scalar.
4
In addition to the corpus level distribution of words in the corpus vocabulary, each document also has
a distribution of topics from which words in a document are drawn. Here, each document has a topic
distribution parameter θ d
which is a Kx1 vector showing the distribution of topics in each document.
Then for each word in a document, a topic is drawn as z
k
∼ Multinomial(θ d
). This topic, then determines
whichβ k
is used which determines the probability of drawing a given word i.e. the probability of selecting
word w
j
in document d
i
is p(w
j
|z
k
,β ). Each of the θ d
values are themselves drawn from a Dirichlet
distribution with hyperparameterα which is ak-vector that is estimated from the data.
Given this, it can be shown that the probability of the corpus is as shown below (α andβ are corpus
4
A Dirichlet distribution takes a vector of parameters. Here an “exchangeable” Dirichlet distribution is used where all
parameters are a single scalar valueη .
9
level parameters,θ d
’s are document level parameters, andz
dn
,w
dn
’s are word level parameters drawn for
each word). Using this probability, parameters α and β are estimated by maximum likelihood from the
data.
5
p(D|α,β )=
M
Y
d=1
Z
p(θ d
|α )
N
d
Y
n=1
X
z
ln
p(z
dn
|θ d
)p(w
dn
|z
dn
,β )
!
dθ
d
(1.1)
LDA has already been used widely in text analysis applications and is a peer-reviewed and accepted
method of economic analysis in academic research. Gentzkow et al. (101) in their review of the current use
of text analysis describe the prominance of LDA as a method in empirical research. Hansen et al. (114) use
LDA to assess a change in text content of FOMC minutes following a change to disclosure requirments.
Finally, this method, in combination with Word2Vec described below, has also been used in the assessment
of economy-wide finacial risk by Hanley and Hoberg (112). It is clear then that LDA is now a valid tool of
economic analysis.
3.2 Word2Vec
Word2Vec is a neural network approach that creates a dictionary vector space for the words in which words
with a similar context, as observed within the text corpus, are located close together in the space. As a
result, it is one of the few natural language processing methods that captures the linguistic importance of
context. Thus for example, with a Word2Vec fitted on a text corpus of bigrams, it is possible to determine
what are the 100 bigrams that are closest to the bigram “technology-bubble”. I can therefore take all
bigrams from the LDA topics and for each find the 100 bigrams closest to them – each bigram’s “semantic
vocabulary” in the terminology of Hanley and Hoberg (RFS forthcoming). This therefore gives a more
readily interpretable set of risk factors from the text data. As with LDA, I first provide a brief description
of the method and then provide references to show it is a peer-reviewed and accepted methodology within
economic analysis. Again further information can be found within the original paper posted on ArXiv by
Mikolov et al. (147).
The goal of Word2Vec is not to actually output the neural network it creates, but instead to output the
“word embeddings” or vectors that are constructed as the neural network is trained. The objective of the
5
In order to do this, various techniques are needed to circumvent non-computability of document probabilities.
10
neural network’s training is to predict a target word from a context (i.e. the x words that precede that
word in any given sentence). The word vectors that are created to achieve this objective are what we use
to represent the words in the dictionary. The first step is to “one hot encode” all words by giving them a
vector representation which has entries one for that word and zero for all others in the vocabulary. These
vectors have a very large dimensionality since they need one dimension per word. These original vectors
then are transformed through a series of “layers” by weighting matrices and functions, where the weights
are trained through repeated iterative adjustment to minimize a loss function, which attempts to predict
the target word from its context.
6
More specifically, the neural network has three layers: (1) the input layer which consists simply of
the one-hot encoded word vectors; (2) the hidden layer which consists of reduced dimensionality vectors
for each word; and (3) an output layer which contains the predicted target word. By using context words
within the data, passing them through the neural network and comparing the neural network’s output to
the target word in the data, the training process iteratively estimates the weighting matrices.
The hidden layer is obtained from the input layer by a simple matrix calculation h=W
T
x, whereh is
the hidden layer vector,W is the weighting matrix andx is the original one hot encoded vector. Although
this h is not the objective of the neural network’s training, it is the objective of the Word2Vec process
which seeks to represent each word in the vocabulary as a vector in the context-specific vector space.
The output layer is obtained from the hidden layer by the combination of a matrix operation and a
function. Here the weight matrix isW
′
and the function creating the output layer vector is as below. In
this equation,u
j
is the vector resulting from matrix multiplication of the hidden layer vector by the weight
matrix: u
j
=v
′
T
w
j
h.
p(w
j
|w
I
)=y
j
=
exp(u
j
)
P
V
j
′
=1
exp
u
j
′
(1.2)
The loss function is obtained by maximizing the probability of the target word given the context words.
Using stochastic gradient descent a transition equation is obtained from this loss function for the weights
to iteratively transition to best meet the objective of the neural network. Using the data from the text
6
This is the Continuous Bag of Words (CBOW) approach, compared to the alternate Skip-Grams approach which predicts the
context from a target word. In this paper, I use the CBOW approach since it is faster to train and better at defining common
words. The Skip-Grams approach is better for tasks that require more precision in understanding uncommon words.
11
corpus to estimate the weights this optimization will eventually converge, with the resulting hidden layer
used to find the optimal word vectors for every word in the vocabulary, completing the Word2Vec. Finally,
by looking at the cosine similarities between words we obtain a context specific comparison of words to
all other words in the vocabulary. Usually a number of algorithmic approaches (hierarchical softmex, or
negative sampling) are combined with this basic architechture to improve the training speed.
Word2Vec is newer and less used in economics than LDA, but still has been used in peer-reviewed
articles. As with LDA, (101) describe LDA and its uses in the economic literature so far in their review
of the current use of text analysis. This method, in combination with LDA, has also been used in the
assessment of economy-wide finacial risk by Hanley and Hoberg (112). As with LDA, Word2Vec is now a
valid tool of economic analysis.
3.3 CombiningLDAwithWord2Vectocreatetextscores
Finally, I follow Hanley and Hoberg (112) by turning word themes into daily prevalence scores. This allows
me to quantitatively measure periods when certain themes are in ascendence versus when they are waning
and therefore creates a novel dataset of how an asset price bubble evolves through its lifecycle. As noted
by Greenwood et al. (108), “there is much more to a bubble than a mere security price increase: innovation,
displacement of existing firms, creation of new ones, and more generally a paradigm shift as entrepreneurs
and investors rush towards a new El Dorado.” This methodology allows us to measure and quantify the
inner workings of these complex economic and social phenomena in a more detailed way than ever before.
Firstly, LDA and Word2Vec are combined. I construct an LDA model with ten topics for each of
four periods in the life-cycle of each asset price bubble separately (i.e. forty fitted topics in total). For
each of these forty topics, I take the top ten representative bigrams for those topics (i.e. 400 bigrams) as
representing bigrams that describe prominent thematic concepts within the overall text corpus. These 400
bigrams are then screened for suitability – bigrams that are ambiguous, have no clear economic meaning,
contain names or are very semantically similar to other bigrams in the set are removed.
7
For each of
the remaining bigrams, I find its semantic vocabulary by taking the 100 closest bigrams in the Word2Vec
constructed vector space (i.e. bigrams judged to have the closest semantic context to the bigram according
to the neural network constructed by Word2Vec). These dictionaries form the set of “textual risk factors”.
7
To remove semantically similar bigrams, I remove bigrams that have semantic vocabularies that are very highly correlated.
The correlation of two semantic vocabularies is calculated by the Word Mover’s Distance (WMD) method, widely accepted to be
the best method of measuring the closeness of two text corpora (see Kusner et al. (135)).
12
These bigrams are then shaped into matrices measuring their prevalence in the text corpus. First, a
vector is constructed for each day and each risk factorW
tk
which has length of the intersection between
all bigrams contained in dayt’s text corpus and all bigrams that are in the semantic vocabulary for theme
k. For each bigram, this vector records how frequently this appears in dayt’s text corpus (for those words
in theme k but not appearing in the corpus for dayt the vector entries are set to 0).
Second, another vectorT
tk
is formed for each dayt and each themek which is the same length asW
tk
.
However, for each bigram this vector records the weighting of the bigram in its LDA topic. For example, if
the first three bigrams in the first LDA topic are “bubble burst”, “price high” and “transaction cost” with
probabilities of 0.30, 0.18 and 0.15 respectively, then these probabilities will populate the entries of T
tk
corresponding to these bigrams. All bigrams not appearing in riskk’s vocabulary will have a zero at their
entry. This vector therefore captures the importance of the bigram in the text corpus as measured by the
LDA model.
The two matrices are the combined to create a daily and per risk factor metric S
tk
as below. This
captures the prevalence of that particular theme on any given day based on the text corpus which can as
a result be captured and monitored throughout each bubble period.
S
tk
=
W
tk
∥W
tk
∥
· T
tk
∥T
tk
∥
(1.3)
3.4 Explosiveunitroottest
The explosive unit root test was developed in Phillips et al. (172) and Phillips et al. (170). The test uses the
following equation.
p
t
=µ x
+δp
t− 1
+
J
X
j=1
ϕ j
∆ p
t− j
+ϵ p,t
, ϵ p,t
∈NID(0,σ 2
) (1.4)
µ x
is a constant term, p
t
is the asset price with ∆ representing first differences, δ is the coefficient
of interest, ϕ j
are a set of additional parameters and ϵ p,t
is the error term. The null hypothesis is that
the series follows a normal unit root (δ = 1; equivalent to the standard Augmented Dickey-Fuller test).
The alternative hypothesis is that the series contains a right-tailed explosive unit root ( δ > 1). If the
13
null hypothesis is rejected then the series shows evidence of an explosive unit root, which is defined in
Phillips et al. (172) to be equivalent to bubble behavior, “irrational exuberance” or prices exceeding their
fundamental value.
The methodology follows Phillips et al. (170) in applying the test over a moving window to identify
when the period of explosivity begins and ends, as well as whether the overall period in question contains
evidence of explosive behavior. Window and other parameter selection is conducted as in the above study.
4 Textimplementation
The text analysis approach described in the sections 3.1-3.3 is applied to each bubble episode. In this
section, I describe the output of this approach for each episode.
4.1 IdentifyriskfactorsduringfirstBitcoinBubble
The initial LDA model is run on all four bubble periods. This produces a total of 40 LDA topics (10 per
period) with 10 bigrams and their associated probabilities of occuring collected for each (400 bigrams in
total). These 400 bigrams are then filtered as described in section 3.3. After filtering, 41 bigrams remain
which then form headers for each textual risk factor, implying 41 textual risk factors for the first bubble
episode. For illustration, a snippet of the LDA themes at this stage for the first bubble episode is shown in
table 1.2 with the complete table provided in the appendix.
I present an example set of 41 textual risk factors (selected as they later test positive for explosivity),
along with the first ten entries of their associated semantic vocabularies, in table 1.3 for illustration (the full
table for all factors is shown in the appendix). “Dist” refers to the cosine distance of the bigram in question
to the main verbal content factor, where 1 would be an identical bigram. Although the additional bigrams
listed form the risk factor’s “semantic vocabulary”, these bigrams are actually just those with a “close”
cosine distance and therefore will have an independent meaning of their own. The aim of the measure
described in Section 3.3 is simply to quantify the extent to which all meaning related to the textual risk
factor appears on a given day. The use of a close bigram in a tweet constitutes an appearance of that textual
risk factor’s meaning.
The textual risk factors range from those discussing quite specific textual meanings such as “technical analysis”
or “credit card”, to those describing much broader or more abstract concepts such as “bubble burst” or
14
Table 1.2: Post filtering LDA factors, first bubble episode
“new year”. Some of the factors identified are more economic such as “return investment”, while others
are more geopolitical such as “north korea” or “chinese exchange”. I will describe later how each of these
can be used to identify different important events during the bubble episode, such as the “north korea” risk
factor which does not just relate to verbal content about North Korea but features many tweets discussing
North Korean hacking and theft of Bitcoin that were important features affecting the Bitcoin price, as well
as investors’ perceptions of the safety of Bitcoin, during the bubble episode.
Finally, as described in section 3.3, I generate a time series for each of the 41 textual risk factors.
These time series measure occurrence of verbal content associated with each risk factor over time during
the bubble episode. This time series data then gives us a previously unseen view into the verbal content
circulating during the bitcoin bubble. Examples of these time series are shown in the appendix.
15
Table 1.3: Examples of 41 final textual risk factors: 1st bubble, most explosive
4.2 IdentifyriskfactorsduringsecondBitcoinBubble
Applying the same process as in the previous section for the second bubble episode, I obtain 68 textual risk
factors after the filtering stage. A selection of these, again those that later turn out to be most explosive,
are shown in table 1.4. Some of the textual risk factors for the second bubble episode were also present
during the first: “time high”, “technical analysis” and “hard fork” all appeared in table 1.3 for the first
bubble episode and are also textual risk factors for the second bubble episode.
However, in addition many new risk factors are identified. This illustrates one of the benefits of this
methodology which is to dynamically update risk factors for individual episodes of financial instability or
in this case bubble episodes. In this same sense previously important risk factors, such as for example
“North Korea” are no longer important in the second episode – reflecting that North Korean hacking
activity retired as an important source of verbal content prior to the second episode.
Semantic vocabularies do differ across episodes. For example, note that just because “time high” for
example appears as a risk factor for both episodes, it may have a slightly different semantic vocabulary
in the first compared to the second. This occurs as semantic definitions are defined for episode-specific
corpora, which reflect that usage of these terms changes over time and from episode to episode. However,
since definitions, particularly of more stable bigrams, should not change much over a 2-3 year period.
Indeed, of the 10 top bigrams shown in table 1.3 of “time high”’s 100-word semantic vocabulary, 3 of these
terms remain high up in its vocabulary during the second episode.
16
Table 1.4: Examples of 68 final textual risk factors: 2nd bubble, most explosive
5 Testingforexplosivebehavior
5.1 ExplosivebehaviorinBitcoinprices
I first run the time series tests for explosive behavior on the price series. If the explosive unit root test
indeed captures “irrational exuberance” or financial asset bubbles it should test positive during the two
large upswings as well as two crashes that occur in the Bitcoin price for the two episodes indentified and
examined in this study. I apply the test as described in equation 1.4.
Figure 1.3 shows the price data (grey dashed line) as well as the moving test statistic (black line) and
associated test critical thresholds (blue dashed line). The figure shows that the test statistic does indeed
exceed the critical values for the period of high acceleration and deceleration of the Bitcoin price during
both bubble episodes. This indicates that for these periods the null of a standard unit root is rejected in
favor of an explosive unit root. I note that the test statistic is elevated for more of the crash period of the
first bubble episode than the second. This appears to be because some of the “crash” in the second episode
is more gradual, perhaps reflecting a more gradual readjustment of the Bitcoin price to its new, lower level.
In addition to the main bubble periods, the test also identifies explosivity twice in the period before
the first bubble episode (once around May/June 2017 and once for a shorter time around August 2017).
Similarly the test statistic is elevated in the period before the second bubble episode in November 2018
(during a sharp drop in the Bitcoin price from around $6,250 to $3,750. It therefore seems common for
short bouts of explosive activity to the asset price in the period prior to the main bubble episode. Price
explosivity maybe therefore be more of a necessary though not sufficient condition for a bubble episode
17
Figure 1.3: Explosive behavior in Bitcoin prices: 2017-2020
to occur.
Overall, the evidence in this section validates the use of this statistical test to identify bubbles in the
Bitcoin price data. Both of the episodes identified in this study are generally viewed to have been bubble
instances. Further in both episodes, the test statistic rises above the critical value in advance of the bubble
peak, suggesting it provides a forward-looking test of whether a bubble might be about to occur.
5.2 ExplosivebehaviorinRiskFactors: bubbleepisode1
In this section I examine whether, in addition to price data indicating explosivity during the first Bitcoin
bubble episode, textual risk factors also show evidence of explosivity. Since new textual risk factors
are identified for each individual bubble episode, the time series data indicates the unique set of textual
18
mechanisms that are at play during these bubbles. Further testing which of these mechanims become
explosive can indicate which textual features are most prominent and most associated with explosive rises
in these bubble episodes.
Figure 1.4 shows the results of applying the explosive unit root test, described by equation 1.4, to the
whole first bubble period January 2017 to April 2018. This is applied to the dataset including all tweets
– not just those with non-zero retweets. The chart therefore tests for evidence of explosivity at any time
during the bubble period. Each green circle represents one of the textual risk factors described in section
4.1. The black line (solid and dashed) represents the critical value at different significance levels. The figure
shows that 12 of 30 textual risk factors exhibit explosive behavior at some point during the first bubble
episode. Textual risk factors “north korea”, “bubble burst” and “trade trade” have the largest test statistics.
A range of risk factors are indicated during this bubble episode as being explosive. The “north korea”,
which may also be thought of as a “hacking” risk factor becomes explosively elevated in December 2017
as a story emerges that North Korea was engaged in cyber-stealing of Bitcoin as well as cyber attacks
on Bitcoin exchanges. The stories spread rapidly on social media, as well as other more traditional news
media, and are likely associated with price falls as investors become worried about hacking risks associated
with Bitcoin. For example, a CNN article from the time states both: “North Korean hackers targeted four
different exchanges that trade bitcoin and other digital currency” , whilst also saying: “It [Bitcoin] has also
provedpopularinthepastwithcriminalsbecauseoftheamountofanonymity”.
8
A further interesting textual
risk factor is the “ounce gold” factor which became explosive around October 2017 and co-occurred with
early rises in the Bitcoin price. This factor described the Bitcoin price reaching a milestone of becoming
more valuable than an ounce of gold which, from Tweet evidence, appeared to be associated with investors
viewing Bitcoin to have become a “valuable” asset. This new perception of Bitcoin was likely associated
with optimism and trust by investors. Such an analysis of individual textual risk factors demonstrates that
this approach can be used to dig into the important, individual elements of verbal content occurring during
any bubble episode.
Some, though not all, of these episodes of explosivity cooccur with the period of high price appreciaion/depreciation
during the bubble episode. For example, as shown in figure 1.5, the “north korea” textual risk factor has
test statistic above the critical threshold at the beginning of and during the crash phase for the first bubble
episode. In fact, the test statistic begins to spike sharply over a week before the crash phase begins,
8
See: “North Korea may be making a fortune from bitcoin mania”, CNN, 13th December 2017.
19
Figure 1.4: Explosive behavior in textual risk factors: 1st Bitcoin bubble, 2017-2018
20
Figure 1.5: “North Korea” explosive textual risk factor
suggesting the statistic itself leads the relevant phase. Future work may wish to categorise textual risk
factors by whether their explosivity occurs before, during or after the period of high price acceleration.
5.3 ExplosivebehaviorinRiskFactors: bubbleepisode2
In this section, I follow a similar analysis of explosive textual risk factors during the second bubble episode,
as well as contrasting the textual risk factors across episodes. Figure 1.6 shows the same chart as in the
previous section but testing for explosivity in the whole second Bitcoin bubble episode in 2019. However
a key difference with Figure 1.4 is that these textual risk factor data series are recovered from a dataset
including only tweets with non-zero retweets.
The tests on risk factors recovered only from tweets with non-zero retweets are less explosive in
general. In the tests for the second Bitcoin bubble only three textual risk factors reject the null of a unit
root when testing over the whole period. This is in contrast to the tests from the first Bitcoin bubble.
Therefore the data including all tweets, including those with zero retweets, appears to be an important
part of identifying explosive behavior. Note however here this concusion is suggestive: a full analysis of
this question would require a comparison between recovered textual risk factors with and without zero-
retweet tweets for the first Bitcoin bubble, and/or for the second Bitcoin bubble.
Textual risk factors that do show explosivity are “interest rate”, “current price” and “hard fork” have
21
Figure 1.6: Explosive behavior in textual risk factors: 2nd Bitcoin bubble, 2019
22
Figure 1.7: “Interest Rate” explosive textual risk factor
the highest test statistics. “Interest rate” becomes explosive briefly after a drop from peak of the Bitcoin
price in June 2019 and is associated with a short rally in the price during August 2019 (see figure 1.7).
This is associated with verbal content discussing an interest rate reduction announcement by the Federal
Reserve on 31st July 2019 – interest rate falls are likely associated with Bitcoin price rises, as investors can
lend cheaply to invest or speculate on Bitcoin. The other two textual risk factors occur at the very end of
this bubble episode.
As with the first bubble episode, some textual risk factors have explosivity which co-occurs with the
period of price explosivity. Even though explosivity tests over the whole period of the second bubble
episode do not indicate much explosivity, looking at the rolling test statistic reveals that some textual risk
factors are still explosive during the second bubble period. In particular the “full node” factor becomes
explosive in May 2019 as the Bitcoin price begins to explode (see figure 1.8). A Full Node is a Bitcoin
specific term relating to a program owned by individuals that validate sets or blocks of transactions for
the Bitcoin blockchain. This explosivity coincided with an announcement by smartphone company HTC
that they were releasing a new technology, the first smartphone capable of running a full Bitcoin node on
it. The textual risk factor suggests this was the explosive announcement that facilitated the Bitcoin price
to start exploding during this Bitcoin episode.
In general, this finding suggests a general, objective approach to uncover what starts explosive increases
23
Figure 1.8: “Full Node” explosive textual risk factor
in asset prices at the beginning of bubbles. First recovering textual risk factors. Then finding textual risk
factors who enter into periods of explosivity around the time that the asset’s price enters a period of
explosivity.
6 Conclusion
I use computational linguistics methods to analyse verbal content on social media during two bubble
episodes. From this data, I identify a range of “textual risk factors” which measure separate sets of verbal
content occurring during these bubble episode. I then apply explosive unit root tests, developped to identify
bubbles in price data, to identify explosive verbal content. I find the following:
1. Both Bitcoin bubble episodes feature explosive price behavior which have distinct start and end
dates.
2. During both Bitcoin bubbles there are textual risk factors also testing positive for explosivity. This
evidence suggests that bubble episodes tend to feature explosive increases in conversation threads
on social media.
3. Not all textual risk factors feature explosivity.
24
4. When unpopular tweets (measured as those that receive no retweets) are excluded fewer textual risk
factors are explosive. This suggests bubble episodes do not necessarily require an explosive increase
in popular tweets.
5. Regarding Bitcoin specifically, I find that the second bubble episode may have begun from explosive
price rises co-occurring with an explosive increase in verbal content related to the release of smart
phones that can hold full nodes – i.e. smart phones that can be used to decentrally update and
process the Bitcoin blockchain. While no such verbal content is detected at the beginning of the first
episode.
6. Regarding Bitcoin crashes, the first episode’s crash featured explosive reductions in the price that
co-occurred with explosive verbal content about “north korea” and its hacking and theft activities,
potentially alerting users to the security concerns associated with Bitcoin. By contrast the second
bubble epsiode featured a much more gradual reduction which was interrupted by a rebound co-
occuring with an explosive increase in the “interest rate” risk factor as the Federal Reserve announced
a further reduction in interest rates in May 2019.
This methodology then is a useful screening device for verbal content featuring explosive patterns
that co-occur with explosive patterns in the price data during bubble events (i.e. verbal factors where the
associated number of conversations are increasing explosively as the price also increases explosively).
Further study of textual risk factors for a wider range of bubbles would be useful to understand whether
this is a general feature of all bubbles.
25
Appendix
26
Figure 1.9: Post filtering LDA factors
27
Figure 1.10: 41 textual risk factors during Bitcoin Bubble period (note axes differ)
28
Chapter2
Disease-economytrade-offsunderalternativeepidemiccontrolstrategies
Co-authored with Antonio M. Bento (University of Southern California), Daniel Kaffine (University of
Colorado Boulder), Akhil Rao (Middlebury College) and Ana I. Bento (Indiana University)
1 Introduction
To date, over 448 million individuals have been infected with SARS-CoV-2 and more than 6 million have
died worldwide, with around fifteen percent of these deaths happening in the United States, and only
around 50% of the world’s population has received at least one vaccination (77). The pandemic also
triggered the sharpest economic recession in modern American history. According to the US Department
of Commerce, during the second quarter of 2020 US Gross Domestic Product shrank at an annual rate
of 32.9 percent (44). The COVID-19 pandemic’s global repercussions exposed a need for coupled-systems
frameworks that link epidemiological and economic models and assess potential disease-economy trade-
offs. Such frameworks allow individuals’ adaptive responses to infection risks to be captured and can
reveal important features of control strategies, such as the role of targeted isolation strategies that can
overcome the fundamental coordination failure between infectious and susceptible individuals that drives
the economic recession.
Broadly, four areas of study have informed the assessment of control strategies. Epidemiological studies
evaluate disease dynamics and consider the heterogeneity of impacts resulting from control strategies
(129; 97; 52; 175; 176; 204; 6; 5; 142; 140). Epi-economics studies consider the micro-foundations of human
behavior as drivers of the disease, as well as the costs and benefits of alternative control strategies (102;
134; 92; 90; 16; 169; 19; 107; 199; 200; 139). An emerging literature on the macroeconomic consequences of
pandemics considers the impacts of COVID-19 and various control strategies, either by embedding these
29
behaviors in a broader economy with disease dynamics (83; 2) or by conducting detailed macroeconomic
projections in the absence of disease dynamics (88; 69). In addition, numerous statistical analyses have
examined the relationship between disease-related behaviors and economic activity (59; 110; 157). Several
knowledge gaps remain. For example, structurally mapping economic activities to contacts in a tractable
fashion that retains the underlying heterogeneity of the population presents various challenges. One major
challenge is how to calibrate this mapping using epidemiological social contact surveys, which contain data
on potentially disease-transmitting contacts between individuals. Further, detailed individual economic
behavior and epidemiological transmission mechanisms have typically not been embedded into models
that consider the broader economy. Finally, the set of control strategies considered in coupled-systems
models remains limited and overly-simplified. To date, these models have not included individual-focused
targeted isolation strategies, and the conditions under which these may overcome disease-economy trade-
offs are unknown.
To address these gaps, we develop a tractable coupled epi-economic model of individual microeconomic
behavior embedded within the broader economy. Fig.2.1 presents a schematic representation of our model.
Dynamic, forward-looking consumption and labor-leisure choices that account for the risk of infection are
made by either (a) decentralized individuals or (b) coordinated policy interventions, in order to maximize
perceived well-being (utility). These choices generate contacts which evolve endogenously in the model—
i.e., contact rates affect and are affected by the disease dynamics. Depending on the activity, contacts
can be avoidable or unavoidable. For example, in the U.S. economy, the average individual has around 7.5
contacts at their place of work during an 8-hour workday. These are avoidable contacts if the individual can
alter their labor supply. In contrast, contacts such as those that occur at home are unavoidable and carry
a risk of infection (109). For analytical tractability, the framework initially assumes individuals have full
information about their health status, and the presence of pre-symptomatic and asymptomatic individuals
is reflected in the calibration of productivity losses as only individuals with no or mild symptoms will be
able to work.
We model choices as reflecting individual preferences over time spent working/not working, how much
to consume, and how to balance infection risk against the need to work for money and consume for well-
being. The model solves this dynamic optimization problem of balancing risk and activity at the individual
30
level and aggregates the solution across the population in order to determine economic recession and
disease outcomes. This allows for direct calculation of disease-economy trade-offs that are grounded in
individual behavior. Solutions to these types of dynamic optimization problems describe forward-looking
individual behavior (193), which is critical to modeling expectations during a pandemic. This does impose
steep computational costs as the dimension of the state space increases (22), necessitating simplifying
assumptions for elements not central to our analysis.
In a decentralized choice setting, infectious individuals put susceptible individuals at risk and bear no direct
consequences for this imposition (in economics terminology, an infectious individual creates a negative
externality on a susceptible individual). Thus, a key challenge for infection control and avoiding economic
losses is the inability of susceptible individuals to coordinate with infectious individuals and encourage
them to reduce their activities and contacts (208). Absent such coordination, susceptible individuals bear
the full burden of adjusting their consumption and labor choices to minimize personal risk—we refer to
this decentralized control strategy as “voluntary isolation” (a behavior documented in prior epidemics such
as H1N1 (93), (20)). While not a policy intervention, this is still a “control strategy” because susceptible
individuals manage their own infection risk. In contrast, a true “no control” strategy would (unrealistically)
entail individuals making no adjustments to their behavior in response to infection dynamics (see SI 2.5).
Recognizing this coordination failure, we consider a policy intervention whereby a governing body (a
social planner in economics) optimally coordinates labor and consumption choices in order to maximize
aggregate well-being (utility), while still accounting for individual preferences. The “social planner” is a
commonly-used methodological construct in economics to identify optimal strategies and inform policy
design. This optimal coordination of labor and consumption generates a control strategy that targets
infectious individuals—“targeted isolation”. In a world where the coordination failure is resolved, e.g.,
by paying infectious individuals to isolate, susceptible individuals can still consume, work, and engage in
contacts, minimizing individual economic losses and the resulting recession. Importantly, to illustrate the
general benefits of such targeting strategies, we abstract from many aspects of individual heterogeneity,
often captured in epidemiological studies (e.g., (155; 24; 6; 46)). This is reasonable, since the fundamental
coordination failure is itself independent of heterogeneity (see SI 2.6).
31
In real-world terms, a targeted isolation policy encompasses interventions aimed at encouraging infectious
individuals to isolate themselves, with susceptible individuals isolating only if necessary to suppress infection
growth. Such a policy could contain features like incentive payments to encourage individuals to obtain
tests following possible exposure or symptoms, incentive payments for individuals to isolate following
positive tests (e.g., compensation for lost wages), randomized compliance checks and penalties for individuals
caught breaking isolation, etc. Implementation of such targeted isolation policies may be imperfect; we
examine compliance scenarios to assess the degree to which such issues may limit the performance of
targeted isolation policies.
We calibrate our model to pre-pandemic economic and social mixing data, using 2017 contact survey
data from (175) and next-generation matrix methods (73) to generate a contact function linking different
economic activities to contacts and to calibrate the transmission rate (see SI 2.3). Alternative parameter
choices consider the role of online shopping and work that may have altered the underlying relationships
between activities and contacts (see SI 5.4). The results described below hold across a wide range of
plausible economic and epidemiological parameters.
We study the disease-economy trade-offs that result from three alternative control strategies: voluntary
isolation, targeted isolation, and a blanket lockdown (see Materials and Methods). Under voluntary isolation,
decentralized individuals continue to optimize their personal behavior based on preferences and health
status. Some may isolate, others will not. By contrast, a policy of targeted isolation of infectious individuals
is able to address the coordination failure, effectively separating susceptible individuals from infectious
ones. Finally, to contrast these results against a commonly imposed control strategy, we also consider
a blanket lockdown, whereby all individuals are forced to isolate, independent of their disease status.
In the U.S., for example, by April 15 2020, more than 95% of the population was under a stay-at-home
order (110; 62); such social distancing policies have been noted for their large economic costs and social
disruption. Our focus is on the behavioral channels each control strategy utilizes to deliver disease control
and economic benefits. Our coupled model makes it possible to identify these optimal strategies, and
our calibration of key economic and epidemiological variables makes it possible to examine and quantify
differences between these control strategies.
32
Our first contribution is the development of the tractable coupled epi-economic model described above
that highlights the mechanisms and benefits that targeted isolation strategies have the potential to deliver.
As noted, this model must prioritize the key mechanisms (e.g., the link between contacts and economy; the
coordination problem) and impose valid assumptions to remain tractable in the face of steep computational
costs. Of course in practice, implementing such targeted isolation strategies comes with challenges, particularly
as our key assumption of full information regarding disease status may not hold in the early stages of
an emerging disease epidemic. Thus, our second contribution is to apply the above model to consider a
number of important frictions in a tractable way—for example, we allow for initial tests to be slow and
of low specificity and sensitivity, which then improve over the course of the epidemic. This application
demonstrates how our modeling approach can be applied to address some of the challenges associated
with integrated disease models identified in (126).
We show that the widely-used control strategies of voluntary isolation or blanket lockdowns suppress the
epidemic nearly as effectively as targeted isolation, but are economically costly and impose a much deeper
recession. Targeted isolation strategies avoid these sharp disease-economy trade-offs by incentivizing
infectious individuals to isolate. This allows susceptible individuals to continue to consume and work,
carrying the economy through the epidemic with a milder recession. Using targeted isolation strategies
instead of voluntary isolation strategies can avert substantial costs—up to the order of $3.5 trillion in
averted recessionary losses. Importantly, we show that relevant frictions (testing information, compliance)
can erode some of the gains from targeted isolation, but availability of high-quality tests unlocks the
majority of the benefits of targeted isolation. An implication of these findings is that the relative merits
of targeted isolation versus blanket lockdowns at any given point in time depend on the test environment
and other features of the emerging disease system.
In the following sections, we begin by abstracting from information- and compliance-related frictions in
order to illustrate the key model mechanisms and our methodological contributions. That is, we assume
all agents perfectly know their health status and the current distribution of health statuses across the
population, and fully comply with all policy mandates. Having established the underlying mechanisms,
we then analyze model applications that introduce lags in test reporting, uncertainty due to limited test
quality, and partial compliance with lockdown strategies and targeted isolation.
33
2 Results
2.1 Methodologicalcontributions
Our first contribution is methodological: we construct a data-driven, theoretically-consistent coupled epi-
economic model which can be used to study important properties of novel pathogens in economies. We
emphasize two core model features. First, regardless of control strategy, in our model the SARS-CoV-2
epidemic spreads rapidly in the population, with peak daily incidence early in the epidemic (Fig.2.2A &
C), and final proportions of the population exposed (Fig.2.2C) are largely unaltered. A plausible blanket
lockdown designed to minimize total cases (see SI 4.1) can indeed reduce cases relative to targeted or
voluntary isolation strategies, however it leads to a rebound (Fig.2.2A) when the lockdown is relaxed.
This rebound is observed in all blanket lockdown scenarios considered, including when the lockdown is
combined with additional non-pharmaceutical interventions such as shifting to more online activity—SI
Fig.S7, Fig.S11 & Fig.S13. In the discussion we describe how blanket lockdowns may still have useful
complementarities with targeted isolation despite the potential for rebounds. All “control” strategies
nevertheless significantly outperform a “no control” strategy where neither individuals nor a social planner
optimize behavior (shown in SI Fig.S3). Second, under a targeted isolation strategy, disease control does
not come at as large an economic cost as under voluntary isolation or a case-minimizing blanket lockdown
(Fig.2.2B & C). In aggregate terms, targeted isolation converts an historically-severe recession (66% peak-
to-trough contraction under voluntary isolation, 84% under the blanket lockdown) to a mild and not-
atypical one (3% peak-to-trough contraction). By coordinating individuals’ behavior over the course of the
epidemic, targeted isolation can minimize the disease-economy trade-off imposed by voluntary isolation
and blanket lockdown strategies.
The large economic savings (91% of individual economic losses averted) and the marked difference in
the probability of contact (Fig.2.3B) from the targeted isolation strategy arise primarily from shifting
the burden of isolating from susceptible to infectious individuals (Fig.2.3A). Under a voluntary isolation
strategy, some infectious individuals continue to work and consume despite the risk they impose on others
(87; 99; 71). This is the key coordination failure that increases the probability of infection (Fig.2.3B) and
34
forces susceptible individuals to work and consume less to avoid infection (Fig.2.3A). Since susceptible
individuals are the majority of the population in a novel epidemic, this approach to disease control comes
at a large economic cost. By contrast, targeting isolation at infectious individuals dramatically changes
the composition of the pool of people working and consuming (Fig.2.3A & B). Voluntary isolation at the
epidemic peak leads to about 3 fewer hours spent at consumption activities and 6 fewer hours spent at
labor activities per day by susceptible individuals, while targeted isolation reduces infectious individuals’
activities by similar amounts (Table S2). This does not cause changes in mean daily contacts between
strategies (Fig.2.3C & D), nor prevalence by activity type (Fig.2.3E), even though many more susceptible
individuals are able to work and consume. As a consequence, targeting delivers small improvements in
infection outcomes but massive economic savings.
Further infection reductions arepossible under these control strategies, however reducing cases even by a
small amount quickly increases economic costs (SI Fig.S5; to achieve the minimum level of cases, economic
losses are multiplied nearly tenfold). This result, that further disease reductions are only possible with
extreme economic losses, is an intuitive consequence of an optimized solution.
Modelapplications
Our second contribution is to apply the model to COVID-19 in the US and study how key frictions with
plausible magnitudes may affect the model mechanisms and resulting policy conclusions. We focus on
two types of frictions which are particularly relevant to novel epidemics: limited or delayed health status
information, and individual non-compliance with policy directives. These frictions are modeled as particular
scenarios (see Materials and Methods). Importantly, we deliver a tractable and plausible analysis of these
frictions in a coupled epi-economic model, though we acknowledge that there is much research to be done
on the microeconomic foundations of individual behavior in the face of a novel pathogen.
Our first scenarios vary test quality and delays in detecting infectiousness (Fig.2.4). The above results in
Fig.2.2 assume full knowledge of infection status (i.e., regular and accurate testing) forboth voluntary and
targeted isolation. Here, we consider scenarios where (a) individuals take a test which correctly reveals
their infection status with X% probability (X determined by the scenario), and (b) the test result is received
only Y days (Y determined by the scenario) after the individual actually becomes infectious. The latter
35
could be either because the test is taken some time after infection, or there is a lag between taking the test
and receiving the results. These two dimensions cover a broad range of population-level testing strategies,
though a detailed modeling of all possible testing strategies is beyond the scope of this paper (see SI Fig.S7
for cases where individuals are uncertain over their own infection status and respond to each others’
uncertainty).
In the “limited and delayed testing” scenario (10% test quality, 8 day test lag), targeted isolation can
only recover around 13% of the economic benefits from targeted isolation in the baseline scenario, and
only around 30% of the infection control benefits from targeting. This scenario demonstrates that the
benefits of targeted isolation are conditional on the quality of the testing regime. In the “improving test
quality” scenario (95% test quality after 75 days, 8 day test lag), targeted isolation can recover nearly 92%
of the economic benefits from targeting in the baseline scenario, and nearly 94% of the infection control
benefits; note this scenario is intended to be consistent with the observed evolution of testing capability
during the COVID-19 pandemic. These results are largely unchanged in the “improving test quality and
delays” scenario (95% test quality after 75 days, 5 day test lag after 60 days). The improvement in test
timeliness has very little effect over and above the effect of improved test quality, and the changes in
transient dynamics can even reduce some of the economic and infection control benefits. In all cases,
the voluntary or targeted isolation policies deliver substantial economic benefits over blanket lockdowns,
while blanket lockdowns deliver greater infection control benefits. We explore the robustness of these
results to alternative assumptions on equilibrium behavior under low-quality information, finding it does
not change the ranking of policies in terms of economic benefits but may alter the ranking over infection
control benefits (see SI 3.3.4).
These results highlight two important channels through which targeted isolation delivers improvements
over voluntary isolation. First, as expected, better information tends to enable better implementation of
targeted isolation. Second, however, better information can also worsen the recession under voluntary
isolation. Intuitively, poor information mitigates the coordination failure by leading individuals uncertain
about their health type to act as though they are a different type. Many infectious individuals who would
otherwise impose externalities on others end up acting as though they are susceptible and reducing their
labor supply and consumption. Similarly, many susceptible individuals act as though they are asymptomatic
36
or recovered and continue to supply labor and consume, mitigating the recession severity.
Our next scenarios vary the fraction of individuals who comply with policy mandates such as blanket
lockdowns or targeted isolation (Fig.2.5). In the “low compliance and perfect information” scenario (0%
compliance rate, no information frictions), targeted isolation is ineffective. In the “partial compliance and
perfect information” scenario (75% compliance, no information frictions), targeted isolation recovers just
over 76% of the economic benefits from targeting in the baseline scenario, and nearly all of the infection
control benefits. These results are intuitive given the properties of targeted and voluntary isolation in the
baseline model: since the non-compliant share of the population behaves as they would under voluntary
isolation, the benefits realized are a convex combination of those from voluntary and targeted isolation.
This result also helps to clarify the role of altruism, for example (32; 87) show that altruistic motives can
induce some infectious individuals to isolate without control strategies. However, if there is still a large
enough portion not acting altruistically (as documented by (32)) sizeable targeted isolation benefits remain.
Finally, in the “partial compliance and improving information scenario” (75% compliance, 95% test
quality after 75 days, 5 day test lag after 60 days), targeted isolation recovers roughly 95% of the economic
benefits from targeting in the baseline scenario, and roughly 95% of the infection control benefits as well.
Although compliance is unchanged, the percentage of benefits obtained improve for the same reason
as in the purely information scenarios—worse information at certain levels can improve outcomes as
infectious individuals act as susceptibles, solving the coordination failure. Taken together, our results
provide qualitative insights into the importance of information quality, information timeliness, and compliance
on the benefits of targeted isolation policies.
37
2.2 Robustnesstoassumptionsoncouplingparametersandfunctions
To assess the robustness of our conclusions to these modeling choices, we conduct sensitivity analysis
over several relevant model parameters. The main functional form for the contact function is assumed
to be linear, such that additional labor and consumption activities increase (infection-risking) contacts
proportionally. However, the types of social networks that individuals belong to and the nature of their
interactions affects the mapping between activities and contacts (191), and therefore we test other functional
forms that aggregate how different network structures could affect this mapping. Next, we test the sensitivity
of the calibration of the contact function, which is based on pre-pandemic contact data. This calibration
aggregates detailed data on individuals’ social network structures up to model features like overall contacts
at different activities. Finally, because we incorporate the impact of asymptomatic individuals through
productivity losses (see Methods), we test the sensitivity of our model to the share of asymptomatic
individuals by varying the productivity losses from infection. In the following sensitivity analyses, we
focus on the core model, devoid of information- or compliance-related frictions, to identify how these
modeling choices affect the maximum possible gains from targeted isolation.
First, we show the mapping between prevalence of asymptomatic individuals and productivity losses
from infection in Fig.2.6A. Our productivity parameter is a weighted average of those experiencing no
symptoms when infected (asymptomatic; able to work unimpeded without a loss to productivity), and
those experiencing symptoms (less able to work during infection and thus incurring a productivity loss).
A lower productivity loss in the figure implies more asymptomatic individuals. Fig.2.6B-E demonstrates
the robustness of our conclusion that targeted isolation reduces economic losses without changing disease
outcomes—varying asymptomatic infections through productivity (i.e., moving horizontally in the charts)
does not significantly change the shading.
Second, because structural changes in the economy during the pandemic may have reduced the number of
contacts per unit of activity (e.g., increased prevalence of contactless goods delivery, increased mask use
or other non-pharmaceutical interventions), we examine our findings’ robustness by altering the ratios of
contacts at different activities. Fig.2.6B-E show how our conclusions about targeted isolation relative to
voluntary isolation change as we vary the contact structure of the economy and the proportion of non-
38
severely-diseased individuals. From the white dots (main model calibration), moving to the left in Fig.2.6B-
E shows that lower prevalence of asymptomatic individuals will increase the economic effectiveness of
targeted isolation without affecting the relative number of cases averted. Moving up the vertical axis in
Figs.2.6B-C shows that increasing the share of contacts which occur at consumption rather than labor
activities (e.g., if remote work becomes more common while bars and restaurants remain open) would
again increase the economic effectiveness of targeted isolation without affecting the relative number of
cases averted. Moving up the vertical axis of Fig.2.6D-E shows that increasing the share of contacts at
unavoidable activities (e.g., if consumption and labor become increasingly contactless, or if more contacts
occur during unavoidable activities such as religious or family gatherings) will reduce the economic effectiveness
of targeted isolation without affecting the relative number of cases averted.
Lastly, the functional form of the contact function allows us to examine heterogeneity in contact rates—
Fig.2.6F-G. The mapping between activities and contacts will be an aggregation of individuals’ social
network structures when consuming and working. Different functional forms approximate different mappings.
Convex contact functions emerge when high-contact activities (individuals) are reduced (isolated) first and
concave functions emerge when high-contact activities (individuals) are reduced (isolated) last. Targeted
isolation accounting for these choices is likely to produce convex contact functions if high-contact activities
(individuals) are reduced (isolated) first. We find such variations have a modest impact on the economic
effectiveness of targeted isolation, but do not affect its disease control properties. We discuss these forms
further in Materials and Methods and the SI and show the sensitivity of our findings to plausible variations
in other structural parameters in SI Fig.S9.
3 Discussion
Close to two years into the SARS-CoV-2 pandemic, it is increasingly clear economic concerns cannot be
neglected (Office). We show that even in scenarios with imperfect testing and compliance, a targeted
isolation approach emerges from our model as an optimal strategy which balances disease spread and
economic activities. Our predicted infection rates and economic responses are broadly consistent with
observed patterns (see SI 2.7), and thus our results likely capture the correct order of magnitude and capture
39
the key qualitative features of the epidemic and recession.
Recent studies suggest the COVID-19 recession was driven by voluntary reductions in consumption in
response to increasing infection risk (59; 38; 105). We show this drop in consumption is driven by a
coordination failure: infectious individuals do not face the full social costs of their activities, leading
susceptible individuals to withdraw from economic activity. This coordination failure resembles the classical
problem of the tragedy of the commons in natural resources and the environment (106; 185; 115), underscoring
the lack of property rights in the market for infection-free common spaces. It also shares similarities with
coordination issues that emerge in climate change, fisheries, orbit use and other settings (160; 68; 70; 181).
Correcting this coordination failure via a targeted isolation strategy that internalizes the costs infectious
individuals impose on susceptible individuals delivers substantial economic savings (Fig.2.2 & Fig.2.3).
Our conclusions arise from a data-driven method to calibrate the mapping between disease-transmitting
contacts and economic activities. Compartmental models of infectious diseases typically segment activities
based on population characteristics like age and student status (e.g., (155; 24; 176; 37; 152)) rather than
economic choices like consumption and labor. We build on prior work in this area (e.g. (92; 90; 169)) to
address two long-standing challenges: appropriately converting units of disease-transmitting contacts into
units of economic activities (contacts into dollars and hours), and calibrating the resulting contact function
to produce the desiredR
0
. We address these challenges in three steps (see Materials and Methods and SI).
First, we use contact matrices from (175) to construct age-structured contact matrices at consumption,
labor, and unavoidable other activities (Fig. S1). We then use next-generation matrix methods to calculate
the mean number of contacts, adjusted for how individuals of different ages mix with each other, at each
activity. Finally, we use these values with pre-epidemic consumption and labor supply levels to map
contacts to dollars and hours in the contact function, and then calibrate theR
0
. This approach provides
a behaviorally-grounded perspective on why contacts occur. Understanding the structure and benefits
of targeted isolation requires this mapping between economic activities and contacts. We calibrate the
model to pre-pandemic economic behavior. We validate the model’s performance along aggregate disease-
economy dimensions in SI 2.8 and Figure S5.
40
Our results also serve to highlight an important benefit of making high-quality testing for novel pathogens
widely available early on (192; 196; 137; 151)—it facilitates targeted isolation approaches which reduce
economic losses. Targeted isolation provides the greatest benefits when information quality and compliance
are high, with testing lags playing a relatively minor role. These results suggest that by enabling targeted
isolation policies, early provision of high-quality testing combined with incentives to comply with policy
directives can unlock some infection control benefits and substantial economic benefits. When testing is
low-quality throughout the epidemic, the targeted isolation solution resembles a blanket lockdown (see
Fig.2.4).
While we have shown that information frictions and non-compliance can be important factors limiting
policy effectiveness, we stress the fundamental problem is an inability to coordinate among susceptible
individuals. In an ideal world, a market would exist allowing susceptible individuals, who do not want to
be exposed to infection, to club together to pay infectious individuals to stay out of common spaces (e.g.,
gyms, restaurants, supermarkets). Even in the face of information and compliance issues, this would solve
the fundamental problem and allow susceptible individuals to continue working and consuming, removing
the disease-economy trade-off inherent in lockdown approaches. In reality this is not possible, since
susceptible individuals cannot coordinate (from economic theory, no “property rights” exist determining
specifically whether susceptible or infectious individuals have a right to enter these spaces and should
therefore be paid to access them, unlike most markets) (63). Since this ideal or first best solution is not
possible, targeted isolation is a policy solution to solve this coordination problem.
To implement targeted isolation, governments can provide incentives and encouragement for infectious
individuals to remove themselves from these public spaces. In our model, paying individuals to stay home
while infectious would require spending on the order of$428 billion (two weeks pay times the total number
of infected), to purchase the gain of an avoided recession on the order of$4 trillion—total savings of up to
$3.5 trillion relative to voluntary isolation (see Fig.2.2), not including additional averted costs from long-
term negative public health outcomes (69). These findings are also net of the costs of implementing testing,
since both targeted isolation and voluntary isolation control strategies require some level of testing and
knowledge of infection status. Our focus here is on the benefits of targeted isolation strategies rather
than the details of how to implement them, or on cross-regional comparisons of implemented strategies
41
(see SI 5.5.1 for more discussion); designing such incentive mechanisms presents its own challenges, e.g.,
(136; 134; 5), and is an important area for future research. However, through our compliance scenarios we
also demonstrate the effect of improperly implemented targeted isolation.
Given the appealing features of targeted isolation strategies, is there still a role for blanket lockdowns?
Particularly when new research indicates their effects may partially be driven by voluntary isolation (105)?
On the one hand, blanket lockdown strategies can reduce burdens on hospital systems, particularly in the
initial phase, (37; 95), while on the other hand, the rebound effects may still induce substantial strains
on hospital systems later on (SI Fig.S10). The excessive costs and rebound effects are robust features
of blanket lockdowns, both in our model (Fig.S7 & Fig. S11) and confirmed in previous studies, e.g.,
(94; 182; 142). The rebound size in our model is also large—nearly 100% of cases averted during the blanket
lockdown reoccur later on. While “targeted lockdowns” that lock down areas or businesses burdened with
higher transmission rates (197; 203) avoid some of the excess costs of blanket lockdowns, they are still
blunt instruments compared to targeted isolation. Nonetheless, blanket lockdowns and targeted isolation
strategies may be complementary—blanket lockdowns reduce hospitalization burdens in the early days
when test quality is low, and targeted isolation manages rebound effects by correcting the coordination
problem once test quality has increased. Our analysis suggests the optimal time to switch from blanket
lockdowns to targeted isolation will depend critically on test quality. Finally, while ensuring compliance
with targeted isolation may be more costly than ensuring compliance with blanket lockdowns, we show
that for many plausible lockdown designs the targeted isolation compliance costs would have to be very
large to overturn the cost savings from targeted isolation (see Fig.2.5F and Fig.2.2B—between $8,000-
$20,000 per person).
As vaccines are being deployed, new SARS-CoV-2 variants are now circulating in many countries (205), and
breakthrough infections have been observed. Thus it continues to be critical to avoid premature relaxation
of disease-economy management measures (157). Our results carry insights for vaccine deployment, to
the extent vaccination limits infectiousness. Since our model shows that one infectious individual failing
to isolate will induce many susceptible individuals to withdraw, our model insights are consistent with
prioritizing vaccines to individuals who, when infectious, are least likely or able to isolate (and therefore
most likely to contribute to spread). Using targeted isolation throughout vaccine delivery can further
42
reduce economic costs and disease burden.
There remain many opportunities and open challenges in coupled-systems modeling of disease control and
economy management. There is important heterogeneity in transmission, infectiousness, and exposure
(e.g., superspreading events and crowding (143; 179)), though explicitly incorporating such heterogeneity
into the coupled systems is non-trivial. Our model applications have demonstrated one way to tractably
introduce such features into a rational epidemic setting. As greater amounts of high-fidelity mobility
data become available, it is important to build data-driven mappings between mobility, contacts, and
economic activities within transmission models—(91; 133; 5; 42) offer promising steps in this direction.
However, connecting mobility to contact rates and infection probabilities (given a contact) will require
further consideration. Such extensions to the calibration methodology are essential to study disease-
economy impacts of heterogeneity in individual behaviors, abilities to isolate and work from home across
economic sectors, and regional policies. Finally, it is critically important to consider how to design incentives
and measure the costs of implementing targeted isolation programs which can sustain participation and
compliance.
As an endgame strategy, targeted isolation could avert trillions in recessionary losses while effectively
controlling the epidemic. Put differently, disease-economy trade-offs are inevitable when the coordination
failure cannot be resolved. The coordination failure can be resolved through positive incentives (e.g.
payments to individuals to learn their disease status and isolate) or negative incentives (e.g. penalties
for individuals who do not learn their disease status and isolate). Amidst the ongoing public policy
debate about economic relief, lockdown fatigue, and epidemic control (88), allocating funds to solving
the coordination problem likely passes the cost-benefit test.
4 Methods
Here we provide an overview of the key elements of our framework including describing the contact
function that links economic activities to contacts, the SIRD (Susceptible-Infectious-Recovered-Dead) model,
the dynamic economic model governing choices, and calibration. The core of our approach is a dynamic
43
optimization model of individual behavior coupled with an SIRD model of infectious disease spread. Additional
details are found in the SI.
4.1 Contactfunction
We model daily contacts as a function of economic activities (labor supply, measured in hours, and consumption
demand, measured in dollars) creating a detailed mapping between contacts and economic activities. For
example, all else equal, if a susceptible individual reduces their labor supply from 8 hours to 4 hours, they
reduce their daily contacts at work from 7.5 to 3.75. Epidemiological data is central to calibrating this
mapping between epidemiology and economic behaviour. Intuitively, the calibration involves calculating
the mean number of disease-transmitting contacts occurring at the start of the epidemic and linking it to
the number of dollars spent on consumption and hours of labor supplied before the recession begins.
We use a SIRD transmission framework to simulate SARS-CoV-2 transmission for a population of 331
million interacting agents. This is supported by several studies (e.g., (123; 118)) that identify infectiousness
prior to symptom onset. We consider three health typesm∈{S,I,R} for individuals, corresponding to
epidemiological compartments of susceptible (S), infectious (I), and recovered (R). Individuals of health
typem engage in various economic activitiesA
m
i
, withi denoting the activities modelled. One of theA
m
i
is
assumed to represent unavoidable other non-economic activities, such as sleeping and commuting, which
occur during the hours of the day not used for economic activities (see SI 2.3.1). Disease dynamics are
driven by contacts between susceptible and infectious types, where the number of susceptible-infectious
contacts per person is given by the following linear equation:
C
SI
(A)=
X
i
ρ i
A
S
i
A
I
i
(2.1)
While similar in several respects to prior epi-econ models (92; 90; 91), a methodological contribution
is thatρ i
converts hours worked and dollars spent into contacts. For example,ρ c
has units of contacts per
squared dollar spent at consumption activities, whileρ l
has units of contacts per squared hour worked.
We also consider robustness to different functional forms in figure 2.6F & G as a reduced-form way to
44
consider multiple consumption and labor activities with heterogeneous contact rates. Formally:
C
SI
(A)=
X
i
ρ i
(A
S
i
A
I
i
)
α , (2.2)
where α > 1 (convex) corresponds to a contact function where higher-contact activities are easiest to
reduce or individuals with more contacts are easier to isolate. α < 1 (concave) corresponds to a contact
function where higher-contact activities are hardest to reduce or individuals with fewer contacts are easier
to isolate. The baseline case ( α = 1) implies all consumption or labor activities and individuals have
identical contact rates (See SI 2.3.2 for further discussion and intuition).
4.2 Calibratingcontacts
To calibrate the contact function, we use U.S.-specific age and location contact matrices generated in
(175), which provide projected age-specific contact rates at different locations in 2017 (shown in SI section
2.3.1). We group these location-specific contact matrices into matrices for contacts during consumption,
labor, and unavoidable other activities. The transmission rate was calibrated to give a value of R
0
=
2.6, reflective of estimates (183). For this, we use the next-generation matrix (73). The next-generation
matrix describes the “next generation” of infections caused by a single infected individual; theR
0
is the
dominant eigenvalue of the next-generation matrix (see SI 2.5.3). This calculation is done at the disease-free
steady state of the epidemiological dynamical system, when all the population is susceptible. Specifically,
we calculate the benchmark number of contacts from each activity in the pre-epidemic equilibrium (e.g.,
ρ c
c
S
c
I
for consumption from equation 2.1), under pre-epidemic consumption and labor supply levels. We
then calculate the coefficients ρ c
,ρ l
,ρ o
(for consumption, labor, unavoidable other) using 2.1 such that pre-
epidemic consumption and labor supply levels equal the benchmark number of contacts. To account for
contacts that are not related to economic activities, the “unavoidable other” contact category is normalized
to 1, so that the coefficient ρ o
is simply the number of contacts associated with unavoidable other activities.
While pre-pandemic contact structures are necessary to calibrateR
0
, our model allows contacts to evolve
over time as a function of individual choices, which respond to disease dynamics.
The contact matrices in (175) measure only contacts between individuals in different age groups by activity,
without noting which individuals are consuming and which are working. Given the lack of precise data on
45
contacts between individuals engaging in different activities, we simplify by assuming individuals who are
consuming only contact others who are consuming, and individuals who are working only contact others
who are working. However, in reality individuals who are consuming also interact with individuals who
are working (e.g., a bar or restaurant). Future work could collect more detailed contact data describing
contacts between individuals engaging in different activities.
4.3 SIRDepidemiologicalmodel
The SIRD model is given by:
S
t+1
=S
t
− τ C
SI
(A)S
t
I
t
,
I
t+1
=I
t
+τ C
SI
(A)S
t
I
t
− (P
R
+P
D
)I
t
, (2.3)
R
t+1
=R
t
+P
R
I
t
,
D
t+1
=D
t
+P
D
I
t
.
where S,I,R,D represent the fractions of the population in those compartments. Because the contact
function C
SI
(A) returns the number of contacts per person as a function of activities A, then τ is a
property of the pathogen that determines the infections per contact. This decomposes the classic “ β ”
in epidemiological modeling into a biological component that is a function of the pathogen (τ ) and a
behavioral component linked to economic activity (C(A)), such thatβ =C(A)τ (e.g., (90)).
A key input into individual decision making is the probability of infection for a susceptible individual,
which per the SIRD model above depends on the properties of the pathogen, contacts generated through
economic activities, and the share of infectious individuals in the population:
P
I
t
=τ C
SI
(A)I
t
. (2.4)
If a susceptible individual reduces their activities (and thus contacts) today, they reduce the probability
they will get infected, which in turn reduces the growth of the infection. However, if they keep their
economic behavior the same, they enjoy those benefits today, but take the risk of becoming infected in the
future. Finally, P
R
is the rate at which infectious individuals recover, and P
D
is the rate at which they
46
die. Both are assumed to be constant over time and independent of economic activities and contacts.
Our framework can be generalized to other structured compartmental models beyond mean-field (homogeneous)
SIRD models. The key feature to translate is the contact function. For example, in an age-structured model
the contact function would need to reflect age-specific consumption and labor supply patterns.
4.4 Choices
In order to analyze the three control strategies (voluntary isolation, blanket lockdown, targeted isolation),
we solve two types of constrained optimization problems: a decentralized problem and a social planner
problem. The decentralized problem reflects atomistic behavior by individuals—they aim to maximize
their personal utility and make choices regarding economic activity. The decentralized problem is used
to analyze the voluntary isolation and blanket lockdown strategies. Conversely, in the social planner
problem, a social planner considers the utility of the population as a whole and coordinates economic
activity to jointly maximize the utilities of all individuals in the population. Importantly, the social planner
internalizes the full economic costs to the population associated with disease transmission. The social
planner problem is used to analyze the targeted lockdown strategy.
In the decentralized problem, individuals observe the disease dynamics, know their own health state, and
make consumption and labor choices in each period accounting for the risks incurred by contacts with
potentially infectious individuals. Individuals’ knowledge of their own daily health state is consistent
with a testing system where individuals use a daily test which reveals their health state. LetA={c
m
t
,l
m
t
}
represent the economic activities of consumption and labor chosen in periodt by individuals of health type
m. Individuals maximize their lifetime utility by choosing their economic activities,c
m
t
andl
m
t
, accounting
47
for the effects of infection and recovery on their own welfare:
U
S
t
=max
c
S
t
,l
S
t
{u(c
S
t
,l
S
t
)+δ ((1− P
I
t
)U
S
t+1
+P
I
t
U
I
t+1
)}, (2.5)
U
I
t
=max
c
I
t
,l
I
t
{u(c
I
t
,l
I
t
)+
δ ((1− P
R
− P
D
)U
I
t+1
+P
R
U
R
t+1
+P
D
U
D
t+1
)}, (2.6)
U
R
t
= max
c
R
t
,l
R
t
{u(c
R
t
,l
R
t
)+δU
R
t+1
}, (2.7)
U
D
t
=Ω ∀t. (2.8)
Per-period utilityu(c
m
t
,l
m
t
) captures the contemporaneous net benefits from consumption and labor choices.
In particular, susceptible individuals in periodt recognize their personal risk of infectionP
I
t
is related to
their choices regarding economic activity c
S
t
,l
S
t
, and if they do become infected in period t + 1, they
have some risk of death in period t+2. Death imposes a constant utility of Ω , calibrated to reflect the
value of a statistical life (see SI 2.1.3). The daily discount factor δ reflects individuals’ willingness to trade
consumption today for consumption tomorrow.
Finally, individuals exchange labor (which they dislike), for consumption (which they do like) such that
their budget balances in each period:
pc
m
t
=w
t
ϕ m
l
m
t
. (2.9)
The wage rate w
t
is paid to all individuals, per effective unit of labor ϕ m
t
l
m
t
, and is calculated from per-
capita GDP. We represent the degree to which individuals are able to be productive at work byϕ m
(labor
productivity). We assume that symptomatic individuals are less productive, such thatϕ S
= ϕ R
= 1 and
ϕ I
<1, reflecting the average decrease in productivity of infectious individuals (accounting for the share of
asymptomatic and pre-symptomatic individuals, similar to (83)—see SI 2.1.1). Following standard practice,
the price of consumption p is normalized to 1. Finally, market equations that state how individuals are
embedded in a broader economy are described in the SI.
The social planner problem coordinates the economic activities of the individuals described above. Instead
of economic activities being individually chosen to maximize personal utility, the social planner coordinates
48
consumption and labor choices of each type (c
t
=c
S
t
,c
I
t
,c
R
t
,l
t
=l
S
t
,l
I
t
,l
R
t
) to maximize the utility of the
population over the planning horizon, subject to the disease dynamics2.3 and budget constraints2.9:
max
l
t
,c
t
∞
X
t=0
δ t
(S
t
u(c
S
t
,l
S
t
)+I
t
u(c
I
t
,l
I
t
)+R
t
u(c
R
t
,l
R
t
)+D
t
Ω) . (2.10)
Additional structure (e.g., age compartments, job types, geography) can be incorporated here either by
creating additional utility functions or by introducing type-specific constraints. For example, with age
compartments, each age type would have a set of utility functions like equations 2.5-2.8. These would then
be calibrated to reflect age-specific economic activity levels, structural parameters, and observed risk-
averting behaviors.
Both the decentralized problem and the social planner problem are solved for optimal daily consumption
and labor supply choices in response to daily state variable updates, and we normalize the total initial
population size to 1 for computational convenience. The assumption that individuals use a daily test which
reveals their health state is maintained across both the decentralized and the social planner problems. We
abstract from the cost of the testing system. Since the cost is common to both problems, it does not affect
the relative comparison between the two.
4.5 Utilitycalibration
Details of the utility function calibration and data sources are found in the SI. Briefly, economic activity
levels and structural economic parameters are calibrated to match observed pre-epidemic variables for the
US economy. We calibrate risk aversion and the utility cost of death to match the value of a statistical life.
This approach ensures both the levels of economic choice variables and their responses to changes in the
probability of infection are consistent with observed behaviors in other settings.
4.6 Modelapplications
We add information frictions and individual non-compliance to our baseline model to study how plausible
magnitudes of such distortions may affect our policy conclusions. These are modelled by altering the inputs
into agents’ optimal choice rules (known as “policy functions” in dynamic optimization problems, not to be
confused with pandemic control policies) that specify their(c
∗ ,l
∗ ) choice given the(S,I,R) information
49
they have. The choice rules take the form shown in the equation below, where the only addition to the
usual sub/superscripts is[P] denoting the policy type{V,T,L} for voluntary isolation, targeted isolation
and blanket lockdown policies respectively:
c
∗ S
[P],t
=c
S
[P],t
(S
t
,I
t
,R
t
) (2.11)
l
∗ S
[P],t
=l
S
[P],t
(S
t
,I
t
,R
t
) (2.12)
All three types of agent choose consumption and labor consistent with these choice rules depending on
what they know of the state of the world (i.e. (S
t
,I
t
,R
t
)). These choice rules are the main output of
the value function iteration process described in SI 3.1. By feeding different information into the choice
rules or taking weighted averages under different policies, we can model the frictions described below as
different scenarios.
Test reporting lags: Test reporting lags force agents to react to population-level infection information
fromx days ago. This is modelled as feeding (S
t− x
,I
t− x
,R
t− x
) into the choice rules above when finding
(c
∗ t
,l
∗ t
). We selectx to be roughly consistent with observed lags during the COVID-19 pandemic: initially
8 days at the outset of the pandemic, before falling to 5 days on day 60. The choice rules become:
c
∗ S
[P],t
=c
S
[P],t
(S
t− x
,I
t− x
,R
t− x
) (2.13)
l
∗ S
[P],t
=l
S
[P],t
(S
t− x
,I
t− x
,R
t− x
) (2.14)
Testquality: Tests for individual health status differ in quality throughout the course of a novel pandemic,
starting from very low quality before becoming progressively more accurate. We assume that due to
test quality q (for the specific foundation of this single-metric quality notion related to specificity and
sensitivity see SI 3.3.2), individuals take a weighted average of the choice-rule-prescribed action for their
true health type and a “no information” action which is averaged uniformly across the actions for each
type. This is equivalent to either of the following behavioral microfoundations:
• individuals realize they do not know their type with certainty, so can do no better than using q to
50
mix between the choice-rule-prescribed action for their test-reported type and an average across
actions for each of the three types; or
• a fraction q of agents of a given type trust their test result and follow the associated choice-rule-
prescribed action, while the remaining 1− q fraction either do not get tested or do not trust their
test and uniformly mix across actions for all health types.
We consider two types of test quality scenarios: first a “limited testing” scenario where test quality is low
throughout the whole pandemic, and second a more-realistic “improving test quality” scenario where test
quality linearly improves over the course of the pandemic, becoming perfect at day 75. The choice rules
become:
c
∗ S
[P],t
=qc
∗ S
[P],t
+(1− q)
1
3
c
∗ S
[P],t
+
1
3
c
∗ I
[P],t
+
1
3
c
∗ R
[P],t
(2.15)
l
∗ S
[P],t
=ql
∗ S
[P],t
+(1− q)
1
3
l
∗ S
[P],t
+
1
3
l
∗ I
[P],t
+
1
3
l
∗ R
[P],t
(2.16)
We examine the robustness of our conclusions to an equilibrium model of behavior under low-quality
information or limited cognitive capacity in SI 3.3.4, finding that the qualitative results regarding policy
effectiveness are unchanged.
Compliance: Some individuals may not comply with policy mandates. We model this as a share of agents
¯c of any type that choose the decentralized (i.e. voluntary isolation) action rather than complying with the
targeted isolation or blanket lockdown mandates. We consider two types of scenarios, “low compliance”
with 10% compliance and “partial compliance” with 75% compliance. The choice rules become:
c
∗ S
[P],t
= ¯c∗ c
∗ S
[P],t
+(1− ¯c)∗ c
∗ S
[D],t
(2.17)
l
∗ S
[P],t
= ¯c∗ l
∗ S
[P],t
+(1− ¯c)∗ l
∗ S
[D],t
(2.18)
Figures
51
Disease dynamics Choices under risk of infection
S I R
D
#
Population Level Outcomes
Infected Population
?
Coordinated interventions
Time
Recession Depth
?
Types of contact
S
I
R
Susceptible
Infectious
Recovered
C&L
Disease
incidence
Decentralized
decisions
Types of Interventions:
Consumption & Labor
Figure 2.1: Coupled system schematic.
Note: Individuals make consumption (C) and labor-leisure (L) choices, considering the risk of infection through contacts with
others. Individual choices and resulting contacts affect and are affected by the disease dynamics. Individual economic choices
drive population-level outcomes such as disease prevalence and economic recessions. Under decentralized approaches,
individuals optimize their behaviors based on their own preferences and health status. Under coordinated approaches,
individuals’ behaviors are optimized based on how they affect population-level outcomes.
52
0.000
0.005
0.010
0.015
0.020
0 100 200 300 400 500
Day
Proportion of cases
0
8000
16000
24000
Blanket
Lockdown
Self
Isolation
Targeted
Isolation
Avg. Individual losses
($/person)
A) B)
Outcomes
Time to peak
Total cases
Recession
trough
Time to
economic
recovery
Economy-wide
loss
0 550 57
60,000 62,000 61,486 61,629 61,826
0 100 66.04 84.40 3.06
0 7 3.99 6.99 0.358
0 550 446 536 118
Targeted Isolation Voluntary Isolation Blanket Lockdown
Cases per 100,000
Days from start of epidemic
% contraction
Days
Trillion $
C)
0.000
0.005
0.010
0.015
0.020
0 100 200 300 400 500
Day
Proportion of cases
0
8000
16000
24000
Blanket
Lockdown
Self
Isolation
Targeted
Isolation
Avg. Individual losses
($/person)
0.000
0.005
0.010
0.015
0.020
0 100 200 300 400 500
Day
Proportion of cases
Policy
Blanket Lockdown
Self Isolation
Targeted Isolation
Voluntary Isolation
Voluntary
Isolation
Strategy
0.000
0.005
0.010
0.015
0.020
0 100 200 300 400 500
Day
Proportion of cases
0
8000
16000
24000
Blanket
Lockdown
Self
Isolation
Targeted
Isolation
Avg. Individual losses
($/person)
Blanket
Lockdown
0.000
0.005
0.010
0.015
0.020
0 100 200 300 400 500
Day
Proportion of cases
0
8000
16000
24000
Blanket
Lockdown
Self
Isolation
Targeted
Isolation
Avg. Individual losses
($/person)
Targeted
Isolation
0.000
0.005
0.010
0.015
0.020
0 100 200 300 400 500
Day
Proportion of cases
0
8000
16000
24000
Blanket
Lockdown
Self
Isolation
Targeted
Isolation
Avg. Individual losses
($/person)
0.000
0.005
0.010
0.015
0.020
0 100 200 300 400 500
Day
Proportion of cases
0
8000
16000
24000
Blanket
Lockdown
Self
Isolation
Targeted
Isolation
Avg. Individual losses
($/person)
0.000
0.005
0.010
0.015
0.020
0 100 200 300 400 500
Day
Proportion of cases
0
8000
16000
24000
Blanket
Lockdown
Self
Isolation
Targeted
Isolation
Avg. Individual losses
($/person)
Figure 2.2: Disease dynamics and economic outcomes under voluntary isolation, blanket lockdown, and
targeted isolation.
Note: A: Proportion of population infected over time under each strategy. Voluntary isolation and targeted isolation curves are
almost-entirely overlapping, indicating nearly-identical disease dynamics. B: Individual losses incurred under each strategy
(targeted isolation averts 95% of voluntary isolation individual economic losses). C: Key aggregate disease and economy
outcomes under each strategy. See SI 2.5 for comparison with a “no control” approach.
53
0
1
2
3
4
5
6
7
8
0 100 200 300 400 500
Day
Normalized person-hours (sqrt)
0
1
2
3
4
5
6
7
8
0 100 200 300 400 500
Day
Average contacts (activity type)
0
1
2
3
4
5
6
7
8
0 100 200 300 400 500
Day
Normalized person-hours (sqrt)
0.00
0.01
0.02
0.03
0.04
0 100 200 300 400 500
Day
Probability S/R contacts I
0
1
2
3
0 100 200 300 400 500
Day
Normalized person-hours (sqrt)
Labor
I
R
S
Policy
Self Isolation
Targeted Isolation
8
10
12
14
0 100 200 300 400 500
Day
Mean daily contacts S-I
0.0000
0.0005
0.0010
0.0015
0.0020
0.0025
0 100 200 300 400 500
Day
Prevalence (activity type)
A) B)
C) D) E)
0
1
2
3
0 100 200 300 400 500
Day
Normalized person-hours (sqrt)
Labor
I
R
S
Policy
Self Isolation
Targeted Isolation
0
1
2
3
4
5
6
7
8
0100 200 300 400 500
Day
Average contacts (activity type)
Activity
consumption
labor
other
Policy
Self Isolation
Targeted Isolation
0
1
2
3
4
5
6
7
8
0100 200 300 400 500
Day
Average contacts (activity type)
Activity
consumption
labor
other
Policy
Self Isolation
Targeted Isolation
8
10
12
14
0 100 200 300 400 500
Day
Mean daily contacts S-I
Voluntary Isolation
0
1
2
3
0 100 200 300 400 500
Day
Normalized person-hours (sqrt)
Labor
I
R
S
Policy
Self Isolation
Targeted Isolation
0
1
2
3
0 100 200 300 400 500
Day
Normalized person-hours (sqrt)
Labor
I
R
S
Policy
Self Isolation
Targeted Isolation
Voluntary Isolation
0
1
2
3
4
5
6
7
8
0100 200 300 400 500
Day
Average contacts (activity type)
Activity
consumption
labor
other
Policy
Self Isolation
Targeted Isolation
Voluntary Isolation
0
1
2
3
4
5
6
7
8
0100 200 300 400 500
Day
Average contacts (activity type)
Activity
consumption
labor
other
Policy
Self Isolation
Targeted Isolation
Strategy
Voluntary Isolation
Strategy
Strategy
Voluntary Isolation
Strategy
Strategy
0
1
2
3
4
5
6
7
8
0 100 200 300 400 500
Day
Normalized person-hours (sqrt)
0.00
0.01
0.02
0.03
0.04
0 100 200 300 400 500
Day
Probability S/R contacts I
8
10
12
14
0 100 200 300 400 500
Day
Mean daily contacts S-I
0.0000
0.0005
0.0010
0.0015
0.0020
0.0025
0 100 200 300 400 500
Day
Prevalence (activity type)
6
Figure 2.3: Key model mechanisms.
Note: A: in voluntary isolation, susceptible individuals withdraw from economic activity due to the presence of infectious
individuals (green dashed), while under targeted isolation susceptible agents engage in much more economic activity (blue
dashed). B: more infectious agents at activity sites under voluntary isolation leads to higher probability of infection throughout
epidemic. C,D,E: overall contacts, contacts by activity and prevalence (% infectious) do not change meaningfully across
voluntary and targeted isolation, as the same infection outcomes are achieved despite enabling far more activity by susceptible
individuals with targeted isolation.
54
61000
62000
63000
64000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Strategy1
values
0.0
2.5
5.0
7.5
10.0
0 50 100 150 200 250
Day
% current infections
Prevalence / 100,000 Avg. Loss / Person ($) % Current Infections Recession
Testing lag + Poor quality Testing lag + quality
Testing lag + quality
-80
-60
-40
-20
0
0 50 100 150 200 250
Day
% Deviation
0.0
2.5
5.0
7.5
10.0
0 50 100 150 200 250
Day
% current infections
0
2000
4000
6000
8000
10000
12000
14000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Avg.loss ($/person)
60000
61000
62000
63000
64000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Prevalence / 100,000
-80
-60
-40
-20
0
0 50 100 150 200 250
Day
% Deviation
0.0
2.5
5.0
7.5
10.0
0 50 100 150 200 250
Day
% current infections
0
2000
4000
6000
8000
10000
12000
14000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Avg.loss ($/person)
60000
61000
62000
63000
64000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Prevalence / 100,000
0.0
2.5
5.0
7.5
10.0
0 50 100 150 200 250
Day
% new infections
Policy
Blanket Lockdown
Targeted Isolation
Voluntary Isolation
A) B) C) D)
E) F) G) H)
I) J) K) L)
O)
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
Max economic gains achieved (%)
Max disease control gains achieved (%)
Scenarios
Testing lag + poor quality test
Testing lag + improving test quality
Decrease testing lag + improving test quality
Frictionless baseline
0
2000
4000
6000
8000
10000
12000
14000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Avg.loss ($/person)
N)
60000
61000
62000
63000
64000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Prevalence / 100,000
M)
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
Max economic gains achieved (%)
Max disease control gains achieved (%)
Scenarios
Testing lag + poor quality test
Testing lag + improving test quality
Decrease testing lag + improving test quality
Frictionless baseline
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
Max economic gains achieved (%)
Max disease control gains achieved (%)
Scenarios
Testing lag + poor quality test
Testing lag + improving test quality
Decrease testing lag + improving test quality
Frictionless baseline
60000
61000
62000
63000
64000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Prevalence / 100,000
0
2000
4000
6000
8000
10000
12000
14000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Avg.loss ($/person)
-80
-60
-40
-20
0
0 50 100 150 200 250
Day
% Deviation
5000
10000
15000
20000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Strategy1
values
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
Max economic gains achieved (%)
Max disease control gains achieved (%)
Scenarios
Testing lag + poor quality test
Testing lag + improving test quality
Decrease testing lag + improving test quality
Frictionless baseline
Figure 2.4: Model outcomes with different information frictions.
Note: PanelsA-D show key model outcomes under 10% test quality and an 8-day lag between testing and reporting. PanelsE-H
show these outcomes when test quality linearly improves from 10% to 95% quality by day 75 under a constant 8-day test
reporting lag. PanelsI-L show these outcomes when test quality linearly improves as before, and the test reporting lag reduces
from 8 days to 5 days at day 60 then from 5 days to 3 days at day 75. PanelsM-N summarize how disease-economy outcomes
vary across these scenarios. PanelO summarizes the disease-economy outcomes under these scenarios relative to the baseline
in Fig.2.2.
55
Prevalence / 100,000 Avg. Loss / Person ($) % Current Infections Recession
Zero compliance + perfect info.
Partial compliance + perfect info. Partial compliance + improving info.
-80
-60
-40
-20
0
0 50 100 150 200 250
Day
% Deviation
0.0
2.5
5.0
7.5
10.0
0 50 100 150 200 250
Day
% current infections
0
2000
4000
6000
8000
10000
12000
14000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Avg.loss ($/person)
60000
61000
62000
63000
64000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Prevalence / 100,000
A) B) C) D)
E) F) G) H)
I) J) K) L)
O)
-80
-60
-40
-20
0
0 50 100 150 200 250
Day
% Deviation
0.0
2.5
5.0
7.5
10.0
0 50 100 150 200 250
Day
% current infections
0
2000
4000
6000
8000
10000
12000
14000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Avg.loss ($/person)
60000
61000
62000
63000
64000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Prevalence / 100,000
0
2000
4000
6000
8000
10000
12000
14000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Avg.loss ($/person)
N)
60000
61000
62000
63000
64000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Prevalence / 100,000
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
Max economic gains achieved (%)
Max disease control gains achieved (%)
M)
61000
61250
61500
61750
62000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Strategy1
values
Scenarios
Zero compliance + perfect information
Partial compliance + perfect information
Partial compliance + improving information
Frictionless baseline
61000
61250
61500
61750
62000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Strategy1
values
Scenarios
Zero compliance + perfect information
Partial compliance + perfect information
Partial compliance + improving information
Frictionless baseline
61000
61250
61500
61750
62000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Strategy1
values
Scenarios
Zero compliance + perfect information
Partial compliance + perfect information
Partial compliance + improving information
Frictionless baseline
60000
61000
62000
63000
64000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Prevalence / 100,000
0
2000
4000
6000
8000
10000
12000
14000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
Avg.loss ($/person)
0.0
2.5
5.0
7.5
10.0
0 50 100 150 200 250
Day
% current infections
-80
-60
-40
-20
0
0 50 100 150 200 250
Day
% Deviation
5000
10000
15000
20000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
type1
values
61000
61250
61500
61750
62000
Blanket
Lockdown
Voluntary
Isolation
Targeted
Isolation
type1
values
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
Max economic gains achieved (%) Max disease control gains achieved (%)
Scenarios
Zero compliance + perfect information
Partial compliance + perfect information
Partial compliance + improving information
Frictionless baseline
Figure 2.5: Model outcomes with different compliance rates.
Note: PanelsA-D show key model outcomes under 0% compliance and no information frictions. PanelsE-H show these
outcomes under 75% compliance and no information frictions. PanelsI-L show these outcomes when test quality linearly
improves from 10% to 95% by day 75 and the test reporting lag reduces from 8 days to 5 days at day 60 then from 5 days to 3
days at day 75. PanelsM-N summarize how disease-economy outcomes vary across these scenarios. PanelO summarizes the
disease-economy outcomes under these scenarios relative to the baseline in Fig.2.2.
56
Proportion of asymptomatic,
pre-symptomatic & mild infections
avoid_unavoid_contact_ratio c_l_contact_ratio
0 10 20 30 40 50 60 70 80 900 10 20 30 40 50 60 70 80 90
0.00
0.25
0.50
0.75
1.00
1.25
Prop. asymptomatic,
pre-symptomatic & mild infections
Ratio of consumption-to-labor contacts
C)
E)
Ratio of unavoidable to avoidable contacts Ratio of consumption to labor contacts
0.1
0.6
Cases per 100,000 averted
B)
D)
0.00
0.25
0.50
0.75
1.00
Convex Linear Concave
Contact function shape
Total cases per 100,000 (ratio)
F) G)
0.00
0.25
0.50
0.75
1.00
Convex Linear Concave
Contact function shape
Individual loss averted (%)
Ratio of consumption to labor contacts
Ratio of unavoidable to avoidable contacts
0.1
0.6
1.2
Individual loss averted
% of voluntary isolation
losses/ cases averted
under targeted isolation
Individual loss averted (ratio)
Total cases per 100,000 (ratio)
Contact function shape Contact function shape
0.00
0.25
0.50
0.75
1.00
1.25
10 20 30 40 50 60 70 80 90
Productivity loss (%)
Ratio of consumption-to-labor contacts
Loss averted (%)
(0.78, 0.80]
(0.80, 0.82]
(0.82, 0.84]
(0.84, 0.86]
(0.86, 0.88]
(0.88, 0.90]
(0.90, 0.92]
(0.92, 0.94]
(0.94, 0.96]
(0.96, 0.98]
(0.98, 1.00]
(78, 80]
(80, 82]
(82, 84]
0.00
0.25
0.50
0.75
1.00
1.25
10 20 30 40 50 60 70 80 90
Productivity loss (%)
Ratio of consumption-to-labor contacts
Loss averted (%)
(0.78, 0.80]
(0.80, 0.82]
(0.82, 0.84]
(0.84, 0.86]
(0.86, 0.88]
(0.88, 0.90]
(0.90, 0.92]
(0.92, 0.94]
(0.94, 0.96]
(0.96, 0.98]
(0.98, 1.00]
(84, 86]
(86, 88]
(88, 90]
0.00
0.25
0.50
0.75
1.00
1.25
10 20 30 40 50 60 70 80 90
Productivity loss (%)
Ratio of consumption-to-labor contacts
Loss averted (%)
(0.78, 0.80]
(0.80, 0.82]
(0.82, 0.84]
(0.84, 0.86]
(0.86, 0.88]
(0.88, 0.90]
(0.90, 0.92]
(0.92, 0.94]
(0.94, 0.96]
(0.96, 0.98]
(0.98, 1.00]
0.00
0.25
0.50
0.75
1.00
1.25
10 20 30 40 50 60 70 80 90
Productivity loss (%)
Ratio of consumption-to-labor contacts
Loss averted (%)
(0.78, 0.80]
(0.80, 0.82]
(0.82, 0.84]
(0.84, 0.86]
(0.86, 0.88]
(0.88, 0.90]
(0.90, 0.92]
(0.92, 0.94]
(0.94, 0.96]
(0.96, 0.98]
(0.98, 1.00]
(90, 92]
(92, 94]
(94, 96]
(96, 98]
(98, 100]
0.0
Convex Linear Concave Convex Linear Concave
1.2
0 0.5 0.85
0.1
0.6
1.2
0.1
0.6
1.2
0.0
1.0 1.0
0 0.5 0.85
Prop. asymptomatic,
pre-symptomatic & mild infections
Prop. asymptomatic,
pre-symptomatic & mild infections
Prop. asymptomatic,
pre-symptomatic & mild infections
0 0.5 0.85
0 0.5 0.85
Prop. asymptomatic,
pre-symptomatic & mild infections
0.00
0.25
0.50
0.75
1.00
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Productivity loss
Proportion of pre-symptomatic, asymptomatic or mild infections
A)
Productivity loss
0.0
0.25 0.50 0.75 1.00
0.7 0.3 0.5
0.0
0.1
Figure 2.6: Result sensitivity to key model parameters.
Note: We plot ratios of outcomes under targeted vs. voluntary isolation to highlight the relative variation in outcomes under
each strategy. The white dots in panels A and C show the baseline parameterization. A: Mapping between productivity losses
and implied share of the population which is pre-symptomatic, asymptomatic, or has mild symptoms (i.e., infectious individuals
able to work). A productivity loss of 0.85 implies approximately 80% of the population are pre-symptomatic, asymptomatic, or
have mild symptoms. B,C: Ratio of individual losses averted (B) and ratio of cases per 100k averted (C) under targeted isolation
vs. voluntary isolation as proportion of contacts at consumption relative to labor activities increases (a value of 1 means equal
number of contacts at consumption and labor) and as the asymptomatic share increases. D,E: Ratio of individual losses averted
(D) and ratio of cases per 100k averted (E) under targeted isolation vs. voluntary isolation as proportion of unavoidable contacts
(e.g., home) relative to avoidable contacts (consumption & labor) increases (a value of 1 means an equal number of contacts at
home as at consumption & labor) and as the asymptomatic share increases. F,G: Ratio of individual losses averted (F) and ratio
of cases per 100k averted (G) under targeted isolation vs. voluntary isolation as contact functional form varies. Convex contact
functions imply high-contact activities are easiest to avoid, while concave contact functions imply low-contact activities are
easiest to avoid (see Materials and Methods).
57
Appendix
All data for replicating the results in this paper can be found athttps://github.com/epi-econ/
COVID19ControlStrategies (Ash et al.).
All code for replicating the results in this paper can be found athttps://github.com/epi-econ/
COVID19ControlStrategies (Ash et al.). All simulations were run in R.
The supplementary information for this paper can be found at https://github.com/epi-econ/
COVID19ControlStrategies (Ash et al.).
58
Chapter3
Talking and bubbles: does social media affect the speed and magnitude
ofbubbleformation?
1 Introduction
How do conversations people have about Bitcoin or Gamestop turn into a bubble, and how do a small
number of enthusiastic “fanboys/girls” contribute? Can bubbles emerge simply because a very contagious
idea asserts itself in public conversations? Why do people talk about bubbles at all, and why do people
listen, read tweets or read newspaper articles about them? I propose a possible answer to some of these
questions with “narrative bubbles” – financial bubbles that take off because themes, ideas or “narratives”
become widely discussed and have certain price beliefs attached to them. These narratives have their
own aggregate dynamics based on contagion and are influenced by whether people want to hear about
something everyone else is discussing, or instead something novel that no-one is discussing.
To analyze the impact of these narrative bubbles, I build a model of talkers and listeners. Talkers enjoy
talking about narratives they have been persuaded by and choose how much to do so as well as how much
to invest in a related risky asset. Listeners choose how much to listen to talkers and how much to invest
based on their private information. In choosing to listen, they admit a chance of becoming convinced of a
narrative’s merit, an event which, should it occur, leads them to become talkers and update their beliefs.
Talkers may also forget or become bored of the narrative, depending on how forgetable that narrative is,
returning them to the pool of listeners. Bubbles and crashes emerge in equilibrium through individuals
optimizing based on their changing beliefs as narratives compete for listeners’ attention. The narratives
themselves, along with their corresponding quantities of talking and listening, can be recovered from text
data. I demonstrate this for two bubbles using social and traditional media data, finding evidence that
social media expedites and amplifies bubble events.
59
At the heart of the model is an underlying process of people changing their minds. Evidence suggests
changing one’s mind is costly. Individuals only do it if it is worth their while (see e.g. (84), (174)).
Such evidence predicts that some internal choice is made about whether to seek out the information that
ultimately leads someone to change their mind. In this model, that choice is a rational choice of whether
to listen to conversations about narratives related to a bubble (e.g. agents wake up each day and decide
how much time to allocate to reading tweets about Bitcoin’s use as dark currency). If agents expect that
becoming infected by a given narrative will be beneficial, for example it might yield profits in a state
where an asset’s price takes off, then they have more reason to listen. Similarly, if there are more people
talking about a narrative other agents may wish to listen more. Individuals changing their mind in such
economically and socially motivated ways have been well documented in the behavioral literature (see e.g.
(131),(138),(149)).
The model’s first result is that bubbles and crashes do form and the content of language matters in two
ways. On the one hand, two narratives’ language can differ in terms of optimism/pessimism (the degree of
disagreement). Large disagreements lead to large price volatility as agents differ strongly in their beliefs
about the asset’s value and the narrative infections spread at different rates. On the other, the narratives’
language implies a level of novelty relative to existing narratives and innovative narratives are catchy. The
narratives’ novelty induces a form of competition among the narratives, leading the ideas to crowd each
other out at different times leading to large price swings. This is in contrast to two non-novel narratives
which do not compete inducing low volatility (e.g. if everyone is already familiar with or tired of narratives
they do little to induce spread or alter price movements). I show that the model generates the key essential
feature of a bubble – a large persistent overvaluation of the risky asset relative to fundamentals – and
the size and speed of this overvaluation depend on these language attributes. Two ideas that are polar
opposites in terms of optimism (one optimistic, one pessimistic) or are very different/novel can lead to
large bubbles and large crashes. These effects tend to interact such that the biggest bubbles occur with
large disagreements and large novelty.
A second result is that faster and more amplified bubbles can occur with social media. If social media
leads to more interactions between agents narratives become more contagious which can hasten a bubble.
Or if social media changes the content of language, such as providing more novel narratives or with greater
polarisation/disagreement between narratives, then as in the first result bigger, faster bubbles are possible
and crashes become larger and more likely. I demonstrate this model prediction by using my calibration
60
approach with actual social media language data during the Bitcoin (2017/18) and Gamestop (2020/21)
bubbles. I then recalibrate to recover the narratives’ contagiousness and forgetability using traditional
news data only. Across my range of results, I show that without social media the 2017/18 Bitcoin bubble
would have peaked between 37-52 days later, peaked on average 23% lower (though this estimate varies
and is only 2% in some results) and does not crash.
The model merges several strands of literature. (187) has suggested using infection/contagion dynamics
to model narratives. Models of bubbles based on agents with heterogeneous beliefs have become a large
subfield of bubble models (see e.g. (190),(184),(100),(47)). These models feature an “agreement to disagree”
among agents which fuels bubbles. My model merges these two areas, as well as extending the concept
of “agreeing to disagree” to cover “agreeing to disagree” with future versions of oneself. For example,
imagine a sober agent today considering whether to become a drunk. Today’s sober agent knows she will
have very different beliefs when a drunk. Nevertheless she may still choose to drink today, carrying with
it a chance of becoming a drunk, if others are drinking or she thinks she may profit from being a drunk
in a future world where only drunks can find jobs. In aggregate, the proportion of drunks will be driven
by these individual choices taking today’s labor market for drunks as given. This agreement to disagree
with future versions of oneself is a further example of the process of changing one’s mind at the model’s
core. Such behavioral explanations exist in the literature, for example cited as causes of the (188) excess
volatility puzzle ((189)).
Talking is an economic decision that can drive a range of outcomes. I show how talking and listening
decisions can be built into an economic framework that can be examined empirically. I use modern
computational linguistics techniques in three main steps. First, I recover the narratives themselves through
two alternative methods: one more researcher guided, the other more automated (these are modified
versions of an approach used in(113)). In the first I order common words into intuitive economic groupings
and fill out the vocabularies of these groupings using syllables from Word2Vec, a neural network approach
which trains a vector space with a geometric position for each word (see (101)). The second approach is
similar but uses Latent Dirichlet Allocation (LDA) to determine the initial groupings. The two key language
attributes of these groupings, optimism and novelty, are also recovered from the trained Word2Vec. Narratives
are defined by the word groupings and their language attributes. Second, numbers of talkers and the
quantities of talking and listening over time can be recovered by counting how many tweets or news
articles contain the narratives’ vocabularies. Finally, I recover the contagiousness and forgetability of the
61
narratives by calibrating the model to the time series number of talkers data, generated in the previous
step, for each narrative using a standard Generalized Method of Moments (GMM) approach.
Through this empirical examination I find interesting new explanations for the bubbles studied using
my narrative approach. In both Bitcoin and Gamestop bubbles a “financial market” narrative is present
as users describe and explain to each other terms associated with understanding a financial bubble. Users
explain the “stock market”, “asset valuation”, “long buys” and other financial market terminology. For
both bubbles these are optimistic narratives that push prices higher. For Gamestop, a second optimistic
narrative emerged of small retail investors combating large hedge funds, further fuelling the bubble. In
Bitcoin, a more neutral narrative explaining technology and cryptocurrency terms such as “blockchain”,
“big data” and “data science”. In my model, this more realistic narrative explaining Bitcoin disagreed with
the euphoric market narrative and was novel, leading to the price volatility that formed the eventual crash.
Similarly with Gamestop a more negative narrative emerged about whether and eventually how Robinhood
had halted trading on Gamestop on their platform with users using terminology like “protect billionaires”
and “block trading”. Future research utilising my framework but allowing for more narratives may add
extra insight.
The model sheds light on two key aspects of modern bubbles. Firstly, recent bubbles appear to occur
at a rapid pace (potentially over just several days) and with extreme magnitude (Bitcoin doubling in value
over the space of just 18 days in 2017 before halving in the following 45 days, or Gamestop shares soaring
to over 8 times their original value in the space of two weeks). Many attribute a role to social media, the
analysis of which requires a model of talking and listening. In particular, such a model can capture the
changing nature of conversations, which used to happen more frequently as person-to-person interactions
or between the authors of traditional print media and its readers, versus the increased prevalence today of
interactions in online chatrooms or through Twitter posts and retweets. Second, the content of language
itself would be expected to impact how likely agents are to change their mind and may be different in
modern social media fuelled bubbles. Examining this requires a model with a role for language attributes
such as optimism and novelty of the language.
To address these possibilities, my model departs from earlier models of heterogeneous belief bubbles
((190), etc) as well as other bubble models ((15)) by including decisions to talk and listen and an associated
model of contagion (similar to that in(10)). Talking and listening allow the study of interactions, creating
a channel for social media to impact the speed and magnitude of bubbles forming. They also allow a role
62
for language in the formation of bubbles. Additionally, my empirical analysis highlights a difference in
the content of language on social media versus in traditional news. Consistent with other literature, I find
that traditional news features fewer narratives or themes during widely-covered events ((158)). In contrast
to this my analysis adds that social media appears to substantively cover more narratives, creating a key
difference in the language content circulating on social media versus in traditional news.
An additional departure from previous models is endogenous trading volume. A key empirical fact
about bubbles is contemporaneous high trading volume. (15) suggest exogenous random “wavering”
between a growth or value mindset to generate trading volume. My model provides an endogenous reason
for this wavering – in day to day activities, agents are either becoming reinfected by an idea, or forgetting
the narrative (think initially being obsessed with “Blockchain” for a week, but then time is scarce and you
instead become obsessed with some other idea). This wavering leads to associated changes in portfolio
allocations each day, creating trading volume. A related concept that my model has the power to examine
is “talk volume” – instead of how frequently agents are adjusting their investment portfolios’ each day,
how frequently are agents switching from talking about one idea to another. I examine talk volume in the
Twitter data and find it to also be elevated during bubble episodes as in my model.
A third departure from existing models is adding new data to the analysis of bubbles. The literature
examines prices and fundamentals ((166)), investor buying and selling ((161)), firm characteristics ( (108))
and survey data on beliefs ((23)). To this my model adds text and interaction data. The structural approach
can be applied more broadly to any field where talk or narratives affect economic variables e.g. the
emergence of climate change as a narrative driving investment in mitigation and adaptation, a recession
as a narrative suppressing consumer spending, inflation targeting as an idea leading to better monetary
policy decisions and therefore fewer stagflation episodes. The model scales: policy-makers could use it as
a tool to assess the impact of dozens of narratives on the likelihood of bubbles forming. The model also
suggests several new, language-based policy tools for addressing bubbles or “leaning against the wind” e.g.
for two ideas that disagree strongly, increasing correlation in their language, i.e. less novelty, tames large
price swings (similar to providing accurate information in similar language to optimistic information).
Several other aspects of the literature are worth mentioning and suggest extensions for future research.
My model is related to herding models suggested by (14) and (29). Such models suggest an externality
in herds: agents that follow the herd deprive the market of information by not acting on their own
information. My model introduces structure that enables the measurement of the information externality
63
present in herding and informational cascades during real bubbles, similar to that only previously attainable
in experimental settings such as (53). (23) use survey evidence to test a structural model of cryptocurrency
bubbles with heterogeneous beliefs. Their underlying heterogeneous belief model is similar to mine, but
has no role for language or social interactions. Linkages between their survey data approach and the
news/language approach used here might generate interesting findings. Finally, my model relies on the
displacement observation made by (Kindleberger) that most bubbles start with general good news about
fundamentals. I do not provide a theory of how this news (what I call a new narrative) emerges or how the
emergence of one narrative may spur the introduction of others. Of course, the attention generated by a
bubble may drive the emergence of opposing negative ideas (in fact,(1) show that such negative ideas can
be used as a coordination point for sophisticated investors to leave the market, which can be an important
crash mechanism). In earlier work ((8)) I examine the importance of such correlations among “risk factors”
during bubbles; further research and modelling of such correlations may provide useful conclusions.
In section 2 I present the model. Section 3 describes a series of model results e.g. showing that the
model does generate bubbles and crashes, as well as results on trade volume and informational cascades.
In Section 2 I present the main data used to answer the research question in this paper. Section 5 presents
the empirical methodology and in particular how the text data is summarized into quantitative data that
can be used in the model. Section 6 displays the calibration showing that the model fits talking, listening
and price data well. Section 7 shows the calibration results and analyzes the role of social media in bubble
events. Finally, section 8 concludes.
2 Amodeloftalkingandlisteningduringbubbleevents
General setup: I consider a risky asset with fundamental valueτ which is unknown, and fixed supply ( ω).
In addition there areM narratives, “ideas” or “sub-ideas” about the asset, indexed byi, each with their own
beliefs about the asset’s value (θ P
i
), their own spreadability (β i
) and their own forgetability (P
F
i
). There
are two types of agent in this market, listeners and talkers. At timet, the proportion of listeners isL, and
talkers for each narrative “infection” are proportionI
i
, such that the total population is normalized to1.
Listeners “change their mind” through listening. Listeners choose how much to invest in the asset (x
S
t
)
and how much to listen to each narrative (l
i,t
,∀i ∈ M). Listeners’ information amounts to a private
signal drawn from a distribution around the asset’s fundamental value (θ ∼ F(τ )). This information is the
64
probability of a high price realization for the asset price next period (p
t+1
) such that:
E
L
t
p
t+1
=θ p
H
+(1− θ )p
L
(3.1)
At each timet, listeners maximize a constant absolute risk aversion (CARA) utility function over next
period’s wealth. Listeners also plan forward to wealth in future periods based on their listen decision.
Listening may “change their mind” or their beliefs and convert them to a talker for narrative i with
probability P
I
i
, itself a function of l
i,t
. This changes their beliefs and therefore their payoffs. Finally
listeners face a time constraint for each listen decision based on a fixed availability of time ( l
i,t
≤ ˆ
L). The
value function for this dynamic choice is shown in appendix 8.
Talkers talk and may forget the narrative. Each period, cognitive resources are scarce so talkers are
infected by one narrativei only. They cannot be infected by two narratives simultaneously. They choose
how much to invest in the asset (x
i
t
) and how much to talk (t
i
t
). Talkers’ of type i bayesian update their
beliefs based on the public signal conveyed by the narrativei (see Appendix 8 for details). Their perceived
probability of the high price realization becomes
b
θ i
= θ θ P
i
, whereθ P
i
is a feature of narrativei. Finally,
talkers may forget narrativei with a fixed probability ( P
F
i
), at which point the narrative’s public signal is
forgotten and their beliefs return to their private signal – they become listeners again.
As for listeners, talkers maximize CARA utility over next period’s wealth, but with an additive and time
separable boost to utility from talking – talkers enjoy talking. Talkers also face a time constraint limiting
how much they are individually able to talk. Talkers of typei thus face the following static problem (where
W
i
t
is just wealth accumulated up to pointt):
max
{x
i
t
,t
i
t
}
E
t
[U(W
i
t
+x
i
t
(p
t+1
− p
t
))]+H(t
i
t
), ∀t∈{1,..,T} (3.2)
In this model then, agents are exerting daily rational choices over whether to change their minds
through listening. In this way agents are optimally selecting what beliefs they will hold in future periods,
whilst acknowledging that they may “agree to disagree” with a future version of themselves in doing so.
A range of evidence has shown that agents reason in such a motivated way, choosing reason so as to
65
accord with existing beliefs, rather than choosing beliefs based on reason (e.g. (84), (174)). This implies
that changing one’s mind is cognitively costly in some way, since retaining one’s existing beliefs is the
default ((131), (149)). Given such behavior, as in this model, agents require social or economic incentives
to make a choice to change their mind, as well as the random chance for these incentives to materialize
((173),(76)).
What kind of infection can you catch from listening? Narratives have a language structure which
will be used to tie them to data and recover their attributes. A narrative features a language L
i
=
(w
i,1
,w
i,2
,...,w
i,N
i
, where each w
i,n
∈ W is a w-dimensional vector, representing a single word (e.g.
“blockchain”) in the language vector spaceW.L
i
is therefore a set ofN
i
words associated with and used
to discuss narrativei, whereN
i
can differ by narrative.
Narratives form a “narrative space” (S), the collection of M narratives such thatS = (1,..M). The
linguistic distance between two narratives is measurable as the “World-Mover (WM) Distance” i.e. the
distance required to overlay the vocabulary of narrativei (L
i
) onto that of narrativej (L
j
; see Section 5
for more details). The distances between narratives can therefore be represented by the “IdeaSubstitution”
Matrix:
1
Definition2.1 (Idea Substitution Matrix). A=
1 a
1,2
... a
1,M
a
2,1
1 ... a
2,M
.
.
.
.
.
.
.
.
.
.
.
.
a
M,1
a
M,2
... 1
Eacha
i,j
is the WM distance between the vocabulary ofi,L
i
, and the vocabulary ofj,L
j
, such that
a
i,j
=1 impliesi andj are perfect language complements (i.e. j is very similar toj) anda
i,j
=0 implies
they are perfect language substitutes (j is perfectly novel compared to i; see section 6 for more details).
“Similar” narratives will boost each others’ spread, while narratives “novel” relative to each other will
compete for spread. This occurs through the AdjustedInfectionVector.
Definition2.2 (Adjusted Infection Vector). A
⃗
I =A
I
1
I
2
.
.
.
I
M
Narrative attributes and choices affect the probability a listener is infected by narrative i. This probability
for eachi is a function of narratives’ spreadability (
⃗
β ), interactions among talkers and listeners (vector
⃗
T
⃗
L
′
)
and the adjusted narrative infection vector (A
⃗
I).
1
MatrixA will be a symmetric, square matrix, since there is no order dependence in measuring narrative distances. Diagonal
elements will be 1 denoting that a narrative is located on top of itself. IfA =I, the identity matrix, then all ideas or narratives
are perfect substitutes i.e. narratives crowd each other out one-for-one.
66
⃗
P
I
(
⃗
T
⃗
L
′
,A
⃗
I)=
⃗
β ′
⃗
T
⃗
L
′
A
⃗
I (3.3)
Optimal decisions and equilibrium. All decisions are static and have simple solutions except the listen
decision. Listening is determined through value function iteration. Investment decisions have simple
interior solutions governed by risk aversion. Talkers talk up to their time constraint. This yields the
following simple optimal decision equations:
x
S∗ t
=
E
S
t
(p
t+1
− p
t
)
γσ 2
, x
i∗ t
=
E
i
t
(p
t+1
− p
t
)
γσ 2
, ∀i∈{1,..,M} (3.4)
t
i∗ t
=
ˆ
T, ∀i∈{1,..,M} (3.5)
In equilibrium, the asset market clears and individual listening is set equal to aggregate. The resulting
equilibrium price equation is a weighted average of the price expectations or beliefs that exist in the market.
L
t
=l(
⃗
I
t
,L
t
) (3.6)
p
t
=[
M
X
i=0
E
i
t
(p
t+1
)I
i,t
+E
S
t
(p
t+1
)(1− M
X
i=1
)]− ωγσ 2
(3.7)
Each narrative updates each period based on agents becoming infected by it and forgetting it according
to state update equations. Individuals interact through talking and listening contributing to the contagion.
Proportion P
I
i
of listeners are infected by narrative i (note this is now the aggregate probability, rather
than the individual probability that was a function of l
i,t
). Proportion P
F
i
of talkers of type i forget the
narrative each period. Error terms are added for the later calibration exercise. Since agents can only
be infected by one narrative at a time, these equations cause competition to occur among narratives in
equilibrium. Competition is more fierce the more novel narratives are relative to each other.
67
L
t+1
=L
t
− L
t
∗ ⃗
P
I
(
⃗
T
t
⃗
L
′
t
,A
⃗
I
t
)∗ 1
M
+
⃗
P
F′
∗ ⃗
I
t
+ϵ S
t
(3.8)
⃗
I
t+1
=
⃗
I
t
+
⃗
P
I
(
⃗
T
t
⃗
L
′
t
,A
⃗
I
t
)∗ S
t
− P
F
i
∗ ⃗
I
′
t
+⃗ ϵ I
t
(3.9)
This leads to the following definition of a “talking” and “listening” equilibrium.
Definition2.3 (Talking and listening equilibrium). A combination of (i) optimal investment choices(x
S
t
,x
i
t
M
i=1
)
from equations 3.4; (ii) optimal talking and listening policy functions (t
i
t
,l
i
t
(
⃗
I
t
,
⃗
L
t
)
M
i=1
); (iii) a price path
and market clearing conditions satisfying equations 3.6 and 3.7.
3 Modelresults
The model generates the key essential features of a bubble event: a persistent overvaluation of the risky
asset relative to fundamentals as well as a large devaluation given the right conditions. This section
demonstrates how the model does this by first explaining the key language attributes between narratives
that trade-off in the model: novelty and disagreement. I then show that the model of changing one’s mind
through talking and listening does generate bubble events and crashes, where they depend on the language
attributes at the heart of the model. I then demonstrate computationally the important role listening plays
in this. Finally, I describe (a) how the model generates the other key empirical feature of bubbles, high
trade volume, as well as defining and numerically describing “talk volume”; and (b) what the structure of
the model allows us to deduce about informational cascades.
3.1 Trade-offsbetweenideasubstitutabilityanddisagreement
Thelanguageofsub-ideasaffectagents‘choicesandthroughthemthebubblesizeandspeed. As
an example, in the case of the Dutch Tulip Mania, consider three narratives characterized by the following
vocabularies: 1 = [fashionable,artistic,luxury]; 2 = [contract void,possible]; 3 = [contract void,impossible].
The first narrative, which may have driven the demand surges for Tulips in 17th century Holland, is both
“optimistic” (positive and highθ P
1
) and semantically very novel from the other two (a
12
=a
13
=0). Agents
infected by this narrative will talk about Tulips and spread the idea that they are valuable, generating a
bubble if the narrative is sufficiently persuasive or spreadable. Listeners that expect the idea to take-off
68
will be motivated to listen if they expect the price for Tulips to take-off, since the implied long position
could be profitable for them.
The second two narratives represent the debate, existing at the time, that futures contracts for Tulips
may be allowed to void by the Dutch parliament, favoring buyers.
2
These two narrative disagree: believing
this would happen (i.e. narrative 2) would make Tulips even more valuable (it is optimistic with highθ P
2
);
believing it would not (i.e. narrative 3) would make Tulips crash (it is pessimistic with negativeθ P
3
). These
narratives are also very novel relative to the first narrative ( a
12
=a
13
=0), implying they compete with it
and reduce the chance of being infected by it, while fomenting the spread of each other (a
23
>>0). These
two factors, a large “disagreement” and large “novelty” (between the initial idea and the other two), create
price volatility and are required to establish a crash.
These two roles for language are summarized in Figure 3.1. We can project the language vector space,
W, into the 2-dimensional space representing disagreement and novelty shown in the left panel of Figure
3.1. The vertical axis shows agreement: proximity of a narrative to the concept of optimism (positive
quadrants) versus pessimism (negative quadrants).
3
The horizontal axis measures relative novelty: it
measures semantic synonimity (narratives on the vertical axis are perfectly synonymous to some base
word represented by 0), where two narratives far from each other on the horizontal axis are very novel
relative to each other, while two that are close are said to be very “similar”.
In themodel, smallbubbles form fairlyeasily, butspecific conditions arerequired for very
large bubbles. As long as there exists a sufficiently infectious, optimistic narrative a bubble can form.
However, as shown in the left panel of Figure 3.1, larger bubbles will occur when there are multiple
agreeing, optimistic narratives, like 1 and 2 depicted, that are similar each other. More similarity means
the narratives contributes to each others’ spread, causing each to spread faster and more widely than they
would alone.
Figure 5.6 shows that the model generates bubbles (i.e. a persistent overvaluation of the risky asset)
and that large bubbles form with several similar and optimistically agreeing narratives. Imagine an initial
narrative infection represented by 1 in Figure 3.1. Then Figure 5.6 represents the position in this space
2
As Tulip prices rose rapidly, buyers that had committed to buy Tulips for a high price in 6 months faced uncertainty that they
could lose money if the price fell. The debate therefore suggested that buyers should not face a legal obligation to purchase, and
could instead pay a smaller compensation payment to the seller. The option to void would reduce forward-contract uncertainty
for buyers, making Tulips (and the forward contracts through which they were obtained) even more valuable.
3
In the model I shall measure perfect optimism as being represented by a vocabulary of 100 words most associated with the
word “buy”, since I wish to represent purchase optimism. I shall measure perfect pessimism as being represented by a vocabulary
of 100 words most associated with sell. I conduct sensitivities on the choice of optimistic/pessimistic word.
69
Figure 3.1: There is a trade-off between substitutability and disagreement
Notes: Attributes of the language space affect the likelihood of a bubble forming or a crash occurring. As in the two panels, the
language space (where each point represents a word) can be summarized along the two dimensions of semantic
novelty/similarity (measured on the x-axis relative to some base word i.e. two words close on the horizontal axis will have a
semantically similar meaning) and agreement on optimism (where two narratives close vertically are said to “agree”, whereas
those far vertically are said to “disagree”. PanelA shows a narrative combination expected to lead to the forming of a bubble,
since narratives 1 and 2 have agreeing optimistic expectations of the asset price, and narrative 2 is semantically similar to
narrative 1 causing the narratives to fuel each other. PanelB shows a narrative combination expected to lead to a crash, since
now narrative 3 features a large disagreement with narrative 1, is very novel relative to narrative 1, and can thus
out-compete/crowd out narrative 1.
of adding a narrative2, and the resulting bubble outcomes from running these comparative statics in the
model. Bubble size is denoted by the size of the circle at each point in this parameter space. On the
horizontal axis, 0 represents a narrative2 perfectly novel to narrative1, with novelty falling as we move
along the horizontal axis. On the vertical axis, 0 represents a neutral idea2 (smallest disagreement with
idea1); -0.5 represents a very pessimistic idea2 (large disagreement with idea1). Generally, starting from
most points in the chart, moving up (a more optimistic narrative 2) and to the right (less novel) leads to
a larger circle/bubble. This relation does not always hold, depending on the position in the space, but
more disagreement and novelty always lead to more volatility in outcome (i.e. whether large bubbles and
crashes occur or not).
Crashesformlessfrequentlythanbubbles. Firstly, for a crash to occur, usually a bubble must have
already occurred (in this paper I do not consider cases where a stable and high priced asset crashes in value,
though the framework could be used to analyse such cases). Second, narrative2 must be capable of out-
competing or displacing the optimistic narrative(s) that generated the original bubble. As shown in Figure
3.1, this displacement depends on several factors: (i) the narrative must be sufficiently pessimistic (more
negative on the horizontal axis) – this means there are bigger potential profits from shorting the asset,
which will convince more people to listen to the narrative; (ii) the narrative must be sufficiently novel to
70
not just be subsumed by the existing optimistic narrative (i.e. if the languages of the two narratives are too
similar, then the language will just further promote the first narrative, and the new pessimistic narrative
will not take off).
Figure 5.6 demonstrates first that the model does generate crashes. The figure also shows computationally
that larger crashes occur with larger disagreements and more novelty. In the figure, the color of the circle
represents crash size (as measured by the price level at day 100 divided by the price at day 150 – higher
implies the price has crashed a lot/is a lot lower at day 150 than day 100). Red implies a larger crash,
blue implies no crash at all. The largest crash occurs at the bottom left position in the chart, where the
disagreement between narratives1 and2 is largest and narrative2 is most novel.
Interestingly, the relationship between crash size and disagreement, when narrative2 is very novel, is
not monotonic. Moving directly north from the lower-left point does not lead to a monotonic reduction in
crash size – at degree of optimism0 there is a larger crash than at− 0.250. This occurs because agents will
only take large risks if there is a large profit. At (0,− 5) (lower-left), the large disagreement means that if a
crash begins a short position generates large profits, justifying the risk over whether a crash will occur. As
a result, agents listen to narrative2 and so it takes off, generating a crash. At (0,− 0.25), the disagreement
is smaller so profit if there is a crash falls, it is no longer worth the risk, and listeners no longer listen.
Narrative2 never takes off and there is no crash. Finally, at (0,0), there is not much disagreement, so the
risk is now small (there are fewer large crash or large bubble states). Therefore listeners find it optimal to
listen again, narrative2 takes off and a small crash does occur.
In summary, Figure 5.6 shows that a model of changing one’s mind through talking and listening can
generate bubbles and crashes. The sizes and speeds of these bubble and crash phases will depend on the
two key attributes of language: disagreement and novelty.
3.2 Listeningchoicesaffectassetprices
In the model choices to listen affect aggregate bubble speed, size and shape. Narratives only take-off if
listeners are willing to read tweets, newspaper articles or listen to other agents talking about the narrative.
Intuitively agents would be expected to listen to narratives more if they expect to learn profitable information
from them, or they know others that are listening to the narrative and so do not want to miss out. For
example (18) show naive investors listen to more savvy investors as they anticipate this information will
be profitable. (11) show having, and therefore presumably listening to, friends purchasing homes increases
71
Figure 3.2: Comparative statics: more novelty and disagreement affect bubbles and crashes
Notes: Parameter space of a narrative infection 2, relative to a first narrative. Axes same as Figure 3.1. Horizontal axis: 0
represents totally novel narrative 2 compared to narrative 1, with semantic similarity increasing along axis. Vertical axis: 0
indicates a neutral position (i.e. narrative 1 is optimistic and narrative 2 is neutral), with disagreement becoming larger moving
down axis. Circlesize: indicates the size of the bubble at that point of the parameter space (measured as price growth between
time 100 and 50). Circlecolor: indicates crash size (measured as price at day 100/price at day 150), blue being no crash, red
indicating a very large crash.
home purchases. I demonstrate that this intuition is present in the model.
The listen choice in the model can be analyzed through the following expression for the value function
derivative, formed through rearranging the first order conditions for the listening choice.
∂V
L
∂l
i,t
=
P
I
i[l
i,t
]
1− P
M
j=1
P
I
j
E
L
t
[V
I
i
− V
L
] (3.10)
The listening decision is affected by the relative value of narrative infection and its impact on the
probability function. From inspecting the equation: (1) the derivative will only be negative if expectations
ofV
L
are greater than the expected value of infection by narrativei. In this case agents prefer not to listen
to ideai. (2) If listening has a larger impact on the probability of being infected withi then agents want to
listen more (if the narrative has positive value i.e. V
I
i
>V
S
) or listen less (if the idea has negative value).
(3) The impact of other ideas on the listen decision for idea i depends on their similarity:
1. P
I
i[l
i,t
]
: The spread of perfectly novel ideas does not affect this. However, more spread of similar ideas
increases it as there is a greater chance the agent is infected by this idea through the complementary
72
idea.
2. 1− P
M
j=1
P
I
j
: listening makes less difference if the agent will likely remain a listener (i.e. the
probability of remaining a listener is high).
3. E
L
t
[V
I
i
− V
L
]: for an optimistic narrative, more spread of other optimistic narratives improve its
relative value as the price improves, vice versa for a pessimistic narrative. Therefore this term
depends on the agreement between the narratives.
Higherlisteninginthismodelleadstobigger,fasterbubbles,aswellasbiggerfastercrashes.
In Figure 3.3, I create price paths using the model equations 3.3, 3.7, 3.8 and simple static listening choices
for two narrative infections. The two narratives are perfectly novel to make the comparison clear. The first
narrative is optimistic and starts on day 0. The solid lines show the price paths for the asset associated
with this one narrative under levels of listening{0.5,0.7,0.9}. Higher levels of listening lead to a faster
and larger price path.
Higher listening to pessimistic narratives also affects the bubble crash. Linetypes indicate listening to a
second, pessimistic narrative that is introduced on day 50. The dashed lines show a high level of listening
to this narrative, which leads to a large, fast crash. The size of the crash is lower for bubbles that had
higher listening to the first narrative in the first 50 days, since it takes a more contagious second narrative
to displace the original idea. The dotted line shows the case where there is lower listening to the second
narrative. For the bottom line (the least listened to first narrative) there is still a crash, though it is much
smaller. For the two top lines with more listened to first narratives, the dotted lines show that if the second
narrative is less listened to then there is barely a crash at all.
The model approximates what is likely a more prolonged process of changing one’s mind. In real
conversations, persuasion likely occurs in several stages. In a first stage, agents become aware of a
narrative’s existence. In a second stage, agents actively “listen” or read information about the narrative
and decide whether they come to agree with it. The model ignores the first as it is probably not associated
with action. The model focuses on the second stage, assuming persuasion has not occurred until agents
are finally motivated into action by it.
73
Figure 3.3: Listening choices affect price path cumulatively
Notes: All lines show data simulated using model equations and a simple parameterization. Solid lines represent the price path
for a single narrative infection where listening is held constant at the level indicated i.e. low, medium and high listening.
Dashed lines indicate the size of the associated bubbles‘ crashes when listening is either low or high.
3.3 Modelcanexplaintradevolumeand“talkvolume”
A well-documented empirical fact about bubble events is that they feature high trade volume ((184),(40)).
During a bubble event there is significantly elevated buying and selling of the asset. Previous bubble models
have struggled to generate this empirical fact. Recently, (15) show that an exogenous form of wavering
included in their model can generate trade volume. In this section I describe how my model endogenizes
wavering with a behavioral foundation grounded in the model, and can therefore generate trade volume.
I additionally create the related concept of “talk volume” (how much agents change what they are talking
about) and document how this changes during a bubble episode.
Trade volume in my model happens as agents are continuously forgetting and relearning about narratives,
and therefore adjusting their asset portfolios accordingly. Intuitively, imagine that today I become infected
by an optimistic narrative and so buy more of an asset. Tomorrow I remain infected by the narrative and
thus maintain my portfolio. However, the following day a new narrative about a different asset infects
me and I thus shift my portfolio towards this other asset away from the existing one. If a large portion of
investors are engaging in this forgetting and reinfection during a fast-spreading narrative infection this
74
can generate substantial trade volume. Similarly agents are switching what they talk about a lot, since
narrative infection implies a desire to talk about that narrative. This therefore also generates high talk
volume.
Tradeandtalkvolumehaveintuitive,correspondingdefinitionsinthedataandmodel. Within
the model we can define trade volume as in the below definition. The left term in the equation represents
changes over time to asset portfolios within groups, as well as changes over time in the size of these groups
themselves. The right term represents those that change group between periods and the implied change
in asset position that comes along with such a change in beliefs.
Definition3.1 (Trade volume).
vol
trade
t
=(%S
t− 1
toS
t
)∗| x
S
t
− x
S
t− 1
|+
M
X
i=1
(%I
i,t− 1
toI
i,t
)∗| x
i
t
− x
i
t− 1
|
| {z }
Changes to portfolio within groups
+
M
X
i=1
h
(%I
i,t− 1
toS
t
)∗| x
S
t
− x
i
t− 1
|+(%S
t− 1
toI
i,t
)∗| x
i
t− 1
− x
S
t
|
i
| {z }
Changes to portfolio those switching
from one group to other
(3.11)
The use of social media and news data allows the measurement of a related, but different, concept.
“Talk volume” is defined as total changes to which narratives agents are talking about during a bubble.
High talk volume will imply agents are switching what they talk about a lot (i.e. changing their minds a
lot), whereas low talk volume will imply little change to the content of what agents are discussing. The
following model based definition for talk volume can be used, similar to that for trade volume.
Definition3.2 (Talk volume).
vol
talk
t
=
M
X
i=1
(%I
i,t− 1
toI
i,t
)∗| T
i
t
− T
i
t− 1
|
| {z }
Changes to talking within groups
+
M
X
i=1
h
(%I
i,t− 1
toS
t
)∗| 0− T
i
t− 1
|+(%S
t− 1
toI
i,t
)∗| T
i
t− 1
− 0|
i
| {z }
Changes to portfolio those switching
from one group to other
(3.12)
Talkvolumeinthemodelanddataareelevatedduringbubbleepisodes,similartowithtrade
75
volume. In appendix 8, I show a similar definition to that above can be used to recover talk volume
empirically. As with trade volume, I show in Figure 3.4, panel A that during bubble episodes (in this case
the 2017 Bitcoin bubble) talk volume is elevated. The figure panel B also demonstrates that my model is
able to replicate this empirical fact.
Figure 3.4: Talk volume empirically is elevated during bubble episodes
PanelA: Talk volume empirically is elevated during bubble episodes
PanelB: Model talk and trade volume are elevated during bubble episodes
Notes: Panel A shows data from the Bitcoin bubble 2017, demonstrating that in the data “talk volume” is elevated. Panel B shows
an example simulated bubble in the model. In both panels price is in black, talk volume in green, and trade volume is also shown
in panel B in grey. In both the data and the model talk volume is elevated during the period of rapid price appreciation. Trade
volume is known to be elevated during bubble episodes in the data. Panel B shows that this empirical fact also occurs in the
model.
76
3.4 Informationalcascadescanberecoveredfromthemodel
Informational cascades occur when individuals cease to follow their own private information. Instead
individuals choose to follow a ‘herd’, assuming that the herd must know something. Bubble events feature
individuals making decisions based on seeing others participating in a bubbly market. Therefore an important
question is to what extent are individuals following herds versus using their own information. My model
provides structure allowing this to be measured.
In the model narrative infection drives individuals to follow the narrative’s public signal
ratherthantheirownprivatesignal. This follows from the listener’s signal being the agents’ private
information, as in equation 3.1. Being a talker for a narrative represents a deviation from this private
signal, where individuals are persuaded by others to utilize the group-specific public signal. In this way,
the state summarized by the size of each narrative infection, as well as the overall price level, become
incorrect signals of the true fundamental value of the asset, which is represented by the distribution (with
parameterτ ) from which the listener’s private signal is drawn. An informational cascade of measurable
size can therefore develop, leading the aggregate information (the price and narrative infection states) to
inefficiently perturb agents’ private information (equivalent to not accurately representing agents’ true
private information).
4
Note that the informational cascade may be welfare beneficial. If individuals’ private signal draw is
much higher than the true fundamental value, an informational cascade based on a pessimistic narrative
can drag the price level down closer to the fundamental value. Similarly it can be welfare negative if
individuals’ private information was in line with fundamentals and the cascade causes the equilibrium
price to deviate from that value.
This discussion leads us to the below standard definition ( (53),(29),(14)).
Definition3.3 (Informational cascade (speculation)). ∄ θ s.t. agent does not “speculate” (there exists no
private information realization that dissuades the agent from the “buy more” or “speculate” action)
Theorem 3.1. An informational cascade will occur if θ P
i
≥ ξ +pt− p
L
p
H
− p
L
. Where ξ is a value or function that
defines“speculation”.
4
Note that “herding” is said to occur whenever agents choose to use information other than their private information, a weaker
condition from that used for informational cascades and that is present in this model whenever there are any individuals following
a narrative.
77
Proof. • First define “speculate”. Speculation occurs when agents infected with narrative i decide to
purchase the asset in a given period t in large quantities: x
i
t
≥ ξ , for some ξ >> 0. Here I can
remain neutral on the exact measure/size of speculation (ξ ).
• Rearrange to find an expression for speculation. From the optimal investment equations 3.4 this
becomes:E
i
t
P
t+1
− p
t
≥ ˆ
ξ , where
ˆ
ξ =ξσγ is just renormalized. Further rearranging and substituting
for expectations gives speculation if
ˆ
θ i
≥ ˆ
ξ +pt− p
L
p
H
− p
L
. Therefore speculation occurs for narrative i if
this condition is met.
• Rearrange further to find a condition on θ (the agents’ private signal). An informational cascade
will occur if no private signalθ ∈ [0,1] exists that will stop the agent from speculating (the agent
chooses to follow the crowd regardless of their own private information). Substituting out for
ˆ
θ gives: θ ≥ ˆ
ξ +pt− p
L
p
H
− p
L
− θ P
i
.
• Finally, link to informational cascades. Since informational cascades only occur where ∄θ s.t.
speculation does not occur, then this condition must hold for the whole range of θ ⇔ θ ≥ 0 ≥ ˆ
ξ +pt− p
L
p
H
− p
L
− θ P
i
⇔ θ P
i
≥ ˆ
ξ +pt− p
L
p
H
− p
L
.
The theorem yields a time-varying condition that can be used to compute whether an informational
cascade has formally started. This condition is a useful tool to demonstrate that informational cascades can
indeed occur in reality (about which there is debate in the literature ((53))) and indicate whether they are
(or are not) occurring within a bubble episode. Usefully this theorem is general to the precise definition of
“speculation” as suits the objective of the analyst.
The condition is intuitive. Informational cascades are less likely when (1) there is a higher threshold
for behavior to be considered speculation; (2) there is a smaller gap between maximum and minimum price
realizations (since this creates a smaller window within which individuals can potentially herd); (3) the
public signal is smaller; and (4) the price is higher at a given timet (since then the asset is more expensive
making it more costly to herd).
4 Data
I use two main types of text data in this analysis, as well as general financial data.
78
Social media data (Twitter) I web-scrape Tweets from the Twitter website for periods associated
with four bubble events: Bitcoin (Jan, 2017 to Apr, 2018), Gamestop (Aug, 2020 to Apr, 2021), Coal (Aug,
2007 to July, 2009) and Silver (Jan, 2009 to May, 2013). In each case tweets including the bubble name only
are scraped (e.g. for Bitcoin the search word “bitcoin” is used, not case-sensitive). Additional exercises
testing the use of other associated search terms do not lead to sizeable increases in data size. This process
provides the text information of the actual Tweet posted, along with a range of metadata for each Tweet
including user, retweets, favorites, date/time among other variables. I conduct a range of standard cleaning
adjustments to the text data as described in papers such as(114).
Traditionalnewsdata(rangeofmediaoutlets) I obtain traditional news data for 2016-2020 (covering
the Bitcoin bubble) from “Components”, an online machine-learning data provider. I queried the data
producer, as well as internally validating the data with comparisons to news websites online, to assure data-
validity. The data comes with the article text, headline, author name (available in most cases), publication,
date among other variables. The Gamestop bubble for the period 2020-2021 is not covered, therefore my
counterfactual results are limited to the Bitcoin bubble. I restrict my dataset to newspaper articles that
include the search terms as with the Twitter data.
I show examples of the text data collected in table 3.1. The table shows that the Twitter data contains
a range of Tweets, some informative and well-written, some decidedly uninformative and random. In
any text dataset many word choices and terminology are present following random usage patterns. The
algorithms I use to produce the datasets for analysis remove this random word variation and identify only
data variations that are continuous and meaningful.
These approaches works because of the sheer quantity of Twitter data allowing the removal of such
random variation. Table 3.2 reports summary statistics for the data, showing that the data includes over 1.7
million (Bitcoin) and 150k (Gamestop) tweets for this analysis. Data points per day are shown in Figure 3.5.
The size of the news dataset is smaller in number of articles. However each article includes substantially
more text than the average tweet (tweets have an average word count of 4 compared to 234 for news
articles, as shown in the table).
Generaldata I also use a range of price data for Bitcoin and Gamestop collected from yahoo finance
and other general, accepted financial data websites.
79
Table 3.1: Examples of text data for Bitcoin and Gamestop
Bitcoin2017/18 Gamestop2020/21
SocialMedia(Twitter)
Pre-bubble
”@NicolasMeyler Will quantum computers destroy
BlockchainandBitcoin.”
Cryptolinguist1
”THIS is why I prefer buying used games from @Gamefly
over any other ’Gamestore.’ The prices are just better and
youwouldNEVERgetallthisfromGameStop!”
Marlin004
”Have fun: $1000 BitCoin Jungle Dance -) # Bitcoin #
Blockchain”
DisruptBanksy
”@geronimo 73 Exactly, I never ever will bootleg anything
like that mainly fear of the laws but I am patient if I can’t
afford a game and mainly buy games from psn, eShop,
MicrosoftstoreorGameStoporWalmart.”
Titancrazy1992
During-bubble
”Blockchain is questionable regarding scalability and
velocity. QuestionisifweonlysticktoBlockchainbecause
# Bitcoin is made out of it and if there are already better
technologieslike#Hashgraph?”
andydotlutz
”#GameStop’s stock price has skyrocketed by somewhere
around 8,000% over 6months. It’s stock has become the
centralgamepieceofafinancialpowerstruggle…”
QuoteDigging
”Freefuturemoney,haveareadandifyoulikethenusemy
inviteandget20coins.”
designersjam
”With gaming biz in decline but with large cash on its
books, the GameStop stock has been rising on the back of
coordinatedbuying…”
livemint
News(headlines)
Pre-bubble
”Northern Trust uses blockchain for private equity record-
keeping.”
AnnaIrrera,Reuters
”Businessgiantstoannouncecreationofacomputersystem
basedonEthereum.”
NathanialPopper,NewYorkTimes
During-bubble
”TheBitcoinBubble–Greaterfooltheory.”
Unknown,Economist
”AsBitcoinsoarsandICOsspread,advisorsurgecaution.”
Unknown,CNBC
5 Empiricalmethodology
I use two computational linguistics methodologies to construct the following key components of the model;
(1) identifying the ideas and their vocabularies;
5
(2) measuring the agreement between ideas and the
semantic distances between them; (3) recovering listening and talking for each of these ideas from the
data; and (4) recovering the size of idea infections over time. Though in the model described above I
5
As noted I use the terminology “narrative” and “idea” interchangeably.
80
Figure 3.5: Text observations over time
Notes: For Bitcoin day 1 is 1st January 2017. Bitcoin news counts are MA smoothed since there are many days on which there
are 0 such articles, tends to be more volatile than tweets. For Gamestop day 1 is 1st August 2020. Gamestop happens much
quicker, hence the smaller number of days.
allow forM ideas, from this point forward in the paper I restrict attention to 3 ideas for tractability. The
framework described is scaleable and could admit more idea dimensionality if desirable to the researcher.
My conclusions here however are restricted to the dynamics involving at most three main ideas under the
justification that further ideas are likely nested within the represented ideas and are therefore unlikely to
change qualitative conclusions.
5.1 Computationallinguisticsmethodologies
In this paper, I use two well-cited computational linguistics methodologies. I provide a brief description
here. More interested readers are directed to the references given. (101) provide a particularly thorough
overview of these two techniques and their associated advantages and disadvantages.
LatentDirichletAllocation(LDA) This approach uses a probabilistic assessment of word frequency
81
Table 3.2: Document and user characteristics
Bitcoin
2017/18
(Social
Media)
Bitcoin
2017/18
(Trad. News)
Gamestop
2020/21
(Social
Media)
PanelA:Documentcharacteristics
Words per doc (avg)* 4.32 234.08 5.39
Word length (avg)* 9.10 8.64 7.55
Total documents 1,781,245 11,333 158,780
Docs per day (avg) 3,673 23 582
Docs per day (max) 16,535 88 30,264
Retweets per tweet (avg) 8.17 N/A 18.35
PanelB:Usercharacteristics
Users per day (avg) 2145.05 11.92 465.66
Users (¿1000 followers) per day (avg) Unavail. N/A 295.64
Daily docs per user (avg) 1.81 1.94 1.26
Total docs per user (avg) 6.64 3.47 1.83
Total docs per user — ¿10 docs (avg) 68.08 34.78 31.24
Notes: table reports summary stats by dataset. *Stopwords are not included when calculating these values.
across documents to derive a set of common “topics”. It is akin to factor analysis in time series – it removes
noise and focuses on continually present factors (or “topics”) in the text data. In so doing, it provides
a linguistic summary of the themes present in the text corpus. I present a concise description of the
methodology here, and provide references to demonstrate that this methodology is now a common and
accepted text analysis methodology; more information can be found in the original Journal of Machine
Learning paper,(31).
The LDA model specifies a probability distribution for the likelihood of a dataset ( D) of documents
(d ∈ D, of which there are M) appearing. This distribution contains a range of parameters based on a
reasonable set of dependencies among topics in documents (θ d
,α ), among words in topics (z
dn
for n =
number for words ind, which has sizeN
d
) and among the appearance of a given word in general (w
dn
,β ).
These parameters are then selected to maximize the likelihood of the resulting likelihood function below.
p(D|α,β )=
M
Y
d=1
Z
p(θ d
|α )
N
d
Y
n=1
X
z
ln
p(z
dn
|θ d
)p(w
dn
|z
dn
,β )
!
dθ
d
(3.13)
LDA has already been used widely in text analysis applications and is a peer-reviewed and accepted
method of economic analysis in academic research. (101), in their review of the current state of text
analysis, describe the prominence of LDA as a method in empirical research. (114) use LDA to assess
82
a change in text content of FOMC minutes following a change to disclosure requirments. Finally, this
method, in combination with Semantic Vector Analysis (now known as word2vec; described below), has
also been used in the assessment of economy-wide finacial risk by (113).
Word2Vec (neural network/word embedding) Word2vec is a neural network approach that creates a
vector space for words from which semantic distances between words can be measured. Words with a
similar context, as observed within the text corpus, are located close together in the space. As a result, it
is one of the few natural language processing methods that captures the linguistic importance of context.
For example, with an word2vec fitted on a text corpus of words, it is possible to determine what are the
100 closest words to the word “technology’ or bigram “technology-bubble”. I will use this to (1) build out
vocabularies by adding the closest 100 words to a lead word of interest (building a “semantic vocabulary”
as in (113)), (2) measure the distances between different ideas, and (3) measure the distance of ideas from
optimism/pessimism to measure their agreement. Again, here I provide a concise description of the method
and then provide references demonstrating this to be a peer-reviewed, accepted tool in economic analysis.
Further information can be found in the original paper(148).
The goal of word2vec is not to actually output the neural network it creates, but instead to output the
“word embeddings” or vectors that are constructed as the neural network is trained. The objective of the
neural network’s training is to predict a target word from a context (i.e. the x words that precede that
word in any given sentence). The word vectors that are created to achieve this objective are what we use
to represent the words in the dictionary. The first step is to “one hot encode” all words by giving them a
vector representation which has entries one for that word and zero for all others in the vocabulary. These
vectors have a very large dimensionality since they need one dimension per word. These original vectors
then are transformed through a series of “layers” by weighting matrices and functions, where the weights
are trained through repeated iterative adjustment to minimize a loss function, which attempts to predict
the target word from its context.
6
Neural-network approaches are newer and less used in the economic literature than LDA. However
they still have been used in peer-reviewed articles. As with LDA, (101) describe word2vec and its uses in the
economic literature to date. This method, in combination with LDA, has also been used in the assessment
of economy-wide finacial risk by (113).
6
This is the Continuous Bag of Words (CBOW) approach, compared to the alternate Skip-Grams approach which predicts the
context from a target word. In this paper, I use the CBOW approach since it is faster to train and better at defining common
words. The Skip-Grams approach is better for tasks that require more precision in understanding uncommon words.
83
5.2 Recoveringnarrativesfromtwitterdata
I use two approaches to recover narratives/ideas and their vocabularies (Vocab
i
) from the data. Similar
to(113), I use a more subjective approach (the “user-defined” or “user” approach) involving the researcher
specifically choosing the key ideas and generating vocabularies for these ideas. I also use a more automated
approach (the “LDA-approach”) using LDA (described in the previous section) to recover the key themes
from the data. A drawback of the LDA approach is that its results are less interpretable, although they
are more representative of the key patterns in the text data. The user approach creates groups with a
clearer interpretive definition, however it is more subjective and likely less representative of the true factor
structure of the data. I use both methods and compare results to reinforce the paper’s main conclusions.
However I will often cite the user-approach as the “main” results, confining the LDA-approach to appendices
(unless otherwise stated, conclusions will hold for both approaches).
User-approach: In this first approach, I use the following algorithm to determine ideas (as noted at
the start of the section, my analysis is restricted to three ideas):
1. First, examine the 30 most used words in the text dataset. Separate these into three or more groupings
with clear economic interpretations.
2. Supplement the existing ideas with additional words from the list of 30-100 most used words in the
dataset as useful. Each idea at this point will likely consist of 10 or fewer lead words.
3. If more than three ideas remain, reduce these to three by looking at those three with the highest
total word frequency in the data.
4. Supplement the idea’s lead words with the 100 closest synonyms from the word2vec word embedding,
called “supplemental words”. This builds out the semantic vocabulary of each idea to deal with the
sparsity of words in the text data.
LDA-approach: In the second approach, I use the following, more automated algorithm:
1. Run LDA with a topic number of 3 on the text corpus. Do standard hyperparameter testing to
determine the correct values for these. Take the top ten words for each topic.
84
2. Find a lead word for the topic by measuring the world-mover distance (standard measure of distance
in word embeddings) for each word with the rest of the ten words. Lead word is the word with
smallest distance to the other words (ie. semantically most represents the topic).
3. For each topic, remove words that are semantically “far” (beyond a pre-designated threshold) from
the lead word. This step makes ideas more semantically cohesive, and is central to moving from
‘topics’ as defined by LDA, and ideas as defined in section 8. Remaining 10 or fewer words are
defined as the “lead words” of this idea.
4. Finally, supplement the lead words to form supplemental words using the word embedding as in the
user-approach.
The above approaches differ from (113) who recover “risk factors”, defined to be semantically separate. I
define “narratives” or “ideas” which have semantic differences taken as given from the data. These semantic
differences are quantifiable from the data and are used in the calibrated model as parameters. In Table 3.3
below I show the lead words of each narrative for the user-approach for Bitcoin and Gamestop (Appendix 8
shows the same table for narratives found from the LDA approach). I also show a selection of supplemental
words. These narrative are found from social media data only – the social media model is the base model,
so my exercise is to examine how social media narratives are covered in social media and news data.
For both bubble instances there is a market/finance narrative or idea. From the tweets it appears that
during these bubble events there is often a need to explain and reiterate various market terms. Some of
these tweets involve explaining market dynamics, such as how share purchases function or reiterating
market rules of thumb (e.g. “overpriced doesn’t mean it will plunge in price tomorrow”, @mensah oteh,
12-Dec). Some of these tweets are more simply just descriptions that a bubble is or is not occurring
and how long this may persist (e.g. “one of the tricks of the stockmarket is staying ahead of the curve”,
@derekegonzales, 19-Dec). For Bitcoin, given its complexity and relative novelty at the time, there is a
narrative and many tweets that explain and discuss terms such as “blockchain” and “cryptosecurity” (e.g.
“the blockchain is to Bitcoin what gold is to jewelry”, @suvendusgiri, 30 Apr) as well as what they imply
about Bitcoin as an innovation. The Gamestop bubble has more straight-forward narratives such as a tale
of David versus Goliath (how the story of retail traders combating hedge funds was portrayed at the time),
as well as the emergence of news that the trading platform RobinHood would restrict trading of Gamestop
shares temporarily on its platform.
85
Table 3.3: Idea vocabularies: user-approach
Bitcoin2017/18 Gamestop2020/21
Idea1: Cryptotech Idea1: Stockmarket
θ P
1
=− 0.01,a
12
=0,a
13
=0.04 θ P
1
=0.59,a
12
=0.09,a
13
=0.2
Leadwords Supplemental
words(eg)
Leadwords Supplemental
words(eg)
Cryptocurrency fintech Market strategist
Ethereum technology Share understand market
Blockchain ripple Buy value company
Altcoin mining Invest efficient
Use big data Trade long buy
Cybersecurity datascience Stock market big finance
Idea2: Economicsofcrypto Idea2: Biginvestorversuslittle
θ P
2
=0.27,a
21
=0,a
23
=0.22 θ P
2
=0.47,a
21
=0.09,a
23
=0
Market market news Hedge fund maneuver
Exchange stock market Investor try bankrupt
Currency index Fund greedy hedge
Price asset valuation Trader retail investor
Buy finance Redditor squeeze hedge
Get currency return Wallstreetbet wealthy people
Trade daytrade trading Short force price
Idea3–Timeandurgency Idea3: Robinhoodhaltingtrades
θ P
3
=0.15,a
31
=0.04,a
32
=0.22 θ P
3
=0.04,a
31
=0.2,a
32
=0
Start news future Lose trading restriction
Today wait Sell robinhood app
Late tomorrow Robinhood block trading
Future follow Today put place
News news fintech Happen protect billionaire
Iuseword2vectomeasurenarratives’agreementandnovelty. I need to recover model components
θ P
i
(Vocab
i
),L
i
(Vocab
i
) from the data which represent the narratives’ position in the space in Figure 3.1.
The word embeddings from word2vec provide a natural method to retrieve these parameters from the text
data. Word embeddings (constructed vector space with a geometric position for all words) deliver distances
between words or collections of words, so provide a measure of the semantic novelty or difference between
the narratives. Similarly, narratives’ agreement can be measured from these semantic differences based on
their distances from linguistic concepts like optimism and pessimism.
Firstly, for narratives’ relative novelty, I measure the World Mover (WM; a standard measure of distance
in word embeddings) distance between all three narratives’ vocabularies in the word embedding vector
space. Semantically close narratives have a WM distance of 0, and the more positive the WM distance
the more semantically different or ‘novel” the narratives. To convert these scores to model units, I then
assume the two ideas furthest from each other are perfectly novel i.e. a
ij
= 0 for narrativesi,j that are
furthest from each other. I then scale all other narrative distances to this point to create a measure from
86
0 to 1, 0 indicating two completely novel narratives relative to each other, 1 indicating two semantically
identical narratives. As an example for narratives 1,2 in equations this is represented as: a
12
= 1− (d
12
/ min
∀i,j∈Z:1≤ i,j≤ 3
(d
ij
), where d
ij
is the WM distance between ideas i and j. These novelty scores are
shown in Table 3.3 under each narratives’ heading.
Agreement or optimisim scores are measured using the WM distance between each narrative’s vocabulary
and the vocabulary of 100 words most closely associated with “buy” (fully optimistic) and 100 words most
“sell” (fully pessimistic). Buy and sell are used for optimism and pessimism as they are more closely
associated with the optimistic (“buy”) or pessimistic (“sell”) action. Once these two distances are found, I
find the final score by projecting the distances onto the distance between buy and sell in the vector space.
These agreement scores ( θ P
i
) are shown in Table 3.3 under each narratives’ heading.
5.3 Measuringtalking,listeningandthoseinfectedbyideas
Talking and listening are recovered from tweets and retweets respectively. Talking and listening
are measured for each narrative i. Users are identified as “talking” about a narrative if they use the
narrative’s language in a given tweet. Therefore the quantity of “talking about narrative i” in the data
is measured as tweets/news articles that include at leastz words from ideai’s languageL
i
. In the model,
users are assumed to only be talkers for at most one narrative at each timet. Therefore equivalently in the
data, if a tweet/news article includes words from two narratives’ languages then the user is assumed to
not be talking about either narrative. For a given day, total talking is all tweets identified in this way for
each narrative.
For comparison, talking in the model needs to be scaled to comparable units with the data. This
conversion, from model units into numbers of tweets/articles comparable with the data, is described in
equation 3.14 below (from here I use “tweets” to interchangeably refer to tweets or newspaper articles. I
also use “day” for time though this could represent hours or otherwise). The equation seeks “talking” in the
data, measured as the number of tweets for narrativei on dayt (Tweets
i,t
). For each dayt and narrative
i, this will equal average tweets per talker, multiplied by the total quantity of talkers for that narrative.
Model components are mapped to the data equation. In the model, T
i,t
represents the intensity of
talking in[0,1]. Therefore the model’s maximum quantity of talking T
i,t
=1 has to map to the maximum
number of tweets for a user. This is done by multiplying T
i,t
by the 99th-percentile value of tweets by user
87
by day.
7
Similarly, in the modelI
i,t
is a number in[0,1] where 1 represents that the whole population is
infected by narrativei, and thus must be multiplied by the total population (pop).
8
Tweets
i,t
=(T
i,t
∗ max(daily tweets per user))∗ (I
i,t
∗ pop) (3.14)
For listening to narrative i, I measure listen intensity as total retweets associated with i divided by
tweets fori. Users retweeting a tweet that talks about narrativei (identified as described above) are said to
be “listening” to narrativei. In appendix 8 I examine other measures of listening such as favorites, showing
them to be similar to retweets. Normalization by the number of tweets removes the mechanical increase
in total listening from there being more talkers.
Again model listening must be converted to units comparable with the data. This conversion is described
in equation 3.15 below. The denominator is as in equation 3.14. Retweets are simply the average user’s
listening multiplied by the total number of listeners. In the model, L
i,t
is in (0,1) so multiplied by
the maximum retweets per user gives the average user’s listening. As with I
i,t
, S
t
is multiplied by the
population. The equation approximately reduces to a model-based share multiplied by the aggregated
maximum of daily retweets by tweets.
Retweets
i,t
Tweets
i,t
=
(L
i,t
∗ max(daily retweets per user))∗ (S
i,t
∗ pop)
(T
i,t
∗ max(daily tweets per user))∗ (I
i,t
∗ pop)
≈ L
i,t
S
i,t
T
i,t
I
i,t
∗ max(
Retweets
t
Tweets
t
)
(3.15)
Narrativeinfectionnumbersarederivedfromtalking. In the model, those infected by a narrative
are those talking about that narrative. Therefore I measure the number infected by a narrative from the
number of users talking about a narrative on a given day. Where a user talks about multiple narratives in
the same day (i.e. has some tweets about one narrative, and some about another) the user is assumed to
7
99-th percentile is used to avoid results being driven by outliers.
8
In the social media data, the true population of listeners is difficult to measure. The total number of users in the dataset,
particularly for Bitcoin, is very large since there are many tweets and the time period is large. However, the data does not report
the usernames of those that retweet, the measure of listening in the data. There are also many users who appear once and never
again and so are unlikely to be consistently listening. In the model, too large a population estimate swamps narratives, leaving
the model nothing to explain, when in reality the overall number of tweets make it clear that many individuals are participating
in the model’s interactions. To tackle these challenges, I estimate the total population with the simplifying assumption that the
largest narrative infection infects at least 50% of the population at its peak. The model’s success in explaining untargeted moments
is taken as validation of this assumption.
88
Figure 3.6: Number of talkers throughout Bitcoin bubble episode (User approach)
be infected by neither. To match the data with the modelI
i,t
is just multiplied by the population number.
6 Calibration
This paper seeks to understand the role that social media plays in the development of bubbles. To conduct
this investigation I produce and then compare one “social media model” scenario and one “news model”
counterfactual scenario. First for the social media model, I calibrate the model to purely social media
data of the number of talkers for each narrative. The model’s success in replicating the targeted moments
(talkers per narrative) but especially untargeted moments (asset price, aggregate talking and listening
volumes) demonstrate that it explains key features of the relationship between media and market. Second,
I produce a “news model” scenario, recalibrating key spread parameters from the number of talkers from
the traditional news data.
I interpret the “news model” scenario as investors only having access to news media. Listeners are
now investors reading traditional news media only, social media no longer exists.
9
Talkers are now
investors that have become persuaded by the narrative, and therefore report the narrative to journalists
who reproduce it in print and online media. This increased print media on the narrative may convince
future investors listening to the narrative. This interpretation is consistent with research on how investors
9
Note that in reality, social media may also affect news media. The effect estimated here will therefore be a lower bound, since
the news infections are likely happening faster in the data due to social media. If social media truly did not exist, news journalists
would take more time to uncover and become infected by narratives.
89
Table 3.4: Counterfactuals to demonstrate impact of social media
Param Description Calibrationmethod Scenarios-dataused
SMModel NewsModel
1. Narrativespreadparameters
β I
Spread rate Model GMM SM News
P
F
Forget rate Model GMM SM News
ϵ 0
Initial shocks Model GMM SM News
2. Languageparameters
θ P
Optimism scores WM distance with “buy”
and “sell”, plus orthogonally
project.
SM SM
L Narrative novelty WM distances between
ideas.
SM SM
consume and react to news.
10
Table 3.4 summarizes the source of narrative parameters for the scenarios.
All other model parameters are held the same across scenarios, are mostly in line with the literature, and
I describe these in Appendix 8.
6.1 Calibrationapproach: socialmediaornewsparameters
Narrative spread parameters: spread parameters govern the shape narrative infections over time. I
structurally calibrate these using Generalized Method of Moments (GMM). I use two types of moment: (1)
level moments to capture the size of the narrative infections; (2) growth moments to capture the shape of
narrative infections over time. For level moments I just use the level of the infection at each day during
the time span studied. For growth moments I use the first difference of infection at each day. For three
infections this therefore consists of 2∗ 3∗ (total days) moment conditions. The likelihood function is
therefore:
L=a
1
T
X
t=0
3
X
i=1
(I
i
t,mod
− I
i
t,data
)
2
+a
2
T
X
t=0
3
X
i=1
((I
i
t,mod
− I
i
t− 1,mod
)− (I
i
t,data
− I
i
t− 1,data
))
2
(3.16)
a
1
anda
2
represent adjustable weights for each type of moment condition. Often, structural estimation
exercises divide by the data moment to create a relative condition. I do not use a normalization here since
10
For example, (120) show investors miss news when there is a lot of “extraneous” news, in this model this would be when
narratives have low rates of talkers and so are missed (low probability of infecting others). Similarly,(Fedyk) finds that front page
news (in my model, high numbers of talkers may imply news makes it to the front page) is incorporated by the market much
quicker than non-front page news.
90
infection values are typically close to 0 so normalizing would distort the weighting on moment conditions.
For each scenario, I find the parameters that minimize this structural likelihood function.
Languageparameters: the calibration of these text parameters was described in section 5.2.
6.2 Calibrationfittosocialmedia: targetedanduntargetedmoments
In this section, I first examine the performance of the calibration against key moments. The data fits the
three narrative infections well in the baseline model using only social media data to calibrate the model.
The model also provides a reasonable approximation of untargeted moments: the asset price, aggregate
talking and listening. Second I describe what the model reveals about the progression of aggregate listening
during the bubble.
User-definedandLDAapproachcalibrationsmatchthedata. Table 3.5 summarizes performance
of the model against key level moments (note growth moments were also targeted in the calibration
equation 3.16 but the table just compares levels). For the percentage of talkers for each narrative in the
data, the table shows the average percentage point deviation for three 100-day time periods (Bitcoin) or
three 2-day time periods (Gamestop). In most cases the model is within 10pp of the data. The model’s
approximation is typically worse towards the end of the data period. As discussed in section 3.1, the model
finds crashes harder to generate with the available model parameters. This remains true in the calibrated
results as the model does not fully replicate the magnitude of the Bitcoin crash.
For untargeted moments such as the price, Table 3.5 shows that, on average over the time periods
indicated, deviations between the model and data are small to modest. Figure 3.7 shows the time series
data versus modelled for the two approaches. Both correctly reflect the speed of bubble expansion and
crash. However, for both bubbles the user-defined method better explains the overall level of the bubble
price (though note the user-defined method for Bitcoin struggles to capture the post-crash level in the final
100 days). Aggregate talking, talking for all three narrative infections, follows the data but is between 0-
30% lower on average in the user approach (with a slightly higher gap in the LDA approach).
Modelled aggregatelistening during thebubble firstrises from eachindividual choosing to
listen more, whereas later in the bubble it falls as there are fewer listeners. Table 3.5 shows that
average percentage deviations for listening between the model and data are mostly below or around 10%.
This implies that the model follows the rise and crash of the listen data shown in Figure 3.12. Figure 3.8
shows how this works mechanically in the model. First, each listener individually listens more as they
91
Table 3.5: Calibration is close to targeted and un-targeted series
User-def: avg(data-mod) LDA:avg(data-mod)
PanelA:Bitcoin2017/18
Day: 1-100 101-200 201-300 1-100 101-200 201-300
1. Targetedmoments
Narrative 1 talkers 0.00 0.02 -0.07 -0.10 -0.08 -0.09
Narrative 2 talkers 0.08 0.15 0.04 0.00 -0.02 -0.07
Narrative 3 talkers 0.00 -0.11 -0.36 0.00 0.09 -0.10
2. Untargetedmoments
Price/Data 22% 6% -60% -11% 31% 6%
Talk/Data 6% 31% 25% 12% 39% 37%
Listen/Data -23% -14% -10% -9% 2% -8%
PanelB:Gamestop2021/22
Day: 1-2 3-4 4-5 1-2 3-4 4-5
1. Targetedmoments
Narrative 1 talkers -0.02 0.03 -0.12 0.01 0.00 -0.07
Narrative 2 talkers -0.04 -0.01 -0.08 0.05 0.09 -0.09
Narrative 3 talkers -0.02 -0.04 -0.18 0.04 0.04 -0.11
2. Untargetedmoments
Price/Data 33% 1% -3% 70% 43% 32%
Talk/Data 29% 43% 22% 83% 22% -47%
Listen/Data 29% 35% -20% -14% 22% -29%
Notes: table shows average daily differences between level values of data and modelled. Averages taken over three time periods
for Bitcoin: days 1-100, days 101-200 and days 201-300, where day 300 is the final day recorded for the bubble; and three time
periods for Gamestop (modelled in hours): days 1-2, days 2-3 and days 4-5. For targeted moments, narrative talker numbers are
between 0 and 1 so table reports percentage point differences in the proportion infected with each narrative. For untargeted
moments, these are further normalized by the data value each time period thus reporting model percentage deviations from the
data.
begin to expect investing to be profitable. Average listening per user rises (with the discrete jumps shown
in the figure due to each of the three narratives emerging at different times) and there are many available
listeners, causing overall listening to climb in the early phase of the bubble. Second, occurring later in the
bubble, although average listening per user stays high only dipping slightly, the overall number of listeners
starts to dwindle causing overall listening to fall. More listeners become infected by the narratives, leading
the share of listeners to fall dramatically as more agents become talkers. Overall listening then falls even
though remaining listeners continue to listen the same amount. This fall in the number of listeners causes
overall listening to fall, rather than a fall in how much each agent listens. I show the equivalent chart for
the LDA approach in appendix 8, which shows a similar process.
92
Figure 3.7: Model price versus data in User and LDA approaches
(a) Bitcoin2017/18 (b) Gamestop2020/21
Notes: left chart shows Bitcoin price in the data versus two modelled approaches with daily data. Right chart shows Gamestop
price in data versus two modelled approaches with hourly data (Gamestop bubble occurred very rapidly so is modelled hourly).
For Gamestop, price data is a combination of opening and close prices in data (9am open, 6pm close assumed for comparison
between price data and modelled). Data stops at hour 220 as becomes the weekend. However, I add a linear progression to the
price that opening on Monday 1st February at$316 and closing at$225, which appears in line with the modelled series shown.
Figure 3.8: Increase in avg listening per user causes rise, then fall in proportion of listeners causes fall
93
7 Counterfactualresults: removingsocialmedia
7.1 Counterfactualremovingsocialmedia
How would the price of Bitcoin have developed in the absence of social media? I provide evidence to
answer this question by calibrating the model to news data rather than the Twitter data used in section
6.2.
11
In the news data, investors only have access to news media. Talkers are now investors, persuaded
by the narrative, who therefore talk about the narrative to journalists who report it in news media (see
section 5 for more description). These news media narrative infections, measured as described in section
5.3, are used to calibrate the model’s infections. All other parameters of the model are kept the same. This
generates a new simulated price path based on the news data as shown in Figure 3.9 (results for the LDA
approach are shown in Appendix 8).
The chart shows that without social media the Bitcoin price grows slower, has a slightly lower peak,
and does not feature a crash. The Bitcoin price peaks around 50 days later in a Bitcoin market without social
media. This supports the view that social media accelerates the pace of bubble formation. The Bitcoin price
peaks around 2% lower when social media is removed (although there is variability in this result with the
LDA approach in the appendix and other robustness results featuring larger reductions of an approx. 50%
smaller bubble without social media). Finally, no crash is observed in the Bitcoin market without social
media. This reflects the conclusion in section 3.1 that price crash episodes are harder to generate than price
appreciations.
The estimates shown are conservative. In reality, there is likely a feedback loop between language
information on Twitter and social media with information in news articles. Some news articles may obtain
information from social media and vice versa (see (48) for more discussion of information propagation in
news and social media). If information reproduction does exist it is likely that social media information is
contained within the news data, meaning that true news data without any impact from social media may
show an even slower pace of bubble formation.
11
I refer to this calibrated model as the “news model”, compared to the “social media model” described in section 5.
94
Figure 3.9: Bubble grows slower, with lower magnitude and no crash
Figure 3.10: Relative speed of infections in social media and news models
7.2 Keymechanisms
The result shown in the previous section is driven by three mechanisms. The news data has less substantive
coverage of each narrative than in social media. The rates of spread of narrative infections are slower in
the news model. Finally, as a result of the first two reasons, incentives to listen in the news model differ
from those in the social media model.
News data has low coverage of some narratives. Under the user approach, the news data only
has one narrative that receives substantial coverage (the “Economics of Crypto” narrative from Table 3.3).
Figure 3.10 shows the modelled, social media narrative infections for each of the three narratives as solid
lines, news infections are shown as dashed lines. While the narrative infections on social media all affect
95
substantial portions of the population at some stage, only one narrative infection takes off in the news data
(green, dashed line). This holds in the data as well as in the model, and is also true in the LDA approach
(see Appendix 8).
One possible cause of this is that news coverage has lower text “dimensionality” compared to social
media. Social media may feature a broader range of distinct narratives discussed widely, compared to
the news data. This conclusion is consistent with the literature. (158) show that during widely covered
news events news coverage tends to homogenize around one theme. Since bubble events are often widely
covered in the news my finding that only one narrative is substantively discussed in the news media is in
line with this literature. I further find that the opposite occurs in social media. The social media data has
substantive coverage of a range of narratives.
Fewer widely discussed narratives means less price volatility. In the model, similar and agreeing
narratives cause fast, large bubbles to form. Therefore the prevalence of few narratives leads slower, smaller
bubbles to form in the news model compared to the social media model. Then also in the model, very novel
and disagreeing narratives cause price volatility of big rises and then large crashes as these disagreeing
narratives compete. Again, fewer prevalent narratives in the news model means there is less scope for such
competition and so less volatility in the form of no crashes. Overall then few narratives that are widely
discussed in the news data is less conducive to large, fast rises and large crashes.
Thenewsmodelhasnarrativesspreadingataslowerrate. For the one narrative that does have
substantive coverage in the news data (“Economics of Crypto”), its rate of spread is slower. In Figure
3.10, the dashed green line representing the one, widely-covered news narrative begins early and increases
gradually, when compared to either the corresponding social media narrative (solid green line) or the other
two narratives (black and grey solid lines). Inspecting the calibrated parameters further demonstrates
this point. The Economics of Crypto narrative in the news model spreads fairly slowly with coefficient
β 2,news
= 0.16 compared to the much more contagious social media narrativeβ 2,SM
= 0.46. The social
media narrative is also more forgetable as agents tire of it faster withP
F
2,SM
=0.18 compared toP
F
2,news
=
0.08 in the news model, though not by margins as large as the spread difference.
Incentivestolistenincreasemoreslowlyinthenewsmodel. The model components affecting the
listen decision for each narrative infection are derived in section 3.2. Figure 3.11 shows listening levels in
the social media and news models. For the news model, since there is only one narrative infection listening
changes gradually. Listening increases from a starting level of 0.5 (indifferent between listening and not
96
Figure 3.11: Listening differs across social media and news models
listening) up to 1 as listening to the narrative about the bubble becomes more profitable in expectation
(dashed green line in the figure).
In the social media model there is more volatility in the listening decisions. Incentives to listen are
greater when other agreeing narratives are spreading (see section 3.2). Thus the spread of optimistic
narratives 2 (“Economics of Crypto”) and 3 (“Time and Urgency”) provide incentives to listen to each other,
and undermine incentives to listen to the disagreeing pessimistic narrative 1 (the more realistic narrative
describing what Bitcoin actually is, “Crypto Tech”). This reinforcement between narratives 2 and 3 leads
listening to grow much faster in the social media model.
The crash in the social media model is driven by a highly contagious narrative, rather than high
incentive to listen. As shown in the figure, listening to narrative 1 is not high. However its contagiousness
is sufficiently high (in the calibrated model few people forget this narrative, P
F
1,SM
is close to 0) that the
narrative continues to spread rapidly anyway. In contrast, this narrative proves not contagious at all in
the news model, hence the lack of a crash.
7.3 Mainresultsonsizeandspeedofbubblesduetosocialmedia
Table 3.6 shows the reduction in peak size, additional days to peak and crash sizes in the social media model
versus the news model. Results from the User and LDA approaches, the main approaches developed in the
paper, are shown as well as a range of alternative parameterizations. The table shows that in the social
media model, time to bubble peak is always later, by 37-52 days in the main two cases (generally larger
97
in the alternative parameterizations as these cause the bubbles to form much faster in the SM model).
Amplification of the bubble from the social media to news model varies more across results but generally
suggests social media provides some amplification. The peak reduces in size slightly in the user approach
while leading to an almost 50% lower peak using the LDA approach. Finally, the social media model
reliably features a crash, while the news model never does. The final column is an extreme scenario where
novelty among narratives is increased, which leads to the unusual result that the peak in the news model
is actually slightly larger. This occurs because in the social media model the neutral narrative (“Crypto
Tech”) competes more against the other two optimistic narratives, causing a smaller peak in the social
media model as well as the larger eventual crash of 28% (the largest crash in the table).
Table 3.6: Range of estimates of SM effect on size, speed and crash of Bitcoin bubble 2017/18
User
approach
LDA
approach
Lower
risk
aversion
Lower
high
price(p
H
)
Less
optimism
More
novelty
(1) (2) (3) (4) (5) (6)
% peak size reduction
(SM/News-1)
-2% -47% -15% -1% -50% 6%
Days more to peak
(News-SM)
52 37 189 171 192 157
Crash (SM; % change,
100dayspost-peak)
-13% -12% -17% -18% -5% -28%
Crash (News; % change,
100dayspost-peak)
1% 1% 1% 1% 1% 1%
Notes: Table shows estimates of (1) reduction in the size of the peak from the SM model to the News Model; (2) the additional
number of days until the peak moving from the SM model to the News model; (3) the crash size (the % fall from peak to 100 days
post-peak) in the SM model and (4) in the News model. User and LDA approaches as described in the text. Later columns
examine lowering risk aversion, increasing the high price realization, reducing optimism and making all narratives completely
novel.
8 Conclusion
In this paper I have developed a model of talking and listening during bubble events. Agents listen to
“change their minds” if they expect it may be advantageous to do so. Thus agents play an active role
in changing their minds, depending on economic incentives they expect exist. The model is capable of
generating bubble events, including endogenous trading volume and “talk volume”, a new bubble concept
that fits well with this examination of bubble events through language data and social media specifically.
The model also shows that the existence of crashes is more difficult than price run-ups, since the conditions
required for their existence are more strenuous (however whether they are unlikely depends on the stance
98
one takes on the distribution of emerging ideas).
I then demonstrate how this model can be combined with empirical language data to answer economic
questions. My methodology involves using a range of computational linguistics methodologies to recover
key parameters of the economic model. In particular I summarize the text data into “ideas” or “narratives”
that reflect what the data shows individuals to be talking and listening about. I then quantify the progression
of these ideas, in terms of those infected by the ideas as well as the amount of talking and listening about
each idea. I then use these measurements to calibrate and verify the performance of the economic model
creating a “social media model” as well as a counterfactual “‘news model” fit to these different text data
sources.
Finally I use this framework to answer a specific economic question: does social media affect the speed
and magnitude of bubbles. I find that when removing the impacts of social media from the model the
Bitcoin bubble of 2017/18 forms with a peak between 37-52 days later, a more variable estimate of the
impact on bubble size (an average of a -23% reduction in bubble peak across the cases I examine) and
without evidence of a crash.
This paper suggests several avenues for future research. Firstly, my model introduces structure that
enables the measurement of the information externality present in herding and informational cascades
during real bubbles. Further pursuing quantifications of the welfare impacts of such events could be of
value. Second, recent research has used survey evidence to test a structural model of cryptocurrency
bubbles with heterogeneous beliefs ((23)). Linkages between their survey data approach and the news/language
approach used here might generate interesting findings. Thirdly, my model does not address the question
of how ideas or narratives emerge or how the emergence of one idea may spur the introduction of others.
Understanding these processes would be an interesting focus of future research.
Finally, the model suggests a role for several new, language-based policy tools for addressing bubbles
or “leaning against the wind”. Investigating the viability and impact of these tools could deliver valuable
insights into counteracting (or just simply stabilizing) bubbles, if such a policy goal is desirable (note that
some bubbles can provide useful reallocations of economic activity towards valuable new technologies).
Possible new policy tools include: (1) for two ideas that disagree strongly, increasing correlation in their
language (i.e. complementarity) tames large price swings (i.e. providing accurate information in similar
language to optimistic information); (2) introducing pessimistic or neutral ideas earlier to mute optimism-
fueled bubble growth (i.e. providing accurate information earlier); or (3) as a last resort, increasing the
99
pessimism of ideas that emerge following very optimistic ideas. I leave the examination of these tools to
future research.
100
Appendix
Stylizedfacts
I briefly provide some motivating facts. I show first that listening tends to precede the bubble, suggesting
a preemptive, dynamic choice to listen. Second, in contrast I show that talking moves broadly in line with
the bubble.
Listen intensity has two main peaks during the bubble cycle. Listen intensity is defined based on the
model as described in section 6. For each narrative there is a peak immediately before the bubble begins to
develop in earnest (between day 180 and 250 in the figure). Then there are second peaks in listen intensity
that peak around/just before the bubble’s price peak. The peaks occurring prior to the bubble price peaking
suggest that listening is a forward-looking decision – I choose to listen more when I begin to expect that
something may occur with the asset price in the future. This empirical observation will justify listening
being a dynamic forward-looking choice in the theoretical model in section 2.
Figure 3.12: Listening tends to precede the bubble peak
Notes: listening is measured for three sub-ideas recovered from social media (Twitter) text data. Listen intensity by day is
defined as
Retweets
i,t
Tweets
i,t
fort time, andi idea. Price is in USD000s for the 2017 Bitcoin bubble.
Talking moves in line with the bubble price. Figure 3.13 shows “talking” as measured by the number of
tweets per day for each of the three recovered sub-ideas or narratives. Talking is highly correlated with the
price decision for each of the three ideas (correlation coefficients between the price and talking in each of
the three sub-ideas are high: 0.86, 0.94 and 0.96). In appendix 8, I also plot average tweets per user showing
that this stays fairly constant throughout the bubble. This evidence suggests (a) that talk is a fairly static
101
decision, kept fairly constant over time by users, and (b) that talk is primarily about the number of users
talking, rather than users talking more or less during a bubble. I use this to justify a fairly simple talk
decision for agents in the model in section 2.
Figure 3.13: Talking tends to move in line with the bubble
Notes: talking is measuredfor three ideas recovered from social media (Twitter) data. Talking is measured by the number of
Tweets on a given day for each idea. Price is in USD000s for the 2017 Bitcoin bubble.
Fullagentproblems
TalkersBayesianupdatetheirbeliefs.
p
t+1
= p
H
is a probabilistic event withp
t+1
a random variable and probabilityP(p
t+1
= p
H
). Assume
that eventA is the listener receiving their private signal drawn from distributionF(τ ) as in the main text.
Therefore the conditional probability of p
H
given the listener’s private signal isP(p
t+1
=p
H
|A)=θ .
Assume there are a series of M public signals associated with each of the M narratives. Individuals
are persuaded of the veracity of these signals based on the probabilities described in the main text. Such
persuasion events are labelledB
i
∀i∈M. Therefore talkers of type i have beliefs based on Bayes rule:
P(p
H
|A∩B
M
)=
P(p
H
∩A∩B
M
)
P(A∩B
M
)
=θ P(B
M
|p
H
∩A)P(A)
P(A∩B
M
)
=θ θ P
i
(3.17)
Where the final term follows from:
102
P(p
H
∩A∩B
M
)=P(P
H
|A)P(B
M
|p
H
∩A)P(A)=θ P(B
M
|p
H
∩A)P(A) (3.18)
As talkers forget to become listeners again, they forget the additional signal eventB
M
and therefore
return to beliefs P(p
H
|A). Rather than receiving an additional signal that leads them to return to their
private signal. This “wavering” between being persuaded of a narrative and not is of the form suggested
and used by(15).
Listenerproblem: extradetails.
Changing one’s mind is not straight-forward for people. A range of evidence has shown that agents reason
in a motivated way, choosing reason so as to accord with existing beliefs, rather than choosing beliefs
based on reason (e.g. (84), (174)). This implies that changing one’s mind is cognitively costly in some
way, since retaining one’s existing beliefs is preferred ((131), (149)). Given such behavior, agents require
social or economic incentives to change their mind, as well as the random chance for these incentives to
materialize ((173),(76)).
Listeners choose how much to “listen” to a given narrative i.e. to allocate a portion of their day
to consuming media, say news or social media, to learn more about that idea. This choice affects their
probability of catching the narrative infection and the individual makes a choice for each narrativei. The
second choice is how much to invest in the asset. Investment choices are simple static choices. Following
(15), the investment problem is kept simple in this way to focus on the impact of talking and listening
whilst still capturing asset market outcomes. More sophisticated investment choices could yield interesting
additional effects. First order conditions for investment are standard, based on assuming future prices are
distributed normally and solving the expectation of the exponent, and as below (where σ represents the
standard deviation of future prices, which is assumed to be fixed.
103
V
L
(
⃗
I
t
,
⃗
L
t
)=max
{x
S
t
,{l
i,t
}
M
i=0
}
E
t
[− e
− γ (W
L
t
+x
L
t
(P
t+1
− Pt))
]
| {z }
Today’s payoff:
Maximize wealth and
expected future returns
+δ E
t
[
M
X
i=0
P
I
i
(l
i,t
)V
I
i
(
⃗
I
t+1
,
⃗
L
t+1
)
| {z }
Payoff if infected
+(1− M
X
i=0
P
I
i
(l
i,t
))V
L
(
⃗
I
t+1
,
⃗
L
t+1
)]
| {z }
Payoff if stay susceptible
(3.19)
W
L
t
denotes wealth as of timet. γ,δ represent standard risk aversion and time discount parameters.
V
L
represents the susceptible agent’s value function andV
I
i
represents the agent’s value function when
infected by narrative i. State variables for the problem are the narrative infection rates contained in
vector
⃗
I
t
. Note thatP
I
i
represents the agent’s individual probability of being infected based on their own
behavior and choices. This differs from the aggregate probability of infection, ensuring that agents do not
internalize aggregate behavior (i.e. agents take aggregate listening as given), but these will be reconciled
in equilibrium by setting individual listening equal to the aggregate.
Talkerproblem: extradetails
There are two possible interpretations of talkers. One is that these are agents no longer acting based on
their own information. Instead they are “fanboys/girls”, overcome by a passionate interest in some aspect
of the bubble asset, e.g. Blockchain technology in the case of Bitcoin. In this sense these individuals may
represent agents “talking very loudly on things about which they know very little” (the existence of such
individuals has been documented in e.g. (18)). A second interpretation is that these are individuals who
have uncovered some real information, not widely known, that changes their understanding of the market
relative to others, inducing them to invest differently (such investors appear for example in (1)). I remain
agnostic as to which interpretation is correct – it could be that one narrative is predominantly the first
group, whereas another narrative is predominantly the second.
In either interpretation, I capture the high trade volume by assuming agents forget ideas and return to
the pool of listeners to be potentially infected by the idea again. Agents’ passions/uncovered information
104
are unlikely to last forever as competing information/interests vie for agents’ time. Ideas are more likely
to be forgotten the longer the agent remains infected, where this model assumes that each idea has a fixed
probability of being infected which is exogenous and based on how forgettable that idea intrinsically is.
Upon forgetting the idea, agents’ return to the pool of susceptible agents creating a type of wavering as
in (15) which generates trade volume. This wavering can be interpreted as agents first becoming infected
by an idea e.g. “Blockchain”, but then as days wear on agents face other ideas about things unrelated to
the asset, inducing them to “forget” the prior idea. Then in the future, further exposure to agents talking
about that idea may remind/reinfect them with that same idea. In the rest of this subsection I sketch out
this agent’s problem.
Talkers of typei only talk about the idea with which they are infected, hencet
i
t
is the sole individual
talk decision for infectioni at timet. This allows us to summarize the individual talk decisions for all ideas
with the following vector.
⃗
t
t
=
t
1
t
t
2
t
.
.
.
t
M
t
(3.20)
Expectations are formed in similar way to listeners. Except that a narrative specific signal is attached to
the listener signal. This signal will be recovered from the text data, and represents the optimism/pessimism
of the idea.
E
i
t
P
t+1
=
b
θp
H
+(1− b
θ )p
L
(3.21)
Narrativeinfections: extradetails
The narrative space described by S will be associated with an asset. For example, ideas “blockchain”, “dark
money” and “new economy” are associated with the Bitcoin asset. The attributes of an asset collectively
determine the likelihood of a bubble forming – for example, assets with low demand fundamentals, high
supply or a “muted”, slow-spreading idea space are unlikely to form bubbles.
Definition8.1 (Asset). An asset is a set of parameters, a price path and an idea space:(τ,ω,P
L
,P
H
,{P
t
}
T
t=0
,S).
105
Where:
1. τ : demand fundamentals. Denoting the likelihood of individuals receiving high draws of the asset’s
value.
2. ω: supply fundamentals. Describing the available supply/endowment of an asset. This can be
endogenized to vary with the price.
3. P
L
,P
H
: highest and lowest possible price realizations.
In this section I introduce the concept of an “idea infection”. During a bubble, a range of related sub-
ideas
12
spread amongst people through talking and listening and are associated with their own beliefs about
the future price of the bubble. The structure of these sub-ideas in an “idea space” is fairly general, allowing
the model to capture a wide range of information structures with different numbers and types of infections
operating at different times. Later, I focus on a specific smaller number of infections to demonstrate how
this framework can capture empirical features of asset price bubbles.
Ideas come equipped with their own “languages”. For example, if “Blockchain” were a sub-idea it would
have an associated language or vocabulary that is frequently used when discussing this idea. The word
for this idea, along with its language or dictionary, determine how fast the idea spreads, as well as how
fast it is forgotten. Similarly, the future price expectations of a sub-idea represent demand fundamentals
(more intrinsically valuable assets will have more sub-ideas with high positive price expectations). For
example, having learnt about Blockchain, an individual will develop a price expectation based on a belief
of Bitcoin’s value as a Blockchain technology.
To capture this more formally we have the following definition:
Definition 8.2 (Narrative infection). An idea infection, i, is a group of infection parameters, infection
paths and a language represented by: (β i
,P
F
i
,θ P
i
,ϵ i,0
,{I
i,t
}
T
t=0
,L
i
). Where:
1. β i
: a spread parameter. Summarizes how likely a contact between an individual susceptible to an
idea and an individual infected with the idea will lead to another infection.
2. P
F
i
: a forget rate. Determines the likelihood of “forgetting” the idea i.e. returning to the susceptible
state.
12
I will use the terminology “idea” and “sub-idea” interchangeably. However, typically the model will examine more than one
idea associated with a bubble asset e.g. “blockchain” and “dark money”. Rather than just the single idea of the bubbly asset itself
e.g. “bitcoin”.
106
3. θ P
i
: an expectation parameter. Determining how much higher is the price expectation of an agent
infected by the idea, relative to that of a susceptible agent.
4. ϵ 0,i
: a set of shocks. Determining when and by how much the idea infection begins/changes.
5. L
i
: language characteristics. A position in the vector space of language, based on Vocab
i
.
Marketclearingandfundamentalsdetermineaggregatebubbledynamics
Market clearing will determines the price spike and crash. Scarcity of the asset, perhaps driven by wild
demand, will drive the price spike. Availability of the asset, perhaps driven by high endogenous availability
of the asset, will drive a price crash (e.g. Bitcoin miners may work overtime and produce more coin if the
price is high; overproduction could ultimately fuel a crash).
Market clearing equations, including talk space clearing equations, are then as below. The first equation
just states that investments in the asset have to equal the available supply of the asset (ω). A more
sophisticated modelling of the supply-side could be conducted (e.g. I explore a supply increase with price in
one of the model extensions), but for the base model I keep this simple to focus on the demand features. The
second equation just sets the policy function l equal to aggregate listening i.e. the individuals’ listening
choice adds up to the aggregate in equilibrium. The price equation in the main text follows from these
equations and optimal investment equations.
x
S
t
+
M
X
i=0
x
i
t
=ω
t
=ω (3.22)
l(
⃗
I
t
,L
t
)=L
t
(3.23)
Talkvolume
In the data, I haveU twitter users or Talkers, each labelled u, for each oft∈{1,..,T}. We can then denote
whether or not individual u is infected by narrative i on day t as u
i,t
= 1 and u
i,t
= 0 otherwise. This
yields the following formula to determine talk volume from the data, where the modulus term ensures that
a 1 is recorded if a narrative infection occurs, or if a narrative is forgotten in a given period.
107
Definition8.3 (Talk volume (data)).
vol
talk,data
t
=
M
X
i=1
X
∀u∈{1,..,U}
|u
i,t
− u
i,t− 1
| (3.24)
Note this assumes that talk volume is measured based on daily infections of narrative infection. Of
course t could also measure hours or even minutes for higher frequency analyses. This definition also
assumes average talking is constant across time and so can be normalized, consistent with the model
assumptions. Finally, this definition has embedded the assumption that it is only possible to infected by
one narrative on a given day.
Narrative,talkingandlisteningconstruction: empiricalmethodology
Table 3.7 shows narratives recovered from the Twitter data using the LDA approach. Note that this
approach produces factors that are much harder to interpret, with the advantage that they more accurately
represent the text patterns in the data, as well as the data’s true underlying factor structure. The data
exercise with LDA narratives therefore represents narratives that are purely selected by the data with
minimal researcher intervention.
Average tweets per user is fairly constant in the data over the whole bubble period, shown in Figure
3.14. That suggests that talking increases occur because of an increase in the number of users tweeting,
rather than their average tweeting. This justifies the assumption that talkers derive utility from talking
and therefore tweet up to their time constraint – most people’s tweets per time constraint appears to be
around 2 as in the figure.
Figure 3.15 compares main listening measure of retweets with other measure of favorites. The figure
shows that these measures are fairly correlated.
Figure 3.16 shows aggregate listening data versus listening model mechanism for LDA (user approach
equivalent shown in Section 6.2).
Calibrationapproach–otherparameters
In Table 3.8 I list the other model parameters used in the calibration and their source. These are kept the
same over scenarios.
108
Figure 3.14: Average tweets per user relatively constant over bubble period (Bitcoin shown)
109
Figure 3.15: Comparison of listening measures, retweets versus favorites (Bitcoin shown, smoothed)
Figure 3.16: LDA approach: increase in avg listening per user causes rise, then fall in proportion of listeners
causes fall
110
Table 3.7: Recovered narratives: LDA Approaches
Bitcoin2017/18 Gamestop2020/21
Narrative1 Narrative1
Leadwords Supplemental
words(eg)
Leadwords Supplemental
words(eg)
Cash blockchain Come order online
Invest cryptocurrency Pre-order next day
Currency credit card Still shipment
Bank altcoin Get market place
Good money Good electronics video
Look government Game people play
Make exchange Narrative2
Go central bank Tell cheap price
Narrative2 Get buy bundle
Say bubble Work reduce risk
Buy buy crypto Talk current event
Big possible buy Be breaking
Thank take loan Make discuss
purchase Say listen
Sell sell fast
Narrative3 Narrative3
Trade future Short squeeze happen
Day year look Hedge fund investor
Year happen Fund inflate price
Base network Trader big loss
Come decentralize Take stock skyrocket
Be finance Investor online community
Use investor trade Buy investor reddit
Extracalibrationresultfigures
111
Figure 3.17: LDA approach: price in data, SM model and news model
Figure 3.18: LDA approach: speed mechanism in SM model and news model
Figure 3.19: Data for user and LDA approaches: infections, news vs social media
112
Table 3.8: Other model parameters
Param Description Calibrationmethod Bitcoin Gamestop
1. Narrativeparameters
Described in main paper.
2. Bubbleparameters
τ Demand fundamentals Summarized by choice of theta-S. - -
ω Supply fundamentals Bitcoin: average quantity of BN available in
2017/18.
Gamestop: number of shares available in 2020/21.
17m 305m
p
H
,p
L
Highest and lowest price Solved from date 0 price, use price equation to
match that. Usep
L
=0 and solve forp
H
.
0,100 0,828
θ Listener signal Recovered from pre-bubble measure of
optimism/pessimism i.e. optimism of pre-
bubble corpus.
0.3 0.54
3. Utilityparameters
γ Risk aversion Standard literature 0.2 0.2
σ Price volatility Standard literature, set higher for Bitcoin as
viewed as a very volatile asset.
8 4
δ Time discount Standard literature, set slightly higher for GS as is
hourly.
0.9 0.99
4. Timeparameters
ˆ
l min/max Range of possibilities for
talking
Assumed % of time available. 0,1 0,1
113
PartII
EssaysonClimateChangeandMigration
114
Chapter4
Aremacro-climatemodelsmissingfinancialfrictions:Empiricalevidence
andastructuralmodel
1 Introduction
Climate change is expected to increase the frequency of climate shocks experienced by the macroeconomy.
Discovering and modeling the impact of these shocks on the macroeconomy is therefore essential to
understanding how a market economy will react to a changing climate. Financial friction effects have
been documented since the global financial crisis (for example see Mian and Sufi (2009))(146) where firms
and households, unable to raise financing due to the impact of shocks on their balance sheets, are unable
to carry out their expenditure plans. This slows economic activity, which further reduces the value of
economic assets leading to yet more tightening in a viscious circle. In such circles, it has been shown that
negative shocks are both amplified and made more persistent than predicted in standard business cycle
models or as approximated by growth models. Despite the prominence of these effects in newer models of
financial economies, little has been said to date on how climate change will interact with these important
macroeconomic mechanisms for processing shocks. This paper addresses this gap.
As little has been said to date on the general financial consequences of climate shocks,
1
I begin with
an empirical examination of how climate shocks
2
impact financial conditions in US states. I find that
climate shocks do indeed cause significant financial frictions (increased corporate credit spreads) which
can be expected to become more pronounced if climate shocks become more frequent. I then construct a
DSGE model with a climate change externality and collateral constraint (a “financial macroeconomy with
climate”) capable of capturing the impact of a changing distribution of climate shocks as emissions rise in
1
Some work has been conducted however on disaster insurance, for example Gibson and Mullins (2020).(103)
2
I use the term climate shock interchangeably with natural disasters. Which could simplify some aspects of the climate shock
distribution but is sufficient for the discussion here.
115
the future. This allows for parameterization of climate shocks and their impact on financial economies that
are not available for exploration and quantification in standard macro-climate models. Models that miss
these key economic channels may substantially underrepresent the economic damages associated with
climate change.
The academic literature on climate shocks has focussed on their impact on real economic activity.
Boustan et al. (2020) describe the county level effects of natural disasters showing that they increase out-
migration from counties hit by them.(35) Caliendo et al. (2018) also examine regional impacts in their case
study of Hurricane Katrina. They show that lost output in states is partially made up by increased activity
in adjacent states following a disaster.(49) Bloom et al. (2013) document the impact of disaster uncertainty
on economic growth, finding that this uncertainty causes substantial reductions.(12) Hsieh et al. (2010)
show that production losses for cyclones are large, but also are highly correlated with the effect of labor
productivity of higher temperatures which leads to an underestimation of the economic costs of climate
change.(121) Finally, Kahn (2005) tackles the death rate from natural disasters, and what cofactors are
associated with higher death tolls. None of these studies tackle financial effects. Kahn and Ouazad (2020)
conduct a micro study of mortgage lending during flood shocks, showing that mortgage lenders have an
incentive to securitize flood risk.(128) However the authors do not tackle the economy-wide impacts.
The theoretical macro climate literature has mostly focussed on growth models. For examples see
discussions in Nordhaus on integrated assessment modeling and an overview of the canonical climate
macro model in Hassler and Krusell (2018).(117) Growth models are a natural starting point since the time
horizons involved are long and the time periods used can be 10 years or more. However, such models do
not include shocks in the Real Business Cycle sense, instead modeling climate damages as fixed scaling
factors on the production function. Growth models have been shown to approximate economic data well
over the past century. However, with climate change leading to a substantially different shock distribution
in the future this approximation may well cease to hold in the future. New models are needed to quantify
and characterise these potential differences from past modeling. I provide such a model in this paper.
In my empirical examination, I seek to document the existence of financial effects from climate shocks
to provide justification for my later modeling exercise. Looking at the aggregate economy level for the US
only provides a small number of observations, not sufficient to achieve the desirable statistical precision.
Therefore to obtain more data points I instead focus on US states over the period 1980-2016.
3
I combine
3
I focus on a state comparison, rather than a cross-country analysis, to minimize the risk of unobservable omitted variables
116
natural disaster data with US corporate loan market data from the Dealscan database to produce a dataset of
disasters and financial conditions (specifically credit spreads) at the state-year level. I find that less severe
disaster-years are associated with modest increases in credit spreads (in the order of 10-20bps) causing
a gradual drag on the ease of obtaining financing. While severe disasters are associated with significant
increases in credit spreads (in the order of 60-100 bps, and more for larger disasters). My estimates are
causal since natural disasters at the state level occur at random, and I provide robustness tests to confirm
this. I present a range of additional robustness checks to verify the main results. Although the focus of
this study is corporate lending, important financial effects are likely to also exist at the household level in
response to natural disasters, which are not covered here.
Empirical examinations of climate change are by definition difficult since they seek to understand
relationships that will exist in the future, but are possibly different from those in existence today. Much
of the data available from the past occurs concurrently with a relatively benign climate compared to
expectations of the climate in the future. As a result, my estimates represent a lower bound on what could
be expected in the future should the climate evolve as anticipated by climate science.(122) As an illustration,
my credit spread data indicates that during the financial crisis from 2008 to 2009 state credit spreads (after
removing idiosyncratic effects on specific companies and specific loan types) increased by around 95bps for
the average firm in the average state (this is larger in some states and for firms with already tight financial
constraints). According to my empirical examination, states in years with disasters between the 97.5 and
99.9 percentiles have a roughly similar basis point increase in their credit spreads (92.5bps). In the data
there are 32 state-year instances with a disaster of this magnitude where the states in question experience
an increase in credit spreads comparable with that experienced by the average state during the financial
crisis. If climate change doubles the incidence of such disasters (not out of the bounds of possibility under
IPCC predictions
4
) then there would be 64 state-year instances of states experiencing credit prices as high
as during the average state during the financial crisis. This only captures the extreme end of the disaster
distribution, not even accounting for the drag created on firm financing by the continuous incidence of
smaller climate shocks.
Further interesting discussions on insurance and adaptation are relevant here, but are beyond the
influencing the analysis. US states are more comparable units of observation.
4
E.g. the IPCC AR5 report notes that the incidence of category 4-5 tropical cyclones in the North Atlantic could increase by
up to 200% with an expected 50% increase (IPCC AR5, Fig 14.17). Similary the projected number of extreme warm, extreme cold
and very wet days is expected to more than double (IPCC AR5, Fig 11.17) etc.(163)
117
scope of this paper. If firms were taking out sufficient insurance or insurance markets were functioning
perfectly, then I would not find any financial effects of climate change, since firms would perfectly insure
against these risks and not find themselves needing extra financing during climate shocks. The fact that
spreads increase during climate shocks show that there is an additional need for financing during these
times as well as a reduced ability to obtain this financing. Similarly, the fact that I find effects suggests that
adaptation to these climate events is far from perfect. Nevertheless adaptation will have an important role
in how economies experience climate change and the role of adaptation in financial markets should be a
subject of future research.
The empirical examination creates a rational for theoretical modeling of macroeconomies with climate
systems that also account for the financial effects of climate shocks. I take a fairly general model of a
financial economy, similar to that in Jermann and Quadrini (2012), and add a climate module, similar to
Golosov et al. (2014), to represent the general class of financial economies with climate.(127) (104) The
model features an occasionally binding collateral constraint, such that in the absence of shocks firms can
lend unconstrained. Whereas in the presence of shocks firms become constrained by the value of their
collateral. This mechanism, combined with convex adjustment costs of borrowing, create amplification
effects in the model due to financial frictions.
The climate system in this model interacts with the real economy through a standard damage function
as carbon emissions rise. In addition however, financial shocks are also linked to the climate such that they
become more prevalent over time (higher variance) as carbon emissions rise. As time evolves in the model
more instances of financial tightening occur, driving up the economic costs of climate change relative to the
standard macroeconomic model with climate. Varying this parameterization allows different magnitudes
of this climate change channel to be explored.
I derive several key analytical results for this general class of financial economies with climate. For
example, the formula derived for the marginal externality damage of climate change in Golosov et al. (2014)
for the general class of macroeconomies with climate continues to hold for financial macroeconomies with
climate under certain conditions. I also verify that the model replicates the increased credit spread effect as
climate shocks occur documented in the empirical analysis. Finally, I leave the numerical solution of this
class of model for future research, though the simple form of the climate module and financial frictions
make the framework sufficiently tractable.
This paper therefore contributes to the existing literature in documenting the existence of financial
118
frictions as they relate to climate change and climate shocks. I also build a framework under which
these effects can be analysed and understood, which is not currently possible in the models used in the
macroclimate literature. Finally, this paper provides a general approach to how macroeconomic models
with separate environmental or other modules can be extended to include the importance of financial
effects. Future research should attempt to further catalog instances of financial frictions in relation to
climate shocks, and particularly their relationship with adaptation, as well as developping more tools to
explore how these effects may interact with climate in the future.
Section 2 of this paper describes the datasets used and my approach to measuring corporate credit
spreads at the state level. Section 3 describes my empirical methodology, results and a battery of robustness
checks I conduct to validate my results. Section 4 describes the theoretical model and proves key analytical
results. Section 5 concludes.
2 Dataanddescriptivestatistics
In this section I describe the data I use for the empirical analysis. State data is used rather than national
to increase data points and therefore precision of the estimates. Relying on national data alone does not
provide much spatial variation upon which to test the hypothesis that disasters cause tightening financial
conditions.
After describing the natural disaster data, I then decribe the variability in the data at the state level,
demonstrating that it can be considered random. This justifies my identification strategy in section 3 and
is in line with the typical identification strategy in the literature (Boustan et al. (2020); Bloom et al. (2013)
etc.).(35),(12) Finally, I describe the Dealscan data that provides the credit spreads used to measure credit
conditions.
2.1 Naturaldisasterdata
I collect natural disaster data from the Federal Emergency Mangement Agency (FEMA). FEMA maintains a
dataset of natural disasters declared at the county level in the United States called the OpenFEMA disaster
declarations summary dataset. FEMA also keeps data on non-natural disasters such as terrorist acts and
dams breaking; in this paper I examine only natural disasters as these are of most direct relevance to
climate change.
119
Figure 4.1: Distribution of state natural disasters over time.
The data records disaster declarations in US counties from 1953. A single disaster hitting multiple
counties would register as several disasters. Thus using count data combines both quantity of disasters
with the geographic severity of the disaster. Combining the count data across state and year therefore
gives a measure of the quantity and severity of disasters in a given year in a given state.
Figure 4.1 shows the distribution of state observations from year to year. The figure shows how the
distribution of state natural disaster observations evolves over time. The light grey area shows how the
shape of the 10-90th percentiles change over time; dark grey shows how the shape of the 25th-75th
percentiles change over time; the blue dashed and black lines show how the average and median move
over time. The data shows a steady increase in the number as well as the range of natural disasters over
the course of the dataset. In section 3.3 I test using only later data to check for the possibility that some
aspect of declaring disasters changed over time.
2.2 Identifictionandrandomnessofnaturaldisasters
Natural disasters are random events and the literature treats them as natural experiments to establish
causality (e.g. Boustan et al. (2012), Boustan et al. (2020), Barrot and Sauvagnat (2016)).(35),(34),(17) For
my state level regressions, natural disasters also need to be random at the state level to establish a causal
link between natural disaster shocks and financing. Some states will have more natural disasters than
others leading to entrenched differences in credit markets. These differences in credit conditions between
states are controlled for using fixed effects as explained in section 3.
At the state level, the data appears random with sharp deviations from year to year. Figure 4.2 shows
120
the natural disaster count by year for the five states with the largest number of disasters. As the chart
shows, increases and decreases in the measure by over 100 are common, and larger rises and falls are also
often seen.
Further, comparing Coefficients of Variation for each state show that each state’s distribution of disasters
has a wide variance. The Coefficient of Variation (CoV) is the standard deviation of a sample standardized
by its mean. Figure 4.3 shows that all but one state have a CoV of over 1 – a CoV more than 1 is generally
considered a high variance sample since it implies many observations fall outside of 2*mean.
These figures provide evidence that these disasters are random events and therefore any significant
covariation in financial conditions and natural disasters must be causal. Nevertheless there is a further
question of anticipation – some of the natural disasters may be anticipated and adapted to given weather
forecasting and state populations experience in previous disasters. Though this is not necessary to establish
causality, as I do not distinguish between anticipatory effects and unexpected effects, all are relevant to
this study, I do provide some sensitivity testing using lagged values of natural disasters in section 3.3.3.
2.3 Dealscandata
The Dealscan database records information on commercial loans and their rate spreads from a range of
companies that are required to record deals by the Securities and Exchange Commission (SEC). This is
combined with data from many deals recorded based on reporting, LPC contacts within the credit industry
as well as from the borrowers and lenders themselves. The information has been collected over many years
(beginning roughly around 1980) by the Loan Pricing Corporation. Coverage of the data versus the whole
commercial loan market in the US has been shown to be high, but not complete (Carey and Hrycray (1999)
find the database contains between 50-75% of the value of all commercial loans in the US during the 1990s,
and this is known to have increased over time).(51),(55)
The data has good coverage of most state and year combinations. Since the objects of my analysis are
state and year observation it is important that I have a good coverage of loans issued to firms in all state
and year combinations. A few states have lower numbers of observations for some years and therefore
may not have a well-defined credit spread. I test the impact of removing these states from the dataset in
section 3.3.3.
I address issues about Dealscan being skewed towards large firms. Particularly since many, though far
from all, deals recorded in the Dealscan database are filed with the SEC there is a known bias in the data
121
Figure 4.2: Number of counties declaring natural disasters by state and year
Note: For the 5 states with the largest average natural disaster count. Shows sharp deviations in quantity and severity of natural
disasters from year to year.
towards larger firms. Similarly deals conducted by medium size firms may receive less attention and thus
be harder to collect. Larger firms are also less likely to be geographically constrained and to be assessed for
loans based on local factors and are therefore less relevant for this analysis. I therefore undertake various
steps to focus the data on smaller or medium size firms described below. In section 3.3.3 I also provide
various robustness checks around these steps to ensure they do not unduly affect results.
Term loans are less frequently used by large companies and therefore I focus on these loans in the main
analysis. The majority of deals in the database are either term loans or credit facilities. Credit facilities are
typically larger, less likely to be secured and are typically used for the short-term financing needs of larger
companies.(194) For this reason, I omit them from the main analysis, including them in a sensitivity test
in section 3.3.3.
122
Figure 4.3: Coefficients of variation by state (1980-2016).
Note: Most states have a CoV of more than 1 implying the entire length of the mean is within the first standard deviation i.e.
wide variance.
I focus on loans taken out for reasons other than mergers. The Dealscan data has a field describing the
“primary purpose” of the loan. Since the focus of this study is financial constrainedness during or after
natural disasters, loan financing raised for mergers is less relevant to this study. I therefore omit term loans
raised for the primary purpose of conducting a merger. Again, I test sensitivity to this choice in section
3.3.3.
Lastly, since I am only interested in US domestic states, I omit the large amount of dealscan data that is
made to foreign corporations. These are most often large companies and therefore do not meet the criterion
that the firms in my sample be geographically constrainted to the state in which their headquarters are
located.
123
Figure 4.4: Persistence of natural disasters by state (1980-2016).
Note: P-value of arima regression with one lag. For almost all states regression is insignificant showing that state series show no
persistence from year-to-year.
2.4 Stateandyearvariationincreditspreads
I use the Dealscan data to measure the state and year variation of credit spreads. To do this I must first
remove company and loan specific information from the data to be left with a dataset of credit spreads by
state and by year. I extract this measure from the data using the following panel regression:
r
i,s,t
=θθθ ′
(Y
s
∗ Z
t
)+βββ ′
X
i
+ϵ i,s,t
(4.1)
Here r
i,s,t
denotes credit spreads as measured in the data; Y
s
and Z
t
denote a set of state and year
fixed effects respectively and are used to extract the state and year variation from the credit spread data;
124
Figure 4.5: Distribution of state credit spread measure over time.
X
i
denotes a set of firm and loan specific controls such as firm size, industry, loan size, fees etc. X
i
can
be omitted or not since this and other unmeasurable firm and loan variation will be contained in the error
term. Since these should be uncorrelated with the state and year fixed effects, as within each state-year
pair there will be large variation in the types of firm and loan, it is not necessary to include them to get an
unbiased, efficient estimate of θθθ .
There are a range of alternative specifications and approaches that could be used to extract state and
year variation. The advantage of this approach is that it takes the most conservative possible stance on the
variation in spread data as attributed to state-year combinations, since the remaining error term is omitted
from the measure entirely. The alternative approach of just regressing the raw Dealscan dataset of firms
on the state-year natural disaster data could include some firm specific reactions to natural disasters in its
estimate. For example, if one state has some firms that are uniformly vulnerable to flooding due to failures
to invest in the past. This type of variation, which is not specifically a change in financing conditions
due to the natural disaster, but instead a reaction to firm specific conditions, is conservatively omitted in
this analysis. It is worth nothing that such firm-specific variation could also be an interesting part of the
economic reaction to natural disasters both now and in the future.
Finally, as an additional test of validity, I compare my credit spread measure with available interest
rate data by state from the Federal Deposit Insurance Corporation (FDIC) state profiles. This comparison
shows a correlation between my measure and data from this externally constructed source.
125
3 Financialfrictionsduringclimate-relatednaturaldisastershocks
3.1 Empiricalframework
I follow the literature in regressing the log of credit spreads on the unit measure of natural disasters. Using
the log specification approximates percentage changes and controls for the possibility of the spreads, which
are prices, following some form of integrated process over time. As numbers of natural disasters represent
real variables this is less likely an issue and thus these are included without adjustment. My specification
is as follows:
ln(r
s,t
)=δ ∗ (Natural
D
isasters
s,t
)+β ∗ X
s,t
+γ T
∗ Y
t
+γ S
∗ Z
s
+ϵ s,t
(4.2)
Heres∈S are values in the set of US states;t is the year;X
s,t
is a set of optional controls;Y
t
andZ
s
are vectors of time (whole US) and state fixed effects; and ϵ s,t
is the error term. Given the identification
discussion in section 2.2,δ can be interpreted as the causal impact of natural disasters on the year on year
growth in interest rates. I also do a range of sensitivity testing on this specification in section 3.3.
3.2 Panelresultsoninterestrates/corporatecreditspreads
The purpose of the analysis in this section is to document the effect of natural disasters on financial
conditions, specifically credit spreads. If natural disasters do cause financial friction effects we would
expect them to be associated with higher credit spreads indicating either increased demand for (more
firms need more financing) or reduced supply of (lenders are less willing to lend to companies affected)
credit during these random events.
My analysis shows that natural disasters do have a statistically significant impact on credit spreads,
with the effect increasing as the severity of the disaster increases. Table 4.1 shows the results of regression
exercises using the credit spread data. Total disasters is normalized to the mean number of natural disaster
declarations in the sample (27). Across a range of specifications including state fixed effects, clustered
standard errors and control variables the effect remains significant.
126
The bottom section of the table shows the coefficient interpretation. For differing levels of natural
disasters within the distribution of state natural disaster shocks I show the associated predicted credit
spread. The table shows that for a modest natural disaster there is only a modest size increase in the credit
spread e.g. a mean natural disaster yields an increase in spreads of between 3.8 and 7 bps. However, for
more extreme natural disasters, e.g. those at the 99 percentile, the spread can increase by a much larger
34-65 bps. Further, the distribution of natural disasters has a long tail as noted in section 2.2 and therefore
events at the 99.5 or even 99.9th percentile are seen in the data. These events have extrapolated increases
in credit spreads of up to 73bps and even 303bps respectively.
Furthermore, due to the log specification on the credit spreads data, these basis point increases depend
on the starting value used. Such that firms that are relatively financially secure (already have low credit
spreads) would experience a more modest increase in their difficulty of financing due to a natural disaster.
Firms that already face financial difficulties however, and thus start with higher typical credit spreads such
as of 400 or 500bps, will face even larger increases in credit spreads as a result of natural disasters. This
point relates to the existing literature on the importance of heterogeneity in firm financial conditions i.e.
precisely those firms that are the most productive have larger risk profies and so are also those that may
face more financial frictions to begin with. These firms however are also precisely those in the greatest
need of financing.
As controls I include: (a) state populations according to dicennial censuses with a straight interpolation
for years between decades; (b) state unemployment rates from the Bureau of Labor Statistics as proxies
for other economic activity as in Boustan et al. (2020), however as noted by them these economic effects
are endogenous and are partly what my analysis wishes to capture; (c) an overall time trend term. The
time trend is challenging because the natural disaster data appears to show a slight trend increase over
the period of analysis. Therefore a time trend may capture some variation in spreads that is in fact due to
disasters becoming slightly more common over time. Nevertheless I include this here and the variable of
interest remains statistically significant albeit smaller.
Lastly, these results suggest that insurance is not capable of mitigating the impacts of natural disasters
to firms. It could be argued that the financial effects of natural disasters can be mitigated by insurance since
they are idiosyncratic events which can be hedged. However, the results here show that this is not true –
if insurance did meet all the financing needs of companies during natural disasters additional credit would
not be required, and credit markets would not be affected by these events. My results here demonstrate that
127
Table 4.1: Regression results: credit spreads on normalized disasters and control variables.
at the aggregate level either typical market imperfections in insurance markets still leave firms requiring
additional financing.
3.3 Robustness
In this section, I demonstrate that the results in section 3 remain robust to a range of alternative data
assumptions. The results in this section also show some interesting directional changes in the main
coefficient estimate when different assumptions/sets of data are used. In particular, in sensitivities aimed at
targeting even smaller firms than the main dataset (by looking at firms with small loans or small sales) the
effect of disasters on credit spreads becomes larger. Similarly looking at lagged values of disasters reveals
128
that the impact of a disaster continues to be felt on credit spreads for several years, although statistical
significance drops off from year 2 onwards. This suggests the usual persistence of shocks that have a
financial friction component.
Figure 4.6 shows 8 sensitivity tests, each based around different methodological choices that could have
been made and are described in section 2. The chart shows the coefficient estimates (diamond markers)
under the sensitivity relative to the main estimate (blue dashed line). The chart also shows statistical
significance in the sensitivities when clustering standard errors (yellow circles) and without clustering
standard errors (red circles). In all but sensitivities of including credit spreads and using a 2-year lag for
the natural disasters explanatory variable, the estimate remains similar to the main estimates from table
4.1 and remain statistically significant.
Several of these sensitivities address the potential inclusion of large firms in the dataset. This is a
problem because for the estimates to be accurate the data must accurately record the location of each
firm. Since the data only has headquarter data, then the true location of larger firms is less likely to be
represented and thus the inclusion of these firms could bias down the resulting estimate. Hence testing
robustness of assumptions that may capture larger, geographically separated, firms. The sensitivities are
as follows:
1. Including loans used for mergers: in the main analysis I omitted loans used for the primary
purpose of mergers, as these seemed to be less related to loans for the purpose of investing that
are the more typical focus of financial friction effects. However it could be the case that natural
disasters have important effects on merger loans as well, since firms may attempt to consolidate
through mergers after experiencing a shock. The chart shows that the main coefficient estimate
increases slightly when mergers are included, and the estimate remains significant.
2. Including credit facilities: the main analysis is most interested in medium sized firms in the
dataset, as these are more likely to be geographically concentrated in one state and to be unable
to escape the financing effects of a natural disaster. For larger firms with operations in many
states, their lending conditions are less likely to be determined by a disaster hitting a particular
state. Since credit facilities are more often taken out by large firms for short-term financing needs I
omit them from the main analysis. The sensitivity indeed shows that when these loans are included
the estimated effect falls dramatically as expected, and becomes less significant. The larger firms
129
included in the credit facility dataset are less likely to experience financial frictions unless the natural
disaster is substantial.
3. Removinglargerloans: again, the subject of my analysis is smaller, geographically concentrated
firms. It may be that my main analysis, which imposes no limit on the size of the loan in the data,
captures some larger firms that are not subject to local financing constraints and so are not truly
treated with the natural disaster. As a result here I remove loans that are above the mean loan amount
in the data to test this. The main estimate remains similar but does increase slightly, suggesting that
the main estimate is a lower bound on how much some firms financing are affected by natural
disasters.
4. Removingcompanieswithlargerrevenues: another way to test if the results are biased down
by firms that are not geographically located in one state is to remove those with high revenue. These
companies are more likely to be large and thus not geographically concentrated. To test robustness
I remove firms with more than the mean level of sales in the overall dataset. Similar to the previous
robustness check, the coefficient increases slightly and remains statistically significant.
5. Lagged values of diasasters: as discussed in section 2.2 natural disasters are random events and
therefore the estimated impact on credit spreads should be causal. However by using same year for
spreads and natural disaster there is some possible time dependency e.g. if a natural disaster happens
at the beginning of the year but the credit spread increase is mostly attributed to the end of the year.
Therefore to test robustness to this I use lagged values of natural disasters. The figure shows that
at one year lag the coefficient remains similar and is still statistically significant. Interestingly, with
a lag of two years the effect becomes smaller but is still significant with no clustering of standard
errors (although slightly fails the significance test under clustered standard errors). This suggests
that there are also medium term financing effects of natural disasters e.g. firms in states with a
natural disaster two years ago are still facing tighter financial conditions.
6. Excludingearlieryearsofdata: as noted in section 2.3, in earlier years there was less coverage of
the overall loan market in the Dealscan dataset than in later years. Therefore here I test robustness
to omitting earlier years. This comes at the cost of losing data points. The figure shows that the
main estimate is still significant and positive. However the estimate is slightly reduced, closer to
130
the levels seen in the main regression with controls. Nevertheless it is still within the bounds of
estimates described in section 3.
7. Excluding states with fewer Dealscan observations: as mentioned in section 2.3, most states
have good coverage for most state-year combinations. However, some do have much less coverage
than others, and some have only a small number of observations for some years. Therefore here I
test robustness of the results to omitting these less precisely estimated states. The figure shows that
again the estimate remains similar to the main results and significant.
As a final robustness check, instead of including the number of disaster declarations I use dummy
variables indicating the presence of a disaster-year between different percentiles. This is a crude measure
of the severity of disasters in a given state and year. Table 4.2 shows the regression results from this dummy
regression including fixed effects and clustered standard errors (here “dum 80 pc” indicates a year within a
state where disaster declarations were between the 70th and 80th percentiles). Similar to the main results,
the table shows coefficients significant for disasters above the 70th percentile, and gradually increasing in
terms of their impact on credit spreads. This is consistent with the conclusion that smaller disasters evoke a
small drag effect on credit spreads (e.g. a modest increase of 15bps for disaster-years between the 60th and
70th percentiles – albeit just insignificant); whereas large disasters can add substantial amounts to credit
spreads (e.g. 60bps for disaster-years between the 95th and 97.5th percentiles – strongly significant).
4 Afinancialmacroeconomywithclimate
Previous sections have shown that climate shocks are indeed associated with tightened financial conditions
in the empirical data. Similarly, more severe shocks are associated with even tighter financial conditions.
This empirical exercise motivates the construction of a theoretical model that can represent these interactions
between climate and the financial sector as the economy processes these shocks.
Due to data constraints, the empirical exercise was done at the state level rather than aggregate US
level. Aggregating across states is likely to lead to a similar conclusion at the aggregate US level – most
disasters create an intermittent drag on firms’ abilities to raise financing, while extreme disasters have more
substantial financial effects. Therefore this model can be taken to approximate the aggregate US economy,
however more empirical work is needed to complement the model to fully uncover the mechanisms and
131
Table 4.2: Sensitivity using dummy variables instead of using number of disaster declarations.
detail behind the interactions of climate shocks and financial frictions at the aggregate level. Given the
empirical evidence however, this model can certainly be used to approximate state level economies, some
of which represent larger economies than most countries e.g. California and Texas, and their experiences
of climate shocks.
The model here broadly integrates a financial model of the sort used by Jermann and Quadrini (2012)
with a climate DSGE model such as that constructed by Golosov et al. (2014) who follow Nordhaus
and Boyer (2000) in their modeling of the climate system.(127) (104) (159) These climate models do not
use models with shocks as are common in macroeconomics, favoring growth models due to the time
horizons involved. Although I have argued for the inclusion of shocks in a real business cycle framework
to adequately represent financial effects in the model. This paper therefore deviates from the typical
macroclimate modeling literature by building climate effects into the model that are not specifically related
to the damage function which scale down production.
I first discuss the model environment, climate module and consumer problems. I then present the Final
Output firm problem, at which time I present much of the key intuition that underpins the key results in
the paper. I then discuss the energy firm problem, equilibrium definition, before concluding with a series
132
of key analytical results from the model.
4.1 ModelEnvironment
Agents in the model are consumers who provide labor and purchase consumption, a final output firm
which raises financing both intra and intertemporally and invests in and maintains the capital stock, and
energy firms which produce energy inputs through the use of fossil fuels. The energy firms rent capital
from the final output firm rather than owning and investing in their own. This assumption is required to
have marginal externality damages that are no different from those in a regular economy, because energy
firms are not themselves exposed to financial frictions. This assumption is justified by the likelihood that
financial frictions for energy firms are likely dwarfed by those in the broader economy, and that energy
firms are less likely to be companies with tight financial constraints (due to their significant assets and
image as safe investments).
5
Contrary to other climate models, which typically assume periods of around 10 years or more, here
I assume periods are closer to 2-3 years in duration, to adequately capture the important non-linearities
created by financial frictions. Critical here is the view that although in the past growth models with long
time periods have approximated historic data well, it is unclear whether this will hold in the future given
that historic data from the last 100 years features a relatively benign climate relative to that envisaged by
climate researchers.(122) (163)
4.2 Climatemodule
Before describing the model’s agents, I first describe the consolidated climate module used in the model.
For this, I broadly follow Golosov et al. (2014) and Nordhaus and Boyer (2000), adding in an additional
climate shock component.
Energy output in the model is measured in carbon emission units. Therefore the key climate variable,
the atmospheric stock of carbon (S
t
), is a function of energy outputsE
f
=
P
Ig
i=1
E
i,t
based on the equation
below.
5
However, note that there are additional costs expected if oil firms for example become less profitable relative to renewable
energy. Similarly renewable energy firms may be more likely to be innovative firms with high rates of financing. Nevertheless
given increasing returns to scale in energy industries more generally these special cases are unlikely to invalidate this assumption
for most economies. A micro level analysis of the financial constrainedness of different energy firms may be an interesting topic
of future research, but is beyond the scope of this paper.
133
S
t
− ¯ S =
t+T
X
s=0
(1− d
s
)E
f
t− s
(4.3)
Where here
¯ S is the pre-industrial atmosphericCO
2
concentration. T is the number of periods that
carbon remains in the atmosphere.d
s
represents the carbon depreciation schedule and is approximated by
the paramaterized equation below. A detailed discussion of this approximate formulation used in economic
models, as well as potential alternatives, can be found in Golosov et al. (2014).
1− d
s
=ϕ L
+(1− ϕ L
)ϕ 0
(1− ϕ )
s
(4.4)
Finally, the climate system feeds into the economic model in the following two ways:
1. A standard damage function on economic production, such that output depends on carbon emissions.
In the general case production is just specified to be a function of the atmospheric carbon contribution.
In the fully specified version of the model this is represented as a scaled damage as follows, where
D
t
typically takes exponential form e.g. 1− D
t
(S
t
)=exp(− γ t
(S
t
− ¯ S)):
F
o
(K
t
,N
o,t
,E
o,t
,S
t
)=(1− D(S
t
))
˜
F
o
(K
t
,N
o,t
,E
o,t
) (4.5)
2. Through increasing the frequency of climate shocks. Either a deterministic impact on the presence of
financial shocks ξ t
such thatξ t
=
˜
ξ t
(S
t
) if these are modelled as deterministic. Or, in the stochastic
version of the model, an impact on the variance ofξ t
i.e. ξ t
∼ G(σ (S
t
)) whereσ is the variance of
the distributionG and this is increasing inS
t
.
Finally, I use the following precise definition of the term “climate change” in the rest of this paper.
“Climate change” in this model is the increased accumulation of carbon in the atmosphere. This feeds
into the financial economy through the damage function on the production function, and through changing
the variance of financial shocks due to climate shocks.
134
4.3 Consumers
There is a representative agent (possibly representing a continuum of households). Deviating from Golosov
et al. (2014), consumers do not create capital. However they do provide intertemporal borrowing to firms
(buying corporate bonds).
The consumer’s problem is the following:
max
Ct,Nt,Bt
E
0
T
X
t=1
β t
U(C
t
,N
t
) (4.6)
Subject to:
B
t+1
1+r
+C
t
=W
t
N
t
+B
t
+Π t
+T
t
∀t∈T (4.7)
WhereC
t
,W
t
,N
t
,β are standard, and the remaining variables are:
• B
t+1
= return of corporate bonds purchased last period;
• B
t
= purchase of corporate bonds for this period;
• Π t
= profits from energy firms and final output firm (i.e. Π t
=Π o,t
+
P
I
i=1
Π i,t
);
• T
t
= Government transfer as lump sum subsidy.
The consumer’s problem involves only choices of variables today and has first order conditions included
in the appendix.
4.4 Finaloutputfirm
The Final Output (FO) firm both produces all other goods in the economy, produces capital, uses loan
financing and is subject to financial frictions. Crucially, within a period, the firm must first purchase
inputsbefore it is able to access revenues. Therefore it requires intraperiod financing to purchase inputs
for production.
135
Following Jermann and Quadrini (2012) in their formulation of a firm subject to financial constraints,
the FO firm has two financing options: (1) at the beginning of each period the firm takes out an intraperiod
loan (represented by l
t
) to pay for inputs before output is realised; and (2) the firm also takes out an
interperiod loan (represented by B
t+1
) to redistribute over time periods. The detailed mechanics of the
contract and associated game between the FO firm and its lenders is described in detail in Jermann and
Quadrini (2012) and I omit it here. Note that since the FO firm uses l
t
to purchase inputs it will be set each
period to be equal to output.
The FO firm takes prices as given when making choices where r
t
is the net interest rate on the
interperiod loan,R
t
is the rental rate the FO firm offers on capital (to energy firms), and W
t
is the usual
wage rate it pays for labor. Final output goods are the numeraire so have price of 1. The amount the
firm is able to borrow is subject to convex adjustment costs φ(B
t+1
), representing the fact that it is more
straight forward to renew existing borrowing than to renogtiate borrowing/issue new debt. This is a typical
financial friction effect generating persistence in this class of models.
The FO firm chooses how much capital to produce subject to a standard capital accumulation equation
by choosingI
t
(effectively choosing K
t+1
). It then chooses the shareρ K
of available capital today to use
for its own production, which earns a return equal to the marginal product of capital. The remaining share
(1− ρ K
) is then rented out on the rental market for price R
t
(this capital is leesed to the energy firms).
The trade-off to the firm between using or renting out capital is pinned down by the market clearing rental
rate.
Finally, the firm is subject to a collateral or enforcement constraint which determines how much the
firm can borrow. After realising revenue but before repaying the intraperiod loan the firm faces a choice
to default on its liabilities at that time: l
t
+
B
t+1
1+rt
. Revenues realized are assumed to not be recoverable by
the lender but the capital stockK
t+1
is. This is assumed to have an uncertain recovery value at this point
ofξ t
K
t+1
+(1− ξ )∗ 0 = ξ t
K
t+1
. Given this contract, the lender will only be willing to lend up to the
recovery value implying the enforcement contraint specified below in equation 4.11.
Then the financial shock term ξ t
∈ [0,1] is a stochastic financial shock which when high does not
affect the FO firm’s investment plans, but when low constrains the firms investment plans forcing it to
either reduce l
t
or B
t+1
. The constraint also provides additional motivation to invest in capital, as it
provides collateral.
6
Crucially, this financial shock will also be tied to the climate such that it becomes
6
A more complete description of this renegotiation process can be found in Jermann and Quadrini (2012).
136
more binding over time: ξ (S
t
). One particular formulation of this shock is that ξ t
∼ F where F has
variance that increases with the climateS
t
such that as emissions increase the financial shock has a low
realization more frequently. Alternatively the path of ξ can be set deterministically as a function of S
t
to produce different calibrated scenarios of climate shock realizations based on the changing profile of
emissions.
Based on this the FO firm’s problem can be stated as follows:
max
Ko,t,No,t,Eo,t
E
0
T
X
t=1
Π o,t
(4.8)
Subject to:
• Budget constraint:
I
X
i=1
p
i,t
E
o,i,t
+B
t
+W
t
N
o,t
+φ(B
t+1
)+I
t
+Π o,t
≤ F
o,t
(ρ K,t
K
t
,N
o,t
,E
o,t
,S
t
)+
B
t+1
1+r
t
+R
t
I
X
i=1
K
i,t
(4.9)
• Capital accumulation:
K
t+1
=(1− δ )K
t
+I
t
(4.10)
• Enforcement constraint:
ξ t
(K
t+1
− B
t+1
1+r
t
)≥ F
o,t
(ρ K,t
K
t
,N
o,t
,E
o,t
,S
t
) (4.11)
• Potentially binding non-negativity constraints:
137
0≤ ρ K,t
≤ 1 (4.12)
• Adjustment cost:
φ(B
t+1
)=κ (B
t+1
− ˆ
B)
2
(4.13)
WhereN
o,t
,E
o,i,t
∀i∈ I are labor inputs and energy input of type i respectively.E
o,t
is the vector of
energy inputs.
Subbing out the first two of these constraints, this recursive problem can be restated as the following
value function, where S represents the aggregate states, K and B represent the other states, and t,t+1
notation is temporarily replaced withX,X
′
notation. Notem can either be the stochastic discount factor
to represent changing valuation of uncertainty in the future, or can just be set to the deterministic discount
factorβ .
V(S;K
o
,B)= max
K
′
o
,B
′
,No,Eo
Π o
+Em
′
V(S’;K
′
,B
′
) (4.14)
Subject to equations [4.9-4.13].
Full first order conditions for the FO firm are included in the appendix, where lagrange multipliers on
the Enforcement Constraint and the two non-negativity constraints areη t
,ϵ 1,t
,ϵ 2,t
respectively (the latter
two will be zero ifR
t
=MPK
t
in equilibrium).
F
[No]
o
=W(
1
1− η ), F
[E
i
]
o
=p
i
(
1
1− η ) ∀i∈I (4.15)
The first order conditions for basic inputs are all of similar form, shown in the equation below. If the EC
constraint is binding (i.e. η t
>0) then the optimal input is shaded down by a fixed portion from the usual
condition that the input price equals the marginal product of that input. This is because the lack of ability
to take out adequate financing from the lack of collateral forces the FO firm to shade down its production
plans depending on the tightness of the EC. Therefore if climate change entails frequent years where the
138
EC binds due to climate shocks then firms will frequently be forced to shade down their investment plans,
amplifying the typical output losses from a climate shock. The ρ K
first order condition produces a similar
expression, just including an additional wedge adjusting for if it becomes constrainted at 0 or 1.
Em
′
[ (1− η ′
)F
[K
′
]
o
ρ ′
K
| {z }
Contribution to production
next period
+(1− δ )+
Extra can be
sold on rental market
next period
z }| {
R
′
(1− ρ ′
K
) ]
= 1− ηξ | {z }
Provides collateral that
loosens EC this period
(4.16)
Finally, the dynamic K
′
equation trades off the typical productivity of capital next period with its
role as collateral in the EC. Increasing K
′
increases production and sellable capital next period (as well
as contributing to the renewal of depreciation), thus increasing the LHS. The ability to increase capital
on the LHS of 4.16 however becomes limited by the RHS. This in turn depends on the benefit from the
EC – increasingK
′
loosens the collateral constraint, reducingη , so theηξ term increases the optimalK
′
.
However this benefit is moderated by the value/recoverability of the capital stock to lenders – e.g. if capital
has no valueξ =0 and there is no collateral benefit from increasing K
′
. This leads to the following climate
predictions.
Proposition4.1. Inthisfinancialeconomywithclimate, climatechangeleadstomorefrequentbindingof
the EC from climate shocks. Then firms will have an incentive to accumulate more capital to act as collateral
compared to a model without financial frictions η = 0. By increasing the supply of capital this would reduce
therentalrateofcapital.
The binding EC constraint could also reduce demand for borrowing intertemporally. This is shown
in the dynamic B
′
equation shown below. If the EC binds η > 0 then the LHS of equation 4.17 falls,
implying the RHS has to fall. The RHS is increasing in B
′
and so this leads optimalB
′
to fall. However,
this depends onξ – for large financial shocks, the lender cannot collect collateral and so having larger B
′
is less important; optimalB
′
is higher. Therefore there are offsetting forces on the demand for borrowing.
This leads to the model’s climate prediction in proposition.
139
1
1+r
(1
| {z }
benefit of loan
− ηξ )
|{z}
tightens EC
= Em
′
+φ
′
(B
′
)
| {z }
Costs of repaying loan
tomorrow and adjustment
(4.17)
Proposition4.2. Inthisfinancialeconomywithclimate, climatechangeleadstomorefrequentbindingof
theECfromclimateshocks. Inthemodelthisleadstoareduceddemandforintertemporalborrowingfromthe
FOfirm. Thisistemperedhoweverbythesizeofthefinancialshock–alargeenoughfinancialshockincreases
theamountofborrowingconditionalontheECbinding.
4.5 Energyfirms
There are a set of i ∈ I energy firms respectively producing the energy resource E
i
to the FO firm as
well as other energy firms. As in Golosov et al. (2014), a baseline model may include a non-renewable,
essentially inexaustible, energy source like coal (E
1
), a non-renewable but exhaustible energy source like
oil (E
2
) and a renewable energy source like wind (E
3
).
The energy firms also face a tax τ i,t
on production which is set equal to 0 to simulate the no carbon
tax equilibrium. Or can be set to the marginal damage cost to emulate the planner’s solution. Resources
enter the production function to simulate that the energy firms could face increase costs of production
as an energy resource runs out (oil only in the baseline). Finally, exhaustible energy sources face the
decumulation equation shown below whereR
i,0
represents the initialt=0 supply of the energy source.
Finally then, the energy firms’ problem is as below, and first order conditions are included in the model
equation appendix.
Π i
≡ max
{K
i,t
,N
i,t
,E
i,t
,E
i,t
,R
i,t+1
}
E
0
∞
X
t=0
q
t
[(p
i,t
− τ i,t
)E
i,t
− r
t
K
i,t
− w
t
N
i,t
− I
X
j=1
p
j,t
E
i,j,t
] (4.18)
Subject to:
• Production constraint:
140
E
i,t
=F
i,t
(K
i,t
,N
i,t
,E
i,t
,R
i,t
) (4.19)
• Decumulation equation:
R
i,t+1
=R
i,t
− E
i,t
≥ 0 (4.20)
4.6 Equilibrium
Finally then, we have the competitive equilibrium definition of this general class of financial macroeconomies
with climate below.
A Competitive Equilibrium for a financial macroeconomy with parameters (δ,ξ t
,κ ) is allocations: (A)
{C
t
,N
t
}
T
t=0
for consumers; (B){K
i,t
,N
i,t
,E
i,t
,R
i,t
}
T
t=0
for all energy firms i ∈ I; (C) policy functions
for the final output firm: {K
o,t
(S;K,B),N
o,t
(S;K,B),E
o,t
(S;K,B),ρ K,t
(S;K,B),η t
(S;K,B)}
T
t=0
; and
lastly (D) a set of prices{W
t
,r
t
,R
t
,p
t
}
T
t=0
, such that:
1. {C
t
,N
t
}
T
t=0
solves the consumer problem described in equations [4.6-4.7], given prices.
2. {K
o,t
,N
o,t
,E
o,t
}
T
t=0
solves the Final Output firm problem in equations [4.8-4.13], given prices.
3. {K
i,t
,N
i,t
,E
i,t
,R
i,t
}
T
t=0
solve each energy firm’s problem in equations [4.18-4.20], given prices.
4. Markets clear to solve for prices such that∀t∈{0...T}:
N
t
=N
o,t
+
I
X
i=1
N
i,t
(4.21)
K
t
=K
o,t
+
I
X
i=1
K
i,t
(4.22)
141
E
i,t
=E
i,o,t
+
I
X
j=1
E
i,j,t
(4.23)
B
S
t+1
=B
D
t+1
(4.24)
5. Government budget balances: T
t
=
P
I
i=1
τ i,t
E
i,t
.
Notep
t
=(p
1,t
,...,p
I,t
). Note in the stochastic model,ξ t
is drawn from a distribution with mean and
variance parameters included in the list of parameter. The above represents the deterministic case, where
ξ t
is just a deterministic function of the carbon stockS
t
i.e. agents may mitigate in equilibrium to avoidξ becoming too affected by the climate.
4.7 Keyanalyticalresults
First, I establish a proposition linking the model more explicitly to the empirial results described in section
3. Then I describe a few other key analytical results that emerge from the model and that hold for the
general class of financial economies with climate.
In my empirical analysis, I found that climate shocks lead to increased credit spreads. The model
emulates this response to climate shocks, and provides insight into how this happens in the general
equilibrium framework.
Proposition 4.3. Climate shocks causeξ t
to fall, increasing the chance that the EC will bind. This reduces
thesupplyoflendingcausingcorporateinterestrates(i.e. creditspreads)toriseinperiodswithclimateshocks.
Ifclimatechangeincreasesclimateshocks,spreadswillrisemorefrequently.
Proof. As shown in equation 4.15, optimal input choices are scaled down when the EC binds, with larger
climate shocks forcing the FO firm to scale its production plans down further. This in turn reduces demand
for labor meaning that in equilibrium the consumer receives less income. As a result, from the consumer’s
budget constraint (equation ) the consumer supplies less credit through corporate bonds (B
t
) to the FO firm.
142
This reduction in the supply of borrowing causes the equilibrium interest rate for corporate borrowing,
i.e. credit spreads, to rise.
Golosov et al. (2014) find a key equation for the marginal externality damage of climate change that
crucially under certain assumptions can be entirely calculated based on variables knowable today. Here I
show that the general version of this important climate equation continues to hold. However, as savings
now fluctuate through the business cycle this equation can no longer be computed by assuming saving
rates of today, and instead a stance needs to be taken on a forecast of future savings rates. In particular, if
firms save more in anticipation of future climate shocks, then the marginal externality damages could be
higher than under Golosov et al. (2014)’s assumptions.
Proposition4.4. InaFinancialMacroeconomy,whereEnergyFirmsarenotsubjecttofinancialfrictions,the
marginalexternalitydamageequationremainsasinGolosovetal. (2014):
Λ s
t
=E
t
∞
X
j=0
U
′
(C
t+j
)
U
′
(C
t
)
∂F
o,t+j
∂S
t+j
S
t+j
E
i,t
(4.25)
Proof. The proof is a trivial corrolary of Golosov et al. (2014)’s Proposition 2. The first order condition of
the energy firms’ problem wrt E
i,t
is as follows:
ˆ
λ i,t
+ ˆ µ i,t
+ˆ ν i,t
=p
i,t
− τ i,t
(4.26)
Where
ˆ
λ i,t
, ˆ µ i,t
and ˆ ν i,t
are the lagrange multipliers on the energy production function, resource
constraint (equation 4.20) and non-negativity constraint respectively.
Notice that this is equivalent to the decentralized equilibrium in Golosov et al. (2014) as the Energy
Firms’ problem here is identical given that Energy Firms are not exposed to the financial frictions. Thus it
continues to be true thatτ i,t
lines up with the marginal externality damage equation produced in Golosov
et al. (2014).
I now describe how the addition of the EC and its linkage to climate shocks impacts economic output,
relative to the class of economic models with climate without a financial sector.
143
Proposition 4.5. Through leading to more frequent binding of the EC which forces the FO firm to scale
down its production plans, climate change will lead to larger reductions in economic output in a financial
macroeconomywithclimatecomparedtoastandardeconomywithclimate.
Proof. For given S
t
, by FO firm input equations 4.15, N
o,t
,E
o,t
,ρ K,t
,K
t
will be lower relative to their
equivalent values in a standard economy N
Standard
o,t
,E
Standard
o,t
,ρ Standard
K,t
,K
standard
t
. Therefore we have
that:
F
o,t
(ρ K,t
K
t
,N
o,t
,E
o,t
,S
t
)≥ F
o,t
(ρ Standard
K,t
K
Standard
t
,N
Standard
o,t
,E
Standard
o,t
,S
t
) (4.27)
Finally, I provide a key proposition demonstrating the importance of adjustment costs in borrowing to
the financial effects in the model.
Proposition 4.6. If κ = 0, then financial frictions do not bind and changes in ξ have no effect on the real
economy.
Proof. Ifκ = 0, the FO firm can costlestly adjust borrowing. In this case φ
′
(B
′
) = 0. From the FO firm
FOC in equation 4.17 then1− ηξ = (1+r)Em
′
. SinceEm
′
=
1
1+r
from the consumer’s FOC thenη is
always zero. Profits and borrowing will always cancel out in the consumer’s budget constraint to ensure
prices do not change with ξ . This happens optimally because adjusting borrowing is costless, so the FO
firm will always just use borrowing to overcome financial constraints.
5 Conclusion
In this paper, I have provided the first empirical examination of economy-wide financial market effects of
climate shocks. Using state panel regressions I demonstrate that natural disasters cause higher corporate
credit spreads. The increase in credit spreads is larger if the number of county disaster declarations in
that year are higher, indicating that larger disaster-years cause larger credit spread increases. These
results show that the financial market consequences of climate shocks are siginificant. If the incidence
144
of these shocks increases as climate science predicts, these financial market consequences will become
more important.
Further, I subject my results to a range of alternative specifications and robustness tests. Deliberately
focussing on smaller firms, that are more likely to have stronger identification with a given state rather
than spanning several, does indeed increase the disaster credit spread relationship. Whereas not filtering
loan data to try to remove larger, not geographically localized, firms reduces the size of the effect. Finally
the effect is robust to use prior year natural disaster declarations in the regressions, suggesting some
persistence in the financial effects.
The empirical exercise provided important justification for constructing a theoretical model. The
climate change literature has yet to substantively address how financial markets react to climate shocks.
Therefore constructing a model taking a stance on their reaction required the effect to first be documented.
In this paper I presented a DSGE model with climate externality and collateral constraint to approximate
a financial macroeconomy with climate. The model allowed the derivation of key insights. First, I established
that the credit spread increases documented in the empirical work are indeed emulated in the model. I then
showed that economic output is lower in the financial macroeconomy than in standard models without
collateral constraints. Finally, I characterized the importance of adjustment costs to borrowing in the
framework and showed that the marginal externality damage equation of Golosov et al. (2014) continues
to hold for a financial economy. However, its prior advantage of being totally calculable based on current
information under certain assumptions is unlikley to hold. A forecast of how savings will evolve over time
with carbon emissions is required to fully calculate the marginal externality damage.
I also build important intuition from the model. One important finding is the role of collateral as the
climate changes. If climate shocks become more frequent as predicted by climate science, the collateral
constraint binds more frequently providing an additional incentive to accumulate capital relative to the
model without financial frictions. The heterogenous impacts of this are an interesting subject of future
research, since these additional collateral requirements may damage precisely the most innovative firms
who lack such collateral. This could damage the innovation potential of the economy.
Finally, the model allows different types of deterministic or stochastic parameterization of the impact
of the atmospheric carbon stock on the distribution of climate shocks. Given the uncertainty about the
impact of emissions on this distribution (see IPCC (2015)) this flexibility is a prerequisite of any useful
climate model.
145
Figure 4.6: Sensitivity of main results to alternative data approaches and analytical choices.
Note: Diamond markers represent coefficient estimates in each sensitivity using the left axis; relative to blue dashed line
showing main estimate of 0.023 from the main results table. Circle markers represent p-values in each sensitivity using the right
axis; relative to red dashed line showing whether regression remains significant at the 5% level.
146
Appendix
Maintheoreticalmodelequations
In this appendix section I present all model equations based on solving the optimization problems described
in the main report.
FOfirmproblem
F
[No]
o
=W(
1
1− η )
F
[E
i
]
o
=p
i
(
1
1− η ) ∀i∈I
F
[K]
o
=R(
1
1− η )
Em
′
[(1− η ′
)F
[K
′
]
o
ρ ′
K
+(1− δ )+R
′
(1− ρ ′
K
)
=1− ηξ 1
1+r
(1 − ηξ ) = Em
′
+ φ
′
(B
′
)
ξ (K
′
− B
′
1+r
)=F
o
(ρ K
K,N
o
,E
o
,S)
147
Consumerproblem
U
[C]
(C,N)W =U
[N]
(C,N)=0
U
[C]
(C,N)(
1
1+r
)=EβU
[C
′
]
(C
′
,N
′
)
B
t+1
1+r
+C
t
=W
t
N
t
+B
t
+Π t
+T
t
Energyfirmproblem
(p
i,t
− τ i,t
− ˆ µ i,t
)F
[K]
i,t
=r
t
(p
i,t
− τ i,t
− ˆ µ i,t
)F
[N]
i,t
=W
t
(p
i,t
− τ i,t
− ˆ µ i,t
)F
[E
j
]
i,t
=p
j,t
∀j∈I
148
Chapter5
HowAsylumSeekersintheUnitedStatesRespondtoTheirJudges:Evidence
andImplications
Co-authored with Emily Nix (Marshall School of Business, University of Southern California)
”Peopletalkamongthemselves. ‘Igotthemeanestjudge;Ishouldn’tgotocourt.’”
- Evelyn Smallwood, a lawyer at Hatch Rockers Immigration in an interview with WNYC.
”Theunfairnessoftheimmigrationcourtsweighsheavily,especiallyinmydistrict. Manyofmyconstituents
in Houston…we see immigration judges denying approximately 98% of petitions to asylum, as opposed to our
friends in districts like New York with a denial rate as low as 5%. Somebody, please just tell me how that’s
fair?”
- Rep. Sylvia Garcia in May 2022 on the range of denial rates amongst immigration judges after the
House of Representatives Judiciary Committee voted on the Real Courts, Rule of Law Act.
1 Introduction
Every year, many thousands of migrants flee their home countries and arrive in the United States to legally
petition for asylum. Once these people arrive they encounter a high-stakes judicial process that determines
whether they have a credible fear of persecution and can legally remain in the country or not. One aspect
of the asylum process that is the subject of heated political debate and substantial media attention is how
often and why asylum seekers fail to show up for their court hearings. Alongside similar comments made
by members of the United States House and Senate, are those made by former Immigration and Customs
Enforcement (ICE) Director, Mark Morgan, in reference to asylum applicants: “Majorityofthemdon’teven
showup,theyreceivedorderedremovalinabsentia.” (NPR interview, 21st June 2019).
While a more careful examination of the data shows that the majority of asylum seekers to the United
States do show up to court, a nontrivial 15% are absent during the years we study, 2009-2015.
1
Consistent
1
This is for our estimation sample including cases with asylum applications or missing application data. See Section 3 for
149
with prior legal studies, we record an asylum seeker as absent based on court proceeding documents,
i.e. an asylum seeker is defined as absent if the court proceeding documents that the individual was ”in
absentia”.
2
This is a growing problem: on average 16% of cases were decided in absentia in 2015, and this
figure increased to 24% by 2018. This fact, combined with the growth in the past decade of asylum seekers
entering the United States, underscores the importance of understanding why these absences occur.
There are many reasons why people might be absent from their court hearing. Critics argue that people
use asylum as a loophole to legally enter the country and then illegally disappear. Asylum advocates argue
that these absences have nothing to do with ”strategic” asylum seeker behavior and are instead explained
by things like language barriers and missed court summons. In this paper, we propose and test one possible
driver of absentia: asylum seekers may rationally choose to be absent in response to being assigned a
stricter judge. For example, immigration lawyer Evelyn Smallwood stated in a news interview ”People
talk among themselves. ‘I got the meanest judge; I shouldn’t go to court.’” (178).
For just one example that highlights the sort of systematic differences asylum seekers face and may
respond to, in Los Angeles from 2012-2017 Judge Munoz denied 97.5% of cases while Judge Neumeister
who was also based in Los Angeles denied only 29.4% of cases.
3
This information is currently publicly
available: asylum seekers can look up how strict their judge is relative to others in their court and in the
United States online and there is substantial anecdotal evidence that asylum seekers are routinely given
this information by immigration advocates and lawyers.
4
A rational individual might make choices like
absentia if assigned to Judge Munoz’s docket, where a grant of asylum seems unlikely.
Ignoring fairness implications, if some portion of the 15% of asylum seeker absences we document
are due to this large variation in judge behavior, this creates a strong policy justification for reducing this
variation. The rational choice asylum seekers face is: should I show up to court if I know I am likely to
lose and potentially face immediate deportation? By not showing up the asylum seeker incurs a large
cost since absentia most often results in the judge denying the asylum seeker’s claim in their absence.
details on data construction. If we only include non-missing application data then the absentia rate for 2009-2015 is much lower,
at only 6%.
2
The vast majority (roughly 80%) of cases only have a single proceeding in which to be absent. However, when there are
several proceedings associated with a case, based on consultations with immigration law experts, we use whether the first was
recorded as ”in absentia” to define if the asylum seeker was absent or not.
3
See TRAC data and summary here: https://trac.syr.edu/immigration/reports/490/.
4
This information can be found at: https://trac.syr.edu/immigration/reports/judgereports/. Before this information was online
migrants likely shared information amongst each other or gleaned information on judges through other formal and informal
networks. Based on our discussions with immigration lawyers, and media accounts documented later in the paper, people are
well aware of which judges are stricter versus more lenient.
150
However, absentia may make it easier for a person to disappear and remain in the United States illegally
(or an individual may perceive this to be the case). Especially if an asylum seeker is certain that they may
be killed after deportation, but views their case of asylum as unlikely to be granted, then absentia and
illegally remaining in the country may be the best of bad options. This difficult choice of whether to show
up for court and the large stakes involved has been covered extensively in the media. For example, when
discussing asylum seekers and their court cases a news article stated that ”some just stayed away, fearing
theycouldbedeporteddirectlyfromcourthousesandchoosinginsteadtotaketheirchancesintheimmigration
underground” (177). If assigned a relatively lenient judge, this decision skews in favor of showing up and
hopefully obtaining the benefit of being granted permanent asylum. In this paper, we find that asylum
seekers do respond in this way, which providing important empirical justification for policies aimed at
reducing the large variability in judicial behavior in the U.S. immigration system.
Beyond the clear policy implications of this analysis, a key economic contribution of this paper is
documenting whether these types of endogenous responses to decision-maker inconsistency occur. The
answer to this question could have broad external validity, given that decision-maker inconsistency is
present in many other institutional settings that require people to make highly subjective choices with
unclear guidelines. For a few examples, consider other legal settings with unstable legal precedents,
markets with volatile governmental regulations, and education systems for subjects with wide variation in
test standards. A second key economic contribution of this paper is that we show this type of endogenous
response leads to bias in second-stage estimates in the popular decision-maker research design.
Turning to a more detailed description of our methodology and results, in the first part of the paper
we document extreme variation in judicial leniency in immigration courts. This empirical result, and the
analysis that follows, uses data from Freedom of Information Act (hereafter FOIA) requests to the United
States Department of Justice Executive Office for Immigration Review (hereafter EOIR) largely made by
the Transactional Records Access Clearinghouse (TRAC) at Syracuse University. Given the complexity
of the data, we supplemented the TRAC data with our own extensive FOIAs consulting EOIR, as well as
additional data we collected via FOIA requests to the Department of Homeland Security (DHS).
For each judge, we calculate a judge leniency score equal to the share of applications granted out of
all cases the judge sees. The average percentage point gap between the strictest and most lenient judges
across all courts with more than 1 judge is 20 percentage points. At the most extreme, in the New York -
Federal Plaza court (32 judges in our sample) we find a gap of 76 percentage points, i.e. the strictest judge
151
is 76 percentage points less likely to grant asylum than the most lenient judge on average for the period
2009-2015. These numbers are consistent with the notion that in U.S. immigration courts, ”your judge is
yourdestiny” (198).
5
Next, we examine whether asylum seekers rationally respond to this substantial variability in how
lenient different judges are. Asylum seekers assigned a harsher judge, regardless of how strong their case
is, might optimally choose not to show up in court and disappear illegally rather than risk deportation.
While anecdotal accounts suggest at least some asylum seekers engage in this type of cost-benefit analysis
calculation, this paper examines whether this practice is more systematic.
To understand if asylum seekers are systematically absent in response to how lenient their assigned
judge is, we cannot simply regress judge leniency on asylum seeker absentia. If judges in Houston see
very different cases compared with judges in New York, then other characteristics of the asylum seekers
could be correlated with both absentia and the leniency of judges across courts, leading to a spurious
conclusion that asylum seekers respond to judge leniency. To address this challenge, we leverage the fact
that within court and time, cases are quasi-randomly assigned to judges to identify the causal impact of a
more lenient judge on absentia. We provide both empirical and institutional evidence consistent with this
randomization of judges to cases, which was also used in Chen et al. (57). We find that even after removing
court by time fixed effects, the percentage point gap between the most and least lenient judge is still on
average 12 percentage points across all courts, and there is a 55 percentage point gap between the strictest
and most lenient judge in our data.
Using this identification strategy, we find asylum seekers quasi-randomly assigned to an immigration
judge who is ten percentage points more lenient are 0.74 percentage points more likely to show up for their
court hearings, a 4.9% increase relative to the dependent variable mean which is significant at the 99.9%
level. Given that the unconditional average variation from the strictest to the most lenient judge within a
court is 20 percentage points, this scales to on average a potential 1.5 percentage point decrease in absentia
moving from the strictest to the most lenient judge in the court, which is a 10% decrease in absentia relative
to the mean. Thus, the vast variability in judicial leniency within United States immigration courts causes
modest but important distortions in the behavior of asylum seekers.
Our interpretation of this result is that while some individuals are rationally responding to judge
5
On the other end of the spectrum, in Honolulu, Hawaii EOIR court (2 judges total) we find judges are much more consistent,
with only a 3 percentage gap between most and least lenient judge.
152
leniency, this response cannot explain the majority of absences. However, we caution that this estimate is
a lower bound of the overall amount of endogenous absentia in response to the vast differences in judicial
behavior in the United States immigration system. For example, there is likely also endogenous absentia in
response to average judicial leniency across courts, which our research design cannot identify since courts
are not randomly assigned. As an example, our main result implies that just removing all within-court
variation could reduce the number of yearly absentia rulings by 6%. However, if we extend this back-
of-the-envelope calculation and assume that our main result also applies between courts, then removing
these distortions reduces absentia findings by 20% per year.
6
In the last part of the paper, we discuss the implications of this type of endogenous response to judge
assignment for the popular decision-maker research design. This research design is often used to identify
causal impacts in settings where identification can be otherwise challenging. For example, it is unlikely that
imprisonment will ever be randomly assigned, but randomly assigned judges who vary in their leniency
have been used to circumvent this fact and identify causal impacts of prison (132; 27; 156; 125). This
approach has also been used to understand the impacts of a wide range of policies, such as bankruptcy
(26; 25), debt (75; 58), disability (98), and foster care (78; 13). Using a simulation exercise we show that the
endogenous behavior we document in this setting introduces bias in second-stage estimates when using
randomized judge assignment to identify causal treatment effects in a two-staged least squares (2SLS)
framework.
Our results suggest some caution when using the popular randomly assigned decision-maker identification
strategy. We recommend that researchers check for endogenous responses to decision-maker assignments
before proceeding. This is particularly true in settings like the asylum system where the following four
conditions are met: 1) people are subject to arbitrary decision-makers, 2) the variation in decision-maker
behavior is large, 3) information on this variation is publicly available, and 4) there are actions that can
be taken in response. This result contributes to the growing literature scrutinizing the popular decision-
maker research design. For example, Frandsen et al. (96) provides a stricter monotonicity test and shows
that many applications fail this test, with important implications for the interpretation of estimates.
This paper provides evidence of how people navigate the asylum system in the United States. Prior to
this paper, there has not been research in economics on how asylum seekers in the United States interact
6
These numbers use estimates based on judge variation conditional on court and year fixed effects. If asylum seekers respond
to actual judge grant rates, which is more likely the information they have, these estimates are larger.
153
with the United States immigration courts, despite an active law literature on this topic (180; 81). Instead,
we contribute to a larger literature in economics on asylum seekers and refugees more generally.
Asylum is intended to save those at risk of death in their home countries, although there is substantial
disagreement on who should be admitted and whether immigrants in general benefit the receiving country
(186; 168; 4; 167).
7
Focusing on asylum seekers and refugees, Chin and Cortes (61) provide a good overview.
Cortes (64) shows that while initially refugees to the United States earn and work less than other types
of immigrants, they eventually surpass other immigrants in terms of wages, hours worked, and language
assimilation. Brell et al. (36) show a similar pattern of catch-up among refugees more recently.
8
This is
consistent with the possibility that asylum seekers and refugees have recently escaped trauma, putting
them at an economic disadvantage initially.
Relative to the large literature in economics focused on the impacts of migrants on receiving countries
and migrant labor market outcomes, our paper contributes by providing evidence on how this heavily
publicized group of people responds to the complex and fractured United States asylum system. While
this topic has received limited attention in economics, the evidence from this paper is particularly relevant
given the ongoing debate across all branches of the U.S. government on absentia and whether asylum
seekers are manipulating the system.
9
Our results provide critical direction on how the United States
asylum system should be reformed. In particular, our results indicate that the large variability in judicial
decision-making is problematic not only because it is potentially unfair, as Representative Sylvia R. Garcia
claims in the quote at the start of this paper, but also because it leads to important distortions.
7
Much of the existing research studies the impact of immigrants on receiving countries. Immigrants have been shown to
impact labor supply (67; 82), wages (7; 164; 79; 33), prices (66), innovation (124; 154; 153; 206), growth (43) and politics (145; 45?
; 195; 80). Interestingly, with the exception of domestic violence (85), there are not significant impacts of immigration on crime
(119; 28). However, providing legal status to migrants does appear to reduce their criminal activity (144).
8
See also Beaman (21) who shows refugee social networks can facilitate this catch-up in wages.
9
The wild variability in judge assignment was part of the discussion around the Real Courts, Rule of Law Act, passed out
of the judiciary committee in the House of Representatives in May 2022. On November 19, 2019, in a House of Representatives
Homeland Security Subcommittee hearing Representative, Mike Rogers stated ”The overwhelming majority of people do not show
upforthesehearingsoncetheygetintothiscountry. That’snotthekindofsituationthatwecancontinuetoallow” . On June 11, 2019
DHS Secretary McAleenan was questioned about absentia in a Senate Judiciary Committee hearing. These represent just a few
such examples.
154
2 InstitutionalContext
2.1 WhoIsAnAsylumSeeker?
In the United States, immigration can happen legally through a large number of different classifications.
In this paper we focus on asylum seekers. The United States Government defines an asylee as a person on
United States soil who is found to have a ”credible fear of persecution” in their own country. Credible fear
is in turn defined as an applicant having a “significant possibility” of establishing in a hearing before an
Immigration Judge that he or she has been persecuted or has a well-founded fear of persecution on account
of his/her race, religion, nationality, membership in a particular social group, or political opinion (we refer
to these as ”persecution categories”) if returned to his/her country.
10
The burden of proof that such a
well-founded fear exists falls squarely on the asylee. As we will show later in this paper, this definition is
interpreted very differently across judges.
Figure 5.1 shows the overall asylum process. Further technical points about the system and this figure
are described in Appendix 7. We focus on the judicial stage conducted by Executive Office for Immigration
Review (hereafter EOIR) within the Department of Justice (DOJ). In practice, whether an applicant has a
legitimate asylum case is a legal question, with a vast array of precedents and legal decisions determining
whether a particular individual case constitutes a valid asylum case.
In this paper, we take advantage of both the random assignment of cases and the fact that different
judges vary enormously in terms of their leniency to identify one interesting way in which asylum seekers
might strategically interact with the judicial system: absentia.
Absentia has been of particular policy interest, especially recently. Some of the questions around
absentia were summarized by Theresa Cardinal Brown, director of immigration and cross-border policy
at the Bipartisan Policy Center, in a May 25, 2022 Los Angeles Times interview: ”They need to figure out
why people are not showing up…ICE wants to believe that everyone is absconding, and the advocates
want to believe that everyone is legitimate and has an asylum claim. The reality is that it is somewhere in
the middle. And how far in the middle to which side? We don’t know” (50). This paper investigates one
type of strategic absentia that could be occurring, endogenous absentia in response to variation in judge
leniency. Our results also speak to the larger issue of how those subject to capricious decision-makers
10
For additional information, see https://www.uscis.gov/humanitarian/refugees-asylum/asylum/questions-answers-credible-
fear-screening
155
Figure 5.1: The Asylum Process
Notes: Figure summarizes the asylum process for applicants from start to finish. Edited version of Figure 1.2 in Miller et al. (150).
might respond to such a system.
2.2 ImmigrationJudges
Immigration judges and their behavior form the core of the analysis in this paper. Immigration judges
are appointed by the United States Attorney General, in accordance with the Immigration and Nationality
Act. In Table 5.1 Panel A we report summary statistics for all immigration judges for the years 2009-
2015, including both asylum and non-asylum cases. For completeness, Panel B reports summary statistic
restricting to just the asylum cases we focus on in our main analysis.
Focusing on Panel A, judges see 649 cases on average per year. This number might seem very high, but
as discussed in multiple media articles immigration judges see an extraordinarily large number of cases per
year. This problem has only gotten worse over time. ”The number of people claiming asylum has increased
by nearly 2,000% between 2008 and 2018. EOIR has not hired enough new immigration judges to keep pace
with this increasing rate of new cases. In 2019, the median caseload for judges was 3,000 annually” (3).
11
11
See also discussion here: https://trac.syr.edu/immigration/reports/675/#::text=In%20May%
202019%20this%20TRAC,four%20thousand%20or%20more%20cases.
156
On average judges have served 8 years, possibly indicative of immigration judges seeing large numbers of
emotionally taxing cases leading to high rates of burnout (141).
The fact that judges see so many cases likely precludes them from carefully weighing the merits and
evidence when deciding each individual case. Thus they could be more likely to rely on ”rule-of-thumbs”
or their own inclinations when deciding cases. Such a setting makes judge assignment even more pivotal,
with how strict or lenient a judge playing an outsize role in the outcome of a case.
157
Table 5.1: Judge Characteristics, 2009-2015
Percentiles
JudgeAttributePercentile: Mean 25 50 75
PanelA:AllImmigrationCases
Avg cases per year 649 29 204 485
Avg female cases per year 48% 41% 50% 60%
Avg Chinese cases per year 21% 0% 12% 38%
Avg Northern Triangle cases per year 49% 30% 50% 67%
Avg Mexican cases per year 46% 21% 47% 69%
Total years served in 8 4 8 13
Total courts served in 4 2 3 6
PanelB:EstimationSampleofAsylumCaseswithRestrictions
Avg cases per year 95 18 79 139
Avg female cases per year 40% 17% 38% 54%
Avg Chinese cases per year 8% 0% 1% 7%
Avg Northern Triangle cases per year 31% 14% 24% 43%
Avg Mexican cases per year 27% 6% 24% 42%
Total years served in 6 3 5 10
Total courts served in 2 1 1 2
Notes: Table reports descriptive statistics for judges using all
immigration cases in Panel A, which is the level at which we calculate
our judge leniency measure, although results are similar if we use the
sample in Panel B to calculate judge leniency (see Appendix Table 5.7).
Panel B reports results for just asylum cases and where we additionally
impose the restrictions that define our estimation sample as described in
Section 4. All ’per year’ categories are measured for each court a judge
serves in during a given year, and then averaged over all years.
InstitutionalSupportforRandomAssignmentofCasestoJudges A key feature of our empirical
strategy is the fact that cases are randomly assigned to judges. ”The most important moment in an asylum
case is the instant in which a clerk randomly assigns an application to a particular asylum officer or
158
immigration judge.” (180). Documentation of the random assignment of cases can be found in various
EOIR documents and other summaries. For example, a TRAC report states that ”the Executive Office for
Immigration Review assigns incoming asylum applications to judges on a random basis, using automated
procedures within each court. When individual judges decide a significant number of asylum requests,
this random assignment results in all judges dealing with the same broad mix of asylum seekers as their
colleagues on that court.”
12
Similarly, a more recent report states ”These random assignment procedures
parallel what happens in a scientific experiment where individuals are assigned randomly to different
treatments, as in a drug trial. Here, however, instead of being assigned to different drug treatments,
asylum seekers are assigned to different judges. When individual judges handle a sufficient number of
asylum requests, random case assignment will result in each judge being assigned an equivalent mix of
asylum seekers”.
13
A 2016 Government Accountability Office (GAO-17-72) stated that ”substantively, random effects accurately
represented the random assignment of judges to cases, which EOIR generally uses as an administrative
policy”.
14
The fact that cases are randomly assigned has also reached popular consciousness, as evidenced
by the following statement in a 2007 NYT article ”It is very disturbing that these decisions can mean life
or death, and they seem to a large extent to be the result of a clerk’s random assignment of a case to a
particular judge”.
15
VariationAcrossJudges Another key to our identification strategy is the fact that judges vary in how
likely they are to grant asylum. Among United States immigration court judges this variation is extreme,
as depicted in Figure 5.2. The figure depicts the leniency rate of each judge in each court from 2009-2015,
the years we use in our main analysis. Each dot represents a judge, and the line extends to cover the most
and least lenient judges within each court.
The figure demonstrates that in most courts the variability in asylum grant rates across judges is
exceptionally large. For example, in the New York - Fed Plaza Immigration Court the 32 judges we observe
serving from 2009-2015 range from a 7% grant rate for one judge to an 83% grant rate for another, which
corresponds to a 76 percentage point difference in grant rates across judges. For all courts that have more
12
Seehttps://trac.syr.edu/immigration/reports/209/ for the full report.
13
Seehttps://trac.syr.edu/immigration/reports/447/.
14
Original source found here: https://www.gao.gov/assets/690/680976.pdf and here: https://www.
gao.gov/assets/gao-17-72.pdf
15
Seehttps://www.nytimes.com/2007/05/31/washington/31asylum.html for full article.
159
than 1 judge, the average gap in judge leniency from the least to most lenient judge serving in 2009-2015
is 20 percentage points. Note that the six largest courts (New York - Fed Plaza, San Francisco, Chicago, Los
Angeles - Olive, Dallas, and Miami) see 33% of all cases.
The figure also suggests differences across courts. We capture this more succinctly in Figure 5.3 Panel
(a) which displays the average grant rate across all cases seen within each court over our time period.
Variation in judicial leniency across courts could be partly attributable to the different types of cases seen.
For example, courts in Texas may see more cases from Central and South America than courts in the
Northeast. As a result, the same judge might display very different grant rates in different courts. Similarly,
there could be differences in the types of cases over time. When we remove court by time fixed effects in
our main analysis below, we remove such variation across courts.
However, if we were primarily interested in whether court grant rates differ because judges are more
or less merciful across courts or because case characteristics are different, the court fixed effects ”over
control”. Court fixed effects will not only remove differences in case characteristics across courts but also
average differences in judge leniency across courts. For example, suppose judges in New York City are
more lenient but also see more compelling asylum cases. Then the court fixed effects will remove not only
differences in case characteristics but also the greater leniency on average amongst New York City judges.
While these fixed effects are necessary for identification (see Section 4 for more details), it is interesting to
consider how much of the differences in grant rates across courts are explained just by differences in the
types of cases seen in different courts.
Figure 5.3 Panel (b) partly addresses this. It calculates average grant rates by court conditional on
nationality and weekly fixed effects. This approach imperfectly controls for differences in the composition
of cases across different courts.
16
This reduces the range in average grant rates across courts but does not
eliminate it, consistent with differences in average judge behavior and not just case characteristics across
courts explaining differences in grant rates across courts.
2.3 AnecdotalAccountsofAsylumSeekerAbsentia
Why might an asylum seeker not show up to their immigration court hearing? Asylum seekers maximize
their chance of receiving asylum by showing up in court, yet many are absent. Media accounts covering
16
Imperfectly because lower grant rates of Mexican cases, for example, could be due to Mexican cases generally being seen by
harsher judges concentrated in certain courts.
160
Figure 5.2: Raw Variation in Judge Leniency Across Courts, 2009-2015
Notes: Figure shows the average grant rate for all judges across all courts from 2009-2015. Judges with very few (¡120 per year)
or very large (¿4000 per year) numbers of cases are omitted. Each blue dot represents an individual judge’s grant rate, averaged
over the whole data period. In other words, each judge serving in a single court is depicted only once from 2009-2015, where we
calculate the average leniency across all months and years for each court the judge has served in (a minority serve in more than
one court from 2009-2015). Lines indicate the least to most lenient judge by court. The six largest courts in terms of cases seen
are New York City-Fed Plaza, San Francisco, Dallas, Chicago, Los Angele-Olive and Miami.
161
Figure 5.3: Variation in Average Grant Rates By Court, 2009-2015
0.35pt
(a)
Unconditional
Average
Grant
Rates
0.35pt
(b)
Conditional
Average
Grant
Rates
Notes: Figure shows average grant rates across all cases within each court from 2009-2015. Subfigure (a) represents raw court
averages, averaged over all judges serving in a court during the data period. Subfigure (b) is similar, but shows only the conditional
grant rates, where we regress grant of absentia on the major nationalities (Mexico, China, and Northern Triangle) along with
weekly fixed effects and calculate the average grant rate over the residual. Thus, subfigure (b) partially accounts for how the
differences in composition of cases across courts might drive the differences in grant rates across courts depicted in subfigure (a).
The six largest courts in terms of cases seen are New York City, San Francisco, Dallas, Chicago, Los Angeles and Miami.
the prevalence of absentia amongst asylum seekers have offered a number of possible explanations for this
phenomenon. For example, missed deadlines for paperwork, a poor understanding of the judicial process,
and language barriers are frequently cited (50).
17
These reasons are independent of judge characteristics.
In our paper, we focus on another explanation, also covered by the media and demonstrated in anecdotal
accounts: asylum seekers may be avoiding strict judges. This avoidance could be driven by the fact that
asylum seekers fear losing their case and facing immediate deportation (the consequences of which can
be large if they have fled persecution in their home country). Media reports discussing an especially strict
judge, Stuart Couch of Charlotte, North Carolina, report: ”Many immigrants did not understand what they
were supposed to do to pursue their claims and could not connect with lawyers to guide them. Some just
stayed away, fearing they could be deported directly from courthouses and choosing instead to take their
chances in the immigration underground” (177). Law firm Geygan and Geygan, on their website, note:
”Many immigrants are frightened by the thought of being removed from the United States, and do not
appear at their immigration hearings.”
18
When assigned strict judges, immigrants report feeling a sense of
hopelessness that may lead to absentia. Local immigration attorney Evelyn Smallwood is cited as saying:
”Negativity permeates the community. People talk among themselves. ‘I got the meanest judge; I shouldn’t
go to court.’ They just feel the outcome is the same if they do go to court and if they don’t” (178).
19
17
The LA Times quotes one story: ”Ultimately, an immigration judge ordered William and his 6-year-old to be deported in
”absentia” when they didn’t show up for their court hearing at U.S. Immigration Court in downtown Los Angeles. In fact, at the
time the judge gave the order, William was in the building, but was three floors below the courtroom in a waiting area at the
direction of an Immigration and Customs Enforcement official.”
18
Geygan and Geygan Ltd. are an immigration firm in Cincinnati. See website information page. Title: ”Immigration court
and the immigration judge’s wide discretion.”
19
Evelyn Smallwood practices law in Durham, North Carolina as a partner at Hatch Rockers Immigration.
162
In order to respond to judge assignments, asylum seekers must have some idea of the leniency of their
assigned judge. Chance interactions with altruistic immigration advocates, support groups, or lawyers
(who may or may not go on to eventually represent them), as well as informal information networks of
family form one source of information on judge leniency. For example, in reporting by Reuters, Theodore
Murphy, an immigration attorney, says: ”I tell them to move” when he has a client in a jurisdiction with
a high deportation rate.
20
Viridiana Martinez of Alerta Migratoria, an advocacy group in Durham, North
Carolina, notes: ”We should set up billboards on the highway for people coming from the border. Keep
going, don’t stop in Charlotte!” (178). These accounts demonstrate the informational role played by a range
of indirect agents involved in the asylum process.
Another key source of information is the TRAC website, where asylum seekers can select their judge
and recover the stringency of their judge overall and relative to others. Consider Judge David Neumeister
who was appointed as an Immigration Judge in 2010. The public page on this judge for 2015-2020 states
that ”compared to Judge Neumeister’s denial rate of 42.9 percent, nationally during this same period,
immigration court judges denied 66.7 percent of asylum claims. In the Los Angeles immigration court
where Judge Neumeister was based, judges there denied asylum 74.9 percent of the time” (202).
21
More
informed asylum applicants may access this directly, while others are made aware of it through the information
networks described above. Many immigration law firms link to this information on their websites which
can then be found by asylum seekers seeking legal representation.
22
We conclude that judge leniency is
readily attainable information for most asylum seekers. Moreover, asylum seekers have strong incentives
to find this information given the high stakes nature of their cases.
3 DataandDescriptiveResults
3.1 Data
We use relatively new data from the United States Department of Justice Executive Office for Immigration
Review (EOIR). This data was obtained via a series of contentious Freedom of Information Act (FOIA)
20
”They fled danger at home to make a high-stakes bet on U.S. immigration courts.” Reuters. 17th October 2017.
21
See Appendix Figure 5.9 for the full informational page provided on Judge Neumeister online. The same is provided for
other immigration judges and for other courts. Before this information was online, informal information sharing likely played
an important role.
22
For example, the website of immigration lawyer Carl Shusterman includes a link to the TRAC information on his page
”Asylum guide: helping you win your case”.
163
requests by the Transactional Records Access Clearinghouse (TRAC) at Syracuse University. Since it was
obtained through FOIA requests, much of the data is unlabelled, uses specialized terminology, and required
consultation with legal experts and the producers of the data themselves (the DOJ) to use it properly. Thus,
a part of this paper’s contribution is developing this new, sparsely-used, raw data into a format that can be
used for economics research. We submitted our own FOIA requests to USCIS to get further data on asylum
cases and internally validate. We submitted further FOIA requests to EOIR to obtain many additional
details needed to clarify the correct usage of this data. We also consulted with legal academic experts and
specialized immigration lawyers to further understand and process the data.
We note the following additional points about the data that are relevant and discuss these in more
detail in appendix 7. First, given our interest in asylum seekers, we remove all non-asylum cases in our
main estimates.
23
However, a large number of cases have no application information, and when we only
use asylum cases with non-missing application information that designates the case as asylum, our case
numbers appear low based on internal validation with the data we obtained through FOIA requests to the
DHS. Thus, for our preferred specification, we also include all cases where the application data is missing.
Note that we will additionally report results using only the non-missing asylum applications for robustness
and all results remain the same.
Second, we restrict to the years 2009-2015. We do this because (a) some immigration cases can take
years to resolve, so we want to make sure we only see final decisions, and (b) the Trump presidency
(January 2017 on) brought in a period of substantial upheaval in the asylum system, and so we choose to
avoid those years.
Last, based on consultations with immigration law experts, when there are several proceedings associated
with a case we use whether the first was recorded as in absentia to define if the asylum seeker was absent
or not. The vast majority, roughly 80%, of cases only have a single proceeding.
3.2 DescriptiveResults
Figure 5.4 shows that while overall legal immigration to the United States remained relatively flat from
2009-2018 as depicted in the light blue bars, the number of asylum case applications has increased significantly
23
This is consistent with the definition of asylum used by TRAC. For example, our case numbers align with the overall asylum
case numbers found here: https://trac.syr.edu/phptools/immigration/asylum/
164
in the past 10 years as seen in the black dashed line.
24
Despite the increase in the number of asylum
applications, particularly since 2014, the number of asylum cases granted has remained relatively flat,
shown by the black portions of the bars. This has been achieved through lower acceptance rates over time.
In particular, 88% (65%) of cases are denied from 2009-2015 amongst the asylum cases including those with
missing applications (excluding missing applications). These numbers reflect the long odds asylum seekers
face in convincing a judge that their case is valid and legally entering the country via asylum.
Panel (b) of Figure 5.4 depicts the weekly share of cases decided in absentia from 2009-2019. The
figure shows that there has been a large increase in the share of all cases decided in absentia. In 2009
approximately 10% of all cases were decided in absentia. By 2015 this rate had almost doubled to approximately
20%. The share of all cases decided in absentia has since continued to grow, and the final week of 2019
recorded 40% (note however that this was the maximum observed through all weeks in 2019).
Thus, absentia is an important factor in the asylum process. If those who are absent are indeed
using the asylum system as a loophole to remain in the United States illegally, then this graph could
be indicative of substantial manipulation of the system. However, prior to this paper there has been little
rigorous empirical evidence on the ways in which asylum seekers might ”strategically” be absent for their
immigration hearings.
24
Note that for this subsection, we use all immigration cases and do not restrict to asylum cases nor do we restrict to the
estimation sample described in Section 4 (unless otherwise specified) in order to provide a broader view of immigration to the
United States.
165
Figure 5.4: Number of Asylum Cases, Grants, and Absentia 2009-2019
(a) Total Immigration Cases, Asylum Cases, and Grants of Asylum
(b) Share of Cases Decided in Absentia
Notes: Subfigure (a) data on total legal immigration and asylum grants obtained from the DHS immigration statistical yearbook,
Table 6 – persons obtaining lawful permanent resident status by type and major class of admission. Total asylum applications
are the summation of defensive asylum applications (EOIR adjudication statistics) and affirmative asylum applications (USCIS
monthly affirmative asylum reports). Note that there is some discrepancy between EOIR’s report of affirmative applications and
those of USCIS. For this figure we use the USCIS numbers, as this is the relevant authority that processes affirmative asylum
cases. Subfigure (b) reports the share of all cases decided in absentia, using the absentia variable as defined in the text and
derived from the EOIR data.
Figure 5.5 shows the proportion of migrants from the three largest regional groups who petition
166
for asylum in the United States: China, the Northern Triangle countries (Guatamala, Honduras, and El
Salvador), and Mexico. Together, these countries make up over 60% of all asylum seekers during the
period we analyze. The share of cases from the Northern Triangle has sharply increased from just under
20% in 2008 to just over 60% of cases in 2019, with noticeable surges in certain years. The share of asylum
seekers from China has decreased steadily over the past decade, and the share of Mexican migrants has
also decreased in importance, from making up just over 40% of all asylum seekers in 2009 to just under
20% in 2019. The graph also shows a large variation in the share of asylum seekers coming from different
countries even within the same year.
Figure 5.5: Share of Major Nationalities, 2009-2019
Notes: Figure depicts the weekly share of total asylum seekers from each nationality. For more information on how we construct
the data of asylum seekers see Section 3.1.
In Figure 5.6 we present descriptive evidence on the correlation between characteristics of either the
case or defendant and absentia. The regression includes all variables in the figure in a single equation with
absentia as the dependent variable. As we might expect, detention is strongly correlated with the asylum
seeker showing up for their court case, demonstrated by the large negative correlation between absentia
and detention. Note, we are not able to see if an individual is currently detained. Thus, some defendants
who were detained at some point in the past have since been released, and thus can be in absentia, which
167
is why detention does not automatically lead to zero absentia.
In addition, certain nationalities are more likely to be absent for their immigration hearings. In particular,
Northern Triangle asylum seekers are much more likely to be absent for their court hearings. We return to
this fact later in the analysis. In contrast, while Mexican asylum seekers are also statistically significantly
more likely to be absent, conditional on other variables like age and year this coefficient is extremely small.
We also see virtually no relationship between Chinese asylum seekers and absentia conditional on other
observables. All other correlations are intuitive.
Figure 5.6: Correlation between Case Characteristics and Absentia, 2009-2015
Notes: Figure shows correlation between absentia and other variables i.e. coefficients from a regression of absentia on all shown
variables simultaneously (with a constant included but not shown). Data constructed as described in Section 3 and Appendix
Section 7. Estimated denoted by the dot and 99.99% confidence intervals reported in whiskers around the estimate.
4 EmpiricalSpecification
4.1 ResearchDesign
We estimate the impact of quasi-randomly assigned judge leniency on the probability asylum seekers are
absent for their court hearings. This implies estimating the following regression
A
ic,jCt
=β b
Z
c,jCt
+W
ct
+ϵ ic,jCt
(5.1)
168
where
b
Z
c,jCt
is the leniency measure for judge j in court C and month t whom individual i is assigned for
case c. This judge leniency measure is calculated as follows:
Z
∗ c,jCt
=Z
c,jCt
− κ X
Ct
(5.2)
b
Z
c,jCt
=
1
n
jCt
− n
ic,jCt
X
Λ jCt
Z
∗ c,jCt
− X
Λ ic,jCt
Z
∗ ic,jCt
(5.3)
In words, we calculate the leave-out mean of acceptance decisions made by the judge. The first equation
residualizes out the court by time fixed effects (in our setting court by month), X
Ct
, as is standard in this
literature Bhuller et al. (27); Dobbie et al. (74). This is equivalent to assuming that randomization occurs
within court and time. The necessity for court-fixed effects in X
Ct
is obvious: cases are not randomized
across different courts. To understand the necessity of the interaction with time-fixed effects consider the
example of a judge serving in NYC in January of 2014 but who leaves the court or retires later that year.
This judge will not overlap with a different judge who begins serving in NYC in December 2015. In this
example, randomization of cases will not occur across these two judges. Put simply, these fixed effects
address the fact that randomization only occurs amongst judges serving together in the same court in the
same time period. The second equation then calculates the judge’s average leniency across all cases he or
she is assigned, excluding the asylum seeker’s own case.
Our judge decision indicator variable,Z
c,jCt
, is equal to one if the judgej in courtC and montht issues
a grant decision in casec and 0 otherwise.n
jCt
is the number of cases seen by judgej in courtC in month
t;n
ic,jCt
is the number of cases associated with individuali (the individual applicant associated with case
c) and judgej in courtC in montht. Then Λ jCt
represents the set of all cases presided over by judgej in
courtC in montht andΛ ic,jCt
represents the set of all cases that have the same immigration applicanti
as in casec and seen by that judgej in courtC in montht. This ensures that the second bracketed term in
equation 5.3 is a sum of all judgej’s decisions in courtC, but removing those associated with the asylum
applicant. Asylum seekers are rarely ever allowed to apply for asylum twice. Thus, in practice, we almost
never have to remove additional cases for a given asylum seeker. Finally,W
ct
represent additional controls
169
for casec and montht.
We leverage variability in judge leniency across judges within the same court and the same period
of time to understand how those subject to this variability across judges might respond as they move
through the court system, specifically whether they appear before their judge or are absent. Note that
in a few cases, judge assignment changes over time. We always use the first judge we observe assigned
to the case. Additionally, we use all immigration cases when calculating judge leniency. We show in
Appendix Table 5.7 that calculating judge leniency using just asylum cases or other subsets of the data is
highly correlated, and our main results hold with alternative measures. However, using all cases is our
preferred specification because most publicly available information on judge leniency is at the all-case
level. Therefore, using all cases when calculating judge leniency in our main approach better reflects the
information on judge leniency that asylum seekers have access to when making their decisions.
25
If judges were not randomly assigned, then β in equation 5.1 would be biased. Random assignment
is necessary because without it we cannot be sure if stricter judges simply tend to see individuals who
have other characteristics that are correlated with absentia, as opposed to the case that individuals don’t
show upbecause the judge is less likely to grant the individual asylum. As Table 5.1 shows, there are large
differences between characteristics of asylum seekers across judges that could drive differences in absentia
across courts. However, with random assignment within the court, differences in asylum characteristics
cannot explain differences in absentia.
Consider the following example illustrating the pros and cons of our empirical strategy. On average,
as Figure 5.2 shows, Houston has less lenient judges than San Francisco. When we look at the data, we
also find that Houston sees more asylum seekers from Northern Triangle countries compared to a number
of other courts (80% in Houston, versus 40% in LA, 50% in Dallas, 30% in San Francisco). Figure 5.6 showed
that Northern triangle cases are more likely to be absent in general.
If we compared the impact of the judge leniency of a judge in Houston on absentia to the impact of
judge leniency of a judge in San Francisco on absentia, we could be capturing both the fact that judges in
Houston are less lenient keeping the characteristics of the cases constant, as well as the fact that Northern
Triangle asylum seekers are more likely to be absent, and a greater share of cases in Houston are from the
Northern Triangle. By comparing within courts and within time periods we isolate the impact of judge
25
Moreover, based on the institutional details, we expect randomization to occur across all cases, not just asylum cases, although
this would also imply randomization across just asylum cases. Indeed, we find the strongest correlation between our main judge
leniency measure, as opposed to other possible measures, and the raw grant rate.
170
leniency on asylum seeker behavior. However, this also comes at a cost. By removing Houston court fixed
effects, we also remove some of the true information about the underlying leniency of judges in Houston.
Thus, if we find an effect, our research design will identify the lower bound of the extent to which
asylum seekers endogenously respond to judge leniency. To understand this fact, consider the following
example. Suppose that judges in San Francisco are generally more lenient on average than judges in
Houston, as Figure 5.2 suggests. In this case, higher absentia rates in Houston could also reflect knowing
that Houston judges are less lenient on average and responding to this fact. Additionally, more knowledgeable
asylum seekers might selectively choose not to locate in Houston at all in order to avoid these harsh
judges entirely. This choice would reflect further endogeneity in response to judge leniency, but will not
be captured in our estimates given that we only identify responses to differences in judge leniency within
courts.
As with other papers using the random assignment to decision-maker research design, we include
court-by-time fixed effects (i.e. see Bhuller et al. (27); Dobbie et al. (74)). Including these fixed effects
compares asylum-seeker behavior for individuals assigned to the same set of judges. Also similar to
preceding papers, we use information about the institutional setting to impose a few additional restrictions
on the estimation sample in order to ensure random assignment. First, we drop weekend cases since some
judges may be more flexible seeing weekend cases than others, violating random assignment, similar to
the concerns in Dobbie et al. (74). Second, we drop juvenile cases as they may be assigned with their family
members or treated differently in ways that could violate random assignment. Third, we require that judges
see at least 10 cases per month to ensure sufficient observations to estimate the judge’s leniency accurately.
Fourth, we additionally require that judges see no more than 4,000 cases per year. This restriction is based
on our conversations with law experts and reading of review materials suggesting that within the asylum
system, some judges serve a ”rubber stamping” role approving cases at a higher level. Such judges would
see many cases, but cases are unlikely to be randomly assigned to these judges. Last, as with all judge fixed
effects papers, we require that all courts have at least two judges to randomize across. In Appendix Table
5.6 we show how each of these restrictions reduces the number of observations in the data in turn.
4.2 ValidityofRandomJudgeAssignment
In order for our empirical strategy to identify the impact of judge leniency on absentia, three main conditions
must hold. First, judges must be randomly assigned to cases, as described above. Second, there must exist
171
sufficient variation amongst judges, conditional on random assignment, to be able to identify an effect.
Third, the exclusion restriction must be satisfied.
For random assignment, there is strong contextual evidence that cases are allocated randomly to judges
within a given court and time period, which we discussed in detail in Section 2. In addition to contextual
evidence, we can also test that assignment is not correlated with observable characteristics of the asylum
seekers by estimating the following regression, as in Mueller-Smith (156):
x
i,c,t
=α +d
c,t
+βJudge × Court× Month+ϵ i,c,t
(5.4)
wherex
i,c,t
represents characteristics of defendanti in courtc at timet. We have limited characteristics
in the asylum data, but will estimate separate regressions with the following outcomes: a dummy variable
indicating if the asylum seeker is from China, a dummy variable for Northern Triangle, and a dummy
variable for Mexico. d
c,t
represent fully interacted court by week dummies, which control for variation in
types of cases courts see across courts and by week. These fixed effects limit comparison to cases at the level
at which randomization occurs. The necessity for court-fixed effects in this estimating equation is obvious,
given that cases are not randomized across cases in different courts so to test balance we must compare
cases assigned to different judges within the same court. Time-fixed effects are similarly necessary given
that judges who do not overlap in the same court during the same time period may not see the same cases
(i.e. a judge serving in the later period is more likely to see Northern Triangle migrants in a number of
courts, see Figure 5.5). Judge× Court× Month is a fully interacted set of judge by court by month
dummies. The interaction with court and month allows judges to change their behavior over time or
when they switch courts. We test for the joint significance of the β s for each of these characteristics. We
impose all other estimation sample restrictions described above for this exercise. Note that this test for
random assignment is equivalent to regressing judge leniency on case characteristics in a context where
there is no endogenous response by asylum seekers. However, endogenous responses by asylum seekers
could introduce correlation between judge leniency and case characteristics even in the case of random
assignment, which is why we use this particular test for random assignment which avoids that possible
complication.
We report results of this exercise in Table 5.2. We estimate joint significance separately for large courts
(New York Federal Plaza, San Francisco, Dallas, Chicago, Los Angeles and Miami) and subsets of smaller
172
courts. We break the estimation into these groups because including all the fixed effects in one joint
regression was not computationally feasible. We find that in all cases the joint F-statistic is less than 4
indicating that these characteristics do not strongly predict judge assignment. These estimates suggest
very minor variations in case loads across judges within the same court and week, consistent with random
assignment to judges.
The fact that the joint F-statistics suggest random assignment is especially clear when comparing the
estimates in Panel A to the estimates in Panel B which report estimates from the following equation:
x
i,t
=α +d
t
+βCourt +ϵ i,t
(5.5)
wherex
i,t
represents characteristics of defendanti at timet,d
t
represent week dummies, andCourt is a
set of court dummies.
We test for the joint significance of the β s for each of these characteristics in Panel B of Table 5.2 and
find joint F-statistics of higher than 10 in all cases. In two-thirds of cases the F-statistic is higher than
500, more than 100 times larger than the joint-F-statistics testing for random assignment to judges. These
results indicate that while judge assignment does not strongly predict nationality of cases within courts
(see Panel A), nationality of asylum seekers is strongly predictive of which courts asylum seekers choose
(see Panel B). Thus, random assignment appears to hold within court and time, but not across courts within
week.
Note that the randomization of cases to judges also holds for our estimation sample. We re-estimate
Table 5.2 using only our estimating sample of asylum applicants and those with missing application information
and report results in Appendix Table 5.8. We find that all estimates are below 3. Last, given the main joint
F-statistics suggest potentially very minor deviations from random assignment, in Appendix Table 5.5 we
go a step further and include a specification where we remove judges who show any correlation with case
characteristics.
26
We discuss this exercise in more detail in the Appendix, and we will also show that our
main results are robust to excluding any such judges.
The second condition for our identification strategy is that conditional on removing court by time
26
There are institutional reasons why we might expect a small number of judges to fail random assignment, even though
institutional details also indicate that random assignment of cases to judges within courts is otherwise the norm in this setting
(see Section 2.2). For example, based on discussions with immigration experts, a very small number of judges may serve a rubber
stamping role for only a subset of cases, which might vary relative to the case loads of normal judges. After follow-up FOIA
requests on such information with EOIR, it became clear that detailed information on this type of role is unavailable. Thus, we
attempt to identify such judges in the data and remove their cases.
173
Table 5.2: Test of Random Assignment to Judges
Large Courts Courts A Courts B Courts C Courts D
PanelA:JudgeDummiesJointF-Tests
China 2.93 1.42 1.72 1.73 1.91
Mexico 3.25 2.7 2.8 2.27 3.92
NorthernTriangle 3.55 2.36 2.89 2.01 3.97
PanelB:CourtDummiesJointF-Tests
China 12109.77 10.87 87.13 38.32 18.73
Mexico 17677.17 2941.28 1438.79 1983.29 2433.21
NorthernTriangle 999.73 411.38 520.76 609.03 1462.71
Notes: Panel A reports estimates from equation (5.4) testing for random assignment
to judges by estimating the joint significance of a regression of judge dummies on
the characteristics of cases assigned: whether the case is from China, from Mexico,
or from the Northern Triangle. Panel B reports estimates of equation 5.5 and only
includes week fixed effects. Panel A reports estimates using all immigration cases
seen by the judge (or court for Panel B).
fixed effects, there is still sufficient variation in judge leniency in order to identify an effect on absentia.
Figure 5.7 demonstrates that this is true. The figure shows the distribution of judge leniency of each judge
within each court after removing court by time fixed effects. Each dot is one of the judges within the court.
Compared to Figure 5.2, there is less variation across courts. This is to be expected, since removing court
and time fixed effects precisely removes variation that is shared across judges within a given court.
However, this figure still shows a great deal of variation across judges within courts, which will help us
identify effects of being randomly assigned a more lenient judge within a court on the choice of absentia by
the asylum seeker. Across all judges, the judge leniency measure ranges from -0.24 (Judge Alan Vomacka,
of New York, Varick) to 0.31 (Judge Print Maggard, of San Francisco). This suggests that moving from
the least to the most lenient judge in our data set increases the probability of being granted asylum by 55
percentage points. While this variation is smaller than what we documented without court by time fixed
effects removed (see Section 2.2), it still indicates substantial variability in one’s likelihood of receiving
a grant of asylum, depending on the assigned judge. On average, there is still a 12 percentage point gap
between the least and most lenient judge within courts (compared to a 20 percentage point average gap
from moving to least to most lenient judge across all courts in the unconditional calculation).
Removing court by time fixed effects removes a lot of the true variation in the data, and could cause our
measure of judge leniency to no longer line up with the information that is publicly available to asylum
seekers.
27
Since our hypothesis is that asylum seekers make decisions based on available data, our new
27
Asylum seekers are able to look up their judges. The website providing this service calculates the simplest possible version of
the denial rate, which does not exclude court and time fixed effects, although it provides comparisons across judges. Presumably
174
measure of judge leniency would ideally line up with the information asylum seekers would most likely
have access to, namely the raw acceptance and/or denial rates. In Appendix Table 5.7 we report correlations
across different measures of judge leniency, from raw grant rates to our measure and to other ways of
calculating judge leniency. We find that they are very similar. Thus, the judge variability from our research
design that allows us to identify the causal impact of judge leniency on absentia is also largely consistent
with the measure available to asylum seekers.
Third, judge leniency must satisfy the exclusion restriction. In our context, the exclusion restriction
implies that the impact of judge leniency on absentia is only due to the judge’s relative leniency when
deciding cases. For example, if more lenient judges are also more encouraging to asylum applicants, or are
more likely to send out reminders to asylum seekers to show up for their court dates, then any results we
find could be due to these other actions which would violate the exclusion restriction. This is an untestable
assumption, as we do not observe such behavior. However, procedural details provide some evidence that
such violations are unlikely to explain our main results, with judges in this context limited to making
judgments. For example, regarding the above example of a possible violation of the exclusion restriction,
reminders or notices are not handled by the Department of Justice in which immigration judges reside.
Notices To Appear (NTAs) are issued by the Department for Homeland Security and specify the date and
time at which an applicant must appear before an immigration judge.
28
Note that if we were to use judge leniency to estimate impacts in an instrumental variable framework,
the instrument must also satisfy monotonicity. Here we are only interested in the first-stage estimates
which do not require monotonicity so we omit this discussion from the paper. For a good overview of
the role of monotonicity in the decision-maker fixed effects designs and potential violations, see Frandsen
et al. (96).
word of mouth descriptions regarding which judges are stricter versus more lenient would be similarly lacking in granularity.
28
An example of a NTA can be found on the Immigration and Customs Enforcement website and is clearly labeled as a DHS
document: https://www.ice.gov/doclib/detention/checkin/NTA I 862.pdf.
175
Figure 5.7: Variation in Judge Leniency with Court by Month Fixed Effects Removed
Notes: Figure shows the average judge IV leniency for all judges across all courts from 2009-2015. Judges with very few (¡120 per
year) or very large (¿4000 per year) numbers of cases are omitted. Each blue dot represents an individual judge’s grant rate,
averaged over the whole data period, after removing court by month fixed effects. In other words, each judge serving in a single
court is depicted only once from 2009-2015, where we calculate the average leniency across all months and years for each court
the judge has served in (a minority serve in more than one court from 2009-2015), conditional on court by month fixed effects.
Lines indicate the least to most lenient judge by court. The six largest courts in terms of cases are New York Federal Plaza, San
Francisco, Dallas, Chicago, Los Angeles and Miami.
176
5 Results
In Table 5.3 we report the main results. Appendix Table 5.9 reports robustness of these estimates to
including court-by-month fixed effects as opposed to court-by-week fixed effects. In column (1) we report
the impact of judge leniency on absentia. In columns (3) and (4) we drop the judges on the margin of failing
the balance test as a robustness check. In all cases, we find that asylum seekers quasi-randomly assigned
to more lenient judges are less likely to be in absentia, and thus more likely to show up for their court
hearings. All estimates are highly significant (P¡.001) and are similar to each other.
In column (2) we repeat these results but add in controls including dummies for nationality and a
dummy for gender. Adding controls should not yield statistically significantly different estimates. We
find that the estimates in columns (1) and (2) are statistically indistinguishable This is consistent with the
random assignment of cases to judges. Note that the large drop in the number of observations in column
(2) is due to the fact that gender is often not recorded, and may also account for the fact that this estimate is
only significant at the 95% level - it is estimated only on the subsample of data where gender was recorded
by the courts.
Last, in column (5) we restrict to only cases where the application information is not missing and the
case is designated as an asylum case. As described in Section 3, we expect that the true number of asylum
cases is somewhere between the number of observations in columns (1) and (5), but we are unable to clearly
restrict to asylum cases alone in the EOIR data. Thus, we report results using both samples. It is reassuring
that both results are consistent in sign, similar in magnitude, and remain highly significant.
Overall, these estimates indicate that asylum seekers do respond to the judge they are assigned within
this institutional setting in important ways. For absentia, our preferred estimate suggests that being
assigned a judge who is 10 percentage points more lenient corresponds to a 0.74 percentage point increased
likelihood that the asylum seeker shows up for their asylum hearing. Relative to the dependant variable
mean of 15% in absentia, this corresponds to a 4.9% reduction in absentia from a 10 percentage point
increase in judge leniency. Paired with the large variation in the leniency of judges depicted in Figure 5.2
where on average there is a 20 percentage point gap between the most and least lenient judge within the
same court, these estimates scale to asylum seekers on average being 1.5 percentage points more likely to
be present for their court hearing when moving from the least to the most lenient judge on average, which
corresponds to a 10% reduction relative to the mean absentia rate.
177
Thus, these estimates suggest that there is an effect of judge arbitrariness in terms of decision leniency
on asylum seeker behavior, although it is unlikely that this endogenous response explains the majority of
absences in this context. Nonetheless, recall that this is only one of many possible ways asylum seekers
might respond to the arbitrary variability in judge behavior. Asylum seekers might also choose courts that
on average feature more lenient judges, behavior we cannot capture with our research design. Thus, our
estimates represent a lower bound on the impact of judge variation in decision-making on asylum seeker
absentia.
To further put these results into context, we calculate the implied system-wide changes in absentia from
removing the wide variation in judge leniency. We start with the total average absences in a hypothetical
year, 15,320.
29
First, we calculate the fall in absentia cases if all discrepancies in judge leniency within
courts are corrected, i.e. in each court the decisions judges make are corrected to be in line with the
decisions of the most lenient judge within that court (note we could also change this to the median judge’s
decisions and the results would not change since here we are assuming that it is relative leniency that
matters and the change in absentia applies linearly). In this case, conservatively using the randomized
variation only (i.e. removing court and time fixed effects from the variation, rather than using the pure
grant rates, which are what asylum seekers likely actually respond to) we find that the number of absentia
cases falls by 938 per year out of a total of 15,320 yearly absentia cases, a 6% reduction. If we instead use
grant rates without removing court by time fixed effects, the number of absentia cases falls by 1,450, a
9.5% reduction. This is likely a more accurate extrapolation given asylum seekers likely view and respond
approximately to actual grant rate differences across judges within courts.
However, this estimate may still be too conservative. If we assume that the size of the effect we estimate
also applies across courts, as well as within courts, we can estimate the reduction in absences that would
occur if courts were similar in terms of grant rates. Specifically, we can extend our back-of-the-envelop
calculations to adjust absences accounting for the fact that asylum seekers may be more likely to be absent
in courts where judges are stricter across the board. This is equivalent to correcting discrepancies in
judge leniency so all judges within the entire system decide consistently. This exercise requires two main
assumptions: first, that the general strictness of judges in a court drives absentia behavior across courts (in
addition to what we have documented, namely that variation in judge stringency drives absentia within
29
To calculate this absentia number we calculate the average yearly absentia cases for each judge in our sample, and add them
together.
178
courts), and second, that our within-court absentia response also applies across courts. If these assumptions
hold, then using our conservatively using the randomized variation only (i.e. removing court by time fixed
effects from the variation, which in this case controls for differences in case characteristics across courts)
we find that the number of absentia cases falls by 3,120 per year out of a total of 15,320 yearly absentia cases.
Again, if we instead use grant rates, the reduction in the number of absentia cases expands dramatically to
5,759 per year of the 15,320 yearly total, a 37.5% reduction, which is likely an upper bound of the possible
impact of removing the vast variability in judicial decision-making.
As such, these results suggest that the extreme variation in judge behavior may significantly distort
the asylum system in the United States. We conclude that removing judicial inconsistency could reduce
absences by asylum seekers in United States immigration courts by anywhere from 6.1%-37.5%.
Table 5.3: Asylum Seeker Responses to Judge Leniency, 2009-2015
Dependent Variable: Absentia
(1) (2) (3) (4) (5)
Judge Leniency -0.074*** -0.048* -0.083*** -0.082*** -0.073***
(0.01) (0.02) (0.01) (0.01) (0.01)
Court× Week Fixed Effects ✓ ✓ ✓ ✓ ✓
Additional Controls ✓
Robustness to Marginal Judges ✓
Stricter Robustness to Marginal Judges ✓
Excluding if Application Missing ✓
Dep. Variable Mean 0.15 0.17 0.15 0.15 0.06
Observations 321,231 77,812 312,919 303,579 139,012
Notes: Estimates of equation 5.1 in column (1). Column (2) adds controls for China, Mexico, Northern
Triangle, and gender dummies. Gender is often missing in the data, hence the much lower number of
observations. Note that when we re-estimate column (1) restricting to the same sample in column (2)
but not including controls we get a significant estimate of -0.050. Columns (3)-(4) provide robustness
to excluding cases assigned to judges on the margin of being significant in the balance test, as
described in the main text. Column (5) only includes cases where the asylum application is non-
missing and indicates it is an asylum case. See Section 3 for more details. *p < 0.05, **p < 0.01,
***p<0.001.
5.1 AdditionalRobustness
To support our main results, we run a number of robustness checks in addition to those already reported.
We report the results in Figure 5.8, which displays point estimates for every possible alternative specification.
179
Each dot on the figure corresponds to an alternative regression where we re-estimate our main estimating
equation 5.1, but with the different restrictions and/or data sets applied as indicated below the point
estimates in the x-axis labels. Our preferred specification reported in Table 5.3 is shown first, by the
far-left estimate.
We find that the main results remain with each of the possible combinations of alternative data restrictions.
In almost all cases the alternative specifications are statistically indistinguishable from our main result.
These alternative approaches include: changing the number of cases we require judges to see per month,
changing the maximum number of cases judges see in a year, including weekend cases (which are excluded
in our main analysis as these may be seen by non-randomly assigned judges), requiring that gender not be
missing (many observations in the main data set are missing gender), and requiring nationality or entry
date by non-missing.
We also present results from heterogeneity exercises in the figure. We find that including non-asylum
cases (i.e. including all immigration cases) also results in endogenous response to judge assignment. If
we only include the 6 largest courts, indicated by ”main courts” ( New York City, San Francisco, Dallas,
Chicago, Los Angeles and Miami) we get almost identical estimates as when we include all courts. When
dividing the data set into the earlier years (2009-2011) versus later years (2012-2015) point estimates
indicate stronger responses in the later years. While the two estimates are not statistically significantly
different from each other, this is interesting given the growing absentia rates over time, and the likelihood
that information has become even more available in more recent years. These results provide suggestive
evidence that the sort of endogenous response to judge assignment we demonstrate occurs in this paper
could be a growing issue.
180
Figure 5.8: Robustness of Judge Leniency on Absentia Results
Notes: sensitivity tests of main result from Table 5.1, column 1, to changes in estimation assumptions. “No missing” sensitivity
tests indicate that cases without information entered for these categories are not included (possibly tests data entry issues by
court clerks). Coefficient estimate quite stable across sensitivities. Largest deviation exists when cases that have missing gender
or entry date information are excluded.
6 ImplicationsforResearchDesignsUsingRandomlyAssignedJudges
In this section we show that endogenous absentia in response to variation in leniency of randomly assigned
decision-makers (like judges) introduces bias when using decision-maker leniency as an instrument to
identify a second-stage effect. To show this, we simulate data that includes versus does not include
endogenous absentia responses to judge behavior, building on the results shown in this paper. Our simulation
mimics applying such an instrumental-variables strategy to a policy relevant question in this setting: Does
granting legal asylum increase wages? A prior literature suggests numerous benefits of being granted legal
immigration status (65; 165).
We assume the following data generating processes, where allowing for legal employment by granting
an individual asylum increases earnings relative to a counterfactual of not granting asylum and the asylum
181
seeker either remaining in the country and working illegally or returning to their home country and
working legally from there. Equation 5.6 represents the impact of being granted asylum on labor market
earnings.
Y
ict
=β 0
+β 1
Asylum
ict
+ε
ict
. (5.6)
whereY
ict
represents earnings for asylum seekeri who had a court casec in yeart.Asylum
ict
is a dummy
variable equal to 1 if asylum is granted for individuali for their court casec in yeart (and 0 otherwise).
ϵ ict
is the error term. The coefficient of interest is β 1
.
Of course, whether an asylum seeker is granted asylum might depend on unobservable variables that
are correlated with both their future labor market earnings and being granted asylum. This fact presents
the key endogeneity problem, and will cause estimates ofβ 1
to be biased in a simple OLS framework. To
simulate this bias, we generate data where being granted asylum is correlated with some unobservable
characteristicP which is also correlated with earnings.
To solve this endogeneity challenge, one might use random judge assignment as an instrumental
variable. As we have shown, there is random assignment in this context. Thus, we can use equation
5.7 below as a first stage, where we use the randomly assigned judge leniency to instrument the grant
of asylum, and then estimate the second-stage depicted in equation 5.6. With no endogenous response
to judge assignment, random assignment of judges, monotonicity, and no violations of the exclusion
restriction, this approach should recover the local average treatment effect (LATE).
Asylum
ict
=α 0
+α 1
Z
cjt
(5.7)
We generate data consistent with these assumptions, and describe this process in more detail in Appendix
7. We report results from 1,000 Monte Carlo simulations of a data generating process simulating the above
formulation of the problem, and then later add the possibility of endogenous absentia as described below.
We produce data with 5,000 observations for each simulation.
Table 5.4 reports second-stage estimates from this simulation with and without endogenous individual
182
responses to judge leniency. Column (1) reports the true parameters used to produce the data that we
are trying to recover. Appendix Table 5.10 reports the remainder of the parameter values from equations
5.6 and 5.7. Column (2) reports estimates when there is no endogenous response to the judge, and shows
that the IV approximately recovers the LATE. When we estimate the parameter from column (1) in a 2SLS
set-up we get precisely the parameter of interest when averaged over the 1,000 Monte Carlo Simulations,
as shown in column (2).
After producing the data and estimating the problem without endogenous absentia and showing that
we correctly recover the parameter of interest, we now introduce endogenous absentia. We have documented
that such endogenous absentia occurs in this setting in Section 5 and is correlated with both judge leniency
and observed asylum seeker characteristics. We showed that the former occurs in Table 5.3, and the latter
in Figure 5.6. The data generating process that produces the endogenous response is captured in our main
estimating equation 5.1. We assume that this endogenous absentia is also correlated with someP , i.e. the
unobserved characteristic that drives the endogeneity concerns in the first place. For example, individuals
might choose to be in absentia for court hearings selectively based on their case being stronger or weaker,
and the strength of the case may be correlated with future earnings after being granted (or denied) asylum.
In addition, we impose that thisP is also correlated with earnings. In practice, we assume that only those
who will eventually make above median earnings after being granted asylum are endogenously absent.
Formally, we re-simulate the data including this endogenous absentia, varying the amount of endogenous
absentia that occurs. Specifically, we assume in column (3) that 5% of asylum-seekers will not show up
if their judge’s grant rates (i.e. the judge’s leniency) is below 50% and their expected earnings are above
median. In column (4) we assume that 15% won’t show up if the judge’s grant rate is below 50% and their
expected earnings are above median. In column (5) we assume that 25% won’t show up if the judge’s grant
rate is below 50% and their expected earnings are above median. In column (6) we assume that 35% won’t
show up if the judge’s grant rate is below 50% and their expected earnings are above median.
We then re-estimate the impact of asylum on earnings, again in a 2SLS framework using judge assignment
as the instrument. Results are reported in columns (3)-(6), where with each additional column we increase
the number of observations who are at risk of endogenous response to judge assignment as described
above. We see that immediately, there is bias in the estimate of the true parameter from the data generating
process indicated in column (1) versus the estimates obtained using 2SLS with endogenous absentia reported
in column (3). This bias grows in columns (4)-(6) as we increase the amount of endogenous absentia. By
183
columns (5) and (6) the 95% confidence intervals no longer even contain the parameter of interest.
Thus, we see that the LATE we estimate is biased when there is endogenous absentia, and the bias
grows larger as the amount of endogenous absentia increases. Note that this bias occurs even though
the percent of observations that are endogenously absent is quite small, from only 1.1% of observations
endogenously absent in column (3) up to 7.9% of observations missing in column (6). This is well below
the 15% absentia rate we observe in our actual data. This reflects both that we only allow between 5%-35%
to potentially even take such a risk in the simulation, and, moreover, they will only be absent if both their
judge leniency is below 50% and their earnings are above median.
Thus if we were to use random assignment of judges to try and estimate the effect of asylum on
earnings in this context we would fail to obtain unbiased estimates of the coefficient of interest due to
the endogenous attrition caused by asylum seekers choosing to be absent for less lenient judges, and this
choice also being correlated with future earnings.
We close with advice on the type of settings where researchers should be particularly cautious in
implementing random assignment of judges as an instrument to identify outcomes. Specifically, researchers
should check for endogenous responses to judge assignment in settings like the asylum system where the
following four conditions are met: 1) people are subject to arbitrary decision-makers, 2) the variation in
decision-maker behavior is large, 3) information on this variation is publicly available, and 4) there are
actions that can be taken in response. This result contributes to the growing literature scrutinizing the
popular judge fixed effects instrument. In some cases, researchers can directly test for responses to judge
assignment, as we do in our main results. Unfortunately in other contexts the endogenous response may
not be observable and this becomes an untestable assumption for the decision-maker research design.
7 Conclusion
This paper shows that there is large variation in grant rates across judges in immigration courts in the
United States and asylum seekers respond to this variation. We find that an asylum seeker assigned a
judge who is 10 percentage points more lenient is 0.74 percentage points more likely to show up for their
court, a 4.9% increase relative to the dependant variable mean. Given average percentage points gaps from
least to most lenient judges within courts of 20 percentage points, this scales to a 1.5 percentage point
decrease in absentia when moving from the least to most lenient judge, i.e. a 10% decrease in absentia
184
relative to the mean. Thus, our results indicate a modest but important endogenous response by asylum
seekers to the leniency of the judge they are assigned.
Our results have broader implications for judicial settings where judges exhibit differences in their
behavior or, indeed, any setting with wide variation in decision-maker behavior where those subject
to these decision-makers might endogenously respond in some capacity. Our results indicate that such
systems may not work as intended and can be distorted by the reactions of those subject to such caprice.
We also show that if this type of endogenous response occurs, random assignment of decision-makers
cannot be used as an instrument to identify effects of policies. This suggests some caution when using this
popular approach in economics to identify effects in otherwise challenging settings.
This paper also has important implications for the United States asylum system. One takeaway is
that allowing such wide variability in judge behavior, such that life or death determinations of asylum
hinge on a chance assignment, is less than ideal. Our paper suggests that a system with more uniform
decision-maker behavior would be less prone to distortions. Such a system would surely also be more just.
185
Table 5.4: Simulation of Bias with Endogenous Response to Randomly Assigned Judge Leniency
Truth Estimated with Different Shares of Observations Subject to Endogenous Absentia
0% 5% 15% 25% 35%
(1) (2) (3) (4) (5) (6)
Second-StageEstimatesUsing2SLS
Impact of Asylum Granted on Earnings 10 10.000 10.057 10.169 10.280 10.390
95% Confidence Interval [9.765, 10.223] [9.827,10.276] [9.952,10.372] [10.077, 10.481] [10.188,10.583]
Percent Bias 0% 0.5% 1.7% 2.8% 3.9%
Share Endogenously Absent 0% 1.1% 3.4% 5.6% 7.9%
Earnings Mean 33 33 33 33 33
Observations 5000 5000 5000 5000 5000
Notes: Second-stage estimates from a simulation estimating the impact of asylum granted on earnings. Dependent variable is asylum granted. True
coefficient from the underlying data generating process reported in column (1). For all others values generating the simulated data, see Appendix
Table 5.10. Estimates of equation 5.6 reported in column (2)-(5). In Column (2) the estimate is for the case with no endogenous absentia and then we
add a percent of observations who respond to their judge assignment. Specifically, either 5% in column (3) 15% in column (4), or 25% in column (5) of
the observations do not appear rendering the asylum grant variable and earnings missing if their assigned judge’s grant rate is less than 50%.
186
Appendix
AdditionalInstitutionalDetails
Asylum is distinguished from refugees who apply from within their home countries to enter the United
States to flee persecution. A further technical distinction exists within the asylum category: that of
affirmative and defensive asylum cases. Affirmative asylum applicants apply legally from within the United
States. A defensive asylum applicant applies for asylum as part of deportation proceedings after being
caught illegally present within the United States. In the analysis in this paper we pool all asylum cases
together.
In Figure 5.1, we outline the process for asylum starting with the USCIS decision. In this paper we focus
on the judicial stage of the asylum process, which involves data from the Department of Justice Executive
Office for Immigration Review (hereafter EOIR). This agency is contained within the Department of Justice
(DOJ) and provides immigration judges that oversee both affirmative asylum applicants that appeal their
case and all defensive applicants. Note that the EOIR will not interact with applicants that are either
granted asylum by USCIS or who do not appeal a USCIS denial. In practice, whether an applicant has a
legitimate asylum case thus becomes a legal question, with a vast array of precedents and legal decisions
determining whether a particular individual case constitutes a valid asylum case.
30
The following asylum applicant categories describe different routes to applying for asylum within the
US:
1. Enter on a valid visa (tourist, student, business etc) and then apply for asylum on or before the time
when the visa ends.
2. Enter on a visa, overstay the visa, and at some later point either apply for asylum or get caught by
either a U.S. border patrol agent or any law enforcement officer and ask for asylum.
3. Approach a Border Patrol Agent at a port of entry and pass a credible fear test.
30
Examples of how nuanced and legalistic the true definition of asylum becomes include Barajas-Romero v Lynch (2017) which
established that persecution by police was the same as political persecution, and that an applicant does not have to prove he/she
could relocate within his or her country to avoid persecution; and Luz Marina Cantillano Cruz v Sessions (2017) which established
that belonging to a family was a social group that could face persecution and therefore qualify under the ”Convention Ground” of
social group, even though the applicant was fleeing gang violence, although this was later challenged by the Trump administration.
See International Journal of Refugee Law for many other useful case summaries.
187
4. Enter illegally into the United States with no visa and when subsequently caught by either a US
border patrol agent or any law enforcement officer claim asylum.
Potential asylees in asylum applicant category 1 will be classed as affirmative, as will those in category 2
that apply of their own volition. Similarly those in category 3 will also be classified as affirmative since
they apply legally; such applicants would typically be allowed to enter the United States pending progress
in their case, however some are detained, especially more recently. Applicant asylees in category 2 that
are caught rather than applying of their own volition are defensive applicants as are those in category 4.
In the analysis in this paper we pool all asylum cases together.
There are five main Governmental institutions that process asylum applicants, each treating different
asylum applicant categories. First there is Customs and Border Protection (CBP) which is nested within
the Department for Homeland Security (DHS) and is the first agency that comes into contact with anyone
entering the US. Therefore all asylum applicants in categories 1-3 will pass through this agency. Category
4 applicants never interact with the CBP since CBP only has officers at ports of entry. However migrants
attempting to become category 4 unsuccessfully may be caught by CBP at a port of entry. Second, Immigration
and Customs Enforcement (ICE) is nested within the DHS and has the most convoluted set of responsibilities.
ICE is responsible for the removal of unsuccessful asylum applicants and therefore applicants in category
1 and category 2 who apply of their own volition and are successful may never interact with ICE whereas
those who are unsuccessful may only interact with ICE at the end of the process. However, ICE is also
responsible for the detainment of applicants which creates the exception for these categories: they may
interact with ICE earlier in the process if it is judged they must be detained. Third is United States
Citizenship and Immigration Services (USCIS) which is responsible for an initial screening of affirmative
asylum applicants. Fourth and fifth are the Executive Office for Immigration Review (EOIR) and the
Department of Justice (DOJ) who oversee the judicial proceedings of the asylum case. In this paper we
focus on these last two entities.
InformationonJudges
In order for asylum seekers to respond to their judge assignment, it must be possible for them to a) know
who their judge is and b) have some information on how lenient their judge is. The first condition is met.
In fact, one immigration lawyer in Texas who we interviewed in preparation for this paper stated that
188
when a potential client approached her who had been assigned a judge all but certain to deny their case,
she would tell the individual that they would be wasting their money on representation and encourage
them to not pay for counsel as it would do them no good given the judge assignment. Again, this type
of statement is consistent with the notion that when it comes to immigration courts, ”your judge is your
destiny” (198).
Regarding the second condition, it is clearly true currently that information on the leniency of one’s
judge is easily accessible. Below, we provide a screenshot from the TRAC website demonstrating how
such information might be accessed by asylum seekers. The figure is obtained by going to the website and
entering in the selected judge’s name.
We can see from the screenshot that based on entering the judge’s name, the website provides information
on the denial rate of the judge himself, as well as how he compares to other judges in the same court, and
all judges in the United States. In addition to this publicly available information, asylum seekers may also
learn about how strict or lenient judges are from other asylum seekers, their lawyers (if they have one)
or through other informal information sharing channels. Based on our discussions with those involved in
this system, it is often easy to find out and is well known which judges are stricter and which are more
lenient.
189
Figure 5.9: Publicly Available Judge Data: Example for Judge Neumeister
Note: Figure shows the information that asylum seekers can access online that is currently publicly available. For the original,
see: https://trac.syr.edu/immigration/reports/judgereports/00364LOS/index.html. Last accessed March 26, 2021.
190
BalanceAdditionalChecks
Given the main joint F-statistics in Table 5.2 Panel A suggest potentially very minor deviations from
random assignment in some cases, in Appendix Table 5.5 we go a step further and include a specification
where we remove judges who show any correlation with case characteristics.
31
Panel A replicates the results from the main text for convenience. Under our strictest cutoff we exclude
judges for whom the coefficient on their dummy is significant at the 5% level. Results are reported in Panel
C of Table 5.5. We find that after imposing this restriction all of our joint F-statistics are even smaller,
under 3, and more than half are under 2. This suggests even more minor deviations from randomization.
We will report estimates using this alternative set of judges in all of our main results in addition to the
full sample estimates to be as certain as possible that we are identifying the effects of a randomly assigned
judge on asylum seekers’ absentia outcomes.
We additionally impose a more lenient cutoff where we only exclude judges for whom the coefficient
on their dummy is significant at the 1% level (this is more lenient as it removes fewer judges). These results
are reported in Panel B. In the main results that follow, we will always report robustness to this alternative
groups of judges as well.
31
There are institutional reasons why we might expect a small number of judges to fail random assignment, even though
institutional details indicate that random assignment of cases to judges within courts is otherwise the norm in this setting (see
Section 2.2). For example, based on discussions with immigration experts, some judges may serve a rubber stamping role for only
a subset of cases, which might vary relative to the case loads of normal judges. After follow-up FOIA requests on such information
with EOIR, it became clear that detailed information on this type of role is unavailable. Thus, we attempt to identify such judges
in the data and remove their cases.
191
Table 5.5: Test of Random Assignment to Judges
Large Courts Courts A Courts B Courts C Courts D
PanelA:JudgeDummiesJointF-Tests
China 2.93 1.42 1.72 1.73 1.91
Mexico 3.25 2.7 2.8 2.27 3.92
NorthernTriangle 3.55 2.36 2.89 2.01 3.97
PanelB:JudgeDummiesJointF-TestsExcludingat1%Threhold
China 2.47 1.32 1.56 1.77 1.90
Mexico 3.03 2.52 2.85 2.02 2.31
NorthernTriangle 3.12 2.28 2.92 1.62 2.47
Percent of Observations Removed
China 12% 0% 0% 1% 0%
Mexico 4% 6% 0% 10% 23%
NorthernTriangle 1% 10% 0% 5% 9%
PanelC:JudgeDummiesJointF-TestsExcludingat5%Threshold
China 2.21 1.32 1.57 1.78 1.91
Mexico 2.48 2.22 2.83 1.87 2.09
NorthernTriangle 2.63 2.05 2.87 1.47 2.14
Percent of Observations Removed
China 27% 1% 0% 1% 0%
Mexico 14% 16% 1% 22% 34%
NorthernTriangle 2% 34% 3% 11% 15%
PanelD:CourtDummiesJointF-Tests
China 12109.77 10.87 87.13 38.32 18.73
Mexico 17677.17 2941.28 1438.79 1983.29 2433.21
NorthernTriangle 999.73 411.38 520.76 609.03 1462.71
Notes: Panel A reports estimates from equation (5.4) testing for random assignment
to judges by estimating the joint significance of a regression of judge dummies on
the characteristics of cases assigned: whether the case is from China, from Mexico,
or from the Northern Triangle. Panel B reports estimates where we remove judges as
described in text. Panel C does the same as Panel B, but removes judges at a stricter
threshold, i.e. when the judge dummy is significant at the 5% threshold. Panels A-C
include court by week fixed effects. Panel D reports estimates of equation 5.5 and
only includes week fixed effects. All panels reports estimates using all immigration
cases seen by the judge for Panels A-C (or court for Panel D).
DataConstruction
DataSourcesandConstruction
Our data comes from the Executive Office for Immigration Review (EOIR). This data was made publicly
available in the last several years as a result of Freedom of Information Act (FOIA) requests and litigation
192
undertaken by Transactional Records Access Clearinghouse (TRAC) at Syracuse University. The new
availability of this data allows us researchers to answer a range of relevant questions on asylum seekers
and immigration in the United States. Nevertheless this data is difficult to sort through, and requires
substantial expertise, prior research, and legal interpretation to use appropriately. We are grateful to legal
scholars Emily Ryo, Ingrid Eagly, Simon Shafer, Banks Miller and Jaya Ramji-Nogales for assistance with
using and interpreting the data. Similarly we are indebted to TRAC for their work in making this data
available for scholarly use. Finally, we also received substantial assistance from the EOIR FOIA legal team
themselves who, through a series of FOIA requests we made, gave us information on interpreting the legal
data.
In this section, we describe in detail the construction of the main data sets we use in the analysis
in this paper and the relevant assumptions made. The EOIR data comes in the form of a series of large
spreadsheets, many of which are unclear, contain missing cells and require substantial reshaping and
interpretation. In the list below, we document challenges associated with the data that we address as
part of our dataset construction:
1. Data includes all cases: The dataset contains many immigration cases that take place for many
different legal reasons. In this paper we are only examining asylum cases. However, these observations
are not straight forward to identify. We use applications data to rule out and exclude those cases that
do not have an asylum application, with the exception of calculating the judge leniency measures
as described in more detail in the text, although using alternative cuts of the data to calculate judge
leniency yield similar results. See Table 5.7.
2. Case with no decision/absentia: some cases contain no entry in these fields. Such cases are omitted
from the analysis.
3. Data set years: the EOIR data set covers cases with most recent hearing as far back as the 1980s. This
data set includes cases of immigrants that may have even entered the US in the 1950s. However, this
very early data has the issue that some of the cases will not be fully covered. We choose a year
range of 2009-2019 for the latest hearing date as a reasonable ten-year period to present descriptive
statistics, and focus on the years 2009-2015 for the main analysis for the reasons described in the
text.
193
4. Proceeding numbers: most cases have a single ”proceeding” associated with them. However, a small
fraction of cases (20% roughly) have more than one proceeding. An even small fraction have more
than 2, roughly 5%. We only keep the latest six proceedings associated with a case.
5. Decision variable: for cases with multiple proceedings, we take the decision from the last proceeding
that has a decision. I.e. some cases will have a latest proceeding entry which has no decision. In this
case the penultimate proceeding entry’s decision will be used for that case. For approximately 80%
of cases this is not relevant, since these cases only have one proceeding.
6. Absentia variable: we take absentia from the first proceeding for cases where there are multiple
proceedings.
DifferentDataSubcuts
Based on our assumptions above, we constructed multiple different data sets from the raw data. In addition
to our main analysis, we estimate robustness checks on our main conclusions using alternative cuts of the
data. These different data set are summarized in Table 5.6. Data set 1 includes all cases and therefore
represents the widest possible set of data. Data set 2 is the main data set we use in this analysis, where
application data is used to reduce the data set to just asylum cases. Data set 3 further restricts to only
those cases with an asylum application. The key difference between the data set 2 and 3 is that data set
2 also includes cases that have no application at all, since some of these may be asylum applications but
are missing this variable. We show our main results are robust to using either data set 2 or data set 3.
In the subcategories we show why in our main analysis the number of observations is actually smaller-
for a variety of reasons, many observations are missing information on the judge assigned, the decision,
absentia, and so on. Cases that are missing such key variables must be excluded from the analysis.
194
Table 5.6: Main Data Sets
Data Set Description Obs Cumul. Removed
1 AllData 2,759,847 - -
2 Withasylumapplicationormissingapplication 2,350,376 2,350,376 -
2.1 Judge leniency non-missing 676,294 676,294 1,674,082
2.2 Absentia non-missing 1,552,066 666,045 798,310
2.3 More than 1 judge 2,350,376 666,045 -
2.4 More than 10 monthly cases 2,280,485 645,162 69,891
2.5 Less than 4000 total cases 2,005,042 645,121 345,334
2.6 Pre-2016 1,199,390 399,568 1,150,986
2.7 Not juvenile 1,870,581 342,293 479,795
2.8 Not weekend 2,078,440 322,695 271,936
2.9 FE Monthly: enough obs w. restricts 2,350,300 322,619 76
2.10 FE Weekly: enough obs w. restricts 2,348,912 321,231 1,388
3 Onlynon-missingasylumapplications 720,368 720,368 -
3.1 Judge leniency non-missing 269,931 269,931 450,437
3.2 Absentia non-missing 359,085 263,779 361,283
3.3 More than 1 judge 720,368 263,779 -
3.4 More than 10 monthly cases 691,371 254,283 28,997
3.5 Less than 4000 total cases 702,328 254,272 18,040
3.6 Pre-2016 310,401 165,142 409,967
3.7 Not juvenile 568,978 145,513 151,390
3.8 Not weekend 645,234 139,489 75,134
3.9 FE monthly: enough obs w. restricts 719,891 139,012 477
3.10 FE weekly: enough obs w. restricts 719,891 139,012 -
Notes: Table describes the different data sets available and used in the observations, including number
of observations left after each cut of the data to arrive at our estimation samples.
195
AdditionalResults
Table 5.7: Correlation in Judge Leniency Across Data Sets
Raw Leniency: Leniency: Leniency:
grant rate main asylum + missing asylum
(1) (2) (3) (4)
Raw grant rate 1.000
Leniency: main 0.367*** 1.000
Leniency: asylum + missing 0.364*** 0.968*** 1.000
Leniency: asylum 0.275*** 0.833*** 0.832*** 1.000
Notes: Table reports correlations across alternative approaches to calculating judge
leniency. “Leniency: main” refers to our main judge leniency measure used in this paper
which uses all immigration court cases the judge sees to calculate leniency (see Section
4 for more details). “Leniency: asylum + main” refers to judge leniency constructed
only using asylum seeker applicants and observations with missing application data.
“Leniency: asylum” refers to judge leniency constructed only using asylum seeker
applicants.
196
Table 5.8: Test of Random Assignment to Judges - Estimation Sample
Large Courts Courts A Courts B Courts C Courts D
PanelA:JudgeDummiesJointF-Tests
China 2.20 2.17 2.17 2.56 2.54
Mexico 2.66 2.24 2.19 1.82 2.64
Northern Triangle 2.84 2.21 2.16 1.82 2.56
PanelB:JudgeDummiesJointF-TestsExcludingat1%Threhold
China 2.03 1.36 1.84 1.92 2.07
Mexico 2.10 2.21 2.18 1.47 2.39
Northern Triangle 2.16 2.20 2.16 1.78 2.18
Percent of Observations Removed
China 3% 0% 0% 0% 0%
Mexico 12% 2% 0% 23% 6%
Northern Triangle 3% 1% 0% 1% 5%
PanelC:JudgeDummiesJointF-TestsExcludingat5%Threshold
China 1.78 1.37 1.83 1.91 2.08
Mexico 1.83 1.91 2.02 1.23 2.04
Northern Triangle 1.74 2.06 2.13 1.55 1.84
Percent of Observations Removed
China 8% 0% 0% 1% 0%
Mexico 21% 16% 7% 44% 16%
Northern Triangle 7% 3% 0% 18% 12%
Notes: Panel A reports estimates from equation (5.4) testing for random assignment
to judges by estimating the joint significance of a regression of judge dummies on
the characteristics of cases assigned: whether the case is from China, from Mexico,
or from the Northern Triangle. Panel B reports estimates where we remove judges as
described in text. Panel C does the same as Panel B, but removes judges at a stricter
threshold, i.e. when the judge dummy is significant at the 5% threshold. Panels
A-C include court-by-week fixed effects. Panel D reports estimates of equation 5.5
and only includes week fixed effects. All panels report estimates using the main
estimation sample, i.e. those with non-missing asylum applications and those for
whom the application is missing.
197
Table 5.9: Asylum Seeker Responses to Judge Leniency Robustness to Month by Court
Fixed Effects, 2009-2015
Dependent Variable: Absentia
(1) (2) (3) (4) (5)
Judge Leniency -0.080*** -0.052* -0.089*** -0.088*** -0.073***
(0.01) (0.02) (0.01) (0.01) (0.01)
Court× Month Fixed Effects ✓ ✓ ✓ ✓ ✓
Additional Controls ✓
Excludes ”Failing” Judges Lenient ✓
Excludes ”Failing” Judges Strict ✓
Excluding Missing Application Cases ✓
Dep. Variable Mean 0.15 0.06 0.15 0.15 0.06
Observations 322,619 80,605 314,354 305,071 139,012
Notes: Estimates of equation 5.1 in column (1). Column (2) adds controls for China, Mexico,
Northern Triangle and gender dummies. Gender is often missing in the data, hence the much
lower number of observations. Columns (3)-(4) provide robustness to excluding cases assigned to
judges on the margin of failing the balance test, as described in the main text. Column (5) only
includes cases where the asylum application is non-missing and indicates it is an asylum case. See
Section 3 for more details. *p<0.05, **p<0.01, ***p<0.001.
198
Figure 5.10: Variation in Judge Leniency Averages By Court, 2009-2015
Notes: Figure shows average grant rates for each judge, averaged by court from 2009-2015. Judges with very few (¡120 per year)
or very large (¿4000 per year) numbers of cases are omitted. Blue dots represent raw court averages, averaged over all judges
serving on a court during the data period. Yellow dots are otherwise the same but show only the controlled variation – the grant
rates predicted by a regression on average court characteristics and week-year fixed effects. Therefore the difference between
the two represents unexplained variation. The six largest courts in terms of cases seen are New York City, San Francisco, Dallas,
Chicago, Los Angeles and Miami.
199
SimulationAdditionalDetails
Table 5.10 below reports the parameter values we use for equations 5.6 and 5.7 when we simulate the
impact of endogenous absentia on estimating outcomes of interest using judge leniency as an instrument
in a 2SLS framework. In addition to these parameter values, we also generate data corresponding to the
variableP which is endogenous to asylum. We do so by generating data whereP and asylum are jointly
normal with means of 15 and 20, respectively, and covariance of 0.5 and standard deviations of 1 each. We
then write earnings as:
Y
ict
=β 0
+β 1
Asylum
ict
+P +ϵ ict
(5.8)
We assume that we do not observe P, but it is related to both Asylum and earnings, where the relationship
to earnings is defined by the above equation, and it is jointly normally distributed with asylum.
Table 5.10: Parameters Used in the Monte
Carlo Simulation
Parameter Value
(1) (2)
Second-StageParameters
β 0
10
β 1
10
Error Term Mean 0
Error Term Standard deviation 0.5
FirstStageParameters
α 0
0
α 1
2
Notes: Parameter values used for Monte
Carlo simulations of the bias introduced
by endogenous absentia when using judge
leniency as an instrumental variable. See
Section 6 for more details.
Note that the first stage equation, given by equation 5.7, is first simulated with asylum as a continuous
variable. Of course, in reality, asylum is discrete. Thus, we impose a cut-off rule, where we set asylum to 1
if it is above 20 and 0 otherwise. The cut-off imposed should depend on the data, and we choose a cut-off
200
of 20 to give us some cases where asylum is granted and others where asylum is not. The judge leniency
variable, given byZ
cjt
, pushes overall asylum higher and thereby makes it more likely that the cut-off of
20 is reached and the asylum seeker is granted asylum.
201
References
[1] Abreu, D. and Brunnermeier, M. K. (2003). Bubbles and Crashes. Econometrica, 71(1):173–204. eprint:
https://onlinelibrary.wiley.com/doi/pdf/10.1111/1468-0262.00393.
[2] Acemoglu, D., Chernozhukov, V., Werning, I., and Whinston, M. D. (2020). A multi-risk sir model with
optimally targeted lockdown. Working Paper 27102, National Bureau of Economic Research.
[3] Aguilera, J. (2022). A Record-Breaking 1.6 Million People Are Now Mired in U.S. Immigration Court
Backlogs. TimeMagazine.
[4] Alesina, A., Miano, A., and Stantcheva, S. (2022). Immigration and Redistribution. ReviewofEconomic
Studies.
[5] Aleta, A., Mart´ ın-Corral, D., Pastore y Piontti, A., and Moreno, Y. (2020). Modelling the impact of
testing, contact tracing and household quarantine on second waves of covid-19. Nature Hum Behav,
4:964–971.
[6] Althouse, B. M., Wenger, E. A., Miller, J. C., Scarpino, S. V., Allard, A., H´ ebert-Dufresne, L., and Hu,
H. (2020). Stochasticity and heterogeneity in the transmission dynamics of sars-cov-2. arXiv preprint
arXiv:2005.13689.
[7] Altında˘ g, O., Bakıs ¸, O., and Rozo, S. (2020). Blessing or Burden? The Impact of Refugees on Businesses
and the Informal Economy. JournalofDevelopmentEconomics, 146:102490.
[8] Ash, T. (2019). A tale of two bitcoin bubbles: risk factor correlations during bubbles. WorkingPaper.
[Ash et al.] Ash, T., Bento, A. M., Kaffine, D., Rao, A., and Bento, A. I. Disease-economy trade-offs under
alternative epidemic control strategies. https://zenodo.org/record/6478460.
[10] Ash, T., Bento, A. M., Kaffine, D., Rao, A., and Bento, A. I. (2022). Disease-economy trade-offs under
alternative epidemic control strategies. Nature Communications, 13(1):3319. Number: 1 Publisher:
Nature Publishing Group.
202
[11] Bailey, M., Cao, R., Kuchler, T., and Stroebel, J. (2018). The Economic Effects of Social Networks:
Evidence from the Housing Market. Journal of Political Economy, 126(6):2224–2276. Publisher: The
University of Chicago Press.
[12] Baker, S. and Bloom, N. (2013). Does uncertainty reduce growth? using natural disasters as
experiments. Workingpaper.
[13] Bald, A., Chyn, E., Hastings, J., and Machelett, M. (2022). The Causal Impact of Removing Children
from Abusive and Neglectful Homes. JournalofPoliticalEconomy, 130(7):000–000.
[14] Banerjee, A. V. (1992). A Simple Model of Herd Behavior*. The Quarterly Journal of Economics ,
107(3):797–817.
[15] Barberis, N., Greenwood, R., Jin, L., and Shleifer, A. (2018). Extrapolation and bubbles. Journal of
Finance, 129:173–204.
[16] Barrett, S. (2013). Economic considerations for the eradication endgame. Philosophical Transactions
oftheRoyalSocietyB:BiologicalSciences, 368(1623):20120149.
[17] Barrot, J.-N. and Sauvagnat, J. (2016). Input specificity and the propagation of idiosyncratic shocks
in production networks. QuarterlyJournalofEconomics , 131:1543–1592.
[18] Bayer, P., Mangum, K., and Roberts, J. W. (2021). Speculative Fever: Investor Contagion in the Housing
Bubble. AmericanEconomicReview, 111(2):609–651.
[19] Bayham, J. and Fenichel, E. P. (2020). Impact of school closures for covid-19 on the us health-care
workforce and net mortality: a modelling study. TheLancetPublicHealth .
[20] Bayham, J., Kuminoff, N. V., Gunn, Q., and Fenichel, E. P. (2015). Measured voluntary avoidance
behaviour during the 2009 a/h1n1 epidemic. Proceedings of the Royal Society B: Biological Sciences,
282(1818):20150814.
[21] Beaman, L. A. (2012). Social Networks and the Dynamics of Labour Market Outcomes: Evidence from
Refugees Resettled in the US. TheReviewofEconomicStudies , 79(1):128–161.
[22] Bellman, R. (1966). Dynamic programming. Science, 153(3731):34–37.
[23] Benetton, M. and Compiani, G. (2022). Investors’ beliefs and cryptocurrency prices. Working Paper,
RevisionrequestedatJournalofAssetPricing.
[24] Bento, A. and Rohani, P. (2016). Forecasting epidemiological consequences of maternal immunization.
ClinicalInfectiousDiseases, 64:1298.
203
[25] Bernstein, S., Colonnelli, E., Giroud, X., and Iverson, B. (2019a). Bankruptcy Spillovers. Journal of
FinancialEconomics, 133(3):608–633.
[26] Bernstein, S., Colonnelli, E., and Iverson, B. (2019b). Asset Allocation in Bankruptcy. The Journal of
Finance, 74(1):5–53.
[27] Bhuller, M., Dahl, G. B., Løken, K. V., and Mogstad, M. (2020). Incarceration, Recidivism, and
Employment. JournalofPoliticalEconomy, 128(4):1269–1324.
[28] Bianchi, M., Buonanno, P., and Pinotti, P. (2012). Do Immigrants Cause Crime? JournaloftheEuropean
EconomicAssociation, 10(6):1318–1347.
[29] Bikhchandani, S., Hirshleifer, D., and Welch, I. (1992). A theory of fads, fashion, custom, and cultural
change as informational cascades. JournalofPoliticalEconomy, pages 992–1026.
[30] Blei, D., Ng, A., and Jordan, M. (2003a). Latent dirichlet allocation. Journal of Machine Learning
Research, pages 993–1022.
[31] Blei, D. M., Y.Ng, A., and Jordan, M. (2003b). Latent Dirichlet Allocation. JournalofMachineLearning,
3:993–1022.
[32] Bodas, M. and Peleg, K. (2020). Self-isolation compliance in the covid-19 era influenced by
compensation: Findings from a recent survey in israel: Public attitudes toward the covid-19 outbreak
and self-isolation: a cross sectional study of the adult population of israel. HealthAffairs , 39(6):936–941.
[33] Borjas, G. J. (2003). The Labor Demand Curve Is Downward Sloping: Reexamining the Impact of
Immigration on the Labor Market. TheQuarterlyJournalofEconomics , 118(4):1335–1374.
[34] Boustan, L., Kahn, M., and Rhode, P. (2012). Moving to higher ground: migration response to natural
disasters in the early twentieth century. American Economic Review: Papers and proceedings, 102:238–
244.
[35] Boustan, L., Kahn, M., Rhode, P., and Yanguas, M. L. (2020). The effect of natural disasters on economic
activity in us counties: A century of data. JournalofUrbanEconomics, 118.
[36] Brell, C., Dustmann, C., and Preston, I. (2020). The Labor Market Integration of Refugee Migrants in
High-Income Countries. JournalofEconomicPerspectives, 34(1):94–121.
[37] Brett, T. S. and Rohani, P. (2020). Transmission dynamics reveal the impracticality of covid-19 herd
immunity strategies. ProceedingsoftheNationalAcademyofSciences, 117(41):25897–25903.
204
[38] Brinca, P., Duarte, J. B., and Faria-e Castro, M. (2020). Measuring labor supply and demand shocks
during covid-19. Technical report, Technical report, Federal Reserve Bank of St. Louis, St. Louis, MO,
USA.
[39] Brunnermeier, M. K. (2001). Asset Pricing under Asymmetric Information: Bubbles, Crashes, Technical
Analysis,andHerding. Oxford University Press, Oxford, UK ; New York, 1st edition edition.
[40] Brunnermeier, M. K. (2017). Bubbles. InTheNewPalgraveDictionaryofEconomics , pages 1–8. Palgrave
Macmillan UK, London.
[41] Brunnermeier, M. K. and Oehmke, M. (2013). Chapter18: Bubbles,FinancialCrises,andSystemicRisk,
volume 2. Elsevier.
[42] Buckee, C. O., Balsari, S., Chan, J., Crosas, M., Dominici, F., Gasser, U., Grad, Y. H., Grenfell, B.,
Halloran, M. E., Kraemer, M. U. G., Lipsitch, M., Metcalf, C. J. E., Meyers, L. A., Perkins, T. A., Santillana,
M., Scarpino, S. V., Viboud, C., Wesolowski, A., and Schroeder, A. (2020). Aggregated mobility data
could help fight covid-19. Science, 368(6487):145–146.
[43] Burchardi, K. B., Chaney, T., Hassan, T. A., Tarquinio, L., and Terry, S. J. (2020). Immigration,
Innovation, and Growth. Technical Report 27075, National Bureau of Economic Research.
[44] Bureau of Economic Analysis (2020). Gross domestic product, 2nd quarter 2020 (advance estimate)
and annual update. Quarterly report, Department of Commerce.
[45] Bursztyn, L., Chaney, T., Hassan, T. A., and Rao, A. (2021). The Immigrant Next Door: Exposure,
Prejudice, and Altruism. Technical Report 28448, National Bureau of Economic Research.
[46] Byambasuren, O., Cardona, M., Bell, K., Clark, J., McLaws, M.-L., and Glasziou, P. (2020). Estimating
the extent of asymptomatic covid-19 and its potential for community transmission: systematic review
and meta-analysis. Official Journal of the Association of Medical Microbiology and Infectious Disease
Canada, 5(4):223–234.
[47] Caballero, R. J. and Simsek, A. (2020). A Risk-Centric Model of Demand Recessions and Speculation*.
TheQuarterlyJournalofEconomics , 135(3):1493–1566.
[48] Cag´ e, J., Herv´ e, N., and Viaud, M.-L. (2020). The Production of Information in an Online World. The
ReviewofEconomicStudies, 87(5):2126–2164.
[49] Caliendo, L., Parro, F., Rossi-Hansberg, E., and Sarte, P.-D. (2016). The impact of regional and sectoral
productivity changes on the us economy. ReviewofEconomicStudies, 131:2042–2096.
205
[50] Carcamo, C. (2017). 99% of l.a. asylum seekers - many kids - in biden program face deportation, report
says. LosAngelesTimes.
[51] Carey, M. and Hrycray, M. (1999). Credit flow, risk, and the role of private debt in capital structure.
Unpublishedworkingpaper,FederalReserveBoard.
[52] Castillo-Chavez, C., Bichara, D., and Morin, B. R. (2016). Perspectives on the role of mobility, behavior,
and time scales in the spread of diseases.ProceedingsoftheNationalAcademyofSciences, 113(51):14582–
14588.
[53] Celen, B. and Kariv, S. (2004a). Distinguishing Informational Cascades from Herd Behavior in the
Laboratory. AmericanEconomicReview, 94(3):484–498.
[54] Celen, B. and Kariv, S. (2004b). Distinguishing informational cascades from herd behaviour in the
laboratory. AmericanEconomicReview, pages 484–498.
[55] Chava, S. and Roberts, M. (2008). How does financing impact investment? the role of debt covenants.
JournalofFinance, 63:2085–2121.
[56] Chawla, N., Da, Z., Xu, J., and Ye, M. (2017). Information diffusion on social media: does it affect
trading, return and liquidity? Workingpaper, pages 1–54.
[57] Chen, D. L., Moskowitz, T. J., and Shue, K. (2016). Decision Making Under the Gambler’s Fallacy:
Evidence from Asylum Judges, Loan Officers, and Baseball Umpires. TheQuarterlyJournalofEconomics ,
131(3):1181–1242.
[58] Cheng, I.-H., Severino, F., and Townsend, R. R. (2021). How Do Consumers Fare When Dealing with
Debt Collectors? Evidence from Out-of-Court Settlements. TheReviewofFinancialStudies , 34(4):1617–
1660.
[59] Chetty, R., Friedman, J., Hendren, N., and Stepner, M. (2020). The economic consequences of r = 1:
Towards a workable behavioural epidemiological model of pandemics. Working Paper 27431, National
Bureau of Economic Research.
[60] Cheung, A. W.-K., Roca, E., and Su, J.-J. (2015). Crypto-currency bubbles: an application of the
Phillips–Shi–Yu (2013) methodology on Mt. Gox bitcoin prices. Applied Economics, 47(23):2348–2358.
Publisher: Routledge eprint: https://doi.org/10.1080/00036846.2015.1005827.
[61] Chin, A. and Cortes, K. E. (2015). The Refugee/Asylum Seeker. In Handbook of the Economics of
InternationalMigration, volume 1, pages 585–658. Elsevier.
206
[62] CNN (2020). These states have implemented stay-at-home orders. here’s
what that means for you. https://www.cnn.com/2020/03/23/us/
coronavirus-which-states-stay-at-home-order-trnd/index.html.
[63] Coase, R. H. (1960). The problem of social cost. In Classicpapersinnaturalresourceeconomics, pages
87–137. Springer.
[64] Cortes, K. E. (2004). Are Refugees Different from Economic Immigrants? Some Empirical Evidence
on the Heterogeneity of Immigrant Groups in the US. ReviewofEconomicsandStatistics, 86(2):465–480.
[65] Cortes, K. E. (2013). Achieving the DREAM: The Effect of IRCA on Immigrant Youth Postsecondary
Educational Access. AmericanEconomicReview, 103(3):428–32.
[66] Cortes, P. (2008). The Effect of Low-Skilled Immigration on US Prices: Evidence from CPI Data.
JournalofPoliticalEconomy, 116(3):381–422.
[67] Cortes, P. and Tessada, J. (2011). Low-Skilled Immigration and the Labor Supply of Highly Skilled
Women. AmericanEconomicJournal: AppliedEconomics, 3(3):88–123.
[68] Costello, C., Gaines, S. D., and Lynham, J. (2008). Can catch shares prevent fisheries collapse? Science,
321:1678–1681.
[69] Cutler, D. M. and Summers, L. H. (2020). The covid-19 pandemic and the $16 trillion virus. Jama,
324(15):1495–1496.
[70] Daniel, K. D., Litterman, R. B., and Wagner, G. (2019). Declining CO2 price paths. Proceedings of the
NationalAcademyofSciences, 116(42):20886–20891.
[71] Deb, R., Pai, M., Vohra, A., and Vohra, R. (2020). Testing alone is insufficient. SSRN Working Paper
3593974.
[72] Diba, B. T. and Grossman, H. I. (1988). Explosive Rational Bubbles in Stock Prices? The American
EconomicReview, 78(3):520–530. Publisher: American Economic Association.
[73] Diekmann, O., Heesterbeek, J., and Roberts, M. (2010). The construction of next-generation matrices
for compartmental epidemic models. JournaloftheRoyalSociety,Interface, 7(47):873–885.
[74] Dobbie, W., Goldin, J., and Yang, C. S. (2018). The Effects of Pretrial Detention on Conviction,
Future Crime, and Employment: Evidence from Randomly Assigned Judges.AmericanEconomicReview,
108(2):201–40.
207
[75] Dobbie, W. and Song, J. (2015). Debt Relief and Debtor Outcomes: Measuring the Effects of Consumer
Bankruptcy Protection. AmericanEconomicReview, 105(3):1272–1311.
[76] Dolan, P., Hallsworth, M., Halpern, D., King, D., Metcalfe, R., and Vlaev, I. (2012). Influencing
behaviour: The mindspace way. JournalofEconomicPsychology, 33(1):264–277.
[77] Dong, E., Du, H., and Gardner, L. (2020). An interactive web-based dashboard to track covid-19 in
real time. LancetInfDis., 5:533–534.
[78] Doyle, J. (2008). Child Protection and Adult Crime: Using Investigator Assignment to Estimate Causal
Effects of Foster Care. JournalofPoliticalEconomy, 116(4):746–770.
[79] Dustmann, C., Frattini, T., and Preston, I. P. (2013). The Effect of Immigration Along the Distribution
of Wages. ReviewofEconomicStudies, 80(1):145–173.
[80] Dustmann, C., Vasiljeva, K., and Piil Damm, A. (2019). Refugee Migration and Electoral Outcomes.
TheReviewofEconomicStudies , 86(5):2035–2091.
[81] Eagly, I. and Shafer, S. (2020). Measuring in absentia removal in immigration court. University of
PennsylvaniaLawReview, pages 817–876.
[82] East, C. N. and Vel´ asquez, A. (2022). Unintended Consequences of Immigration Enforcement:
Household Services and High-Educated Mothers’ Work. Journal of Human Resources, pages 0920–
11197R1.
[83] Eichenbaum, M. S., Rebelo, S., and Trabandt, M. (2020). The macroeconomics of epidemics. Working
Paper 26882, National Bureau of Economic Research.
[84] Epley, N. and Gilovich, T. (2016). The Mechanics of Motivated Reasoning. Journal of Economic
Perspectives, 30(3):133–140.
[85] Erten, B. and Keskin, P. (2021). Female Employment and Intimate Partner Violence: Evidence from
Syrian Refugee Inflows to Turkey. JournalofDevelopmentEconomics, 150:102607.
[86] Evans, G. W. (1991). Pitfalls in Testing for Explosive Bubbles in Asset Prices. TheAmericanEconomic
Review, 81(4):922–930. Publisher: American Economic Association.
[87] Farboodi, M., Jarosch, G., and Shimer, R. (2020). Internal and external effects of social distancing in a
pandemic. Working Paper 27059, National Bureau of Economic Research.
[88] Faria-e Castro, M. (2020). Fiscal policy during a pandemic. FRB St. Louis Working Paper 2020-006.
208
[Fedyk] Fedyk, A. Front Page News: The Effect of News Positioning on Financial Markets. Journal of
Finance.
[90] Fenichel, E. P. (2013). Economic considerations for social distancing and behavioral based policies
during an epidemic. Journalofhealtheconomics, 32(2):440–451.
[91] Fenichel, E. P., Berry, K., Bayham, J., and Gonsalves, G. (2020). A cell phone data driven time use
analysis of the covid-19 epidemic. medRxiv.
[92] Fenichel, E. P., Castillo-Chavez, C., Ceddia, M. G., Chowell, G., Parra, P. A. G., Hickling, G. J.,
Holloway, G., Horan, R., Morin, B., Perrings, C., et al. (2011). Adaptive human behavior in
epidemiological models. ProceedingsoftheNationalAcademyofSciences, 108(15):6306–6311.
[93] Fenichel, E. P., Kuminoff, N. V., and Chowell, G. (2013). Skip the trip: Air travelers’ behavioral
responses to pandemic influenza. PloSone, 8(3).
[94] Ferguson, N. M., Laydon, D., Nedjati-Gilani, G., Imai, N., Ainslie, K., Baguelin, M., Bhatia, S.,
Boonyasiri, A., Cucunub´ a, Z., and Cuomo-Dannenburg, G. (2020). Impact of Non-Pharmaceutical
Interventions (NPIs) to Reduce COVID19 Mortality and Healthcare Demand. Imperial College London,
10:77482.
[95] Flaxman, S., Mishra, S., Gandy, A., and Samir, B. (2020). Estimating the effects of non-pharmaceutical
interventions on covid-19 in europe. Nature, 584:257–261.
[96] Frandsen, B. R., Lefgren, L. J., and Leslie, E. C. (2019). Judging Judge Fixed Effects. Technical Report
25528, National Bureau of Economic Research.
[97] Fraser, C., Riley, S., Anderson, R. M., and Ferguson, N. M. (2004). Factors that make an infectious
disease outbreak controllable. ProceedingsoftheNationalAcademyofSciences, 101(16):6146–6151.
[98] French, E. and Song, J. (2014). The Effect of Disability Insurance Receipt on Labor Supply. American
EconomicJournal: EconomicPolicy, 6(2):291–337.
[99] Gans, J. S. (2020). The economic consequences of r = 1: Towards a workable behavioural
epidemiological model of pandemics. Working Paper 27632, National Bureau of Economic Research.
[100] Geanakoplos, J. (2010). The Leverage Cycle. NBERMacroeconomicsAnnual, 24:1–66. Publisher: The
University of Chicago Press.
[101] Gentzkow, M., Kelly, B., and Taddy, M. (2019). Text as Data.JournalofEconomicLiterature, 57(3):535–
574.
209
[102] Geoffard, P.-Y. and Philipson, T. (1996). Rational epidemics and their public control. International
economicreview, pages 603–624.
[103] Gibson, M. and Mullins, J. (2020). Climate risk and beliefs in new york floodplains. Journal of the
associationofresourceeconomists.
[104] Golosov, M., Hassler, J., Krusell, P., and Tsyvinski, A. (2014). Optimal taxes on fossil fuel in general
equilibrium. Econometrica, 82:41–88.
[105] Goolsbee, A. and Syverson, C. (2021). Fear, lockdown, and diversion: Comparing drivers of pandemic
economic decline 2020. JournalofPublicEconomics, 193:104311.
[106] Gordon, H. S. (1954). The economic theory of a common-property resource: The fishery. Journal of
PoliticalEconomy, 62.
[107] Greenwood, J., Kircher, P., Santos, C., and Tertilt, M. (2019a). An equilibrium model of the african
hiv/aids epidemic. Econometrica, 87(4):1081–1113.
[108] Greenwood, R., Shleifer, A., and You, Y. (2019b). Bubbles for fama. Journal of Financial Economics,
pages 1–24.
[109] Grijalva, C., Rolfes, M., and Y, Z. (2020). Transmission of SARS-COV-2 Infections in Households —
Tennessee and Wisconsin, April–September 2020. MMWRMorbMortalWklyRep, 69(11):1631–1634.
[110] Gupta, S., Montenovo, L., Nguyen, T. D., Rojas, F. L., Schmutte, I. M., Simon, K. I., Weinberg, B. A.,
and Wing, C. (2020). Effects of social distancing policy on labor market outcomes. Working Paper
27280, National Bureau of Economic Research.
[111] Hall, S. G., Psaradakis, Z., and Sola, M. (1999). Detecting Periodically Collapsing Bubbles: A Markov-
Switching Unit Root Test. JournalofAppliedEconometrics, 14(2):143–154. Publisher: Wiley.
[112] Hanley, K. and Hoberg, G. (2019a). Dynamic interpretation of emerging risks in the financial sector.
ReviewofFinancialStudies, 32:4543–4603.
[113] Hanley, K. W. and Hoberg, G. (2019b). Dynamic Interpretation of Emerging Risks in the Financial
Sector. TheReviewofFinancialStudies , 32(12):4543–4603.
[114] Hansen, S., McMahon, M., and Prat, A. (2018). Transparency and deliberation within the fomc: a
computational linguistics approach. QuaterlyJournalofEconomics , pages 801–870.
[115] Hardin, G. (1968). The tragedy of the commons. Science, 162(3859):1243–1248.
210
[116] Hassan, T. A., Hollander, S., van Lent, L., and Tahoun, A. (2019). Firm-Level Political Risk:
Measurement and Effects*. TheQuarterlyJournalofEconomics , 134(4):2135–2202.
[117] Hassler, J. and Krusell, P. (2018). Environmental macroeconomics: the case of climate change.
HandbookofEnvironmentalEconomics, 4:1–63.
[118] He, X., Lau, E., Wu, P., Deng, J., Hao, X., Lau, Y., Wong, J., Guan, Y., Tan, X., Mo, X., Y, C., liao, B., W,
C., Hu, F., Zhang, Q., Zhing, M., Wu, Y., Zhao, L., Zhang, F., Cowling, B., and Leung, G. (2020). Temporal
dynamics in viral shedding and transmissibility of covid-19. NatureMedicine, 26:672–675.
[119] Hines, A. L. and Peri, G. (2019). Immigrants’ Deportations, Local Crime and Police Effectiveness.
[120] Hirshleifer, D., Lim, S. S., and Teoh, S. H. (2009). Driven to Distraction: Extraneous
Events and Underreaction to Earnings News. The Journal of Finance , 64(5):2289–2325. eprint:
https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1540-6261.2009.01501.x.
[121] Hsiang, S. (2010). Temperatures and cyclones strongly associated with economic production in the
carribbean and central america. ProceedingsoftheNationalAcademyofSciencesoftheUSA, 107:15367–
15372.
[122] Hsiang, S. and Kopp, R. (2018). An economist’s guide to climate change science. JournalofEconomic
Perspectives, 32(4):3–32.
[123] Hu, S., Wang, W., Wang, Y., Litvinova, M., Luo, K., Ren, L., Sun, Q., Chen, X., Zeng, G., Li, J., Liang,
L., Deng, Z., Zheng, W., Li, M., Yang, H., Guo, J., Wang, K., Chen, X., Liu, Z., Yan, H., Shi, H., Chen,
Z., Zhou, Y., Sun, K., Vespignani, A., Viboud, C., Gao, L., Ajelli, M., and Yu, H. (2020). Infectivity,
susceptibility, and risk factors associated with sars-cov-2 transmission under intensive contact tracing
in hunan, china. medRxiv.
[124] Hunt, J. and Gauthier-Loiselle, M. (2010). How Much Does Immigration Boost Innovation? American
EconomicJournal: Macroeconomics, 2(2):31–56.
[125] Huttunen, K., Kaila, M., Macdonald, D., and Nix, E. (2014). Financial Crime and Punishment.
UnpublishedWorkingPaper.
[126] Jamie Bedson, Laura A. Skrip, D. P. S. A. S. C. M. F. J. S. F. N. G. T. G.-V. G. C. J. R. d. A. R. E. S. V.
S. R. A. H. S. B. J. M. E. L. H.-D. . B. M. A. (2021). A review and agenda for integrated disease models
including social and behavioural factors. NatureHumanBehaviour.
[127] Jermann, U. and Quadrini, V. (2012). Macroeconomic effects of financial shocks. AmericanEconomic
Review, 102(1):238–271.
211
[128] Kahn, M. and Ouazad, A. (2020). Mortgage finance in the face of rising climate risk. NBERWorking
Paper.
[129] Kermack, W. O. and McKendrick, A. G. (1927). A contribution to the mathematical theory of
epidemics. Proceedingsoftheroyalsocietyoflondon, 115(772):700–721.
[Kindleberger] Kindleberger, C. Manias,PanicsandCrashes. New York: Basic Books.
[131] Kirkeboen, G., Vasaasen, E., and Halvor Teigen, K. (2013). Revisions and Regret: The
Cost of Changing your Mind. Journal of Behavioral Decision Making, 26(1):1–12. eprint:
https://onlinelibrary.wiley.com/doi/pdf/10.1002/bdm.756.
[132] Kling, J. R. (2006). Incarceration Length, Employment, and Earnings. American Economic Review,
96(3):863–876.
[133] Kraemer, M. U. G., Yang, C.-H., Gutierrez, B., Wu, C.-H., Klein, B., Pigott, D. M., , du Plessis, L., Faria,
N. R., Li, R., Hanage, W. P., Brownstein, J. S., Layan, M., Vespignani, A., Tian, H., Dye, C., Pybus, O. G.,
and Scarpino, S. V. (2020). The effect of human mobility and control measures on the covid-19 epidemic
in china. Science, 368(6490):493–497.
[134] Kremer, M. (1996). Integrating behavioral choice into epidemiological models of aids. TheQuarterly
JournalofEconomics, 111(2):549–573.
[135] Kusner, M., Sun, Y., Kolkin, N., and Weinberger, K. (2015). From word embeddings to document
distances. InternationalConferenceonMachineLearning, pages 957–966.
[136] Laffont, J.-J. and Tirole, J. (1988). The dynamics of incentive contracts. Econometrica: Journal of the
EconometricSociety, pages 1153–1175.
[137] Larremore, D. B., Wilder, B., Lester, E., Shehata, S., Burke, J. M., Hay, J. A., Tambe, M., Mina, M. J., and
Parker, R. (2021). Test sensitivity is secondary to frequency and turnaround time for covid-19 screening.
ScienceAdvances, 7(1).
[138] Levy, R. (2021). Social Media, News Consumption, and Polarization: Evidence from a Field
Experiment. AmericanEconomicReview, 111(3):831–870.
[139] Lewnard, J. and Lo, N. (2020). Scientific and ethical basis for social-distancing interventions against
covid-19. TheLancetInfectiousDiseases , 20(64):631–633.
[140] Li, R., Pei, S., Chen, B., Song, Y., Zhang, T., Yang, W., and Shaman, J. (2020). Substantial
undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov-2). Science,
368(6490):489–493.
212
[141] Linos, E., Ruffini, K., and Wilcoxen, S. (2021). Reducing burnout and resignations among frontline
workers: A field experiment. AvailableatSSRN3846860.
[142] Liu, Q.-H., Bento, A. I., Yang, K., Zhang, H., Yang, X., Merler, S., Vespignani, A., Lv, J., Yu, H., Zhang,
W., Zhou, T., and Ajelli, M. (2020). The covid-19 outbreak in sichuan, china: Epidemiology and impact
of interventions. PLOSComputationalBiology, 16(12):1–14.
[143] Lloyd-Smith, J. O., Schreiber, S. J., Kopp, P. E., and Getz, W. M. (2005). Superspreading and the effect
of individual variation on disease emergence. Nature, 438(7066):355–359.
[144] Mastrobuoni, G. and Pinotti, P. (2015). Legal Status and the Criminal Activity of Immigrants.
AmericanEconomicJournal: AppliedEconomics, 7(2):175–206.
[145] Mayda, A. M., Peri, G., and Steingress, W. (2022). The Political Impact of Immigration: Evidence
from the United States. AmericanEconomicJournal: AppliedEconomics, 14(1):358–89.
[146] Mian, A. and Sufi, A. (2009). The consequences of mortgage credit expansion: evidence from the us
mortgage default crisis. QuaterlyJournalofEconomics , 124:1449–1496.
[147] Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013a). Efficient estimation of word representations
in vector space. ArXiv, pages 801–870.
[148] Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013b). Efficient Estimation of Word Representations
in Vector Space. arXiv:1301.3781 [cs].
[149] Milkman, K. L., Chugh, D., and Bazerman, M. H. (2009). How Can Decision Making Be Improved?
PerspectivesonPsychologicalScience, 4(4):379–383. Publisher: SAGE Publications Inc.
[150] Miller, B., Keith, L. C., and Holmes, J. S. (2014). Immigration Judges and U.S. Asylum Policy. In
ImmigrationJudgesandUSAsylumPolicy. University of Pennsylvania Press.
[151] Mina, M. J. and Andersen, K. G. (2021). Covid-19 testing: One size does not fit all. Science,
371(6525):126–127.
[152] Mistry, D., Litvinova, M., Pastore y Piontti, A., Chinazzi, M., Fumanelli, L., Gomes, M., Haque, S., Liu,
Q., Mu, K., Xiong, X., Halloran, M., Longini, I., Merler, S., Ajelli, M., and Vespignani, A. (2021). Inferring
high-resolution human mixing patterns for disease modeling. NatCommun, 323.
[153] Moser, P. and San, S. (2020). Immigration, Science, and Invention. Lessons from the Quota Acts.
WorkingPaper.
213
[154] Moser, P., Voena, A., and Waldinger, F. (2014). German Jewish ´ emigr´ es and US Invention. American
EconomicReview, 104(10):3222–55.
[155] Mossong, J., Hens, N., Jit, M., Beutels, P., Auranen, K., Mikolajczyk, R., Massari, M., Salmaso, S.,
Tomba, G. S., Wallinga, J., Heijne, J., Sadkowska-Todys, M., Rosinska, M., and Edmunds, W. J. (2008).
Social contacts and mixing patterns relevant to the spread of infectious diseases. PLOSMedicine, 5(3):1–
1.
[156] Mueller-Smith, M. (2014). The Criminal and Labor Market Impacts of Incarceration. Unpublished
WorkingPaper.
[157] Nguyen, T. D., Gupta, S., Andersen, M., Bento, A., Simon, K. I., and Wing, C. (2020). Impacts of state
reopening policy on human mobility. Working Paper 27235, National Bureau of Economic Research.
[158] Nimark, K. and Pitschner, S. (2019). News media and delegated information choice. Journal of
EconomicTheory , 181:160–196.
[159] Nordhaus, W. and Boyer, J. (2000). Warming the world: economic models of global warming. MIT
press.
[160] Nordhaus, W. D. (2007). To tax or not to tax: Alternative approaches to slowing global warming.
ReviewofEnvironmentalEconomicsandPolicy, 1:26–44.
[161] Ofek, E. and Richardson, M. (2003). DotCom Mania: The Rise and Fall of Internet Stock Prices.
The Journal of Finance , 58(3):1113–1137. eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/1540-
6261.00560.
[Office] Office, C. B. Monthly budget review for november 2020. Monthly report.
[163] on Climate Change, I. P. (2013). Climate change 2013: the physical science basis.AR5FifthAssessment
Report.
[164] Ottaviano, G. and Peri, G. (2012). Rethinking the Effect of Immigration on Wages. Journal of the
Europeaneconomicassociation, 10(1):152–197.
[165] Pan, Y. (2012). The Impact of Legal Status on Immigrants’ Earnings and Human Capital: Evidence
from the IRCA 1986. JournalofLaborResearch, 33(2):119–142.
[166] Pastor, L. and Veronesi, P. (2006). Was there a nasdaq bubble in the late 1990s? JournalofFinancial
Economics, 81:61–100.
214
[167] Peri, G. (2012). The Effect of Immigration on Productivity: Evidence from US States. Review of
EconomicsandStatistics, 94(1):348–358.
[168] Peri, G. and Yasenov, V. (2019). The Labor Market Effects of a Refugee Wave Synthetic Control
Method Meets the Mariel Boatlift. JournalofHumanResources, 54(2):267–309.
[169] Perrings, C., Castillo-Chavez, C., Chowell, G., Daszak, P., Fenichel, E. P., Finnoff, D., Horan, R. D.,
Kilpatrick, A. M., Kinzig, A. P., Kuminoff, N. V., et al. (2014). Merging economics and epidemiology to
improve the prediction and management of infectious disease. EcoHealth, 11(4):464–475.
[170] Phillips, P. C. B., Shi, S., and Yu, J. (2015a). Testing for Multiple Bubbles: Historical Episodes of
Exuberance and Collapse in the S&p 500. International Economic Review, 56(4):1043–1078. eprint:
https://onlinelibrary.wiley.com/doi/pdf/10.1111/iere.12132.
[171] Phillips, P. C. B., Shi, S., and Yu, J. (2015b). Testing for Multiple Bubbles: Limit
Theory of Real-Time Detectors. International Economic Review, 56(4):1079–1134. eprint:
https://onlinelibrary.wiley.com/doi/pdf/10.1111/iere.12131.
[172] Phillips, P. C. B., Wu, Y., and Yu, J. (2011). Explosive behavior in the 1990s nasdaq: when
did exuberance escalate asset values? International Economic Review, 52(1):201–226. eprint:
https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1468-2354.2010.00625.x.
[173] Pons, V. (2018). Will a Five-Minute Discussion Change Your Mind? A Countrywide Experiment on
Voter Choice in France. AmericanEconomicReview, 108(6):1322–1363.
[174] Pouget, S., Sauvagnat, J., and Villeneuve, S. (2017). A Mind Is a Terrible Thing to Change:
Confirmatory Bias in Financial Markets. The Review of Financial Studies , 30(6):2066–2109. Publisher:
[Oxford University Press, The Society for Financial Studies].
[175] Prem, K., Cook, A., and Jit, M. (2017). Projecting social contact matrices in 152 countries using
contact surveys and demographic data. PLOSComputationalBiology, 13(9).
[176] Prem, K., Liu, Y., Russell, T. W., Kucharski, A. J., Eggo, R. M., Davies, N., Flasche, S., Clifford, S.,
Pearson, C. A., Munday, J. D., et al. (2020). The effect of control strategies to reduce social mixing on
outcomes of the covid-19 epidemic in wuhan, china: a modelling study. TheLancetPublicHealth .
[177] Preston, J. (2017a). Fearful of Court, Asylum Seekers are Banished in Absentia. TheMarshallProject .
[178] Preston, J. (2017b). Migrants in Surge Fare Worse in Immigration Court Than Other Groups. The
WashingtonPost.
215
[179] Rader, B., Scarpino, S. V., Nande, A., Hill, A. L., Adlam, B., Reiner, R. C., Pigott, D. M., Gutierrez,
B., Zarebski, A. E., Shrestha, M., Brownstein, J. S., Castro, M. C., Dye, C., Tian, H., Pybus, O. G., and
Kraemer, M. U. G. (2020). Crowding and the shape of covid-19 epidemics. NatureMedicine, 26(12):1829–
1834.
[180] Ramji-Nogales, J., Schoenholtz, A. I., and Schrag, P. G. (2011). RefugeeRoulette: DisparitiesinAsylum
AdjudicationandProposalsforReform. NYU Press.
[181] Rao, A., Burgess, M. G., and Kaffine, D. (2020). Orbital-use fees could more than quadruple the value
of the space industry. ProceedingsoftheNationalAcademyofSciences, 117(23):12756–12762.
[182] Roche, B., Garchitorena, A., and Roiz, D. (2020). The impact of lockdown strategies targeting age
groups on the burden of covid-19 in france. Epidemics, 33:100424.
[183] Sanche, S., Lin, Y., Xu, C., Romero-Severson, E., Hengartner, N., and Ke, R. (2020). High
contagiousness and rapid spread of severe acute respiratory syndrome coronavirus 2. Emerg Infect
Dis., 7:1470–1477.
[184] Scheinkman, J. and Xiong, W. (2003). Overconfidence and speculative bubbles. Journal of Political
Economy, 111:1183–1219.
[185] Scott, A. (1955). The fishery: The objectives of sole ownership. JournalofPoliticalEconomy, 62.
[186] Sequeira, S., Nunn, N., and Qian, N. (2020). Immigrants and the making of america. The Review of
EconomicStudies, 87(1):382–419.
[187] Shiller, R. (2019). NarrativeEconomics. Princeton University Press.
[188] Shiller, R. J. (1981). Do Stock Prices Move Too Much to be Justified by Subsequent Changes in
Dividends? TheAmericanEconomicReview , 71(3):421–436. Publisher: American Economic Association.
[189] Shiller, R. J. (2003). From Efficient Markets Theory to Behavioral Finance. Journal of Economic
Perspectives, 17(1):83–104.
[190] Simsek, A. (2021). The macroeconomics of financial speculation. Annual Review of Economics,
13:13.1–13.35.
[191] Sneppen, K., Nielsen, B. F., Taylor, R. J., and Simonsen, L. (2021). Overdispersion in covid-19 increases
the effectiveness of limiting nonrepetitive contacts for transmission control. ProceedingsoftheNational
AcademyofSciences, 118(14).
216
[192] Sood, N., Simon, P., Ebner, P., Eichner, D., Reynolds, J., Bendavid, E., and Bhattacharya, J. (2020).
Seroprevalence of sars-cov-2–specific antibodies among adults in los angeles county, california, on april
10-11, 2020. Jama.
[193] Stokey, N., Lucas, R., and Prescott, E. (1989). Introduction. In Recursive methods in Economic
Dynamics, pages 3–7. Harvard University Press.
[194] Strahan, P. (2000). Borrower risk and nonprice terms of bank loans. NewYorkFederalReserveBoard
StaffReport .
[195] Tabellini, M. (2020). Gifts of the Immigrants, Woes of the Natives: Lessons from the Age of Mass
Migration. TheReviewofEconomicStudies , 87(1):454–486.
[196] Taipale, J., Romer, P., and Linnarsson, S. (2020). Population-scale testing can suppress the spread of
covid-19. medRxiv.
[197] The Associated Press (2021). At a glance: Europe’s coronavirus curfews
and lockdowns. https://abcnews.go.com/Health/wireStory/
glance-europes-coronavirus-curfews-lockdowns-75248293.
[198] Thompson, G. (2019). Your Judge Is Your Destiny. Topic.
[199] Thunstr ¨ om, L., Ashworth, M., Shogren, J. F., Newbold, S., and Finnoff, D. (2020a). Testing for covid-
19: Willful ignorance or selfless behavior? BehaviouralPublicPolicy, pages 1–26.
[200] Thunstr ¨ om, L., Newbold, S. C., Finnoff, D., Ashworth, M., and Shogren, J. F. (2020b). The benefits
and costs of using social distancing to flatten the curve for covid-19. Journal of Benefit-Cost Analysis ,
pages 1–27.
[201] Tirole, J. (1982). On the Possibility of Speculation under Rational Expectations. Econometrica,
50(5):1163–1181. Publisher: [Wiley, Econometric Society].
[202] TRAC (2020). Immigration judge reports: Judge david neumeister.
[203] USA Today (2021). Map of covid-19 case trends, restrictions and
mobility. https://abcnews.go.com/Health/wireStory/
glance-europes-coronavirus-curfews-lockdowns-75248293.
[204] Verity, R., Okell, L. C., Dorigatti, I., Winskill, P., Whittaker, C., Imai, N., Cuomo-Dannenburg, G.,
Thompson, H., Walker, P. G., Fu, H., et al. (2020). Estimates of the severity of coronavirus disease 2019:
a model-based analysis. TheLancetinfectiousdiseases .
217
[205] Weisblum, Y., Schmidt, F., Zhang, F., DaSilva, J., Poston, D., Lorenzi, J. C., Muecksch, F., Rutkowska,
M., Hoffmann, H.-H., Michailidis, E., Gaebler, C., Agudelo, M., Cho, A., Wang, Z., Gazumyan, A.,
Cipolla, M., Luchsinger, L., Hillyer, C. D., Caskey, M., Robbiani, D. F., Rice, C. M., Nussenzweig, M. C.,
Hatziioannou, T., and Bieniasz, P. D. (2020). Escape from neutralizing antibodies by sars-cov-2 spike
protein variants. eLife, 9:e61312.
[206] Winichakul, K. and Zhang, N. (2021). Enter stage left: Immigration and the creative arts in america.
WorkingPaper.
[207] Zhou, G. (2018). Measuring Investor Sentiment. Annual Review of Financial Economics, 10(1):239–
259. eprint: https://doi.org/10.1146/annurev-financial-110217-022725.
[208] Zivin, J. G. and Sanders, N. (2020). The spread of covid-19 shows the importance of policy
coordination. ProceedingsoftheNationalAcademyofSciences, 117(52):32842–32844.
218
Abstract (if available)
Abstract
This thesis entitled “Essays on Narrative Economics, Climate Macrofinance and Migration” contains three essays (Part 1) developing an approach to studying narratives; as well as two essays (Part 2) on Climate Change Macrofinance and Migration.
The first chapter demonstrates how verbal content can be used to catalog and identify financial asset bubbles. I recover time series of “textual risk factors”, economic factors that users regularly discuss, for two Bitcoin bubble episodes using social media data. I apply time series tests for “explosivity”, typically used to test for bubbles and/or “irrational exuberance” in price data, to these text series. This exercise generates three findings about the behavior of verbal content during bubble events: (1) during bubbles, explosive prices do co-occur with explosivity in some, though not all, of these textual risk factors; (2) explosivity is more present in textual risk factors measured using all tweets rather than just those with non-zero retweets, suggesting a key role for tweets typically considered unpopular; and (3) explosive textual risk factors differ across episodes, suggesting each bubble is associated with its own set of unique verbal content. These findings suggest that a procedure of checking textual risk factors for explosivity may provide a useful additional test of bubble formation. I further provide evidence on important events/verbal content associated with each Bitcoin bubble episode.
In the second chapter, with coauthors, I use a contagion model (a modelling strategy I employ in the following chapter for narratives) to study pandemics. Public policy and academic debates regarding pandemic control strategies note disease-economy trade-offs, often prioritizing one outcome over the other. Using a calibrated, coupled epi-economic model of individual behavior embedded within the broader economy during a novel epidemic, we show that targeted isolation strategies can avert up to 91% of economic losses relative to voluntary isolation strategies. Unlike widely-used blanket lockdowns, economic savings of targeted isolation do not impose additional disease burdens, avoiding disease-economy trade-offs. Targeted isolation achieves this by addressing the fundamental coordination failure between infectious and susceptible individuals that drives the recession. Importantly, we show testing and compliance frictions can erode some of the gains from targeted isolation, but improving test quality unlocks the majority of the benefits of targeted isolation.
In the third chapter I bring together methods developed in the prior two to build an approach to modelling narratives. I develop a theory of talking and listening during bubble events. Agents decide how much to “listen” (e.g. read tweets about Bitcoin) which may cause them to adopt an idea (e.g “blockchain”). This changes their beliefs, affecting their investment decisions. The model provides a role for language in bubbles – language determines an idea/narrative’s optimism/pessimism, but also its novelty. These two roles for language interact, driving bubble and crash phases. Using Twitter data, I calibrate the model using modern computational linguistics techniques for several bubbles. The calibrated model can explain aggregate talking and listening on Twitter, as well as the bubble price. I use the model to show that with social media bubbles form faster and have larger magnitude. The framework may be useful for modelling other economic aggregates impacted by social interactions e.g. the emergence of political, social and environmental innovations/investments.
In the fourth chapter, and the first of Part 2, I examine the local financial effects of natural disasters to study climate change. Substantial debate exists on how financial markets will react to climate change. Given the importance of these markets in the economy, and how the economy responds to shocks, understanding this relationship is important. I contribute to this debate by providing empirical evidence that establishes the financial consequences of climate shocks and building a theoretical model to parameterize the role of financial markets in the climate problem. My empirical results show that routine climate shocks drag on firms' ability to raise financing via higher credit spreads; while larger disasters cause much larger spikes in credit spreads of 60-100 basis points for an average firm. I use this evidence to motivate a financial macroeconomy with climate model (a DSGE model with a climate externality and collateral constraint) and state important results and intuition. The marginal externality damage equation from the macroclimate literature still holds but reductions in economic output are amplified.
Finally, an additional critical impact of climate change is the displacement it may create; in my final chapter I study a specific form of migration to the United States. Every year many thousands of migrants seek asylum in the United States. Upon entry, they encounter U.S. immigration judges who exhibit large variability in their decisions. We document on average a 20 percentage point within-court gap in grant rates between the least versus most lenient judges. We find that asylum seekers respond to these large discrepancies across judges. Focusing on the years 2009-2015, preceding and during a major increase in asylum applicants, we estimate that asylum seekers who are quasi-randomly assigned to less lenient immigration judges are more likely to be absent for their immigration hearings. We show that this type of endogenous response to decision-maker leniency leads to bias in second-stage estimates when using randomly assigned judges and variation in judge leniency as an instrument. We conclude that the extreme variability in judicial decisions in United States immigration courts causes important distortions in the behavior of those subject to such caprice.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Essays on innovation, human capital, and COVID-19 related policies
PDF
Essays in climate change adaptation: role of market power in incentivizing adaptation behavior
PDF
The impact of minimum wage on labor market dynamics in Germany
PDF
Migration, location, and economic opportunity
PDF
Essays in macroeconomics and macro-finance
PDF
Essays on urban and real estate economics
PDF
Essays on the dual urban-rural system and economic development in China
PDF
Essays on the economics of climate change adaptation in developing countries
PDF
The changing policy environment and banks' financial decisions
PDF
Essays on monetary policy and international spillovers
PDF
Essays on Japanese macroeconomy
PDF
Essays in panel data analysis
PDF
Essays on development and health economics
PDF
Essays on econometrics
PDF
Three essays on heterogeneous responses to macroeconomic shocks
PDF
Essays on wellbeing disparities in the United States and their social determinants
PDF
Three essays on macro and labor finance
PDF
Essays on firm investment, innovation and productivity
PDF
Three essays in international macroeconomics and finance
PDF
Essays on development and health economics: social media and education policy
Asset Metadata
Creator
Ash, Thomas
(author)
Core Title
Essays on narrative economics, Climate macrofinance and migration
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Economics
Degree Conferral Date
2023-05
Publication Date
04/27/2023
Defense Date
03/21/2023
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
climate change,computational linguistics,digital economics,economics of speculation,environmental economics,financial bubbles,macrofinance,migration,narrative economics,OAI-PMH Harvest,text analysis
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Kurlat, Pablo (
committee chair
), Hoberg, Gerard (
committee member
), Kahn, Matthew (
committee member
), Nix, Emily (
committee member
), Zeke, David (
committee member
)
Creator Email
asht@usc.edu,thomassash@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113089386
Unique identifier
UC113089386
Identifier
etd-AshThomas-11732.pdf (filename)
Legacy Identifier
etd-AshThomas-11732
Document Type
Dissertation
Format
theses (aat)
Rights
Ash, Thomas
Internet Media Type
application/pdf
Type
texts
Source
20230501-usctheses-batch-1033
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
climate change
computational linguistics
digital economics
economics of speculation
environmental economics
financial bubbles
macrofinance
migration
narrative economics
text analysis