Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Bridging possible identities and intervention journeys: two process-oriented accounts that advance theory and application of psychological science
(USC Thesis Other)
Bridging possible identities and intervention journeys: two process-oriented accounts that advance theory and application of psychological science
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Bridging Possible Identities and Intervention Journeys: Two Process Oriented Accounts That
Advance Theory and Application of Psychological Science
By
S. Casey O’Donnell
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the Requirements for the Degree
DOCTOR OF PHILOSOPHY
PSYCHOLOGY
AUGUST 2021
Copyright 2021 S. Casey O’Donnell
ii
Epigraphs
“There is nothing so practical as a good theory”
– Kurt Lewin in Field Theory in Social Science (1951)
“A fact, in science, is not a mere fact, but an instance.”
- Bertrand Russell in The Scientific Outlook (1931)
iii
Dedications
To my late father, Sean Christopher O’Donnell, who always encouraged me to work hard, be
curious, be caring, and to do work that matters for the world.
To my mother, Cathryn W. O’Donnell, who continues to be an inspiring shining example of
competence, kindness, and creativity
To my partner, Anna E. Blanken, who saw me through some of the toughest times in my life, is a
an endless font of love and laughter, and whose compassionate and patient support bolstered me
when I might otherwise have buckled.
To my dog and writing partner, Dr. Denver “Bobos” O’Donnell, whose thoughtful edits included
deleting entire paragraphs that were probably no good anyway.
iv
Acknowledgments
To my chairperson, Dr. Daphna Oyserman for being engaging sounding board and an immensely
supportive mentor, and to my dissertation committee members Drs Leor Hackel, Mark Lai,
Norbert Schwarz, and Sarah Townsend who read and thoughtfully evaluated the work; my fellow
lab mates in the Mind and Society Center for their feedback and support; my partner and fellow
researcher Anna Blanken; and to my roommate at the time of writing, Matt Multach, who
endured my hyper-focused grumpiness while providing helpful guidance on developing
methodological ideas into actual R script.
v
TABLE OF CONTENTS
Epigraph .......................................................................................................................................... ii
Dedications .................................................................................................................................... iii
Acknowledgments.......................................................................................................................... iv
List of Tables .............................................................................................................................. viiii
List of Figures ................................................................................................................................. x
Abstract ......................................................................................................................................... xii
Introduction ..................................................................................................................................... 1
Chapter 1: It all comes back to school: Bridging possible identities and academic attainment ..... 4
Introduction ......................................................................................................................... 5
Possible Self, Possible Identities............................................................................. 6
The Current Studies .......................................................................................................... 19
Modeling Our Outcome Across Studies ............................................................... 21
How We Created our Algorithms From Student Open-Ended Responses ........... 21
Analytic Strategy and Reporting Across Studies .................................................. 23
Study 1: Training the Outcome-Based Algorithm ............................................................ 24
Participants ............................................................................................................ 25
Method .................................................................................................................. 27
Analysis Plan ........................................................................................................ 28
Results ................................................................................................................... 29
Discussion ............................................................................................................. 30
Study 2 .............................................................................................................................. 31
Participants ............................................................................................................ 31
vi
Measures ............................................................................................................... 32
Analysis Plan ........................................................................................................ 32
Results ................................................................................................................... 33
Discussion ............................................................................................................. 34
Study 3 .............................................................................................................................. 35
Sample................................................................................................................... 35
Method .................................................................................................................. 35
Measures ............................................................................................................... 37
Analysis plan ......................................................................................................... 40
Results ................................................................................................................... 40
General Discussion ........................................................................................................... 45
Synthesis with prior research ................................................................................ 46
Limitations and future directions .......................................................................... 49
Conclusion ........................................................................................................................ 51
Appendix A ....................................................................................................................... 52
Appendix B ....................................................................................................................... 53
Appendix C ....................................................................................................................... 54
Supplemental Materials .................................................................................................... 58
Chapter 2: Thinking of interventions as journeys concretizes the invisible elements that make
interventions work at scale............................................................................................................ 77
Abstract ............................................................................................................................. 77
Introduction ....................................................................................................................... 78
Interventions as Journeys .................................................................................................. 79
vii
Making Progress Without an Apt Metaphor Has Proven Difficult................................... 81
Designing a Vehicle .......................................................................................................... 85
Designing a Path ............................................................................................................... 88
Making a Smart Map ........................................................................................................ 92
Intervention Journey: An Empirical Model ...................................................................... 95
Applying the Journey Framework to Two Identity-based Motivation Interventions ..... 104
Conclusion ...................................................................................................................... 109
References ................................................................................................................................... 110
viii
List of Tables
Chapter 1………………………………………………………………………………………….
Table 1. Prior theories, associated predictions, and relevant features possible identities assumed
to motivate current action ....................................................................................................... 8
Table 2. Characteristics of prior studies linking possible identities to academic outcomes ......... 15
Table 3. Overview of study core research question, overview of sample, and prediction ........... 20
Table 4. Training and test sample descriptions across studies ...................................................... 26
Table 5. Measures of valance, content, and structure ................................................................... 37
Table 6. Correlations of predictors with previous GPA and demographic variables ................... 41
Table A1. Categories used in network analysis ............................................................................ 56
Table S1. Manual corrections of commonly misspelled words .................................................... 60
Table S2. Effects of Change in Outcome-based Algorithm Derived Possible Identity Scores on
Current Year GPA................................................................................................................. 67
Table S3. Algorithmic Representation of Possible Identities is Specific to how Students write
about their future ................................................................................................................... 68
Table S4. Correlations among operationalizations of possible identities at the beginning of the 8
th
grade with 6
th
and 7
th
grade GPA. ......................................................................................... 70
Table S5. Correlations among operationalizations of possible identities at the beginning of the 8
th
grade with student demographics.......................................................................................... 70
Table S6. Relation between number of expected possible identities and algorithm-based possible
identity scores controlling for random effect of person, demographics, and measures of
elaboration/literacy ............................................................................................................... 72
ix
Table S7. Relation between number of avoided possible identities and algorithm-based possible
identity scores controlling for random effect of person, demographics, and measures of
elaboration/literacy ............................................................................................................... 72
Table S8. Relation between balanced school-focused possible identities and algorithm-based
possible identity scores controlling for random effect of person, demographics, and
measures of elaboration/literacy ........................................................................................... 73
Table S9. Relation between plausible school-focused possible identities and strategies and
algorithm-based possible identity scores controlling for random effect of person,
demographics, and measures of elaboration/literacy ............................................................ 74
Table S10. Relation between school betweenness and algorithm-based possible identity scores
controlling for random effect of person, demographics, and measures of elaboration/literacy
............................................................................................................................................... 75
x
List of Figures
Chapter 1………………………………………………………………………………………….
Figure 1. Network diagram with highly between node (red) and highly connected node (blue) . 12
Figure 2. How we created our algorithms from student responses ............................................... 23
Figure 3. Flow of the Possible Identities and Strategies Measure ................................................ 28
Figure 4. Change in algorithm-based possible identity scores predict end of year GPA controlling
for school, prior GPA, and demographics ............................................................................ 29
Figure 5. Change in identity-based scores of student possible identities and strategies predicts
end of year GPA and is not accounted for by other writing-based scores ............................ 33
Figure 6. Network representation of text ...................................................................................... 36
Figure 7. Correlations of predictors with previous GPA and demographic variables .................. 41
Figure 8. School betweenness is the only feature of responses robustly related to possible identity
scores..................................................................................................................................... 43
Figure A1. Visualization of how we measured possible identity and strategies .......................... 53
Figure S1. Sample distribution (y) compared to posterior distribution (yrep) with normal
distribution priors and with change in possible identity score as the predictor .................... 65
Figure S2. Sample distribution (y) compared to posterior distribution (yrep) with skew-normal
distribution priors and with change in possible identity score as the predictor .................... 65
Figure S3. How features of possible identities covary ................................................................. 71
Chapter 2…………………………………….……………………………………………………
Figure 1. The Components of the Intervention Journey ............................................................... 81
Figure 2. The Five Components of Fidelity that Should be Captured by Intervention Smart Maps
............................................................................................................................................... 85
xi
Figure 3. Identity-based motivation theory translates to three active ingredients ........................ 87
Figure 4. What promotes Persuasion ............................................................................................ 92
Figure 5. What promotes learning ................................................................................................ 92
Figure 6. What processes make a particular intervention journey useful for future journeys. ..... 93
Figure 7. A Process Model to Describe the Journey Empirically ................................................. 96
Figure 8. A flowchart for interpreting intervention findings ........................................................ 98
xii
Abstract
Researchers often provide patchwork empirical evidence based on siloed theories. As a
result, they give less than the expect to their fellow researchers and the community. This
dissertation explores how two areas of research: possible identities and intervention science has
been undertheorized, in part, because research often focuses on constructs instead of process. To
make progress, I offer two process-oriented accounts. In Chapter 1, I discuss the extant literature
and offer theoretical rationale for past findings pertaining to how content, valence, and structure
of possible identities matter for motivation. In two studies, I employ a bottom-up approach using
natural language processing and machine learning to identify whether change in any features of
possible identities matter for student grade-point average. I leverage this bottom-up approach in a
third study to focus just on the functionally relevant features of possible identities and strategies,
finding that a novel formulation of bridging possible identities explains part of the functional
relationship, while previous operationalizations of content, valence, and structure do not. In
Chapter 2, I turn to application of research in intervention, using a journey metaphor to
understand interventions, their effects, and ways to create positive feedback loops to improve
scalability. As journeys, interventions entail travelers (participants), travel guides
(implementers), scenery (intervention context), transportation modes (psychological theory
translated to a manualized intervention), paths (culturally-attuned psychology of learning,
memory, and persuasion operationalized to manualized train-the-trainer and intervention), and
smart-maps (quantified fidelity to stay on track and improve manualized training and
implementation). We articulate how social science theories matter for each of the six components
of journeys. I present an empirical model and interpretation guide to facilitate use of the
intervention as journey framework.
1
Introduction
Any review of social scientific research is likely to reveal two core problems. One,
researchers seem to confuse the concept of operationalization with the theory being
operationalized. They may overly rely on certain operationalizations and experimental paradigms
because these methods are convenient and seemed to have worked well in the past. Two,
researchers’ theories are often descriptive, rather than predictive. A descriptive theory describes
the way things are, what they are like, and categorizes them. A predictive theory, on the other
hand, explains the process for how things got the way they are and what might be done to change
them. These tensions together have contributed to the current replication crisis in psychology in
often underappreciated ways.
I explore these core problems in two chapters, each a stand-alone paper linked together
by a focus on distinguishing constructs from process and advancing predictive over descriptive
theories. In Chapter 1, I explore how the literature on possible identities can describe the content
of possible identities (e.g., school, relationships) and how that content might be structured, but as
I describe, it is hard to nail down why exactly they matter for motivation based on a descriptive
process alone. I draw on the identity-based motivation theory prediction that possible identities,
like other identities, are stored in associative knowledge networks and draw on network science
to make predictions for how the structure of that associative knowledge network relates possible
identities to academic outcomes. In the first two studies, I employ a bottom-up approach using
natural language processing and machine learning to identify whether change in any features of
possible identities matter for student grade-point average. I leverage this bottom-up approach in a
third study to focus just on the functionally relevant features of possible identities and strategies,
2
finding that a novel formulation of bridging possible identities explains part of the functional
relationship, while previous operationalizations of content, valence, and structure do not.
In Chapter 2, I turn to application of research in intervention, offering a conceptual
framework for understanding interventions as journeys. I explore how this journey metaphor can
concretize the abstraction of intervention, thus disentangling fraught conceptual issues in
intervention science. As journeys, interventions entail three obvious components: travelers
(participants), guides (implementors), scenery (intervention context). They also entail two often
overlooked components: a mode of transportation (psychological theory translated to a
manualized intervention), a path leading to some destination (culturally-attuned psychology of
learning, memory, and persuasion operationalized to manualized train-the-trainer and
intervention). It helps to have an up-to-date smart map (quantified assessment of fidelity), though
researchers often do not include one or, if they include some navigational tool, it is typically a
static map. Carefully thinking of these six elements when designing, implementing, and
evaluating interventions helps answer 6 critical questions that are necessary for replicating an
intervention: Was the vehicle driven as expected and on the right path? Did any travelers make
it to their destination when offered the intervention journey? Did driving the vehicle as expected
and on the right path matter? Were travelers convinced, did they learn the lessons they were
expected to. Does driving the vehicle as expected and on the expected shape how much travelers
were changed or the way in which that change matters? Do features of the travelers, guides, or
scenery fundamentally change the journey or whether travelers reach their destination? I provide
an empirical model for answering these questions and a flow chart to interpreting intervention
results based on the answers to each. In doing so, I aim to provide a clearer conceptual
framework for how existing social scientific theories can be applied to explain and operationalize
3
the six components of the intervention journey. Together these chapters reflect my interest in
promoting a version of science that is process oriented and relevant to the real world.
4
Chapter 1: It all comes back to school: Bridging possible identities and academic
attainment
Abstract
Students might attain better grades if they start the year with more possible identities,
energizing expected identities, vigilance-promoting linked expected and feared identities, or
possible identities linked to a plausible behavioral roadmap. Studies provide evidence for each of
these possibilities, implying a broader theoretical synthesis of how possible identities affect
academic outcomes is necessary. To make progress, we start bottom-up in Study 1, using
machine-learning to create an algorithm score of student open-ended possible identities
responses based on their grade point averages (GPA; training sample N= 602, Mage =14, 44%
free/reduced-price lunch). Our algorithm predicts which students in a different sample will have
a positive fall-to-spring change in their GPA (test sample N=247, Mage=13, 92% free/reduced-
price lunch). Our algorithm captures possible identities’ motivational force. An algorithm we
developed from different student writing (about an edu-game) does not (Study 2 training sample
N=540, Mage=14, 43% free/reduced-price lunch), implying that there is something unique about
possible identities. We use network analyses of possible identity responses to unpack what our
possible identity-based algorithm captures about the possible identities of students whose GPA
improves (Study 3). Students are more likely to have a positive year-to-year change in their GPA
when their various possible identities are bridged together by school. Bridging possible identities
makes aspects of the future self seem relevant no matter what else is on the mind; other features,
the count, valence, and basic structure of possible identities are less stable predictors.
Keywords: possible identities, academic expectations, academic outcomes, machine learning,
natural language processing
5
Introduction
“No matter how dirty your past is, your future is still spotless.” – Drake
“Ghost of the Future, I fear you more than any spectre I have seen. But as I know your purpose is
to do me good, and as I hope to live to be another man from what I was, I am prepared to
bear you company, and do it with a thankful heart. Will you not speak to me?”
Scrooge in A Christmas Carol (pp. 74, Dickens, 1843/1858)
“Education is the passport to the future, for tomorrow belongs to those who prepare for it today.”
Malcolm X, Speech at Founding Rally of the Organization of Afro-American Unity (June
28, 1964).
As our opening quotes suggest, how imagining the future matters is a bit up for grabs.
Just imagining a positive future self may be energizing, because as Drake implies, anything
might be possible; the future self is a “spotless” blank slate. Alternatively, it might provide some
standard for evaluation, motivating change in the present, as it does for Dickens’ Scrooge, who
vividly experienced what his future might be if he did not change course. Or, it might provide a
plan for what to do and, in doing so, increase awareness of the importance of focusing on
schooling and education as the bridge to the future, as Malcolm X suggests. As we detail next,
psychologists seem to agree with each of these ideas but not on the underlying process or
processes by which future selves matter. Our examination of the psychological literature on the
effect of possible identities on academic attainment and grades reveals studies that provide
supporting evidence that Drake, Scrooge, and Malcolm X might be right. Either number of
possible identities, their valence, or their link to strategies to get going matters. To disentangle
which features of possible identities matter, we start with a bottom-up approach (Studies 1, 2).
6
We examine the unique relationship between possible identities and an important behavioral
outcome, grades in school, using natural language processing and machine learning. Then we
leverage our machine coding representation to examine which features of possible identities
matter (Study 3). We use network analyses to show that bridging elements of possible identities
with school is the unique structural feature of possible identities that shape behavior over time.
Possible Self, Possible Identities
A potential self (James, 1890) or possible self (Markus & Nurius, 1986) is a self a person
imagines becoming in the future. Possible selves or possible identities are valenced and content-
rich mental representations that are available in memory (Markus & Nurius, 1986). Their
accessibility depends on contextual cues, and they influence action when they are on the mind
and experienced as apt, relevant to the task at hand (Oyserman et al., 2012). As mental
representations, possible identities are connected to other possible identities via an associative
structure, as is the case for other content stored in memory (e.g., Amodio, 2019; Bodenhausen et
al., 2003; Collins & Loftus, 1975). The nature of that associative structure matters, shaping the
likelihood that a particular possible identity will be accessible in the moment and the
downstream consequences of this accessibility for how people make sense of their experiences
and the actions they take. Following identity-based motivation theory, associative structure
interfaces with context. Which possible identities come to mind, and what they imply for current
behavior depends on the affordances and constraints of the situation (Oyserman et al., 2017).
However, the existing empirical literature has not focused on process, typically studying
possible identities as outcome variables rather than as predictors (for reviews Hoyle & Sherill,
2006; (Oyserman & Fryberg, 2006; Oyserman & James, 2009, 2011). This focus on describing
content and antecedents of possible identities makes sense given the starting assumption that
7
possible identities shape how successful people are in important life outcomes like academics.
However, neglecting to test the process by which possible identities affect the likelihood of
successfully attaining these life outcomes means that an array of process predictions remain
equally plausible. As we outline next, predictions range from the existence, valence, number,
content, or various aspects of the structure of possible identities. To make progress, in the next
section we scaffold these predictions about how the number, valence, and link to strategies of
possible identities matter with related theories about motivation and integrate these process
predictions with the body of research relating possible identities to academic outcomes.
Number, Valence, and Link to Strategies and the Motivational Power of Possible Identities
In Table 1, we outline theories for why possible identities should matter, their associated
predictions and their relevant features given current operationalizations in the possible identities
literature. One possibility is that thinking about the future is itself motivating. Following
temporal construal theory (Trope & Liberman, 2003), the future self might feel more like the true
self than the current one (Wakslak et al., 2008). Thinking about the future focuses attention on
the big picture (the essence, the core aspects) while thinking about the present focuses attention
on the details (the specific elements); this is especially the case when the future feels far or
unlikely (Wakslak et al., 2006). For this reason, thinking about the future can be motivating
without necessarily triggering action. A second related possibility is that people take action when
they have expectations that doing so will lead to success. Following expectancy-value theories
(Eccles et al. 1983; Eccles & Wigfield, 2002, 2020; Vroom, 1964), possible identities may be
another way to describe expectation-linked content stored in memory. This content is motivating
because it links to a positive expectation, such as expecting to go to college (e.g., Beal &
Crockett, 2001). A third possibility is that people take action when they imagine a positive future
8
for themselves; this idea is congruent with conceptualizations of the motivating power of
positive self-esteem (Rosenberg, 1965) and self-efficacy (Bandura, 1977). Indeed people who
have more positive possible identities have higher self-esteem (Markus & Nurius, 1986).
Table 1
Prior theories, associated predictions, and relevant features possible identities assumed to
motivate current action
Relevant Theories Predicts Motivation When... Relevant Features
Construal Theory Thinking about the future Number of possible identities
Expectancy Value Theory Current action feels like it
will lead to success
Number of positive expected
possible identities
Self-Efficacy
Self-Esteem
Having a positive image of
the future leads to efficacy
and positive self-image
Regulatory Focus Theory Regulatory focus on attaining
success or preventing failure
matches positive or negative
valence of accessible possible
identities
Number of expected or feared
possible identities
Rubicon-Action Phase Model Action is already underway Possible identities linked to
current strategies
Implementation Strategies Relevant strategies come to
mind and are concrete
Possible identities linked to
concrete strategies
Identity-based Motivation
Both positive expected and
negative avoided identities
are available in memory
Balance of expected and
feared possible identities in
the same domain
Relevant strategies come to
mind, are concrete, and
address plausible
interpersonal barriers
Possible identities linked to
concrete strategies and
interpersonal barriers
A fourth possibility is that people take action when, no matter the situation, they have an
appropriately valenced possible identity to fit that situation. Following Higgins’ (2005)
9
regulatory focus theory, people prefer to act in ways that match situational regulatory focus cues.
Higgins assumes focus is a trait of people. However, following identity-based motivation theory,
the possible identities that come to mind and what they mean for current action depends on cues
in the immediate context (Oyserman et al., 2017). That is, people in failure-likely situations
should prefer using prevention-focused strategies to avoid feared possible identities and
promotion-focused strategies to attain positive expected possible identities in success-likely
situations (Oyserman et al., 2015). Of course, students sometimes find themselves in failure-
likely and other times in success-likely situations. Hence, the structure may matter: students
whose possible identities include both positive (expected) and negative (feared) possible identity
content in the same domain may be more likely to take identity-based action. After all, no matter
the situation, some aspect of their possible identity is relevant (Oyserman et al., 2015).
A final possibility is that taking action may require both a possible identity and a
strategy: a roadmap for working toward positive and away from negative possible identities
(Oyserman et al., 2004). Outcomes cannot be attained without behavior, so imagining a future
that is about school without strategies is unlikely to matter (Oyserman et al., 2011). Following
the Rubicon action-phase model (Heckhausen & Gollwitzer, 1987), having strategies attached to
a possible identity may imply a person has already “crossed the Rubicon”--they are strongly
committed to acting to attain or avoiding a possible identity and are no longer deliberating about
the odds or value of doing so. Moreover, the strategies that come to mind have to be concrete
enough to be usable now, linking possible identities to implementation strategies (Gollwitzer &
Sheeran, 2006; Oyserman et al., 2004). Following identity-based motivation theory (Oyserman et
al., 2017), in the context of school, for possible identities to impact academic outcomes, students
10
may need a plausible roadmap -- concrete strategies that are linked to possible identities and
address interpersonal barriers (Oyserman et al. 2011).
Hence, a possible identity approach can be fit with each of several theoretical approaches
to motivation. Because some of these approaches are more process-oriented than others, they
make a more explicit connection between features of the future self (the number, valence,
content, and link to strategies of the possible identities it contains) and the actions students are
likely to take at the moment. Thus, the idea that having a future self and the valence, number,
and content of its possible identities matters for action is congruent with other theories of
motivation. However, these features do not connect well with modern approaches to social
cognition, which heavily implicate the role of accessibility and link to action through
experienced aptness - relevance to the judgment or task at hand (Smith & Semin, 2007). We
propose a conceptual model of bridging possible identities that articulates how some possible
identities are more likely to be accessible and feel apt and hence shape the course of action over
time to affect meaningful outcomes.
Spreading of Activation: An Analogy from Social Networks
To make progress, we step back and return the Markus and Nurius’s (1986) claim that
possible identities are mental representations stored in memory that only come to mind when
they are relevant. Following identity-based motivation, we assume that the mental
representations of possible identities, along with current and past identities, are stored in
associative knowledge networks that are activated in context. This allows us to make a novel
theoretical formulation of possible identities by connecting to the literature on associative links
and spreading activation coming from cognitive (e.g., in memory, Collins & Loftus, 1975) and
network science (e.g., Siew et al., 2019). We assume that how much possible identities affect
11
academic outcomes will depend on the likelihood that, over time, students repeatedly experience
the higher-level features of school (e.g., academic engagement) as the apt response when their
possible identities are activated. We propose that accessibility and aptness depend on how
students’ possible identity associative knowledge networks are structured; how activation
spreads depends on the structural links among possible identity content and strategies. We
predict that possible identities will matter for academic outcomes when the relevant message
efficiently spreads across the network. In these cases, no matter what else comes to mind, so
does taking action to do well in school. Network analytic approaches can address this question
by formally representing structural relationships in possible identities and highlighting how those
structures might matter.
We base our network analytic approach to representing cognitive structure on prior work
documenting its usefulness in understanding memory retrieval, semantic priming, and language
representation (for a review, see Siew et al., 2019) and on social network research (Burt, 1999;
Granovetter, 1973; Kitsak et al., 2010). Network researchers suggest that the best candidates for
spreading information are the people that bridge different social cliques or groups (Burt, 1999;
Granovetter, 1973; Kitsak et al., 2010). Like human social networks (Travers & Milgram, 1977),
associative knowledge follows a 5 or 6 degrees of separation small-world structure (De Deyne et
al., 2008; Steyers & Tenenbaum, 2005). Consider what would happen if the red bridging node in
Figure 1 was activated. Activating aspects (i.e., specific content, valence) of possible identities
that form bridges will spread widely across students’ knowledge networks. Students are likely to
experience connections among disparate possible identity aspects through school if possible
identity content related to school is between different possible identities. If school is the bridge,
its higher-level feature--academic attainment-- is more likely to be on the mind than its content-
12
idiosyncratic lower-level features. These features are relevant to some possible identity content
and not others and rarely entail academic attainment. For example, what school means to a
possible ‘popular me’ might entail getting noticed or hanging out with friends, while for a
possible ‘sports me’, it might entail not getting cut from the team. That means that if school-
related content was simply the highly connected center (blue node in Figure 1) rather than a
bridge (red node in Figure 1), students might often think about school. But when they do, they
may focus on school in ways that do not promote academic engagement (e.g., remaining eligible
to play sports rather than studying to learn skills for the future).
Figure 1
Network diagram with highly between node (red) and highly connected node (blue)
What is the evidence that possible identities predict academic outcomes?
13
We examined the body of evidence for support or refutation of prior claims or our novel
bridging claim for how possible identities matter for academic outcomes. We found experiments
that carefully isolate some aspect of the relationship between possible identities and academic
performance (e.g., Destin & Oyserman, 2010; Landau et al., 2014; Nurra & Oyserman, 2018;
Ruvolo & Markus, 1992). In these studies, researchers manipulate a particular theoretically
relevant feature of the future self to document a causal effect on something that relates to
schoolwork, though not necessarily academic outcomes. These studies suggest some features of
the future self-structure as candidates (e.g., that the future and current self overlap, Nurra &
Oyserman, 2018) or that the future self provides a contrasting standard that people could feel
efficacious to attain (Oettengen et al., 2005). While useful as proof that some aspects of possible
identity structure are likely to matter, these studies do not provide a sense of the full content of
possible identities or their structure. Hence, we searched for studies showing the predictive
power of possible identities to affect grades at a later point in time. We started with Horowitz
and colleagues’ (2020) review, which reported only six papers that examined this link. We
supplemented this existing review by searching for papers published since Markus and Nurius’
seminal paper was published in 1986. We used PsycINFO to search using the following
keywords (academic outcome: academic outcomes, grades, and academic attainment AND
possible self: possible selves, future selves, possible identities, aspirations, and expectations).
Our search yielded a total of 176 articles, 69 of which were about possible identities.
Fifty-four articles treated possible identities as an outcome in their analyses, confirming that
observational research involving possible identities tends to be descriptive, not predictive (for
similar conclusions involving experimental studies, see Hoyle & Sherill, 2006). The remaining
fifteen articles examined predictive links between a facet of possible identities and academic
14
outcomes; one of these assessed change (Horowitz et al., 2020). Because we were interested in
how possible identities matter over time, we excluded one study that manipulated change
(Oyserman et al., 2006). We summarize these studies in Table 2, reporting studies based on
coding open-ended possible identity probes in the top panel and studies based on closed-ended
queries (e.g., how far do you expect to go in school) in the bottom panel. Our summary
highlights key features, including sample size and power to find effects of varying size, how
researchers assessed possible identities, the possible identities features (e.g., content, valence,
structure) they attended to, and the academic attainment dependent variable used.
Our review highlights three strengths in the literature on the predictive power of possible
identities on academic outcomes: First, this literature underscores that something about possible
identities matters for academic outcomes. All but one of the 15 studies (Feliciano, 2012) reports
a significant relationship between some feature of possible identities and academic outcomes.
Second, as reflected in our first three columns in Table 2, studies were adequately powered (the
exception being Oyserman et al., 1995). Third, effects appear across a wide variety in how
researchers assess possible identities, which possible identity feature they focus on, and how they
operationalize academic outcomes.
Table 2
Characteristics of prior studies linking possible identities to academic outcomes
Sufficiently
Powered
At Effect Size
What aspects of possible
identities does the predictor
operationalize
Study N f
2
=.03
f
2
=.07 Predictor Valence Content Structure Outcome
Studies using open-ended possible identities measures
Oyserman et al., 1995
(Study 4)
44 No No Balance in school-focused
possible identities
Yes Yes Yes Grades
Destin and Oyserman,
2010
(Study 1)
266 Yes Yes Count of education-
dependent adult identities
No Yes No Grades
Bi & Oyserman, 2015
(Study 3)
176 No Yes Count of possible identities
with strategies
No Yes Yes National
Exam Scores
Bi & Oyserman, 2015
(Study 4)
145 No Yes Count of possible identities
with strategies
No Yes Yes National
Exam Scores
Horowitz et al., 2020 247 No Yes Multiple open-ended
scoring methods
Yes Yes Yes Grades
Oyserman et al., 2004 160 No Yes Plausibility of school-
focused possible identities
No Yes Yes Grades
Studies using close-ended possible identities measures
Anderman et al., 1999 315 Yes Yes Agreement with a list of
academic descriptions
No Yes No Grades
Beal & Crocket, 2010 317 Yes Yes Single-item educational
expectations
No Yes No Standardized
math scores
16
Feliciano, 2012 3,611 Yes Yes Single-item educational
expectations
No Yes No Grades
Merolla, 2013 5,948 Yes Yes Single-item educational
expectations
No Yes No Educ.
Attainment
(Categorical)
Messersmith &
Schulenberg, 2008
12,066 Yes Yes Single-item educational
expectations
No Yes No Attaining a
BA
Muller, 2001 3,422 Yes Yes Single-item educational
expectations
No Yes No High school
graduation/Co
llege
Enrollment
Ou & Reynolds, 2008 1,286 Yes Yes Single-item educational
expectations
No Yes No High school
graduation/To
tal years of
schooling
Schoon & Ng-Knight,
2017
5,770
Yes Yes Single-item educational
expectations
No Yes No Standardized
Test
Scores/Univer
sity
enrollment
Marjoribanks, 2003 8,322 Yes Yes Two-item educational
expectations
No Yes No Educational
Attainment
(6-point scale)
Note. Marjoribanks published several subsequent articles (2004; 2005), each examining some subsample of this larger sample.
17
17
We also found room for improvement. First, as reflected in the fourth through the last
columns of Table 2, measures and the features of the future self they focus on are idiosyncratic,
making it hard to know how best to summarize the accumulated evidence. Almost every study
used a different measure of possible identities mirroring Corte and colleagues’ (2020) review of
the relationship between possible identities and health. Regarding features, most of the studies
focused on school-relevant content, and few focused on valence and structure. Second, even
when researchers measured more than one aspect of possible identities, they did not provide a
separate report on the effects of each. Third, studies report their analyses including differing
controls, which means that effects are model-specific, limiting our ability to compare effect sizes
or estimate an overall effect.
Fourth, features of the sample and data source (open- vs. close-ended data) covary and
almost all studies obtained only a single measure of possible identities (the exception was
Horowitz et al., 2020). The third of studies using more detailed hand-coded operationalizations
also have samples that are both and include mostly low-SES early adolescents from historically
stigmatized racial/ethnic groups (i.e., Latinx and African American). In contrast, the two-thirds
using larger samples also used a single-item measure (exceptions are Anderman et al., 1999, who
used a list of academic descriptors, and Marjoribanks, 2003, who used a 2-item measure). Using
a single-item, single-point-in-time, educational expectation question seems to be an overly
narrow and error-prone operationalization of possible identities. Given that possible identities are
assumed to be context-sensitive (Markus & Nurius, 1987; Oyserman et al., 2012), single point in
time estimates may reflect survey context, rather than underlying content and structure in
memory. For instance, possible identities tend to become less positive as the school year
progresses (e.g., Oyserman et al., 2006; Oyserman, et al., under review), so measuring in the fall
18
or spring alone likely obscures the underlying memory structure. Only one study (Horowitz et
al., 2020), assessed possible identities at more than one time point, finding that students with less
fall to spring decline have better grade point averages, even controlling for their prior school
attainment.
In sum, the state of the literature is suggestive; something about possible identities is
motivating. But we cannot draw clear inferences regarding which feature(s) of possible identities
matter, how much they matter, and ways to measure them at scale. Moreover, though we report
effect sizes in our summary, as we noted above, sample size and features, measurement method,
and focus are confounded, making a quantitative synthesis of effect sizes impossible. In that
sense, our review reflects weaknesses noted in prior, more general, reviews of the literature
connecting possible identities to outcomes (e.g., Corte et al., 2020; Hoyle & Sherill, 2006;
Oyserman & James, 2011). Our review also highlights the bifurcation between richer
ideographic open-ended data with hand-coded measures and small samples on the one hand and
large sample sizes with close-ended, typically single-item, measurement of expectations
regarding academic attainment on the other hand. Single-item measures do not allow for a test of
the process(es) by which possible identities might matter and reduce the theory of possible selves
to expectations. The finding that a single-item measure of student expectations about going to
college is associated with later academic attainment strongly suggests that possible identities
matter but is mute on the process. The process might be the motivating effect of a positive
expectation, potentially any positive expectation would do. Or the process might be strategies
because students who expect to go to college also generally have some strategies to get there. Or
the process might be something else about the structure of possible identities that include
expecting to go to college. A single item cannot rule in or rule out any of these possibilities. We
19
found only one study that pitted different ways possible identities matter against one another
(Horowitz et al., 2020). These authors found that fall-to-spring change in possible identities
predicts change in grades no matter which possible identity feature they measured. At the same
time, they also found that they obtained more precise estimates when the possible identity
measure considered response content and structure (i.e., school-focused identities connected to
plausible roadmaps).
The Current Studies
As we outlined above, though valence-, content-, and structure-focused approaches to
how possible identities matter each has merit, the extent to which each contributes to the
effectiveness of possible identities in motivating academic outcomes is not yet known. Larger-
sample studies use single-item approaches, undermining face validity and losing the rich
ideographic underpinning of a possible identities approach. Since we did not have enough
evidence to start with a particular prediction regarding how possible identity valence, content, or
structure matters, or to specify the likely size of that effect, we started with a data-driven bottom-
up approach. In doing so, we took advantage of support vector machine learning and natural
language processing to obtain a more nuanced sense of how possible identities matter. We
combined this data-driven, natural language processing bottom-up approach (Studies 1, 2) with a
network analysis top-down theory-based approach (Study 3). We outline our approach in Table
3.
In Study 1, we ask two questions. First, we ask if we can train an algorithm to score
possible identities and strategies based on their functional relationship with student grade-point
average (GPA). Second, we ask if this algorithm can be used successfully in a different group of
students to predict student GPA. In Study 2, we rule out that our first algorithm may be capturing
20
something about general writing (e.g., differences in vocabulary) and not possible identities. We
do so by using another sample of student writing (feedback on an edu-game they played) as the
training input for identifying how general writing might matter for student grades. In our test
sample, we pit the predictive power of our possible identity-based algorithm against this other
writing-based algorithm. In Study 3, we switch to a top-down approach to unpack what our
possible identity-based algorithm has to say about how possible identities matter. We contrast the
predictive effects of various features of possible identities (valence, content, and simple
structure-based approaches) to our network-based, more nuanced, bridging feature. We also
contrast the effect of our network-based bridging structure prediction to the alternative of
network-based structure (centrality). We predict that our bridging approach explains unique
variance in student academic outcomes beyond that explained by general features of their
writing, their demographics, and their prior academic attainment.
Table 3
Overview of study core research question, overview of sample, and prediction
Study Training Sample Trained on... Test Sample Tested using...
1 620 7th- to 10
th
-graders
in rural and semi-rural
Colorado
Fall possible identity
and strategy
responses
247 8th-graders
in Chicago
Fall and spring
possible identity
and strategy
responses
2 574 7th- to 10
th
-graders
in rural and semi-rural
Colorado
Open-ended
feedback about
edugame experience
Same as Study 1 Same as Study 1
3 Same as Study 1 Same as Study 1 Same as Study 1 Same as Study 1
Note. The test sample was n=247 8
th
-graders from high poverty schools in Chicago with possible
identity responses and administrative data from which we could compute GPA. In Studies 1 and
2 we analyzed data at the student level. In Study 3, we analyzed all possible identity responses,
21
even if GPA could not be calculated (n=301) and used data at the observation level (fall n=301,
and spring n=572).
Modeling Our Outcome Across Studies
Our key outcome variable is GPA. GPA is negatively skewed (more students get good
grades than bad ones, the modal score is in the A-range; Chadbury et al., 2018). That means that
we needed to consider how to represent GPA skew in Studies 1 and 2. Rather than adjust
(transform) the data, we used a Bayesian model that accurately represents the data using skew-
normal distribution, a generalization of the normal distribution that accounts for skew (Azzalini,
1998). As a result, the skew in GPA carried forward to the distribution of algorithm-based
scores, so we modeled this skew in Study 3 because algorithm-based scores were our outcome
variable. In the Supplemental Materials we detail what the skew looks like and our skew-normal
analytic approach (Azzalini, 1998).
Our modeling approach better fits the observed data than a regression model that
incorrectly assumes GPA is normally distributed. One consequence of better fit to the observed
data, is that our model does not explain as much variance in GPA. That is, our model using the
skew-normal GPA distribution explains about 3% of the variance in GPA while our model
assuming a normal distribution explains 7% of the variance in GPA. We suspect this is because
when a model incorrectly assumes GPA is normally distributed, it estimates model-implied
variance that is not present in the observed data. Given that prior studies assumed GPA was
distributed normally, the effect size estimates in previous studies and the ones we produce are
not comparable.
How We Created our Algorithms From Student Open-Ended Responses
22
In Figure 2, we show a stylized model for how we created, trained, and applied our
possible identity scoring algorithms starting with raw student response (see Appendix A for
detail). We started with student responses as written (all of their responses to the possible
identity and strategies questions, or all of their responses to four questions about their experience
with an edu-game; Panel I). Then, we preprocessed and cleaned responses (Panel II). Then we
applied a word-embedding model to quantify each student’s response as a set of 300 features
corresponding with how words are used in an extensive corpus of Google News articles
(Mikolov, 2013) (Panel III). Next, We used this quantified response in support vector regression
(SVR) to train an algorithm that relates these 300 features to student end of year GPA (Panel IV).
SVR is a supervised machine learning method that predicts an outcome variable based on some
dependent variable set (Vapnik, 2000). SVR is the right tool for creating an algorithm for scoring
possible identities and strategies that is likely to generalize to samples that differ from the
training sample because while conceptually similar to Ordinary Least Squares regression, SVR is
more replicable (i.e., less prone to overfitting), especially when models include many predictors
and those predictors are correlated.
Finally, we used our algorithms to score the possible identity and strategy responses of
students based on the word embedding numeric representation of test sample responses. In
Studies 1 and 2, we used these Fall and Spring scores to produce our key predictor variables:
change in possible identities and strategies based on the identity-based algorithm and the other
writing-based algorithm. To obtain a stable change score, we used a residual score, as detailed in
Supplemental Materials. In Study 3, we used the fall and spring scores as the key outcome being
predicted in multilevel models with observations nested within students.
Figure 2
23
How we created our algorithms from student responses
Analytic Strategy and Reporting Across Studies
Across studies, we use Bayesian regression models with weakly informative priors for all
estimated parameters (see Supplemental Materials for detail). We chose a Bayesian approach
principally to address the expected negative skew in GPA. Bayesian approaches facilitate
24
modeling in contexts that are computationally difficult to handle with frequentist methods. To
estimate our Bayesian regression models, we used the brms package in R (Bürkner, 2018), with
weakly informative priors for all parameters. Using weakly informative priors means that we are
ultimately allowing the data to determine the posterior distribution and hence the inferences we
make (Gelman et al., 2008). We make all data, code, and syntax that is shareable available
online.
We used the Markov Chain Montecarlo (MCMC) method to obtain the posterior
distribution and used expert recommendations for Bayesian workflow (Gelman et al., 2020) to
evaluate model convergence (see Supplemental Materials for details). We report 95% highest
probability density credible intervals for model parameters at each step of the process. We use
these intervals as our primary inferential tool. A 95% credible interval is the interval that has
95% probability of containing the true value of the parameter given the data. In contrast,
confidence intervals are an estimate of the range of effects with repeated sampling. Hence the
interpretation typically given to confidence intervals (e.g., Haller & Kraus, 2011) is actually
what credible intervals provide and is a conceptual advantage of Bayesian modeling. We infer
the presence of an effect when the credible intervals for model coefficients do not include zero.
In the main text, we report regression coefficients for the key predictors in each study. In
Supplemental Materials, we report full information on each model, including the relationship
between control variables and the outcome, effect size (Bayesian R
2
rule-of-thumb for effect
sizes are small .02, medium .13, large .26; Cohen 1998) and two preferred measures of model fit
measures of model fit in Bayesian models (leave-one-out cross-validation and Watanabe-Akaike
information criteria, smaller numbers indicating better fit, Vehtari et al., 2016).
Study 1: Training the Outcome-Based Algorithm
25
In Study 1, we created a machine learning algorithm to relate end-of-year GPA to open-
ended possible identity and strategy responses in our training sample. We predicted that this
algorithm would capture meaningful features of student possible identities and strategies in our
test sample. We operationalize the accessibility of these features by calculating change scores
because single time point estimates are more error prone and may carry information about
context independent from underlying possible identity representation. Our prediction would be
supported if positive change in algorithm-based possible identities scores was associated with
higher GPA.
Participants
We document demographics of the training and test sample used in Study 1 in Table 4.
Algorithm Training Sample
The data we used for our training sample were collected as the active control condition in
a randomized controlled trial in eleven schools as part of the development of an identity-based
motivation intervention (U.S. Department of Education Investing in Innovation Grant #
U411C150011). In the active control, students used an edu-game focused on science or language
arts during the time that other students used a novel identity-based game. To be included in the
training sample, students had to have parent consent, fall survey responses, and complete GPA
information. Our final training sample of 602 was that subset of the active control group (N=805)
who had parental consent (n=705), were present when surveys were administered (n=645), and
had complete GPA information (n=602). The schools were mid-low to mid-high poverty schools
as defined by the Department of Education which defines mid-low as 26% to 50% of students
receiving free or reduced-price lunch and mid-high as 51% to 75% of students receiving this
26
benefit. Schools at these poverty levels are common in rural areas: two-thirds of students in rural
America attend schools at these poverty levels (U.S. Department of Education, 2018).
Table 4
Training and test sample descriptions across studies
Attribute
Training Sample Test Sample
Studies 1 and 3 Study 2 Studies 1, 2, and 3
Total N 602 540 247
% Female 47.0 47.1 55.0
% White 60.1 59.3 2.0
% LatinX 28.0 28.1 84.0
% Black/African American 8.8 9.1 14
% Asian 3.2 3.5 <1
Median Age 14.5 14.5 13
% 7
th
Grade 16.5 18.6 0
% 8
th
Grade 16.0 17.6 100
% 9
th
Grade 36.6 37.9 0
%10
th
Grade 30.8 25.8 0
% Free or Reduced-Price
Lunch
43.6 43.3 92.0
Text used Fall possible
identities and
strategies
Student
responses to 4
questions about
their
experiences in
an edu-game
Fall and spring
possible identities
and strategies
Algorithm Test Sample
We evaluated the possible identities algorithm on a separate test sample of 8th-grade
27
students attending seven Chicago public schools who had parental consent to participate,
complete administrative demographic and GPA data, and were present when the survey
containing the possible identity and strategies prompt was administered in their classroom. The
final sample of 247 was the subset of the 502 students who had parental consent (n=302), were
present when the fall and spring surveys were administered (n=285), and had complete
administrative data, including two years of prior grades and complete demographics (n=247).
The schools students attended were high poverty schools as defined by the Department of
Education, which operationalizes high poverty over 75% of students receive free or reduced
price lunch.
Method
Measures
Student Demographics and Grade Point Average in the Training Sample. School
districts in Colorado provided student current and prior year unweighted grade-point averages,
GPA as 0=F, 1=D, 2=C, 3=B, 4=A, as part of a data sharing agreement for an Investing in
Innovation Grant # U411C150011. Prior year and current year GPA were highly negatively
skewed. The most common GPA for current year GPA was an A (Mode =4). Mean GPA was a B
(M =2.93, SD =.87).
Student Demographics and Grade Point Average in the Test Sample. Chicago Public
Schools provided 6
th
-, 7
th
-, and 8
th
-grade course grades, student gender, free/reduced price lunch
status (poverty), and race/ethnicity as part of a data sharing agreement with the American
Institutes for Research. We computed students’ final 6
th
-, 7
th
-, and 8
th
-grade core grade point
average (GPA) by computing the average final grades in Math, Science, English, History, and
Social Studies (0=F, 1=D, 2=C, 3=B, 4=A). Each was highly negatively skewed. The most
28
common GPA was an A for 7
th
- and 8
th
-grade GPA (Mode =4.0) and the second most common
GPA for 6
th
-grade GPA. Mean GPA was a B for 6
th
-grade (M= 2.75, SD=0.98), 7
th
-grade (M=
2.74, SD=0.99), and 8
th
-grade (M=3.00, SD=0.82).
Next-year Possible Identities and Strategies (Training and Test Samples). We
adapted Oyserman and colleagues’ (2004) paper-and-pencil measure for Qualtrics. Students
responded to prompts about their next year possible identities and strategies in four steps (see
Figure 2). At the first step, on a single screen, students typed in up to 4 expected possible identity
responses and for each identity indicated with a check if they were doing something now to work
on it. At the second step, on a single screen, they typed-in up to 4 to-be-avoided possible identity
responses and for each identity indicated with a check if they were doing something now to
avoid it. At the third step, students saw that subset of their expected possible identities that they
had said they were working on, one at a time, each on a single screen, and were asked to describe
what they were doing to work on that possible identity. This process was repeated for to-be-
avoided possible identities at the fourth step. We show the full prompt text and corresponding
schematic in our Appendix B.
Figure 3
Flow of the Possible Identities and Strategies Measure
Analysis Plan
29
We conducted a series of Bayesian regressions using the brms package in R (Bürkner,
2018), with weakly informative priors for all parameters. Our outcome variable was 8
th
-grade
core GPA. Our predictor variables were our residualized possible identity change score, random
school effects, prior GPA, and demographics and were entered in stepwise fashion.
Results
We used our algorithm to derive a possible identity-based prediction of GPA , which
could range from 0 and 4. In fall (M = 2.95, SD = 0.14) and spring (M = 2.93, SD = 0.15)
predicted scores were just under 3. We created a residualized change score to reflect change in
scores from Fall to Spring (M = -0.003, SD =.146, Min=-.56, Max= .26) termed this our possible
identity-based algorithm score and used in our analyses.
Prediction 1: Change in Possible Identities Matters
In Figure 4, we display effects of residualized change in possible identity-based
algorithm score when entered alone and controlling for the effect of being in a particular school,
6
th
- and 7
th
-grade GPA, and demographics. Our results suggest that there is an effect of change in
algorithm-based possible identity scores on students’ 8
th
-grade GPA. Possible identity scores
account for about 3 percent of the variance in GPA (R
2
= .03, 95% Credible Interval [0.005,
0.055]). This effect is robust to controlling for the effect of being in a particular school, 6
th
- and
7
th
-grade GPA, and demographic covariates; including prior GPA reduces the value of the point
estimate, though the credible intervals with and without other covariates overlap.
Figure 4
Change in algorithm-based possible identity scores predict end of year GPA controlling for
school, prior GPA, and demographics
30
Note. Error bars are 95% credible intervals. Δ Possible Identity Score = possible identity-based
algorithm residualized change score
Discussion
In Study 1 we document that our possible identity-based algorithm trained to predict
GPA given student possible identity and strategy responses in one sample predicts GPA in a
different sample of students. Our results are important because we document that we can capture
an active ingredient of how possible identities matter and that that is transferable to making
predictions for students who differ in race/ethnicity, socioeconomic status, geographic location,
stage of adolescence, and year in school from our training sample. Our results underscore some
of the 300 word-embedding based features of possible identities matter, but cannot identify
which ones. We investigate this in Study 3, but before we do, in Study 2 we address two
limitations of the machine-learning method. It is possible that our possible identity-based
algorithm reflects prosaic differences in student writing style (e.g., vocabulary differences) or
other differences based on race-ethnicity, gender, or social class, not just possible identities (e.g.,
Rudin et al., 2018). For instance, girls tend to have better grades than boys and also may use
31
different words and phrases when writing. We want our possible identity based-algorithm score
to capture differences specific to the functional relationship between possible identities and GPA
rather than these other differences. We address these limitations in Study 2 by training an
algorithm to identify how the more general features of student writing matter for student grades.
Study 2
In Study 2 we trained another algorithm, this one based on other school-related writing
(students’ responses to four feedback questions about their edu-game playing experiences). We
pitted this other-writing-based algorithm against the algorithm we developed in Study 1 as ways
to score possible identities and predict GPA in our test sample.
Participants
Algorithm Training Sample
Students in the training sample (n=540) were from the same population of students as the
students in the Study 1 algorithm training sample. Sample size varied slightly because the
writing sample for this algorithm was collected on a different day, so a somewhat different set of
students was present. The final sample of 540 was the subset of students who had parental
consent (n=574) and were present when feedback surveys were administered and for whom we
also had complete GPA information. We provide demographic information in Table 4.
Algorithm Test Sample
We applied our other-writing algorithm to score our test sample’s fall and spring
possible identities and strategies. We provide demographic information in Table 4.
Training the Algorithm
32
We compiled responses from the four open-ended questions about student edu-game
experience we list in Table 5. We trained our other writing based-algorithm following the same
steps we used for training the GPA-based possible identity and strategy algorithm.
Table 5
Open-ended questions used in training a general-writing scoring algorithm.
Question Number Question Text
1 When you missed a day, what did your teacher have you do?
2 What was the best part of the digital program you just completed?
3 What was the worst part of the digital program you just completed?
4 Please use the space below to share any other information about your
experience participating in the digital program.
Measures
We used the same measures as in Study 1 with the exception that instead of applying the
possible identity-based algorithm, we applied the other writing-based algorithm to the possible
identity responses in the test sample to create a score that represents how the more prosaic
features of students’ possible identity responses might relate to GPA. This yielded a fall
(M=3.05, SD=0.04) and spring (M=3.05, SD=0.04) response prediction for GPA; note that
though the means are identical, these are different measures, correlating only at r=.22. Parallel to
the method we used in Study 1, we created a residualized change score to reflect other writing-
based changes in possible identity responses from Fall to Spring (M = 0.00, SD =.036, Min=-.18,
Max=.05). The change scores based on each algorithm were correlated, r(269)=.33, p <.001, but
not redundant, allowing us to include both in our models.
Analysis Plan
We pitted change in possible identities as scored by our possible identity-based algorithm
against change scored by our other writing-based algorithm as predictors of end-of-year GPA in
33
two Bayesian regressions. The first model includes only identity-based and other writing-based
change scores as predictors of GPA. The second model includes controls for demographics, 6
th
-
and 7
th
-grade GPA, and the effect of attending a particular school.
Results
In Figure 5, we show the regression coefficients with 95% credible intervals for our two
algorithm-based measures of change in possible identity responses. We found that including the
other writing-based algorithm scoring of possible identities did not substantively change the
effect of possible identity-based algorithm scores. Thus, at Step 1 we found that the positive
association between change in possible identity-based algorithm scores and current year GPA is
robust to controlling for the other writing-based algorithm score. At Step 2 we found that this
effect remains after including the effect of attending a specific school, 6
th
- and 7
th
-grade GPA,
and demographic covariates.
Figure 5
Change in identity-based scores of student possible identities and strategies predicts end of year
GPA and is not accounted for by other writing-based scores
34
Note. Error bars are 95% credible intervals. Δ Other Writing Score = residualized change in
student possible identities coded using the other writing-based algorithm. Δ Possible Identity
Score = residualized change in possible identity-based algorithm scores.
Discussion
Study 2 shows that the effect of change in possible identities and strategies on GPA that
we found when we scored possible identities using our possible identity-based algorithm remains
when controlling for change in the more general features of how students described their possible
identities as operationalized in our other writing-based algorithm score. Our results suggest first,
that our identity-based algorithm is capturing the relationship between motivation features of
how students write about their possible identities and their GPA. Second, that the features
driving the relationship cannot be reduced to simple differences in how students write or who
they are. By using a machine algorithm, we represented responses in a rich 300-dimension word
embedding model, many more dimensions than any human coder could consider at once. By
35
identifying a functionally relevant score of possible identities in this rich dimensional space, our
bottom-up approach lays the groundwork for theory building. When used as an outcome in
analyses, our algorithm-based scores of possible identities operationalize the unique content and
structural features that make possible identities and strategies motivating. We build on this in
Study 3, using a top-down theory-driven approach to advance understanding of how possible
identities matter, which is the question with which we started.
Study 3
In Study 3, we compared the predictive power of theoretically relevant predictors
(valence, content, simple structure, spatial network structure) when controlling for features of
writing (word count in general and word count about school, as operationalized below).
Sample
We analyzed the responses of students in our test sample, including responses of students
present for the fall survey (responses n=301) and/or for the spring survey (responses n=271). Our
analytic strategy allows us to include students who were present for one but not both surveys.
Method
We used a novel approach combining dictionary-based representation of text with
network analyses to produce two original measures of identity and strategy structure. We first
explain how we created student response networks using network analysis. Then we describe our
specific measures.
Network Representation
We used network analysis to represent the connections (collocations in space) among the
words students generated in response to our possible identity and strategy prompts. Network
analysis is the right approach because it can formally represent and analyze the connections
36
among any kind of node (e.g., a person, a word; Wasserman & Faust, 1994). In Figure 6, we
display what networks looked like. As we show, we started with uncategorized preprocessed
student possible identity and strategy responses (Panel 1), developed and applied dictionaries to
sort words based on their meaning (Panel 2, full dictionary in Appendix C), and created network
representations for each student from which we derive our network-based measures of possible
identity structure (Panel 3). Our method both reduces the influence of sentence structure and
idiosyncratic word-choice across students while retaining enough detail to analyze content and
structure of responses in a network. We explain each of these steps in more detail in Appendix C.
Figure 6
Network representation of text
37
Note. Colors represent superordinate categories. Purple=school-focused words. Yellow =
interpersonal-focused words. White=uncategorized words.
Measures
We organize our measures into outcome, predictor, and control variables.
The Outcome: Algorithm-scored possible identities
Our principal outcome variable was the algorithm-based possible identity score we
described in Study 1 and 2.
The Predictors: response valence, content, and structure
We show our complete list of our measures of response valance, content, and structure in
Table 6. Following prior work, we included counts of expected and to-be-avoided identities and
hand-coded measures of school-focused identity balance and plausibility. We used network
analysis to derive two measures of the structural configuration of school content: connectedness
and betweenness (described in detail in the section below). So that regression coefficients are
comparable across our models, we standardized predictors in our regression models. Hence, they
should be interpreted as the association between increasing the predictor by 1 standard deviation
and increase in identity-based algorithm scores.
Table 6
Measures of valance, content, and structure
What is being
operationalized How is it measured Mean SD
Range
Min Max
Valence without
consideration for
specific content
● Number of expected
(positive) possible
identities
2.88 1.40 0 4
● Number of feared possible
identities
2.77 1.44 0 4
38
Simple valenced
structure of school-
focused content
● School-focused expected
identities are balanced by
similar school-focused
feared identities (Balance)
1.18 1.01 0 4
Simple structure of
school-focused
content and
strategies
● School-focused possible
identities are linked to
plausible behavioral
roadmaps (Plausibility)
3.34 1.55 0 5
Network-based
structure of school-
focused content
● School-focused words are
more central
● School-focused words
bridge between other
words and concepts
0.79 0.49 0 4
1.87 1.11 0 5.26
Simple school-
focused content
● School-focused content is
accessible
1.71 2.06 0 64
Student literacy and
motivation to write
more
● Students who write more
tend to have greater verbal
ability
23.10 12.17 0 64
Note. We used the balance and plausibility coding from Horowitz and colleagues (2020). They
report coding balance to agreement and high inter-rater reliability for plausibility.
Network Measures of Response Content and Structure. Our syntax created networks
for each student’s possible identity and strategy response allowing us to operationalize elements
of response structure and content at the response-level using network analysis.
School Connectedness. We counted the number of connections in each response network
to words/concepts that were about school
1
. To make connectedness scores comparable across
response networks of varying sizes, we normalized scores by dividing the raw count of
connections by the total number of unique words in the response network minus one (Freeman,
1979). We summed the normalized scores to yield a single school connectedness score. The
1
Within network analysis, this value is typically referred to as degree.
39
mean connectedness of school, 0.79 (SD = .49), implies that in a typical student response, school
words/concepts were connected to around 80% of the other words/concepts in student networks
2
.
School Betweenness. We operationalize betweenness of school-focused words/concepts
from student-level network graphs. This can be thought of as the count of number of shortest
paths in a network that pass through a particular node
3
(Freeman, 1979). The length of a path in a
network refers to the number of nodes passed through on the way from one node to another. Like
with connectedness, we normalize scores so that student networks are comparable. To do so, we
first summed the non-normalized betweenness values of words and concepts that were about
school. We divided this sum by the largest possible value of betweenness for a single
word/concept in a network (M = 1.87, SD = 1.11). Scores could be greater than one when
students wrote about school in multiple ways and those nodes were at or near theoretical max
value of betweenness in the network. Betweenness considers how school words and concepts
connect other words/concepts in the network. Connectedness simply counts the number of
words/concepts that are connected to school words/concepts.
Control variables
We controlled for person-level demographics (gender, race/ethnicity and poverty) and
two response-level variables: the count of unique words/concepts in general (reflecting student
2
Values can be greater than one when many school words are highly connected and connected to the same nodes.
Say a student’s response network had 10 word/concept nodes. Two school-word nodes, HOMEWORK and
POSITIVE SCHOOL TRAITS, were each connected to 6 other nodes in the network with some overlap. Their
normalized scores would be .60, which when summed would yield 1.2. The score is scaled to the network, making
interpretation difficult in concrete terms, but allows for comparison across networks of different sizes.
3
Technically, it is the proportion of shortest paths between two words/concepts in the network that pass through a
particular word/concept. This proportion is summed over every pair of words/concepts in the network. This roughly
approximates to the number of shortest paths that pass through a particular word/concept in the network, but is really
a sum of proportions, not a count. Were we normalizing on the basis of an individual node, normalizing this way
would result in a bounded score from 0 to 1. However, we normalize on the basis of a group of nodes, normalizing
the sum of betweenness scores for those words/concepts. The resultant score can be interpreted as the betweenness
of school words relative to the network level maximum betweenness score for a single node and can exceed values
of 1.
40
literacy) and that were about school (reflecting a content-only representation of possible
identities).
Analysis plan
In order to analyze what features of responses are represented in algorithm-based possible
identity scores, we conducted a series of Bayesian multi-level regressions with fall and spring
observations nested within students with weakly informative priors for all parameters. Our
dependent variable is student algorithm-based possible identity scores.. For each of the six
predictors listed in Table 6, we ran three models. Predictors were tested one at a time because the
various ways of representing possible identities covary (see Supplemental Materials Figure S1).
In the first model, the predictor was the only fixed effect included in the model. In the second
model, we included dummy-codes indicating whether a student was identified in administrative
records as female, Latinx, and a recipient of free/reduced lunch. In the third model, we controlled
for two measures of elaboration and literacy.
Results
Descriptive analysis
In Figure 7, we show correlations of fall of 8
th
-grade measures of possible identity and
strategy content, structure, and valence with prior GPA and demographic variables (we provide
full reporting of correlations in Supplemental Table S6). In general, prior grades were more
closely related to structural operationalizations of possible identities. Students who had better
grades in 6
th
-grade tended to write more in general, generated more to-be-avoided possible
identities, had more balanced school-focused possible identities, connected more school-focused
identities to plausible strategies, and were more likely to bridge responses with school-focused
content. Students who had done well in 7
th
-grade, more proximal to the time they listed their
41
possible identities, tended to write more about school and in general. Only one demographic
covariate was related to possible identities: female students tended to write more than male
students.
Figure 7
Correlations of predictors with previous GPA and demographic variables
Note. Correlation values greyed out when p>.05. Colors correspond to sign (red is negative,
green is positive) and magnitude (deeper shades of colors are greater in magnitude)
Regression analysis
42
We present the results of our regression models in Figure 8, highlighting the standardized
regression coefficient for each operationalization of possible identities in a model with no
controls (light grey), demographic controls (darker grey), and controlling for demographics and
total and school-focused word/concept counts (black).
Model Controls. As detailed in Supplemental Materials, across models, female students
tended to receive higher algorithm-based identity scores than male students (standardized
coefficients ranged from .02 to .03) though scores did not differ on other student attributes
(Latinx or not, or received or did not receive a free or reduced lunch benefit). Regarding
response attributes, responses tended to receive higher scores if they included more compared to
less unique words/concepts total (standardized coefficients range from .04 to .05) though scores
did not differ for responses that include more compared to less school-related words/concepts.
Effects of Valence. As we show in the top two panels of Figure 8, student responses that
included a greater number of expected identities and to-be-avoided identities tended to receive
higher algorithm-based possible identity scores. This association was robust to including
demographic covariates, but not to including writing-focused covariates, implying that valence-
based coding captures information about writing style rather than the active ingredient of
possible identities that affects academic outcomes. That is, possible identity coding based on the
count of expected and to-be-avoided identities is no longer a reliable predictor of algorithm-
based scores when the total number of unique words/concepts or the total number of unique
words/concepts and about school are included as covariates.
Figure 8
School betweenness is the only feature of responses robustly related to possible identity scores
Note. Error bars are 95% credible intervals. Predictors listed on the y-axis were entered in separate models.
44
44
Effects of Balance and of Plausibility. As we show in the middle two panels of Figure
8, student responses that included a balance of school-focused expected and to-be-avoided
identities or more school-focused possible identities with plausible roadmaps for action tended to
receive higher algorithm-based possible identity scores. Each of these associations was robust to
including demographic covariates, but not to including writing-focused covariates, implying that
simpler versions of structure-based coding attending to content (balance) or link to strategies
(plausibility) still captures information about writing style rather than only the active ingredient
of possible identities that affect academic outcomes. That is, possible identity coding based on
either balance or plausibility is no longer a reliable predictor of algorithm-based scores when the
total number of unique words/concepts or the total number of unique words/concepts and about
school are included as covariates.
Effects of School Connectedness. As we show in the second to bottom panel of Figure
8, the degree to which school words/concepts are highly connected in a network is not reflected
in their algorithm-based possible identity scores; credible intervals with and without controls all
include zero.
Effects of School Betweenness. As we show in the bottom panel of Figure 8, student
responses with higher school betweenness tended to have higher identity-based algorithm scores.
This effect is robust to controlling for student demographics and writing style (unique words in
total and related to school).
Discussion
Results of Study 3 reveal that our algorithm-based scores are congruent with previous
methods of scoring possible identities based on valence, content, and structure but also that these
prior measures capture features of writing-style. Indeed, when we control for these features, only
45
our network-based bridging possible identities (school betweenness score) remains reliable.,
Students who bridged their possible identities and strategies with school focused words/concepts
tended to receive higher algorithm-based scores of their possible identity responses. We
conclude that bridging possible identities, possible identities and strategies linked to others via
school-related content, is a key feature of what makes possible identities motivating.
General Discussion
In three studies, we showed that it is possible to isolate the underlying motivationally
relevant features of possible identities that make them matter for academic outcomes. Across
studies, our test sample included predominantly low-income students of color attending urban
high poverty schools, the typical school context of American students of color (U.S. Department
of Education, 2020). We found a small-sized effect of change in possible identities and strategies
on grades, comparable to the effects in prior studies, though we used a Bayesian approach to
appropriately model skew in grades, a necessary statistical technique that other studies have
neglected. Not adjusting for skew would have inflated our effect size two-fold. Though small,
our effect is comparable to the size of gain in standardized tests as students complete an
additional year of schooling (Hill et al., 2008) and of the magnitude found in successful
interventions to improve academic outcomes (Boulay et al., 2019).
Given that our synthesis of prior literature did not give us a strong basis to predict how
possible identities matter, we used a bottom-up approach in Studies 1 and 2. We employed word-
embeddings and machine learning to identify the motivationally relevant features of possible
identities in a large sample, using as our data source student responses to open-ended questions
about possible identities and strategies (Study 1). We contrasted the algorithm we built from
these responses to one we built from how students wrote about something else (an edu-game
46
they played, Study 2). Our identity-based algorithm scores predicted better grades in school in a
separate sample of students, even controlling for academic attainment in the prior years, student
demographics and the schools they attended (Study 1). We showed that the motivationally active
features of possible identities captured by our identity-based algorithm are distinct from the
structure of student writing in general (Study 2). Next, we used our algorithm-based functional
representation of possible identities and strategies to test predictive power of different theory-
based operationalizations of how possible selves matter against our theoretical concept of
bridging possible identities. Consistent with our novel theoretical frame, we find possible
identities are motivating when they are bridged together by school.
Synthesis with prior research
Like any mental construct, for possible identities to matter for meaning-making and
action, they have to be available in memory, accessible at the moment of judgment, and
experienced as apt, relevant to the task at hand (Oyserman et al., 2017). As we detailed in our
introduction, different theoretical approaches imply different ways in which this might occur.
Construal level theory (Trope & Liberman, 2003; Wakslak et al., 2008) and expectancy-value
theories (Eccles et al., 1983, Wigfield & Eccles, 2002; Vroom, 1963) imply that availability
itself is the key (simply having possible identities or expected possible identities should be
motivating). Our results do not support an availability-is-sufficient explanation. Self-regulatory
fit theory (Higgins, 2005) implies that possible identities are likely to be accessible and
experienced as apt when they are in equal measure about expectations and fears in the same
domain as the outcome (e.g., balanced, Oyserman & Markus, 1990). Our results do not support
this simple structure-based accessibility and aptness explanation. The Rubicon action-phase
model (Heckhausen & Gollwitzer, 1987) and theory of action goals and implementation
47
intentions (Gollwitzer & Sheeran, 2006) imply that for possible identities to matter, they have to
be connected with concrete implementation plans -strategies for action. Both the standard hand-
coded measure of school-focused identity balance and strategy plausibility were imprecise and
not robustly related to functional features of possible identities when controlling for how much
students wrote. Hence, our results do not provide strong support for this version of a structure-
based accessibility and aptness explanation.
A number of more complex structure-based approaches have suggested particular
features of structure that our machine-learning based approach could not directly account for.
These specific, not capturable features of the structure of the future self include experiencing the
current and future self as part of the same self (e.g., Hershfield & Bartels, 2018; Hershfield et al.,
2011; Lewis & Oyserman, 2015; Nurra & Oyserman, 2018) or as a contrasting standard against
which the current self can be evaluated (often in a particular way Oettingen et al., 2005). Another
not-machine-capturable feature of the structure of the future self is the extent to which possible
identities have concrete behavioral strategies that address interpersonal barriers (Oyserman et al.,
2004; Oyserman & Lewis, 2017). Our results cannot directly support or refute these possibilities.
To make progress, we synthesized identity-based motivation (Oyserman et al., 2017) with
cognitive (e.g., Loftus & Collins, 1975) and cognitive network science (e.g., Siew et al., 2019),
approaches to how spreading of activation works. Much like classic social network analyses
(e.g., Grannoveter, 1973), we argued that bridging is a particularly powerful feature of the
structure of future self associative knowledge networks. Our novel bridging possible identities
prediction advances theory by postulating a new formulation of how possible identities come to
be accessible and apt and hence affect important outcomes over time. Regarding accessibility,
bridging means that a possible identity is more likely to come to mind even if another one comes
48
to mind first. Regarding aptness, bridging means that an accessible possible identity is more
likely to feel relevant to the task at hand and that the task will be understood in a particular way.
We focused on the features of possible identities relevant to school attainment. In our case,
bridging possible identities were school-related aspects of a possible identity or strategy. From
an ecological perspective, having school-focused bridges implies that no matter what other
aspect of the future self comes to mind, a school-focused aspect is also likely to come to mind.
School might mostly not reflect academic attainment for a particular possible identity, for a
possible ‘popular me’ it could be a context for making friends, for a possible ‘made the team me’
it could be a context for playing sports. However, across possible identities, these lower-level
features of school could not form a bridge, as they are idiosyncratic to one possible identity,
hence, we suspect that bridges also highlight the higher level, essential features of school that do
focus on academic attainment. In this sense, bridging identities capture a construal-level theory
idea (Trope & Liberman, 2003). Another possibility is that it is not having a possible self but
how detailed the representation is that matters. Detailed elaboration can increase the likelihood
of remembering and retrieving possible identities via depth of processing (e.g., Bartsch &
Oberauer, 2021; Craik & Tulving, 1975).
We interpret our bridging findings as congruent with, but going beyond simply
specifying that elaboration matters. Moreover, when school is a bridge between other aspects of
a person’s mental representation of their identities, school is essentially like the connective tissue
holding the rest of the network together. Removing the school link would silo different identities.
Reinforcing the link would promote, as Malcolm X suggests, the idea that school is the “passport
to the future”. While we did not directly test the identity-based motivation theory prediction that
future selves are motivating when they are experienced as part of the current self (Oyserman &
49
James, 2009), our bridging construct is clearly congruent with this idea. Bridging possible
identities offers a concretizing way of operationalizing the process by which many identity-based
interventions work. Indeed, many school-based interventions to improve school outcomes invoke
change in either possible identity content, structure, or both, as an active ingredient or a putative
mediator. Sometimes this is explicit as in interventions in middle schools (Elliot et al., 2011;
Oyserman et al., 2006; under review; Wooley et al., 2013) and high schools (Lee et al., 2015;
Rinaldi & Farr, 2018). Other times this change is an implicit part of the intervention process as in
interventions in middle schools (Ansong et al., 2018; Destin & Svoboda, 2017) and in colleges
(Stephens et al., 2014; Stephens, et al., 2015). These studies often do not measure possible
identities. By providing a machine-coding algorithm and our network-derived bridging measure,
we provide a path for testing processes in these interventions.
Limitations and future directions
Like any study, ours has limitations and leaves some questions unanswered. We focus on
culture-based generalizability and features of our outcome measure and of the data we used to
construct our algorithm. First, regarding culture-based generalizability, we trained and tested our
machine-scoring algorithm in samples of students that varied in their race/ethnicity, SES,
developmental phase, and geographic, suggesting that our algorithm-based scoring approach is
likely to generalize to most American student contexts. At the same time, like most of the studies
we reviewed, our study focused on participants from the United States. Hence we cannot address
cultural generalizability or rule out that larger effects of school-focused possible identities stem
from cultural difference (e.g., Bi & Oyserman, 2018). For example, the likelihood that students
have bridging possible identities might depend on culture-based features of selves (independent
and interdependent self-construals, Hamedani & Markus, 2019; individualist-mindset and
50
collectivistic-mindset, Oyserman, 2017). For instance, the types of content that need to bridge
possible identities may depend on whether a person has a situationally or chronically accessible
focus-on-the-main-point, individualistic-mindset or focus-on-connection, collectivistic- mindset
(e.g., Oyserman et al., 2009). We could not address these questions in our sample, though future
research could elaborate on how cultural knowledge structures that organize self-relevant
knowledge interact with possible identities to produce motivational and behavioral
consequences.
Next regarding features of our outcome measure and of the data we used to develop our
algorithm. We trained and tested our machine-learning algorithm using student cumulative grade
point average (GPA) as our outcome measure. We did so because GPA is both a real-world
outcome with real consequences for students’ lives and tends to be measured in similar ways
across American school contexts, facilitating our use of student samples from a total of 18
different schools to train and test the algorithm. At the same time, grade point averages are either
not obtained in the same way in other parts of the world or are not used in the same way in
determining student chances and movement through an educational system. Many countries use
high-stakes standardized tests, combined with other factors to track students as they leave
elementary schools into different kinds of school systems that train them for different future
careers and post-secondary educational trajectories. Future research in these contexts might
consider using these kinds of tests as a more culturally appropriate criteria, requiring that
researchers use the methods for building and testing that we describe rather than simply applying
our algorithms.
We used student responses to open-ended prompts about their next-year expected and to-
be-avoided possible identities and any strategies they were currently using to work on them. Our
51
approach provided a rich ideographic basis from which to build an algorithm that can illuminate
the motivationally relevant aspects of possible identities and strategies. As we reviewed in our
introduction, two-thirds of studies testing the predictive effect of possible identities on academic
outcomes use a single item measure (e.g., how far do you expect to go in school or do you expect
to go to college) while the other third uses simple structure-based coding (e.g., how many
possible identities, what valence, are they balanced, are they plausibly linked to strategies). We
suspect this is in part because coding open-ended data is onerous and resource intensive. Our
algorithm and network-based measures of structure allowed us to quantify and capture a richer,
more nuanced feature of structure, bridging. What is more, each approach can be computed
within a day: hand-coding and attaining reliability may take weeks or months. At the same time,
future research could increase the richness of the data by linking to other ecologically valid
methods including daily-diary and experience sampling to connect mental representation of
possible identities to actual behaviors.
Conclusion
Returning to our opening quotes, Drake could be right about the future being “spotless”,
but our results suggest that simply thinking about the future is unlikely to change behavior in the
present or shape downstream outcomes. Scrooge could be right that the future is like a stick that
prods us towards our better selves, but our results suggest that a more effective stick would be
one in which the central goal shows up across situations and bridges possible identities. Our
results converge best with Malcolm X, the future all comes back to school. Bridging possible
identities matters; students with possible identities bridged by school are more likely to achieve
the futures they hope for and avoid those they fear.
52
Appendix A
Preprocessing text and applying Word2Vec word-embeddings to quantify features of
responses
To train the algorithm and derive algorithm-based scores, we first cleaned and pre-
processed student text (all of their responses to the possible identity and strategies questions, or
all of their responses to four questions about their experience with an edu-game, as detailed in
Supplemental Materials). Next, we used word-embeddings to quantify responses into a set of 300
features based on a general model for how words co-occur in Google’s corpus of news articles
written in English (Mikolov et al., 2013). Word2Vec represents words used in similar contexts as
closer together in lexical space based on 300 features. For example, “studying” is more often
used in context of other school words; “cooking” in context of words about home or work.
Using Support Vector Regression to Create Our Scoring Algorithms Across Studies
At the training phase, we used the numeric inputs from our Word2Vec (student possible
identities and strategies in Study 1; other student writing in Study 2). Specifically, we used the
e1071 package in R (Dimitriadou et al., 2009) to train models with support vector regression
(SVR), using the 300-dimensional representation of student responses to predict end-of-year
GPA. SVR is a supervised machine learning method that predicts an outcome variable based on
some dependent variable set (Vapnik, 2000). We selected a final SVR model for each scoring
algorithm by using the standard method: grid-search and 10-fold cross-validation (Bergstra &
Benio, 2012; see Supplemental Materials for technical details). 10-fold cross validation and grid-
search identify the best performing model in the training sample, which we then applied our the
test sample.
53
Appendix B
Measuring Possible Identities Across Studies
Figure A.1
Visualization of how we measured possible identity and strategies
54
Appendix C
Combining Dictionary-based Methods and Network Representation to Operationalize
Content and Structure of Possible Identities in Study 3
Our goal was to capture the concepts and the various ways these concepts were structured
in students’ minds while not losing the rich ideographic representations that our possible identity
and strategy responses provided. We could not use raw text networks to attain our goal. No two
students used the exact same set of words so representing response structure with their actual raw
text would only reveal sentence syntax. Hence, we focused on the semantic content of words by
categorizing each word based on form of speech (whether it was a noun, verb, or adjective) and
what it was about (Table A1 is our full dictionary). Some words were very common and were
clearly relevant when considered alone so we did not collapse them into a single category. For
example, “school” was the most common and “homework” was the 6
th
most commonly used
word. We classified each as being about school in our analyses, but did not collapse them into a
single broader category because they were so common and hence relevant to network structure.
As we detail in our supplemental materials, we used prior work (Oyserman & Markus, 1990;
Oyserman et al., 2004) and a two-step iterative snowball procedure to develop dictionaries with
good lower-level categorization coverage of the types of words students use. We represented
words as belonging to one of our dictionaries (41% of words) or to no dictionary at all. Words
that did not fit into dictionaries tended to be more ambiguous (e.g., “judgment” could refer to
concerns about being judged by others or about using good judgment, “people” could be good
people, bad people, shady people, or refer to helping other people) or idiosyncratic (e.g.,
“heaven”, “youtube”). By focusing on lower-order categorizations and retaining some unique
words we maintain the structure of student responses while reducing idiosyncrasies enough to
make comparisons feasible.
55
Creating Network Graphs
We used the igraph package (Csardi & Nepusz, 2006) to derive student-level Fall and
Spring network graphs based on this dictionary-reduced set of responses. We understand this
dictionary-reduced set (see Figure 6, Panel 3) as a representation of students’ in-the-moment
accessible possible identities and strategies. We used our network graphs to calculate measures
of word/concept count, school word concept/count, school connectedness, and school
bridge/betweenness centrality.
56
Table A1
Categories used in network analysis
Theme Category Label Category Description Words in Category
Health SPORT_N Sport Nouns court, basketball, team, ball,
baseball,
volleyball, soccer, varsity,
athlete, football,
player, sport
SPORT_V Sport Verbs train, play, dribble, layup
HEALTH_N Health Nouns diet, athletic, healthy,
unhealthy,
milk, water, fat, food, fruit,
strong
HEALTH_V Health Verbs run, swim, eat, drink, exercise,
sleep
Interpersonal
Context
SOC_NEU Social Nouns Neutral people, kid, crowd, person,
peer, boy, girl,
classmate
SOC_NEG Social Nouns Negative bully, drama, gang, enemy,
unfriendly
FRIEND Friends Friend, boyfriend, girlfriend
FAMILY Family sister, brother, dad, mom,
mommy,
family, cousin, parent
SOC_ADJ_POS Positive Social Traits funny, nice, outgoing, positive,
friendly,
faithful, patient, helpful,
respectful
SOC_ADJ_NEG Negative Social Traits annoy, shady, dumb, immature,
stupid, fake, rude,
angry
Valence POS Positive Descriptors amazing, exceptional, good,
awesome,
cool, fabulous
NEG Negative Descriptors bad, wrong, poor, atrocious,
negative,
Off-track/
non-
normative
SUBSTANCE Substance Use drugs, alcohol, drug, smoking,
smoke, beer
Uncategorized
Off-track
Frequent words that did
not fit into the other
categories
trouble
School
focused
GRADES Grades grades, GPA, a’s, b’s.
PASS Passing or graduating graduate, pass
57
FAIL Failing fail, dropout
ATTN Attention Words focus, attention, listen
concentrate
MOTIV Motivation Words motivated, effort, hardworking,
goal
HS High school freshman, high school, 9th,
ninth, sophomore
MS Middle School 8th, eighth, middle school
SCH_ADJ_POS Positive School Related
Traits
responsible, organized,
smart, mature, intelligent,
hardworking, determined,
scholar
SCH_ADJ_NEG Negative School Related
Traits
laziness, procrastination,
distraction
SCH_SUB Words about specific
subjects
math, algebra, science,
English, calculus
BEH_SCH_NEG Negative school
behaviors
skipping, detention, expulsion,
BEH_SCH_POS Positive school
behaviors
Reading, studying, writing,
learning
ADV_CLASS Advanced student things valedictorian, honor, AP, ap,
4.0, A, advance
HW_ETC Homework and
Classwork
homework, assignment,
worksheet, classwork
TESTS_ETC Tests and Exams test, exam, quiz
TEACH Teacher was in multiple
declinations.
teacher, teach, teacher, teach
Uncategorized
School Words
Frequent words that did
not fit into the other
categories
college, school, learn, student
Note. Words appear in this table to facilitate interpretation. Syntax includes stemmed forms and
multiple declinations of words
58
Supplemental Materials
Assessing Model Convergence and Details on our MCMC approach.
We used Markov Chain Montecarlo (MCMC) methods to obtain the posterior distribution
for each model. We used the recommended number of chains (4), number of iterations (4000, of
which 2000 are warmup), and target acceptance rate set at .90. To assess convergence, we
ensured that there were no divergent transitions post-warmup, that the Gelman-Rubin statistics
were under 1.01 for all parameters, that the trace plots showed good mixing for the chains, that
the posterior density plots showed no multimodality, and that the random draws from the
posterior predictive distribution approximated the sample distribution (Gelman & Rubin, 1992;
Gelman & Shirley, 2011).
Skewed Grades, Skew-Normal Distribution, and Use of Bayesian Approach
Grade-point average is skewed rather than normally distributed. Most grades are high.
Indeed, in some school districts, the modal grade is an ‘A’. This grade inflation is ubiquitous and
well-documented (Adelman, 2006; Attewell & Domina, 2008; Bishop, 2000; Breland et al.,
2002; Camara et al., 2003; Chowdhury, 2018; Geiser, 2009; Godfrey, 2011; Kirst & Venezia,
2001; Nikolakakos et al., 2012; Zirkel, 1999). Traditional approaches to modeling are likely to
misrepresent the expected negative skew in academic outcome data. Bayesian modeling has the
advantage of tremendous flexibility with respect to defining expectations for how certain
parameters are likely to be distributed. In our models, we take advantage of this flexibility by
model outcomes (GPA in Studies 1 and 2; Algorithm-based possible identity scores Study 3)
based on what is called a skew-normal distribution: a generalization of the normal distribution
that allows for skewness (Azzalini, 1985). Though frequentist methods may be able to
accommodate adjustments to expected distributions, they are often difficult to implement in
59
contrast to Bayesian methods. The advantage of using the skew-normal distribution is that,
unlike other methods, it does not entail removing extreme cases or transforming data. Instead,
skew-normal distribution includes a skew parameter ( α) in addition to mean and standard
deviation. This parameter indicates the extent of skew in the outcome, can be inferred from the
data, and, hence, makes modeling of grades feasible without adding much complexity in terms of
analyses (Azzalini, 1985; Azzalini, & Valle, 1996). In our data, we looked for and found skew in
GPA and in our machine score and use a skew-normal distribution in our Bayesian regression
models. In general, though our models using the skew-normal distribution fit the data better, they
tended to have lower R-square values, reflecting that methods not accounting for skew may
overstate effect size estimates.
Preprocessing student text
We used the NLTK package to preprocess text in python and the tidytext and hunspell
packages to preprocess text in R before proceeding with any other text-based analyses. Two
features of text-based responses make it necessary to first preprocess responses before
conducting other types of analysis. First, words in open-ended text are often misspelled. We
used spell check functions to automatically correct for misspellings. In study 3, we went one step
further and manually corrected some common mistakes (e.g., “studying” was misspelled as
“studding”—see table S1). Second, the most common words in writing tend to be “stop words”
(e.g., “and”, “but”, “with”)—words that are functional in syntax but create noise in word
embedding models and network based representation of text. We used pre established lists of
stop words and removed them out of student responses. In our syntax, we are clear on how we
tokenized text in order to facilitate text cleaning.
Table S1
60
Manual corrections of commonly misspelled words
Mispelled Corrected
'broing' 'boring',
'flow' 'follow',
'relly' 'really',
dont" "don't",
estudiante" student,
examenes" exams,
honers" honor,
im" I'm,
mejor" best,
quater" quarter,
sacar" 'take',
softmore" sophomore,
tarea" homework,
dictrons' 'directions',
Data Preprocessing for networks specifically. We preprocessed student raw text in 5
steps. First, we used the tidytext package (Silge & Robinson, 2016) to extract bigrams (pairs of
adjacent words) from student responses. Second, we spellchecked responses using the hunspell
package in R (Ooms, 2020), replacing any misspelled words with their correct spelling. Third,
we removed stop words (e.g., “and”, “but”, “or”) from possible identity responses. Fourth, we
reduced words to their stems using hunspell (e.g., “failing” became “fail, failing”). This was
necessary so that declinations like “fail” and “failing” were represented as a single node in the
network diagram. Fifth, we visually examined the output, correcting any remaining spelling
mistakes (e.g., a student who wrote “studding” instead of “studying”, which was not captured by
spell check because “studding” is a correctly spelled word). These manual corrections are fully
documented in our syntax which is shared online (github link). Our preprocessing facilitated the
next step of preparing the data for network analysis: grouping words based on their more general
meaning and function in speech.
61
Training the classifier
We trained our GPA-based possible identity and strategy algorithm as follows. First, we
combined Fall possible identity and strategy text for a given student into a single block of text
(see Table 1). Second, we prepared the text for natural language processing by removing stop-
words (e.g., “and”, “but”, “or”, etc.) and spell-checking student responses. Third, we represented
student responses using Word2Vec, yielding a vector of 300 dimensions for each student
response. Fourth, using this numeric representation of student responses, we tuned a support-
vector regression to predict end of year grades on the basis of students’ possible identity and
strategy responses. Support vector regression is a supervised machine learning method that
predicts an outcome variable on the basis of some dependent variable set. Whereas Ordinary
Least Squares regression fits a model that minimizes residual error, support vector regression fits
a model with a set of constraints on the allowable error in the model. Within these constraints,
the goal of the support vector regression is to minimize the coefficients of the regression
equation. Minimizing the coefficients in this way circumvents overfitting to some extent.
Predicting the outcome variable (e.g., GPA) on the basis of the classifier entailed an
iterative modeling process examining algorithm performance (Mean Square Error (MSE)) using
10-fold cross validation while varying two parameters: epsilon and cost. Support vector
regression models with lower epsilon values utilize a greater number of support vectors in the
training process, and hence, tend to be more accurate. The cost parameter penalizes large
residuals. So a larger cost will result in a more flexible model with fewer misclassifications. In
effect the cost parameter allows you to adjust the bias/variance trade-off. The greater the cost
parameter, the more variance in the model and the less bias. The final result had an epsilon of .40
and a cost of 4.
62
Detailed documentation of priors Study 1
We outline our priors for each step of the modeling process below. At step 1, we included
the outcome-based algorithm residualized change score. We modeled current year GPA using a
skew-normal distribution (Azzalini, 1985) with a mean of 3, a standard deviation of 1 and a skew
parameter α. We follow Gelman (2006), defining the prior for standard deviation in GPA as a
student t distribution. We estimate the skew parameter (α) rather than guessing at its value,
assuming it is distributed normally with a mean of 0 (representing no skew) and a standard
deviation of 4. This is the estimate suggested for a relatively weak default prior in the brms
package (Bürkner, 2018). It allows the α parameter to take values that represent either negative,
positive, or no skew, letting the data define the parameter. We assume the effect of Fall to Spring
change in outcome-based possible identity scores is normally distributed with a mean of 0 and
standard deviation of 1. This weakly informative prior provides some shape for the posterior
effect distribution and allows it to take values that are reasonable given our expectations (a beta
coefficient no greater than |1|) . We evaluate our use of the skew-normal by comparing the fit of
this model to a parallel model assuming GPA was normal distributed.
𝐺𝑃 𝐴 ~𝑆𝑘𝑒𝑤𝑁 (3, 1, 𝛼 )
𝜎 ~𝑆𝑡𝑢𝑑𝑒𝑛𝑡 𝑇 (4, 0, 1)
𝑎 ~𝑁 (0, 4)
𝐵 ∆ 𝑃𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝐼𝑑𝑒𝑛𝑡𝑖𝑡𝑖𝑒𝑠 𝑆𝑐𝑜𝑟𝑒 ~𝑁 (0,1)
At Step 2, we include school effects because we assume that some variance in GPA is
attributable to differences between the schools students attend. We model the random effects of
schools on GPA. Specifically, our priors for the effect of change in school-focused possible
identities on GPA, with between-school varying intercepts, and a prior for the amount of
between school variance in GPA (τ) follow Gelman (2006). We define the prior for τ with a
63
gamma distribution (2, 1.22). The prior is based on the maximum possible between school
difference (SD) in scores, which is taken as 1/SD in GPA.
𝐺𝑃𝐴 𝑖𝑗
~𝑁 (𝜇 𝑗 , 𝜎 )
𝜇 𝑗 ~(𝛾 , 𝜏 )
𝛾 ~𝑆𝑘𝑒𝑤𝑁 (3,1, 𝛼 )
𝐵 ∆𝑃𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝐼𝑑𝑒𝑛𝑡𝑖𝑡𝑦 𝑆𝑐𝑜𝑟𝑒 ~𝑁 (0,1)
𝜎 ~𝑆𝑡𝑢𝑑𝑒𝑛𝑡 𝑇 (4, 0, 1)
𝜏 ~𝐺𝑎𝑚𝑚𝑎 (2, 1/.82)
At step 3, we added 6
th
and 7
th
grade GPA. Prior year GPA likely has a bearing on current
year GPA but given that we had little information, our prior is weakly informative.
𝐵 ∆𝑆𝑐 ℎ𝑜𝑜𝑙 −𝐹𝑜𝑐𝑢𝑠𝑒𝑑 𝑃𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝐼𝑑𝑒𝑛𝑡 𝑖 𝑡𝑖𝑒𝑠 ~𝑁 (0, 1)
𝐵 6𝑡 ℎ 𝐺𝑟𝑎𝑑𝑒 𝐺𝑃𝐴 .
~𝑁 (0,1)
𝐵 7𝑡 ℎ 𝐺𝑟𝑎𝑑𝑒 𝐺𝑃𝐴 .
~𝑁 (0,1)
At step 4, we added demographics covariates. Specifically, we added whether
participants received free or reduced lunch, were Latinx, and were female. Mickelson, Bottia,
and Lambert (2013) document that demographic effects on GPA can vary by school context.
Therefore our demographic prior is weakly informative.
𝐵 ∆𝑆𝑐 ℎ𝑜𝑜𝑙 −𝐹𝑜𝑐𝑢𝑠𝑒𝑑 𝑃𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝐼𝑑𝑒𝑛𝑡𝑖𝑡𝑖𝑒𝑠 ~𝑁 (0, 1)
𝛽 𝐹𝑒𝑚𝑎𝑙𝑒 .
~𝑁 (0,1)
𝛽 𝑊 ℎ𝑖 𝑡 𝑒 .
~𝑁 (0,1)
𝛽 𝐹𝑅𝑃𝐿 .
~𝑁 (0,1)
Method For Dictionary Development
To develop the dictionary categorization shown in Appendix C of the main text we
started with superordinate categorizations derived from previous work (Oyserman & Markus,
1990; Oyserman et al., 2004), categorizing words broadly into content focused on school,
interpersonal relationships, off-track concerns (e.g., staying away from drugs), or health and
wellness. Under each of these categorizations, we made subcategorization of words based on
64
valence and whether the word was a noun, verb, or adjective. We identified two additional
categories for general valence: content-free positive descriptors (“good”, “great”, “awesome”,
etc) and negative descriptors (“bad”, “terrible”, “worst”). These categories served as our starting
point for using a snowball approach to identifying other words that belong to the same category.
Our snowball approach used two methods iteratively to expand the dictionaries to
increase coverage of student responses and to more effectively represent the content of student
responses. First, we used tidytext to tokenize responses and count the frequency of word pairings
(bigrams) in the entire set of student responses. This approach highlighted obvious cases of a
frequent word pairing that fit into our dictionary categorization scheme. We applied the modified
dictionary and proceeded to the next step in our iterative process. In the second step of our
iterative process, we derived Fall and Spring network graphs from the word-to-word connection
frequency counts calculated in the first step. We visually inspected the network graphs to
identify remaining words that fit into our dictionary categorization scheme. We repeated the first
and second steps iteratively, each time expanding the coverage of our dictionary-based
categorization of the words students used in their possible identity responses. We show our
finalized dictionaries in Appendix C. This approach is well suited for increasing dictionary
coverage for a particular set of text data, but is not meant to be an exhaustive coverage of all
possible English words that might fit into those dictionary categories.
Study 1 Supplemental Results
Using a Skew-Normal Distribution for GPA
We compared a model assuming GPA is normally distributed to one with skew-normal
distribution. We found worse model fit when GPA was modeled using a normal distribution
(Figure 1, WAIC and LOO IC= 619.2: difference = 35.8, SEdifference=6.5) than when it was
65
modeled using skew-normal distribution (Figure 2, WAIC = 547.6, LOOIC= 547.7). In Figure 1,
the observed distribution in GPA is denoted in dark blue and the distributions implied from
sampling the posterior distribution are plotted in light blue. As can be seen, assuming a normal
distribution misrepresents the outcome, failing to capture the observed negatively skewed
distribution in grades. The posterior distribution in Figure 1 also includes many values above the
4.0 maximum GPA. In contrast, in Figure 2, the outcome is assumed to follow a skew-normal
distribution. And, as can be seen, the posterior distribution represents the sample distribution
fairly well, capturing the expected skew in student GPA.
Figure S1
Sample distribution (y) compared to posterior distribution (yrep) with normal distribution priors
and with change in possible identity score as the predictor
Note. Black vertical line represents the sample mean GPA.
Figure S2
Sample distribution (y) compared to posterior distribution (yrep) with skew-normal distribution
priors and with change in possible identity score as the predictor
66
Note. Black vertical line represents the sample mean GPA.
Table S2.
Effects of Change in Outcome-based Algorithm Derived Possible Identity Scores on Current Year GPA
Step 1 Step 2 Step 3 Step 4
b 95% CI b 95% CI b 95% CI b 95% CI
Δ Possible Identity Score
0.84 [0.453, 1.225] 1.17 [0.734, 1.573] 0.41 [0.155, 0.668] 0.35 [0.093, 0.611]
6th Grade GPA
0.44 [0.371, 0.515] 0.44 [0.362, 0.51]
7th Grade GPA 0.21 [0.133, 0.291] 0.20 [0.114, 0.283]
Female
0.09 [0.016, 0.169]
Free/Reduced Lunch
-0.10 [-0.238, 0.028]
Latinx
-0.01 [-0.159, 0.136]
Intercept
3.01 [2.933, 3.09] 3.07 [2.856, 3.277] 2.99 [2.83, 3.137] 3.04 [2.824, 3.257]
Random Effect of School
- 0.28 [0.12, 0.567] 0.23 [0.111, 0.431] 0.24 [0.108, 0.449]
R
2
0.03 [0.005, 0.055] 0.13 [0.05, 0.217] 0.79 [0.752, 0.818] 0.79 [0.748, 0.818]
ICC
0.14 [0.019, 0.465] 0.28 [0.071, 0.672] 0.29 [0.065, 0.686]
loo IC
516.95 506.65 199.09 202.79
WAIC
516.93 506.45 198.63 201.44
Note. Δ Possible Identity Score = residualized change score.
68
68
Study 2 Supplemental Results
Table S3
Algorithmic Representation of Possible Identities is Specific to how Students write about their
future
Step 1 Step 2
Parameter name b 95% CI b 95% CI
Δ in Possible Identity Score 0.86 [0.45, 1.28] 0.30 [0.03, 0.58]
Δ in Student Feedback Score -0.08 [-1.18, 1.01] 0.77 [-0.14, 1.68]
6th Grade GPA
- -
0.20 [0.12, 0.28]
7th Grade GPA
- -
0.44 [0.37, 0.51]
Female
- -
0.09 [0.01, 0.165]
Free/Reduced Lunch
- -
-0.10 [-0.24, 0.041]
Latinx
- -
-0.01 [-0.148, 0.135]
Intercept 3.01 [2.94, 3.08] 3.03 [2.81, 3.25]
R
2
0.03 [0.005, 0.059] 0.79 [0.753, 0.820]
Random Eff. of School
(ICC)
-
-
0.29 [0.071, 0.655]
loo IC 200.75 518.25
WAIC 199.83 518.17
Note. Δ in Possible Identity Score = residualized change score. Δ in Student Feedback Score
Detailed description of measures used in Study 3
Word/Concept Count
As a measure of general elaboration, we calculated word/concept counts using response-
level network graphs. Scores represent the number of unique words/concepts a student generated
in their possible identity response that were not stop words (M= 23.10 words/concepts,
SD=12.17).
69
School Word/Concept Count
We calculated school word/concept counts by counting the number of unique
words/concepts a student generated that fit into one of the school-focused dictionaries or into our
more complete list of school-focused words (M= 1.71, SD= 2.06).
Count of expected and feared possible identities
We counted the number of non-blank responses provided in each of the four response
boxes for expected (M = 2.88, SD = 1.40) and feared possible identities (M = 2.77, SD =1.44)
respectively.
Human-coded Measures of Response Content and Structure
For our human-coded measures of balance and plausibility, we used the coded data from
Horowitz and colleagues (2020).
Balance. Balance scores (Oyserman, Gant, & Agar, 1995) represent the count of the pairs
of positive (expected, e.g., “getting good grades”) and negative (feared, e.g.,“not flunking
classes”) possible identities that are school-focused (M = 1.18, SD =1.01). Horowitz et al (2020)
double coded all responses and discussed inconsistent codes to agreement.
Plausibility. To access whether student school-focused possible identities were
connected to strategies that could plausibly yield their desired result, two raters used a rubric (see
Oyserman et al., 2004) to score student responses on a scale from 0 (no school-focused possible
identities or one vague school-focused possible identity without strategies) to 5 (four or more
school-focused possible identities with four or more linked strategies and at least one strategy
that considers interpersonal aspects of the school context (e.g., “getting along with teachers;
Mean =3.34, SD = 1.55). Horowitz et al (2020) reported substantial inter-rater agreement
(Cohen’s Kappa = .96; Percentage Agreement = 88%).
70
Study 3 Supplemental Results
Table S4
Correlations among operationalizations of possible identities at the beginning of the 8
th
grade
with 6
th
and 7
th
grade GPA.
Theme Predictor 6
th
grade
GPA
7
th
grade
GPA
r p r p
Elaboration Word Count (all) 0.30 <.001 0.19 0.002
Word Count (School) 0.11 0.08 0.27 <.001
Valence # of Expected Identities 0.10 0.14 0.06 0.29
# of Avoided Identities 0.13 0.05 0.06 0.36
Structure and
Content
Balanced School-focused identities 0.16 0.01 0.10 0.08
School-focused identities with plausible strategies 0.18 0.01 0.11 0.05
Associative
Structure
School Connectedness -0.04 0.53 -0.07 0.25
School betweenness 0.16 0.01 0.09 0.15
Table S5
Correlations among operationalizations of possible identities at the beginning of the 8
th
grade
with student demographics
Theme Predictor Female Latinx Free/Reduced
Lunch
r p r p r p
Elaboration Word Count (all) 0.15 0.01 -0.01 0.82 -0.07 0.24
Word Count (School) 0.05 0.39 -0.01 0.92 0.06 0.27
Valence # of Expected Identities -0.04 0.55 0.02 0.74 -0.06 0.29
# of Avoided Identities -0.01 0.88 0.05 0.37 -0.02 0.78
Structure
and
Content
Balanced School-focused identities -0.02 0.80 0.04 0.52 0.06 0.34
School-focused identities with plausible
strategies
0.00 0.97 0.06 0.27 -0.02 0.70
Associative
Structure
School Connectedness -0.02 0.70 0.07 0.24 0.08 0.18
School betweenness 0.02 0.71 0.02 0.75 0.04 0.47
71
Correlations Among Model Predictors
Figure S1 below shows the correlations between predictors entered in the models
described in Study 3. In general, the measures of possible identities all correlated at modest
levels. At the high point, students with more balanced school-focused possible identities also
tended to have more plausible school-focused identities (r=.69). Students who generated more
expected possible identities tended to generate more avoided possible identities (r=.69). School
connectedness, on the other hand, was the only measure to be negatively correlated with other
operationalizations of possible identities. Students with response networks in which school was
more highly connected, tended to write less in general (increasing the chances that school would
be a connected node relative to others) and similarly generated fewer expected and avoided
possible identities.
Figure S3
How features of possible identities covary
72
Table S6
Relation between number of expected possible identities and algorithm-based possible identity
scores controlling for random effect of person, demographics, and measures of
elaboration/literacy
Step 1 Step 2 Step 3
Parameter Name b 95% CI b 95% CI b 95% CI
Expected Count 0.02
[0.01, 0.03
] 0.02 [0.01, 0.03] 0.01
[-
0.003, 0.02]
Female
0.03 [0.01, 0.04] 0.02 [0.01, 0.04]
Free/Reduced
Lunch -0.03
[-
0.06, 0.003] -0.03 [-0.06, 0.01]
Latinx
0 [-0.02, 0.02] 0.01 [-0.02, 0.03]
Total concept count
(network size)
0.04 [0.03, 0.05]
School concept
count 0 [-0.01, 0.01]
Intercept 2.92
[2.91, 2.93
] 2.93 [2.90, 2.97] 2.93 [2.90, 2.97]
R
2
0.07
[0.02,
0.12] 0.09 [0.04, 0.15] 0.24 [0.14, 0.34]
between person
ICC 0.06
[0.01,
0.12] 0.07 [0.02, 0.13] 0.13 [0.04, 0.23]
loo IC -
827.9
-
835.6 -869.6
WAIC -
840.9
-
849.9
-
890.89
Table S7
Relation between number of avoided possible identities and algorithm-based possible identity
scores controlling for random effect of person, demographics, and measures of
elaboration/literacy
Step 1 Step 2 Step 3
Parameter Name b 95% CI b 95% CI b 95% CI
Avoided Count 0.01
[0.002, 0.0
2] 0.01
[0.003, 0.02
] -0.01
[-
0.02, 0.01]
73
Female
0.02 [0.01, 0.04] 0.02
[0.003, 0.0
4]
Free/Reduced
Lunch -0.03
[-0.06, -
0.00] -0.02
[-
0.06, 0.01]
Latinx
-0.01
[-
0.03, 0.01] 0.004
[-
0.02, 0.03]
Total concept count
(network size) 0.05 [0.03, 0.06]
School concept
count 0
[-
0.01, 0.01]
Intercept 2.92 [2.91, 2.93] 2.94 [2.91, 2.98] 2.94 [2.90, 2.97]
R
2
0.05 [0.01, 0.10] 0.07 [0.02, 0.12] 0.23 [0.13, 0.32]
between person
ICC 0.06 [0.01, 0.11] 0.06 [0.01, 0.12] 0.12 [0.03, 0.21]
loo IC -
815.7
-
820.6
-
866.65
WAIC -
824.6
-
830.7
-
885.86
Table S8
Relation between balanced school-focused possible identities and algorithm-based possible
identity scores controlling for random effect of person, demographics, and measures of
elaboration/literacy
Step 1 Step 2 Step 3
Parameter Name b 95% CI b 95% CI b 95% CI
Balanced School
0.0
1
[0.00, 0.02
] 0.01
[0.003, 0.0
2] 0.01
[-
0.002, 0.02]
Female
0.02 [0.01, 0.04] 0.02
[0.004, 0.04
]
Free/Reduced Lunch
-
0.03
[-
0.06, 0.00]
-
0.03 [-0.06, 0.01]
Latinx
-
0.01
[-
0.03, 0.01] 0 [-0.02, 0.03]
Total concept count (network
size) 0.04 [0.03, 0.06]
School concept count
0 [-0.01, 0.01]
Intercept
2.9
2
[2.91, 2.93
] 2.95 [2.91, 2.98] 2.94 [2.90, 2.98]
74
R
2
0.0
5 [0.01, 0.1] 0.07 [0.03, 0.13] 0.24 [0.14, 0.34]
Person-level ICC 0.0
6
[0.01,
0.12] 0.07 [0.01, 0.13] 0.13 [0.04, 0.22]
loo IC
-829.08 -834.15 -867.21
WAIC
-843.75 -850.83 -886.51
Table S8
Relation between plausible school-focused possible identities and strategies and algorithm-
based possible identity scores controlling for random effect of person, demographics, and
measures of elaboration/literacy
Step 1 Step 2 Step 3
Parameter Name b 95% CI b 95% CI b 95% CI
Plausibility
0.0
2
[0.01, 0.03
] 0.02 [0.01, 0.03] 0.01
[-
0.00, 0.02]
Female
0.02 [0.01, 0.04] 0.02
[0.00, 0.04
]
Free/Reduced Lunch
-
0.03
[-
0.06, 0.002] -0.03
[-
0.06, 0.01]
Latinx
-
0.01 [-0.03, 0.01]
0.00
4
[-
0.02, 0.03]
Total concept count (network
size) 0.04
[0.03, 0.05
]
School concept count
0
[-
0.01, 0.01]
Intercept
2.9
2
[2.91, 2.94
] 2.94 [2.91, 2.98] 2.94
[2.90, 2.98
]
R
2
0.0
8
[0.03,
0.14] 0.10 [0.04, 0.17] 0.24
[0.14,
0.34]
Person-level ICC 0.0
8
[0.02,
0.14] 0.08 [0.02, 0.15] 0.13
[0.03,
0.23]
loo IC
-829.08 -834.15 -867.21
WAIC
-843.75 -850.83 -886.51
Table S9.
Relation between school connectedness and algorithm-based possible identity scores controlling
for random effect of person, demographics, and measures of elaboration/literacy
75
Step 1 Step 2 Step 3
Parameter Name b 95% CI b 95% CI b 95% CI
School Connectedness
-0.01
[-
0.02, 0.001] -0.01
[-
0.01, 0.004] 0.002 [-0.01, 0.01]
Female
0.02
[0.002, 0.04
] 0.02
[0.004, 0.04
]
Free/Reduced Lunch
-0.02 [-0.05, 0.01] -0.02 [-0.06, 0.01]
Latinx
-0.01 [-0.03, 0.01] 0.004 [-0.02, 0.03]
Total concept count
(network size) 0.04 [0.03, 0.06]
School concept count
0 [-0.01, 0.01]
Intercept 2.92 [2.91, 2.93] 2.94 [2.91, 2.98] 2.94 [2.90, 2.97]
Marginal R
2
0.04 [0.01, 0.07] 0.05 [0.01, 0.09] 0.23 [0.13, 0.33]
Person-level ICC
0.04 [0.01, 0.09] 0.05 [0.01, 0.09] 0.12 [0.03, 0.21]
loo IC
-811.73 -814.07 -866.03
WAIC
-818.08 -823.12 -883.45
Table S10
Relation between school betweenness and algorithm-based possible identity scores controlling
for random effect of person, demographics, and measures of elaboration/literacy
Step 1 Step 2 Step 3
Parameter Name b 95% CI b 95% CI b 95% CI
School Betweenness 0.01
[0.01, 0.02
] 0.02 [0.01, 0.02] 0.01
[0.003, 0.0
2]
Female
0.02 [0.01, 0.04] 0.02
[0.004, 0.0
4]
Free/Reduced Lunch
-0.03
[-0.06, -
0.001] -0.03
[-
0.06, 0.01]
Latinx
-0.01 [-0.03, 0.01] 0.003
[-
0.02, 0.03]
Total concept count
(network size) 0.04 [0.03, 0.05]
School concept count
-
0.002
[-
0.01, 0.01]
Intercept 2.92
[2.91, 2.93
] 2.95 [2.92, 2.98] 2.94 [2.90, 2.98]
76
R
2
0.06
[0.02,
0.11] 0.07 [0.03, 0.12] 0.23 [0.14, 0.33]
Person-level ICC
0.06
[0.01,
0.11] 0.06 [0.01, 0.12] 0.12 [0.03, 0.21]
loo IC
-
822.8
-
829.6
-
870.2
7
WAIC
-
832.5 -840
-
887.4
9
77
Chapter 2: Thinking of interventions as journeys concretizes the invisible elements that
make interventions work at scale
Abstract
Once tested, people assume an intervention works. Researchers rarely replicate tests, and
when they do, they often fail to show the same pattern of effects as the original. Instead,
replications imply new moderators, needed controls, even reversals of effect signs. All of this
reduces clarity as to what works and the credible interval of effect sizes for things that may work.
To make progress, we use a journey metaphor to understand interventions, their effects, and
ways to create positive feedback loops to improve scalability. As journeys, interventions entail
travelers (participants), travel guides (implementers), scenery (intervention context),
transportation modes (psychological theory translated to a manualized intervention), paths
(culturally-attuned psychology of learning, memory, and persuasion operationalized to
manualized train-the-trainer and intervention), and smart-maps (quantified fidelity to stay on
track and improve manualized training and implementation). Social scientists focus most often
on how travelers, and scenery affect the likelihood that an intervention will robustly replicate.
We articulate how social science theories matter for each of the six components of journeys.
Rather than ask if intervention effects are stable, our journey metaphor highlights critical missing
elements without which interventionists, policymakers, and interested parties cannot diagnose
what an intervention effect means. We lay out the six questions interventions should be able to
answer and use two interventions operationalizing identity-based motivation to concretize our
conceptual framework.
Keywords: Intervention evaluation, treatment effects, intervention replication, intervention
fidelity, evidence-based policy
78
Introduction
“It's a dangerous business, Frodo, going out of your door," he [Bilbo] used to say. "You step into
the Road, and if you don't keep your feet, there is no knowing where you might be swept off to.
Do you realize that this is the very path that goes through Mirkwood, and that if you let it, it
might take you to the Lonely Mountain or even further and to worse places?” - from The
Fellowship of the Ring, J.R.R. Tolkien
Educators, practitioners, and policymakers want interventions that reliably yield
promised results. To support finding interventions that work, federal agencies and non-profit
organizations developed clearing houses of promising interventions. The U.S. Department of
Education established The What Works Clearinghouse (ies.ed.gov/ncee/wwc/) and other federal
agencies created repositories in their fields (e.g., for behavioral and mental health practitioners
focused on substance use treatments, www.samhsa.gov/resource-search/ebp; for medical
practitioners, www.clinicaltrials.gov, general clinical trials for economic development
interventions, www.aidgrade.org). Unfortunately, even promising interventions commonly fail to
yield previously documented results on a second try or at larger scale (e.g., Cheung & Slavin,
2009), limiting the utility of these repositories which imply that listed interventions are scalable
and sustainable
4
. But, as we show, having a successful randomized trial is not enough. To make
progress, intervention scientists and program evaluators need a clearer concretization of the
abstract, multidimensional construct of intervention. In the current paper, we use a journey
metaphor and consider six key elements of journeys so that, unlike Frodo, intervention scientists,
4
We operationalize scaling as moving from a single intervener or site to delivering the intended
intervention in the intended way at multiple sites. We operationalize sustainability as having the
needed infrastructure to keep delivering the intended intervention with the intended effects
without further researcher involvement in training or support.
79
program evaluators, and policymakers will not head out without considering the features of the
journey they are embarking on. We start by explaining what metaphors can do, how we use them
to unpack what interventions are, map out the sequence of six questions beyond randomization
our formulation supports addressing, and provide two examples of how using a journey metaphor
can help.
Interventions as Journeys
Abstractions are hard to understand. To make sense of them, people use concretizing
metaphors. Metaphors help people understand the abstraction as a whole as well as its particular
features (Lakoff & Johnson, 1980). People use what they know about the concrete to make sense
of the abstract. This allows people to draw inferences about the abstract concept that they would
not otherwise make. A good metaphor is not only applicable (possible to apply), but also apt
(Landau, 2016). That is, it fits the abstract idea better than alternative metaphors.
A journey is an apt metaphor for intervention for four reasons. First, it concretizes the
abstract multidimensional concept of intervention as six key elements. We represent these six
key elements in Figure 1. Journeys entail travelers (participants) and scenery (intervention
context). Researchers often consider these elements when they attempt to understand replication
failures. Our journey metaphor clarifies that intervention is more than travelers and scenery. To
get to the intended destination, it helps to have a guide (implementers). Guides facilitate by
making sure travelers have a mode of transportation (psychological theory translated to a
manualized intervention) and are on the right path (culturally-attuned psychology of learning,
memory, and persuasion operationalized to manualized train-the-trainer and intervention). They
use up-to-date smart-maps (quantified assessment of fidelity). Smart maps provide directions and
keep guides (and hence travelers) on track by synthesizing multiple data inputs. In the case of
80
intervention, these data sources can be video recordings, backend analytics, retrospective or
concurrent self-reports. Smart map directions take preferences and features of the immediate
context into account, suggesting alternate routes to stay on track and suggest alternatives given
conditions. In the case of intervention, this entails relevant adaptations, input into ongoing
implementer supports, and feedback that improve the manualized intervention and training going
forward.
Second, a journey metaphor clarifies that each of six elements of a journey is necessary
and needs to be considered when developing, implementing, evaluating, reporting on, or
attempting to scale, or sustain, an intervention. Each element seems obvious when considered
concretely as elements of a journey. But without the metaphor most of these aspects are not well-
articulated or considered, leading to unspecified or poorly documented elements and hence lack
of clarity as to what actually happened in an intervention.
Third, a journey metaphor provides the wherewithal for a positive feedback loop. Having
systematized knowledge about past journeys increases the likelihood of reaching the planned
destination in the current journey. Systematically monitoring progress in the current journey
facilitates planning for new journeys. After each journey, vehicles are serviced and repaired,
paths are re-examined so that the best route to the destination is chosen, and smart maps are
refined to increase their efficacy for future guides and travelers.
Fourth, a journey metaphor concretizes the applicability of social science theories to each
of the six elements of journeys. That is, social science theory articulates what aspects of travelers
(e.g., their developmental phase, their social and cultural identities), scenery (e.g., context-
specific social norms, culture, and structural affordances and constraints), and guides (e.g., are
interveners role models, mentors, in-group or out-group members) are likely to matter. It also
81
articulates theories of culture, identity, motivation, memory, learning, and persuasion relevant for
creating vehicles (interventions) and paths (delivery systems). Without a concretizing journey
metaphor, interventionists, evaluators, and policymakers often silo their use of social science
theorizing to focus on only one or two of the six elements, without really noticing that the other
elements exist and are worthy of the same attention.
Figure 1
The Components of the Intervention Journey
Making Progress Without an Apt Metaphor Has Proven Difficult
Using a journey metaphor bridges the gap between intervention scientists and program
evaluators who emphasize specificity (Dane & Schneider, 1998; O’Donnell, 2008), and those
who emphasize flexibility (Blakely et al., 1987; Chambers et al., 2013; Sleeter, 2008). In our
metaphor language, the former entails the clarity of knowing exactly how the vehicle was driven
and the latter entails learning from past journeys, responding thoughtfully to unexpected
obstacles, and considering the culture of travelers and guides and the constraints of the scenery.
After all, with each journey, the travelers, scenery, and guides necessarily change. The
implication is that no intervention can be exactly replicated (Chambers et al., 2013). Without
theory going in, relevant data collection during, and focused analyses after, why replications
yield the same or different results remains opaque. Over time, culture, norms, policies, and
82
practices change. A journey metaphor concretizes the need to document the scenery and apply
theory-based interrogation of which changes in scenery may matter. With each journey, the
culture and norms of travelers and guides must be considered, both to predict best ways to
fruitfully engage these in the manualized intervention and training, and to consider how they
may shape positive feedback for the smart map on how to operationalize what progress looks
like (e.g., Liu et al., 2021; O’Donnell et al., 2021).
Though intervention scientists and program evaluators do provide evidence that travelers
get to their destination (e.g., evidence that a particular intervention has the intended effect), they
typically cannot document that a particular vehicle was used or retrace the path traveled
(Premachandra & Lewis, in press). Evidence that randomization occurred, an intervention was
offered, and something happened are necessary but insufficient (e.g., Deaton & Cartwright,
2018). At present, lacking a concretizing metaphor to clarify that there are six elements to
address, interventionists and evaluators provide interested parties with too little information.
That means others can neither replicate nor evaluate the likelihood that an intervention would
replicate as participants, contexts, and implementers change (e.g., Chambers et al., 2013; Michie
& Abraham, 2004; Rozenweig & Udry, 2020; Vivalt, 2020). Vivalt (2020), for instance, reports
that if policymakers wanted to predict the effect of an intervention on a particular outcome, the
available information (mean treatment effects) would fail to even predict the correct sign of that
effect 40% of the time. This makes reviews of interventions sobering (Cheung & Slavin, 2009;
Perepletchikova & Kazdin, 2005; Sisk et al., 2019; Vivalt, 2020) and may explain why scale-ups
often yield non-significant (Al-Ubaydli et al., 2019; Vivalt, 2020) or smaller effects than the
original (Chambers et al., 2013; Cheung & Slavin, 2009). We suggest that this gap exists because
interventions are abstract and multidimensional. Without an applicable and apt metaphor,
83
intervention scientists and program evaluators fail to see, let alone communicate key elements
(see Mohr et al., 2015 for a related dilineation between principles and components). Not seeing
or communicating how each of the elements of the journey occur yields highly heterogeneous
effects across attempts to deliver ostensibly the same intervention without a clear reason for why
this might be (Vivalt, 2020).
We suspect that the problem lies in failure to apply a concretizing metaphor to the
abstract concept of intervention. Once intervention is considered a journey, otherwise missing
elements come to mind, as does the necessity to thoughtfully consider each element.
Interventions involve travelers (participants), guides (implementers), and scenery (context). But
they also involve a mode of transportation, a vehicle (the what), a path (the how), and a smart
map (to make sure that the vehicle is running as expected, that they are staying on path, and to
allow for adjustment and enhanced future performance). Until now, intervention scientists and
program evaluators have mostly focused on only three of the six features of journeys, travelers,
guides, and scenery. They largely omit consideration of the other three features of journeys:
mode of transportation, path, and even a rudimentary smart map (O’Donnell, 2008;
Perepletchikova & Kazdin, 2005).
Regarding travelers, intervention effects differ when they are delivered to different
participants (e.g., people who differ in age from the original intervention test, Dishion &
Paterson, 1992). Regarding scenery, intervention effects differ when they are delivered at
another site (Borman et al., 2018; Vivalt, 2020), or even to the same site at a different time
(Rozenweig & Udry, 2020). Regarding guides, intervention effects differ when delivered by
different people (Crits-Cristoph et al., 1991; Cheung & Slavin, 2009; Vivalt, 2020).
84
When the mode of transportation is discussed at all, it is typically discussed in terms of
two aspects of what intervention researchers call fidelity or program integrity: treatment
adherence and dosage (e.g., Dane & Schneider, 1998; Al-Ubaydli et al., 2019). Using our
journey metaphor, these aspects of fidelity are like knowing whether the engine of the vehicle
was turned on (adherence) and the planned number of miles was driven (dosage). While helpful,
knowing that the engine was turned on and how close the odometer reading was to expected
mileage does little to help current travelers reach the planned destination or planners of future
journeys. Smart maps can help. They both support current guides and travelers in reaching their
planned destination and store and share knowledge about the path for future journeys. Smart
maps also share potential alternative routes that are dynamically constructed to adapt to features
of the immediate context. Without separating out the effects of travelers, scenery, guides,
vehicles, paths, and smart maps, it is impossible to know how intervention effects were
produced.
What researchers typically provide is like a static map (see Premachandra & Lewis, in
press). It can provide a plan before the journey begins. Unlike a static map, a smart map clearly
lays out what the path should look like, provides feedback along the way, and updates on
deviations and what to do to get to the intended destination. In this way, smart maps are what a
complete fidelity assessment could be, they outline how an intervention should be administered
(dosage, adherence) and experienced (quality, responsiveness, fidelity of receipt) and provide
clear updates on deviations from both. This feedback allows guides to correct course and for
future refinements in training and implementation manuals (e.g., Oyserman et al., revise and
resubmit) and more complete reporting of results (e.g., Carroll et al., 2007). Moreover, when
fidelity is low, it can undermine the likelihood of attaining the desired effect of intervention
85
(Durlak & Dupre, 2008, but see Collyer et al., 2020). In Figure 2, we operationalize the five
components of fidelity when fidelity is used as a smart map.
Figure 2
The Five Components of Fidelity that Should be Captured by Intervention Smart Maps
In the following sections, we use the journey metaphor to explain how social scientific
theories can inform vehicle, path, and smart map design. For each element, we provide examples
from our own work and how we have drawn theories of identity, culture, persuasion, and
learning to develop our intervention vehicles, paths, and smart maps.
Designing a Vehicle
What Researchers Are Trying to Do
Intervention research often starts with a theory of etiology (how a problem came to),
prevention, or change (how to prevent the problem from emerging, what to do once it appears).
An intervention is not the same thing as a theory. It is a translation of the theory into a set of
critical components that are operationalized into a set of activities or ideas that can be used to
build a vehicle that helps travelers get to their destination. This is often a first stumbling block.
Knowing how a problem develops without intervention does not necessarily reveal what to do to
86
prevent or treat it. A second stumbling block is that theories are often presented as
representations of reality. But theories are typically rooted in a cultural context (e.g., Oyserman,
Coon, & Kemmelmeier, 2002). By not taking cultural context into account, an intervention may
fail or fail to replicate if cultural context mismatches the vehicle. Returning to the journey
metaphor, when underspecified, researchers assume the vehicle will take all travelers, go through
all sceneries and can be effectively steered by all guides without modifications.
An Example: Translating Identity-Based Motivation Theory into Intervention to Support
Academic Attainment
We translated identity-based motivation (IBM, Oyserman, 2007; Oyserman, et al., 2017)
into brief interventions to support academic attainment (Horowitz et al., 2018; Oyserman, 2015;
Oyserman et al., 2021). IBM is a theory of motivation and goal pursuit which emphasizes the
motivational power of the self, the people’s past, current, and future personal and social
identities. IBM theory posits that people’s identities matter for how they act and make sense of
their experiences, but that which aspect of identity comes to mind and what it implies for
meaning-making and action is dynamically constructed given affordances and constraints in
context. Though they value school success, students often fail to take action because, in context,
their school-focused identities do not come to mind, seem irrelevant to possibilities for action, or
students misinterpret difficulties along the way as implying that school success may not really be
possible for them (Oyserman & Lewis, 2017). When a task or goal feels identity-congruent
(something ‘I’ might do), people are more likely to engage in task/goal relevant actions and
interpret their metacognitive experiences of ease or difficulty while doing so as implying that
taking current action is important—“no pain, no gain.” Alternately, when a task or goal feels
identity-incongruent (not for ‘me’), people are less likely to engage in task/goal relevant actions
87
and more likely to interpret their metacognitive experiences as implying that taking current
action is a waste of time, that success is impossible or irrelevant.
We used identity-based motivation theory because it captures the effects of context
(scenery) in shaping which identities come to mind and what these identities seem to imply for
meaning-making and action in a way that highlights the dynamic nature of this process. That
implies that skilled guides (implementers) matter in part by shaping which aspects of the scenery
travelers (students) notice. The Pathways-to-Success (Pathways) translation of IBM focuses on
the translation of the three active ingredients of IBM to the context of school and academic
attainment, as shown in Figure 3. First, school should feel like an identity-relevant pathway for
students across contexts. Second, students need to feel like now is the time to act and to use
strategies to work toward future identities. Third, students should be supported in interpreting
difficulty as implying the importance, not the impossibility, of school success. The prediction is
that students whose identity-based motivation is affected by the intervention will do better in
school. We have translated our Pathways intervention for use with outside trainers or college
students (Oyserman et al., 2006), and for teachers (Horowitz et al., 2018; Oyserman et al., 2021)
in the form of a manualized, whole classroom in-person set of activities (Oyserman, 2015). We
have also developed a digital translation of these activities (U.S. Dept. of Education Investing in
Innovation Grant #U411C150011).
Figure 3
Identity-based motivation theory translates to three active ingredients
88
Designing a Path
Our journey metaphor makes it clear that the vehicle is not enough and that not just any
path will do. An intervention may fail to produce effects or later replicate when intervention
messages feel off, fall on deaf ears, or are quickly forgotten. Interventions are effective when
they are delivered in ways that are convincing and persuasive, that promote deep learning and
knowledge transfer, and when they make sense given the travelers, guides, and scenery.
Researchers can design effective paths by drawing on culturally-attuned theories of persuasion,
learning and memory, and supporting both with a general human-centered approach (e.g.,
Rogers, 1943). Failing to attune vehicles and paths to the local travelers, guides, and scenery
means even journeys that promise results will likely not be taken (e.g., Bernal & Rodriguez,
2012). These attunements are sometimes called implementation strategies (for list in education,
see Cook et al., 2019; for list in health, see Powell et al., 2015). Implementation strategies can be
understood in reference to IBM theory as meaning intervention vehicles and paths must feel
relevant to the identities of guides and travelers given their unique scenery (culture, norms, and
contextual barriers). We next describe social scientific theories of persuasion and learning that
we have used in building effective intervention paths
What Makes A Good Path: Culturally-Attuned Persuasion
Interventions are influence attempts. Travelers will not just jump on any vehicle, follow
any path, trust just any guide, or get to the desired destination without appropriate attention paid
to them, the scenery, the guide, vehicle and path. Our approach to designing the right path draws
on the social science of persuasion. We concretize our discussion as panels a and b in Figure 4.
Panel a depicts three ways to increase the persuasive power of messages.
89
First, people are more likely to incorporate and follow a persuasive argument that they
themselves generate and find identity-congruent. Evidence for this insight comes from multiple
perspectives, including person-centered intervention (Raskin & Rogers, 2005; Rogers, 1947), the
MODE model (Fazio, 1990), elaboration likelihood model (Petty & Cacioppio, 1987), reactance
theory (Brehm, 1966), and IBM theory (Oyserman et al., 2017). Hence, researchers should
design intervention paths that encourage travelers to elaborate on messages themselves and
provide them with multiple opportunities to do so.
Second, people are more likely to follow through on actions when they would experience
failing to do so as identity-incongruent. Evidence for this insight comes from multiple sources,
including cognitive consistency theories (Festinger, 1957), self-perception theory (Bem, 1967),
and culture-based accounts of face and honor (Cross et al., 2013). Hence, researchers should
design intervention paths that encourage travelers to publicly commit by sharing their thoughts
(“saying-is-believing”).
Third, people are more likely to believe things that others like them believe as well.
Evidence for this insight comes from social norms and social influence literature (Cialdini &
Goldstein, 2004; Miller & Prentice, 2006; Paluck, 2009) and theories of human culture (e.g.,
Oyserman, 2017) and social identities (Hogg, 2020; Oyserman, 2007). Making positive norms
explicit (e.g., Murrar et al., 2020) and realizing how negative norms might harm (e.g., Miller &
Prentice, 2006) is critical. When researchers fail to notice that travelers are sensitive to what
other travelers are doing (e.g., Dishion & Poulin, 1999) and what guides attend to (e.g., Zhao et
al, 2019), intervention paths may lead travelers in exactly the opposite destination a researcher
expects because the wrong messages are being reinforced by social norms.
90
In Figure 4, panel b, we highlight three major threats to message persuasiveness guiding
our overall theory of delivery. First, travelers form quick impressions about the guide, the
vehicle, and the path. Positive impressions help travelers believe in the process; negative ones
can lead them to quit. We draw these conclusions from the literature on impression formation,
which shows the immense power of initial judgments (Ambady & Rosenthal, 1992), as well as
the literature on persuasion (Briñol & Petty, 2009), school climate (e.g., Cohen, 2006; O’Malley
et al., 2015), and therapeutic alliance (Ackerman & Hilsenroth, 2003; Bordin, 1979). Second,
people prefer to experience their actions as self-guided rather than coerced. Evidence for this
insight comes from reactance theory (Brehm, 1966; Wortman & Brehm, 1975) and self-
determination theory (Ryan & Deci, 2000) as well as conceptualizations of when change is
possible (Dai et al., 2019). The implication is that heavy-handed and inappropriately-timed
influence attempts are unlikely to work and may backfire (e.g., Oyserman, 2015; Steinberg &
Morris, 2001; Yeager & Walton, 2011).
Third, while people do not have to interpret what their metacognitive experiences of ease
and difficulty while thinking mean, they often do. Evidence for this insight comes from
information processing fluency (Schwarz, 2015; Schwarz et al., 2016) and identity-based
motivation (Oyserman, 2009; Oyserman et al., 2018) theories. Messages that feel difficult to
process are unlikely to ring true (Schwarz, 2015) and are more likely to be experienced as
identity-incongruent ways of thinking or acting (Oyserman et al., 2018). The implication is that
travelers are more likely to believe in the value, truth, and identity-congruence of easy-to-process
messages. This implies that travelers will be more likely to act on messages that feel easy-to-
process. Processing ease comes from various features, including, as in our case, using a journey
metaphor (also employed in the Pathways-to-Success intervention, Oyserman, 2015).
91
Figure 4
What Makes a Good Path: Making Messages Stick
The goal of interventions is to elicit long-term change, not merely to prime a one-off
positive or negative evaluation. To that end, we assume that interventions work well when
travelers deeply learn core concepts, internalize them, and generalize them to other parts of their
lives. We follow research on learning, drawing especially on work on so-called desirable
difficulties (Soderstrom & Bjork, 2016), but also the power of self-generated information, to
inform this aspect of our theory of delivery. We identify three core considerations as we show in
Figure 4. To start, following research on the sequence of to-be-learned content, we draw two
inferences. One, core concepts should be interleaved rather than blocked (Cepeda et al. 2006).
People perform better on learning tasks when different concepts are presented as interleaved
rather than blocked based on shared features (Yan et al., 2020). This means that an intervention
should deliver messages multiple times and should be interleaved rather than blocked. Two, core
concepts should be practiced with adequate spacing between practice (Rawson & Dunlosky,
2011). In general, people can learn by forgetting, as the effortful reconstruction involved in recall
promotes deeper elaboration of the to-be-learned information.
92
The third inference we draw to inform our theory of delivery follows the literature on the
recall and recognition advantage of self-generating concepts as part of testing (e.g., Roediger &
Karpicke, 2006). Self-testing is assumed to work through two processes. First, like spacing and
interleaving, recall is assumed to require deeper processing than recognition--so free generation
of to-be-learned concepts should promote deeper learning. Second, self-relevant and self-
generated concepts may be more deeply encoded because they connect to the already elaborated
associative knowledge system that define self-concept and identities (e.g., Symons & Johnson,
1997).
Figure 5
What promotes learning
Making a Smart Map
What Researchers Are Trying to Do
The intervention smart map should gather data that reflects a researcher’s expectation for
how a vehicle should be driven and what path should be taken. As we described earlier, this
entails measuring adherence, dosage, quality, responsiveness, and receipt (e.g., Horowitz et al.,
2018; Oyserman et al., under review). These measures have typically been undertheorized
(Perepletchikova & Kazdin, 2005), but a journey metaphor clarifies that an effective smart map
operationalizes the core components of the vehicle (e.g., identity-based motivation) and the path
(e.g., culturally attuned persuasion and learning). Adherence then should not just measure
whether guides showed up but should characterize whether guides operated vehicles as expected.
93
Dosage then should be thought of in terms of spacing and interleaving: what pacing and spacing
of activities will promote long term change in travelers. Quality then should operationalize how
travelers felt about their guides and other travelers and the extent to which their guides delivered
messages with skill and fluency. Responsiveness then should operationalize the extent to which
travelers elaborated on and publicly shared messages. Receipt then should operationalize the
extent to which core components of the journey were internalized by travelers and felt like the
norm. If the data used to create smart map fidelity scores is sufficiently rich (e.g., video
recording of intervention implementation, detailed backend data from digital intervention),
researchers may be able to conduct post-hoc analyses to create better vehicles, more effective
paths, and to update smart map fidelity in the future.
A Positive Reinforcement Loop: Past Journeys Inform New Journeys
Current evidence repositories like the What Works Clearing House, describe past
journeys with static maps, when what researchers, practitioners, and policymakers need is a way
to plan for future journeys that will involve different travelers, guides, and scenery (Chambers et
al., 2013). Smart maps, in contrast, can be updated and provide concurrent signals that other
routes should be taken or imply that vehicles need servicing. Updates to the smart map should be
made by considering the three aspects of adaptation and sustainability we show in Figure 6.
Figure 6
What processes make a particular intervention journey useful for future journeys.
94
To develop effective smart maps, researchers should begin with a user-centered
psychological approach (e.g., Rogers, 1942) that considers the identities of guides and travelers
as well as the culture, norms, and practical affordances and constraints in the local scenery.
Students are agentic and generally do want to do well in school and not do poorly (e.g.,
Oyserman & Lewis, 2017), teachers want to get to know their students and give them tools to
succeed in life—but they are also working to provide economic security for themselves and their
families, and schools are trying to serve the needs of their teachers, students, and communities—
but also exist within a particular economic, cultural, and political environment. Researchers may
come into the community without regard for these realities, disregarding or even disrespecting
guides, travelers, and scenery (e.g., Sleeter, 2008). A smart map should align with the goals of
the researcher and the guides. It should distill underlying data into a clear navigational scheme to
help guides learn how to use vehicles effectively and follow expected paths and to provide some
real-time feedback to correct course during the journey. Guides may expect this information to
look a certain way—hence soliciting feedback from guides is a critical component of user-
centered smart map design.
Second, a smart map should be designed with sustainability in mind. This means learning
from past journeys to provide up-to-date and dynamic smart map information. A smart map
delivers problem signals (e.g., traffic jams, unexpected obstacles, paths not taken) that
researchers can thoughtfully address in development cycles that collect additional information
from guides and travelers so that vehicles and paths can be adapted and refined. What is relevant
and what is likely to be sustained is likely to shift over time. Our third consideration makes this
clear. Interventions should leverage their measure of smart map and other data-collecting
schemes for creating a cycle of theory testing, gathering input from guides and travelers about
95
the experience of implementation, documenting, and then implementing those changes in future
rounds of implementation. The nature of the journey likely shapes what is possible in this
respect, however, cyclically collecting meaningful data can promote thoughtful adaptation and
increase flexibility while ensuring that core components of the vehicle and path are maintained
over time. In the next section, we present an empirical model of the intervention journey that
explains how researchers can use smart maps in their research practice to answer six critical
questions about a particular intervention journey.
Intervention Journey: An Empirical Model
In Figure 7, we concretize what our journey metaphor means for research practice with a
testable process model that can help answer several important research questions. This model can
be fit using a multigroup generalized linear model and the logic of structural invariance (Kline,
2016)
5
. In Figure 8, we provide a flow chart to aid in interpretation of model parameters. These
figures highlight six questions researchers need to answer in order to interpret the results of an
intervention and set the stage for refinement, revision, replication, scaling, and sustaining.
Q1: Was the Vehicle Driven as Expected and on the Right Path?
Before asking if travelers got to their destination, researchers need to know what the
journey was like. Otherwise, researchers cannot tell others whether their vehicle and path were
the reason travelers made it to their destination. To illustrate this point, we draw a black box
following randomization in Figure 8. Without a smart map, researchers cannot know how the
journey unfolded: was the vehicle operated as expected, did it stay on the expected path? We
5
Multigroup SEM is not the only analytic strategy available, but it is particularly useful for
careful a priori theory testing and thoughtful post-hoc analyses.
96
Figure 7
A Process Model to Describe the Journey Empirically
Note. We use the standard representation of moderation in multigroup SEM, drawing boxes
around the full model. In multigroup SEM, researchers specify which model they expect will
differ based on features of travelers, guides, and scenery. a=Intention-to-treat effect, b=effect of
randomization depends on fidelity, c=intervention changes putative mediators, d=mediators
relate to outcomes, e = the effect of intervention on mediators depends on fidelity. f=the effect of
the mediator on outcomes depends on fidelity, g=implementation strategies influence fidelity.
build on the conclusions of Durlak and Dupre (2008) to define plausible rule-of-thumb markers
for fidelity. Durlak and Dupre found that practitioners rarely implement interventions with over
80% fidelity, and interventions are unlikely to yield planned effects when implemented with
under 60% fidelity. Moreover, when delivered with less than 60% fidelity, researchers cannot
explain much about the journey because most guides did not use the vehicle as expected or took
97
travelers on a different path than the researcher expected. Though it may be a gross marker, this
60 to 80 rule-of-thumb helps researchers make inferences about what happened following
randomization or an offer of service.
As we show in Figure 8, researchers can answer our first question based on descriptive
analysis of the distribution of smart-map fidelity scores. When most travelers were transported
by vehicles driven as expected that followed the expected path, researchers can safely infer that
the planned journey occurred. The intervention can be a test of the operationalization of theory
into a vehicle and path. However, when most travelers were transported by vehicles driven in
unexpected ways and that veered off-course, a researcher should take a step back and ask if there
is variability in smart map fidelity scores. A researcher may proceed with further analyses if at
least some travelers experienced the journey researchers hoped they would. Doing so, however,
depends on how many travelers are there and how effective the journey is expected to be (i.e.,
statistical power). Alternatively, if smart map fidelity scores indicate that almost no travelers
experienced the planned journey, even if they did get to their destination, it was probably
through something other than the planned means.
Q2: Did any travelers make it to their destination when offered the intervention journey?
Researchers often start by modeling intention-to-treat effects (ITT, parameter a) in Figure
7. But, as our model shows and as others have suggested (e.g., Deaton & Cartwright, 2018),
testing the effect of randomization assumes travelers went on some journey, but it cannot unpack
the black box to explain how travelers interfaced with guides or whether vehicles were driven as
expected and stayed on the planned path. Researchers often describe the ITT effect as the likely
effect of intervention in the real world, however, it is not a sufficient data point by itself to be
interpretable or provide insight into how to replicate intervention effects. Researchers can draw
98
Figure 8
A flowchart for interpreting intervention findings
99
one of two conclusions, depending on the distribution of smart-map fidelity scores.
First, if the majority of smart map fidelity scores were well under the 60% threshold and
the ITT parameter is not significant, researchers may reasonably conclude that the vehicle and
path need to be better refined and adapted to the travelers, guides, and scenery. While testable in
large-scale intervention trials, as we imply in Figure 7 (parameter g), the typical intervention trial
will likely not have sufficient representativeness of travelers, guides, and scenery to draw
generalizable conclusions about what went wrong. However, by using structured interviews or
survey instruments, researchers can learn whether guides or teachers struggled with certain
aspects of the intervention (e.g., Carroll et al., 2007; Chambers & Norton, 2019). Guides and
travelers may not be able to provide solutions in their feedback, but they can identify problems
and frustrations. Researchers can also look for help in the implementation strategy literature
(Cook et al., 2019; Powell et al., 2015). Though the findings may be hard to disentangle,
researchers can try to identify problematic aspects of fidelity. For example, if quality is high but
adherence is low, researchers can look for ambiguities in the manualized intervention and obtain
structured feedback from implementers regarding barriers. Figure 8 shows how this step could be
implemented routinely to systematize continuous refinement and adaptation even when results
are promising.
Second, if the majority of smart map fidelity scores were well under the 60% threshold
and the ITT parameter is significant, it suggests that some version of the journey happened and
was helpful for travelers. If the data used to create smart map fidelity scores is sufficiently rich
(e.g., video recording of intervention implementation, detailed backend data from digital
intervention), researchers may be able to conduct post-hoc analyses to characterize the journey
that did happen and use those insights to create better vehicles or more effective paths. The
100
Oregon Model of preventing antisocial behavior in children and adolescents, for instance, used
observational data to refine their treatment model (e.g., Patterson, Reid, & Eddy, 2001). Miller &
Rose (2009) describes how video-recording therapists to document fidelity helped identify
“change talk” and therapist empathy as key active ingredients in motivational interviewing. Both
approaches started with puzzling non-replications or heterogeneity stemming from the guides,
travelers, and scenery. Each led to theoretical advance in the long run. Figure 8 shows how even
after these changes are codified, researchers should adapt and refine their model using input from
guides and travelers.
Q3: Did driving the vehicle as expected and on the right path matter?
The extent to which the journey gets travelers to their destination should depend on
whether the vehicle was driven as expected and stayed on the right path: that’s what the ITT
parameter implicitly tests, though randomization is a black box without smart map fidelity
scores. In Figure 7, parameter b represents a researcher’s assumption that their vehicle was
designed well and the path made sense. If the smart-map effectively operationalizes the right
way of driving the vehicle and the right path, then any effect of randomization or offer of service
should depend on fidelity.
We contrast this moderator approach to current methods using treatment-on-the-treated
(TOT) or complier average causal effect analysis (CACE; Angrist et al., 1998). These methods
treat the journey as a binary, asking if the vehicle was ever driven or what was the probability
that travelers and guides “complied” with the rules-of-the-road. While these approaches are help
explain something about the black box of randomization, treating compliance as a binary or
probability is not a sufficiently nuanced conceptual framework for understanding the various
ways interventions might fail to produce effects: travelers may not want to participate, guides
101
may find vehicles difficult to drive, vehicles might not be appropriate for the surrounding
scenery or run on the wrong fuel, and paths might be too steep or winding to be feasibly
traversed. That’s why having a nuanced and up-to-date smart map is so important
6
.
Researchers can make three inferences depending on whether treatment is moderated by
fidelity. First, if the moderation path is significant it implies that the vehicle and path were
appropriate operationalizations of theory, that smart maps compiled the right data for fidelity,
and that investing in improving fidelity is worthwhile. As with finding low fidelity, the next step
a researcher should take is to examine the factors that would increase fidelity (parameter g) using
structured feedback from travelers and guides. They can also dig into their fidelity scores to see
if certain aspects of the vehicle are too hard to operate or parts of the path need repair.
Second, if the moderation parameter is not significant, but the ITT effect is, interpretation
depends on how variable smart map fidelity is. If the journey went according to plan and there
was not much deviation from that plan, it is highly suggestive that a researcher has put in place a
vehicle and path that makes sense and that guides know how to drive vehicles and follow the
planned path. This may be an unlikely event when interventions are delivered at scale, as
introducing variability in travelers, guides, and scenery increases the chances that fidelity is
variable. When smart-map fidelity is variable and it does not moderate a significant ITT effect,
the implication is that the smart map is tracking the wrong information and that travelers got to
their destination through some other means. As shown in Figure 8, depending on the richness of
6
In our current work (osf.io/yebac), we are implementing an adaptation of the CACE model that,
in the first step, predicts a continuous measure of five-factor fidelity, rather than binary
compliance, and uses the prediction as an instrument for defining a continuous measure of
amount of treatment. This is an alternative formulation of our moderation hypothesis when the
goal is to summarize the effect of treatment stemming from what is provided by the smart map.
102
the underlying smart-map data, researchers may be able to identify core components that could
be translated into future vehicles and paths.
Third, if the moderation parameter is not significant and the ITT parameter is not
significant, it may suggest several related conclusions. For one, it suggests that the intervention
vehicle and path were simply not effective in getting travelers to their destination. The vehicle
and path may have been missing core components that were necessary for guides to transport
travelers through particular scenery. It also suggests that the smart map is misaligned and fails to
capture what ways of transporting travelers mattered. As Figure 8 suggests, researchers can then
turn to their underlying smart map data to try to identify what those missing core components
were and how to track them in the future.
Q4. Were Travelers Convinced, Did they Learn the Lessons They Were Expected to.
When an intervention’s vehicle and path successfully operationalize theories of process,
travelers should be changed by the journey in a particular way (i.e., through putative mediators;
Kazdin, 2007). Parameters c and d represent these mediating parameters in Figure 7. How
researchers interpret c depends on how the first three questions were answered. If an intervention
isn’t delivered with sufficient fidelity, yet randomization still affects the mediating process, then
it implies that whatever happened on the journey still managed to persuade and teach travelers
what the researcher expected. Hence, a researcher’s theory that guided the operationalization of
the vehicle might still have merit, but the vehicle may have followed a different path to the
destination. Alternatively, if the intervention is delivered with fidelity, but randomization does
not affect mediators, it suggests that the operationalization vehicle and path were not effective in
persuading travelers or teaching them new concepts. Researchers can interpret parameter d as the
extent to which change in the expected core components helped travelers reach their destination.
103
When it is insignificant it suggests that the researcher’s theory may be misguided or that it is not
well operationalized using the measures included in the study.
Q5. Does driving the vehicle as expected and on the expected shape how much travelers were
changed or the way in which that change matters
If an intervention’s vehicles and paths effectively operationalize theories of process, then
the extent to which they were driven as expected and followed the expected path should matter
for the extent to which travelers were changed by the journey (parameter e in Figure 7). This is
the more straightforward and obvious prediction with respect to moderated mediation. However,
it is also possible that learning and persuasion are not an issue of magnitude, but of changing
patterns of association. In this case, researchers may expect fidelity to moderate the extent to
which the mediators are related to outcomes (parameter f in Figure 7).
Q6. Do features of the travelers, guides, or scenery fundamentally change the journey or
whether travelers reach their destination?
A particular vehicle or path may not be appropriate for all types of travelers, guides, or
scenery. This is the question researchers typically ask. Researchers test often test this by
including an interaction term between the effect of randomization and the features of travelers,
guides, or scenery they are interested in. As shown in Figure 7, our approach is more flexible by
depicting moderation in the standard format for multigroup structural equation modeling (Kline,
2016). This notation indicates the possibility of modeling group invariance: the extent to which
parameters in Figure 3 are invariant between different types of travelers, guides, and scenery. For
instance, if a researcher just wants to know whether features of travelers, guides, or scenery
moderate the direct effect of intervention, they should compare a model that freely estimates
parameters a and b to one that constrains them to be equal across groups. Based on model fit, a
104
researcher may determine whether the effect of intervention and hence the moderation by fidelity
differs based on features of travelers, guides, and scenery. There are other possibilities, however,
and given the flexibility provided by our empirical model, it would be difficult to walk through
every possibility here.
Applying the Journey Framework to Two Identity-based Motivation Interventions
Following the work of Oyserman (2015), we have developed two classes of vehicle
(teacher-led; digital program) built to move passengers using three core components of identity-
based motivation theory (Figure 3) on paths based on principles of culturally attuned persuasion
and learning (Figures 4 and 5). Each was designed to be implemented at the beginning of the
year, when taking a journey to school success is relevant. Each involves 12 sessions, to be
delivered twice a week, for six weeks. Core components of identity-based motivation are
elaborated on by students themselves, not guides. Though guides (teachers, the program) provide
scaffolding in earlier sessions so that travelers do not interpret difficulty as a sign that this
journey is not for them. Core components are presented in multiple ways across multiple
activities, sometimes together, sometimes separately, based on a model of interleaved practice
and self-testing. In the teacher-led model, teachers are trained to elicit responses from students
(rather than providing responses themselves), promoting public commitment and reinforcing
social norms. In the digital model, students are prompted to elaborate on core intervention
activities. Trailblazers--students who came before--provide example responses that reinforce
social norms and scaffold understanding. To prevent reactance, students are not required to share
responses, but they are asked if they’d like to share what they wrote now or after they have a
chance to edit it later.
Teacher-led, Teacher-Trained Pathways-to-Success Intervention
105
Our teacher-led, teacher-trained model, Pathways-to-Success (Pathways) started with a
researcher and community-member-led afterschool program, Schools-To-Jobs (Oyserman et al,
2006). To facilitate scalability and sustainability, our team has worked on developing,
implementing, and refining the vehicle and path for students in Chicago Public Schools (U.S.
Institute of Education Sciences #R305A14028; #R305A180308) and soon, state-wide in Nevada
(U.S. Education Innovation Research #S411B200027). We use a combination of video
recordings of teachers while they deliver and retrospective student self-report to assess five-
factor smart map fidelity (see Supplemental Materials for our measures). We answered Q1 by
testing whether our method of training teachers was sufficient to teach teachers how to use the
vehicle and follow the expected path (Horowitz et al., 2008). After documenting that teachers
could deliver and students experienced the journey as expected, we refined training using
feedback from teachers, creating a manual that better fit teachers’ expectations and needs
(brightly colored panels, main points highlighted in break-out boxes). We also developed a train-
the-trainer model, which we use to train the teachers who attained the highest fidelity scores to
train other teachers to do the same. In an observational development trial (i.e., no
randomization), we tested this train-the-trainer model, answering Q1 (teachers can deliver with
fidelity) and Q3
7
and Q4, finding that students who experienced the journey closer to what we
had hoped were more likely to get to their destination (better grades, lower likelihood of failing
courses) and that it was because they were changed by the journey in the expected ways (i.e.,
through school-focused identity-based motivation; Oyserman et al., under review). Across cycles
of implementation, we have continued to make changes to the vehicle, path, and smart map
based on smart map data and structured feedback from teachers and teacher trainers.
7
We could not directly address Q2, given that there was not a control group.
106
The realities of a changing world have meant that two of our extended tests of our model
faced quite radical shifts in the scenery that were out of our control: teachers went on strike in
Chicago shortly after delivering the intervention in November 2019 and the COVID-19
pandemic shut schools down in March 2020. In one current test (osf.io/yebac), schools were
randomized to have their 8th-grade teachers trained to deliver the Pathways intervention, so we
will be able to further extend causal claims and continue to refine and adapt our models.
However, we will interpret them in light of the complex realities of a changing and dynamic
world. The COVID-19 pandemic has also meant that the teachers delayed-treatment schools
(schools randomized to control, but who receive a year after randomization) are not able attend
in-person training. This means our model of training had to adapt. We are collecting smart map
data from video recordings of these teachers to see if a web-based model is a feasible means to
train teachers. Though we expect our in-person model based on active learning and behavioral
modeling is more effective, given its firmer basis in psychological theory than more passive
web-based learning.
Fortunately, not all changes to the scenery are quite so unexpected and dramatic. In our
first year of implementation in Nevada, we are working with a small set of schools before
extending our training model to a larger swath of the state. Mostly rural Nevada and densely
urban Chicago are very different places and the travelers and guides have different cultural
makeup and values. Because we have a strong theory-based rationale for the structure and
content of activities, we have been able to identify non-negotiables and separate them from
necessary adaptations. For instance, one session of the intervention involves images of what
adulthood might be like. In Chicago, images are mostly of Latinx and African American adults.
In Nevada, we focused on more ethnically/racially ambiguous adults, but made sure to include
107
Native American adult examples, as we are working with schools serving the Western Shoshone
and Washoe tribes.
Digital Pathways
Though the train-the-trainer model has proven useful, we have explored using a digital-
translation model as the basis of designing a sustainable intervention journey. We worked with a
game developer to translate the activities of the in-person Pathways intervention to a digital
context for use in schools in rural and semi-rural Colorado (U411C150011). The central
challenge to designing the vehicle and creating a path was how to leverage classroom interaction
to engage identity-based processes when no real-time interaction could take place. Our solution
was to include responses from Trailblazers--students had come before and can help show new
students the way. The digital intervention was new, so there weren’t other students who had
completed it before. But having watched videos in-person Pathways, we had actual student
examples that could be adapted for our digital vehicle. We ensured that Trailblazers matched the
demographics of the students (mostly white and Latinx) and had names that were common in
Colorado for the birth cohorts of students, but made sure to sample within race/ethnic groups so
names reflected the diversity in our sample.
Our central challenge in creating a smart map was to consider what fidelity looks like
when the content is held constant. We assumed that the quality of messages was high, as we
drew on social scientific theories of persuasion and learning to develop them. The biggest
unknown was what students were doing: what were they clicking, were they writing meaningful
responses when prompted, were they engaging with trailblazers. We also did not have direct
video observation of what was happening in the classroom. To solve these problems, we insisted
on having detailed backend data that reflected the core components of our vehicle and path. We
108
extracted adherence for student login attempts, dosage from date stamps of logins for all students
in the class, responsiveness from the extent to a student’s responses were serious attempts to
answer the question (hand-coded by research assistants), and quality based on the proportion of
responses students were willing to share and the proportion of actual Trailblazers clicked on to
the minimum requirement (typically one, but sometimes more). When we examined the smart
map, however, it became abundantly clear that something had gone wrong. Few students
completed all 12 sessions and classrooms were not in sync: some students rushed ahead
completing all sessions in a few sittings while others fell far behind, completing only four or five
sessions. Our smart map fidelity implied that learning was unlikely to occur given the rapid pace
some students completed the sessions: like cramming for a test, unspaced practice of to-be-
learned material may yield short term performance benefits, but is unlikely to yield to the type of
deep learning we had hoped (e.g., Soderstrom & Bjork, 2015). Our smart map, however, did
imply that students who were given the opportunity to do so did engage in the ways we expected:
the vast majority provided serious responses to prompts, they shared about a third of their
responses, and, on average, clicked on twice as many trailblazers as we expected. They engaged,
publicly committed, and attended to social norms. Randomization, not surprisingly, did not
influence students who were assigned to the intervention journey. Because we had a smart map,
we know a little bit about why. We talked to schools and found that though we had assumed their
bandwidth could handle the digital intervention, the amount of data passing through the network
created a bottleneck that caused issues in saving student progress. Rather than accepting defeat,
we were encouraged that students engaged in the way we expected when they had the
opportunity. The problem was we did not go far enough to train teachers in why this digital
intervention was helpful and how they could use it with their students. The local scenery
109
presented constraints that we could not address at the time. We have taken these lessons from a
failed journey and are currently working with a new game developer to hone in on making the
digital intervention more usable and hence testable.
Conclusion
Having a journey metaphor allows us to unpack the abstraction of intervention such that
researchers, practitioners, and policymakers can clearly articulate and understand the six
components of any intervention. Though researchers have tended to focus on travelers
(participants), guides (implementers), and scenery (intervention context), the journey metaphor
shines light on often overlooked vehicles (psychological theory translated to a manualized
intervention), paths (culturally-attuned psychology of learning, memory, and persuasion
operationalized to manualized train-the-trainer and intervention), and smart maps (quantified
fidelity to stay on track and improve manualized training and implementation). We showed how
each of these six elements matter and can be used to learn from past journeys and plan new ones:
which is only possible with an up-to-date smart map. Critically, social scientific theories can
explain why each element should matter. To facilitate application of our conceptual model, we
provide a general SEM frame that highlights how researchers can answer six critical questions
and specific examples for how answers to those questions have shaped our work intervening in
schools. Our synthesis integrates the issues raised by many researchers in the field frustrated by
the gap between evidence and practice and provides a way forward using a journey metaphor for
understanding intervention design, implementation, evaluation, and sustainability.
110
110
References
Ackerman, S. J., & Hilsenroth, M. J. (2003). A review of therapist characteristics and techniques
positively impacting the therapeutic alliance. Clinical Psychology Review, 23(1), 1-33.
Adelman, C. (2006). The toolbox revisited: Paths to degree completion from high school through
college. Washington, D.C.: U.S. Department of Education.
Al-Ubaydli, O., List, J. A., & Suskind, D. (2019). The science of using science: Towards an
understanding of the threats to scaling experiments (No. w25848). National Bureau of
Economic Research.
Amodio, D. M. (2019). Social Cognition 2.0: An interactive memory systems account. Trends in
Cognitive Sciences, 23(1), 21-33.
Anderman, E. M., Anderman, L. H., & Griesinger, T. (1999). The relation of present and
possible academic selves during early adolescence to grade point average and
achievement goals. The Elementary School Journal, 100(1), 3-17.
Angrist, J. D., Imbens, G. W., & Rubin, D. B. (1996). Identification of causal effects using
instrumental variables. Journal of the American Statistical Association, 91(434), 444-455
Ansong, D., Chowa, G., Masa, R., Despard, M., Sherraden, M., Wu, S., & Osei-Akoto, I. (2018).
Effects of youth savings accounts on school attendance and academic performance:
evidence from a youth savings experiment. Journal of Family and Economic Issues, 1-13.
Aronson, J., Fried, C. B., & Good, C. (2002). Reducing the effects of stereotype threat on
African American college students by shaping theories of intelligence. Journal of
Experimental Social Psychology, 38(2), 113-125.
Attewell, P., & Domina, T. (2008). Raising the bar: Curricular intensity and academic
performance. Educational Evaluation and Policy Analysis, 30(1), 51-71.
111
111
Azzalini, A. (1985). A class of distributions which includes the normal ones. Scandinavian
Journal of Statistics, 12, 171-178.
Azzalini, A., & Valle, A. D. (1996). The multivariate skew-normal distribution. Biometrika,
83(4), 715-726.
Bandura, A. (1977). Self-efficacy: toward a unifying theory of behavioral change. Psychological
Review, 84(2), 191-215.
Bartsch, L. M., & Oberauer, K. (2021). The effects of elaboration on working memory and long-
term memory across age. Journal of Memory and Language, 118, 104215.
Beal, S. J., & Crockett, L. J. (2010). Adolescents’ occupational and educational aspirations and
expectations: Links to high school activities and adult educational attainment.
Developmental Psychology, 46(1), 258-265.
Beets, M. W., Flay, B. R., Vuchinich, S., Acock, A. C., Li, K. K., & Allred, C. (2008). School
climate and teachers’ beliefs and attitudes associated with implementation of the positive
action program: A diffusion of innovations model. Prevention Science, 9(4), 264-275.
Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of
Machine Learning Research, 13(2), 281-305.
Bernal, G. E., & Domenech Rodríguez, M. M. (2012). Cultural adaptations: Tools for evidence-
based practice with diverse populations (pp. xix-307). American Psychological
Association.
Bi, C., & Oyserman, D. (2015). Left behind or moving forward? Effects of possible selves and
strategies to attain them among rural Chinese children. Journal of Adolescence, 44, 245-
258.
Bishop, J. H. (2000). Nerd harassment and grade inflation: Are college admissions policies
112
112
partly responsible? (CHERI Working Paper #8). Retrieved from Cornell University, ILR
School site: http://digitalcommons.ilr.cornell.edu/cheri/8
Blakely, C. H., Mayer, J. P., Gottschalk, R. G., Schmitt, N., Davidson, W. S., Roitman, D. B., &
Emshoff, J. G.(1987). The fidelity-adaptation debate: Implications for the implementation
of public sector social programs. American Journal of Community Psychology, 15, 253–
268.
Bodenhausen, G. V., Macrae, C. N., & Hugenberg, K. (2003). Social cognition. Handbook of
psychology, 255-282.
Bond, G. R., Evans, L., Salyers, M. P., Williams, J., & Kim, H. W. (2000). Measurement of
fidelity in psychiatric rehabilitation. Mental Health Services Research, 2(2), 75–87.
Borman, G. D., Grigg, J., Rozek, C. S., Hanselman, P., & Dewey, N. A. (2018). Self-affirmation
effects are produced by school context, student engagement with the intervention, and
time: Lessons from a district-wide implementation. Psychological Science, 29(11), 1773-
1784.
Boulay, B., Goodson, B., Olsen, R., McCormick, R., Darrow, C., Frye, M., & Rimdzius, T.
(2018). The Investing in Innovation Fund: Summary of 67 evaluations. U.S. Department
of Education.
Brehm, J. W. (1966). A theory of psychological reactance. Academic Press.
Breland, H., Maxey, J., Gernand, R., Cumming, T., & Trapani, C. (2002). Trends in college
admission 2000: A report of a survey of undergraduate admissions policies, practices,
and procedures. Tallahassee, FL: Association for Institutional Research.
Bürkner, P. (2017). brms: An R package for Bayesian multilevel models using Stan. Journal of
Statistical Software, 80(1), 1–28.
113
113
Burt, R. S. (1999). The social capital of opinion leaders. The Annals of the American Academy of
Political and Social Science, 566(1), 37-54.
Camara, W. J., Kimmel, E. W., Scheuneman, J., & Sawtell, E. A. (2003). Whose grades are
inflated? (College Board Research Report No. 2003-4). The College Board.
Carroll, C., Patterson, M., Wood, S., Booth, A., Rick, J., & Balain, S. (2007). A conceptual
framework for implementation fidelity. Implementation Science, 2(1), 1-9.
Carroll, P. J., Shepperd, J. A., & Arkin, R. M. (2009). Downward self-revision: Erasing possible
selves. Social Cognition, 27(4), 550-578.
Castro-Schilo, L., & Grimm, K. J. (2018). Using residualized change versus difference scores for
longitudinal research. Journal of Social and Personal Relationships, 35(1), 32-58.
Cepeda, N. J., Pashler, H., Vul, E., Wixted, J. T., & Rohrer, D. (2006). Distributed practice in
verbal recall tasks: A review and quantitative synthesis. Psychological Bulletin, 132(3),
354–380.
Chambers, D. A., Glasgow, R. E., & Stange, K. C. (2013). The dynamic sustainability
framework: addressing the paradox of sustainment amid ongoing change. Implementation
Science, 8(1), 117.
Cheung, A. C., & Slavin, R. E. (2016). How methodological features affect effect sizes in
education. Educational Researcher, 45(5), 283-292.
Chowdhury, F. (2018). Grade Inflation: Causes, Consequences and Cure. Journal of Education
and Learning, 7(6), 86-92.
Cohen, J. (2006). Social, emotional, ethical, and academic education: Creating a climate for
learning, participation in democracy, and well-being. Harvard Educational Review,
76(2), 201-237.
114
114
Collins, A. M., & Loftus, E. F. (1975). A spreading-activation theory of semantic processing.
Psychological Review, 82(6), 407–428.
Collyer, H., Eisler, I., & Woolgar, M. (2019). Systematic literature review and meta-analysis of
the relationship between adherence, competence and outcome in psychotherapy for
children and adolescents. European Child & Adolescent Psychiatry, 1-15.
Cook, C. R., Lyon, A. R., Locke, J., Waltz, T., & Powell, B. J. (2019). Adapting a compilation of
implementation strategies to advance school-based implementation research and practice.
Prevention Science, 20(6), 914-935.
Corte, C., Lee, C. K., Stein, K. F., & Raszewski, R. (2020). Possible selves and health behavior
in adolescents: A systematic review. Self and Identity, 1-27.
Craik, F. I. M., & Tulving, E. (1975). Depth of processing and the retention of words in episodic
memory. Journal of Experimental Psychology: General, 104(3), 268–294.
Crits-Christoph, P., Baranackie, K., Kurcias, J., Beck, A., Carroll, K., Perry, K., ... & Gallagher,
D. (1991). Meta‐analysis of therapist effects in psychotherapy outcome studies.
Psychotherapy Research, 1(2), 81-91.
Cronbach, L. J. (1982). Designing evaluations of educational and social programs. Jossey-Bass.
Cronbach, L. J., Furby, L. (1970). How should we measure “change”—or should we?
Psychological Bulletin, 74, 68–80.
Cross, S. E., Uskul, A. K., Gerçek-Swing, B., Sunbay, Z., Alözkan, C., Günsoy, C., ... &
Karakitapoğlu-Aygün, Z. (2014). Cultural prototypes and dimensions of honor.
Personality and Social Psychology Bulletin, 40(2), 232-249.
Csardi, G., & Nepusz, T. (2006). The igraph software package for complex network research.
InterJournal, complex systems, 1695(5), 1-9.
115
115
Dai, H., & Li, C. (2019). How experiencing and anticipating temporal landmarks influence
motivation. Current opinion in Psychology, 26, 44-48.
De Deyne, S., & Storms, G. (2008). Word associations: Network and semantic properties.
Behavior research methods, 40(1), 213-231.
Deaton, A. and Cartwright, N. (2018). Understanding and misunderstanding randomized
controlled trials. Social Science & Medicine, 210, 2–21.
Destin, M., & Oyserman, D. (2010). Incentivizing education: Seeing schoolwork as an
investment, not a chore. Journal of Experimental Social Psychology, 46(5), 846-849.
Destin, M., & Svoboda, R. C. (2017). A brief randomized controlled intervention targeting
parents improves grades during middle school. Journal of Adolescence, 56, 157-161.
Dickens, C. (1843/1858). A Christmas Carol. Bradbury & Evans.
Dimitriadou, E., Hornik, K., Leisch, F., Meyer, D., & Weingessel, A. (2008). Misc functions of
the Department of Statistics (e1071), TU Wien. R package, 1, 5-24.
Duckworth, A. L., Kirby, T. A., Gollwitzer, A., & Oettingen, G. (2013). From fantasy to action:
Mental contrasting with implementation intentions (MCII) improves academic
performance in children. Social Psychological and Personality Science, 4(6), 745-753.
Durlak, J. A., & DuPre, E. P. (2008). Implementation matters: a review of research on the
influence of implementation on program outcomes and the factors affecting
implementation. American Journal of Community Psychology, 41(3), 327-350.
Eccles, J. S., & Wigfield, A. (2002). Motivational beliefs, values, and goals. Annual Review of
Psychology, 53(1), 109-132.
Eccles, J. S., & Wigfield, A. (2020). From expectancy-value theory to situated expectancy-value
theory: A developmental, social cognitive, and sociocultural perspective on motivation.
116
116
Contemporary Educational Psychology, 61, 101859.
Eccles, J., Adler, T. F., Futterman, R., Goff, S. B., Kaczala, C. M., Meece, J., & Midgley, C.
(1983). Expectancies, values, and academic behaviors. InJ. T. Spence (Ed.). Achievement
and achievement motives (pp. 75-146) .W. H. Freeman.
Elliott, W., Choi, E. H., Destin, M., & Kim, K. H. (2011). The age-old question, which comes
first? A simultaneous test of children's savings and children's college-bound
identity. Children and Youth Services Review, 33(7), 1101-1111.
Fazio, R. H. (1990). Multiple processes by which attitudes guide behavior: The MODE model as
an integrative framework. In Advances in experimental social psychology (Vol. 23, pp.
75-109). Academic Press.
Feliciano, C. (2012). The female educational advantage among adolescent children of
immigrants. Youth & Society, 44(3), 431-449.
Festinger, L. (1957). A theory of cognitive dissonance (Vol. 2). Stanford university press.
Fixsen, D. L., Blase, K. A., Naoom, S. F., & Wallace, F. (2009). Core implementation
components. Research on Social Work Practice, 19(5), 531-540.
Freeman, L. C. (1979). Centrality in social networks I: Conceptual clarification. Social Networks,
1, 215-239.
Geiser, S. (2009). Back to the basics: In defense of achievement (and achievement tests) in
college admissions. Change, 41(1), 16-23.
Gelman, A. (2006). Prior distributions for variance parameters in hierarchical models (comment
on article by Browne and Draper). Bayesian Analysis, 1(3), 515-534.
Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple
sequences. Statistical Science, 7(4), 457-472.
117
117
Gelman, A., & Shirley, K. (2011). Inference from simulations and monitoring convergence.
Handbook of Markov Chain Monte Carlo, 6, 163-174.
Gelman, A., Goodrich, B., Gabry, J., & Vehtari, A. (2019). R-squared for Bayesian regression
models. The American Statistician, 73(3). 1–6.
Gelman, A., Jakulin, A., Pittau, M. G., & Su, Y. (2008). A default prior distribution for logistic
and other regression models. Annals of Applied Statistics, 2(4), 1360-1383.
Gelman, A., Vehtari, A., Simpson, D., Margossian, C. C., Carpenter, B., Yao, Y., ... & Modrák,
M. (2020). Bayesian workflow. arXiv preprint arXiv:2011.01808.
Godfrey, K. E. (2011). Investigating grade inflation and non-equivalence (Research Report
2011-2). The College Board.
Gollwitzer, P. M., & Sheeran, P. (2006). Implementation intentions and goal achievement: A
meta‐analysis of effects and processes. Advances in Experimental Social Psychology, 38,
69-119.
Graham, A. K., Lattie, E. G., Powell, B. J., Lyon, A. R., Smith, J. D., Schueller, S. M., ... &
Mohr, D. C. (2020). Implementation strategies for digital mental health interventions in
health care settings. American Psychologist, 75(8), 1080.
Granovetter, M. S. (1973). The strength of weak ties. American journal of sociology, 78(6),
1360-1380.
Haller, H., & Kraus, S. (2002). Misinterpretations of significance: A problem students share with
their teachers? Methods of Psychological Research, 7(1), 1–20.
Hamedani, M. Y. G., & Markus, H. R. (2019). Understanding culture clashes and catalyzing
change: A culture cycle approach. Frontiers in Psychology, 10, 700-.
Heckhausen, H., & Gollwitzer, P. M. (1987). Thought contents and cognitive functioning in
118
118
motivational versus volitional states of mind. Motivation and Emotion, 11(2), 101-120.
Henrich, J., Heine, S. J., & Norenzayan, A. (2010). Most people are not WEIRD. Nature,
466(7302), 29-29.
Hershfield, H. E. & Bartels, D. (2018), “The Future Self,” In G. Oettingen, A. Sevincer, & P. M.
Gollwitzer. (Eds). The Psychology of Thinking about the Future. Guilford Press, 89-109.
Hershfield, H. E., Goldstein, D. G., Sharpe, W. F., Fox, J., Yeykelis, L., Carstensen, L. L., &
Bailenson, J. N. (2011). Increasing saving behavior through age-progressed renderings of
the future self. Journal of Marketing Research, 48(SPL), S23-S37.
Higgins, E. T. (2005). Value from regulatory fit. Current Directions in Psychological Science,
14, 209–213.
Hill, C. J., Bloom, H. S., Black, A. R., & Lipsey, M. W. (2008). Empirical benchmarks for
interpreting effect sizes in research. Child Development Perspectives, 2(3), 172-177.
Hogg, M. A. (2020). Social identity theory (pp. 112-138). Stanford University Press.
Horowitz, E., Oyserman, D., Deghani, M., & Sorensen, N. (2020). Do you need a roadmap or
can someone give you directions: When school-focused possible identities change so do
academic trajectories. Journal of Adolescence. 79, 26-38.
Horowitz, E., Sorensen, N., Yoder, N., & Oyserman, D. (2018). Teachers can do it: Scalable
identity-based motivation intervention in the classroom. Contemporary Educational
Psychology, 54, 12-28.
Hoyle, R. H., & Sherrill, M. R. (2006). Future orientation in the self‐system: Possible selves,
self‐regulation, and behavior. Journal of Personality, 74(6), 1673-1696.
James, W. (1892). Psychology: Briefer Course. Macmillan.
119
119
Johnson, D. W., & Johnson, R. T. (1989). Cooperation and competition: Theory and research.
Interaction Book Company.
Kazdin, A. E. (2007). Mediators and mechanisms of change in psychotherapy research. Annual
Review of Clinical Psychology, 3, 1-27.
Kirst, M. W., & Venezia, A. (2001). Bridging the great divide between secondary schools and
postsecondary education. The Phi Delta Kappan, 83(1), 92-97.
Kitsak, M., Gallos, L. K., Havlin, S., Liljeros, F., Muchnik, L., Stanley, H. E., & Makse, H. A.
(2010). Identification of influential spreaders in complex networks. Nature Physics,
6(11), 888-893.
Kline, R. B., (2016). Principles and practice of structural equation modeling (4th edition).
Guilford Press.
Lakoff, G., & Johnson, M. (1980). The metaphorical structure of the human conceptual system.
Cognitive Science, 4(2), 195-208.
Landau, M. J., Oyserman, D., Keefer, L. A., & Smith, G. C. (2014). The college journey and
academic engagement: How metaphor use enhances identity-based motivation. Journal of
Personality and Social Psychology, 106(5), 679–698.
Lee, J., Husman, J., Scott, K. A., & Eggum-Wilkens, N. D. (2015). Compugirls: Stepping stone
to future computer-based technology pathways. Journal of Educational Computing
Research, 52(2), 199-223.
Marini, J. P., Westrick, P. A., Young, L., Shmueli, D., Shaw, E. J., & Ng, H. (2019). Validity of
SAT® essay scores for predicting first-year grades. College Board.
Marjoribanks, K. (2003). Learning environments, family contexts, educational aspirations and
attainment: A moderation-mediation model extended. Learning Environments Research,
120
120
6(3), 247-265.
Markus, H., & Nurius, P. (1986). Possible selves. American Psychologist, 41(9), 954-969.
Martin, D. J., Garske, J. P., & Davis, M. K. (2000). Relation of the therapeutic alliance with
outcome and other variables: A meta-analytic review. Journal of Consulting and Clinical
Psychology, 68(3), 438–450.
Merolla, D. M. (2013). The net Black advantage in educational transitions: An education careers
approach. American Educational Research Journal, 50(5), 895-924.
Messersmith, E. E., & Schulenberg, J. E. (2008). When can we expect the unexpected?
Predicting educational attainment when it differs from previous expectations. Journal of
Social Issues, 64(1), 195-212.
Michie, S., & Abraham, C. (2004). Interventions to change health behaviours: evidence-based or
evidence-inspired?. Psychology & Health, 19(1), 29-49.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed
representations of words and phrases and their compositionality. In C. J. C. Burges, L.
Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger (Eds.), Advances in neural
information processing systems (pp. 3111–3119). Redhook, NY: Curran Associates Inc.
Miller, D. T., & Prentice, D. A. (2016). Changing norms to change behavior. Annual Review of
Psychology, 67, 339-361.
Miller, W. R., & Rose, G. S. (2009). Toward a theory of motivational interviewing. American
Psychologist, 64(6), 527-537.
Mohr, D. C., Schueller, S. M., Riley, W. T., Brown, C. H., Cuijpers, P., Duan, N., ... & Cheung,
K. (2015). Trials of intervention principles: evaluation methods for evolving behavioral
intervention technologies. Journal of Medical Internet research, 17(7), e166.
121
121
Muller, C. (2001). The role of caring in the teacher‐student relationship for at‐risk students.
Sociological Inquiry, 71(2), 241-255.
Murrar, S., Campbell, M. R., & Brauer, M. (2020). Exposure to peers’ pro-diversity attitudes
increases inclusion and reduces the achievement gap. Nature Human Behaviour, 4(9),
889-897.
Nikolakakos, E., Reeves, J. L., & Shuch, S. (2012). An examination of the causes of grade
inflation in a teacher education program and implications for practice. College and
University, 87(3), 2-13.Ooms, J. (2020). hunspell: High-performance stemmer, tokenizer,
and spell checker. R package version 3.0.1. https://CRAN.R-
project.org/package=hunspell
Nurra, C., & Oyserman, D. (2018). From future self to current action: An identity-based
motivation perspective. Self and Identity, 17(3), 343-364.
O'Malley, M., Voight, A., Renshaw, T. L., & Eklund, K. (2015). School climate, family
structure, and academic achievement: A study of moderation effects. School Psychology
Quarterly, 30(1), 142–157.
O’Donnell, C. L. (2008). Defining, conceptualizing, and measuring fidelity of implementation
and its relationship to outcomes in K–12 curriculum intervention research. Review of
educational research, 78(1), 33-84.
O’Donnell, S. C., Yan, V. X., Bi, C., & Oyserman., D. (under review). Is difficulty mostly about
impossibility? What difficulty implies may be culturally variant. Motivation Science.
Oettingen, G., Mayer, D., Thorpe, J. S., Janetzke, H., & Lorenz, S. (2005). Turning fantasies
about positive and negative futures into self-improvement goals. Motivation and
Emotion, 29(4), 236-266.
122
122
Ou, S.-R., & Reynolds, A. J. (2008). Predictors of educational attainment in the Chicago
Longitudinal Study. School Psychology Quarterly, 23(2), 199–229.
Oyserman D. & James L. (2011). Possible identities. In: Schwartz S., Luyckx K., Vignoles V.
(Eds.) Handbook of identity theory and research. (pp 117-145). Springer.
Oyserman, D. (2015). Pathways to success through identity-based motivation. Oxford Press.
Oyserman, D. (2017). Culture three ways: Culture and subcultures within countries. Annual
Review of Psychology, 68, 435–463.
Oyserman, D., & Fryberg, S. (2006). The Possible Selves of Diverse Adolescents: Content and
Function Across Gender, Race and National Origin. In C. Dunkel & J. Kerpelman (Eds.),
Possible selves: Theory, research and applications (p. 17–39). Nova Science Publishers.
Oyserman, D., & Lewis, N. A., Jr. (2017). Seeing the destination AND the path: Using identity‐
based motivation to understand and reduce racial disparities in academic achievement.
Social Issues and Policy Review, 11(1), 159–194.
Oyserman, D., Bybee, D., & Terry, K. (2006). Possible selves and academic outcomes: How and
when possible selves impel action. Journal of Personality and Social Psychology, 91(1),
88-204.
Oyserman, D., Bybee, D., Terry, K., & Hart-Johnson, T. (2004). Possible selves as roadmaps.
Journal of Research in Personality, 38(2), 130-149.
Oyserman, D., Destin, M., & Novin, S. (2015). The context-sensitive future self: Possible selves
motivate in context, not otherwise. Self and Identity, 14(2), 173-188.
Oyserman, D., Elmore, K., Novin, S., Fisher, O., & Smith, G. C. (2018). Guiding people to
interpret experienced difficulty as importance highlights their academic possibilities and
improves their academic performance. Frontiers in Psychology, 9, 1-13.
123
123
Oyserman, D., Gant, L., & Ager, J. (1995). A socially contextualized model of African American
identity: Possible selves and school persistence. Journal of Personality and Social
Psychology, 69(6), 1216-1232.
Oyserman, D., Johnson, E., & James, L. (2011). Seeing the destination but not the path: Effects
of socioeconomic disadvantage on school-focused possible self content and linked
behavioral strategies. Self and Identity, 10(4), 474-492.
Oyserman, D., Lewis, N. A. Jr, Yan, V. X., Fisher, O., O'Donnell, S. C., & Horowitz, E. (2017).
An identity-based motivation framework for self-regulation. Psychological Inquiry, 28(2-
3), 139-147.
Oyserman, D., O’Donnell, S. C., Wingert, K. M, & Sorensen, N. (in press). The clearer the
signal, the stronger the effect: Teaching practices that enhance student identity-based
motivation reduce academic risk. Contemporary Educational Psychology.
Oyserman, D., Sorensen, N., Reber, R., & Chen, S. X. (2009). Connecting and separating mind-
sets: Culture as situated cognition. Journal of Personality and Social Psychology, 97(2),
217-235.
Perepletchikova, F., & Kazdin, A. E. (2005). Treatment integrity and therapeutic change: Issues
and research recommendations. Clinical psychology: Science and Practice, 12(4), 365-
383.
Petty, R. E., & Cacioppo, J. T. (1986). The elaboration likelihood model of persuasion. In
Communication and Persuasion (pp. 1-24). Springer.
Pfeifer, J. H., & Berkman, E. T. (2018). The development of self and identity in adolescence:
Neural evidence and implications for a value‐based choice perspective on motivated
behavior. Child Development Perspectives, 12(3), 158-164.
124
124
Powell, B. J., Waltz, T. J., Chinman, M. J., Damschroder, L. J., Smith, J. L., Matthieu, M. M., ...
& Kirchner, J. E. (2015). A refined compilation of implementation strategies: results
from the Expert Recommendations for Implementing Change (ERIC) project.
Implementation Science, 10(1), 1-14.
Premachandra, B., & Lewis Jr, N. (in press). Do we report the information that is necessary to
give psychology away? A scoping review of the psychological intervention literature
2000-2018, Perspectives on Psychological Science.
Quin, D. (2017). Longitudinal and contextual associations between teacher–student relationships
and student engagement: A systematic review. Review of Educational Research, 87(2),
345-387.
Raskin, N. J., & Rogers, C. R. (2005). Person-centered therapy. In R. J. Corsini & D. Wedding
(Eds.), Current psychotherapies (p. 130–165). Thomson Brooks/Cole Publishing Co.
Riley, W. T., Serrano, K. J., Nilsen, W., & Atienza, A. A. (2015). Mobile and wireless
technologies in health behavior and the potential for intensively adaptive interventions.
Current Opinion in Psychology, 5, 67-71.
Rinaldi, R. L. & Farr, A. C. (2018). Promoting positive youth development: a psychosocial
intervention evaluation. Psychosocial Intervention, 27(1), 22-34.
Roberts, G., Scammacca, N., & Roberts, G. J. (2018). Causal mediation in educational
intervention studies. Behavioral Disorders, 43(4), 457-465.
Roediger III, H. L., & Karpicke, J. D. (2006). The power of testing memory: Basic research and
implications for educational practice. Perspectives on Psychological Science, 1(3), 181-
210.
125
125
Rogers, C. R. (1946). Significant aspects of client-centered therapy. American Psychologist,
1(10), 415-422.
Rogers, E. M. (1962/1995). Diffusions of Innovation. The Free Press.
Rogosa, D. W., Willett, J. B. (1983). Demonstrating the reliability of the difference score in the
measurement of change. Journal of Education Measurement, 20, 335–343.
Rosenberg, M. (1965). Society and the adolescent self-image. Princeton University Press
Rosenzweig, M. R., & Udry, C. (2020). External validity in a stochastic world: Evidence from
low-income countries. The Review of Economic Studies, 87(1), 343-381.
Rothman, A. J. (2004). “Is there nothing more practical than a good theory?”: Why innovations
and advances in health behavior change will arise if interventions are used to test and
refine theory. International Journal of Behavioral Nutrition and Physical Activity, 1(1),
11.
Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions
and use interpretable models instead. Nature Machine Intelligence, 1(5), 206-215.
Rudin, C., Wang, C., & Coker, B. (2018). The age of secrecy and unfairness in recidivism
prediction. arXiv preprint arXiv:1811.00731.
Ruvolo, A. P., & Markus, H. R. (1992). Possible selves and performance: The power of self-
relevant imagery. Social Cognition, 10(1), 95-124.
Ryan, R. M., & Deci, E. L. (2000). Self-determination theory and the facilitation of intrinsic
motivation, social development, and well-being. American Psychologist, 55(1), 68–78.
Schlegel, A., & Barry, H. III. (1991). Adolescence: An anthropological inquiry. Free Press.
126
126
Schoon, I., & Ng-Knight, T. (2017). Co-development of educational expectations and effort:
Their antecedents and role as predictors of academic success. Research in Human
Development, 14(2), 161-176.
Schwarz, N (2015). Metacognition. In M. Mikulincer, P. R. Shaver, E. Borgida, & J. A. Bargh
(Eds.), APA handbook of personality and social psychology: Attitudes and social
cognition (pp. 203-229). American Psychological Association.
Schwarz, N., Newman, E., & Leach, W. (2016). Making the truth stick & the myths fade:
Lessons from cognitive psychology. Behavioral Science & Policy, 2(1), 85-95.
Shrout, P. E., & Rodgers, J. L. (2018). Psychology, science, and knowledge construction:
Broadening perspectives from the replication crisis. Annual Review of Psychology, 69,
487-510.
Siew, C. S., Wulff, D. U., Beckage, N. M., & Kenett, Y. N. (2019). Cognitive network science:
A review of research on cognition through the lens of network representations, processes,
and dynamics. Complexity, 2019.
Silge, J., & Robinson, D. (2016). tidytext: Text mining and analysis using tidy data principles in
R. Journal of Open Source Software, 1(3), 37.
Sisk, V. F., Burgoyne, A. P., Sun, J., Butler, J. L., & Macnamara, B. N. (2018). To what extent
and under which circumstances are growth mind-sets important to academic
achievement? Two meta-analyses. Psychological Science, 29(4), 549-571.
Sleeter, C. (2008). Equity, democracy, and neoliberal assaults on teacher education. Teaching
and teacher education, 24(8), 1947-1957.
Smith, E. R., & Semin, G. R. (2007). Situated social cognition. Current Directions in
Psychological Science, 16(3), 132-135.
127
127
Soderstrom, N. C., & Bjork, R. A. (2015). Learning versus performance: An integrative review.
Perspectives on Psychological Science, 10(2), 176-199.
Stephens, N. M., Hamedani, M. G., & Destin, M. (2014). Closing the social class achievement
gap: A diversity education intervention improves first-generation students’ academic
performance and all students’ college transition. Psychological Science, 25, 943-953.
Stephens, N. M., Townsend, S. S., Hamedani, M. G., Destin, M., & Manzo, V. (2015). A
difference-education intervention equips first-generation college students to thrive in the
face of stressful college situations. Psychological Science, 26(10), 1556-1566.
Steyvers, M., & Tenenbaum, J. B. (2005). The large‐scale structure of semantic networks:
Statistical analyses and a model of semantic growth. Cognitive Science, 29(1), 41-78.
Symons, C. S., & Johnson, B. T. (1997). The self-reference effect in memory: A meta-analysis.
Psychological Bulletin, 121(3), 371–394.
Trope, Y., & Liberman, N. (2003). Temporal construal. Psychological Review, 110(3), 403–421.
U.S. Department of Education, (2018). Table 216.60. Digest of Education Statistics.
Vapnik, V. (2000). The Nature of Statistical Learning Theory. Springer.
Vehtari, A., Gelman, A., and Gabry, J. (2016). Practical Bayesian model evaluation using leave-
one-out cross-validation and waic. Statistics and Computing, 27(5):1413–1432.
Vivalt, E. (2020). How much can we generalize from impact evaluations?. Journal of the
European Economic Association, 18(6), 3045-3089.
Vroom, V. H. (1964). Work and Motivation. Wiley.
Wakslak, C. J., Trope, Y., Liberman, N., & Alony, R. (2006). Seeing the forest when entry is
unlikely: probability and the mental representation of events. Journal of Experimental
Psychology: General, 135(4), 641-653..
128
128
Wakslak, C. J.,Nussbaum, S. ,Liberman, N., & Trope, Y. (2008). Representations of the self in
the near and more distant future. Journal of Personality and Social Psychology, 95(4),
757-773.
Wasserman, S., & Faust, K. (1994). Social network analysis: Methods and applications. Oxford
University Press.
Woolley, M. E., Rose, R. A., Orthner, D. K., Akos, P. T., & Jones-Sanpei, H. (2013). Advancing
academic achievement through career relevance in the middle grades: A longitudinal
evaluation of CareerStart. American Educational Research Journal, 50(6), 1309-1335.
Yan, V. X., Schuetze, B. A., & Eglington, L. G. (2020, December 10). A Review of the
Interleaving Effect: Theories and Lessons for Future Research.
https://doi.org/10.31234/osf.io/ur6g7
Yeager, D. S. & Walton, G. M. (2011). Social-psychological interventions in education: They're
not magic. Review of Educational Research, 81, 267-301.
Zhao, L., Chen, L. Sun, W., Compton, B., Lee, K. & Heyman, G. (2019). Young children are
more likely to cheat after overhearing that a classmate is smart. Developmental Science,
e12930.
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
A roadmap for changing student roadmaps: designing interventions that use future “me” to change academic outcomes
PDF
Culture's consequences: a situated account
PDF
Difficulty-as-importance and difficulty-as-impossibility: unpacking the context-sensitivity and consequences of identity-based inferences from difficulty
PDF
Are jokes funnier when they’re easier to process?
PDF
Classrooms are game settings: learning through and with play
PDF
Metacognitive experiences in judgments of truth and risk
PDF
Intuitions of beauty and truth: what is easy on the mind is beautiful and true
PDF
Can I make the time or is time running out? That depends in part on how I think about difficulty
Asset Metadata
Creator
O'Donnell, S. Casey
(author)
Core Title
Bridging possible identities and intervention journeys: two process-oriented accounts that advance theory and application of psychological science
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Psychology
Degree Conferral Date
2021-08
Publication Date
07/26/2021
Defense Date
06/15/2021
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
academic expectations,academic outcomes,evidence-based policy,intervention evaluation,intervention fidelity,intervention replication,machine learning,natural language processing,OAI-PMH Harvest,possible identities,treatment effects
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Oyserman, Daphna (
committee chair
), Hackel, Leor (
committee member
), Lai, Mark (
committee member
), Schwarz, Norbert (
committee member
), Townsend, Sarah (
committee member
)
Creator Email
scodonne@usc.edu,sean.odonnell@coloradocollege.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC15622962
Unique identifier
UC15622962
Legacy Identifier
etd-ODonnellSC-9887
Document Type
Dissertation
Format
application/pdf (imt)
Rights
O'Donnell, S. Casey
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
academic expectations
academic outcomes
evidence-based policy
intervention evaluation
intervention fidelity
intervention replication
machine learning
natural language processing
possible identities
treatment effects