Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Developing and testing novel strategies to detect inattentive responding in ecological momentary assessment studies
(USC Thesis Other)
Developing and testing novel strategies to detect inattentive responding in ecological momentary assessment studies
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
DEVELOPING AND TESTING NOVEL STRATEGIES TO DETECT INATTENTIVE
RESPONDING IN ECOLOGICAL MOMENTARY ASSESSMENT STUDIES
By
Shirlene D Wang
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(PREVENTIVE MEDICINE (HEALTH BEHAVIOR RESEARCH))
August 2023
Copyright 2023 Shirlene D Wang
ii
ACKNOWLEDGEMENTS
This dissertation was only possible due to the attentiveness of many people!
First, I am grateful to my committee: Genevieve Dunton, Stephen Intille, Matthew
Kirkpatrick, Raina Pang, and Tyler Mason. Specifically, I must acknowledge my committee
chair, Genevieve, for her mentorship, advice, encouragement, and guidance throughout my
Ph.D. Similarly, I must acknowledge Stephen for the inspiration, guidance, and patience in
reminding me that I mean survey when I say prompt. I remember the first time I read about
the Mobile Teens study and dreamed about applying to graduate school and collaborating with
them, and it is one of the biggest honors of my life to be associated with them on the TIME
Study. My dissertation would not have been possible without Aditya Ponnada supporting my
interest in engagement and programming all the attention check questions. I also want to
thank Jixin Li, Micaela Hewus, Jon Kaslander, Katherine Nguyen, Daniel Chu, and the rest of
the TIME Study Team for all their hard work.
I am indebted to all the previous researchers who have been curious about the quality
and quantity of EMA data collected and who inspired this work.
I am grateful to Trevor Pickering for being my unofficial committee member who was
willing to listen and provide advice on everything from STATA error messages to my new
orders at Starbucks. Daniel Chu was always there to ask how writing was coming along and
then accompany me to a novel experience.
Bridgette Do and Christine Naya were the best peer mentors, collaborators, and friends
a graduate student could ask for. I am blessed to have had so much support from the REACH
Lab in general and my cohort in HBR (Cynthia Begay, Bridgette Do, Chris Rodgers, Ian Kim,
Jesse Goldshear, Sheila Pakdaman, and Sydney Miller).
iii
It was a pleasure to work with and be supported by Naomi Rodgers, Julia Stal, Jose
Scott, and Samuel Garza through Graduate Student Government. Reviewing poorly completed
finance applications was the perfect supplement to working on my dissertation.
I am grateful to Kyle Nixon, Kelcie Kadowaki, and Nora Dixon for helping me work
through all the barriers that I had to persevere through my doctoral degree.
Margaret DeZelar was there for me in 2017 when I was still contemplating if I would
like to live in Los Angeles, and her loyalty to my well-being over the past six years was never
time varying.
Moni was the best companion who was sometimes overly attentive to me while I was
writing this dissertation.
Finally, I would have never reached this point without the devotion of my family
(Hongxin Dong, Edward Wang, Becky Wang). My mom Hongxin always keeps me motivated
to strive for excellence.
This work was funded by NIH/NHLBI U01HL146327 and NSF DDRIG Award
#2150617.
iv
Table of Contents
ACKNOWLEDGEMENTS ........................................................................................................ ii
LIST OF TABLES ..................................................................................................................... vi
LIST OF FIGURES .................................................................................................................. vii
ABSTRACT ............................................................................................................................ viii
INTRODUCTION ...................................................................................................................... 1
EMERGING ADULTHOOD IS A PERIOD OF RISK FOR UNHEALTHY BEHAVIORS. ............................ 2
ASSESSMENTS OF HEALTH BEHAVIORS AND THEIR DETERMINANTS RELY ON SELF-REPORTS. ... 3
ECOLOGICAL MOMENTARY ASSESSMENT ................................................................................. 4
ADVANTAGES OF EMA ............................................................................................................ 6
THE GROWING POPULARITY OF EMA ....................................................................................... 6
PARTICIPANT ENGAGEMENT IS A MAJOR CHALLENGE OF EMA ................................................ 7
PSYCHOLOGICAL PROCESS UNDERLYING SURVEY RESPONSE ................................................. 10
IDENTIFYING INATTENTIVE RESPONDING ............................................................................... 11
EFFECTS OF INATTENTIVE RESPONDING ON DATA INFERENCES .............................................. 12
INATTENTIVE RESPONDING AND EMA ................................................................................... 13
AIMS OF THE PROPOSED STUDIES ........................................................................................... 15
Study 1 ............................................................................................................................... 16
Study 2 ............................................................................................................................... 16
Study 3 ............................................................................................................................... 17
STUDY 1: IDENTIFYING AND APPLYING INATTENTIVE DETECTION INDICES TO
ECOLOGICAL MOMENTARY ASSESSMENT DATA ....................................................... 18
ABSTRACT ............................................................................................................................. 18
INTRODUCTION ...................................................................................................................... 19
Potential Detection Indices ............................................................................................... 20
METHODS .............................................................................................................................. 25
Participants ....................................................................................................................... 25
Procedures ........................................................................................................................ 26
Measures ........................................................................................................................... 27
Data Analysis .................................................................................................................... 32
RESULTS ................................................................................................................................ 33
Antonym pair selection ..................................................................................................... 34
Descriptive Statistics for the IR Indices ............................................................................ 36
The predictive power of the IR detection indices and revision of cut scores .................... 38
Predictive model building ................................................................................................. 39
DISCUSSION ........................................................................................................................... 49
v
Application of Attention Check Questions ........................................................................ 51
Implications and Future Directions .................................................................................. 53
Conclusion ........................................................................................................................ 55
STUDY 2: MODELING PERSON-LEVEL AND CONTEXTUAL FACTORS
ASSOCIATED WITH INATTENTIVE RESPONDING IN AN ECOLOGICAL
MOMENTARY ASSESSMENT STUDY ............................................................................... 56
ABSTRACT ............................................................................................................................. 56
INTRODUCTION ...................................................................................................................... 57
METHODS .............................................................................................................................. 60
Participants and procedures ............................................................................................. 60
Measures ........................................................................................................................... 61
Data Analysis .................................................................................................................... 65
RESULTS ................................................................................................................................ 66
DISCUSSION ........................................................................................................................... 74
STUDY 3: BURDEN AND INATTENTIVE RESPONDING IN INTENSIVE
LONGITUDINAL STUDIES: A QUALITATIVE ANALYSIS ............................................. 78
ABSTRACT ............................................................................................................................. 78
INTRODUCTION ...................................................................................................................... 79
METHODS .............................................................................................................................. 81
Design ............................................................................................................................... 81
Interview procedure .......................................................................................................... 81
Data Analysis .................................................................................................................... 83
RESULTS ................................................................................................................................ 84
DISCUSSION ........................................................................................................................... 94
CONCLUSION AND IMPLICATIONS .................................................................................. 99
Contributions to the field ................................................................................................ 100
Potential recommendations for the field ......................................................................... 101
Final thoughts ................................................................................................................. 103
REFERENCES ....................................................................................................................... 104
APPENDIX A: ATTENTION CHECK QUESTIONS .......................................................... 119
vi
LIST OF TABLES
TABLE 1. ANALYZED EMA ITEMS FROM THE TIME STUDY ...................................................... 28
TABLE 2. BETWEEN SUBJECT CORRELATIONS BETWEEN POTENTIAL ANTONYM PAIRS ................ 34
TABLE 3. DESCRIPTIVE STATISTICS RESULTS OF TESTS COMPARING IR DETECTION TECHNIQUES
BY RESPONDING TO THE ACQ CORRECTLY VS INCORRECTLY ............................................. 42
TABLE 4. BIVARIATE CORRELATIONS AMONG IR DETECTION INDICES ....................................... 43
TABLE 5. AUCS OF THE INATTENTIVE RESPONDING DETECTION INDICES BASED ON EXISTING
CUT SCORES GENERATED FROM ROCS ............................................................................... 44
TABLE 6. DATA-DRIVEN CUT SCORES OF POTENTIAL IR DETECTION INDICES AND LEVELS OF
SPECIFICITY AND SENSITIVITY AND AUC ........................................................................... 45
TABLE 7. UNIVARIATE LOGISTIC REGRESSION MODELS OF POTENTIAL DETECTION INDICES
PREDICTING CORRECT ATTENTION CHECK QUESTION RESPONSE (IN TRAINING DATASET) .. 46
TABLE 8. MULTIVARIATE LOGISTIC REGRESSION MODELS OF POTENTIAL DETECTION INDICES
PREDICTING CORRECT ATTENTION CHECK QUESTION RESPONSE (IN TRAINING DATASET
COMPARED TO THE FULL DATASET) .................................................................................... 47
TABLE 9. CLASSIFICATION TABLE COMPARING OBSERVED VS CLASSIFIED CORRECT SURVEYS
BASED ON MULTIVARIATE LOGISTIC REGRESSION ............................................................... 48
TABLE 10. TIME STUDY PARTICIPANT CHARACTERISTICS ........................................................ 70
TABLE 11. DESCRIPTIVE STATISTICS FOR ADDITIONAL PREDICTORS .......................................... 71
TABLE 12. PERSON-LEVEL PREDICTORS OF INATTENTIVE RESPONDING BEHAVIOR FOR EMA
TESTED THROUGH SEPARATE UNIVARIATE LINEAR REGRESSIONS (N=290) ........................ 72
TABLE 13. CONTEXTUAL PREDICTORS OF ATTENTIVE EMA SURVEY CATEGORIZATION TESTED
THROUGH MULTILEVEL LOGISTIC REGRESSIONS (N=183,026 SURVEYS; N=290
PARTICIPANTS) ................................................................................................................... 73
TABLE 14. INTERVIEW GUIDE .................................................................................................... 82
TABLE 15. DEMOGRAPHICS FOR QUALITATIVE INTERVIEW PARTICIPANTS ................................. 84
vii
LIST OF FIGURES
FIGURE 1. THE PSYCHOLOGICAL PROCESS UNDERLYING SURVEY RESPONSE PROPOSED BY
TOURANGEAU ET AL 2000 .................................................................................................. 11
FIGURE 2. EXAMPLE OF THE PRESENTATION OF EMA ITEMS ON THE PHONE SCREEN ................. 29
FIGURE 3. DISTRIBUTION OF ACQ CORRECTNESS ACROSS PARTICIPANTS (N=198) ................... 34
FIGURE 4. DISTRIBUTION OF RESPONSES TO TENSE AND RELAXED EMA RESPONSES ................. 35
FIGURE 5. DISTRIBUTIONS OF POTENTIAL INDICES OF INATTENTIVE RESPONDING ...................... 37
FIGURE 6. DISTRIBUTION OF MEASURES OF ATTENTIVENESS OF SURVEY SURVEYS BY
PARTICIPANT ...................................................................................................................... 67
viii
ABSTRACT
This dissertation was methodological in nature and consisted of three interrelated
studies that aimed to address the prevalence and predictors of inattentive responding (IR) in
an intensive ecological momentary assessment (EMA) study. The studies established models
of how, when, and why inattentive responding is likely to occur and provide insights into
preventive strategies to increase the validity of the self-report measure. The aims were to: 1a)
Apply various IR detection indices used with retrospective cross-sectional self-report methods
to EMA methods and compare accuracy across those indices. 1b) Develop a predictive model
for IR using the indices to estimate the prevalence of IR in EMA. 2) Investigate person-level
(e.g., demographics, personality, approach motivation, perceptions of burden) and survey-
level (e.g., study day, time of day, location, activity level) predictors of IR. 3) Collect
qualitative data from participants from the EMA study to understand the process of IR and
factors leading to IR to EMA surveys.
Study 1 found a low prevalence of IR in the EMA study data (3%). Inattentive
response detection indices from cross-sectional survey research did not translate well to EMA
data and revised cut-off scores were suggested. The best model to detect IR in EMA
combined between-subject response variability as well as between-subject and within-subject
total response times as predictors. Study 2 revealed that sex at birth and reward
responsiveness were the most important person-level factors related to attentiveness. Contrary
to existing literature, other demographic factors and personality showed no significant
relationships to participant attentiveness to EMA surveys. Of the contextual factors, surveys
delivered on the weekend, earlier weeks in the study, and at home had the highest likelihood
of being attentive. The qualitative data in Study 3 provides preliminary evidence that social
ix
context may be an under-considered factor underlying decreased data quality and the
acceptability of attention check questions. These findings also provide potential explanations
for observed decreases in response variability over time and could guide future research to
improve participant engagement.
As EMA study designs become more sophisticated, the measurement of factors that
contribute to declines in response accuracy will grow in importance. Overall, results from this
dissertation suggest that inattentive responding may be more related to study-level factors
than person level factors. Results provide insights into factors that may improve the reliability
and validity of EMA data or considerations for researchers when designing their studies to
assess and maximize data quality. Given the low prevalence of IR found, this dissertation
potentially also reduces the concern of IR as a potential limitation of the use of real-time
longitudinal data capture methods like ecological momentary assessment for the assessment of
health behavior.
1
INTRODUCTION
Chronic diseases such as heart disease, cancer, and diabetes have been identified as the
leading cause of death in the United States (US) (National Center for Chronic Disease Prevention
and Health Promotion, n.d.). The Centers for Disease Control and Prevention (CDC) reports that
60% of US adults have at least one of these chronic diseases, and 40% have two or more. While
there exist risk factors that are beyond one’s control, such as family history (Banerjee, 2012),
race/ethnicity (Kurian & Cardarelli, 2007), and age (Kannel & Vasan, 2009; Winkleby &
Cubbin, 2004), many of these chronic diseases can be attributed to health behaviors. Thus,
promoting healthy lifestyle change is a critical modifiable component of chronic disease
prevention (Halpin et al., 2010; H. Schmidt, 2016).
Examples of modifiable health behaviors underlying chronic diseases are lack of physical
activity, tobacco use, and poor diet (Börnhorst et al., 2020; Leventhal, 1973). Adults who are
physically inactive, which is defined as performing less than 150 min of moderate-to-vigorous
intensity physical activity) per week have a 20-30% increased risk of all-cause mortality (Saint-
Maurice et al., n.d.). Engaging in physical activity is beneficial for weight regulation, blood
pressure, glucose levels, inflammation, and the health of blood vessels and factors (Shiroma &
Lee, 2010). Independent of physical activity levels, sedentary behaviors such as screen time are
also associated with excess weight (Friedenreich et al., 2021). In the US, cigarette smoking is the
leading cause of preventable disease, disability, and death (Islami et al., 2015). In addition to
distal health outcomes, tobacco use can also affect metabolic risk factors such as body weight,
body fat distribution, and insulin resistance, which then increases the risk of cardiovascular
disease (Chiolero et al., 2008). Excess consumption of energy-dense foods, such as processed
foods high in sugar and fat, promotes obesity and reduces cardiovascular health (Jeon et al.,
2
2011). Making healthy choices can reduce the likelihood of developing chronic disease and
improve quality of life.
Emerging adulthood is a period of risk for unhealthy behaviors.
Many unhealthy habits such as physical inactivity that predispose individuals to chronic
disease manifest during emerging adulthood (18-29 years old) (Sofija et al., 2020). While 50% of
US children have ideal levels of 5 of the 7 key indicators of cardiovascular health, the proportion
falls to 15% of emerging adults (Gooding et al., 2016). This life period is marked by important
life transitions such as moving away from home and beginning full-time employment that results
in increased autonomy for health behavior decision-making (Arnett, 2000). Emerging adults may
be at greater risk for developing unhealthy behaviors due to less parental monitoring, growing
independence, and financial instability (Schwartz & Petrova, 2019). Many researchers target this
period as an opportunity to integrate health promotion and maintenance activities to prevent
long-term risk behaviors that could increase the risk of chronic diseases later in life (Helgeson et
al., 2014; Schwartz & Petrova, 2019). About 40% of emerging adults report meeting
recommended moderate-to-vigorous physical activity levels (Fuller et al., 2015). Approximately
99% of adult smokers begin using tobacco by age 26 (Wang et al., 2018) and those who continue
smoking in emerging adulthood are at the greatest risk of being regular smokers in later
adulthood (McCarron et al., 2000). Among all adult age groups, emerging adults are the highest
consumers of fast food and sugar-sweetened beverages and the lowest consumers of fruits and
vegetables (Allman-Farinelli et al., 2016). Thus, it is important to comprehensively assess the
adoption and maintenance of health behaviors in emerging adults.
3
Assessments of health behaviors and their determinants rely on self-reports.
Much of our knowledge about engagement in health behaviors is based on data collected
by self-report. While some health behaviors such as physical activity can be objectively assessed
with devices like accelerometers (Kohl et al., 2000), capturing behavioral determinants and
consequences that cannot be observed directly such as internal psychological phenomena
including emotional states, beliefs, opinions, and attitudes associated with health behaviors
requires self-report data collection methods such as surveys or questionnaires. Additionally,
many aspects of physical activity (i.e., type performed, purpose, and location) may not be
measured with accelerometers alone (Dunton et al., 2012). Self-report is one of the most widely
used methods for the collection of health behavior data due to its low cost and ease of
implementation. Surveillance and intervention studies use self-report to gain knowledge on the
prevalence of risk factors in a population (e.g., number of cigarettes smoked or minutes of
physical activity) or performance of health behaviors (e.g., COVID-19 vaccinations or wearing
sunscreen) to assess the efficacy of health promotion interventions (Newell et al., 1999; Stone et
al., 1999)
However, there have been methodological issues identified related to the accuracy and
utility of cross-sectional self-report methods such as errors and biases in recall and the use of
heuristics in response patterns (Smyth & Heron, 2012). For physical activity, in particular, it may
be difficult for participants to recall bouts of activity performed over an extended period of time
(i.e., over the last week) especially if the activity was not salient (e.g., walking for transportation)
(Smyth & Heron, 2012). Additionally, the benefits of physical activity performed depend on its
duration and intensity, which a participant may not remember (Sallis et al., 1993). To address
4
these challenges and advance the field, efforts have been made to develop methods of real-time
assessment that allow for more frequently assessed self-report in natural settings (Raphael, 1987;
Shiffman et al., 2008).
Ecological momentary assessment
Ecological momentary assessment (EMA) is a real-time, real-world sampling strategy
that researchers can use to collect repeated measures of self-report data such as current
experiences, behaviors, contexts, moods, and psychosocial factors relevant to the research
question. As EMA surveys participants frequently (up to multiple times a day) with short surveys
that do not require participants to recall extended periods of time, the method reduces recall bias
(Shiffman et al., 2008). The use of temporally dense assessment methods began in the 1980s
when Csikszentmihalyi and colleagues introduced experience sampling methods
(Csikszentmihalyi, 1992). Stone and Shiffman presented the technique to the field of behavioral
medicine in 1994 as ecological momentary assessment (Stone & Shiffman, 1994). While many
terms are used to describe experience sampling methods, this dissertation will use the term EMA
as it is the most commonly used term when the method is applied for assessing health behavior
related outcomes. The first EMA studies used pen and paper diaries in combination with beeper
devices that notified participants when it was time to take a survey (Yearick, 2017). However,
now due to advancements in mobile technology and the prevalence of ownership of smartphones,
EMA studies can be easily implemented on mobile devices, which improves researchers’ ability
to access individuals in their natural environment (Cartwright, 2016; Raento et al., 2009).
Moreover, data collected electronically no longer requires manual entry by researchers which
further reduces error. A unique advantage of modern EMA on mobile devices is the collection of
metadata such as timestamps indicating when the survey was delivered and when participants
5
began and completed the surveys. This aids in assessing compliance and avoids issues of
participants backfilling paper and pencil diaries. Contemporary EMA studies send participants
multiple survey surveys per day on a personal mobile device that occur randomly throughout the
day or on fixed schedules (i.e., every 2 hours). When these surveys are received, participants are
asked to report their current or recent thoughts, behaviors, and feelings (Nam et al., 2021). EMA
has even further advanced from signal-contingent (i.e. random) surveying, and applications for
EMA can now be linked or paired with sensors such as accelerometers and global positioning
systems (GPS) to trigger surveys based on factors other than time (described as context-sensitive
EMA) (Intille et al., 2007) .
EMA now serves as a foundation for assessing time-varying or context-sensitive
variables (Shiffman et al., 2008). It is also well suited for investigating discrete health behaviors
that occur many times throughout the day such as smoking or sedentary behavior, which often
can be difficult to remember and report through cross-sectional self-report methods. A wide
variety of health behaviors have now been studied with EMA such as smoking (Shiffman &
Waters, 2004), pain (Stone et al., 2003), substance use (Jones et al., 2019), physical activity
(Dunton et al., 2012; Liao et al., 2015), and eating behaviors (McKee et al., 2014; Schembre et
al., 2018; Zink et al., 2018) across unique populations such as children and adolescents, young
adults, and older adults.
EMA is particularly promising for research in emerging adults. Researchers point to
higher comfort with technology and lower privacy concerns associated with this age range in
increasing the potential acceptability of frequent digital or mobile surveys (Pew Research Center,
2017). A paper assessing the feasibility of EMA in emerging adults found that almost all
participants (95.9%) reported that they would be willing to accept texts on their smartphones to
6
answer questions about their mood, surroundings, or feelings (Duncan et al., 2017). Thus,
emerging adults are a common population for EMA research.
Advantages of EMA
EMA overcomes some of the limitations of retrospective cross-sectional surveys to
improve the validity of the data. Beyond reducing recall bias, a major benefit of using EMA is
that through the frequent assessments of ongoing contexts (i.e., changes in the social context,
locations, or feeling states throughout the day), EMA permits a better understanding of the
antecedents of, concomitants to, and consequences of health behaviors (e.g., negative affect and
decreased physical activity or stress and relapse in smoking). Traditional paper and pencil or
online cross-sectional studies may only sample participants once. While some longitudinal
studies may sample participants repeatedly, it is often at infrequent intervals (e.g., three months
to assess changes in constructs at the beginning and end of the study) resulting in missed
variations within individuals during the study. For researchers and healthcare providers to be
able to understand and change health behaviors, it is important that we understand processes
across groups of individuals (between-person), but also that interventions can be applicable to
momentary states (within-person) (Johnston & Johnston, 2013).
The growing popularity of EMA
The number of journal articles that use ecological momentary assessment in PubMed
increased from 186 published in 2010 to 2899 published in 2021, and trends suggest that the
application of the method is expected to further increase over time. Moreover, these results do
not account for alternate names for similar assessment methods such as experience sampling
method, daily diary, or ambulatory assessment. The use of ecological momentary assessment has
“opened the black-box of daily life” (Myin-Germeys et al., 2009) and uncovered previously
7
unstudied dynamic processes. EMA offers opportunities for the growing focus on precision
medicine as individualized approaches to the detection and treatment of disease require the
collection of a large amount of data per participant (J. Kim et al., 2019).
Commercial EMA software has been developed to resolve many of the previous technical
and logistic challenges hindering EMA studies. However, these commercial EMA platforms do
not deliver the methodological training or knowledge required by researchers to design and
conduct a productive EMA study. Experienced EMA researchers still struggle with how to select
EMA items, capture the daily experiences of participants without interfering with their lifestyles,
as well as how to sample moments across highly variable personal schedules (Pejovic et al.,
2015). Without guidelines, researchers who are more unfamiliar with EMA methods may fail to
make appropriate design considerations and push the limits in terms of the number of items
assessed, frequency of surveys, and length of monitoring periods leading to increased participant
burden, decreased data quality, and produce flawed findings (Doherty et al., 2020; Eisele, 2021).
These issues could be alleviated with further methodological research on the effects of EMA
study design with findings translated into clear recommendations.
Participant engagement is a major challenge of EMA
Despite its numerous benefits, EMA presents new challenges to data collection.
Sustaining participant engagement in studies that use repeated measures can be difficult for
researchers. Engagement is defined as the behavioral, cognitive, and affective manifestation of
energy and interest toward a focal task or stimulus (Maslach & Leiter, 2008). The intensive
nature of frequent data collection study protocols may decrease participant engagement due to
the physical and mental burden it places on respondents (Smyth et al., 2021). To alleviate this
challenge, solutions to support participants contributing meaningful and high-quality data have
8
been evaluated (Hufford & Shiffman, 2003; Scollon et al., 2003). Researchers commonly
evaluate engagement with the EMA protocol through retrospective assessments of participants’
experiences in the study. Behavioral and affective aspects of participant engagement in EMA
have been assessed through compliance (e.g., performing the task of answering surveys) and
acceptability (e.g., assessing if participants enjoy the tasks), but there is limited research on
cognitive engagement with EMA (e.g., do participants spend time thinking about the tasks; do
participants engage in cognitive processing of the questions) (Dao et al., 2021). Cognitive
measures of engagement could include self-reports of burden and fatigue, perception of
interruption, differential reactions to surveys or questions within a survey, and time spent on the
EMA surveys (Nahum-Shani et al., 2022). It has been proposed that participants may have less
cognitive engagement in observational EMA data collection compared to intervention-related
EMA data collection due to a lack of direct benefit from the data monitoring (McGonagle, 2015).
As a relatively novel approach, the cognitive challenges that are associated with EMA as a
methodological technique must be further addressed.
To estimate the impact of cognitive burden in their EMA studies, researchers have
previously evaluated the quantity of data collected by examining compliance (e.g., the proportion
of questionnaires completed within the allotted time) or participant retention (i.e., drop-out rates)
(Christensen et al., 2003; Hsieh et al., 2008; Larson & Csikszentmihalyi, 2014; Zhang et al.,
2016). Conducting compliance and missing data analyses are critical to understanding the
representativeness of the data collected and ensuring there is no sampling bias. Previous studies
have identified predictors of EMA response noncompliance at the momentary, person, and study
levels to suggest that noncompliance in EMA studies is often systematic. For instance, male and
younger participants have repeatedly shown lower compliance rates (Ono et al., 2019; Rintala et
9
al., 2020). In a systematic review of EMA studies in adolescents, Wen et al. found when
evaluating the effects of study duration, EMA compliance was similar for studies longer than
three weeks compared to studies less than one week (Wen et al., 2017). As a result, many EMA
studies make design choices or oversample participants to try to mitigate non-compliance factors
and achieve sufficient power (van Berkel et al., 2018).
Many strategies have been implemented by researchers to counteract the burden
associated with EMA protocols. Currently, incentives are the most commonly used practice
(Farzan et al., 2008; van Berkel et al., 2018). Most EMA studies offer momentary compensation
for completing all days of data collection or meeting a response rate threshold to increase
attrition or compliance rates (Church, 1993; Musthag et al., 2011). While it is believed that
compensation is a key factor in compliance with EMA, there is yet to be a sequential randomized
controlled trial to assess the effect of compensation. Research has also suggested strategies to
make data collection more intrinsically rewarding for participants through the return of the
collected self-report data and building rapport with research staff (Hsieh et al., 2008; Land-
Zandstra et al., 2016; Larson & Csikszentmihalyi, 2014). In longer studies, as questions may
become repetitive, researchers could consider modifying the format, presentation, or order of
questions to increase participant engagement. Techniques from the field of user experience could
also be applied to the design of EMA applications to improve participants’ satisfaction through
utility, ease of use, and pleasure provided in the interactions with the application. Additional
research is needed to systematically test the effect of these non-monetary solutions.
While burden is frequently assessed in exit interviews, it has not been investigated as a
variable that changes throughout the course of the study. If data analysis indicates that
respondents show an increased tendency for response inconsistency temporally (end of day or
10
end of the study), this pattern suggests that data quality may be compromised depending on the
length of a measurement period. Without considering how burden may change over time or be
affected by participants’ context, it is unclear whether protocols are adversely impacting
participants’ well-being and response quality even without a reduction in compliance. The
common practice of using monetary incentives as bonuses for high compliance may have
negative effects on EMA data quality as participants are more motivated to find ways to hit the
thresholds for compensation. Possibly because of strategies implemented to increase engagement
such as micro-incentives or rapport with study staff, participants may choose to continue to
comply with completing surveys, despite finding them burdensome. However, to ease burden,
participants might engage in inattentive responding, a form of cognitive non-engagement in
which participants respond to items with “low or little motivation to comply with survey
instructions, correctly interpret item content, and provide accurate responses.” (Huang et al.,
2012; Johnson, 2005)
Psychological process underlying survey response
The psychological process underlying survey response is theorized by Tourangeau et al to
consist of four major steps (presented in Figure 1 below) (Tourangeau et al., 2000). After a
survey item is presented, a participant undergoes 1) Comprehension: the question and
instructions are processed through reading and interpretation, 2) Retrieval: relevant information
for a response to the question is recalled, 3) Judgement: retrieved information is processed and
integrated into the context of the question, and 4) Response: the resulting judgment is mapped
upon onto the response options.
11
Figure 1. The psychological process underlying survey response proposed by Tourangeau et al
2000
Using this framework, it can be hypothesized that when inattentive responding occurs,
the respondent has enough motivation to respond to the survey item presented but not enough
motivation to carefully indicate a response. To reduce effort, they may skip directly from seeing
the survey item presented to indicating a response without fully attending to the middle survey
response steps. For example, this may occur when a participant fails to comprehend the question
because they do not read or fail to retrieve the relevant information (Steps 1 and 2).
Alternatively, information may be retrieved that is inadequate to form the required judgment
(Step 2). Participants may also skip or only superficially integrate information to make a
judgment (Step 3). Or, inattentive participants may fail to select the appropriate response option,
due to not reading or superficially processing them (Step 4).
Identifying inattentive responding
The accuracy of self-report data has been a long-standing concern and identified as a
source of error (Pinneau & Milton, 1958; Walsh, 1967). While there are methods to ensure that
response inaccuracy is not due to misunderstanding (e.g., screening out participants who do not
have strong language comprehension or cognitive difficulties that would influence participation),
addressing inaccuracy due to cognitive non-engagement and low motivation has been more
challenging for researchers. This type of response pattern has been described with several terms
Survey item
presented
1) Read and
interpret
survey item
2) Retrieve
relevant
information
3) Make
judgement
4) Map upon
response
options
Response
indicated
12
including careless responding (Ward & Meade, 2018), insufficient effort responding (Bowling et
al., 2016; McGonagle, 2015), or inattentive responding (Fleischer et al., 2015). In this
dissertation, the term inattentive responding (IR) will be used instead of careless responding or
other terms when describing responses made with low motivation or concentration to not imply
participants are irresponsible. There have been various factors hypothesized to contribute to IR
patterns such as negative motivations, frustration, fatigue, and a desire to finish the survey
quickly that could be associated with the survey (e.g., questionnaire length), the person (e.g.,
participant interest in the subject, social norms, personality), and contextual factors (i.e.,
distractions) (Curran, 2016; Huang et al., 2012). When a participant becomes disengaged from a
survey, there are a variety of ways they could take shortcuts to reduce effort. Studies using cross-
sectional self-report methods have applied classification methods and identified two types of IR
patterns (Maniaci & Rogge, 2014; Meade & Craig, 2012). The first pattern is participants who
provide the same response to a string of adjacent questions to reduce effort. The second pattern is
characterized by varied or random responses that show inconsistencies. Thus, various indices
have been designed to detect IR within cross-sectional self-report methods (Dunn et al., 2018).
Approaches to identifying IR can be classified as proactive or reactive (Meade & Craig, 2012).
Proactive approaches involve the inclusion of items or scales within a questionnaire to “catch”
individuals. Reactive approaches involve post hoc analyses of responses and metadata (Dunn et
al., 2018; Meade & Craig, 2012). These approaches will be elaborated upon in the introduction
section of Study 1.
Effects of inattentive responding on data inferences
Researchers are concerned about the prevalence of IR in EMA studies because it leads to
inaccurate data (Maniaci & Rogge, 2014). The presence of IR in data collection threatens the
13
reliability and validity of self-report data and can lead to measurement errors. Previous estimates
in emerging adult samples estimate that the median rate in cross-sectional self-report methods is
10-12% with some estimates of 50-72% on single items (Maniaci & Rogge, 2014; Meade &
Craig, 2012; Oppenheimer et al., 2009). While statistical analysis methods such as multilevel
modeling may be robust to missing data, they are not robust to false or deceitful data. Maniaci
and Rogge found that online MTurk respondents that engaged in IR had data with different
psychometric and statistical characteristics (Maniaci & Rogge, 2014). In particular, Cronbach
alpha coefficients ranged from 0.80-0.90 for their variables of interest among attentive
respondents but 0.40-0.60 for respondents suspected of IR. Likewise, the variables of interest
better predicted an outcome when the dataset only included a subsample of attentive responders.
Simulation studies have suggested that a small proportion of inattentive responders (5-10%) is
enough to alter study results and lead to different conclusions regarding hypotheses (Credé,
2010). While random responding adds noise to the dataset, inattentive responses introduce
systematic bias (Schroeders et al., 2021). These error variances reduce reliability estimates,
measurement precision, and attenuate or inflate correlations (Hong et al., 2020; Huang et al.,
2015; McGrath et al., 2010). Published studies based on analysis of these “uncleaned” data sets
may present misleading and unrepresentative conclusions.
Inattentive responding and EMA
EMA study designs may be prone to IR. Various causes of IR in cross-sectional surveys
include burden related to survey length or design, lack of research-participant social contact, and
environmental distractions (Meade & Craig, 2012), which are all relevant to EMA methods. The
ease of use and familiarity with smartphones in emerging adults may also result in additional
14
distractions competing for users’ attention because 28% of the population report using their
smartphone as their only tool for internet access (Andrew Perrin, 2021).
Moreover, there are additional opportunities in EMA research to build rapport during the
onboarding process and maintain a connection with participants through check-ins throughout
the protocol which increases motivation to complete surveys (Csikszentmihalyi, 2014). In
repeated measures designs such as EMA, it is likely that identifying subject-level poor response
style is not deterministic, and an item answered inaccurately at one survey or occasion may be
answered more carefully and accurately at another survey. Because EMA involves repeated
measures in free-living situations that introduce recurring contextual, within-person, and
temporal influences on data validity, research must be conducted to examine IR in EMA.
Because EMA data are often used to interpret within-person effects, inattentive responses
misattributed as individual differences could introduce error variances to results (Beal & Weiss,
2003).
Researchers can take steps to evaluate data quality in EMA by translating IR detection
techniques from cross-sectional self-report methods to the domain of EMA. Beyond detection,
there is a need for EMA researchers to examine IR in a person-by-situation context to examine if
certain people tend to respond inattentively in certain situations. While EMA studies have
assessed how factors such as time (e.g., evening vs. morning) affect the likelihood of completing
EMA surveys (Ponnada et al., 2022), less is known about how contextual or intrapersonal
variables affect response patterns across completed surveys. It is assumed that participants who
are motivated only by the compensation are likely to be more vulnerable to IR compared to
participants who have more intrinsic motivation for participating (e.g., benefit science, learn
more about themselves) (Litman et al., 2015). Unknown is whether the likelihood of IR to a
15
given EMA survey is distinguished by aspects of the EMA sampling process such as time of day
or day in study. For example, participants who respond to a survey after reminders may be more
likely to answer inattentively than those who answer immediately. Moreover, it is well known
from studies in cognitive psychology that cognitive skills such as attention and working memory
fluctuate over time and across contexts (Cowan, 2012; van Berkel et al., 2020). Attention is
lower during early morning and evening, and distractions and constrained mental states (e.g.,
stress) increase inattention (Beilock & DeCaro, 2007; Cowan, 2012; C. Schmidt et al., 2007).
However, EMA surveying often occurs on random schedules across the day, and common data-
cleaning practices consider all data to be accurate and equal without accounting for these
contextual or cognitive factors. Dzubur and colleagues found that EMA survey item response
variance and survey completion times decreased through a two-week measurement period and
were positively correlated, suggesting that participants may be more likely to select repetitive
responses over time (Dzubur, 2017). Thus, data may be most valid at the onset of the study when
participants are less fatigued and the degree to which data can be trusted to be representative of
an individual’s daily life may vary over time.
Aims of the proposed studies
As the use of EMA for data collection to understand health behaviors increases, so does
the importance of establishing definitions around the quality of EMA responses. To fill this gap
of knowledge, working with a team I developed and tested novel and practical strategies to detect
IR in longitudinal data collection methods. I leveraged data drawn from an EMA study that used
real-time mobile technologies to collect self-report data in a diverse population of emerging
adults (N=332) across a 12-month period.
16
This work had four aims across three studies:
Study 1
AIM 1A: Apply various IR detection indices used with cross-sectional self-report methods to
EMA methods and compare accuracy across those indices.
H1: Short response times will be more accurate in detecting inattentive EMA surveys
compared to longer response times.
H2: Invariant response indices (e.g., longstring and response variability) will be more
accurate at detecting IR than variant response indices (e.g., Synonym and antonym
index).
These hypotheses are consistent with the findings from Jaso, Kraus, and Heller (2021).
AIM 1B: Develop a predictive model for IR using the indices to estimate the prevalence of IR in
EMA and whether overall IR prevalence is higher for EMA than for cross-sectional self-report
surveys for each participant.
H3: The prevalence of IR in EMA will be higher than the reported prevalence of IR in
cross-sectional self-report surveys (>10%).
Study 2
AIM 2: Investigate person-level (e.g., demographics, personality, approach motivation,
perceptions of burden) and survey-level (e.g., study day, time of day, location, activity level)
predictors of IR.
This aim is exploratory and does not test a hypothesis.
17
Study 3
AIM 3: To collect qualitative data from completers of the EMA study to understand the process
of IR and factors leading to IR.
This aim is exploratory. The guiding research questions are to understand:
1) Why did participants join the TIME study and what was their motivation to continue?
2) What factors affected participants’ response accuracy? What contextual or time-
varying factors were identified by participants as influential?
3) What were the participants' perceptions of the attention check questions?
Overall, I will explore IR as a unique potential limitation of real-time longitudinal data capture
methods like ecological momentary assessment as a tool for the assessment of health behaviors.
18
STUDY 1: IDENTIFYING AND APPLYING INATTENTIVE DETECTION INDICES
TO ECOLOGICAL MOMENTARY ASSESSMENT DATA
Abstract
Ecological Momentary Assessment (EMA) could present novel data collection challenges
due to the repeated assessment nature of the methodology. Participants are often required to
complete surveys multiple times a day across several days, which may lead to reduced
engagement and inattentive responding (IR) to reduce burden. We aim to explore the prevalence
of IR in EMA studies, compare the accuracy of established IR detection indices from cross-
sectional surveys, and develop a prediction model to estimate the prevalence of IR in EMA data.
Secondary data analyses were conducted using data from the Temporal Influences on Movement
and Exercise (TIME) Study. Attention check questions (ACQs) were included in 20% of burst
EMA surveys to assess data quality, and the correctness of ACQ responses served as an objective
measure of IR. Results from the analysis of 20,439 burst EMA surveys indicated an overall low
prevalence of incorrect ACQ responses (1.33%). T-tests revealed significant differences in mean
scores of IR detection indices between surveys with correct and incorrect ACQ responses.
Furthermore, applying established cutoffs from cross-sectional surveys for IR detection yielded
poor accuracy in detecting IR in EMA data and revised cutoffs were proposed. A prediction
model combining IR detection indices was developed that included between-subject response
variability as well as between-subject and within-subject total response times as predictors. The
model demonstrated good discrimination (87%) in predicting the probability of correct ACQ
responses. This study contributes to researchers’ understanding of IR and findings highlight the
importance of considering response variability and total response time as indicators of IR in
EMA data. The prediction model provides a useful tool for future researchers to screen and
identify surveys with IR, enhancing data quality in EMA studies.
19
Introduction
The prevalence of personal smartphone ownership has increased the popularity of using
Ecological Momentary Assessment (EMA) in health behavior research (Duncan et al., 2019; S.
S. Intille et al., 2007; Raento et al., 2009). Ecological momentary assessment is a real-time, real-
world sampling strategy that researchers can use to collect repeated measures of self-report data
such as current experiences, behaviors, contexts, moods, and psychosocial factors (Shiffman et
al., 2008). The methodology is advantageous as it overcomes some of the limitations of
retrospective cross-sectional surveys to improve the validity of the data. As EMA surveys
participants frequently (up to multiple times a day) with short surveys that do not require
participants to recall extended periods of time, the method reduces recall bias. Additionally,
through the frequent assessments of ongoing contexts (i.e., changes in social context, physical
locations, or feeling states throughout the day), EMA provides a greater understanding of the
antecedents of, co-occurrences to, and consequences of health behaviors (e.g., negative affect
and decreased physical activity or stress and relapse in smoking) (Nam et al., 2021).
Nevertheless, despite its numerous benefits, EMA presents new challenges to data
collection. Sustaining participant engagement in studies that use repeated measures can be
difficult for researchers as participants are typically asked to complete surveys administered
several times during the day across multiple days (Scollon et al., 2003). The intensive nature of
these frequent data collection study protocols may place additional physical and mental burdens
on participants and require making a long-term commitment to the inconvenience of responding
to surveys in multiple contexts (Smyth et al., 2021). It has been proposed that participants may
engage in subtle behaviors to reduce these burdens, such as responding without paying sufficient
attention to the questions (Eisele, 2021). Careless responding occurs when participants respond
20
to items with “low or little motivation to comply with survey instructions, correctly interpret
item content, and provide accurate responses.” This behavior adds noise to the data and may lead
to illegitimate correlations between variables (Huang et al., 2012). In this paper, the term
inattentive responding (IR) is used instead of careless responding when describing EMA
responses made with low motivation or concentration so as not to imply that participants are
irresponsible because there may be contextual or time-varying factors contributing to the low-
quality data. While IR is a well-established phenomenon in cross-sectional self-report methods,
this threat to data quality has been rarely studied in EMA studies, which have focused more on
noncompliance as the only measure of data quality conceptualized as compliance (proportion of
completed surveys) or attrition (rates of dropout from the study) (Christensen et al., 2003; Hsieh
et al., 2008; Larson & Csikszentmihalyi, 2014; Zhang et al., 2016).
Potential Detection Indices
Identifying IR is critical to the reliability and validity of study results. As a first step,
researchers can borrow detection methods that are traditionally used in cross-sectional self-report
surveys and apply them to the domain of EMA. One method uses a priori or proactive
approaches by including items that assess attention through an objective correct response
(attention check questions (ACQ)). Often this involves instructing participants to select a specific
response option (e.g., select “Strongly agree” on this item) or ask questions with a clear, correct
response (2+2= ?) Alternatively, self-report inattentiveness is a straightforward method that
directly asks participants about the quality of the responses they provided at the end of the
survey. For example, participants could be asked to report how carefully they paid attention, how
much effort was exerted, or how engaged they were in completing the survey. They may also be
asked to judge whether their data should be used in the analysis based on the quality of the data
21
they provided. These a priori methods have typically been used in lengthy single-administration
cross-sectional surveys (Curran, 2016). However, there has been hesitation in using these
approaches in EMA given that inserting these items may take up limited space and time and
increase participant burden (Welling et al., 2021). Thus, reactive approaches to identify IR
through post hoc analyses of response patterns or metadata could be applied (Curran, 2016).
These measures to capture IR patterns include:
Longstring index. Inattentive participants may reduce burden by clicking the same response
option for consecutive questions (straight lining). The longstring index looks for these invariant
and overly consistent responses by counting the maximum number of times the participant
indicated the same response option in a row. Cut scores vary and may depend on the response
options, as some answer options may be more prevalent. Curran recommends a cutoff value
equal to half the number of items in a scale but notes that this depends on the probability of
careful responders selecting the same response repeatedly (Curran, 2016).
Intra-individual response variability (IRV). This method, developed by Dunn et al., aims to
identify response patterns that are more consistent than expected by calculating the standard
deviation of responses (Dunn et al., 2018). The assumption is that a participant’s responses will
somewhat vary across items. This measure identifies participants with low response variability
that did not respond with consecutive strings. The creators of this index did not create a cutoff
score that would indicate IR, but instead recommend flagging 10% of participants with the
highest or most response consistency, as this is the expected rate of inattentive responders in a
sample based on previous research. Whereas Dunn et al. propose to mark persons with low IRV
22
scores as outliers, Marjanovic et al. also propose to mark persons with high IRV scores as it
reflects highly random responses (Dunn et al., 2018; Marjanovic et al., 2015).
Synonym and antonym index. This method aims to identify participants with high response
variability. It assumes that participants respond similarly to similar items and respond
dissimilarly to dissimilar items, thus synonym and antonym pairs can be identified by either their
semantic or psychometric properties. For example, a positive correlation on an antonym index
could be a clear indicator of IR (Curran, 2016).
Mahalanobis distance. This method identifies if a participant’s responses are outliers in the data
by considering the improbability of the set of responses (Curran, 2016; Niessen et al., 2016).
Statistically, Mahalanobis distance is calculated by determining the distance between a person’s
vector of responses and the vector mean of the sample. Since the probability distribution for
Mahalanobis distance is known, a p-value can be calculated. A critical p-value can be set to
determine if a response is improbable enough to be flagged. While this method is promising, it
carries normality assumptions, and it is unclear how this method could be applied to the repeated
measures of EMA (Denison, 2022).
Response time. It is assumed that a threshold of time is needed to read the survey question,
evaluate the item, retrieve the information to determine a response, and indicate the most
appropriate response option (Tourangeau, 2018). Response time represents the total amount of
time a participant takes to complete the survey. Participants who respond quickly may not be
reading the question carefully, not answering thoughtfully, or incorrectly picking their response.
Response time is an intuitive and commonly used metric in online surveys, as the metadata
necessary is automatically recorded and easy to interpret. However, there are no standard cutoff
23
values for this indicator because response time may depend on the content of the survey items or
the population completing the survey. Huang et al. suggest a two-second-per-item benchmark to
serve as a minimum response time for data inclusion (Huang et al., 2012). But this cutoff is
based on an educated guess of the minimum amount of time required for an attentive response
and may leave inattentive responders in the sample. A time-based indicator does not need to
differentiate on the specific response patterns and can provide a general measure of IR. A further
advantage of timing-based over response pattern-based indicators is that it is possible to identify
IR on the item level, supporting a finer identification and allowing the possibility that
respondents only engaged in IR on some of the items.
The strength of these indicators is that they can be easily used to identify IR, but each
indicator is specific to a pattern or type of IR behavior. A potential solution is to use a
simultaneous approach and combine multiple indicators, but there need to be clear guidelines for
implementing this method or the optimal combination of indices (Jaso et al., 2021).
Overall, very little research has been conducted specifically seeking to identify IR in
EMA, and these post hoc indices of IR have had limited application to EMA thus far. Previous
attempts to refine detection indices for IR to be used with EMA items have been made (Jaso et
al., 2021). Jaso, Kraus, and Heller analyzed EMA data from 293 participants (18,093
assessments). The authors examined 1) time per item (time to complete each assessment divided
by the number of items), 2) standard deviation of responses per EMA survey set, and 3)
longstring or the mode of the item scores per EMA. Their analysis identified the inattentive
response metrics of time per item < 1 s OR within EMA SD < 5 OR % items at mode > 60%.
Most participants (59.39%) had at least one EMA response identified as inattentive by the above
24
criteria. A small subset of participants (5%) was identified as frequently inattentive responders
(>50% of their EMAs flagged as inattentive). Welling et al. also applied these post-hoc methods
but found much lower levels of IR in their data. The authors explain that this is likely due to the
study population of invested hearing loss participants (N=20) (Welling et al., 2021). These
papers provide a unique and novel exploration of the prevalence and detection of IR but have
some limitations. As the affective states that participants reported could not be verified with
ground truth data, the studies rely on identifying outliers in the distribution of the data or
associations between variables to categorize as IR. However, these results can serve as the basis
upon which to base hypotheses for the current study’s analysis.
To replicate existing analyses and add to the literature by addressing previous gaps, we
estimated IR prevalence in an EMA study with a denser sampling frequency (once an hour) and
more study days (up to 365 days) than previous work in this area. This intensive study design
may be more conducive to responsive patterns that produce IR due to increased burden. We also
used ACQ as an objective measure of IR. Therefore, the first aim of the study was to describe
patterns of IR in our sample. We applied various IR detection indices and cutoffs used in cross-
sectional self-report surveys to EMA and compare accuracy across those indices. It was
hypothesized that 1) short response times will be more accurate in detecting IR in EMA surveys
compared to longer response times and 2) Invariant response indices will be more accurate at
detecting IR than variant response indices. The second aim was to develop a prediction model
that optimizes the best combination of indices to detect IR in our EMA data and estimate
prevalence. It was hypothesized that overall IR prevalence would be higher for EMA than
typically reported in cross-sectional self-report surveys (>10%).
25
Methods
This study conducted secondary analyses of data from the Temporal Influences on
Movement and Exercise (TIME) Study. The overarching objective of the 12-month study was to
use real-time mobile technologies to collect intensive longitudinal data examining differences in
the micro-temporal processes underlying the adoption and maintenance of physical activity, low
sedentary time, and sufficient sleep duration in young adults (18-29 years old). The TIME Study
was approved by the Institutional Review Board at the University of Southern California: IRB#
HS-18-00605. Full details of the study design and participants have been previously reported
(Wang et al., 2022).
Participants
The sample consisted of emerging adults between the ages of 18-29 years old living in
the United States. To be eligible for the study, the participant had to use a compatible Android-
based smartphone as their only primary personal mobile device and planned to reside in a home
with Wi-Fi connectivity during the 12-month study period. Additionally, participants had to
intend to engage in recommended levels of MVPA (≥150 min/week moderate or ≥75 min/week
vigorous intensity) within the next 12 months and be able to speak and read English. Exclusion
criteria included physical or cognitive disabilities that prevented participation, diagnosed sleep
disorders, inability to wear a smartwatch or answer EMA surveys at home, work, school, or other
locations where the participant spent a substantial amount of time (e.g., more than 20% of the
time), Android phone version 6.0 (or older), less than 2GB or pay as you go data plan,
pregnancy, or current smartwatch wear. Overall, 332 participants consented to participate in the
study. Participants who successfully completed at least eight surveys per day on each of the four
26
days during the first EMA measurement burst period were considered fully enrolled in the study
(N=246) and loaned a smartwatch.
Throughout the study, staff performed real-time remote monitoring of participant
compliance. On a weekly basis, staff reviewed data uploaded to the study server and contacted
participants by text message if missing data was observed, to encourage compliance and address
technical issues. The smartphone app was aware of the status of the phone and watch (i.e., if the
watch was being worn and sending data, if survey responses were being received, if devices were
being properly charged daily), and the smartphone automatically surveyed participants via push
notifications to encourage proper watch use throughout the study. Study staff withdrew
participants from the study due to technical or participant issues that lead to poor data integrity,
missing data, or ongoing low compliance. Participants were sent a birthday card and quarterly
newsletters to keep them engaged in the study and to promote compliance with study procedures.
Participants were given a phone number and instructed to text study staff with any questions,
concerns, or technical issues.
Procedures
Measurement burst EMA surveying. Every two weeks, the participants underwent a four-day
measurement burst period of EMA surveys (up to 26 bursts during the entire study period).
During waking hours, the smartphone app used push notifications to survey participants to
complete signal-contingent (i.e., randomly surveyed) EMA surveys approximately once every
hour. Within an hour, the surveying was restricted to between the 10th and 50th minute. For
instance, a survey scheduled between 10 A.M. and 11 A.M. occurred only between 10:10 A.M.
and 10:50 A.M. to ensure that two surveys from consecutive hours did not occur closer than 20
27
min from each other. Each EMA survey started with an audio chime, a vibration lasting 11
seconds, and a persistent notification. EMA surveys included up to 27 back-to-back multiple-
choice questions (requiring 1-2 minutes to complete). If the participant did not complete the
survey after the initial survey, the app repromoted it after 5 min; if 10 minutes elapsed and the
study was still not completed, it disappeared and became inaccessible to the participant.
Compensation. Participants received $10 for each EMA burst if they completed at least eight
surveys per day. In addition, if the participant answered more than 11 EMA burst surveys on a
given day, they received a $5 bonus for that day. The TIME app showed a persistent notification
on the smartphone that displayed the number of answered vs. surveyed EMA surveys on a given
burst day to help participants monitor their daily compliance.
Measures
EMA items. The analyses focused on the first fourteen EMA items that assessed affect, stress,
attention, self-control, productivity, demands, and habit. The items are presented in Table 1.
Affect and feeling states are frequently measured in other EMA studies, and thus these items
were selected so that findings from these analyses would be more relevant and translatable to
other researchers. Additionally, these 14 consecutive questions had five response options that
appeared on the same area on the phone screen (conducive for longstring response behavior).
Screenshots of the presentation of the first four EMA items on a Pixel 4 device are presented
Figure 2. The EMA items were always presented in the same order in every burst survey.
28
Table 1. Analyzed EMA items from the TIME Study
Variable Item
Response
Options
Affective and Feeling
States
Right now, how SAD do you feel? 1 = Not at all
2 = A little
3 = Moderately
4 = Quite a bit
5 = Extremely
Right now, how HAPPY do you feel?
Right now, how FATIGUED do you feel?
Right now, how ENERGETIC do you feel?
Right now, how RELAXED do you feel?
Right now, how TENSE do you feel?
Feeling Stress Right now, how STRESSED do you feel?
Right now, how FRUSTRATED do you feel?
Right now, how NERVOUS do you feel?
Attention Right now, I feel FOCUSED.
Self-control Right now, I feel IN CONTROL. 1 = Not at all
2 = A little
3 = Moderately
4 = Quite a bit
5 = Very much so
Productivity Right now, I am PROCRASTINATING.
Demands Right now, I feel like I can’t get everything done. 1 = Not at all
2 = A little
3 = Moderately
4 = Quite a bit
5 = Extremely
Habit Right now, I am following my usual routine.
29
Figure 2. Example of the presentation of EMA items on the phone screen
30
Attention check questions (ACQ). Aligned with previous studies, we included ACQs within the
burst EMA surveys to assess data quality. Twenty percent of the EMA burst questions in the
TIME Study included an ACQ randomly presented among the EMA affect questions that asked
about obvious facts with clear, unambiguous answers. The questions included five answer
options to match the visual presentation of the EMA affect questions. Study team members
generated 80 different attention-check questions, which were added to the study protocol on
April 5, 2021. The generated ACQ are presented in the appendix. A binary variable was
generated (1=yes/0=no), indicating if the attention check question was answered correctly. This
binary variable served as an objective IR criterion for the EMA survey. It was assumed that
responses to all items in an EMA survey that included a failed attention check item were likely to
be answered inattentively.
Proposed IR Detection Indices. The following indices were calculated to detect potential IR.
Cut-off scores were established based on the existing research (Curran, 2016; Dunn et al., 2018;
Huang et al., 2012, Jaso et al., 2021; Marjanovic et al., 2015). All calculations excluded the
response to the attention check question.
• Longstring was calculated by taking a matrix of item responses for each survey and
beginning with the second column (second item), comparing each column with the
previous one to check for matching responses. For each survey, the length of the
maximum uninterrupted string of identical responses for the 14 items was counted.
• IRV was calculated as the standard deviation of the responses across a survey. Given that
some participants may have a natural response pattern of less variability in responses,
IRV was disaggregated as within-subjects and between-subjects, and scores were person-
31
centered for both. The between-subject (BS) level represents the deviation of a survey’s
IRV from the group mean IRV across all available surveys, while the within-subject
(WS) level represents a survey’s deviation from the participants’ mean IRV across all
available surveys.
• Flagged antonym pairs violations were determined by calculating the correlations
between the items and considering the pairs with the highest negative correlations and
flagging improbable responses. This selection process is outlined in the results section of
the manuscript.
• Total response time was calculated using metadata collected at each EMA survey. The
difference in time between the response to the first EMA item (sad) and the fourteenth
EMA item (following usual routine) was calculated. EMA surveys with times per
assessment that were not completed in a plausible timeframe (>10 minutes) were
removed due to an app glitch not closing the survey. Additionally, time per item was
calculated as the difference between the end and the start time for each item, and the
median time per item for each survey was extracted. Given that some participants may
have a natural response pattern of responding faster compared to others, total response
time was also disaggregated as within-subjects and between-subjects, and scores were
person-centered for both. The between-subject (BS) level represents the deviation of a
survey’s total response time from the group’s mean total response time across all
available surveys, while the within-subject (WS) level represents a survey’s deviation
from the participants’ mean total response time across all available surveys.
32
Data Analysis
Descriptive statistics were calculated, and response distributions were visualized for the
burst EMA questions that included an ACQ. T-tests were conducted to compare the surveys with
correct and incorrectly answered ACQ on mean scores of the indices and frequencies of flagged
antonym pairs violations. Correlations were run among the indices to examine the degree and
direction of association between the variables.
To address the study aims, first, we assessed the accuracy of the IR detection indices
using cutoff scores previously proposed in the literature for cross-sectional self-report surveys
(i.e., response time <1 sec per item, longstring of > 7 (half of the items), IRV > top 10%, IRV <
bottom 10%, flagged antonym pairs). Next, we compared how accurately the indices performed
on our EMA data in predicting the probability of correctness of the attention check question (i.e.,
objective IR) and how well they performed compared to each other. To do this, receiver
operating characteristic (ROC) curves were plotted for each index, and the corresponding area
under the curve (AUC) was analyzed. A ROC curve illustrates the change in sensitivity (true
positive rate) and 1-specificity (false positive rate) over the entire range of an index. The
corresponding AUC can be interpreted as the probability the indicator yields a higher score for
an attentive/accurate survey than for an inattentive survey. If an index is effective at detecting an
attentive survey, the curve will lie above the diagonal and the AUC will be significantly larger
than 0.5 (Fawcett, 2006). Then, we determined our own cutoffs for the indices based on the ROC
curves above that maximized a high level of specificity, as misclassified IR surveys could result
in a loss of power.
33
Finally, we built a prediction model using the indices that future researchers could
potentially use to screen for surveys with IR that used a combination of the indices. All analyses
were conducted in Stata 17 and the statistical significance level was set at 0.05. We split the
entire dataset into a training and testing set to be able to assess generalizability and prevent
overfitting of the model. The training dataset contained 80% of the data and was used to train the
model. The remaining 20% in the testing dataset that was not used for model building allowed
for the evaluation of the model’s performance and the generalizability to new data. The initial set
of predictor variables considered for the model included all the indices based on prior research.
The prediction model was constructed using a combination of forward and backward stepwise
methods of variable selection. The initial model was built by including predictor variables that
were significantly associated (p<.05) with the outcome (correct ACQ) in a univariate logistic
regression model. After the forward step, a backward stepwise elimination process was
performed that removed variables based on the highest p-value and assessed change in
discriminative ability. The performance of the final prediction model was assessed using
Hosmer-Lemeshow goodness of fit.
Results
A total of 24,561 burst EMA surveys contained an ACQ. Some surveys were excluded if
the ACQ was deemed to be poor as determined by that question containing an ambiguous
answer, was overly difficult (<90% of participants responded correctly), or tested knowledge
rather than attention), leaving a total of 20,439 burst EMA surveys with an ACQ for analysis. Of
these, the prevalence of incorrect responses to an ACQ was very rare. Only 272 (1.33%) of ACQ
were answered incorrectly. The mean percent of ACQ completed correctly by participants
34
(N=198) was 98.51 (SD=6.10) and ranged from 25.18 to 100.00% (See Figure 3). The majority
of participants (N=147, 74.2%) got all the ACQ they received correct.
Figure 3. Distribution of ACQ correctness across participants (N=198)
Antonym pair selection
As none of our EMA items were true psychometric antonyms, we selected semantic
antonyms as a potential IR prediction index through a data-driven process by selecting items
with the highest negative correlations. The highest six pairs are presented below in Table 2.
Table 2. Between subject correlations between potential antonym pairs
Item 1 Item 2 Correlation
Tense Relaxed -.391
Stressed Relaxed -.391
Energetic Fatigued -.361
In control Stressed -.360
Happy Sad -.356
Frustrated Relaxed -.347
1 1
3 3
8
182
0 50 100 150 200
Frequency
20 30 40 50 60 70 80 90 100
Percent of ACQ answered correctly
35
A positive correlation between the items could not be used alone as a sign of poor data quality as
it was common for participants to indicate the same response option to both items (e.g., feeling
“Not at all” on both items). A distribution of the responses to the items tense and relaxed is
presented below in Figure 4 as an example. The other pairs showed similar trends.
Figure 4. Distribution of responses to tense and relaxed EMA responses
Response options: 1 = Not at all; 2 = A little; 3 = Moderately; 4 = Quite a bit; 5= Extremely
While possible, it was determined that indicating high levels (“Quite a bit” or “Extremely”) on
both items was improbable and should be flagged. Three pairs of items were selected as flags for
analysis based on semantics and correlations. These were: Flag 1= Tense/Relaxed; Flag 2=
Energetic/Fatigued; Flag 3= Happy/Sad.
3077 8742 25383 26834 14027
2960 9743 17250 8997 1295
2954 8438 9601 2740 389
2178 4843 3964 1235 260
1455 928 789 490 339
5
4
3
2
1
Tense
1 2 3 4 5
Relaxed
Frequencies
36
Descriptive Statistics for the IR Indices
Visualizations of the distributions for the potential IR indices are presented below in
Figure 5. The mean longstring value was 3.73 (SD=2.02), and values ranged from 1-14. The
skew and kurtosis of the longstring of each EMA were 2.49 and 11.46, respectively. The
distribution had a positive skew and values were concentrated toward the lower end of the scale.
The high kurtosis value suggests more outliers in the dataset compared to a normal distribution.
The mean IRV value was 1.11 (SD= .51), and values ranged from 0-2.34. The skew and kurtosis
of the distribution of the SD per EMA were .70 and 2.82, respectively. These values also suggest
a moderate positive skew and the presence of potential outliers. The range of response time was
8.03 sex–506.97 sec with a mean of 32.17 sec (SD=38.09). The skew and kurtosis of the
response time distribution were 6.67 and 62.81, respectively. The median time per item (TPI)
range was 0.63 s–20.41 s with a mean of 1.59 sec (SD=.93). The skew and kurtosis of the
median TPI distribution were 4.16 and 51.65, respectively. The skew and kurtosis for the time
values show significant deviation from a normal distribution. Skewness and kurtosis do not
directly impact the interpretation or analysis of ROC curves but can influence the performance of
classification models, biasing it toward the majority group, which could impact the ROC curve
and AUC. However, given this analysis aimed to identify these outliers, no additional data
transformation was conducted.
37
Figure 5. Distributions of potential indices of inattentive responding
0 500 1000 1500
Frequency
0 5 10 15
longstring
0 100 200 300 400
Frequency
0 .5 1 1.5 2 2.5
irv
0 500 1000 1500 2000
Frequency
0 100 200 300 400 500
total time per survey (seconds)
0 500 1000 1500 2000
Frequency
0 5 10 15 20
median time per item (seconds)
38
Between April 7, 2021-August 27, 2022, 199 participants received ACQs. Most
participants (N=147, 73.87%) answered all the ACQ they received correctly. Including only the
52 participants who answered at least one question incorrectly, the analysis comprised of 4686
burst EMA surveys, and the frequency of an incorrect response was 4.82%. T-tests were
conducted to examine significant differences in means between the groups of correctly and
incorrectly answered surveys. Results are presented in Table 3. Overall, the results suggest that
the surveys with correct ACQ had significantly lower longstring scores, but significantly more
intra-individual response variability, longer total response time, and longer median item response
times than the surveys with incorrect ACQ. Additionally, surveys with incorrect ACQs were
more likely to be flagged for responses on the tense/relaxed and happy/sad items.
Bivariate correlations among IR detection indices
Correlations among the indices are presented in Table 4. Results suggest that a higher
longstring is associated with less variability and less time spent on the survey. Conversely, more
variability in response was associated with longer time spent on the survey. The pairs of flagged
variables for improbable responses on the antonym pairs were intercorrelated such that being
flagged on one pair was associated with being flagged on the others as well. However, the
flagged improbable responses were not associated with IRV. Interestingly, the flags were all
correlated with median response time but not total response time, and only the tense/relaxed pair
was associated with longstring.
The predictive power of the IR detection indices and revision of cut scores
We applied cut scores from the cross-sectional survey literature and examined the
resulting AUCs with results presented in Table 5. Applying a cutoff of 1.95 (top 10%) for IRV
was the only promising predictor of ACQ correctness among the eight tested predictors, with
39
moderate diagnostic accuracy (AUC= .52). The other AUCs showed that using cutoffs from
cross-sectional surveys was not effective in detecting IR in our EMA study, and they performed
only as well as chance (AUC £.5); bottom 10% IRV, median response time, and total response
time classified more participants incorrectly than correctly.
Then we attempted to define appropriate cut scores for the IR indices by identifying the
point on the ROC curve that maximized specificity. The revised cut score, specificity, sensitivity,
and AUC for each index are presented in Table 6 to assess the performance of the model at that
point. The optimal cut score for longstring was determined to be 4. However, the longstring did
not perform better than chance in detecting an incorrect ACQ (AUC=.49). The best cut scores for
IRV, median item response time, and total time were .915, 1.32 sec, and 19.57 sec respectively,
and all three were moderately accurate in identifying surveys with incorrect ACQ. For total
survey response time (BS), the revised cut score indicated that identifying surveys completed
19.68 seconds faster than the group mean is moderately accurate in identifying surveys with
incorrect ACQ. For total survey response time (WS), surveys completed 2.47 seconds faster than
the participants’ average was the best indicator of IR. However, this threshold has lower
accuracy in identifying incorrect ACQ compared to total time (BS) (AUC .72 vs. .55). For IRV
(BS), the revised cut score indicated that flagging surveys with .388 fewer units of less
variability is moderately accurate (AUC=.69) in discriminating between ACQ correctness. The
proposed cut score for IRV (WS) of -.029 had a lower accuracy (AUC=.57).
Predictive model building
A sample of 3,732 surveys (N=43 participants) (80% of the total sample) was used as a
training dataset to build a model combining the IR detection indices in a multivariate model to
predict the probability that a survey was answered correctly. Candidate predictor variables
40
included longstring, between-subject IRV, within-subject IRV, between-subject total response
time, within-subject total response time, and an improbable response on the tense/relaxed,
energetic/fatigued, or happy/sad items that were flagged. Univariate analysis (Table 7) revealed
that the improbable responses on antonym pairs that were flagged, longstring, and between-
subject IRV were candidate predictors that were not significantly associated (p<.05) with the
outcome (correct ACQ). Thus, the covariates included in the final model (Table 8) were
between-subject IRV, between-subject total response time, and within-subject total response
time. For every unit increase in IRV compared to a participant’s mean IRV, the survey was 2.68
times more likely to be correct, and the 95% CI suggests the results are statistically significant.
In addition, for every second increase in total response time compared to the mean response time
across surveys (BS), the odds of getting the ACQ correct increased by a factor of 1.07, and this
relationship was statistically significant. Similarly, within-subject response time had a
statistically significant effect with every second increase in total response time compared to a
participant’s mean associated with a factor of 1.02 increase in odds of correct ACQ. The
Hosmer-Lemeshow goodness of fit test yielded an insignificant p-value (p=.99), which indicated
a good model fit; the observed and expected values did not differ significantly. The area under
the ROC curve generated by the estimation sample was 0.864 (SE=.015), which indicated good
discrimination when predicting the probability of an ACQ being answered correctly.
The subset of participants (n=954) that were held out of the model building served as the
testing dataset to analyze this predictive model’s generalizability. The model produced a
comparable ROC area under the curve (AUC= .844, SE=.040). The model was rerun on all the
data, and the results are presented in Table 8. The performance of the model was further
evaluated using the following metrics: sensitivity, specificity, positive predictive value (PPV),
41
negative predictive value (NPV), and correctly classified rate using all the cases (n=4686). The
classification table is shown in Table 9. The sensitivity of the model was 88.2%, indicating that it
correctly identified 88% of the correct ACQ among all correct ACQ. On the other hand, the
specificity of the model was 65.0%, indicating that it correctly identified 65% of the incorrect
ACQ among all incorrect ACQ.
42
Table 3. Descriptive statistics results of tests comparing IR detection techniques by responding
to the ACQ correctly vs incorrectly
Survey with
correct ACQ
N=4,460
Survey with
incorrect
ACQ N=226
Index M (SD) M (SD) 95% CI t df p-value
Longstring 3.72 (2.01) 4.01 (2.18) [.003, .586] 1.99 244.88 .04
Intra-individual response
variability
1.13 (.51) .81 (.44) [-.372, -.252] -10.23 256.49 <.001
Total response time 32.78 (38.70) 20.19
(19.59)
[-15.39, -9.78] -8.82 322.10 <.001
Median item response time 1.62 (.93) 1.17 (.90) [-.573, -.331] -7.36 250.00 <.001
N (%) N (%) X
2
p-value
Flagged tense/relaxed
responses*
4.15 .04
Yes 63 (1.41) 7 (3.10)
No 4397 (98.59) 219 (96.90)
Flagged energetic/fatigued
responses*
.47 0.49
Yes 24 (.54) 2 (.88)
No 4436 (99.46) 224 (99.12)
Flagged happy/sad responses* 42.61 <.001
Yes 12 (0.27) 7 (3.10)
No 4667 (99.59) 219 (96.90)
p-values from Chi-square test for categorical variables and t-test for age.
M = mean; SD = standard deviation. 95% CI = the 95% confidence interval of the estimated difference between the means
(incorrect-correct). df = the degrees of freedom, corrected for unequal variances. ACQ=Attention check question
*Highly positive responses on antonym pairs (e.g., extremely tense/extremely relaxed) were flagged as improbable responses
which could indicate inattentive responding
43
Table 4. Bivariate Correlations among IR detection indices
1. 2. 3. 4. 5. 6.
1. Longstring
2. Intra-individual response
variability
-.27***
3. Total response time per
survey
-.03*** .13***
4. Median item response time -.07*** .27*** .53***
5. Flagged tense/relaxed
responses^
-.08*** .03 -.05*** -.08***
6. Flagged energetic/fatigued
responses^
-.03 .01 -.03 -.05** .09***
7. Flagged happy/sad
responses^
-.02 -.02 -.03 -.04** .19*** .09***
Note. * p<.05; **p<.01; ***p<.001
^ Highly positive responses on antonym pairs (e.g., extremely tense/extremely relaxed) were
flagged as improbable responses which could indicate inattentive responding
44
Table 5. AUCs of the inattentive responding detection indices based on existing cut scores
generated from ROCs
Index Cut score AUC(95% CI)
1. Longstring 7 .50.(48 - .51)
2. IRV (bottom 10%) 0.54 .41 (.38 - .44)
3. IRV (top 10%) 1.95 .52 (.51 - .54)
4. Median item response time 1 sec .33 (.31 - .37)
5. Average response time (total time/14) 1 sec .37 (.34 - .40)
6. Flagged tense/relaxed response^ 1 .49 (.48 - .50)
7. Flagged energetic/fatigued response^ 1 .50 (.49 - .50)
8.Flagged happy/sad response^ 1 .49 (.47 - .50)
Note: Cut scores based on guidelines provided in Curran et al 2016
ROC= receiver operating characteristic curve
AUC = area under the ROC
95% CI = 95% confidence interval of the AUC
IRV = intra-individual response variability
^ Highly positive responses on antonym pairs (e.g., extremely tense/extremely relaxed) were
flagged as improbable responses which could indicate inattentive responding
45
Table 6. Data-driven cut scores of potential IR detection indices and levels of specificity and
sensitivity and AUC
Index Data-driven cut score Specificity Sensitivity AUC
Longstring 4 .51 .48 .49
IRV .915 .61 .77 .69
Median item response time 1.324 sec .58 .81 .69
Total time 19.572 sec .65 .72 .69
Total time (BS) -19.681 .75 .69 .72
Total time (WS) -2.469 .46 .64 .55
IRV (BS) -.388 .79 .59 .69
IRV (WS) -.029 .52 .61 .57
IR= inattentive responding
IRV= intra-individual response variability
BS= between-subjects
WS=within-subjects
AUC=area under the curve
46
Table 7. Univariate logistic regression models of potential detection indices predicting correct
attention check question response
(In training dataset)
Outcome= Correct ACQ Coefficient (SE) P-value
Longstring -.09 (.05) .070
IRV (BS) .95 (.63) .130
IRV (WS) 1.08 (.35) .002
Total response time (BS) .06 (.02) .004
Total response time (WS) .02 (.01) .009
Flagged tense/relaxed response^ .62 (.69) .369
Flagged energetic/fatigued response^ .97 (1.18) .413
Flagged happy/sad response^ -.43 (.68) .619
Indices based on guidelines provided in Curran et al 2016.
ACQ= attention check question
IRV= intra-individual response variability
BS= between-subjects
WS=within-subjects
^ Highly positive responses on antonym pairs (e.g., extremely tense/extremely relaxed) were
flagged as improbable responses which could indicate inattentive responding
47
Table 8. Multivariate logistic regression models of potential detection indices predicting correct
attention check question response
(In training dataset compared to the full dataset)
Training dataset Full dataset
OR 95% CI OR 95% CI
IRV (WS) 2.68** [1.32, 5.44] 2.23* [1.18, 4.20]
Total response time (BS) 1.07** [1.02, 1.11] 1.06** [1.02, 1.10]
Total response time (WS) 1.02** [1.01, 1.04] 1.01* [1.00, 1.03]
Constant 89.34*** [44.71, 178.54] 78.57*** [42.33, 145.85]
Variance (Random effects) 1.51 [.81, 2.80] 1.26 [.71, 2.26]
N 3,732 4,686
Note. Outcome=correct ACQ;
* p<.05; **p<.01; ***p<.001
ACQ= attention check question
IRV= intra-individual response variability
BS= between-subjects
WS=within-subjects
48
Table 9. Classification table comparing observed vs classified correct surveys based on
multivariate logistic regression
Observed
Incorrect Correct Total
Classified
Incorrect 147 525 672
Correct 79 3935 4014
Total 226 4460 4686
Sensitivity: 88.2%
Specificity: 65.0%
Positive Predicted Value: 98.03%
Negative Predicted Value: 21.88%
Correctly Classified: 87.11%
*Using cut score of .9313366
49
49
Discussion
This study aimed to explore the patterns of IR in an EMA study with ACQs through three
aims. The first aim of the study was to describe patterns of IR in our sample. Contrary to our
hypothesis, IR prevalence, represented by incorrect ACQs, was lower in our EMA data than rates
in cross-sectional self-report surveys. Our data showed a low overall rate of IR (1.3%) which
was surprising. Estimates in other EMA studies have shown higher rates ranging from 3% up to
50% in a recent analysis by Denison (Denison, 2022). Potential causes of IR include long study
periods, environmental distractions, and lack of research-participant contact, which are inherent
to EMA and would support higher rates. However, alternatively, there are theoretical reasons
why there may be low rates of IR in EMA compared to cross-sectional studies (Welling et al.,
2021). For example, surveys are short, and participants can choose to skip surveys that are
delivered at inconvenient times; thus, selection bias and missing data may be greater threats.
Only a limited number of studies have investigated IR in EMA and differences may be
attributable to differences in study design and participant characteristics as well.
The second aim was to apply various IR detection indices and cut scores used in cross-
sectional self-report surveys to EMA and compare accuracy across those indices. As
hypothesized, short response times and less variability in responses compared to usual were
indicative of IR. The analyses also showed that cut scores derived from screening indices used in
cross-sectional surveys were ineffective and did not translate well to EMA, and revised cut
scores based on the TIME data were provided.
The third aim was to develop a prediction model that optimizes the best combination of
indices to detect IR in our EMA data. The final model included between-subject response
50
50
variability as well as between-subject and within-subject total response times as predictors.
Overall, the test correctly classified 87.11% of the cases, indicating that the model was able to
classify most cases accurately. The high sensitivity suggests that the model was very good at
labeling correct surveys as correct. The lower specificity may be due to participants being able to
identify the ACQ and answer it correctly despite showing poor data quality on the other items in
the survey. The high PPV and low NPV and lower specificity suggest that the model may not be
as effective in identifying true negative cases and the model is better suited for identifying
positive cases (correct ACQ). Overall, as with all predictive models, the combination of selected
variables to best predict ACQ correctness offers valuable insight into which response patterns are
the most critical signs of data quality in EMA.
While there was good predictive value, there may be general limitations to applying the
existing indices to EMA given the differences in design between cross-sectional and EMA
studies. For example, if items are all worded in the same directions, consistency in selecting an
extreme response option (not at all) could indicate longstring or may be an attentive participant
experiencing low levels of affect. Research has demonstrated that participants tend to respond
with more extreme values during the first few EMAs, and this “elevation bias” attenuates over
time (Shrout et al., 2018). Similarly, quicker response times may indicate a participant rushing
through a survey inattentively or a participant who has become familiar with the survey and
knows their responses without much contemplation and requiring only a short amount of time to
attentively select their response (Kuncel & Fiske, 1974). Given the extreme length of our study,
participants may have become familiar with the items and become faster at responding to the
surveys. It is unclear if the model could identify careless responses in other datasets. This model
is only validated for the TIME dataset, and these potential indices should also be validated in
51
51
other EMA samples. While many EMA research studies use similar items to assess affect and
other similar constructs, considerations must be made for studies that assess more complex
concepts with varied numbers of items, different response formats, or participants.
Application of Attention Check Questions
A major contribution of this study is the use of ACQ as an objective measure of IR in an
intensive longitudinal EMA study. Previously researchers have relied on post-hoc response
patterns or used self-report attention items, which were not effective and showed inconsistencies
as the people who failed an objective catch item did not report low attention (Geeraerts &
Kuppens, 2020).
The use of ACQ in EMA has been debatable. In a recent paper that addresses IR in EMA
studies, Welling et al. discuss that while ACQs have been effective in detecting IR in cross-
sectional studies, participants in EMA studies answer repeated surveys and will begin to
recognize the items over time and “possibly be annoyed or insulted by the apparent distrust in
their responding behavior” (Welling et al., 2021). Additionally, some researchers fear that the
use of ACQ may also have undesirable effects like increasing socially desirable responses
(Clifford & Jerit, 2014) or cause participants to answer subsequent questions differently,
incorrectly, or inaccurately because they feel monitored or untrusted by the study investigators
(Hauser & Schwarz, 2015). Paradoxically, if the concern that ACQ threaten scale validity is real,
it does not influence the survey experience of inattentive respondents (i.e., those who overlook
the attention check); instead, it is those who are careful and whose data are likely retained in the
actual analysis who will be affected. But this spillover effect of ACQ may be an unintentional
intervention to increase data quality. Because survey respondents who read the attention check
52
52
can tell it is a “check” it signifies that the researchers are flagging low-quality data and may deter
respondents suspectable to fatigue from engaging in IR (Hauser & Schwarz, 2015).
Another common rationale provided by EMA researchers for why they do not use ACQ
is that surveys must be kept short and “non-critical” items should not be included in
questionnaires to minimize the burden on subjects (Rintala et al., 2021; Welling et al., 2021).
Given the advantages, this logic is unfounded as ACQ should or would not be included in every
survey (in the TIME Study, the probability is 20% of burst surveys), and the occasional addition
of one question is unlikely to add a substantial burden for participants. Moreover, EMA research
has shown that self-reported inattentive responding was not affected by the length of the
questionnaire, and compliance only decreases with a significant change in the length of the
survey (i.e., 30 items vs. 60 items) (Eisele, 2021).
Given the novelty of applying ACQ to EMA, the ACQs for this study were researcher
generated, and the quality of the ACQ could be improved. When researchers create attention
checks based on personal intuition as we did, the checks they use often lack validity (Berinsky et
al., 2014). There are critical factors that make for a good attention check question, such as an
obvious answer and only measuring attention and no other constructs related to memory,
education, or culturally specific knowledge. Many papers caution again removing participants
who failed ACQ as it may lead to demographic biases by age, race, or education, which threatens
external validity and limits the generalizability of findings (Anduiza & Galais, 2017; Berinsky et
al., 2014; Chandler et al., 2014; Hauser & Schwarz, 2015). We combatted this by reassessing the
quality of the questions administered and excluding ones that showed higher rates of incorrect
responses across all participants. Future researchers could use ACQ that were applied in this
study and continue refining them. Alternative forms of ACQ that have been used by researchers
53
53
survey participants to respond in a specific manner (e.g. “Please mark “yes”). In a 10-day EMA
study of substance use and social avoidance behaviors (N=195 participants), the frequency of
missed accuracy checks was .038 (SD=.137) and seven participants were excluded for a number
of failed ACQ more than 2 standard deviations above the mean (.31 or higher) (Gilman et al.,
2017). An ACQ to detect IR was also used in a 14-day EMA study conducted by Eisele et al.
that
instructed participants to choose the first option of a 7-point Likert- type scale. There was also a
low prevalence of objectively determined IR (19 incorrect responses out of 528 surveys) (Eisele,
2021).
Implications and Future Directions
Cross-sectional studies frequently entirely remove all of the data of any participants who
are suspected of IR (Kwapil et al., 2009). But recent research has shown that zero-tolerance
cutoffs (when respondents are eliminated if they failed even a single screening item) perform
poorly and likely screen out too many honest respondents (D. S. Kim et al., 2018). Instead, we
recommended that future EMA researchers weigh suspect data less in analyses or treat the data
as missing on the survey level. Alternatively, a sensitivity analysis could be conducted to
examine the effect of including the responses on results.
Given the wide range of populations and contexts that use self-report methods, it is
crucial that future researchers carefully consider the selection of IR detection techniques to be
applied. Some person-level factors have been found to be associated with unique response styles
in behavioral health studies that could be misinterpreted as IR. For example, there are recognized
cultural differences when responding to Likert scales. Several studies show that participants who
identify as East Asian tend to pick from the midpoint of the scale and avoid extreme responses,
54
54
which can be explained by an expectation socially to represent oneself modestly (Lai et al., 2013;
Lau et al., 2005; Lee et al., 2002). Researchers should thoughtfully consider their populations of
interest and the types of response patterns that may indicate IR. Studies with adolescents’ open-
ended responses may show quintessential indicators of IR, such as responses of “who[ever]
reads this is dumb” and “poop” (van Roekel et al., 2014).
Researchers have argued that the responding indices tested in this study are only indirect
measures that do not address the true nature of inattentive responding, failure to read and
interpret item content (Brosnan et al., 2019). Our existing IR indices indirectly measure IR by
evaluating the response execution instead of the response process. Accordingly, a novel and
direct measure of IR would assess whether a respondent actually read the content of the item.
Eye-tracking would be particularly advantageous for EMA as it has the potential to examine IR
at the item level (e.g., on any given single item). In contrast, indirect IR indices can only assess
IR across the entirety of the survey. It can be expected that eye movements for IR in EMA would
show that the participant did not read the item stem and made fewer eye movements. Using eye
tracking would also advance our current IR detection methods. For example, if it is discovered
through eye-tracking that a participant reads surveys item and seems to consider response
options yet still responds in a manner that is captured by indirect IR methods, then that would
call into question the validity of existing indirect methods of detecting IR and might change our
understand of IR. Besides eye tracking, other physiological data points could be collected from
commonly used wearable devices, like heart rate variability or skin conductivity, which may
assess participant engagement and indicate response quality.
Moreover, future research should measure reactivity to the questions by looking at
response patterns before and after the item. Future analysis should also test the impact of the
55
55
position/order within the burst survey the ACQ appeared in. A potential limitation of our ACQ is
that they were too funny and engaging and participants could easily spot them within the burst
EMA surveys. It is possible that participants may still get the “fun question” right as a distraction
from the usual repetitive affect questions. Also, there may be reactivity such that the question
increased the positive affect of the participants.
Conclusion
In conclusion, this study investigated the patterns of inattentive responding in an
intensive EMA study. Due to the potential influence of survey design differences between cross-
sectional and EMA studies, there was a lower prevalence of IR in our data and existing detection
indices could not be directly applied to predict IR. The study demonstrated the usefulness of
ACQs as an objective measure of IR in EMA, overcoming the limitations of relying on post-hoc
response patterns or self-report attention items. Overall, this study contributes to the
understanding and detection of IR in EMA research. Through the careful consideration of IR
detection techniques, researchers can enhance data quality and improve the validity and
generalizability of findings of EMA studies.
56
56
STUDY 2: MODELING PERSON-LEVEL AND CONTEXTUAL FACTORS
ASSOCIATED WITH INATTENTIVE RESPONDING IN AN ECOLOGICAL
MOMENTARY ASSESSMENT STUDY
Abstract
Ecological momentary assessment (EMA) is a real-time data collection method that
collects self-report data in participants’ natural environments. While EMA offers advantages
such as increased ecological validity, the burden associated with repeated data collection can
affect participants' motivation and data quality. This study aimed to investigate person-level
factors (demographics, personality, approach/avoidance motivation, perceived burden) and
survey-level factors (weekend day, study week, part of the waking day, phone usage, location,
activity level) as predictors of IR in EMA. The data for this study were collected as part of the
Temporal Influences on Movement and Exercise (TIME) study, involving 332 young adults who
completed four-day bursts of hourly EMA surveys on their smartphones every two weeks for up
to one year. Linear regression analyses were conducted to examine the relationships between the
person-level predictors and attentiveness. Results indicated that sex at birth and reward
responsiveness were significantly related to attentiveness, with females and those lower in
reward responsiveness showing higher attentiveness. None of the other factors significantly
influenced attentiveness. Of the contextual factors, surveys delivered on the weekend, earlier
weeks in the study, and at home had the highest likelihood of being attentive. Further analysis
and exploration of additional predictors are necessary to fully understand the factors contributing
to IR in EMA studies and ensure data quality in this methodology.
57
57
Introduction
Ecological momentary assessment (EMA) is a real-time data capture methodology used
by researchers to collect self-report data in the real world from participants that has advantages
such as increased ecological validity and decreased recall bias (Shiffman et al., 2008). The
prevalence of use of EMA has increased in recent years due to advancements in technology and
the popularity of personal smartphone ownership. Despite the benefits of EMA, the burden
associated with repeated data collection is one of the most significant issues of EMA, and the
amount of burden experienced by participants can influence their motivation to complete study
procedures (Courvoisier et al., 2012; S. Intille et al., 2016). Strategies that participants may
engage in to reduce burden include ignoring surveys (i.e., non-compliance), quickly responding
to the questions without reflection (i.e., careless responding), or dropping out of the study
altogether (i.e., attrition), which all negatively impact data quantity and quality. Therefore, it is
crucial to understand the factors that may contribute to burden in EMA studies and how the
factors affect response patterns.
Careless or inattentive responding (IR) is a phenomenon in which respondents “answer a
survey measure with low or little motivation to comply with survey instructions, correctly
interpret item content, and provide accurate responses” (Huang et al., 2012). Given research that
has shown potential causes of IR include factors such as longer investments of time toward the
study, environmental distractions, and lack of researcher-participant contact, there is concern
about IR in EMA due to these factors being inherent to the method. Additionally, while the
prevalence of smartphone ownership is often considered advantageous for EMA feasibility, this
increased familiarity and comfort with devices may also mean additional distractions competing
for users’ attention. As EMA data are often used to interpret within-person effects, these
58
58
inattentive responses may be mistakenly attributed to individual differences which could
introduce error variances to results. As EMA's popularity increases, enhancing EMA data's
accuracy is increasingly important but research is lacking. While it has been established that the
quantity of data collected (i.e., response rate or compliance) can depend on temporal or
contextual factors such as time of the day, day in study, location, and activity levels in other
intensive EMA studies (longer than 14 days) due to participants deprioritizing attending to or
responding to phone surveys, these factors have not been systematically applied to evaluate the
quality of the data.
Although there is a lack of EMA-based studies, research in retrospective cross-sectional
self-report methods suggests IR could be associated with various survey, person, and context-
level factors. Regarding survey factors, researchers have identified that there is more IR in longer
surveys (Baer et al., 1997). Cognitive resources are needed to complete questionnaires, and they
could be depleted in longer surveys, leading to fatigue and IR. Additionally, research has found
that a participant’s interest in the survey’s topic is negatively corrected to IR (Brower, 2018). If
the topic fails to engage the participant or is perceived as uninteresting, the participant may
become bored leading to fatigue or cognitively overload more easily and reduce their effort in
providing accurate responses. Moreover, participants may be more likely to be attentive if the
survey is important and there is an outcome tied to survey performance (e.g., college entrance
exams, skills assessment during the hiring process, survey with bonus compensation tied
accuracy). However, if the outcome is not dependent on performance or accuracy, there could be
concern about more IR as the focus is on completion. These factors are likely to translate to an
EMA data collection context and should be investigated.
59
59
Research has also identified stable individual traits that are predictive of IR patterns. The
behavior has repeatedly been found to be associated with various person-level characteristics
such as gender, education level, or geographic location (Y. Kim et al., 2019). It has been
replicated in research that lower scores on extraversion, agreeableness, conscientiousness, and
emotional stability are correlated with greater IR (Bowling et al., 2016; Dunn et al., 2018a;
Maniaci & Rogge, 2014; Ward & Meade, 2018). These relationships may be due to the
motivation to respond carefully, as participants who are more agreeable and conscientious may
be driven by a desire to help provide accurate data. Participants with lower extraversion or
emotional stability may be less introspective and reflective in their survey response approach.
Additional personality traits have also been examined. Dunn et al. (2018) found that the tendency
to become bored was related to greater IR (Dunn et al., 2018). These results may support
findings that conscientiousness is an underlying mechanism of motivation through drive and self-
regulation (Judge & Ilies, 2002; Kanfer et al., 2017). It has also been suggested that participants’
perceptions about appropriate survey behavior will affect how obligated they feel to respond
carefully (Meade & Craig, 2012). Because the researcher is not present when a participant
completes an online questionnaire, this social distance may lead participants to feel less
accountability (Johnson, 2005; Meade & Craig, 2012). However, an intervention found that
including thank you messages from the researchers and explaining the importance of data quality
at the beginning of a remote study did not reduce IR (Ward & Meade, 2018).
The level of disruption experienced by participants is likely time-varying. When there is
IR, it suggests that participants are motivated enough to answer the survey but lack the cognitive
effort or drive to answer accurately. Increased distractions in a participant’s context have also
been shown to lead to higher levels of IR (Meade & Craig, 2012). Given EMA surveys are often
60
60
delivered randomly to increase the representativeness of data collection, participants sometimes
are completing EMA surveys at inopportune times and places. However, as a result, there may be
elements of one’s context that interfere with their ability to respond thoughtfully, such as
distractions from other individuals or media. Moreover, the participant could be multitasking
while completing the questionnaire reducing the amount of attention allotted to the survey and
causing IR. Type of physical settings can now be measured passively through global position
systems (GPS) and social contexts can be assessed through Bluetooth radios using various
Application Programming Interfaces (APIs) on the phone operating system or collected as
metadata to explore this hypothesis (Lukowicz et al., 2012).
Researchers should strive to understand the factors contributing to burden and determine
how high-quality data can be collected without disturbing the participant. Thus, we assessed
whether previously identified person-level factors from studies using cross-sectional surveys also
affect intensive EMA studies and expand upon existing research to investigate novel survey-level
contextual predictors. This exploratory study aimed to investigate person-level factors (e.g.,
demographics, personality, approach motivation, perceptions of burden) and contextual survey-
level factors (e.g., study day, time of day, location, activity level) as predictors of IR.
Methods
Participants and procedures
Data for these analyses were collected as part of the Temporal Influences on Movement
and Exercise (TIME) study. The TIME Study was approved by the Institutional Review Board at
the University of Southern California: IRB# HS-18-00605. Full details of the study design and
participants have been previously reported (Wang et al., 2022).
61
61
The purpose of the study was to examine the temporal factors underlying the adoption
and maintenance of physical activity, sedentary behavior, and sleep in young adults (18-29 years
old). A total of 332 young adults consented to the study and answered four-day burst periods of
hourly EMA surveys on their personal smartphones every two weeks for up to one year. During
waking hours, the smartphone app used push notifications to survey participants to complete
signal-contingent (i.e., randomly surveyed) EMA surveys approximately once every hour. The
hourly EMA surveys consisted of a set of multiple-choice questions that were available to
answer for 10 minutes.
Measures
Person-level Predictors
Demographics. Demographic variables (age, sex at birth, ethnicity, race, education level, and
employment status) were retrospectively assessed at baseline using online electronic
questionnaires completed remotely on a computer, tablet, or smartphone. Participants were
allowed to skip any of the questions.
Personality. Big Five traits (openness, extraversion, neuroticism/emotional stability,
conscientiousness, and agreeableness) were assessed using the Ten Item Personality Inventory
(Gosling et al., 2003) at the baseline survey. The goal of the TIPI was to create a short
instrument (10 items) that optimized validity such as content and criterion validity, and the scale
has high test-retest reliability (r=.72) in adults (Ahmed & Jenkins, 2013; Gosling et al., 2003).
These traits have been previously analyzed in studies using cross-sectional surveys, and all traits
besides openness were found to be correlated to higher rates of IR (Dunn et al., 2018; Ward &
Meade, 2018).
62
62
Approach/Avoidance Motivation. Motivation was assessed through Behavioral
inhibition/behavioral activation scales (BIS/BAS) (Carver & White, 1994) at the baseline survey.
The subscales include one BIS-scale and three BAS-scales. BIS/BAS items were assessed on 4-
point Likert-type scales from 1 (very true for me) to 4 (very false for me). Previous research
indicates that individuals with higher BIS scores are more likely to engage in avoidance goals
(e.g., “I want to avoid failing”), and individuals with higher BAS scores are more likely to
engage in approach or achievement goals (e.g., “I want to do what I can to succeed”). The BAS
Scale can be subdivided into three subscales (i.e., drive, fun-seeking, reward responsiveness).
Previous research has demonstrated the validity of both the seven-item BIS measure used to
measure avoidance motivation (α=.65) and the 13-item BAS measure used to measure approach
motivation (α=.83) (Carver & White, 1994; Heimpel et al., 2006). Previous research has not
directly examined BIS/BAS in relationship to IR or EMA compliance, so this was a novel
exploration of how the motivation systems drive participants in an EMA study given the
gain/loss of compensation associated with the completion of surveys.
User Burden Scale. To measure the perceived burden from EMA, the TIME Study adapted the
self-report burden scale by Suh et al. 2016, which was included in the six-month online survey.
The original scale consists of six subscales: difficulty, physical, time/social, cognitive, privacy,
and financial burden. The finance burden questions were removed as there was no cost
associated with participation in the TIME Study and we added two questions on interruption
burden resulting from prior work. This resulted in a final scale with 20 items. The subscale
reliabilities range from α=0.73-0.89. Good convergence between the User Burden Scale and
existing validated usability scales (NASA Task Load Index scale and the System Usability
Scale) has been shown (Suh et al., 2016). As perceived burden has been associated with EMA
63
63
data quantity (Ponnada et al., 2022), we were interested to see if the relationships are also present
with data quality.
Contextual predictors
Passive data was collected from the smartphone, on participant-smartphone interactions, and
other contextual variables. These metadata were generated with each survey delivered and
represented the context at survey delivery. The selection of these variables was informed by
previous literature or logical thinking (Ponnada et al., 2022).
1) Week in the study (0-52)
Previous literature warns of decreases in data quality over time (Stone et al., 1991). Study
week was calculated as study day divided by seven.
2) Type of day (weekday vs weekend).
Participant routines may differ depending on the type of day, which could increase
inattention. We categorized each day as a weekday or weekend day (Saturday or Sunday).
This variable was dummy coded with a weekday as the reference (weekend=1,
weekday=0).
3) Part of the waking day (1-4). As EMA surveys were delivered based on the participants'
wake and sleep times to avoid disruption, day definitions for the study are based on the
perspective of the participant’s schedule. Thus, we divided each waking day into four
quartiles between the participant's wake and sleep time. The parts of the waking day were
dummy coded with the last quartile as the reference group.
4) Phone charging status (true=1, false=0). We determined at each survey if the phone was
being charged or not at survey delivery. Phone charging status was dummy coded with not
charging as the reference group.
64
64
5) Screen state upon survey delivery (Screen on=1, Screen off=0).
Previous literature shows inattention to tasks during smartphone usage (multitasking)
(Hyman et al., 2009). For each survey, we identified if the phone was in active use with the
screen on based on system broadcasts. Screen state was dummy coded with screen off as
the reference group.
6) Location upon survey delivery using GPS (1=home, 0=not home).
The smartphone captures the participant's longitude and latitude (GPS location) once a
minute during the study period. The raw location data was converted to location clusters
which were labeled by participants through a self-reported context-sensitive EMA survey
(Ponnada et al., 2022). A person being away from their home environment may affect
their cognitive and affective state and increase the likelihood of IR. Given the pandemic,
participants spent much of their time at home. Location was dummy coded with not home
as the reference group.
7) Phone activity level upon survey delivery (in vehicle, on bicycle, on foot/walking/running,
unknown, tilted, still).
The movement state of the smartphone was collected using Android’s activity recognition
API once a minute during the study period. We hypothesized that being mobile may
increase inattention. The activity level was coded as a six-level factor using five dummy
variables with being still treated as the reference category.
Outcomes
Attentiveness. The outcome of survey-level attentiveness (or the absence of inattentive
responding) was measured in two different ways. These survey-level measures were aggregated
to create person-level scores for the person-level analyses.
65
65
EMA survey attentiveness probability (continuous variable). A previous equation
optimizing cut-off scores of response patterns to EMA (between and within-subject
response time and within-subject response variability) was used to categorize the
likelihood of each hourly EMA survey of being completed attentively (0-1).
EMA survey attentiveness categorization (binary variable). Applying a cut score of .93 to
the EMA survey attentiveness probability, we categorized each EMA 1=attentive or
0=not attentive.
Data Analysis
STATA 17 was used for all analyses. Univariate linear regression models were conducted
to examine how the person-level factors of sex at birth, age, level of education, full-time
employment status, student status, personality subscales, reward sensitivity, and user burden
related to the likelihood that a person has a higher proportion of attentive surveys. Each predictor
was first tested in a separate regression model, then significant factors would be used to build a
multivariate linear regression model.
Contextual predictors of EMA attentiveness were tested in multilevel logistic regression
models. First, the factors were assessed using unadjusted univariate logistic regressions. Then,
significant factors were used to build a multivariate multilevel logistic regression. At most, the
model would include seven contextual predictors (Weekend day (DOW), Week in study (WIS),
Quartile of waking day (QWD), phone charging (PC), screen state (SS), location (LOC), activity
(ACT)).
𝑌
!"
=𝛾
##
+𝛾
#$
𝐷𝑂𝑊
!"
+𝛾
#%
𝑊𝐼𝑆
!"
+𝛾
#&
𝑄𝑊𝐷
!"
+𝛾
#'
𝑃𝐶
!"
+𝛾
#(
𝑆𝑆
!"
+ 𝛾
#)
𝐿𝑂𝐶
!"
+𝛾
#*
𝐴𝐶𝑇
!"
+ 𝑢
#"
+𝑒
!"
66
66
where γ01 represents the change in the probability of the outcome (i.e., EMA survey was
completed attentively) on weekends as compared to weekdays, γ02 represents the change in the
outcome for each additional week in the study, γ03 represents the change in the outcome
throughout the day, γ04 represents the change in the outcome based on if the phone is charging,
γ05 represents the change in the outcome when the screen is on vs. off at survey delivery, γ06
represents the change in outcome when the participant is home vs. away, γ08 represents the
change in the outcome based on phone activity type.
Results
Participant characteristics are presented below in Table 10 for participants who
completed the baseline questionnaire (N=290) and completed six months of data collection
(N=168). Table 11 presents further demographics and EMA descriptive statistics. On the survey
level, the mean probability of attentiveness of surveys was .988 (SD=.07) and 6812/183,026
(3.72%) of EMA surveys were categorized as inattentive. On the person level, the mean
probability of attentiveness of surveys was .99 (SD=.04) and ranged from .27-1.0. The percent of
surveys categorized as attentive ranged from 0-1 with a mean of 0.97 (SD=.15). Distributions are
presented below in Figure 6.
67
67
Figure 6. Distribution of measures of attentiveness of EMA surveys by participant
Table 12 shows the results of the univariate linear regressions examining the relationships
between person-level factors (e.g., demographics, approach motivation, personality, and user
burden) and the two measures of attentiveness: mean probability of attentiveness of EMA
surveys and percent of EMA surveys categorized as attentive. Regarding demographics, sex at
birth was significantly related to both measures of attention (p < .05), with those identifying as
female having higher attentiveness scores (B=1.31 (SE=.61), B=5.50 (SE=1.94)). Age,
education, full-time employment status, and being a student were not significantly related to any
of the outcome measures of attentiveness. Regarding approach motivation, less BAS reward
reactivity subscale was significantly associated with a higher mean probability of attentiveness of
surveys (B=-.40, SE=.14, p<0.01). None of the other reward sensitivity variables (BAS total
1 2 1
3
1 2 1 1 2
276
0 100 200 300
Frequency
0 10 20 30 40 50 60 70 80 90 100
% of surveys categorized as attentive
68
68
score, BAS drive, BAS fun, BIS) were significantly related to any of the measures of
attentiveness. Regarding personality, none of the variables (extraversion, agreeableness,
conscientiousness, emotional stability, openness) were significantly related to either measure of
attentiveness. Finally, in terms of user burden, the total score and none of the subscales
(difficulty, physical, time social, cognitive, interruption) were significantly related to either
measure of attentiveness. Given the lack of factors, we did not proceed to build a multivariate
linear regression model. One participant (wrigglecatalyststerility) was an outlier with a very low
number of attentive surveys compared to other participants. We reran the analyses without this
participant, and the trends were consistent with smaller coefficients.
Table 13 presents results from the multilevel logistic regression analysis and displays the
odds ratios (OR) and 95% confidence intervals (CI) for an EMA survey categorized as attentive
in relation to contextual factors. There were significant univariate relationships between the
likelihood of EMA surveys classified as attentive and weekend day, study week, part of the
waking day, phone charging, phone screen on, home, and phone detected activity, which were
entered into the multivariate model. After controlling for all of the other variables in the model,
analysis found that surveys were more likely to be attentive on weekend days (OR = 1.15, 95%
CI [1.01, 1.32]) than on weekdays. In contrast, the likelihood of an attentive survey was
decreased in later study weeks (OR = 0.94, 95% CI [.93, .95]). The trend that surveys were more
likely to be attentive during the first part of the waking day compared to surveys during the
end/fourth part of the day was no longer significant in the multivariate model. Similarly, there
was no longer a significant effect of the phone usage patterns (phone charging and screen on) on
the likelihood of an attentive survey. Surveys answered at home (OR = 1.59, 95% CI [1.33,
1.91]) were almost twice as likely to be attentive, controlling for the other contextual factors.
69
69
There was no effect of phone-detected activity on the likelihood of an attentive survey in the
combined model.
70
70
Table 10. TIME Study Participant Characteristics
Consented
participants
N=290
Participants with 6
months of data
collection
N=168
Demographics
n (%) n (%)
Age in years (Mean ± SD)
23.7± 3.1 23.5 ± 3.5
Sex at birth
Female 163 (56.2) 94 (56.0)
Male 125 (43.1) 74 (44.0)
Missing 2 (0.7)
Ethnicity
Hispanic 80 (27.6%) 50 (29.8)
Non-Hispanic 210 (72.4%) 118 (70.2)
Race*
White 134 (48.2) 79 (49.7)
Black or African American 38 (13.7) 23 (14.5)
American Indian or Alaska Native 14 (5.0) 9 (5.7)
Asian Indian 29 (10.4) 15 (9.4)
Chinese 37 (13.3) 21 (13.2)
Filipino 18 (6.5) 7 (4.4)
Japanese 6 (2.2) 3 (1.9)
Korean 12 (4.3) 7 (4.4)
Vietnamese 11 (4.0) 7 (4.4)
Other Asian 14 (5.0) 8 (5.0)
Native Hawaiian 0 (0.0) 0 (0.0)
Guamanian or Chamorro 1 (0.4) 1 (0.6)
Other Pacific Islander
Missing
4 (1.4)
12 (4.1)
3 (1.9)
9 (5.4)
Education
Some high school 1 (0.3) 0 (0.0)
Grade 12 or GED (high school graduate) 34 (11.7) 16 (9.5)
Some college or technical school 109 (37.6) 71 (42.3)
College graduate 146 (50.3) 81 (48.2)
Work Status
^
Employed for wages 161 (55.5) 88 (52.4)
Self-employed 16 (5.5) 8 (4.8)
Out of work for 1 year or more 9 (3.1) 4 (2.4)
Out of work for less than 1 year 33 (11.4) 19 (11.3)
Homemaker 6 (2.1) 2 (1.2)
Student 143 (49.3) 95 (56.5)
Unable to work 6 (2.1) 5 (3.80)
*While N=332 participants consented to the study, baseline data was only fully collected for
N=290 as some participants withdrew before completing the survey
^participants were able to select all that apply
71
71
Table 11. Descriptive Statistics for additional predictors
Mean SD
N= 286 participants
Approach Motivation
BAS drive 9.34 2.68
BAS fun 8.25 2.25
BAS reward 8.17 2.24
BIS 14.55 3.48
Personality
Extraversion 7.16 3.24
Agreeableness 10.06 2.36
Conscientiousness 10.23 2.59
Emotional stability 9.13 2.90
Openness 10.80 2.15
N=168 participants
User burden
Difficulty subscale 12.03 2.81
Physical subscale 3.91 1.82
Time social subscale 6.87 2.51
Cognitive subscale 5.01 2.10
Interruption subscale 9.71 3.31
Total Score 51.91 11.20
N %
N= 183,026 surveys (N=290)
Weekend day 81,989 44.8
Part of the waking day
Quartile 1 45,994 25.13
Quartile 2 55,566 30.36
Quartile 3 54,645 29.86
Quartile 4 26,821 14.65
Phone charging 31,873 17.50
Phone screen on 69,111 37.77
Home 91,690 69.06
Phone detected activity
Other 34,382 18.79
In vehicle 4,521 2.47
On bike 206 0.11
Walking 6,441 3.52
Unknown 22,224 12.14
Tilting 18,516 10.12
Still 96,736 52.85
72
72
Table 12. Person-level predictors of inattentive responding behavior for EMA tested through
separate univariate linear regressions (N=290)
BIS = Behavioral inhibition scale
BAS= Behavioral activation scale
^N=168 as this scale was only asked at the 6-month survey
*The probability (0-1) of attentiveness of each survey was calculated. This score takes the
average of the probabilities across all surveys;
**A cut score was applied to transform the continuous probabilities for each survey to a binary
variable of attentive (1=yes, 0=no). The number of attentive surveys was divided by the total
number of surveys received per participant.
Mean probability of
attentiveness of surveys*
% of surveys categorized
as attentive**
Coefficient (SE) p Coefficient
(SE)
p
Demographics
Sex at birth 1.31 (.61) 0.03 5.50 (1.94) <0.01
Age -.01 (.10) 0.92 .002 (.31) 0.99
Education .22 (.44) 0.62 1.40 (1.39) 0.31
Full time employment .65 (.56) 0.24 1.80 (1.78) 0.31
Student .25 (.56) 0.653 -.18 (1.79) 0.92
Approach Motivation
BAS (Total Score) -.09 (.06) 0.10 -.22 (.17) 0.20
BAS drive .04 (.12) 0.75 -.08 (.35) 0.82
BAS fun -.21 (.14) 0.14 -.13 (.44) 0.77
BAS reward -.40 (.14) <0.01 -.79 (.44) 0.08
BIS -.08 (.09) 0.39 -.42 (.29) 0.15
Personality
Extraversion -.01 (.09) 0.96 .30 (.30) 0.32
Agreeableness .08 (.13) 0.54 -.16 (.40) 0.69
Conscientiousness .09 (.12) 0.44 .17 (.38) 0.64
Emotional stability -.04 (.11) 0.74 -.44 (.34) 0.20
Openness .23 (.14) 0.11 .14 (.45) 0.76
User burden^
Difficulty subscale -.05 (.17) 0.77 -.34 (.47) 0.48
Physical subscale -.37 (.26) 0.17 -.44 (.73) 0.55
Time social subscale .004 (.19) 0.98 .13 (.53) 0.81
Cognitive subscale -.08 (.08) 0.29 .13 (.56) 0.82
Interruption subscale -.05 (.15) 0.71 -.03 (.39) 0.94
Total Score -.01 (.01) 0.32 -.003 (.10) 0.98
73
73
Table 13. Contextual predictors of attentive EMA survey categorization tested through multilevel
logistic regressions (n=183,026 surveys; N=290 participants)
Univariate Analysis
Multivariate analysis
OR 95% CI
OR 95% CI
Weekend day 1.16 [1.04, 1.29]
1.15 [1.01, 1.32]
Study week 0.93 [.93, .94]
0.94 [.93, .95]
Part of the waking day
1
Quartile 1 1.36 [1.15, 1.61] 1.21 [.99, 1.49]
Quartile 2 1.12 [.95, 1.33] 1.06 [.86, 1.31]
Quartile 3 1.05 [.89, 1.24] 0.98 [.80, 1.20]
Quartile 4 ref
ref
Phone charging 1.16 [1.01, 1.33]
0.88 [.72, 1.06]
Phone screen on 1.37 [1.22, 1.53] 0.97 [.84, 1.11]
Home
1.96 [1.67, 2.30]
1.59 [1.33, 1.91]
Phone detected activity
Other 0.36 [.21, .61] 0.74 [.40, 1.37]
In vehicle 1.08 [.77, 1.51] 1.11 [.46, 2.67]
On bike 2.28 [.21, 24.6] 1.77
[.002,
1702.72]
Walking 0.67 [.47, .94] 0.75 [.44, 1.27]
Unknown 0.80 [.67, .97] 0.80 [.61, 1.04]
Tilting 1.07 [.89. 1.29] 1.03 [.83, 1.28]
Still ref
ref
Outcome= Survey categorized as attentive (0=no, 1=yes)
1
Each day was divided into quartiles depending on the participant’s retrospective sleep and wake
time. Quartile represents the first period after a participant wakes up.
74
74
Discussion
The purpose of this analysis was to examine person-level and contextual predictors of IR
in EMA data. Knowledge about differences in data quality between and within study days may
be relevant when designing an EMA protocol. These findings shed light on the factors related to
attentiveness to EMA surveys offering valuable insights into design factors that may improve the
reliability and validity of EMA data.
Overall, the results suggest that sex at birth and BAS reward may be the most critical
person-level factors to consider when assessing attentiveness to EMA surveys, while the other
demographic, personality, and user burden factors examined in this analysis were not
significantly related to attention. Our results support previous research that shows gender
differences in IR in cross-sectional surveys (Y. Kim et al., 2019). Female participants may show
different patterns of attentiveness due to socialization or differences in cognitive processes
(Kimura & Hampson, 1994). The reward responsiveness subscale of the Behavioral Activation
System is associated with seeking rewards and experiencing more positive emotions. While
BIS/BAS has not previously been studied in terms of IR, reward responsiveness has been shown
to be related to goal achievement and impulsivity (Taubitz et al., 2015). Our finding that greater
reward responsiveness is associated with lower data quality suggests that participants may be
using IR as a technique to reach compliance thresholds set by the study protocol. Validating
these findings and identification of additional person-level factors that result in more attention
when responding to EMA surveys should be further explored in future studies.
When controlling for all other contextual factors, surveys delivered on the weekend,
earlier weeks in the study, and at home had the highest likelihood of being attentive. Factors that
did not significantly predict attention included phone usage patterns or activity levels. More
75
75
attention on weekends could be attributed to more free time and reduced work or school-related
stressors or distractions (Ragsdale et al., 2011; Zawadzki et al., 2019). Similarly, being at home
could provide an environment that allows more focus with fewer competing tasks for
participants’ attention (van Berkel et al., 2020). Previous research has analyzed a similar
question of conditions that participants experience as more burdensome or disruptive in EMA
studies. van Roekel et al. found that participants perceived surveys to be more inconvenient in
public places (compared to home), on weekends, and when participants were alone. When not
alone, surveys were worse when participants were in the presence of strangers or just
acquaintances. Similarly, EMA was more disruptive when positive affect was low, negative
affect was high, when stressed due to some activity, when outside the home, in the presence of
strangers, and when feeling tired or hungry (Roekel et al., 2019). Regarding study week,
participants likely initially feel more motivated and engaged with the protocol earlier on in the
EMA study (Dzubur, 2017). Our discovery of a relationship between EMA response accuracy
and study week and time of day supports previous hypotheses or assumptions that participants
may become fatigued with the study over time or feel less motivated in certain weeks or
situations (Welling et al., 2021). This is an important consideration for the development of future
longitudinal studies of intensive lengths such as TIME (>1 month) and researchers should build
in strategies to keep participants engaged over time. But as our categorization of attentiveness
relies heavily on response time, there is a possibility that the trend toward less accurate data later
in the study may be due to a training effect rather than disengagement from the survey.
Relatedly, it is important to note that these findings are specific to our study design and
population, and future research is necessary to validate and test the generalizability of these
effects.
76
76
Contrary to expectations, personality dimensions were not significant predictors of
attentiveness in our analysis. This may be due to selection bias, whereas participants higher on
agreeableness or consciousness chose to join the study. It was also unexpected that phone usage
patterns and activity levels did not affect IR likelihood. It must be remembered that these
findings are an additional layer on top of compliance; the analyses only examine the quality of
the data provided when participants responded to the survey. Given compliance was 65%,
participants were more likely to ignore EMA surveys in these contexts where the survey
interrupts the participant (e.g., while performing another task on the phone or in transit) instead
of responding inattentively. This is supported by previous research as an analysis of contextual
biases in compliance was conducted with 131 participants from the TIME study who had
completed at least six months of data collection. Time of day, device charging status, phone
screen status, location, study day, physical activity, and last active phone use were all associated
with EMA non-response (Ponnada et al., 2022). There were also additional limitations of this
study. First, the analyses were based on a specific sample of young adults and a study design of
four days of hourly burst surveys across a year, limiting the generalizability of the findings to
other populations and less dense sampling. Moreover, the contextual factor models were not
adjusted for age, gender, or ethnicity. Additionally, the low prevalence of IR data may have
affected the fit of the models.
Our study contributes to the growing literature on EMA data quality by identifying
person-level and contextual predictors of attentiveness. The findings underscore the importance
of considering individual characteristics such as sex at birth and reward responsiveness when
evaluating data quality in EMA. Additionally, the contextual factors associated with
attentiveness highlight the need for tailored strategies to optimize data collection to combat
77
77
fatigue and increase engagement or just considerations by researchers that data collected later in
the study may be less reliable and should be weighted less in analyses to enhance the validity of
results. These findings provide a foundation for future investigations and the potential
development of interventions to optimize EMA data collection in various research settings.
78
78
STUDY 3: BURDEN AND INATTENTIVE RESPONDING IN INTENSIVE
LONGITUDINAL STUDIES: A QUALITATIVE ANALYSIS
Abstract
Engaging participants in ecological momentary assessment (EMA) studies and ensuring
high data quality is crucial but challenging due to the potential burden of repeated measurements.
Participants may engage in inattentive responding behavior to combat burden, but the process
underlying this behavior is unclear. We explored the process of IR by conducting qualitative
interviews with 31 young adult participants who completed a one-year-long EMA study. The
interviews focused on participants' motivations, the impact of time-varying contexts, changes in
motivation and response patterns over time, and perceptions of attention check questions.
Thematic analysis revealed five overarching themes on factors that influence data quality and
participant engagement: 1) friends and family also had to tolerate the frequent surveys, 2)
participants tried to respond to surveys quickly, 3) the repetitive nature of surveys led to neutral
responses, 4) attention check questions helped to combat overly consistent response patterns, and
5) different motivations led to different levels of data quality. These findings provide insights
into the complex process of inattentive responding and participant engagement in EMA studies.
The study identified factors influencing data quality in EMA studies that could guide future
research to improve EMA survey design. The identified themes offer practical implications for
researchers and study designers, including the importance of considering social context, dynamic
motivation, attention check questions, and intrinsic motivators of participants. By incorporating
these insights, researchers might maximize the scientific value of their EMA studies through
better data collection protocols.
79
79
Introduction
Collecting repeated self-report data in real-time using methods such as ecological
momentary assessment (EMA) reduces recall biases associated with cross-sectional surveys and
has the benefit of examining time-varying (within-person or within-survey) factors (Shiffman et
al., 2008). However, despite these strengths, sustaining participant engagement, defined as
motivation to complete study procedures, can be challenging given the potential burden of
completing repeated surveys. Increased participant burden to complete study procedures may
occur due to EMA study design or time-varying factors influencing the participant such as time
of the day or physical and social contexts. Beyond compliance and non-response, lack of
engagement may also consist of careless or inattentive responding (IR) in which participants
respond to items with low motivation to comply with survey instructions, correctly interpret item
content, or provide accurate responses resulting in lower-quality data (Huang et al., 2012). It
could be hypothesized that detrimental aspects of study participation such as cost (time or
cognitive burden) are weighed by the participant against benefits (financial or intrinsic
motivation) to determine willingness to complete an EMA survey and provide accurate responses
to EMA items.
As IR in EMA studies has rarely been studied, there is a lack of knowledge of the theory
underlying the behavior or the psychological constructs that comprise the process. Potentially,
aligned with Ajzen’s theory of planned behavior, the intention to engage in a behavior (e.g.,
respond to an EMA survey attentively) is a function of social norms (i.e., acceptability of
answering surveys in their physical or social context) and intrapersonal attitudes (i.e.,
participant’s desire or motivation to satisfy researchers or earn compensation) (Madden et al.,
1992). Researchers’ knowledge about IR and EMA could be broadened by directly asking
80
80
participants to describe their process of responding to EMA surveys using qualitative research
methods to understand the contexts of IR occurrence. Accordingly, this study was designed to
begin to uncover the processes of IR to aid with future hypothesis generation for further
quantitative investigation of this phenomenon.
Qualitative interviews have previously been conducted with emerging adult participants
in EMA studies to assess the acceptability of various EMA protocols to better understand the
experience behind completing surveys and barriers to data collection for a variety of health
behaviors (Cherenack et al., 2016; Dietrich et al., 2020; Mackesy-Amiti & Boodram, 2018;
Moore et al., 2013; Suffoletto et al., 2017; Turner et al., 2019). However, these studies have
mainly focused on barriers to compliance rather than barriers to data quality. There have been
two mixed methods studies assessing the accuracy of EMA data using brief qualitative
interviews with a subset of participants that highlighted study design factors that hindered
engagement. However, the EMA protocols in these studies were short (i.e., two weeks long), and
results may highlight changes in the data quality that appear relatively early and miss slower or
more long-term changes that could occur during longer EMA protocols (Eisele, 2021; van Berkel
et al., 2020).
To advance the field, we collected qualitative data from 31 young adult participants who
completed a one-year-long intensive longitudinal EMA study with daily diaries and bi-weekly
waves of hourly surveys to understand participants’ response patterns and potential factors
leading to IR through semi-structured interviews and thematic analysis. The goals were to
understand 1) why participants joined the study and their motivation to continue, 2) the effects of
time-varying contexts on IR, 3) changes to motivation or response patterns over time, and 4)
perceptions of the attention check questions. Through the interviews, rich descriptive data were
81
81
captured about the potential burden of an EMA study lasting across a year’s time, and through
analysis, we can begin to explain “why” participants may contribute low-quality EMA data.
Methods
Design
This study used a qualitative research design to explore participants’ lived experiences,
behaviors, feelings, perceptions, and interactions with the EMA study protocol. Semi-structured
interviews were conducted with participants to elicit data about their experiences using an
interview guide to lead the conversation. The semi-structured interview style balanced the need
for structure to elicit data for the study with a conversational approach to build rapport with the
participants and allow for flexibility in responses. The guiding research question was, “How can
we discover the process and sequence of decisions that individuals make that may contribute to
low-quality data.”
Interview procedure
This study was conducted using a subsample of participants enrolled in the larger TIME
Study (Wang et al., 2022). The intensive year-long study involved participants completing a
daily diary smartphone survey at the end of each day. Participants also completed hourly EMA
surveys across four-day measurement bursts every two weeks. Twenty percent of the hourly
EMA surveys included an attention check question. Individuals who successfully completed the
full year of data collection after April 22, 2022, were asked to participate in an end-of-study
session with study staff on Zoom that included a 30-minute semi-structured interview designed
to elicit feedback on their experiences in the study and perceptions about their EMA survey
responses. The data analyzed in this sub-study were collected between April 22-August 29, 2022.
There was no additional compensation provided for this interview session.
82
82
Participants were asked a series of predetermined questions developed by SDW
(Interview Guide presented in Table 14 below). Additional probing questions were asked
depending on their initial responses to gain additional clarity about participants’ decision-making
process. The interviews were guided by the primary research question but also allowed for new
ideas and themes to be discussed. All interviews were recorded through Zoom, which generated
an audio and video file. However, for the purposes of this study, only the audio file was kept and
stored on a secure shared drive for transcription and the video was deleted.
Table 14. Interview Guide
Motivation • How did you learn about this study?
(probe: What features of the study interested you to want to
participate?)
• Can you describe what motivated you to continue to answer surveys in
this study?
(probe: If not discussed…what was the motivating level of money;
motivation to help science; reflection; ease of answering; How
important was the compensation from the study to you?)
• Can you describe the process of answering phone surveys on a typical
burst day?
(probe: How many phone surveys do you think you answered on a
typical day? Did you have a goal of number of surveys you were trying
to reach? Did you track completion?)
• What would have made participation in the study more fun or
rewarding?
(probe: try to identify non-monetary factors)
Situations of
increased burden
• How did you handle distractions when taking the survey?
• Were there situations in which your responses to the surveys may have
been less accurate (that you answered without thinking through your
responses)?
(probe: How did your responses change if someone else was around?
Depending on your location? Different times of the day?
Response
accuracy
• Were there situations in which your responses to the surveys may have
been less accurate (that you answered without thinking through your
responses)?
(probe: How did your responses change if someone else was around?
Depending on your location? Different times of the day?
• How do you think your motivation or accuracy changed as you were in
the study longer?
83
83
(probe: What made the study easier or harder over time?)
Perceptions of
attention check
questions
• What did you think about the questions and messages that were not
related to measuring health behaviors, routines, and mood on the
phone?
(probe: Which ones were the most memorable? Any suggestions on
how we can make them better?)
Data Analysis
The collected qualitative data was systematically analyzed using an iterative process
following recommendations by Charmaz (Charmaz, 2006). The audio files were deidentified
using the participant’s study ID and transcribed verbatim by an external transcription service
provider (GoTranscript). Transcripts were revised to correct errors manually. First, SDW
independently reviewed the documents and summarized ideas through memos. An initial set of
gerunds were created by extracting significant language and patterns and focused primarily on
the research question and were centered around the habit of responding to surveys and reactions
to the attention check questions. From these, codes began to form after looking at the broader
testimonies that participants provided. All codes were revised and condensed to create a total set
of 13 thematic codes. Definitions and an example from the transcripts for each thematic code
were provided in the codebook. Transcripts were then uploaded onto the qualitative data analysis
software program, ATLAS.ti. Using the program, the thematic codes were applied to the
transcripts to conduct a qualitative assessment of the factors that influenced the data quality of
participants. Each transcript was independently coded by three different team members (SDW,
LAH, JCM) to ensure consistent application of codes. The team discussed and resolved
disagreements using the ATLAS.ti software Intercoder Agreement Mode. After a high level of
agreement was reached between coders, five overarching themes emerged that aggregated the 13
codes. These themes highlight a range of concepts participants associated with their study
84
84
participation and data quality. The distribution of themes and codes was checked across
transcripts to ensure they adequately covered the overall discussions.
Results
Thirty-one participants were interviewed; 24 of whom were women. The largest portion
of participants self-identified as Asian, had graduated from college, and were employed.
Demographics for the participants collected at baseline and 12-month online surveys are
presented in Table 3.2. Our subsample differed demographically compared to the full sample of
participants who completed the year-long study (N=136) as our subsample was more female
(77.4% vs 57.4%), less Hispanic (22.6% vs 32.4%), less White (29.0% vs 48.1%), more Asian
(45.2% vs 37%), had a higher proportion of college graduates (71.0% vs 49.3%), and had a high
proportion of being employed/self-employed (64.5% vs 56.3%) compared to being a student
(38.7% vs 48.4%). These differences were all statistically significant in t-tests (p<.01).
Table 15. Demographics for qualitative interview participants
Demographics
(N = 31)
Age in years (Mean ± SD)
24.4 (3.1)
Sex (n (%))
Male 7 (22.6)
Female 24 (77.4)
Ethnicity (n (%))
Non-Hispanic 24 (77.4)
Hispanic 7 (22.6)
Race (n (%))*
White 9 (29.0)
Asian 14 (45.2)
Black 2 (6.5)
Multiple races 5 (16.1)
None indicated 1 (3.2)
Education (n (%))
High School 4 (12.9)
Some College 5 (16.1)
College Graduate 22 (71.0)
Work Status (n (%))*
Employed 20 (64.5)
85
85
Self employed 4 (12.9)
Out of Work 3 (9.7)
Student 12 (38.7)
Homemaker 2 (6.5)
Marital Status*
Never Married 20 (64.5)
Unmarried couple 7 (22.6)
Married 4 (12.9)
Financial Status
Live comfortably 10 (32.3)
Meet needs with a little left 15 (48.4)
Just meet basic expenses 6 (19.4)
*Assessed at end of study (12-month survey)
Interviews ranged from 12-37 minutes (Mean=21.30, SD=7.15) in length, and the number of
quotes coded per participant ranged from 6-63 (Mean=25.87, SD=12.98). The 13 thematic codes
that were constructed after analyzing all 31 transcripts are presented with the number of
participants who discussed each code. They are listed below by frequency.
1. Interrupting social situations: surveys interfering with social situations (N=29)
2. Earning compensation: earning payment for the study/surveys and needing to reach a
minimum threshold for payment (N=28)
3. Habit of responding to surveys: familiar routine around the surveys (N=24)
4. Identifying the purpose of attention check question: recognizing researchers are
testing their attention with the attention check questions (N=23)
5. Feeling internal motivation: internal factors driving survey completion rather than
money (N=22)
6. Contemplating response: participant describing thoughtfulness about responses
(opposite or IR) and discussing reactivity (N=22)
7. Response speed: responding to surveys quickly (N=19)
86
86
8. Question difficulty: cognitive difficulty of questions; not sure what question was asking
(N=19)
9. Reactivity to Attention Check Question: describing a change in response patterns due
to attention check questions (N=16)
10. Contributing to science: external motivation outside of compensation (e.g., want to help
researchers) (N=15)
11. Feeling judged by others: others’ negative perceptions of the participant answering
surveys (N=11)
12. Disrupting screen time: survey causing interference by displaying over other phone
tasks (N=8)
13. Receiving assistance answering surveys: participant had others help answer the survey
(e.g., if the participant was busy driving) (N=6)
The 13 codes above were distilled into five overarching themes. The five themes presented do
not encompass all participant experiences but do aim to provide a general overview of participant
experiences.
Theme 1: My friends and family also had to tolerate the frequent surveys.
All participants described how they fit the burst surveys into their routines and lifestyles,
which was required to tolerate the disruptions, with many describing the process of responding
as a habit. Given the young adult population, for many, their usual routines involved spending
time with others such as friends and family who had to tolerate these surveys as well. They
commonly described the reaction of these others and how answering the burst surveys became
87
87
routine during their social interactions. Some participants’ social companions were able to
tolerate the surveys better than others. For example:
But also, it really did become part of my routine. I remember at the beginning, someone
saying it kind of becomes part of your routine and it really did. It was kind of just funny
like sometimes I’ve been playing games with friends and I’ll be like “Oh, we have to pause
it’s survey time” and they got it. They just understood and it became part of our thing.
-9252, 25F
I think that piece of habit definitely played a role. I didn’t necessarily feel as motivated
towards the end, but it was just a part of my daily routine. I guess for me it was just easy
because it was second nature, I guess, by that point. Maybe for friends and family, it was a
little bit more frustrating just because they’re like, “He’s going to go do a survey.
-9310, 25M
I think there are a couple or a few times when a buzz day would overlap with some social
thing and so I would tell whoever I was with, “Oh, I have this survey thing that comes
every hour or so. Every hour, I need to quickly answer some questions on my phone.” I did
have a friend who was like, “ Oh, you’re being rude,” and blah, blah, blah. That was not
pleasant, but I feel like that was more of my friend’s problem.
-9314, 27F
Because on occasion, it would be a little bit annoying because it interferes with social life
a little bit, and when you’re talking with somebody it can be rude to just look at your watch
or look at your phone or something. There’s a little bit of that, but that’s what you knew
going in, so I knew this was going to happen. It’s just every once in a while, you’re just
like, “This is annoying and inconvenient,” but other than that, it felt fine.
-9284, 28M
When burst surveys came in during social gatherings, the two main strategies described by
participants involved either physically stepping away and leaving conversations to complete the
survey with more focus or trying to multitask and take the survey in front of others. If
completing the survey in front of others, participants also had to decide whether to provide a
rationale for their phone use and how much of the study to explain.
[I]t did get complicated because having to stop, especially if you’re socializing with people
and having to stop then, okay, do I either explain to them what I’m doing and then try to
explain to them the whole study or just, “Give me a minute, let me do this real quick.”
-9248, 29F
88
88
It’s just some of those exposed scenarios when you’re out. For me personally, at least,
you’re out at a dinner or there’s people around you or something, you cannot go by
yourself or take your phone and focus on the survey. You’re just passively filling the survey
out while you’re doing other things.
-9282, 24M
I would just say like, oh, can you gimme like two minutes and usually I try to multitask by
explaining to them telling a story about it. Makes it more engaging with other people.
-9271, 23F
I think it was doing the actual survey on the phone just because in those scenarios where I
am with friends, but there would be times where I’d be like, okay, well, we’re all hanging
out, but it’s no big deal. Then every hour, I would just be bringing out my phone, like,
“Sorry guys. I just have some stuff to do real quick.” Again, it only took a minute so no big
deal, but I guess it’s disruptive.
-9276, 28M
Thus, for many participants, the friends and family they saw regularly understood the frequency
of the surveys. However, these interruptions were typically recounted in interviews humorously.
Other times I would—I think my family also got used to it after one point that they knew,
almost everyone knew that I’m doing this, so they would repeat it for me. [laughs]
-9287, 29F
Yes, especially the friends that I saw all the time, they all knew. They just like, “Okay, hold
on survey time… Sometimes they would hear the buzzing and be like, “Oh, it’s time for you
to do a survey” I’d be like, “Oh, thanks for telling me.”
-9296, 25F
The notification is disruptive, but it has to be in order to get my attention, I suppose. My
brother and I were joking that we have PTSD from [mimics notification sound] I would be
like, “Oh, gosh, it’s coming now.”
-9245, 28F
Yet for other participants, the surveys sometimes interrupted sensitive moments.
I think the other times it really caused me major issues was typical marital stuff. If my
husband and I are arguing or getting into it and my phone goes off, he got to know the
notifications, so then he’s going, “Just deal with,” and I’m like, “It can wait, we need to
deal with this. It’s more important.” That did happen a couple of times.
-9248, 29F
89
89
Theme 2: I am answering the surveys quickly.
Outside of ignoring the survey, participants described a fast response speed as the best way
to manage the disruption of the survey. This was especially beneficial when the survey came at
an inconvenient time, the consistency of the same questions and responses and the resulting
ability to answer the questions quickly minimized the interruption burden while potentially
providing accurate responses.
I usually just try to do it as fast as possible, get it done with so it wouldn’t annoy me again.
-9270, 26F
Because the surveys would basically be the same questions, I can answer the surveys in 15,
20 seconds because it’s just the same answers and it’s the same questions. I’d be able to do
it fast.
-9294, 21M
With the phone surveys, with the questions being in the same order, that helped a lot
because at the start of it, I hear then get the notification go and I could run through my
head real quick how I was doing since the last survey, over the last hour, so I could
already have that in mind. It would allow me to, if needed, to split my attention, but still be
able to accurately answer the phone questions.
-9248, 28F
Participants did discuss how responding quickly may have been a sign that the responses were
less accurate or a sign that the participant was distracted.
I feel like during the finals week answering surveys it’s not fun at all. That’s the time when
I probably didn’t have as accurate answers because I was like, “I need to get through this
real quick and keep working.
-9302, 18F
I guess maybe sometimes. If I’m really busy, maybe I’ll just skim through it faster. Maybe
like in the middle of the day, I’m working and I want to do it fast, I may not take as long.
It’s more of a reflex of answering those questions.
-9318, 23F
But fast response times are not necessarily a sign of poor data quality. Many participants
described beginning to respond more quickly over time as an expected response pattern due to
becoming more familiar with the study procedures and the questions.
90
90
Usually, it took, I don’t know, maybe a minute to get through. After the first couple, you
know what the questions are going to be so it’s easier to go through faster and, yes, just
knock it out essentially and then wait for the next one… I feel like answering the questions
probably got a little bit easier because I knew what they were and who gauge better, like in
the beginning you think a little bit and you’re like, how am I feeling? Then it just becomes
like second nature almost to think about it and to go through and you’re like, oh yes, I’m a
little bit tired or, yes, I have been procrastinating today. Yes, since you’re thinking about it,
you know the levels a little bit better, so you’re able to—I was able to, I think, answer more
accurately the longer I was in it just because of that.
-9329, 23F
Theme 3: The repetition made me start to pick neutral responses.
Being in a study with repeated surveys had some benefits of reducing burden but also
introduced fatigue for the study participants. One participant emphasized the dual nature of this
study design.
The factors come from the study. I think I know that it’s repetitive. Sometimes it’s daunting
because it’s repetitive but sometimes it’s efficient because it’s repetitive.
-9320, 28F
Participants sometimes did not feel like they have enough time or cognitive effort to process the
questions and later in the study, they may have stopped taking the time to fully process how they
were feeling once the novelty of the study wore off. This may have resulted in a preference for
selecting the middle or neutral/moderate response option.
Fatigue, energetic for me is not that hard, but some other questions can be more subtle,
like “Am I really feeling relaxed or not?” You know those kind of things and I would not
get enough time to really properly answer those because I’m not sure what the answer to
those questions are. “Am I tense? Am I not tense?” I don’t know at certain times,
especially when I have this deadline to respond right away.
-9310, 27M
I feel like when it would ask me, “Oh, are you feeling this, are you feeling that?”
Oftentimes, I’m not really actively thinking about how stressed I am, how tense I am, so
oftentimes, I would just click the middle option, because I’m not really strongly feeling
anything.
-9314, 27F
91
91
More moderate responses may have also been due to the length of the study. By the end, study
participants found themselves just feeling neutral as so many experiences of positive and
negative emotions had been balanced out, and being able to indicate the feeling of strong
emotion was refreshing.
“I think there’s a degree to which almost everything is in the middle towards the last fourth
maybe the last three months of just feeling almost everything I’m answering is within a
certain little range...it felt everything sort of balanced each other out to just be moderate.
I’m always feeling moderate or whatever. I don’t think it’s inaccurate though, but—There
was a day where it was actually really nice to be like, “Yes, I am feeling extremely sad
today”
-9331, 28F
Theme 4: I fell into a consistent rhythm, but the attention check questions helped.
Over time, due to the habit of responding to burst surveys, sometimes participants got into
a “flow” of responding (indicative of IR) which may have resulted in poorer data quality. But
participants often spoke about being able to catch themselves in this pattern and trying to make
sure the data was accurate.
I remember, for some time, I did not see the exercise and physical activity box. I did not
click that for some time. Then at some point, I went through, and I was like, “Oh wow.”
-9236, 24F
You know, that has happened a few times when I would go back because I’m in the flow of
hitting what I’m used to. I would go back, it’s like, “Wait, hang on.” I would say yes when
I am distracted with work especially, sometimes it would call it there would be 70%
accurate instead of a hundred percent accurate.
-9287, 29F
I guess if I was really busy but I had to answer a survey, then I would spend less time on
each question. I don’t know how much that affected my responses because a lot of the times
my responses were in the middle. If I felt like I answered a question incorrectly, I did go
back and fix it. I did this every time. Even when I was doing it pretty fast, I think I would
still go back for those.
-9314, 27F
92
92
I think I just got more used to the questions so less second-guessing “what are they asking
for” … I know there were times when I caught myself like, “Wait, I didn’t read that, go
back,” but I don’t think there was any rhyme or reason to that.
-9265, 27F
One participant particularly emphasized that falling into an overly consistent rhythm that may
indicate IR was the opposite of their response pattern and described her process of thoughtfully
selecting response options.
I’d say mine were really accurate. I honestly probably put a little too much thought into it.
I tend to be thoughtful and that’s probably why I avoided it more because I’m like, “Okay
how am I really feeling?” I feel like a lot of my responses felt generally the same except
when something did happen that day that was really different. Even though some of them
felt contradictory, I’m like, “Oh, I’m a little focused, but I’m also a little fatigued.” I would
say that I definitely didn’t just pick whatever, just to get all the way through. I would say
they’re as accurate as I could possibly make them.
-9270, 26F
The attention check questions were mentioned as a good tool to break up this repetitiveness or
throw off the pace. Participants were able to identify the purpose of these questions as a tool to
verify if participants were paying attention, but they did not find them annoying and sometimes
amusing.
I love them. Sometimes I had to go back and like, so again, because I was anticipating it’s
a different question, we can go next. I’m like, wait, no, that was not the question. I would
go back and then select the correct one. [laughs]
-9287, 29F
It seemed like maybe they were there to keep people focused and not just like tapping
through the questions, which was nice, but I did feel sometimes because I knew what the
questions were going to be and like, I was going through them, especially on the burst
periods, you get one like that and it breaks my routine going through it and I was like, oh,
okay then you got to reset.
-9329, 23F
I thought they were semi-amusing. Like, ‘Oh are they trying to make sure I’m not just
going through the motions?” They were mildly amusing I guess is how I would describe
them. Not a big burden to have one more, “Okay, something different.”
- 9270, 26F
93
93
Theme 5: Different motivations for answering surveys may result in different levels of data
quality.
Monetary compensation was the primary motivating factor for most participants who were
particularly aware of the minimum response rate of eight surveys per day and put in effort to
make sure to reach this threshold.
Sometimes I knew that minimum I have to answer eight. I used to make sure that in the
morning when I’m at home, I’ll keep my phone with me just so that I can answer at least
eight. If I’m home the whole day, sometimes I used to answer even more. After the eight
were done, it at least mentally, I was like, “Okay, I’m done now”... I wouldn’t have to be
consciously looking at my phone all the time.
- 9287, 29F
If it was a day where there’s no way that I get anywhere close to the eight, I would just not
bother. On the flip side, if there was a day when it’s really close to having those eight
surveys, then I would try to make an effort to get from seven to eight.
- 9245, 28F
Oh, definitely while cooking. There were a couple of times when I knew that I still needed
to do a few more for the minimum eight surveys that I needed to answer. I would answer
with half-greasy hands.
-9314, 27F
[T]here was a little stress trying to hit the eight surveys a day so I always wanted to make
sure any opportunity I get I could try to answer them because I don’t know why I would
just miss one or two, but I’d always be like one or two shy and I’d be in despair.
- 9271, 23F
Beyond compensation, some participants felt responsibility and external accountability to the
study after the protocol became a habit.
I felt a responsibility towards the study to answer questions. Obviously, the money
incentive is a thing, too. I think that’s in the back of my head, though. I didn’t say, “Oh,
I’m going to not make money if I don’t do this.” I think for the most part, in reality, it was
just a habit. Over time, it became a habit and a sense of responsibility to this study. I just
felt like I wanted to, so I did.
- 9307, 25M
Participants mentioned the study being easier over time as the surveys became a habit that was a
positive aspect of their lives and emotionally rewarding.
94
94
I actually think that as the study wore on, I was more able to answer the questions and
the survey properly because it was easier for me to gauge how I was feeling. I’m thinking
about certain things just because I was more practiced at it of like checking, doing the
whole mental check-ins every hour of, am I actually stressed? Am I frustrated? Or am I
nervous or am I tense? Distinguishing between those things was a nice skill, I guess.
[chuckles]
- 9296, 25F
It helped me to reflect on how I was feeling. I did actually like that aspect of it, making
me stop and think and be a bit more mindful throughout the day of how I was doing. Also,
it helped me to realize a lot more this entire study, how active I actually am. That
definitely has had a very positive impact.
- 9248, 29F
I used to take a moment and I think about what I am feeling in the moment to answer the
questions correctly, which has been pretty helpful in tethering myself, very emotionally
rewarding almost if it makes sense. Reminder to slow down.
- 9266, 21F
I looked forward to something to do and also it was very simple and also in a small way,
it was like having a buddy [chuckles] checking on you. How are you feeling? Are you
planning to exercise? Do you sit? Do you eat? It reminded me to not just stay stationary.
I think it just encouraged me to think about things I wouldn’t always think about.
-9271, 23F
Discussion
The design of future intensive EMA studies and the quality of the data captured will
depend on a better understanding of participants’ experiences. This will involve accepting that
research participants are not just passive subjects but active contributors and in turn, highlighting
the complex nature of the relationships between participant and collaborator, methodology and
technology, use and design. This study contributes to limited research examining participants’
perceptions of the quality of the data provided in an intensive longitudinal health behavior study.
As one of the first studies that collected EMA data intensively for one year, the TIME Study was
a novel avenue to examine the prevalence of IR, changes in data quality, and the potential
application of attention check questions as both a method to validate attention as well as
intervene and improve data quality.
95
95
Thematic analysis revealed five overarching themes that shed light on the factors
influencing data quality and participant engagement in EMA studies. Findings suggest the
importance of social context and the presence of understanding and accommodating individuals
in their social circle appeared to alleviate the perceived burden of participating in the EMA
study. Quick response times can be attributed to time constraints, distraction, or a desire to
complete the surveys quickly to avoid disruption of daily activities. The findings suggest that
optimizing the survey length and minimizing disruptions could enhance data quality. The third
theme highlighted the dynamic nature of participants' response patterns over time. Participants
reported external factors such as mood, stress, and daily routines influencing their willingness to
provide accurate and thoughtful responses. This insight underscores the importance of
considering and potentially accommodating fluctuations in study design and analysis. The fourth
theme revealed that participants generally recognized the attention check questions as a
technique that was used to test data integrity, thus reactivity to these questions could potentially
be a tool for researchers. The fifth theme shows that monetary incentives alone were not the only
driving force behind participants' sustained engagement. Instead, participants expressed a sense
of curiosity, personal interest, and a desire to contribute to scientific knowledge as important
motivations for their continued participation in the EMA study. Understanding these intrinsic
motivators can inform future study recruitment strategies and participant retention efforts.
The findings from this study align well with previous quantitative studies. Our results
indicate that the social context is an important factor in young adults’ routines, and the burden
associated with an EMA study may need to expand to the disruption of others around study
participants. van Roekel et al. found that participants perceived surveys to be more inconvenient
in public places (van Roekel, 2019). Participant’s descriptions of changes in response patterns
96
96
toward quicker responses and decreased variability are consistent with previous quantitative
research, but this study sheds light on reasons these patterns begin and suggest that these
commonly assumed indices of IR may be normal shifts over time (Arslan et al., 2021, Fuller-
Tyszkiewicz et al., 2017). Many participants reported becoming more habituated to the measures
as the study went on, which has been previously reported (Paterson et al., 2020).
Anecdotally the results were also consistent with previous qualitative studies that
examined accuracy in EMA studies. A study conducted by van Berkel after a two-week EMA
study found that participants attributed their current mental state as the biggest influence on the
accuracy of answers with tiredness, distraction, and low concentration as negative factors (van
Berkel et al., 2020). Eisele et al. also had participants of a 14-day study describe a stabilization
effect with some participants reporting how the initial excitement wore off and questions were
boring and others reporting becoming familiar with the routine of surveys. Participants in that
study also reported changes in response patterns, with 56% reporting an increase in habitual
survey responses over time which included reports of learning the order of questions and
increased familiarity with the questions over time leading to easier and faster responses over
time. In the same study, participants also reported higher awareness of their emotions over time
due to repeated assessments which may suggest reactivity. The researchers took a further step in
matching their interview responses with quantitative data that showed decreases in response
variability during the study period. Participants in the Eisele et al. study did not report changing
their behavior or routines to avoid missing assessments, but this may have potentially been due
to the shorter study duration. (Eisele, 2021). Given the similar theme of qualitative data
collected, a future direction may be to further examine the response variability of the TIME
participants we interviewed.
97
97
A major contribution of this study is the feedback participants provided on the attention
check questions demonstrating acceptability; none of the participants mentioned any major
concerns about the items. Reactivity to the questions has been proposed by researchers, where
participants may start to look out for the items (Rintala et al., 2020; Welling et al., 2021).
However, through the interviews, it seems there was positive reactivity reported where the
attention checks made participants more aware that their accuracy was being monitored and this
may have increased data quality. Due to the lack of expressed concerns and reported potential
benefits, it is recommended that researchers include attention checks in future EMA studies even
though this may increase survey length or response time.
This study has some limitations. The study was not free of bias, including sampling bias
and recall bias. Overall, there are likely personality and demographic differences between
participants who consent to participate in a year-long study and the general public. In looking at
the coding process, each code was not applicable to all participants and may have skewed the
number of codes applied per transcript. In general, participants reported having a positive
experience with the study, but interviews were only conducted with participants who were all
able to finish an extremely intensive year-long study. The participants who had more negative
experiences likely withdrew from the study. There may also have been reporting bias, as some
opinions or information may have been actively withheld or suppressed by the participant.
Despite much discussion on the repetitiveness of the surveys and lack of variability in responses,
surprisingly, none of the participants proposed in interviews that the study could be shorter, or
the hourly surveys were too frequent. Moreover, given the length of the parent study, participants
may have experienced recall bias trying to remember feelings and emotions across the year and
trying to describe their own experiences. Antidotally, some of the interviews were less
98
98
informative because it was difficult for the interviewer to get participants to open up about their
experiences, and some of the transcripts had brief responses. However, even with less
informative interviews, existing sample size guidelines suggest that a range between 20 and 30
interviews is adequate for qualitative interview protocols (Creswell & Poth, 2016). Moreover,
since our study population was well-defined, it can be assumed that the phenomenon of IR is
likely homogenous across participants, which resulted in reaching theoretical saturation (no new
information is emerging in each category) for our focused research aims even with fewer
participants.
Overall, the results from this qualitative study provide insights into the complex process
of IR and participant engagement in EMA studies. The identified themes offer practical
implications for researchers, including the importance of social context and support, the
consideration of dynamic motivations, the utility of attention check questions, and the further
exploration of potential intrinsic motivators for completing EMA. By incorporating these
insights, researchers can improve participant engagement, enhance data quality, and maximize
the scientific value of EMA studies.
CONCLUSION AND IMPLICATIONS
Low-quality EMA data alters research conclusions and has a profound effect on our
understanding of health behavior and the future development of policies and interventions. As
study designs become more sophisticated, the measurement of factors that contribute to declines
in response accuracy will grow in importance. Despite concerns about data quality with intensive
longitudinal data collection methods, little is currently known about the prevalence of IR and the
conditions under which it is more likely to occur. These dissertation studies are a first step in
better understanding and handling data quality issues in EMA. Overall, these studies contribute
methodological knowledge to the field and these analyses have implications for the design of
future EMA studies through recommendations about periods to avoid surveying (Study 2) and
contribute to the post-hoc detection of IR (Study 1). Based on the themes extracted from the
interviews, social context at survey delivery may be an unrecognized elements of the burden
associated with EMA that affect response patterns and there may be intrinsic motivators for
participants to continue completing EMA data collection (Study 3).
Due to the low prevalence of IR in our data, while these studies were intended to focus on
poor data quality and why people “choose” to be inattentive, the results instead also increase the
field’s understanding of what makes for more attentive responses and good data quality. It has
been proposed that there may be a new type of context-sensitive EMA surveying needed for the
field that reduces interruption in which the system only delivers surveys when a participant is
available, and responses would be accurate, and the results of Study 2 and Study 3 may offer
additional evidence by showing qualitatively and quantitively that surveys delivered at home and
alone have a higher likelihood of being attentive (van Berkel et al., 2020). Moreover, given the
low prevalence of IR found in our one-year intensive EMA study, this dissertation could also
100
reduce the concern of IR as a unique potential limitation of real-time longitudinal data capture
methods like ecological momentary assessment as a tool for the assessment of health behaviors.
Contributions to the field
These studies and the TIME study also contribute to the field by demonstrating the
feasibility of conducting a one-year continuous EMA study in young adults and collecting high-
quality data. There may have been unique aspects of the one-year EMA protocol in the current
study that may have reduced IR. First, participants were trained at orientation and there was a
run-in period for the study in which we eliminated participants with initially poor compliance.
We aimed to ensure that participants were committed to responding to as many surveys as
possible and that participants would aim to provide high-quality and accurate responses.
Moreover, through the qualitative interviews, it seems like the implementation of
attention check questions may have increased data quality. Much research has been conducted to
examine the utility of attention check items. Researchers have warned about the unintended
consequences or biases introduced by the adoption of attention check items to EMA (Welling et
al. 2021), so it was unexpected to find a potential benefit instead. Attention check questions are
an easy way for researchers to easily identify and subsequently exclude inattentive participants
from their analysis (Kittur et al., 2013). Many researchers may not be confident in their ability to
apply post-hoc methods that require conducting statistical analyses on response patterns. While
there are other easily collected indices to screen for IR such as response time, a participant’s
completion speed on a survey may be influenced by the device administering the survey or
affected by the strength and speed of the Internet connection (in the case of web-based surveys).
Moreover, some researchers have found that removing participants based only on the criteria of
101
responding too quickly did not alter the results of their cross-sectional models, meaning that
response speed alone may not indicate IR (Anduiza & Galais, 2017; Greszki et al., 2015;
Gummer et al., 2021). Additionally, some survey platforms do not have the ability to capture
time stamps. The alternative straightforward method for researchers to use is directly asking
participants within the survey or retrospectively if they engaged in IR and if the data should be
used in analysis (self-report IR). However, this method is flawed because the self-report IR item
itself may be subject to IR and unreliable. The measure relies on the unlikely assumption that an
otherwise inattentive respondent will identify and carefully read the self-report item, is aware
that they responded inattentively, and will honestly respond to the item.
Potential recommendations for the field
When designing an EMA study, researchers must make many study design decisions.
While there are handbooks published with recommendations (most recently Myin-Germeys &
Kuppens, 2021), optimal EMA study design is highly dependent on the study research question,
study population, and available research infrastructure. Given that EMA studies are burdensome
for researchers in terms of cost and time, enhancing data quality should be a priority for
researchers. Study 2 provides further support that future studies could use phone usage data and
contextual information to explore ways to infer participant ability to answer an EMA
questionnaire (Liono et al., 2019; Mehrotra et al., 2015). Given participants’ discussion of the
repetitive nature of the surveys, researchers should aim to diversify the types of answer options
commonly used in EMA or randomize the order of questions.
Additionally, resources to reduce researcher burden in detecting IR should be developed.
In an ideal study, researchers would monitor data integrity the same way many studies currently
102
check compliance rates. If programs that can automatically screen for IR can be developed,
dashboards could be created for investigators to use, and emails/text messages could be
automatically sent to participants about responding to their surveys more thoughtfully.
Limitations
It is important to recognize the limiting factors of these dissertation studies. The
participant population consists of young adults who were recruited online and were comfortable
using devices such as smartwatches. The analyses lacked variability in age and familiarity with
technology. The method for objectively determining response quality using attention check
questions will not apply to datasets already collected in most existing EMA studies because such
questions are rarely included. Our results likely do not generalize to other datasets due to the
longer length of the TIME Study compared to typical EMA studies (365 days vs 7-28 days) and
the density of surveying (every hour during waking hours during burst periods). Only affect data
was assessed in the EMA surveys categorized in Study 1. While changes in affect are commonly
assessed in EMA studies as affect is prone to reactive changes over time, the developed detection
indices will perform differently with variables that are more reflective or have multiple answer
options. A limitation for future EMA studies is that the TIME Study collected passive and
contextual data that may not be collected in other studies so the findings in Study 2 may not be
easily replicated. Moreover, although a wide range of temporal and contextual variables were
analyzed, there are additional unmeasured variables that may affect participant accuracy. One
example is participants’ negative feeling states, but the self-report measures assessing these
factors were within the same survey for which we assessed data validity. Because the negative
103
feeling state data itself may be invalid if the survey was completed inattentively, we cannot use
those data as a factor to predict IR.
Final thoughts
There is great potential for EMA to alleviate some of the difficulties involved in the
assessment of complex, periodic health behaviors but it is important to identify optimal ways to
use EMA to collect self-report data. As EMA study designs become more sophisticated, the
measurement of factors that contribute to declines in response accuracy will grow in importance.
Results from this dissertation provide insights into the detection of IR, factors that may improve
the reliability and validity of EMA data, and considerations for researchers when designing their
studies to assess and maximize data quality.
104
REFERENCES
Ahmed, A. O., & Jenkins, B. (2013). Critical Synthesis Package: Ten-Item Personality Inventory
(TIPI)—A Quick Scan of Personality Structure. MedEdPORTAL, 9427.
https://doi.org/10.15766/mep_2374-8265.9427
Allman-Farinelli, M., Partridge, S. R., & Roy, R. (2016). Weight-Related Dietary Behaviors in
Young Adults. Current Obesity Reports, 5(1), 23–29. https://doi.org/10.1007/s13679-
016-0189-8
Andrew Perrin. (2021). Mobile Technology and Home Broadband 2021.
https://www.pewresearch.org/internet/2021/06/03/mobile-technology-and-home-
broadband-2021/
Anduiza, E., & Galais, C. (2017). Answering without reading: IMCs and strong satisficing in
online surveys. International Journal of Public Opinion Research, 29(3), 497–519.
Arnett, J. J. (2000). Emerging adulthood: A theory of development from the late teens through
the twenties. American Psychologist, 55(5), 469–480. https://doi.org/10.1037/0003-
066X.55.5.469
Arslan, R. C., Reitz, A. K., Driebe, J. C., Gerlach, T. M., & Penke, L. (2021). Routinely
randomize potential sources of measurement reactivity to estimate and adjust for biases in
subjective reports. Psychological Methods, 26(2), 175–185.
https://doi.org/10.1037/met0000294
Baer, R. A., Ballenger, J., Berry, D. T. R., & Wetter, M. W. (1997). Detection of Random
Responding on the MMPI--A. Journal of Personality Assessment, 68(1), 139–151.
https://doi.org/10.1207/s15327752jpa6801_11
Banerjee, A. (2012). A review of family history of cardiovascular disease: Risk factor and
research tool. International Journal of Clinical Practice, 66(6), 536–543.
Beal, D. J., & Weiss, H. M. (2003). Methods of Ecological Momentary Assessment in
Organizational Research. Organizational Research Methods, 6(4), Article 4.
https://doi.org/10.1177/1094428103257361
Beilock, S. L., & DeCaro, M. S. (2007). From poor performance to success under stress:
Working memory, strategy selection, and mathematical problem solving under pressure.
Journal of Experimental Psychology: Learning, Memory, and Cognition, 33(6), 983–998.
https://doi.org/10.1037/0278-7393.33.6.983
Berinsky, A. J., Margolis, M. F., & Sances, M. W. (2014). Separating the Shirkers from the
Workers? Making Sure Respondents Pay Attention on Self-Administered Surveys:
SEPARATING THE SHIRKERS FROM THE WORKERS? American Journal of
Political Science, 58(3), 739–753. https://doi.org/10.1111/ajps.12081
105
Börnhorst, C., Russo, P., Veidebaum, T., Tornaritis, M., Molnár, D., Lissner, L., Mårild, S., De
Henauw, S., Moreno, L. A., & Floegel, A. (2020). The role of lifestyle and non-
modifiable risk factors in the development of metabolic disturbances from childhood to
adolescence. International Journal of Obesity, 44(11), 2236–2245.
Bowling, N. A., Huang, J. L., Bragg, C. B., Khazon, S., Liu, M., & Blackmore, C. E. (2016).
Who cares and who is careless? Insufficient effort responding as a reflection of
respondent personality. Journal of Personality and Social Psychology, 111(2), 218–229.
https://doi.org/10.1037/pspp0000085
Brosnan, K., Babakhani, N., & Dolnicar, S. (2019). “I know what you’re going to ask me” Why
respondents don’t read survey questions. International Journal of Market Research,
61(4), 366–379. https://doi.org/10.1177/1470785318821025
Brower, C. K. (2018). Too long and too boring: The effects of survey length and interest on
careless responding.
Cartwright, J. (2016). Technology: Smartphone science. Nature, 531(7596), 669–671.
https://doi.org/10.1038/nj7596-669a
Carver, C. S., & White, T. L. (1994). Behavioral inhibition, behavioral activation, and affective
responses to impending reward and punishment: The BIS/BAS Scales. Journal of
Personality and Social Psychology, 67(2), Article 2. https://doi.org/10.1037/0022-
3514.67.2.319
Chandler, J., Mueller, P., & Paolacci, G. (2014). Nonnaïveté among Amazon Mechanical Turk
workers: Consequences and solutions for behavioral researchers. Behavior Research
Methods, 46(1), 112–130. https://doi.org/10.3758/s13428-013-0365-7
Charmaz, K. (2006). Constructing grounded theory: A practical guide through qualitative
analysis. Sage.
Cherenack, E. M., Wilson, P. A., Kreuzman, A. M., Price, G. N., & HIV, A. M. T. N. (2016).
The Feasibility and Acceptability of Using Technology-Based Daily Diaries with HIV-
Infected Young Men Who have Sex with Men: A Comparison of Internet and Voice
Modalities. AIDS AND BEHAVIOR, 20(8), Article 8. https://doi.org/10.1007/s10461-016-
1302-4
Chiolero, A., Faeh, D., Paccaud, F., & Cornuz, J. (2008). Consequences of smoking for body
weight, body fat distribution, and insulin resistance. The American Journal of Clinical
Nutrition, 87(4), 801–809. https://doi.org/10.1093/ajcn/87.4.801
Christensen, T. C., Barrett, L. F., Bliss-Moreau, E., Lebo, K., & Christensen, T. C. (2003a). A
Practical Guide to Experience-Sampling Procedures. Journal of Happiness Studies, 4(1),
53–78. https://doi.org/10.1023/A:1023609306024
106
Christensen, T. C., Barrett, L. F., Bliss-Moreau, E., Lebo, K., & Christensen, T. C. (2003b). A
Practical Guide to Experience-Sampling Procedures. Journal of Happiness Studies, 4(1),
53–78. https://doi.org/10.1023/A:1023609306024
Church, A. H. (1993). Estimating the Effect of Incentives on Mail Survey Response Rates: A
Meta-Analysis. Public Opinion Quarterly, 57(1), 62. https://doi.org/10.1086/269355
Clifford, S., & Jerit, J. (2014). Is there a cost to convenience? An experimental comparison of
data quality in laboratory and online studies. Journal of Experimental Political Science,
1(2), 120–131.
Courvoisier, D. S., Eid, M., & Lischetzke, T. (2012). Compliance to a cell phone-based
ecological momentary assessment study: The effect of time and personality
characteristics. Psychological Assessment. https://doi.org/10.1037/a0026733
Cowan, N. (2012). Working Memory Capacity (0 ed.). Psychology Press.
https://doi.org/10.4324/9780203342398
Credé, M. (2010). Random responding as a threat to the validity of effect size estimates in
correlational research. Educational and Psychological Measurement, 70(4), Article 4.
https://doi.org/10.1177/0013164410366686
Creswell, J. W., & Poth, C. N. (2016). Qualitative inquiry and research design: Choosing among
five approaches. Sage publications.
Csikszentmihalyi, M. (1992). The experience of psychopathology: Investigating mental disorders
in their natural settings. Cambridge University Press.
Csikszentmihalyi, M. (2014). Flow and the Foundations of Positive Psychology: The Collected
Works of Mihaly Csikszentmihalyi (1st ed. 2014). Springer Netherlands : Imprint:
Springer. https://doi.org/10.1007/978-94-017-9088-8
Curran, P. G. (2016). Methods for the detection of carelessly invalid responses in survey data.
Journal of Experimental Social Psychology, 66, 4–19.
https://doi.org/10.1016/j.jesp.2015.07.006
Dao, K. P., De Cocker, K., Tong, H. L., Kocaballi, A. B., Chow, C., & Laranjo, L. (2021).
Smartphone-Delivered Ecological Momentary Interventions Based on Ecological
Momentary Assessments to Promote Health Behaviors: Systematic Review and Adapted
Checklist for Reporting Ecological Momentary Assessment and Intervention Studies.
JMIR MHealth and UHealth, 9(11), e22890. https://doi.org/10.2196/22890
Denison, A. J. (2022). Prevalence and Predictors of Careless Responding in Experience
Sampling Research.
107
Dietrich, J. J., Hornschuh, S., Khunwane, M., Makhale, L. M., Otwombe, K., Morgan, C.,
Huang, Y., Lemos, M., Lazarus, E., Kublin, J. G., Gray, G. E., Laher, F., & Andrasik, M.
(2020). A mixed methods investigation of implementation barriers and facilitators to a
daily mobile phone sexual risk assessment for young women in Soweto, South Africa.
PLoS ONE. https://doi.org/10.1371/journal.pone.0231086
Doherty, K., Balaskas, A., & Doherty, G. (2020). The Design of Ecological Momentary
Assessment Technologies. Interacting with Computers, 32(3), 257–278.
https://doi.org/10.1093/iwcomp/iwaa019
Duncan, D. T., Kapadia, F., Kirchner, T. R., Goedel, W. C., Brady, W. J., & Halkitis, P. N.
(2017). Acceptability of ecological momentary assessment among young men who have
sex with men. JOURNAL OF LGBT YOUTH, 14(4), 436–444.
https://doi.org/10.1080/19361653.2017.1365038
Duncan, D. T., Park, S. H., Goedel, W. C., Sheehan, D. M., Regan, S. D., & Chaix, B. (2019).
Acceptability of smartphone applications for global positioning system (GPS) and
ecological momentary assessment (EMA) research among sexual minority men. PLOS
ONE, 14(1). https://doi.org/10.1371/journal.pone.0210240
Dunn, A. M., Heggestad, E. D., Shanock, L. R., & Theilgard, N. (2018). Intra-individual
Response Variability as an Indicator of Insufficient Effort Responding: Comparison to
Other Indicators and Relationships with Individual Differences. Journal of Business and
Psychology, 33(1), Article 1. https://doi.org/10.1007/s10869-016-9479-0
Dunton, G. F., Liao, Y., Kawabata, K., & Intille, S. (2012). Momentary assessment of adults’
physical activity and sedentary behavior: Feasibility and validity. Frontiers in
Psychology, 3, 9. https://doi.org/10.3389/fpsyg.2012.00260
Dunton, G., Intille, S., Wolch, J., & Pentz, M. (2012). Children’s perceptions of physical activity
environments captured through ecological momentary assessment: A validation study.
PREVENTIVE MEDICINE, 55(2), 119–121. https://doi.org/10.1016/j.ypmed.2012.05.015
Dzubur, E. (2017). Understanding the Methodological Limitations in the Ecological Momentary
Assessment of Physical Activity [PhD Thesis]. University of Southern California.
Eisele, G. V. (2021). The Influence of Methodological Choices on Data Quality and Quantity in
Experience Sampling Studies.
Farzan, R., DiMicco, J. M., Millen, D. R., Dugan, C., Geyer, W., & Brownholtz, E. A. (2008).
Results from deploying a participation incentive mechanism within the enterprise.
Proceeding of the Twenty-Sixth Annual CHI Conference on Human Factors in
Computing Systems - CHI ’08, 563. https://doi.org/10.1145/1357054.1357145
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–
874. https://doi.org/10.1016/j.patrec.2005.10.010
108
Fleischer, A., Mead, A. D., & Huang, J. (2015). Inattentive Responding in MTurk and Other
Online Samples. Industrial and Organizational Psychology, 8(2), 196–202.
https://doi.org/10.1017/iop.2015.25
Friedenreich, C. M., Ryder‐Burbidge, C., & McNeil, J. (2021). Physical activity, obesity and
sedentary behavior in cancer etiology: Epidemiologic evidence and biologic mechanisms.
Molecular Oncology, 15(3), 790–800.
Fuller, J., Gonzales, M., & Rice, K. (2015). Physical activity levels among on campus and online
college students. 8(3), 21.
Fuller-Tyszkiewicz, M., Hartley-Clark, L., Cummins, R. A., Tomyn, A. J., Weinberg, M. K., &
Richardson, B. (2017). Using dynamic factor analysis to provide insights into data
reliability in experience sampling studies. Psychological Assessment, 29(9), 1120–1128.
https://doi.org/10.1037/pas0000411
Geeraerts, J., & Kuppens, P. (2020). Investigating Careless Responding Detection Techniques in
Experience Sampling Methods (ESM).
Gilman, T. L., Shaheen, R., Nylocks, K. M., Halachoff, D., Chapman, J., Flynn, J. J., Matt, L.
M., & Coifman, K. G. (2017). A film set for the elicitation of emotion in research: A
comprehensive catalog derived from four decades of investigation. Behavior Research
Methods, 49(6), 2061–2082. https://doi.org/10.3758/s13428-016-0842-x
Gooding, H. C., Milliren, C., Shay, C. M., Richmond, T. K., Field, A. E., & Gillman, M. W.
(2016). Achieving Cardiovascular Health in Young Adulthood—Which Adolescent
Factors Matter? Journal of Adolescent Health, 58(1), 119–121.
https://doi.org/10.1016/j.jadohealth.2015.09.011
Gosling, S. D., Rentfrow, P. J., & Swann, W. B. (2003). A very brief measure of the Big-Five
personality domains. Journal of Research in Personality, 37(6), 504–528.
https://doi.org/10.1016/S0092-6566(03)00046-1
Greszki, R., Meyer, M., & Schoen, H. (2015). Exploring the effects of removing “too fast”
responses and respondents from web surveys. Public Opinion Quarterly, 79(2), 471–503.
Gummer, T., Roßmann, J., & Silber, H. (2021). Using instructed response items as attention
checks in web surveys: Properties and implementation. Sociological Methods &
Research, 50(1), 238–264.
Halpin, H. A., Morales-Suárez-Varela, M. M., & Martin-Moreno, J. M. (2010). Chronic disease
prevention and the new public health. Public Health Reviews, 32(1), 120–154.
Hauser, D. J., & Schwarz, N. (2015). It’sa trap! Instructional manipulation checks survey
systematic thinking on “tricky” tasks. Sage Open, 5(2), 2158244015584617.
109
Heimpel, S. A., Elliot, A. J., & Wood, J. V. (2006). Basic Personality Dispositions, Self-Esteem,
and Personal Goals: An Approach-Avoidance Analysis. Journal of Personality, 74(5),
1293–1320. https://doi.org/10.1111/j.1467-6494.2006.00410.x
Helgeson, V. S., Palladino, D. K., Reynolds, K. A., Becker, D. J., Escobar, O., & Siminerio, L.
(2014). Relationships and health among emerging adults with and without Type 1
diabetes. Health Psychology: Official Journal of the Division of Health Psychology,
American Psychological Association, 33(10), 1125–1133.
https://doi.org/10.1037/a0033511
Hong, M., Steedle, J. T., & Cheng, Y. (2020). Methods of Detecting Insufficient Effort
Responding: Comparisons and Practical Recommendations. Educational and
Psychological Measurement, 80(2), 312–345. https://doi.org/10.1177/0013164419865316
Hsieh, G., Li, I., Dey, A., Forlizzi, J., & Hudson, S. E. (2008). Using visualizations to increase
compliance in experience sampling. Proceedings of the 10th International Conference on
Ubiquitous Computing - UbiComp ’08, 164. https://doi.org/10.1145/1409635.1409657
Huang, J. L., Curran, P. G., Keeney, J., Poposki, E. M., & DeShon, R. P. (2012). Detecting and
Deterring Insufficient Effort Responding to Surveys. Journal of Business and
Psychology, 27(1), Article 1. https://doi.org/10.1007/s10869-011-9231-8
Huang, J. L., Liu, M., & Bowling, N. A. (2015). Insufficient effort responding: Examining an
insidious confound in survey data. Journal of Applied Psychology, 100(3), 828–845.
https://doi.org/10.1037/a0038510
Hufford, M. R., & Shiffman, S. (2003). Assessment Methods for Patient-Reported Outcomes:
Disease Management & Health Outcomes, 11(2), 77–86.
https://doi.org/10.2165/00115677-200311020-00002
Hyman, I. E., Boss, S. M., Wise, B. M., McKenzie, K. E., & Caggiano, J. M. (2009). Did you see
the unicycling clown? Inattentional blindness while walking and talking on a cell phone.
Applied Cognitive Psychology, 24(5), 597–607. https://doi.org/10.1002/acp.1638
Intille, S., Haynes, C., Maniar, D., Ponnada, A., & Manjourides, J. (2016). μEMA:
Microinteraction-based ecological momentary assessment (EMA) using a smartwatch.
Proceedings of the 2016 ACM International Joint Conference on Pervasive and
Ubiquitous Computing, 1124–1128. https://doi.org/10.1145/2971648.2971717
Intille, S. S., Stone, A., Shiffman, S., Atienza, A., & Nebeling, L. (2007). Technological
innovations enabling automatic, context-sensitive ecological momentary assessment. The
Science of Real-Time Data Capture: Self-Reports in Health Research, 308–337.
Islami, F., Torre, L. A., & Jemal, A. (2015). Global trends of lung cancer mortality and smoking
prevalence. Translational Lung Cancer Research, 4(4), 327.
110
Jaso, B. A., Kraus, N. I., & Heller, A. S. (2021). Identification of careless responding in
ecological momentary assessment research: From posthoc analyses to real-time data
monitoring. Psychological Methods. https://doi.org/10.1037/met0000312
Jeon, K. J., Lee, O., Kim, H.-K., & Han, S. N. (2011). Comparison of the dietary intake and
clinical characteristics of obese and normal weight adults. Nutrition Research and
Practice, 5(4), 329–336. https://doi.org/10.4162/nrp.2011.5.4.329
Johnson, J. A. (2005). Ascertaining the validity of individual protocols from Web-based
personality inventories. Journal of Research in Personality, 39(1), 103–129.
https://doi.org/10.1016/j.jrp.2004.09.009
Johnston, D. W., & Johnston, M. (2013). Useful theories should apply to individuals. British
Journal of Health Psychology, 18(3), 469–473. https://doi.org/10.1111/bjhp.12049
Jones, A., Remmerswaal, D., Verveer, I., Robinson, E., Franken, I. H. A., Wen, C. K. F., &
Field, M. (2019). Compliance with ecological momentary assessment protocols in
substance users: A meta-analysis. Addiction, 114(4), Article 4.
https://doi.org/10.1111/add.14503
Judge, T. A., & Ilies, R. (2002). Relationship of personality to performance motivation: A meta-
analytic review. Journal of Applied Psychology, 87(4), 797.
Kanfer, R., Frese, M., & Johnson, R. E. (2017). Motivation related to work: A century of
progress. Journal of Applied Psychology, 102(3), 338.
Kannel, W. B., & Vasan, R. S. (2009). Is age really a non-modifiable cardiovascular risk factor?
American Journal of Cardiology, 104(9), 1307–1310.
Kim, D. S., McCabe, C. J., Yamasaki, B. L., Louie, K. A., & King, K. M. (2018). Detecting
random responders with infrequency scales using an error-balancing threshold. Behavior
Research Methods, 50(5), Article 5. https://doi.org/10.3758/s13428-017-0964-9
Kim, J., Marcusson-Clavertz, D., Yoshiuchi, K., & Smyth, J. M. (2019). Potential benefits of
integrating ecological momentary assessment data into mHealth care systems.
BioPsychoSocial Medicine, 13(1), 19. https://doi.org/10.1186/s13030-019-0160-5
Kim, Y., Dykema, J., Stevenson, J., Black, P., & Moberg, D. P. (2019). Straightlining: Overview
of Measurement, Comparison of Indicators, and Effects in Mail–Web Mixed-Mode
Surveys. Social Science Computer Review, 37(2), 214–233.
https://doi.org/10.1177/0894439317752406
Kimura, D., & Hampson, E. (1994). Cognitive Pattern in Men and Women Is Influenced by
Fluctuations in Sex Hormones. Current Directions in Psychological Science, 3(2), 57–61.
https://doi.org/10.1111/1467-8721.ep10769964
111
Kittur, A., Nickerson, J. V., Bernstein, M., Gerber, E., Shaw, A., Zimmerman, J., Lease, M., &
Horton, J. (2013). The future of crowd work. Proceedings of the 2013 Conference on
Computer Supported Cooperative Work - CSCW ’13, 1301.
https://doi.org/10.1145/2441776.2441923
Kohl, H. W., Fulton, J. E., & Caspersen, C. J. (2000). Assessment of Physical Activity among
Children and Adolescents: A Review and Synthesis. Preventive Medicine, 31(2), S54–
S76. https://doi.org/10.1006/pmed.1999.0542
Kuncel, R. B., & Fiske, D. W. (1974). Stability of Response Process and Response. Educational
and Psychological Measurement, 34(4), 743–755.
https://doi.org/10.1177/001316447403400401
Kurian, A. K., & Cardarelli, K. M. (2007). Racial and ethnic differences in cardiovascular
disease risk factors: A systematic review. Ethnicity and Disease, 17(1), 143.
Kwapil, T. R., Silvia, P. J., Myin-Germeys, I., Anderson, A. J., Coates, S. A., & Brown, L. H.
(2009). The social world of the socially anhedonic: Exploring the daily ecology of
asociality. Journal of Research in Personality, 43(1), 103–106.
https://doi.org/10.1016/j.jrp.2008.10.008
Lai, L. C. H., Cummins, R. A., & Lau, A. L. D. (2013). Cross-Cultural Difference in Subjective
Wellbeing: Cultural Response Bias as an Explanation. Social Indicators Research,
114(2), 607–619. https://doi.org/10.1007/s11205-012-0164-z
Land-Zandstra, A. M., Devilee, J. L. A., Snik, F., Buurmeijer, F., & van den Broek, J. M. (2016).
Citizen science on a smartphone: Participants’ motivations and learning. Public
Understanding of Science, 25(1), 45–60. https://doi.org/10.1177/0963662515602406
Larson, R., & Csikszentmihalyi, M. (2014). The Experience Sampling Method. In M.
Csikszentmihalyi (Ed.), Flow and the Foundations of Positive Psychology: The Collected
Works of Mihaly Csikszentmihalyi (pp. 21–34). Springer Netherlands.
https://doi.org/10.1007/978-94-017-9088-8_2
Lau, A. L. D., Cummins, R. A., & Mcpherson, W. (2005). An Investigation into the Cross-
Cultural Equivalence of the Personal Wellbeing Index. Social Indicators Research, 72(3),
403–430. https://doi.org/10.1007/s11205-004-0561-z
Lee, J. W., Jones, P. S., Mineyama, Y., & Zhang, X. E. (2002). Cultural differences in responses
to a likert scale. Research in Nursing & Health, 25(4), 295–306.
https://doi.org/10.1002/nur.10041
Leventhal, H. (1973). Changing attitudes and habits to reduce risk factors in chronic disease. The
American Journal of Cardiology, 31(5), 571–580.
112
Liao, Y., Intille, S. S., & Dunton, G. F. (2015). Using ecological momentary assessment to
understand where and with whom adults’ physical and sedentary activity occur.
International Journal of Behavioral Medicine, 22(1), 51–61.
Liono, J., Salim, F. D., van Berkel, N., Kostakos, V., & Qin, A. K. (2019). Improving
Experience Sampling with Multi-view User-driven Annotation Prediction. 2019 IEEE
International Conference on Pervasive Computing and Communications (PerCom, 1–11.
https://doi.org/10.1109/PERCOM.2019.8767394
Litman, L., Robinson, J., & Rosenzweig, C. (2015). The relationship between motivation,
monetary compensation, and data quality among US-and India-based workers on
Mechanical Turk. Behavior Research Methods, 47(2), 519–528.
Lukowicz, P., Pentland, S., & Ferscha, A. (2012). From Context Awareness to Socially Aware
Computing. IEEE Pervasive Computing, 11(1), 32–41.
https://doi.org/10.1109/MPRV.2011.82
Mackesy-Amiti, M. E., & Boodram, B. (2018). Feasibility of ecological momentary assessment
to study mood and risk behavior among young people who inject drugs. Drug and
Alcohol Dependence. https://doi.org/10.1016/j.drugalcdep.2018.03.016
Madden, T. J., Ellen, P. S., & Ajzen, I. (1992). A Comparison of the Theory of Planned Behavior
and the Theory of Reasoned Action. Personality and Social Psychology Bulletin, 18(1),
3–9. https://doi.org/10.1177/0146167292181001
Maniaci, M. R., & Rogge, R. D. (2014). Caring about carelessness: Participant inattention and its
effects on research. Journal of Research in Personality, 48(1), Article 1.
https://doi.org/10.1016/j.jrp.2013.09.008
Marjanovic, Z., Holden, R., Struthers, W., Cribbie, R., & Greenglass, E. (2015). The inter-item
standard deviation (ISD): An index that discriminates between conscientious and random
responders. Personality and Individual Differences, 84, 79–83.
https://doi.org/10.1016/j.paid.2014.08.021
Maslach, C., & Leiter, M. P. (2008). The truth about burnout: How organizations cause personal
stress and what to do about it. John Wiley & Sons.
McCarron, P., Smith, G. D., Okasha, M., & McEwen, J. (2000). Blood pressure in young
adulthood and mortality from cardiovascular disease. The Lancet, 355(9213), 1430–1431.
https://doi.org/10.1016/S0140-6736(00)02146-2
McGonagle, A. K. (2015). Participant Motivation: A Critical Consideration. Industrial and
Organizational Psychology, 8(2), 208–214. https://doi.org/10.1017/iop.2015.27
113
McGrath, R. E., Mitchell, M., Kim, B. H., & Hough, L. (2010). Evidence for response bias as a
source of error variance in applied assessment. Psychological Bulletin, 136(3), 450–470.
https://doi.org/10.1037/a0019216
McKee, H. C., Ntoumanis, N., & Taylor, I. M. (2014). An ecological momentary assessment of
lapse occurrences in dieters. Annals of Behavioral Medicine, 48(3), 300–310.
Meade, A. W., & Craig, S. B. (2012). Identifying careless responses in survey data.
Psychological Methods, 17(3), Article 3. https://doi.org/10.1037/a0028085
Mehrotra, A., Vermeulen, J., Pejovic, V., & Musolesi, M. (2015). Ask, but don’t interrupt: The
case for interruptibility-aware mobile experience sampling. Proceedings of the 2015
ACM International Joint Conference on Pervasive and Ubiquitous Computing and
Proceedings of the 2015 ACM International Symposium on Wearable Computers -
UbiComp ’15, 723–732. https://doi.org/10.1145/2800835.2804397
Moore, S. C., Crompton, K., Van Goozen, S., Van Den Bree, M., Bunney, J., & Lydall, E.
(2013). A feasibility study of short message service text messaging as a surveillance tool
for alcohol consumption and vehicle for interventions in university students. BMC Public
Health. https://doi.org/10.1186/1471-2458-13-1011
Musthag, M., Raij, A., Ganesan, D., Kumar, S., & Shiffman, S. (2011). Exploring micro-
incentive strategies for participant compensation in high-burden studies. Proceedings of
the 13th International Conference on Ubiquitous Computing - UbiComp ’11, 435.
https://doi.org/10.1145/2030112.2030170
Myin-Germeys, I., & Kuppens, P. (Eds.). (2021). The open handbook of experience sampling
methodology: A step-by-step guide to designing, conducting, and analyzing ESM studies.
Amazon Italia Logistica.
Myin-Germeys, I., Oorschot, M., Collip, D., Lataster, J., Delespaul, P., & van Os, J. (2009).
Experience sampling research in psychopathology: Opening the black box of daily life.
Psychological Medicine, 39(9), 1533–1547. https://doi.org/10.1017/S0033291708004947
Nahum-Shani, I., Shaw, S. D., Carpenter, S. M., Murphy, S. A., & Yoon, C. (2022). Engagement
in digital interventions. American Psychologist.
Nam, S., Griggs, S., Ash, G. I., Dunton, G. F., Huang, S., Batten, J., Parekh, N., & Whittemore,
R. (2021). Ecological momentary assessment for health behaviors and contextual factors
in persons with diabetes: A systematic review. Diabetes Research and Clinical Practice,
174, 108745.
National Center for Chronic Disease Prevention and Health Promotion. (n.d.). About Chronic
Diseases. https://www.cdc.gov/chronicdisease/about/index.htm
114
Newell, S. A., Girgis, A., Sanson-Fisher, R. W., & Savolainen, N. J. (1999). The accuracy of
self-reported health behaviors and risk factors relating to cancer and cardiovascular
disease in the general population. American Journal of Preventive Medicine, 17(3), 211–
229. https://doi.org/10.1016/S0749-3797(99)00069-0
Niessen, A. S. M., Meijer, R. R., & Tendeiro, J. N. (2016). Detecting careless respondents in
web-based questionnaires: Which method to use? Journal of Research in Personality, 63,
1–11. https://doi.org/10.1016/j.jrp.2016.04.010
Ono, M., Schneider, S., Junghaenel, D. U., & Stone, A. A. (2019). What Affects the Completion
of Ecological Momentary Assessments in Chronic Pain Research? An Individual Patient
Data Meta-Analysis. Journal of Medical Internet Research, 21(2), e11398.
https://doi.org/10.2196/11398
Oppenheimer, D. M., Meyvis, T., & Davidenko, N. (2009). Instructional manipulation checks:
Detecting satisficing to increase statistical power. Journal of Experimental Social
Psychology, 45(4), 867–872. https://doi.org/10.1016/j.jesp.2009.03.009
Paterson, C., Primeau, C., & Lauder, W. (2020). What Are the Experiences of Men Affected by
Prostate Cancer Participating in an Ecological Momentary Assessment Study? Cancer
Nursing, 43(4), 300–310. https://doi.org/10.1097/NCC.0000000000000699
Pejovic, V., Lathia, N., Mascolo, C., & Musolesi, M. (2015). Mobile-Based Experience Sampling
for Behaviour Research. https://doi.org/10.48550/ARXIV.1508.03725
Pew Research Center. (2017). Mobile Fact Sheet. Pew Research Center: Internet & Technology.
Pinneau, S. R., & Milton, A. (1958). The ecological veracity of the self-report. The Journal of
Genetic Psychology, 93(2), 249–276.
Ponnada, A., Li, J., Wang, S., Wang, W.-L., Do, B., Dunton, G. F., & Intille, S. S. (2022).
Contextual Biases in Microinteraction Ecological Momentary Assessment (μEMA) Non-
response. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous
Technologies, 6(1), 1–24.
Raento, M., Oulasvirta, A., & Eagle, N. (2009). Smartphones: An Emerging Tool for Social
Scientists. Sociological Methods & Research, 37(3), 426–454.
https://doi.org/10.1177/0049124108330005
Ragsdale, J. M., Beehr, T. A., Grebner, S., & Han, K. (2011). An integrated model of weekday
stress and weekend recovery of students. International Journal of Stress Management,
18(2), 153–180. https://doi.org/10.1037/a0023190
Raphael, K. (1987). Recall bias: A proposal for assessment and control. International Journal of
Epidemiology, 16(2), 167–170.
115
Rintala, A., Wampers, M., Lafit, G., Myin-Germeys, I., & Viechtbauer, W. (2021). Perceived
disturbance and predictors thereof in studies using the experience sampling method.
Current Psychology. https://doi.org/10.1007/s12144-021-01974-3
Rintala, A., Wampers, M., Myin-Germeys, I., & Viechtbauer, W. (2020). Momentary predictors
of compliance in studies using the experience sampling method. Psychiatry Research,
286, 112896. https://doi.org/10.1016/j.psychres.2020.112896
Roekel, E., Keijsers, L., & Chung, J. M. (2019). A Review of Current Ambulatory Assessment
Studies in Adolescent Samples and Practical Recommendations. Journal of Research on
Adolescence, 29(3), 560–577. https://doi.org/10.1111/jora.12471
Saint-Maurice, P. F., Graubard, B. I., Troiano, R. P., Berrigan, D., Galuska, D. A., Fulton, J. E.,
& Matthews, C. E. (n.d.). Estimated Number of Deaths Prevented Through Increased
Physical Activity Among US Adults. JAMA Internal Medicine.
Sallis, J. F., Buono, M. J., Roby, J. J., Micale, F. G., & Nelson, J. A. (1993). Seven-day recall
and other physical activity self-reports in children and adolescents: Medicine & Science
in Sports & Exercise, 25(1), 99–108. https://doi.org/10.1249/00005768-199301000-
00014
Schembre, S. M., Liao, Y., O’Connor, S. G., Hingle, M. D., Shen, S.-E., Hamoy, K. G., Huh, J.,
Dunton, G. F., Weiss, R., Thomson, C. A., & Boushey, C. J. (2018). Mobile Ecological
Momentary Diet Assessment Methods for Behavioral Research: Systematic Review.
JMIR MHealth and UHealth, 6(11), e11170. https://doi.org/10.2196/11170
Schmidt, C., Collette, F., Cajochen, C., & Peigneux, P. (2007). A time to think: Circadian
rhythms in human cognition. Cognitive Neuropsychology, 24(7), 755–789.
https://doi.org/10.1080/02643290701754158
Schmidt, H. (2016). Chronic disease prevention and health promotion. Public Health Ethics:
Cases Spanning the Globe, 137–176.
Schroeders, U., Schmidt, C., & Gnambs, T. (2021). Detecting Careless Responding in Survey
Data Using Stochastic Gradient Boosting. Educational and Psychological Measurement,
001316442110047. https://doi.org/10.1177/00131644211004708
Schwartz, S. J., & Petrova, M. (2019). Prevention Science in Emerging Adulthood: A Field
Coming of Age. Prevention Science, 20(3), 305–309. https://doi.org/10.1007/s11121-
019-0975-0
Scollon, C. N., Kim-Prieto, C., & Scollon, C. N. (2003). Experience Sampling: Promises and
Pitfalls, Strengths and Weaknesses. Journal of Happiness Studies, 4(1), 5–34.
https://doi.org/10.1023/A:1023605205115
116
Shiffman, S., Stone, A. A., & Hufford, M. R. (2008). Ecological Momentary Assessment.
Annual Review of Clinical Psychology, 4(1), 1–32.
https://doi.org/10.1146/annurev.clinpsy.3.022806.091415
Shiffman, S., & Waters, A. J. (2004). Negative Affect and Smoking Lapses: A Prospective
Analysis. Journal of Consulting and Clinical Psychology, 72(2), 192–201.
https://doi.org/10.1037/0022-006X.72.2.192
Shiroma, E. J., & Lee, I.-M. (2010). Physical activity and cardiovascular health: Lessons learned
from epidemiological studies across age, gender, and race/ethnicity. Circulation, 122(7),
743–752.
Shrout, P. E., Stadler, G., Lane, S. P., McClure, M. J., Jackson, G. L., Clavél, F. D., Iida, M.,
Gleason, M. E. J., Xu, J. H., & Bolger, N. (2018). Initial elevation bias in subjective
reports. Proceedings of the National Academy of Sciences, 115(1).
https://doi.org/10.1073/pnas.1712277115
Smyth, J. M., & Heron, K. E. (2012). Health psychology. In Handbook of research methods for
studying daily life. (pp. 569–584). The Guilford Press.
Smyth, J. M., Jones, D. R., Wen, C. K. F., Materia, F. T., Schneider, S., & Stone, A. (2021).
Influence of ecological momentary assessment study design features on reported
willingness to participate and perceptions of potential research studies: An experimental
study. BMJ Open, 11(7), e049154. https://doi.org/10.1136/bmjopen-2021-049154
Sofija, E., Harris, N., Phung, D., Sav, A., & Sebar, B. (2020). Does Flourishing Reduce
Engagement in Unhealthy and Risky Lifestyle Behaviours in Emerging Adults?
International Journal of Environmental Research and Public Health, 17(24), 9472.
https://doi.org/10.3390/ijerph17249472
Stone, A. A., Broderick, J. E., Schwartz, J. E., Shiffman, S., Litcher-Kelly, L., & Calvanese, P.
(2003). Intensive momentary reporting of pain with an electronic diary: Reactivity,
compliance, and patient satisfaction. Pain, 104(1), 343–351.
Stone, A. A., Kessler, R. C., & Haythomthwatte, J. A. (1991). Measuring Daily Events and
Experiences: Decisions for the Researcher. Journal of Personality, 59(3), 575–607.
https://doi.org/10.1111/j.1467-6494.1991.tb00260.x
Stone, A. A., & Shiffman, S. (1994). Ecological Momentary Assessment (Ema) in Behavioral
Medicine. Annals of Behavioral Medicine, 16(3), Article 3.
https://doi.org/10.1093/abm/16.3.199
Stone, A. A., Shiffman, S. S., & DeVries, M. W. (1999). Ecological momentary assessment.
117
Suffoletto, B., Goyal, A., Puyana, J. C., & Chung, T. (2017). Can an app help identify
psychomotor function impairments during drinking occasions in the real world? A
mixed-method pilot study. Substance Abuse.
https://doi.org/10.1080/08897077.2017.1356797
Suh, H., Shahriaree, N., Hekler, E. B., & Kientz, J. A. (2016). Developing and Validating the
User Burden Scale: A Tool for Assessing User Burden in Computing Systems.
Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems,
3988–3999. https://doi.org/10.1145/2858036.2858448
Tourangeau, R. (2018). The survey response process from a cognitive viewpoint. Quality
Assurance in Education, 26(2), 169–181. https://doi.org/10.1108/QAE-06-2017-0034
Tourangeau, R., Rips, L. J., & Rasinski, K. (2000). The Psychology of Survey Response (1st ed.).
Cambridge University Press. https://doi.org/10.1017/CBO9780511819322
Turner, C. M., Arayasirikul, S., Trujillo, D., Lê, V., & Wilson, E. C. (2019). Social inequity and
structural barriers to completion of ecological momentary assessments for young men
who have sex with men and trans women living with HIV in San Francisco. JMIR
MHealth and UHealth. https://doi.org/10.2196/13241
van Berkel, N., Ferreira, D., & Kostakos, V. (2018). The Experience Sampling Method on
Mobile Devices. ACM Computing Surveys, 50(6), 1–40. https://doi.org/10.1145/3123988
van Berkel, N., Goncalves, J., Wac, K., Hosio, S., & Cox, A. L. (2020). Human accuracy in
mobile data collection. International Journal of Human Computer Studies, 137(January),
Article January. https://doi.org/10.1016/j.ijhcs.2020.102396
van Roekel, E., Goossens, L., Verhagen, M., Wouters, S., Engels, R. C. M. E., & Scholte, R. H.
J. (2014). Loneliness, Affect, and Adolescents’ Appraisals of Company: An Experience
Sampling Method Study. Journal of Research on Adolescence, 24(2), 350–363.
https://doi.org/10.1111/jora.12061
Walsh, W. B. (1967). Validity of self-report. Journal of Counseling Psychology, 14(1), 18.
Wang, S., Intille, S., Ponnada, A., Do, B., Rothman, A., & Dunton, G. (2022). Investigating
Microtemporal Processes Underlying Health Behavior Adoption and Maintenance:
Protocol for an Intensive Longitudinal Observational Study. JMIR Research Protocols,
11(7), e36666. https://doi.org/10.2196/36666
Wang, T. W., Tynan, M. A., Hallett, C., Walpert, L., Hopkins, M., Konter, D., & King, B. A.
(2018). Smoke-Free and Tobacco-Free Policies in Colleges and Universities ― United
States and Territories, 2017. MMWR. Morbidity and Mortality Weekly Report, 67(24),
686–689. https://doi.org/10.15585/mmwr.mm6724a4
118
Ward, M. K., & Meade, A. W. (2018). Applying Social Psychology to Prevent Careless
Responding during Online Surveys. Applied Psychology, 67(2), Article 2.
https://doi.org/10.1111/apps.12118
Welling, J., Fischer, R.-L., & Schinkel-Bielefeld, N. (2021). Is it Possible to Identify Careless
Responses with Post-hoc Analysis in EMA Studies? Adjunct Proceedings of the 29th
ACM Conference on User Modeling, Adaptation and Personalization, 150–156.
https://doi.org/10.1145/3450614.3462237
Wen, C. K. F., Schneider, S., Stone, A. A., & Spruijt-Metz, D. (2017). Compliance with mobile
ecological momentary assessment protocols in children and adolescents: A systematic
review and meta-analysis. Journal of Medical Internet Research, 19(4), Article 4.
https://doi.org/10.2196/jmir.6641
Winkleby, M. A., & Cubbin, C. (2004). Changing patterns in health behaviors and risk factors
related to chronic diseases, 1990–2000. American Journal of Health Promotion, 19(1),
19–27.
Yearick, K. A. (2017). Experience sampling methods (ESM) in organizations: A review.
Zawadzki, M. J., Scott, S. B., Almeida, D. M., Lanza, S. T., Conroy, D. E., Sliwinski, M. J.,
Kim, J., Marcusson-Clavertz, D., Stawski, R. S., Green, P. M., Sciamanna, C. N.,
Johnson, J. A., & Smyth, J. M. (2019). Understanding stress reports in daily life: A
coordinated analysis of factors associated with the frequency of reporting stress. Journal
of Behavioral Medicine, 42(3), 545–560. https://doi.org/10.1007/s10865-018-00008-x
Zhang, X., Pina, L. R., & Fogarty, J. (2016). Examining Unlock Journaling with Diaries and
Reminders for In Situ Self-Report in Health and Wellness. Proceedings of the 2016 CHI
Conference on Human Factors in Computing Systems, 5658–5664.
https://doi.org/10.1145/2858036.2858360
Zink, J., Belcher, B., Dzubur, E., Ke, W., O’Connor, S., Huh, J., Lopez, N., Maher, J., & Dunton,
G. (2018). Association Between Self-Reported and Objective Activity Levels by
Demographic Factors: Ecological Momentary Assessment Study in Children. JMIR
MHEALTH AND UHEALTH, 6(6). https://doi.org/10.2196/mhealth.9592
119
Appendix A: Attention Check Questions
Question Answer 1 Answer 2 Answer 3 Answer 4 Answer 5 Correct Answer
What color is the
sky? Blue Green Red Sixteen Pliers Blue
How many
inches in a foot? Four hundred 1 12 Snails 10 12
A pig says? Oink Oink Snails Snails Moo Moo Meow Meow
Bananas
Bananas Oink Oink
The word
GREEN has
____ letters 2 3 4 5 6 5
Valentine’s Day
is in February March July Eggshells Monday February
Apple is a Cow Fruit
Flamethrowe
r
Dumpster
fire Guitar Fruit
Which of the
following is NOT
a color? Existence Green Red Blue Yellow Existence
Which one of
these is NOT an
exercise? Sit-ups Push-ups Squats Running Blueberry Blueberry
Which one of
these is a former
president of the
United States of
America? Banana Tree
Abraham
Lincoln
Micky
Mouse Tubelight Cactus plant
Abraham
Lincoln
Which of these is
a coin? A penny New Delhi Paris
Rock
climbing
The Statue of
Liberty A penny
Which of these is
a city?
Washington
D.C. Stomach
Banana
bread
Michael
Jackson Russia
Washington
DC
Which of these is
an animal? Polar Bear iPhone 7
Chai Tea
Latte Snowfall Microwave Polar Bear
What can you
NOT do on a
paper? Take notes Scribble
Draw
flowers
Paint with
colors
Rock
climbing Rock climbing
Which of these is
a planet? Mars Coffee Pencil Sunglasses Water bottle Mars
Which of these is
NOT a city?
New York
City Boston
San
Francisco Chicago
Icecream
Sandwich
Icecream
Sandwich
One dozen= ____
? 12 Compter SpaceX Rock music Tacos 12
Which of these is
a plant? Cactus
Pressure
cooker Camels Aeroplane Smartphone Cactus
120
Which of these is
NOT a school
subject? Chemistry Physics
Chasing
squirrels Mathematics Biology
Chasing
squirrels
Which of these is
a famous painter? Pablo Picasso Harry Potter Jurassic Park
Pad Thai
Noodles Robocop Pablo Picasso
Which of the
following is a
famous scientist? Ninja Turtle Donald Duck
Mickey
Mouse
Albert
Einstein
Bob the
builder Albert Einstein
Which of the
following is a
country/nation? Bucket Mug Spoon Bottle Canada Canada
Sun rises in the
____ ? East West North South Icecream East
Which of the
following is a
state in the
United States of
America? Bubble bath Donuts Toothbrush California Pillow California
Which of the
following is a
part of the human
body? Street lights Monkeys Towers Buildings Legs Legs
Which of the
following is NOT
an OUTDOOR
ACTIVITY? Hiking Running Walking Surfing
Being
indoors all
day
Being indoors
all day
Which of the
following is a
wild animal? Painting Lion Alphabet Book Pizza Lion
Which of the
following is a
famous
astronaut?
Neil
Armstrong Harry Potter
Lord
Voldemort
The Death
Eaters Agent X Neil Armstrong
Which of the
following is a
FOUR LETTER
word? One Two Three Four Seven Four
Which of the
following is a
THREE
LETTER word? Ink Envelope Study Armchair Photography Ink
Which of the
following is a
famous
musician? Mozart
Mickey
Mouse
Donald
Duck Goofy Ninja Turtle Mozart
121
Which of the
following is NOT
a country? India Canada Russia France Starbucks Starbucks
Which of the
following is the
LARGEST? Ant Sparrow Butterfly Snail Elephant Elephant
Which of the
following is used
to make clothes? Silk Mud Ice Pizza Tacos Silk
Please select
MANGO from
this list. Apple Banana Mango Grape Pear Mango
Please select the
FOURTH option
from the list
below First Second Third Fourth Fifth Fourth
What is 5+5=
___ ? 500 3.14 10 99 0.0000005 10
Which of these is
SMALLEST? Elephant Blue Whale
Great White
Shark Ant Grizzly Bear Ant
Which of these
can you carry in
your wallet?
One dollar
bill A big truck
A giant
elephant
A washing
machine
A
dishwasher One dollar bill
Which of these is
NOT a type of
beverage? Coffee Tea Red wine
A large
pepperoni
pizza Water
A large
pepperoni pizza
When is
Thanksgiving
celebrated? November Cats Dogs Google Microsoft November
Which of the
following is a
type of TREE? Oak
New York
City Amsterdam Paris Chicago Oak
Which of the
following is a
type of SPORT? Jeff Bezos Football Elon Musk Bill Gates
Mark
Zuckerberg Football
Which of the
following is a
type of METAL? Silk Cotton Wool Iron Egg Iron
Which of the
following is a
type of SODA? Coke Egg
Banana
Bread
Cuckoo
Clock Toyota Prius Coke
Which of the
following is a
type of
VEGETABLE? Table Chair Carrot Eagle
Mountain
Lion Carrot
122
Which of the
following DOES
NOT start with
the letter P? Paris Pillow Saturday Pine Pastry Saturday
Which of the
following
STARTS WITH
the letter F? Tomato Bird Friday Pine Good Friday
Which of the
following DOES
NOT start with
the letter C? City Car Carrot Chocolate Alabama Alabama
Which of the
following is NOT
a type of flower? Rose Lily Tulip Sunflower Garlic Bread Garlic Bread
Which of the
following is NOT
in a round shape? Wheel Globe Circle Basketball
A cubicle
box A cubicle box
Which of the
following is NOT
a number? Chicago Ten Fifty Twenty Five One Chicago
Which of the
following is a
color? Brown Cow Mushroom Alice Rope Brown
Which of the
following has
wheels? Caterpillar Pillar Rome July Car car
Which of the
following has
spots? NYC Cow LA Rome August Cow
Which of the
following is an
animal? Monkey Money Dollar Environment Metal Monkey
Which of the
following is a
fabric? Brush Chair Computer Cotton Wood Cotton
Which can you
drink? Silver Gold Uranium Water Oxygen Water
Which begins
with the letter O? Selena Rain Earth Fire Olive Olive
Which is a
flower? Tulip Bread Candy Purple Sunset Tulip
Which is a single
letter? Bed I Sheet Comforter Light I
Which is a
country? Japan Jeff Bezos
Washington
Irving
George
Washington Art Japan
123
Which begins
with the letter P? Photograph Cauliflower Air Fryer George Kansas Photograph
Which is a type
of instrument? Ballet Picture frame Jacuzzi Piano KFC Piano
Which is a type
of weather? Hug Jan Rain Tan Ken Rain
Which is a color? Red Baby Lisp Lips Line Red
Which is a part
of the human
body? Lace Leg And The Car Leg
Which is a liquid
at room
temperature? Banana Regret Hand Water Brita Filter Water
Which is a verb? Marshmallow Target Dance Coat Shirt Dance
Which is a state? Florida Paris London Los Angeles Grass Florida
Which is a noun? Run Fly Carry Carrot Fold Carrot
Which is an
adjective? Yodel Read Happy Apple Olive Happy
What is 3+3 =? 1 2 3 6 Tall 6
What is a metal? Iron Silk Cotton Skin Georgetown Iron
What is a job? Airplane Tea Teacup Porcelain Teacher Teacher
What is a plant? Tree Airpods Stick Notes Tape Notion Tree
What is a
number? Green J D 56 Alan 56
What is a type of
music? Jazz Towel Tea Towel Cup Glass Jazz
What is a type of
art? Painting Green Grass Blue Car Capital letter Jelly Bean Painting
What is a letter? 5 7 2 L 8 L
Click on "The" A B At The Task The
Click on "Target" Hally Holiday French Italian Target Target
Abstract (if available)
Abstract
This dissertation was methodological in nature and consisted of three interrelated studies that aimed to address the prevalence and predictors of inattentive responding (IR) in an intensive ecological momentary assessment (EMA) study. The studies established models of how, when, and why inattentive responding is likely to occur and provide insights into preventive strategies to increase the validity of the self-report measure. The aims were to: 1a) Apply various IR detection indices used with retrospective cross-sectional self-report methods to EMA methods and compare accuracy across those indices. 1b) Develop a predictive model for IR using the indices to estimate the prevalence of IR in EMA. 2) Investigate person-level (e.g., demographics, personality, approach motivation, perceptions of burden) and survey- level (e.g., study day, time of day, location, activity level) predictors of IR. 3) Collect qualitative data from participants from the EMA study to understand the process of IR and factors leading to IR to EMA surveys.
Study 1 found a low prevalence of IR in the EMA study data (3%). Inattentive response detection indices from cross-sectional survey research did not translate well to EMA data and revised cut-off scores were suggested. The best model to detect IR in EMA combined between-subject response variability as well as between-subject and within-subject total response times as predictors. Study 2 revealed that sex at birth and reward responsiveness were the most important person-level factors related to attentiveness. Contrary to existing literature, other demographic factors and personality showed no significant relationships to participant attentiveness to EMA surveys. Of the contextual factors, surveys delivered on the weekend, earlier weeks in the study, and at home had the highest likelihood of being attentive. The qualitative data in Study 3 provides preliminary evidence that social context may be an under-considered factor underlying decreased data quality and the acceptability of attention check questions. These findings also provide potential explanations for observed decreases in response variability over time and could guide future research to improve participant engagement.
As EMA study designs become more sophisticated, the measurement of factors that contribute to declines in response accuracy will grow in importance. Overall, results from this dissertation suggest that inattentive responding may be more related to study-level factors than person level factors. Results provide insights into factors that may improve the reliability and validity of EMA data or considerations for researchers when designing their studies to assess and maximize data quality. Given the low prevalence of IR found, this dissertation potentially also reduces the concern of IR as a potential limitation of the use of real-time longitudinal data capture methods like ecological momentary assessment for the assessment of health behavior.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Understanding the methodological limitations In the ecological momentary assessment of physical activity
PDF
Sleep health and variability in youth: a real-time data capture study to examine influences on daily dietary intake patterns and longitudinal weight trajectories
PDF
System dynamics of in-home family eating behavior: insights from intensive longitudinal data using Ecological Momentary Assessment and wearable sensors
PDF
Understanding the dynamic relationships between physical activity and affective states using real-time data capture techniques
PDF
Emotion regulation and heart rate variability among smokers
PDF
The acute and longitudinal associations between sedentary behaviors, affective states, and emotional disorder symptoms among youth
PDF
Understanding emotional regulation and mood of young adults in the context of homelessness using geographic ecological momentary assessment
PDF
Free-living and in-lab effects of sedentary time on cardiac autonomic nervous system function in youth with overweight and obesity
PDF
Using ecological momentary assessment to study the impact of social-cognitive factors on paretic hand use after stroke
PDF
Effects of near-roadway air pollution exposure on obesity, obesity-related behavior, and neurobehavioral deficits during peripuberty
PDF
Cultural risk and protective factors for tobacco use behaviors and depressive symptoms among American Indian adolescents in California
PDF
Smoke-free housing policies and secondhand smoke exposure in low income multiunit housing in Los Angeles County
PDF
Prenatal sleep health, cortisol, and gestational weight gain
PDF
Effects of the perceived and objectively assessed environment on physical activity in adults and children
PDF
Influences of specific environmental domains on childhood obesity and related behaviors
PDF
Opioid withdrawal symptoms, opioid use, and injection risk behaviors among people who inject drugs (PWID)
PDF
Psychosocial and behavioral ractors associated with emotional eating in adolescents
PDF
Multilevel influences of care engagement and long-term survival among childhood, adolescent, and young adult cancer survivors
PDF
Using a structural model of psychopathology to distinguish relations between shared and specific features of psychopathology, smoking, and underlying mechanisms
PDF
Diffusion of a peer‐led suicide preventive intervention in secondary schools: strategies to increase effectiveness of peer‐led interventions
Asset Metadata
Creator
Wang, Shirlene D. (author)
Core Title
Developing and testing novel strategies to detect inattentive responding in ecological momentary assessment studies
School
Keck School of Medicine
Degree
Doctor of Philosophy
Degree Program
Preventive Medicine (Health Behavior)
Degree Conferral Date
2023-08
Publication Date
07/10/2023
Defense Date
06/22/2023
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
data quality,ecological momentary assessment,OAI-PMH Harvest,survey methodology
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Dunton, Genevieve (
committee chair
), Intille, Stephen (
committee member
), Kirkpatrick, Matthew (
committee member
), Mason, Tyler (
committee member
), Pang, Raina (
committee member
)
Creator Email
shirlenedwang@gmail.com,shirlenw@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113262995
Unique identifier
UC113262995
Identifier
etd-WangShirle-12048.pdf (filename)
Legacy Identifier
etd-WangShirle-12048
Document Type
Dissertation
Format
theses (aat)
Rights
Wang, Shirlene D.
Internet Media Type
application/pdf
Type
texts
Source
20230710-usctheses-batch-1064
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
data quality
ecological momentary assessment
survey methodology