Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
The impact of statistical method choice: evaluation of the SANO randomized clinical trial using two non-traditional statistical methods
(USC Thesis Other)
The impact of statistical method choice: evaluation of the SANO randomized clinical trial using two non-traditional statistical methods
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
THE IMPACT OF STATISTICAL METHOD CHOICE: EVALUATION OF THE
SANO RANDOMIZED CLINICAL TRIAL USING TWO NON-TRADITIONAL
STATISTICAL METHODS
by
Christianne Joy Lane
Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(PSYCHOLOGY)
August 2009
Copyright 2009 Christianne Joy Lane
ii
Dedication
While I have been blessed with knowing many people who have believed in me
and told me I can achieve anything (including this dissertation work), only a
couple never let me get away with giving up. To these gentlemen, I dedicate this
work. To my Dad, who has always believed in me much more than I believed in
myself, and to Stan, who steadfastly refused to accept that I did not want to
finish.
iii
Acknowledgements
I always read the acknowledgement pages of books and think of all those
people who helped make it all happen. And I reflect on how lucky the author was
to have such support. And now I get my own chance to say thank you to those
people, and I am humbled at how supremely blessed I have been to receive
more love and support than I ever could have wished for, and certainly more than
I deserved.
I would like to thank my parents, Ed and Sharon, who love to brag about
me and who always told me I could be anything I wanted to when I grow up.
Someday, I hope to figure out what that may be. And my most wonderful Aunt
Carol, an outlier of the best kind, who rolls her eyes at the HUA of the world and
who has been with me through every happy, sad, angry, giddy, and fearful
moment.
I have so many dear friends who have listened to be complain, cry, laugh,
and say more curse words than ever should be spoken aloud. Doug, you have
talked me off the ledge more than once, and then given me permission to jump. I
could never have done it without you (or Sam Adams). Susan, you have watched
this progress from the old days of nine-line and have shared your keen
observances of how things really work which have changed my point of view. For
this, and for the fantabulous poofy hat, I am very grateful. Robin, you have never
waivered in your support and love. I still miss you terribly and can’t wait till you
come to your senses and move out West again. Eri, you are the best drinking
iv
buddy a girl could ask for! May we enjoy many non-juice cocktails in the future!
Anjanette, I keep our Barbie’s watching over me to remind me of your never-say-
die-kick-ass-and-take-names attitude. You are a true inspiration. Elena, you have
fed me emotionally, intellectually, and bodily and without that, I would have
withered away to a waif in all ways. Thanks for always making me smile when I
think of you. And to my GNO crew (Q, Anastasia, Louise, & Alia) for all the
laughter, the righteous indignation on my behalf, and for all the support during
this crazy time. You girls are the best. To you all, I owe much glögg for years to
come!
I would also like to thank Rand Wilcox, who took on lost cause with
enthusiasm and patience. I cannot express enough how much it meant to me. I
also want to give thanks to Steve Read, Zhong-Lin Lu, and Stan Azen for serving
on my committee which meant thoroughly reading “the beast.” I am sure the
good folks at Starbucks are grateful for the amount of caffeine they sold to make
that happen. And for Irene Takaragawa who has always generously walked me
through all the ins and outs of the paperwork, listened to my drama, and found
ways around all sorts of unbreakable rules. For the SANO team who worked so
hard on this study, and to Michael Goran (PI) for letting me use this data, thank
you! I hope there are enough pretty pictures for you! For Arthur and Roger, my R
gurus, I appreciate your generosity in helping me move through this crazy
language. And to the wonderful folks who offered to read my dissertation---aren’t
you happy plans changed?
v
Finally, to Martin, my partner, my love, and my soon-to-be co-pilot. Thank
you for making my life so much more than ordinary. I am happy to be your
second opinion any day (as long as you’ll agree that I am always right). I love
you.
vi
Table of Contents
Dedication ii!
Acknowledgements iii!
List of Tables ix!
List of Figures x!
Abbreviations xiii!
Abstract xv!
Chapter 1.! Purpose 1!
Chapter 2.! Overview of the Strength and Nutrition
Outcomes in Latino Adolescents Study (SANO) 5!
Results of ANCOVA model 7!
Challenges to Assumptions in SANO 11!
Small sample size 11!
Non-normality 11!
Skewness 12!
Heavy Tails 14!
Heteroskedasticity 15!
Heterogeneous population sampled 15!
Chapter 3.! Approach #1: Robust ANCOVA Alternative 18!
What makes an estimator robust? 19!
Research Question for Approach #1: 20!
Chapter 4.! Robust ANCOVA Methods 21!
Strategies for Applying Robust ANCOVA Alternatives 22!
Median (Mdn) 24!
Trimmed Mean (M
T%
) 24!
Strategy #1: Omnibus Comparison of Regression
Lines (ancmg function) 29!
Strategy #2: Pairwise Comparison at Different Design
Points (ancova function) 29!
Strategy #3: Comparison of Regression Depths (ancsm
and qancsm functions) 31!
Effect Sizes 32!
Figures of Running Interval Smoother 32!
Comparing effectiveness of various robust methods 33!
vii
Chapter 5.! Robust ANCOVA Results 34!
Models 34!
Construct specific results 34!
Strength 34!
Dietary Intake 39!
Motivation 45!
Body Composition 52!
Insulin & Glucose Dynamics 59!
Effect sizes 60!
Comparison of Results of different methods 68!
Was this technique effective in achieving a statistical result? 68!
Was this method easy to use? 69!
Would the explanation of these statistics be acceptable
to a medical audience? (I.e., could you publish these results
in a medical journal?) 69!
Summary of Robust ANCOVA Approach 70!
Chapter 6.! Approach #2: Latent Profile Analysis 72!
Research Questions: 73!
Chapter 7.! Latent Profile Analyses Methods 74!
Development of the Hybrid LPA Model 75!
Model Fitting 81!
Chapter 8.! Latent Profile Model Results 87!
Strength 87!
Hybrid model 87!
Profile 87!
Dietary Intake 92!
Hybrid model 92!
Profile 92!
Motivation 100!
Hybrid model 100!
Profile 101!
Body Composition 109!
Hybrid model 109!
Profile 111!
Insulin & Glucose Dynamics 118!
Hybrid model 118!
Profile 118!
Summary of Hybrid Latent Profile Model Approach 128!
viii
Chapter 9.! Conclusion 130!
Did statistical method choice matter? 130!
A new theoretical model? 133!
Strengths and Weaknesses of Non-traditional Approaches 137!
Future Directions 138!
Bibliography 141!
Appendix A. SANO Variables 149!
Appendix B. Pre- & Post-Test Value by Intervention Group 152
Appendix C. Equations for Robust Estimators 160
Appendix D. Robust Descriptives 163
Appendix E. Effect Sizes 169
Appendix F. Fit Indices for Mixed Models 172
ix
List of Tables
Table 1. Hypotheses and ANCOVA Results SANO Construct Variables 8
Table 2. Robust ANCOVA Results: Bench Press (kg) 36!
Table 3. Robust ANCOVA Results: Leg Press (kg) 38!
Table 4. Robust ANCOVA Results: Total Energy Intake (kcal) 40!
Table 5. Robust ANCOVA Results: Total Fiber (g) 42!
Table 6. Robust ANCOVA Results: Total Sugar (g) 43!
Table 7. Robust ANCOVA Results: Added Sugar (g) 44!
Table 8. Robust ANCOVA Results: Relative Autonomy Index (RAI) 46!
Table 9. Robust ANCOVA Results: Autonomous 48!
Table 10. Robust ANCOVA Results: Controlled 49!
Table 11. Robust ANCOVA Results: Amotivation 50!
Table 12. Robust ANCOVA Results: BMI Z-score 53!
Table 13. Robust ANCOVA Results: Fat Weight (kg) 55!
Table 14. Robust ANCOVA Results: Lean Tissue Weight (kg) 57!
Table 15. Robust ANCOVA Results: Insulin Sensitivity (SI) 61!
Table 16. Robust ANCOVA Results: Acute Insulin Response 63!
Table 17. Robust ANCOVA Results: Disposition Index 64!
Table 18. Overlap of Construct Classes 127!
Table 19. Means, Trimmed Means, and Medians 163!
Table 20. Effect Sizes (akp.effect function) 169!
x
List of Figures
Figure 1. SANO Theoretical Model 6!
Figure 2. Raw SI Across Intervention Groups 10!
Figure 3. Latent Profile Analysis (LPA) 75!
Figure 4. Latent Class Grown Analysis (LCGA) 77!
Figure 5. Hybrid model 78!
Figure 6. Class = Treatment 80!
Figure 7. Example of Class Maps for 1-3 Classes 85!
Figure 8. Fit Indices and Class N’s: Strength 88!
Figure 9. Strength Classes 89!
Figure 10. Profile Bench Press 90!
Figure 11. Profile Leg Press 91!
Figure 12. Leg Press Change Class x Intervention (M+SE) 92!
Figure 13. Fit Indices and Class N's: Dietary Intake 93!
Figure 14. Dietary Intake Classes 94!
Figure 15. Profile Energy Intake 95!
Figure 16. Profile Fiber Intake 96!
Figure 17. Profile Total Sugar Intake 97!
Figure 18. Profile Added Sugar Intake 98!
Figure 19. Energy Intake (kcal) Change Class x Intervention (M+SE) 100!
Figure 20. Motivation Classes 101!
Figure 21. Fit Indices and Class N's: Motivation 102!
xi
Figure 22. Profile RAI 104!
Figure 23. Profile Autonomous Factor 105!
Figure 24. Profile Controlled Factor 106!
Figure 25. Profile Amotivated Factor 107!
Figure 26. Amotivation Pre-test Class x Intervention (M+SE) 108!
Figure 27. Amotivation Change Class x Intervention (M+SE) 108!
Figure 28. Controlled Pre-test Class x Intervention (M+SE) 109!
Figure 29. Controlled Change Class x Intervention (M
T
+SE
t20
) 109!
Figure 30. Fit Indices and Class N's: Body Composition 110!
Figure 31. Body Composition 111!
Figure 32. Fit Indices and Class N's: Body Composition 113!
Figure 33 . Profile BMI Z-score 114!
Figure 34. Profile Fat Weight 115!
Figure 35. Profile Lean Weight 116!
Figure 36. BMI Z-score Pre-test Class x Intervention (M+SE
) 117!
Figure 37. Fat Weight (kg) Pre-test Class x Intervention (M+SE) 117!
Figure 38. Hybrid Model Fits: Glucose and Insulin Dynamics 119!
Figure 39. Glucose and Insulin: 120!
Figure 40. Profile SI 122!
Figure 41. Profile AIR 123!
Figure 42. Profile DI 124!
Figure 43. AIR Change Class x Intervention (M+SE) 125!
Figure 44. DI Change Class x Intervention (M+SE) 125!
xii
Figure 45. Model Tested by ANCOVA analyses 134!
Figure 46. Changes to Model Based on ANCOVA Results 135!
Figure 47. Changes to Model Based on Robust ANCOVA Results 135!
Figure 48. Changes to Model Based on LPA Results 136!
Figure 49. Bench Press Pre- and Post-Test 152!
Figure 50. Leg Press Pre- and Post-Test 152!
Figure 51. Total Energy Intake Pre- and Post-Test 153!
Figure 52. Total Fiber Intake Pre- and Post-Test 153!
Figure 53. Total Sugar Intake Pre- and Post-Test 154!
Figure 54. Added Sugar Intake Pre- and Post-Test 154!
Figure 55. Motivation for Exercise: Relative Autonomy
Pre- and Post-Test 155!
Figure 56. Motivation for Fruits & Vegetables: Autonomous
Factor Pre- and Post-Test 155
Figure 57. Motivation for Fruits & Vegetables: Controlled
Factor Pre- and Post-Test 156
Figure 58. Motivation for Fruits & Vegetables: Amotivation
Factor Pre- and Post-Test 156
Figure 59. BMI Z-Score Pre- and Post-Test 157!
Figure 60. Fat Weight Pre- and Post-Test 157!
Figure 61. Lean Tissue Weight Pre- and Post-Test 158!
Figure 62. Insulin Sensitivity (SI) Pre- and Post-Test 158!
Figure 63. Acute Insulin Response (AIR) Pre- and Post-Test 159!
Figure 64. Disposition Index (DI) Pre- and Post-Test 159!
xiii
Abbreviations
1-RM 1 repetition maximum strength test
AIC Akaike Information Criterion
AIR Acute insulin response
ANCOVA Analysis of covariance
ANOVA Analysis of variance
BIC Bayesian Information Criterion
BLRT Bootstrapped Likelihood Ratio Test
BMI Body Mass Index
C Class
CI 95% Confidence Interval
CI! 95% Confidence Interval for difference
Con Control group
DI Disposition index
DP
E
Empirically estimated design point
DP
F
Female design point
DP
M
Male design point
FSIVGTT Frequently sampled intravenous glucose tolerance
test
GCRC General clinical research center
xiv
LCGA Latent class growth analysis
LPA Latent profile analysis
MAD Median absolute deviation
MADN Normalized MAD
Mdn Median
M
T%
% trimmed mean
Nut Nutrition education only group
Nut+ST Nutrition + Strength Training group
RAI Relative Autonomy Index (SRQ-E)
SANO Strength and Nutrition Outcomes in Latino
Adolescents study
SE Standard Error
SI Insulin sensitivity
SRQ-E Motivation for Exercise scale
TSRQ-HD Intervention Self-Regulation Questionnaire: Healthy
Diet
xv
Abstract
When the findings of a randomized clinical trial are null, and yet the
assumptions of the statistical model are not met, is it appropriate to conclude
there is not an effect of the intervention? The purpose of this study is to examine
statistical method choice for the analysis of a randomized clinical trial of a
strength training and nutrition intervention in a sample of overweight Latino
adolescents (N = 54). Results of analysis of covariance (ANCOVA) models were
overwhelmingly null, however there were several concerns about the underlying
assumptions. Two non-traditional statistical approaches that do not carry the
same assumptions as traditional ANCOVA were used to reanalyze these data.
The first approach uses a robust analog of ANCOVA that is based on fewer
restrictions, and which can increase power while maintaining Type I error rate,
even with small samples. Using these robust techniques, the conclusions
regarding the effectiveness of the intervention varied widely from those of
traditional ANCOVA, as several significant intervention effects were found with
these robust methods. In the second approach, developed for this study, a hybrid
of two common latent profile analysis models was created to generate a profile of
which participants benefited from the intervention. This model tests whether the
sample is homogeneous. With this model, it was shown that gender and pre-test
values had more influence than the intervention on outcomes, and the
intervention appeared to modify these influences. These results suggest that the
use of traditional ANCOVA models in the face of assumption violations may lead
xvi
to missing important effects of an intervention. Expanding the scope of standard
techniques for analyzing randomized clinical trials would likely result in a different
literature landscape for many disciplines, though the acceptability of the use of
these results poses challenges for publication of papers using them. New
guidelines may need to be incorporated into recommendations for which
methods to use for analyzing randomized clinical trials.
1
Chapter 1. Purpose
When conducting a randomized clinical trial, great care is given to
defining hypotheses, choosing measures that are reliable and valid, and
ensuring consistency in how the trial is conducted so that the data obtained is
of the best quality possible. Once the data is collected, it is common practice
to use standard, traditional statistical methods to analyze this data without
much consideration of whether these methods are actually the best to answer
the hypotheses. While substantial advances have been made in the field of
statistics since these standard methods were developed, applied research
does not often make effective use of this progress when data are analyzed in
applied settings. Using outdated and insufficient methods to test hypotheses
can lead to incorrect or incomplete conclusions when the assumptions are not
met, a common occurrence with “real” data. Statistical choices can have far
reaching consequences. They have an impact on conclusions as to whether
hypotheses were supported or not, which in turn affects what research is
published, where it is published, and what future projects are funded. It is the
responsibility of researchers to choose the most appropriate statistical
strategies to answer the research questions with the available data. However,
in many research fields, the use of non-traditional statistics is not generally
accepted. Thus, even if these statistics are better scientifically, the challenges
of publishing such data make the effort used in obtaining these statistics a vain
endeavor. This study sought to demonstrate that the use of non-traditional
2
methods has benefits (including the practical aspect of more significant p-
values), which may make the effort worthwhile.
Intent to treat analyses are typically planned such that mean level at
post-test is compared across intervention group assignment using a form of
analysis of variance (ANOVA) model (one-way for change scores or repeated
measures). These models include assumptions of random assignment,
normality, and homogeneity. When these assumptions are satisfied, the
ANOVA model performs as well as any other method in terms of Type I error
and power (Wilcox, 1998b). However, even slight departures in these
assumptions can have drastic effects on power to detect group differences.
While significant results in classic models are likely to reflect true group
differences between distributions, when assumptions are violated, the power
to detect group differences is limited and thus it is more likely that clinically
relevant findings are missed. Outliers, skewness, and heterogeneity can be a
recipe for null findings, even when samples sizes are large. When covariates
are included, and an analysis of covariance (ANCOVA) model is used, these
matters are complicated by additional assumptions regarding the conditional
variances and the specific type of relationship covariates have with outcome
measures. Additionally, there is an assumption that all relevant constructs are
being measured and included in the analyses. However, there is always a risk
of unmeasured heterogeneity in the sample. Even if this heterogeneity reflects
that of the population, such unmeasured systematic variation in the sample
may constitute an important factor influencing intervention response. Given all
3
these assumptions, a case can be made that ANOVA and ANCOVA models
may not be ideal for testing intervention effects in many randomized clinical
trials. And if power is low, a null result may not actually mean that there was
no intervention effect.
This study explored two alternative approaches to analyzing
randomized clinical trials by applying two non-traditional statistical methods to
data from the Strength and Nutrition Outcomes in Latino Adolescents Study
(SANO; Davis et al., In Press). SANO tested the efficacy of an innovative
strength training and nutrition education intervention program to improve the
metabolic parameters in overweight Latino adolescents that put them at risk
for type 2 diabetes. Despite two successful pilot studies (Davis et al., 2007;
Shaibi et al., 2006), the intervention effects of the key variables of the SANO
study were overwhelmingly null when analyzed using the traditional ANCOVA
models as specified in the research plan. This study explored why these
results were not statistically significant by modeling the data using two very
different approaches: (1) The use of robust estimators to test the hypotheses
of intervention effects across groups with statistics that do not carry the
assumption load of classic ANCOVA; and (2) The exploration of unmeasured
population heterogeneity using latent class analysis. This project asks whether
the null findings were due to no effect of the intervention (the conclusion drawn
from the ANCOVA results), the use of non-optimal statistical methods, or were
due to the influence of unmeasured heterogeneity among participants on
intervention effect. These approached were compared in terms of their effect
4
on conclusions to the SANO hypotheses, ramifications to the theoretical
model, ease of use, and potential acceptability to a medical audience.
5
Chapter 2. Overview of the Strength and Nutrition Outcomes
in Latino Adolescents Study (SANO)
The SANO study was a randomized clinical trial of a 16 week strength
training and nutrition education program designed to improve glucose and
insulin dynamics in overweight (> 85
th
percentile of BMI) Latino adolescents
aged 14-17. Details of the sample are reported in Davis et al. (In Press). Sixty-
six participants were randomized to one of three groups: (1) Control (Con); (2)
Nutrition Education (Nut) who received once per week nutrition education
emphasizing decreasing sugar and increasing fiber consumption; or (3)
Strength Training + Nutrition Education (Nut+ST) who received twice per week
resistance training and once per week nutrition education. Both intervention
groups additionally received several sessions of motivational interviewing
(Rubak, Sandbaek, Lauaritzen, & Christiansen, 2005) over the course of the
16 weeks. These sessions addressed the motivation behind making the
lifestyle changes addressed in the weekly sessions. Boys and girls were
separated for all training and education classes. There were 54 evaluable
participants (28 boys and 26 girls) at the end of the study. The main outcome
variables for this study were the glucose and insulin parameters from a
frequently sampled intravenous glucose tolerance test (FSIVGTT).
Additionally, outcomes related to the intervention (strength and nutrition
improvements), and body composition (weight, BMI, lean and fat mass) were
6
reported in the main outcome paper (Davis et al., In Press). Appendix A details
how these variables were collected.
The theoretical model behind the SANO study is shown in Figure 1. In
this model, the intervention (nutrition education, strength training, and
motivational interviewing) is hypothesized to influence insulin and glucose
parameters through increases in strength, increases in motivation, and
changes in diet (specifically increases in fiber consumption and decreases in
sugar consumption). These changes affect body composition, which in turn
influence glucose and insulin dynamics. The SANO study further hypothesized
an ordinal response to intervention, such that those in the Nut group only
Figure 1. SANO Theoretical Model
7
would see improvements over the Con, and the Nut+ST would improve more
than the Nut only group.
Results of ANCOVA model
Table 1 summarizes the hypotheses for the SANO variables under
consideration in this study. Additionally, the results for the ANCOVA models
with change (post-test – pre-test) as the dependent variable and pre-test and
gender as covariates are reported. This model assumes that for j
th
group,
Equation 1
!
Yij ="0 j +"1(Pre#test) +"2(Gender) +$ij
where !
0j
is the mean of group j, controlled for the covariates. It is also
assumed that "
ij
is independent of the covariates and is normally distributed
with µ = 0. Additionally, it is assumed that there is homoskedasticity of
variance across groups, and that the relationship of covariates with Y is linear.
If these assumptions are met, then the hypothesis of primary interest is
whether the intercepts of the groups are the same:
Equation 2
!
H0:"1 ="2 ="3
As these ANCOVA results will be used as a comparison point for further
analyses, these variables reflect the raw value, though several of the variables
were log transformed in the published paper. Additionally, in (Davis et al., In
Press), there were several covariates used that differed by dependent
variables that are not included here. Finally, outliers removed in the original
ANCOVAs are included here for the purpose of comparing the effectiveness of
the alternative methods considered here.
8
Table 1. Hypotheses and ANCOVA Results SANO Construct Variables
Hypotheses ANCOVA
Con Nut Nut+ST
(p-
value) Post-hoc
Strength
Bench Press
= = + 0.000
Con vs Nut
Nut vs
Nut+ST
Leg Press
= = + 0.008
Nut vs
Nut+ST
Dietary Intake
Energy = - - 0.038 Con vs Nut
Total Sugar = - - 0.489
Added Sugar = - - 0.222
Fiber = + + 0.655
Motivation
Fruits & Vegetables
Autonomous
Controlled
Amotivation
= + +
0.057
0.383
0.303
Exercise: Relative
Autonomy Index
= + ++ 0.394
Body Composition
BMI = - - - 0.725
Total Fat (kg) = - - - 0.914
Total Lean Tissue (kg) = + ++ 0.905
Glucose & Insulin
Insulin Sensitivity (SI) = + ++ 0.654
Acute Insulin
Response (AIR)
= + ++ 0.956
Disposition Index (DI) = + ++ 0.535
Notes: = no change + increase - decrease
Con = Control group; Nut = Nutrition Education group; Nut+ST = Nutrition and
Strength Training
P-values are for ANCOVA models of change (pre-post) controlling for pre-test and
gender
As can be seen in Table 1, there were no significant intervention effects
found for the glucose and insulin dynamics (insulin sensitivity (SI), acute
insulin response (AIR), and disposition index (DI)). The significance found in
the strength training variables indicates that the resistance training portion of
the intervention was effective. The nutrition goals of lowering sugar and
9
improving fiber were not reflected by significant p-values, though total energy
significantly declined in the Nut+ST group. There was also trend for the
Autonomous factor in the Intervention Self-Regulation Questionnaire: Healthy
Diet (TSRQ-HD; p = 0.057). The conclusion drawn from these results was that
the intervention protocol was not effective in improving glucose/insulin
dynamics or body composition, even though the measures of the intervention
modality (strength and some nutrition variables) showed that participants did
respond to the intervention.
Upon closer examination of raw scores, it become apparent the lack of
significant findings does not necessarily mean that there were no intervention
effects. For example insulin sensitivity (SI), the variable that power
calculations were based on, shows that the intervention was effective for many
participants. Figure 2 shows the raw pre- and post-test data sorted by change
score across all participants (post-pre). The arrows reflect the hypothesized
directions of intervention effect. Thus, those on the far right of the graph
increased their SI the most, as hypothesized, while those on the left
decreased their SI. As can be seen, most of the Con group changed little or
decreased, with the exception of two who had the greatest increases across
all groups. The Nut group was approximately divided into thirds: decreasing,
no change, and increasing. The Nut+ST group primarily increased or showed
no change, with one participant with a large decline. Upon inspection of these
graphs, it does appear that the interventions may have improved the SI of
participants vs. control, but the outliers with the greatest changes on either
10
end may be masking this effect. The data shown in Figure 2 were not unique.
The raw pre-post values for all the variables under consideration are shown in
Appendix B. These highlight interesting changes in several variables that
would be dismissed by traditional ANCOVA results (Table 1). Thus, the SANO
data set offers an excellent opportunity to explore the effects of non-traditional
statistical models in a setting that challenges most (if not all) of the
assumptions of classic ANCOVA. In the SANO key outcomes paper (Davis et
al., In Press) these were handled in the usual ways as described below.
Figure 2. Raw SI Across Intervention Groups
11
Challenges to Assumptions in SANO
Small sample size
The sample size required to achieve sufficient power (80%) to detect
significant intervention effects (! = 0.05) was determined using significant
difference in SI change found in a pilot study of the strength training only (no
nutrition education) versus controls in a sample of boys (Shaibi et al., 2006). It
was concluded that 20 participants per group (10 boys/10 girls) would be
sufficient to detect differences in SI change similar to those seen in the pilot
study. Gender was not a factor in the sample size determination. In the SANO
study, the attrition rate from randomized to evaluable was 18% and two of the
three groups (Con and Nut+ST) were left with evaluable samples less than 20.
The evaluable groups are Con (n = 16), Nut (n = 21), and Nut+ST (n = 17).
This level of attrition is not surprising given the challenges of conducting an
intensive intervention with adolescents involving multiple hospital stays and
four months of classes and training, however the a priori power calculations
were challenged.
Non-normality
Most of the variables used in (Davis et al., In Press) were not normally
distributed. This is not uncommon in applied research. Micceri (1989)
compared 440 distributions from 89 different populations; all data sets were
large (all N’s >400) and all of them departed from normality. This is even more
likely given a small sample (Wilcox, 1998a, 2002b). Given the low power to
12
detect distribution difference in small samples, this number of non-normal
variables in the SANO data indicates severe deviations from normality.
Examination of box plots, skewness, and kurtosis estimates indicate that much
of the non-normality was due to heavy tailed distributions. In the SANO data,
these were treated in the following way. First, normality statistics tested at the
0.05 level (Kolmogorov-Smirnov and Shapiro-Wilk) were performed with all
data. If the distribution was non-normal, then potential outliers as indicated
from boxplots were removed from the sample and the tests for normality were
rerun. If the normality statistics indicated normality, then the analyses
proceeded with the smaller sample, though this violates the assumption of
independence among observations. If still non-normal, then a series of Box-
Cox transformations were implemented. Some distributions could not be
normalized with any transformations, and so log transformations were used as
a default due to the practicality of exponentiating results for better
comprehension, as these data are not clinically meaningful in transformed
form. The violated assumption of normality may be a large part of why the
SANO results were not significant, given that non-normality can lead to very
low power (e.g., Mosteller & Tukey, 1977; Staudte & Sheather, 1990; Tukey,
1960; Wilcox, 2005).
Skewness
Maintaining sufficient power can be a serious challenge especially
when comparing groups with different skewness (Wilcox, 1998b, 2002; Wilcox
& Keselman, 2003). One way to deal with skewness is to transform the data,
13
and this was done in the case of most of the SANO variables. However,
transformations can be problematic because standard transformations often
fail to address the effects of outliers (Wilcox, 1998a, 2002). It can happen that
skewness remains when all scores are transformed in the same way (Wilcox,
2002). Additionally interpretability is often lost with readers wondering what
clinical relevance transformed variables have. These problems are amplified
when two distributions being compared differ in skewness (Wilcox &
Keselman, 2003).
Another way of managing skewness is to rely on the central limit
theorem. However, many conclusions regarding the efficacy of this approach
were based on light-tailed distributions (Wilcox, 2005). With light-tailed
distributions it was estimated that N of 20-25 is sufficient to use the central
limit theorem. This may not be true with heavy tails; sometimes even much
larger N’s are insufficient when gross outliers are present (e.g., Harwell,
Rubenstein, Hayes, & Olds, 1992; Wilcox, 1998b). In the case of the SANO
study, relying on the central limit theorem was clearly not an option due to
sample size.
Finally, it may be that non-normality may not be reflective of a poorly
drawn sample or a contaminated sample. It may indeed accurately reflect the
true population distribution (Wilcox, 1998b). In this case, a non-normal
distribution is to be expected and embraced, perhaps illuminating qualities in
the samples that are worth exploring.
14
Heavy Tails
Another challenge that is common in real data is that of heavy tails, a
common occurrence (Keselman, Algina, Lix, Wilcox, & Deering, 2008; Tukey,
1960; Wilcox, 1998a, 1998b, 2002;Wilcox & Keselman, 2003). Often this is the
result of outliers. While outliers are potentially interesting (Wilcox, 1998b) they
can also be a nuisance when one is trying to ascertain a “typical” response.
Traditional exploration for outliers includes examination of boxplots, and
looking at the standard deviation distance from mean. Then the “outliers” are
eliminated from the data and the analyses proceed as if they never existed.
These standard techniques were employed with the SANO data. This is
problematic due to the fact that standard deviations include outliers in the
estimation, which can influence the categorizing of a value as an outlier. Thus,
even if a value is very far from the rest of the data if may not be flagged as an
outlier due to itself (see Wilcox, 1998a, 2002), a situation called masking
(Wilcox & Keselman, 2003). Additionally, estimates obtained from a sample
where outliers are eliminated violate the assumption that the observations are
independent and the standard errors calculated are therefore incorrect
(Wilcox, 1998a). Better ways of dealing with outliers include an empirical
exploration of potential outliers using m-estimators or the median absolute
deviation, which is discussed below.
15
Heteroskedasticity
There were vast differences in variances across groups, large enough
to be observable by the naked eye, chiefly in the change scores. This is not
particularly surprising in a trial where it is expected that different interventions
result in different effect sizes. However, it does pose challenges for computing
accurate confidence intervals. Wilcox (1998a) shows that with classic
methods, heteroskedasticity causes one to compute incorrect standard error,
resulting in poor power and large confidence intervals, even when the
normality assumption is met (Wilcox, 2002; Wilcox & Keselman, 2003).
Standard tests for equality (e.g. Kolmorgorov) of variances can be performed,
and were in the SANO study, but they themselves have low power to detect
differences (Wilcox, 1998a, 2002). These problems are only exacerbated
when two distributions being compared also differ in skew (Wilcox &
Keselman, 2003).
Heterogeneous population sampled
When the study was designed, it was assumed that the effect of the
intervention would be similar for boys and girls. There were two pilot studies
conducted before the main trial. A 12-week strength training intervention was
given to boys only (Shaibi et al., 2006) and a nutrition education training (no
control group) was given to girls (Davis et al., 2007). In SANO, all intervention
components were held in unisex environments. Preliminary examination of
analysis stratified by gender showed that boys and girls responded differently
16
to the intervention (results not shown; Lane, 2007). However the sample was
powered only to analyze combined groups, and thus gender was treated as a
covariate in all analyses. Pre-test values across groups did not significantly
differ at baseline, however the small sample size may preclude power to
detect differences.
When covariates are included in a model, the assumptions discussed
above still hold, and additional assumptions relating to the conditional variance
of the dependent variable given the covariate are added. These assumptions
include homoskedasticity of the conditional variances and the same (linear)
slope across groups (Wilcox, 2005) so that the distance between the
regression lines (1 covariate case), or regression planes (2+ covariates)
between groups is maintained no matter where the comparison is made along
the distribution of the covariate. When this assumption of parallel lines does
not hold up, then the choice of covariate points at which to compare the
intervention group means is critical to determining the group effect (Wilcox,
2005). ANCOVA typically makes this comparison at the average of the
covariates, which may not reflect a typical difference. The non-traditional
approaches used here do not make this assumption.
Given the challenges of the SANO data set, it is not particularly
surprising that most of the results of the study were null. However, these null
results leave open the possibility that re-examining these data using non-
17
traditional methods may reveal something that is masked amongst the
assumptions of ANCOVA. The following approaches are alternatives to
traditional ANCOVA models.
18
Chapter 3. Approach #1: Robust ANCOVA Alternative
When it comes to hypothesis testing, the standard methods taught in
most basic statistics classes are dated and limited, especially when used with
applied data where assumptions are often unmet. This includes ANCOVA
models commonly used to test intent-to-treat hypotheses in randomized
clinical trials where even one extreme data point can make findings non-
significant. The first non-traditional approach to testing intervention effects
uses robust ANCOVA alternatives to test the intervention effect.
Robust estimators are statistics that are not greatly influenced by
changes to a small number of data points (Hampel, Ronchetti, Rousseeuw, &
Stahel, 1986; Huber, 1981). They have improved parameter estimation and
power properties as compared with conventional statistics when assumptions
are violated. And when assumptions are not violated, then they perform
comparably well with traditional statistics (Wilcox, 2005). Therefore, one of the
main benefits of the using robust methods is that one loses very little under
standard assumptions: if a classic method is significant, a robust one typically
is significant as well (Wilcox, 2002). Alternatively, if there really is no difference
across groups, then all methods control type I error reasonably well. As
reported above, the data from the SANO possessed several challenges to
assumptions in an ANCOVA model, therefore there may be benefits gained by
utilizing robust statistics that do not require these assumptions. While all
attempts were made in the SANO data to deal with these challenges using
19
standard methods of transformations and eliminating outliers, it is worthwhile
to explore measures that do not rely on meeting the assumptions of ANCOVA.
Thus, the purpose of these analyses is to explore a robust alternative to
classic ANCOVA for the variables presented in Table1.
What makes an estimator robust?
Given the problems with many standard estimators, an important
question is: What are we looking for in good estimators? A good estimator will
have smaller standard errors, meaning shorter confidence intervals, leading to
increased power to detect differences across groups (Wilcox, 1998a).
Additionally, a good estimator will be robust in the face of violations of the
assumptions in traditional methods listed above (Wilcox, 2002). A robust
measure is one where small changes in distribution do not mean large
changes in estimation, power or probability coverage (Wilcox, 1998a).
Wilcox (2005) reviews the characteristics of a robust estimator. These
include qualitative and quantitative robustness. An estimate has qualitative
robustness if the changes in the statistic reflect proportionate changes made
to the data. That is, a small change in the data does not make a large change
in the statistic. Quantitative robustness is represented by the breakdown point,
a practical measure of the fraction of the data that can be non-typical before
the estimator is affected. A larger breakdown point has more quantitative
robustness (Wilcox, 1998b). For example, the breakdown point for the
population mean is zero: A single observation can drastically change this
parameter. The breakdown point of the median is 0.5, where 50% of the data
20
can be altered and it will still not affect the point estimation of central tendency.
What an ideal breakdown point may be is a point of debate. However,
somewhere between 0.1 and 0.2 seems preferable under many conditions
(Huber, 1993; Reed, 1998; Wilcox, 1998b). Prescott (2005) describes validity
robustness and efficiency robustness. Validity robustness is obtained when an
estimator maintains Type I error rate with violations for assumptions. Efficiency
robustness is obtained when power is maintained with violations. Both are
attractive qualities to have in an estimator. As can be seen, robust statistics
are a logical next step to understanding the null findings of SANO.
Research Question for Approach #1:
Do the results of the intent-to-treat analysis of SANO differ when robust
methods that do not have the same assumption are used instead of
ANCOVA?
21
Chapter 4. Robust ANCOVA Methods
Given the small sample sizes, non-normally distributed data, and
heteroskedasticity in the key variables of SANO, robust estimators may offer
better power and more accuracy than traditional methods to test the
intervention effects. If there truly were no intervention effects, then these
methods should not conflict with the ANCOVA findings. However, if group
differences are masked by failure to meet the assumptions, these methods
may reveal that and give additional insight into the effectiveness of the SANO
intervention. Robust methods test a very similar hypothesis to ANCOVA. In
these analyses, it is assumed that:
Equation 3
!
Y
ij
= m
j
(x ) +"
ij
where j denotes the group, i the individual in group j, and x is a vector of
covariates, and m
j
(x) is a conditional measure of location for Y given the
covariates. The omnibus null hypothesis for these models is that
Equation 4
!
H
0
:m
1
(x
k
) = ... =m
1
(x
k
)
at any x
k
. This hypothesis, as shown in Equation 4, also assumes that the
error, !
ij
, is independent of x, and equals zero based on the measure of
location used for Y (Wilcox, 2008). However, unlike ANCOVA models, there is
no assumption that the error terms across groups are homogeneous.
The data used for these analyses are those reported in Table 1 and
include measures of the intervention objectives, body composition, motivation,
and glucose and insulin dynamics. The results for intervention effects from the
22
ANCOVA analyses will be used as a baseline comparison point for all
analyses. Additionally, when applicable, the robust ANCOVA functions will be
run with zero trimming to compare performance of methods. These reference
points reflects the “worst case” scenario in which the estimates fail all criteria
for robustness (Wilcox, 1998a, 1998b), and the data does not conform to a
single one of the assumptions of the method, as reported above. For the
purpose of these analyses, we are interested in assessing intervention effects
across three groups as shown in Equation 4.
Strategies for Applying Robust ANCOVA Alternatives
There are several robust ANCOVA strategies available to researchers
that can be applied to the comparison of treatment groups. In each there are
certain considerations one must consider. The first is the number of groups
that can be compared simultaneously. Most robust ANCOVA alternatives have
been proposed for comparisons of two groups (Bowman & Young, 1996;
Delgado, 1993; Dette & Neumeyer, 2001; Hall, Huber, & Speckman, 1997;
Kulasekera, 1995; Kulasekera & Wang, 1997; Neumeyer & Dette, 2003;
Young & Bowman, 1995), there are not many that can easily handle more
than two groups. The SANO study has three groups, which is a limiting factor
for many of these robust statistics. Thus, for relevant robust ANCOVA
methods that are limited to two groups, models will be run pairwise.
There is also the issue of the number of covariates that can be included
in the model. Additionally, most of these robust statistics do not account for
more than one covariate at a time, another limiting factor (Bowman & Young,
23
1996; Delgado, 1993; Dette & Neumeyer, 2001; Hall, Huber, & Speckman,
1997; Kulasekera, 1995; Kulasekera & Wang, 1997; Munk & Detter, 1998;
Neumeyer & Dette, 2003; Young & Bowman, 1995). When a particular robust
ANCOVA function only allows for one covariate, pre-test values were used,
and then comparisons were made for the total sample, as well as for girls and
boys separately. All models were run with R ("The R Project for Statistical
Computing") utilizing code referenced in Wilcox (2005).
Assumptions about the nature of the relationship of the covariate with
the outcome measure are also to be given consideration. If a linear
assumption is made, and the regression lines are indeed parallel, then the
location of the comparison along the distribution of the covariate is not
important. However, if this assumption is not made, and the group differences
can vary along the covariate, then the location of the comparison is very
important, and the methods of choosing such points becomes relevant.
One of the main benefits of this robust alternative to ANCOVA is the
flexibility of the measure of location; m
j
can be any measure of location, not
just the mean. To compare to the SANO intervention groups medians and
trimmed means were tested using various robust ANCOVA techniques
described below. These measures of location have been demonstrated to be
more robust than the mean in cases where assumptions are not met (Wilcox,
2005). Additionally, these estimators were chosen because they are easily
comprehensible to researchers in the medical field who are not statisticians.
While there are other measures of location (e.g. M-estimators), these have
24
been shown to have limitations when it comes to hypothesis testing with small
samples (Wilcox, 1998b). Also, as the actual variable values are of interest as
outcomes, rank-based non-parametric methods were not utilized due to the
limitations in interpretations of results (Wilcox, 2002).
Median (Mdn)
Whereas the mean has a breakdown point of zero, the median has a
breakdown point of 0.5. That is, more than half of the data must be altered to
ruin the median (Wilcox, 1998a). A criticism of the median is that it may not be
as powerful when there are few outliers, though the reverse may also be true
(Wilcox, 1998b). As the SANO data do not follow a normal distribution, this is
not an anticipated problem.
Trimmed Mean (M
T%
)
Trimmed means are computed by ordering all the values and then
removing a percent of the data symmetrically from the tails. The amount of
trimming is determined a priori. Thus, the trimmed mean is
Equation 5
!
M
t(%)
=
X
(g +1)
+...+ X
(n"g)
n"2g
Where g = !n, rounded down to the nearest integer; ! is the percent of trimming
(R. R. Wilcox, 2005). M
T%
is the mean of the values with the g largest and
smallest values removed. To compute the standard errors for M
T%
, it is
necessary to adjust for the dependence created in the sample by trimming the
largest and smallest values. Details of this are reported in Appendix C.
25
Trimmed means can be robust estimators, especially when used in
conjunction with Winzorized variances (see Appendix C) and a variety of
bootstrapping procedures (Keselman, Wilcox, & Lix, 2003; Othman,
Keselman, Padmanabhan, Wilcox, & Fradette, 2004; Wilcox, 1998a; Wilcox,
Keselman, & Kowalchuk, 1998). This should be beneficial in creating
estimates from the small N in the SANO data. Trimmed means have the
further benefit of being able to make a direct comparison of means and
directionality of intervention effects (Wilcox, 2001). M
T%
can have smaller
standard errors due to the dependence of the order statistic, an advantage in
applied work. There can be limitations to the benefit of trimmed means if there
are more outliers than the percent of trimming (Wilcox, 2002). Twenty percent
trimming is a popular choice and seems to be the best trade-off between no
trimming and maximum trimming (the median). Additionally, it loses little
efficiency under normality, and is much more efficient than the mean with
heavy tails. In this study, three different levels of trimming were compared:
none (M), 20% (M
T20
) , and a midway point between these, 10% (M
T10
).
One strategy is to compare estimated typical values across groups,
generally a measure of central tendency (Wilcox, 1997). In this strategy, the
estimate of location for each group (M
j
), adjusting for covariates (gender x
1
and pre-test x
2
) is computed using regression:
Equation 6
!
M
j
="
0j +
"
1
X
1
+"
2
X
2
26
Then the estimates of the slope and intercepts are used to calculate
confidence intervals for each M
j
and examined for overlap across groups. This
method has the benefit of being able to use multiple covariates, but has the
limitation of assuming linear relationships between the covariates and Y. If this
assumption is not met, then the result will depend on the covariates chosen.
Because the assumption of linearity with the covariates, that is parallel
regression lines, may not be valid, this study utilizes another strategy for
examining group differences at various coordinates (design points) along the
covariate axes. First, the design points are determined, either empirically or
from theory, then robust estimators are computed at each design point for
each intervention groups, and then these estimators are compared.
The covariate used in these analyses, pre-test scores, does not have
theoretically relevant design points so the design points were estimated
empirically using a running interval smoother which finds values close to a
potential design point x
k
as determined by how far distant the span (f) of the
median deviation statistic (MAD; Wilcox, 1997; Wilcox, 2008). The MAD is an
estimate of the variability around the median. The MAD is the median of the
difference of all the absolute values of the differences between the values and
the median.
Equation 7
!
MAD =Mdn{x
1
"M,..,x
n
"M}
27
The span is a constant, f, which weights the MAD for a distance. Under
normality, MAD can be transformed to estimate the standard deviation by
applying the following formula
Equation 8
!
MADN =MAD/0.6745
Thus, if f = 1, the data points would be within one standard deviation of our
design points under normality. The running interval smoother has been shown
to be an effective smoother, especially when sample sizes are small, tails are
heavy, and there are multiple covariates, though the choice of span f may
affect the mean squared errors (e.g., Wilcox, 2005)). This running interval
smoother does not assume any parametric form, including linearity (Wilcox,
2008). For the purpose of this study f of 0.8 and 1.0 were used, which have
shown to be reasonable for many situations (Wilcox, 1997; Wilcox, 2005).
Design points can be also determined for more than one covariate, using a
method based on Tukey’s half-space depth (Tukey, 1975) which is an
extension of rank ordering for multivariate space.
Choosing the number of design points is challenging, as well, and
Wilcox (1997) suggests a manner in which adequate sample size is assured
(12+ per group for 2 samples) when comparing trimmed means (Yuen, 1974).
Wilcox (2007) used samples sizes for (n
jk
) >8 in an examination of small
samples. In this article, simulations were performed with group samples sizes
of 30 to 50. With n
jk
= 30 there was a 30% probability of not finding design
points using this method. This posed a serious concern with the SANO study
where samples are even smaller. When possible, n
jk
was manipulated to
28
increase the chance of empirically estimated design points (DPE) being found.
However, there were many times that even with n
jk
of 4, no DPE could be
found.
Keselman, Cribbie, and Wilcox (2002)) describes pairwise comparisons
on a one-way randomized design in which confidence intervals for the
difference between trimmed means controls for the family-wise error rate.
They manipulated the simulated data to reflect several different challenges
common with real data including unequal samples sizes, heteroskedasticity,
and non-normality. Their robust pairwise test (Keselman, Cribbie, & Wilcox,
2002; Keselman, Lix, & Kowalchuk, 1998) was computed with
Equation 9
!
t
W
=
Y
tj
"Y
t # j
d
j
+ d
# j
where
!
Y
tj
is the trimmed mean in group J, and d is the estimate of the
Winzorized standard error. This critical value is tested versus a t-distribution
with degrees of freedom estimated with
Equation 10
!
v
W
=
(d
j
" d
# j
)
d
j
2
/(h
j
"1) + d
# j
2
/(h
# j
"1)
where h
j
is the trimmed sample size. They additionally used step-down
bootstrapping methods to obtains critical values, though this bootstrapping
technique was not very successful under many of the different conditions,
even when samples were increased.
29
Strategy #1: Omnibus Comparison of Regression Lines (ancmg function)
The R function ancmg was used to compute omnibus comparisons of
medians and trimmed means while controlling for both covariates. This
function has some benefits that the other functions do not. First, it allows the
comparison of more than two groups at a time. Additionally, there is the
possibility of using multiple covariates simultaneously. Finally, n
jk
can be
changed. The default minimum sample size of of n
jk
= 8 will be used, and then
decreased if none of the statistics can be estimated.
For these analyses, comparisons were made for the median, and for
M
T20
. Additionally, the option for multiple comparisons across groups and
percentile bootstrap estimation were used. This function also had the option of
adjusting the minimum N per group used for the comparisons. Decreasing this
minimum N can help the function find a solution. Comparisons were first made
with n
jk
= 8, however no comparisons were possible. Then, n
jk
= 6 was used,
though again, this did not result in estimates. Finally, n
jk
= 4 was used, which
helped, though many of the models still did not run. This was a major
challenge for using this method.
Strategy #2: Pairwise Comparison at Different Design Points (ancova function)
Comparisons of this sort were performed using the ancova function.
This strategy does not have assumptions about normality or homoskedasticity,
nor does it assume a specific parametric relationship of the covariates with Y.
It compares well to ANCOVA when all these assumptions are met. It also has
30
the benefit of being useful with any measure of location, such as trimmed
means or medians. There are two disadvantages to this approach for this
SANO data. The first is that only two groups can be compared at once.
Because of this pairwise comparisons will be made between all groups. The
second disadvantage is that this method allows only one covariate. Pre-test
was the covariate of greater importance, so it was chosen for analysis.
Additionally, the sample was stratified by gender and run separately for boys
and girls.
The ancova function was modeled with the DPE first. However, to
ensure that there were design points for which comparisons could be made,
the ancova models were also performed with design points that represented
the robust measure of central location for males (DPM) and females (DPF). In
this way, at least 2 design points were tested for each sample. DPM and DPF
were designated for the total sample at the levels of males and females such
that each the level of trimming matched the design points. Thus, when M
T20
was compared, the two design points used will reflect the M
T20
for boys and
the M
T20
for girls, and when M
T10
was compared, the boys and girl level reflect
M
T10
for each group.
At these design points, the ancova function was modeled for the M,
M
T10
, and M
T20
. Just as with ANCOVA, an F-statistic is computed to test the
intervention effect. Details of the calculation of the F-statistics for the median
and the trimmed mean are reported in Appendix C. The ancova model
provides p-values and confidence intervals of the difference between groups
31
at each design point tested. Adjustments were made for multiple comparisons
so as to maintain the probability of at least one Type I error, which can be
particularly important when samples are non-normal or of unequal sample size
(Westfall, Tobias, Rom, Wolfinger, & Hochberg, 1999).
Strategy #3: Comparison of Regression Depths (ancsm and qancsm
functions)
Pairwise comparison of regression depth for trimmed means and
median will be computed using the ancsm and qancsm functions, respectively.
These functions provide a global comparison of two regression lines (or
hyperplanes if there are more than two covariates) estimated from a running
interval smoother, so there is no assumption about linearity. Comparisons of
depths are made by comparing the rank of each regression line. The rank is
the number of data points that need to be removed to make the line a “non-fit”.
A non-fit is a partition of the x values where all the residuals values below this
are in one direction (either positive or negative). That is, a non-fit is presents if
none of the data points run through the line. Thus, the rank assesses the
number of points one would need to remove to create a fulcrum in the data, or
in practical terms, the number of residuals that would need to change signs.
As with the ancmg function, multiple covariates can be controlled for
simultaneously. When comparing depth, there are no assumptions about
identical covariates between groups, homoskedasticity, or the distribution of
the error term. Because only two groups may be compared at one time,
pairwise comparisons will be performed.
32
Effect Sizes
Given the small sample sizes with the SANO data, even with robust
methods, significance may be difficult to achieve. So to further gain an
understanding of whether the effects are non-significant due to them being
small or the sample size, various effect sizes were estimated for the pairwise
comparisons. Effect sizes were estimated with the akp.effect function, which
computes a robust effect size as described by (Algina, Keselman, & Penfield,
2005) for trimmed means. This follows a similar method of calculation as
Cohen’s ! (Wilcox, 2004) and can be interpreted the same. In this method,
robust measures of location (trimmed means) are used in place of means, and
Winzorized standard deviation is used in place of the pooled SD. The ratio of
the difference in M
T
to SD
W
is then weighted by 0.642, which ensures that the
robust !
R
is comparable to ! when the distribution of the population is normally
distributed and the variances are equal. While there are several other robust
estimators of effect size, this one was chosen because it is most likely to be
accepted by a medical audience due to its similarity to Cohen’s !.
Figures of Running Interval Smoother
Because these methods do not carry the assumption of a linear
relationship between covariates and dependent variables, it can be difficult to
understand the points of comparison. Thus, figures were created which show
graphical representation of the running interval smoother (function
runmean2g) so that the design points can be visualized.
33
Comparing effectiveness of various robust methods
To compare the effectiveness of each of the robust methods used for
testing the SANO study hypothesis methods were compared using the criteria
listed below. These criteria include statistical evaluation as well as practical
concerns that would be relevant to a medical audience, such as those who
would be reviewers
1. Was this technique effective in achieving a statistical result?
2. Does this result make sense in light of the theoretical model?
3. Which method had the smallest standard errors?
4. Was this method easy to use?
5. Would the explanation of these statistics be acceptable to a
medical audience? (I.e., could one publish these results in a
medical journal?)
34
Chapter 5. Robust ANCOVA Results
Models
Descriptive statistics were computed for means, trimmed means (10%
and 20%) and medians. These are reported in Tables 2 to 17. While
proceeding through the analyses, many of the robust statistics would not run
given the sample sizes. For this reason, several of the analyses could not be
completed. This was particularly true with any function that utilized the running
interval smoother, such as ancova, which attempts to find five design points at
which to make comparisons. This issue was largely resolved when the a priori
gender specific design points were included. Most of the functions only
allowed one covariate. Pre-test values were always included as covariates.
When possible, given sample sizes, gender stratified models were completed
as well.
Construct specific results
Strength
The outcome variables for strength are bench press and leg press
(Tables 2 and 3). There were no design points found when testing omnibus
effects for bench press with the running interval smoother which limited the
analyses possible in terms simultaneously comparing all three means and
medians (ancmg function). For the ancova function, which performs pairwise
comparisons of trimmed means, two design points were found, 95 and 100
(see Table 2). Results were consistent across different levels of trimming
35
(20%, 10%, 0%), whereby the Con increased more in bench press than Nut
and the Nut+ST participants increased more than both Con and Nut . When
design points were specified, some difference were found. There was a
difference between Con and Nut at the boys M
T20
and M , and a trend in girls
M . There was a difference between Con and Nut+ST at girls M , and a trend
at girls M
T20
and boys M . There was a difference between Nut and N+ST at
both M
T20
and M . Models comparing M
T10
did not run. Analysis of the depth
(ancsm function) differed from these results with a difference being found only
between Nut and N+ST , though the analyses for Con vs, Nut+ST would not
run, a limiting factor. As hypothesized, the Nut+ST group gained more in
strength.
As with bench press, there were no design points found for leg press
with the running interval smoother which limited the possible analyses in terms
for the omnibus tests of means and medians (ancmg function). This also was
a challenge for the ancova function, which could not be completed when the
program would estimate the design points (Table 3). When design points were
specified there was a difference between Nut and Nut+ST at boys M
T10
and M
, and for girls M. Models comparing M
T20
did not run. Analysis of the depth
(ancsm function) showed no differences across groups when trimmed means
were compared, though there were differences between Con and Nut+ST ,
and between Nut and Nut+ST when medians were used.
36
Table 2. Robust ANCOVA Results: Bench Press (kg)
Con vs. Nut Con vs Nut+ST Nut vs. Nut+ST
ancmg # DPE = 0
ancova
DPE
95
p
(CI!)
100
p
(CI!)
DPM
p
(CI!)
DPF
p
(CI!)
M
T20
0.01
(1.4,28.6)
0.02
(-0.8,27.9)
<0.01
(5.0,32.2)
0.20
(-4.6,14.9)
M
T10
<0.01
(2.3-27.7)
0.01
(1.2-26.4)
---
---
M
<0.01
(2.7,27.3)
<0.01
(1.1,26.1)
<0.01
(3.0,31.2)
0.09
(-3.4,21.4)
M
T20
---
0.01
(-30.1,4.7)
0.22
(-20.9,6.9)
0.08
(-36.9,5.4)
M
T10
---
0.03
(-29.1,2.8)
---
---
M
---
0.01
(-30.1,-0.1)
0.09
(-27.3,4.3)
0.01
(-29.3,-2.9)
M
T20
---
<0.01
(-44.3,-8.2
<0.01
(-39.8,-
11.5)
0.03
(-42.0,0.4)
M
T10
---
<0.01
(-43.4,-
10.6)
---
---
M
---
<0.01
(-44.9,-
12.6)
<0.01
(-44.3,-
12.9)
0.01
(-40.5,-9.7)
37
Table 2, Continued
Con vs. Nut Con vs Nut+ST Nut vs. Nut+ST
ancsm
qancsm
Total
0.20
0.03
Male
---
Female
---
Total
---
Male
---
Female
---
Total
---
Male
---
Female
---
--- no estimation; DPE-design point estimated empirically; DPM/F -design points male/female; CI's are adjusted for multiple
comparisons
38
Table 3. Robust ANCOVA Results: Leg Press (kg)
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancmg # DPE = 0
ancova
DPE
DPM
p
(CI!)
DPF
p
(CI!)
M
T20
---
---
---
M
T10
---
0.45
(-219,355)
0.36
(-159,323)
M
---
0.45
(-223,359)
0.43
(-169,313)
M
T20
---
---
---
M
T10
---
0.14
(-427,118)
0.97
(-240,247)
M
---
0.13
(-430,114)
0.90
(-255,232)
M
T20
---
---
---
M
T10
---
<0.01
(-341,-104)
0.06
(-176,19)
M
---
<0.01
(-358,-95)
0.05
(-182,16)
ancsm
qancsm
Total
0.83
0.36
Male
---
---
Female
---
---
Total
0.42
0.02
Male
---
---
Female
---
---
Total
0.23
0.01
Male
---
---
Female
0.28
0.82
--- no estimation; DPE-design point estimated empirically; DPM/F -design points male/female; CI's are adjusted for multiple comparisons
39
Dietary Intake
The outcome variables for dietary intake are total energy intake, fiber
intake, total sugar intake and added sugar intake (Tables 4 – 7). There were no
design points found for any of these variables with the running interval smoother
which limited the analyses possible in terms for the omnibus tests of means and
medians (ancmg function). This was likewise a challenge for the ancova function
would worked sporadically for these variables. For total energy, one design point
could be found for the comparison of Con vs. Nut, and the groups did not differ
(Table 4). When design points were specified, between Con and Nut at girls
M
T20
, and a trend at girls M
T10
. There was a difference between Con and Nut+ST
at girls M
T20
and M
T10
, and a trend for boys at those levels as well at girls M.
There was a difference between Nut and Nut+ST at boys M
T20
and a trend at
boys M
T10
and M. All depth comparisons, trimmed means and medians, were not
different across groups.
For total fiber, no design points could be found for the ancova function,
and when they were specified, there were no group differences (Table 5). All
measures of depth were likewise non-significant. Findings for total and added
sugar were similar, indicating that much of the sugars cut by participants came
from sugar added to processed foods. For total sugar, design points were found
for the ancova function for the Con vs. Nut and Nut vs. Nut+ST groups (Table 6).
These did not differ, and when design points were specified there were no
differences between Con and Nut, nor Nut and Nut+ST. There was a difference
40
Table 4. Robust ANCOVA Results: Total Energy Intake (kcal)
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancmg # DPE = 0
ancova
DPE
1914.4 p
(CI!)
DPM p
(CI!)
DPF p
(CI!)
M
T20
0.09
(-343,1413)
0.32
(-624,1426)
<0.01
(113,1029)
M
T10
0.08
(-266,1215)
0.44
(-656,1244)
0.11
(-196,1064)
M
0.18
(-469,1347)
0.58
(-643,1012)
0.25
(-421,1208)
M
T20
---
0.06
(-210,1843)
<0.01
(310,1323)
M
T10
---
0.10
(-300,1639)
0.02
(6,1331)
M
---
0.14
(-323,1400)
0.11
(-259,1423)
M
T20
---
0.04
(-53,884)
0.22
(-235,726)
M
T10
---
0.11
(-168,919)
0.22
(-214,683)
M
---
0.11
(-156,864)
0.33
(-264,641)
41
Table 4, Continued
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancsm
qancsm
Total
0.49
0.25
Male
0.80
0.98
Female
---
---
Total
0.19
0.30
Male
---
---
Female
---
---
Total
0.44
0.58
Male
0.81
---
Female
0.87
0.81
--- no estimation; DPE-design point estimated empirically; DPM/F -design points male/female; CI's are adjusted for multiple comparisons
42
Table 5. Robust ANCOVA Results: Total Fiber (g)
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancmg # DPE = 0
ancova
DPE
DPM p
(CI!)
DPF p
(CI!)
M
T20
---
0.67
(-
8.7,6.2)
0.67
(-
8.7,6.2)
M
T10
---
0.80
(-7.0,8.6)
0.87
(-8.2,7.2)
M
---
0.86
(-8.6,7.5)
0.86
(-8.6,7.5
M
T20
---
0.73
(-12.8,9.7)
0.73
(-12.8,9.7)
M
T10
---
0.69
(-8.4,11.8)
0.78
(-11.7,9.3)
M
---
0.83
(-9.0,10.8)
0.83
(-9.0,10.8)
M
T20
---
0.95
(-11.0,10.4)
0.95
(-10.0,10.4)
M
T10
---
0.80
(-7.9,9.6)
0.85
(-10.0,8.6)
M
---
0.69
(-7.6,10.5)
0.69
(-7.6,10.5)
ancsm
qancsm
Total
0.88
0.82
Male
0.30
0.97
Female
---
---
Total
0.75
0.26
Male
---
---
Female
---
---
Total
0.96
0.12
Male
0.45
---
Female
0.81
0.83
--- no estimation; DPE-design point estimated empirically; DPM/F -design points male/female; CI's are adjusted for multiple comparisons
43
Table 6. Robust ANCOVA Results: Total Sugar (g)
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancmg # DPE = 0
ancova
DPE
112.3 p
(CI!)
98.9 p
(CI!)
DPM p
(CI!)
DPF p
(CI!)
M
T20
0.36
(-40.0,78.0)
---
0.35
(-31.0,68.9)
0.60
(-51.6,78.0)
M
T10
0.46
(40.6,69.7)
---
0.27
(-28.3,75.9)
0.38
(-32.7,69.1)
M
0.41
(-42.2,77.7)
---
0.54
(-38.3,65.2)
0.37
(-30.9,69.1)
M
T20
---
---
0.04
(-4.0,102.0)
0.56
(-51.5,83.3)
M
T10
---
---
0.05
(-7.0,101.0)
0.25
(-29.7,85.1)
M
---
---
0.06
(-8.5,88.5)
0.22
(-24.5,80.0)
M
T20
---
0.69
(-53.8,70.5)
0.13
(-16.9,76.9)
0.88
(-42.6,47.9)
M
T10
---
0.60
(-42.5,61.4)
0.17
(-16.7,63.1)
0.60
(-34.7,53.6)
M
---
0.63
(-41.4,58.8)
0.18
(-19.2,72.3)
0.63
(-34.2,51.7)
ancsm
qancsm
Total
0.90
0.57
Male
---
---
Female
---
---
Total
1.0
0.30
Male
---
---
Female
---
---
Total
0.84
0.53
Male
0.69
---
Female
0.86
0.98
--- no estimation; DPE-design point estimated empirically; DPM/F -design points male/female; CI's are adjusted for multiple comparisons
44
Table 7. Robust ANCOVA Results: Added Sugar (g)
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancmg # DPE = 0
ancova
DPE
DPM
p
(CI!)
DPF
p
(CI!)
M
T20
---
0.31
(-33.8,82.6)
0.30
(-32.6,77.6)
M
T10
---
0.39
(-41.0,87.5)
0.31
(-30.8,74.6)
M
---
0.32
(-38.5,93.8)
0.32
(-31.2,74.9)
M
T20
---
---
---
M
T10
---
---
---
M
---
---
---
M
T20
---
---
---
M
T10
---
---
---
M
---
---
---
ancsm
qancsm
Total
0.16
0.60
Male
0.75
---
Female
---
---
Total
0.59
0.78
Male
---
---
Female
---
---
Total
0.13
0.66
Male
0.63
---
Female
---
0.85
--- no estimation; DPE-design point estimated empirically; DPM/F -design points male/female; CI's are adjusted for multiple comparisons
45
between Con and Nut+ST at boys M
T20
, M
T10
and trends at boys M and girls
M
T20
. There were no group differences in regression depths. Models of added
sugar similarly showed no difference in ancova models, and had problems with
estimating points, even when design points were estimated, though there was a
trend between Nut and Nut+ST at girls M
T20
(Table 7). Depths were all non-
significant.
Motivation
The outcome variables for motivation are the relative autonomy index
(RAI) for exercise (Table 8), and three factors for eating fruits and vegetables:
autonomous, controlled, amotivation (Tables 9 – 12). For RAI the omnibus
comparison showed some points of difference across groups (Table 8),
especially when percentile bootstrapping is used, and the pairwise comparisons
confirmed this indicating the differences were between Con and both of the
intervention groups . The ancova models found a couple of design points, but
there were no group differences. When design points were estimated, none of
the differences between groups were significant. When regression depths were
compared, there were no significant effects for either trimmed means or medians.
For all of the motivation to eat fruits and vegetables factors, no design
points could be found for the omnibus ancmg models (Tables 9 – 11). For the
ancova models, models would not run for the autonomous and controlled factors.
While a couple of design points were found for the controlled factor, none of the
46
Table 8. Robust ANCOVA Results: Relative Autonomy Index (RAI)
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancmg
Omnibus
# DPE = 1
M
T20
p = 0.07; Mdn p = 0.02
Pairwise
M
T20
<0.01
Mdn
<0.01
M
T20
0.03
Mdn
0.11
M
T20
0.42
Mdn
0.44
47
Table 8, Continued
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancova
DPE
8.75 p
(CI!)
10.5 p
(CI!)
11.5 p
(CI!)
DPM p
(CI!)
DPF p
(CI!)
M
T20
0.25
(-6.5,2.7)
---
---
0.37
(-5.5,2.5)
0.38
(-5.1,2.3)
M
T10
0.19
(-7.3,2.5)
---
---
0.29
(-6.0,2.3)
0.35
(-6.1,2.7)
M
0.24
(-7.2,2.8)
---
---
0.35
(-6.1,2.6)
0.40
(-6.2,2.9)
ancsm
qancsm
Total
0.74
0.03
Male
0.54
---
Female
---
---
Total
0.73
---
Male
---
---
Female
---
---
Total
0.83
---
Male
0.99
---
Female
0.50
---
--- no estimation; DPE-design point estimated empirically; DPM/F -design points male/female; CI's are adjusted for multiple comparisons
48
Table 9. Robust ANCOVA Results: Autonomous
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancmg # DPE = 0
ancova
DPE
DPM
p
(CI!)
DPF
p
(CI!)
M
T20
---
0.37
(-2.3,1.1)
0.47
(-1.4,0.8)
M
T10
---
0.17
(-2.2,0.6)
0.38
(-1.4,0.6)
M
---
0.16
(-2.3,0.6)
0.57
(-2.6,1.8)
M
T20
---
0.16
(-2.7,0.8)
0.51
(-1.4,0.8)
M
T10
---
0.09
(-2.5,0.4)
0.31
(-1.5,0.6)
M
---
0.09
(-2.5,0.4)
0.52
(-2.7,1.7)
M
T20
---
0.54
(0.4,-1.3)
0.80
(0.3,-0.8)
M
T10
---
0.57
(0.3,-1.1)
0.69
(0.3,-0.9)
M
---
0.48
(-1.1,0.6)
0.85
(-0.8,0.7)
ancsm
qancsm
Total
0.73
0.07
Male
---
----
Female
---
---
Total
0.63
0.81
Male
---
---
Female
---
---
Total
0.96
0.31
Male
---
---
Female
---
---
--- no estimation; DPE-design point estimated empirically; DPM/F -design points male/female; CI's are adjusted for multiple comparisons
49
Table 10. Robust ANCOVA Results: Controlled
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancmg # DPE = 0
ancova
DPE
DPM
p
(CI!)
DPF
p
(CI!)
M
T20
---
0.23
(-1.5,0.5)
0.10
(-1.5,0.3)
M
T10
---
0.14
(-1.4,0.3)
0.15
(-1.4,0.3)
M
---
0.09
(-1.4,0.2)
0.09
(-1.4,0.2)
M
T20
---
0.37
(-1.2,0.6)
0.28
(-1.4,0.5)
M
T10
---
0.21
(-1.2,0.4)
0.22
(-1.2,0.4)
M
---
0.24
(-1.2,0.4)
0.18
(-1.1,0.3)
M
T20
---
0.70
(-0.8,1.1)
0.60
(-1.8,1.1)
M
T10
---
0.76
(-0.8,1.1)
0.70
(-0.8,1.1)
M
---
0.59
(-0.7,1.2)
0.58
(-0.7,1.1)
ancsm
qancsm
Total
0.54
0.80
Male
---
---
Female
---
---
Total
0.31
0.29
Male
---
---
Female
---
---
Total
0.51
0.38
Male
---
---
Female
---
---
--- no estimation; DPE-design point estimated empirically; DPM/F -design points male/female; CI's are adjusted for multiple comparisons
50
Table 11. Robust ANCOVA Results: Amotivation
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancmg # DPE = 0
ancova
DPE
2.0 p
(CI!)
2.7 p
(CI!)
3.0 p
(CI!)
3.3 p
(CI!)
DPM p
(CI!)
DPF p
(CI!)
M
T20
0.54
(-1.7,0.5)
---
---
---
0.35
(-1.7,0.8)
0.42
(-2.0,1.0)
M
T10
---
---
---
---
0.56
(-1.5,0.9)
0.79
(-1.8,1.4)
M
0.56
(-1.6,1.1)
---
---
---
0.82
(-1.4,1.2)
0.87
(-1.7,1.5)
M
T20
---
---
---
---
0.59
(-1.4,0.8)
0.80
(-1.3,1.7)
M
T10
---
---
---
---
0.70
(-1.3,0.9)
0.61
(-1.1,1.9)
M
---
---
---
---
0.70
(-1.0,1.4)
0.53
(-1.0,1.9)
M
T20
---
0.42
(-1.1,1.9)
0.59
(-1.9,2.7)
0.94
(-2.2,2.3)
0.59
(-0.5,0.9)
0.27
(-0.6,1.9)
M
T10
---
0.57
(-1.3,1.9)
0.61
(-1.6,2.4)
0.97
(-1.9,2.0)
0.86
(-1.0,1.2)
0.36
(-1.7,1.8)
M
---
0.50
(-1.1,1.8)
0.48
(-1.3,2.3)
0.92
(-1.8,1.7)
0.50
(-0.7,1.4)
0.32
(-0.6,1.6)
51
Table 11, Continued
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancsm
qancsm
Total
---
0.05
Male
---
---
Female
---
---
Total
---
---
Male
---
---
Female
---
---
Total
0.90
0.67
Male
---
---
Female
---
---
--- no estimation; DPE-design point estimated empirically; DPM/F -design points male/female; CI's are adjusted for multiple comparison
52
comparisons indicated group differences. When design points were specified,
there was a trend between Con and Nut for the controlled factor at girls M
T20
and
at boys and girls M . There was a trend toward a difference between Con and
Nut+ST at boys M
T10
and M for the autonomous factor. There were no
differences found between any of the groups for the amotivation factor.
Comparisons of regression depths for trimmed means likewise showed no
effects, though there are some indications of a Con vs. Nut difference in the
autonomous and amotivation factors when medians were compared (Table 9).
Body Composition
The main outcome variables for body composition are BMI Z-score, total
fat weight and total lean weight (Tables 12 – 14). Omnibus comparisons of BMI
Z-score were significant for trimmed means and medians (Table 12); pairwise
comparisons showed the differences were between the Nut and the other two
groups. The ancova models found no design points. When design points were
specified, there were trends found between Con and Nut and a significant effect
between Con and Nut+ST at all girl levels (M
T20
, M
T10
, and M) . Comparisons of
regression depths were all non-significant. For total body fat, there was some
indication of an effect in the trimmed mean and median global tests, which were
seen in the Nut vs. the other two groups , as with BMI Z-score (Table 13). Only
one design point could be found for the ancova model between Nut and N+ST,
which was not significant. When design points were specified, there were no
differences found. Additionally, there was a trend toward a point of difference
between Con and Nut+ST in M
T10
and one in M. Despite this, none of the depths,
53
Table 12. Robust ANCOVA Results: BMI Z-score
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
Ancmg
Omnibus
# DPE = 1
M
T20
p = 0.01; Mdn p < 0.01
Pairwise
M
T20
<0.01
Mdn
0.01
M
T20
0.49
Mdn
0.67
M
T20
<0.01
Mdn
<0.01
ancova
DPE
DPM p
(CI!)
DPF p
(CI!)
M
T20
---
0.09
(0.03,-0.1)
0.07
(0.03,-0.1)
M
T10
---
0.71
(-0.1,0.1)
0.76
(-0.1,0.1)
M
---
0.72
(-0.1,0.1)
0.89
(-0.1,0.1)
M
T20
---
0.29
(-0.1,0.03)
0.02
(-0.1,-0.01)
M
T10
---
0.75
(0.05,-0.1)
0.02
(0.02,-0.1)
M
---
0.75
(-0.1,0.1)
0.03
(-0.1,0.0)
M
T20
---
0.20
(-0.1,0.03)
0.11
(-0.1,0.03)
M
T10
---
0.95
(-0.1,0.1)
0.14
(-0.1,0.02)
M
---
0.94
(-0.1,0.1)
0.12
(-0.1,0.03)
54
Table 12, Continued
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancsm
qancsm
Total
0.63
0.52
Male
0.65
---
Female
---
---
Total
0.24
0.77
Male
---
---
Female
---
---
Total
0.52
---
Male
---
---
Female
0.43
0.34
--- no estimation; DPE-design point estimated empirically; DPM/F -design points male/female; CI's are adjusted for multiple comparisons
55
Table 13. Robust ANCOVA Results: Fat Weight (kg)
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
Ancmg
Omnibus
# DPE = 1
M
T20
p = 0.06; Mdn p = 0.13
Pairwise
M
T20
<0.01
Mdn
0.01
M
T20
0.49
Mdn
0.77
M
T20
<0.01
Mdn
<0.01
ancova
DPE
31.7 p
(CI!)
DPM p
(CI!)
DPF p
(CI!)
M
T20
---
0.35
(-2.6,5.9)
0.15
(-1.5,6.0)
M
T10
---
0.31
(-3.7,9.2)
0.19
(-1.8,6.6)
M
---
0.73
(-4.7,6.2)
0.40
(-3.7,7.8)
M
T20
---
0.14
(-1.1,4.6)
0.14
(-1.2,5.0)
M
T10
---
0.16
(-2.4,9.0)
0.38
(-2.2,4.7)
M
---
0.34
(-2.6,6.1)
0.39
(-2.9,6.3)
M
T20
0.78
(-4.5,5.5)
0.93
(-4.0,4.3)
0.82
(-4.0,3.3)
M
T10
0.77
(-4.8,5.9)
0.77
(-4.0,5.1)
0.49
(-5.0,2.7)
M
0.48
(-5.1,8.7)
0.70
(-4.8,6.7)
0.89
(-6.6,5.9)
56
Table 13, Continued
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancsm
qancsm
Total
0.48
0.39
Male
0.84
---
Female
---
---
Total
0.81
0.80
Male
---
0.98
Female
---
---
Total
0.76
0.57
Male
0.65
0.83
Female
0.43
0.85
--- no estimation; DPE-design point estimated empirically; DPM/F -design points male/female; CI's are adjusted for multiple comparisons
57
Table 14. Robust ANCOVA Results: Lean Tissue Weight (kg)
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancmg # DPE = 0
ancova
DPE
55.7 p
(CI!)
57.4 p
(CI!)
DPM p
(CI!)
DPF p
(CI!)
M
T20
0.99
(-4.3,3.9)
1.0
(-4.5,4.0)
0.83
(-4.7,5.6)
0.94
(-3.0,3.1)
M
T10
0.89
(-4.3,3.9)
0.88
(-4.5,4.0)
0.97
(-4.4,4.5)
0.85
(-3.3,3.9)
M
0.83
(-4.2,3.6)
0.82
(-4.4,3.7)
0.96
(-4.3,4.5)
0.84
(-2.8,3.4)
M
T20
---
---
0.69
(-3.7,5.0)
0.04
(-5.5,0.3)
M
T10
---
---
0.54
(-2.6,4.4)
0.16
(-5.3,1.4)
M
---
---
0.43
(-2.5,5.0)
0.12
(-4.8,0.9)
M
T20
---
---
0.91
(-4.2,4.6)
0.03
(-5.4,0.0)
M
T10
---
---
0.64
(-3.3,4.9)
0.06
(-4.9,0.5)
M
---
---
0.54
(-3.2,5.5)
0.05
(-4.7,0.3)
58
Table 14, Continued
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancsm
qancsm
Total
0.62
0.58
Male
---
---
Female
---
---
Total
0.76
0.27
Male
---
---
Female
---
---
Total
0.60
0.17
Male
---
---
Female
---
---
--- no estimation; DPE-design point estimated empirically; DPM/F -design points male/female; CI's are adjusted for multiple comparisons
59
for either trimmed means or medians, showed group differences. For lean tissue
weight, very few design points could be determined, so none of the omnibus
models ran. For the ancova models, there were two points of comparison
between Con and Nut; neither was significant (Table 14). When design points
were specified, there was a difference between Con and Nut+ST at girls M
T20,
as
well as between Nut and Nut+ST at M
T20
and M, and a trend at M
T10
. None of the
depths were significant.
Insulin & Glucose Dynamics
The main outcome variables for glucose and insulin dynamics are SI, AIR, and DI
(Tables 15 – 17). Omnibus tests of trimmed means and medians for SI showed
one point of significant difference when the percentile bootstrapping was used
(Table 15). Pairwise comparisons of trimmed means and medians indicate a
difference at the one design point estimated between Nut vs Nut+ST groups
whereby the Nut group improved and the Nut+ST group declined, which is the
opposite of what was hypothesized. Only one design point was possible for the
ancova function for the Con vs. Nut comparison when estimated by the program,
which differed in effect depending on which level of trimming was used. When
M
T20
was used, p = 0.08, when M
T10
was used, p = 0.03, and when M was used,
p = 0.18. The Nut group improved while the Con group declined in SI. When
design points were specified, there was a difference between Con vs. Nut at M
T10
and trends for M
T20
. Additionally, there was a difference between Nut and
Nut+ST at boys M
T10
, and trend at that level for girls. None of the M comparisons
60
were significant. Comparison of depths for SI for trimmed means and medians
showed no effects.
Omnibus tests of trimmed means and medians for AIR could not select
design points. Only one design point was possible for the ancova function for the
Nut vs. Nut+ST comparison when estimated by the program, which showed no
between group difference (Table 16). When design points were specified, there
were no differences between any of the groups at any of the points. Likewise,
there were no differences in regression depths across groups in trimmed means
or medians.
For DI, there were 14 design points that could be estimated by the running
interval smoother in the omnibus tests (Table 17). A few indicated differences
across groups, though in pairwise comparisons, there was only one point that
stood out, which was different between the Nut and Nut+ST groups in both the
trimmed means and medians. In the ancova model, there were three points that
were compared though there were no differences between any of the groups.
When design points were specified, there was a trend between Con and Nut+ST
at boys M, but none of the other comparisons indicated a difference between any
of the groups. Likewise, there were no differences in regression depths across
groups in trimmed means or medians.
Effect sizes
Robust effect sizes are reported in Appendix E. Examination of these
effects show that many of the differences between groups could be deemed
moderate to large, even when the robust ANCOVA results were not significant.
61
Table 15. Robust ANCOVA Results: Insulin Sensitivity (SI)
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
Ancmg
Omnibus
# DPE = 1
M
T20
p = 0.16; Mdn p = 0.23
Pairwise
M
T20
0.60
Mdn
0.93
M
T20
0.13
Mdn
0.22
M
T20
0.01
Mdn
0.05
ancova
DPE
1.66 p
(CI!)
DPM p
(CI!)
DPF p
(CI!)
M
T20
0.08
(-
1.03,0.22)
0.08
(-1.0,0.1)
0.06
(-1.1,0.1)
M
T10
0.03
(-0.97,0.10)
0.03
(-0.93,0.03)
0.03
(-0.93,0.03)
M
0.48
(-1.21,0.72)
0.48
(-1.07,0.58)
0.48
(-1.07,0.58)
M
T20
---
0.72
(-0.8,1.0)
0.72
(-0.8,1.0)
M
T10
---
0.89
(-0.50,0.56)
0.96
(-0.59,0.56)
M
---
0.70
(-0.72,1.01)
0.70
(-0.72,1.01)
M
T20
---
0.15
(-0.4,1.5)
0.11
(-0.3,1.5)
M
T10
---
0.05
(-0.07,1.03)
0.08
(-0.15,1.03)
M
---
0.15
(-0.24,1.01)
0.15
(-0.24,1.01)
62
Table 15, Continued
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancsm
qancsm
Total
0.21
0.85
Male
0.70
---
Female
---
---
Total
0.86
0.98
Male
0.47
0.83
Female
---
---
Total
0.81
0.19
Male
0.91
0.97
Female
---
---
--- no estimation; DPE-design point estimated empirically; DPM/F -design points male/female; CI's are adjusted for multiple comparisons
63
Table 16. Robust ANCOVA Results: Acute Insulin Response
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancmg # DPE = 0
ancova
DPE
1148.3 p
(CI!)
DPM p
(CI!)
DPF p
(CI!)
M
T20
---
0.60
(-
354,534)
0.54
(-
359,588)
M
T10
---
0.19
(-291,985)
0.60
(-352,545)
M
---
0.30
(-630,1409)
0.26
(-303,831)
M
T20
---
0.81
(-419,508)
0.60
(-379,582)
M
T10
---
0.38
(-425,918)
0.95
(-538,565)
M
---
0.75
(-935,1211)
0.95
(-732,772)
M
T20
0.78
(-450,371)
0.74
(-388,295)
0.93
(-374,348)
M
T10
0.67
(-635,470)
0.57
(-525,325)
0.67
(-550,385)
M
0.41
(-1021,561)
0.36
(-914,410)
0.35
(-864,376)
ancsm
qancsm
Total
0.74
0.99
Male
0.58
---
Female
---
---
Total
0.38
0.81
Male
0.25
0.09
Female
---
---
Total
0.68
0.82
Male
0.34
0.23
Female
0.90
---
--- no estimation; DPE-design point estimated empirically; DPM/F -design points male/female; CI's are adjusted for multiple comparisons
64
Table 17. Robust ANCOVA Results: Disposition Index
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancmg
Omnibus
# DPE = 14
M
T20
p = 0.44 Mdn p = 0.75
65
Table 17, Continued
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
0.44
0.89
0.07
0.95
0.06
0.95
0.44
0.44
0.44
0.07
0.95
0.44
0.03
0.44
0.75
0.99
0.33
0.97
0.15
0.97
0.75
0.75
0.75
0.33
0.97
0.75
0.01
0.75
66
Table 17, Continued
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
Pairwise
M
T20
0.17
0.91
0.84
0.99
0.84
0.99
0.17
0.17
0.17
0.84
0.99
0.17
0.61
0.17
Mdn
0.49
0.83
0.58
0.84
0.58
0.84
0.49
0.49
0.49
0.58
0.84
0.49
0.48
0.49
M
T20
0.12
0.87
0.22
0.87
0.24
0.87
0.12
0.12
0.12
0.22
0.87
0.12
0.13
0.12
Mdn
0.37
0.93
0.59
0.89
0.50
0.89
0.37
0.37
0.37
0.59
0.89
0.37
0.34
0.56
M
T20
0.70
0.73
0.10
0.91
0.09
0.91
0.70
0.70
0.70
0.10
0.91
0.70
<0.01
0.70
Mdn
0.56
0.82
0.12
0.68
0.12
0.68
0.56
0.56
0.56
0.12
0.58
0.56
<0.01
0.56
ancova
DPE
1108.7 p
(CI!)
1175.0 p
(CI!)
1190.6 p
(CI!)
DPM p
(CI!)
DPF p
(CI!)
M
T20
---
---
---
.57
(-1556,983)
.63
(-1378,931)
M
T10
---
---
---
.34
(-1491,644)
.34
(-1491,644)
M
---
---
---
.24
(-1593,548)
.22
(-1680,533)
M
T20
---
---
---
.30
(-1752,724)
.35
(-1573,724)
M
T10
---
---
---
.15
(-1721,415)
.19
(-1709,496)
M
---
---
---
.07
(-1960,237)
.15
(-1763,413)
M
T20
.72
(-1001,778)
.72
(-1001,778)
.67
(-1020,752)
.53
(-1111,657)
.60
(-1149,746)
M
T10
.73
(-916,715)
.73
(-916,715)
.69
(-931,698)
.50
(-1035,576)
.61
(-1046,681)
M
.90
(-917,838)
.90
(-917,838)
.87
(-930,824)
.28
(-1067,389)
.79
(-995,791)
67
Table 17, Continued
Con vs. Nut Con vs. Nut+ST Nut vs. Nut+ST
ancsm
qancsm
Total
.28
.82
Male
1.0
.23
Female
---
---
Total
1.0
.50
Male
---
---
Female
---
---
Total
.72
.65
Male
.76
.26
Female
.89
.48
--- no estimation; DPE-design point estimated empirically; DPM/F -design points male/female; CI's are adjusted for multiple comparisons
68
Strength and dietary intake variables showed effects in the moderate to large
range. Motivation, especially for eating fruits and vegetables, and body
composition effects were large, even though this was not necessarily reflected in
the results of the ANCOVA models. The glucose and insulin dynamics effects
were moderate for AIR and DI, but large for SI. These effect sizes between the
groups highlight the fact that, while there are many benefits of using robust
statistics, they are not a panacea for small sample size, and that even robust
statistics need to be put into context. These effect sizes have the benefit of being
estimated for every pairwise trimmed mean comparison, though the estimates
are made at only one point for each comparison. Adding this descriptive element
to the examination of group differences provides information that the statistics do
not.
Comparison of Results of different methods
Was this technique effective in achieving a statistical result?
The most successful methods, in terms of obtaining a statistic, were
analyses of depths (ancsm and qancsm), and ancova models when design
points were specified. Given the sample sizes under consideration, this is a
positive finding. The SANO intervention was intense and expensive to conduct,
as are many medical studies, which severely limits sample sizes. Thus, to be
able to obtain significant results with such a small sample is a great strength of
the robust ANCOVA and demonstrates the increased power of these models
over traditional ANCOVA. Additionally, all findings that were significant with
69
ANCOVA were likewise significant with robust methods, so no possible
significant findings were missed. An interesting point to note is that the mean
most often had the smallest SE's, with 42% of the smallest SE's (See Appendix
D) being for the mean, followed by 20% trimmed mean (36%). However, it should
be noted, that often the standard errors were very close between the different
estimators. Also, when there were clear outliers, the trimmed mean, especially
M
T20
performed the best. The standard error of the median was never the
smallest.
Was this method easy to use?
All of these methods required similar amounts of programming as found in
typical syntax files. However, for a lay user, it may be more difficult than the
graphical user interfaces in many popular programs. However, these codes offer
much flexibility in the modeling, which is a great benefit.
Would the explanation of these statistics be acceptable to a medical audience?
(I.e., could you publish these results in a medical journal?)
While these results showed many intervention effects that were missed
with traditional ANCOVA, these models still may not be very acceptable to a
mainstream medical audience. Given that most of the reviewers and readers of
these journals have limited knowledge and appreciation of more modern
methods, it is likely that these results will be looked upon with some suspicion.
However, the depth of the results, and the possibility of finding significance
where none was found before may be just the lure that this audience would need
70
to start to investigate robust methods, including those used here. That, coupled
with a strong theory and explanation of results would be a good first step to
convincing this audience to expand their statistical horizons.
Summary of Robust ANCOVA Approach
Due to the requirement of fewer assumptions, the robust ANCOVA
approach had the power to demonstrate treatment effects not found with
traditional ANCOVA. The significant increases in the strength variables in the
Nut+ST group, and the decrease in total energy in the Nut group found with
traditional ANCOVA were seen also with the robust ANCOVA methods. This
demonstrates one of the strengths of using robust statistics: significant findings
were not lost. In addition, several other intervention effects were significant with
the robust ANCOVA that were not found with traditional ANCOVA were
significant with the robust methods. These robust alternatives highlight the
improvement in the Nut group, though not the Nut+ST group in the dietary
variables, as well as a decrease in amotivation for eating fruits and vegetables.
And while the Nut group lost more fat than the Con group, the Nut+ST
participants gained more lean tissue than the Nut group. Thus, depending on
whether it is lean tissue or decreases in fat that are the goal, recommendations
as to which intervention is more effective can be made for future research.
Finally, there were improvements in SI in the Nut group, as compared with the
Nut+ST, though the Nut+ST showed improvements in DI. These findings in
glucose and insulin variables found with the robust ANCOVA approaches present
the SANO intervention in a more positive light than with traditional ANCOVA.
71
Examination of these results demonstrate that traditional ANCOVA was not the
optimal statistical choice for the analysis of these data, and that the use of these
statistics masked significant findings.
72
Chapter 6. Approach #2: Latent Profile Analysis
The methods in the previous section assume that all relevant variables are
measured, collected, and used in the analyses, and that any lack of significant
findings is either due to no intervention effect or statistical methods that are
lacking. In this second approach to testing treatment effects, this assumption is
explored by testing whether there are unmeasured classes of participants in this
sample, representing true heterogeneity in the population that influences the
effectiveness of the SANO intervention. That is, this approach tests whether the
sample is made up of a mixture of populations that have systematic differences
in measured variables, and if they are related to the intervention. Hence, these
models are known as mixture models (McLauchlan & Peel, 2000). This type of
model is similar to a multiple group analysis, with the exception that the
population heterogeneity is unmeasured, and is reflected in the pattern of
responses across variables (Lubke & Muthen, 2005; Muthen, 2002). Here, a new
model was developed and tested as a means to assess intervention effect in a
randomized clinical trial.
As reported above, examination of the raw data plots for the key variables
in the SANO data showed considerable variability in response to intervention in
each of the groups. Even the Control group showed improvement in some tests,
which was not expected at all. Certain individuals made positive changes
independent of intervention, while others did not respond much to an intensive
intervention given every opportunity. This second approach will explore whether
73
the change in dependent variables is due to characteristics of the participants
that can be described by their pre-test and change after 16-weeks by conducting
a latent profile analysis (LPA) on these scores. LPA is a form of structural
equation modeling, a flexible and powerful modeling technique (Muthen &
Curran, 1997). If there are unmeasured latent classes, they would be evidenced
by patterns of responses across key outcome variables. While one variable alone
may not be telling, LPA uses the synergy of examining the pattern across several
variables to distinguish if patterns are evident that create a profile of different
types, or classes, of participants.
Research Questions:
(1) Are there different latent classes of participants reflected in pre-test
and change scores? (2) Do these classes reflect the intervention or are they
independent of intervention? (3) What are the characteristics of these classes?
74
Chapter 7. Latent Profile Analyses Methods
There are many different types of mixture (categorical latent variable)
models, though mixture models are not commonly used in randomized control
trials. Though (Muthen & Asparouhov, 2008) highlight the potential for using
mixture models in clinical trials, a literature search for examples of latent class
analyses in randomized trials resulted in only a handful of articles. Most of these
reported performing latent class modeling at baseline and then used these
grouping as effect modifiers of intervention (e.g., Haro et al., 2006). Alternately,
latent classes were determined for an endpoint, and repeated measures before
that are related to that distal outcome (e.g., Ventura, Loken, & Birch, 2006).
Another approach was to perform latent class growth analysis (LCGA) in which
growth curves are estimated from multiple observations of the outcome variable
and classes are identified based on these growth parameters (e.g., Lennon,
McAllister, Kuang, & Herman, 2005). Segawa, Yanhong, Li, Flay, and Aya (2005)
used this technique to examine classes of growth curves in an intervention aimed
at preventing homelessness in men with mental illness. In this study, participants
were randomized into a control and intervention group, and then followed for 18
months and measured for how often they were homeless, and the time period
they were homeless, during this period. For this analysis, the LCGA models were
stratified by groups, so that the classes were described separately for controls
and intervention. This stratification of analyses does not allow for the possibility
that the same classes were observed in both groups. Muthen et al. (2002)
75
presents an intervention trial were randomized at the classroom level into either
control group or aggression prevention intervention in first grade and followed
until they were 18 years old. Other researchers have used latent classes to
explore compliance in randomized trials (e.g. Dunn, Maracy, & Tomenson, 2005;
Jo, 2002).
Development of the Hybrid LPA Model
While all of these approaches are interesting, none really fit the SANO
study. Therefore, a new model was developed that could answer the questions
specific to this study. One of the main strengths of using structural equation
models is the flexibility to combine and modify models (Muthen, Collins, & Sayer,
2001; Muthen, Hancock, & Samuelsen, 2007) to test unique and new
hypotheses. For this study, a hybrid of two more commonly used mixture models
will be used. These include the LCGA described above and latent profile analysis
(LPA).
Figure 3. Latent Profile Analysis (LPA)
76
LPA is the form of latent class analyses used with continuous variables. It
is a variation of general mixture modeling in which classes of individuals are
inferred from patterns of relationships among different observed dependent
variables. As can be seen in Figure 3, this model includes a categorical latent
variable Con that categorizes participants into classes based on the values of
several observed continuous variables (Y
1
, Y
2
, Y
3
). Similar to factor or cluster
analysis, the latent class provides membership classification for individuals. In a
general sense, LPA is performed by creating a measurement model that
describes the relationship between categorical latent variables and dependent
variables where the dependent variables are interval level data. This can be
expressed as a joint density of the latent class indicators,
!
f (y), which is a
mixture of class specific densities (Equation 11).
Equation 11
!
f (y) = P(x)f (y µ
x
x)
"
x=1
C
"
Each of these latent classes, x, has it’s own mean vector µx and covariance
matrix !x. There is the option of allowing these means and covariance matrices
to vary across groups, or be constrained to be invariant. Px is the proportion of
individuals in each of the classes. This model assumes that within each class,
the variables are independent, that is, the class describes the relationship
between them (Lubke & Neale, 2006). Relationships are described as regression
equations specific to the type of dependent variables. In the case of this study,
dependent variables are continuous, so the latent class indicators will be a set of
linear regression equations. There is an assumption of multivariate normality of
77
the observed variables (Y) conditional on class (Bauer & Curran, 2003; Lubke &
Neale, 2006; Muthen, 2002). This means that the distribution of the mixture (i.e.,
the total sample) may be non-normal, as seen in SANO.
While latent class grown analysis (LCGA) also includes a latent class
variable, it is a mixture model in which several measurements of one dependent
variable are modeled in such a way that the growth curve parameters (intercept
and slope) are estimated as latent variables. Then, a class variable C is included
to classify different growth trajectories based on these latent growth parameters.
This is shown in Figure 4.
Figure 4. Latent Class Grown Analysis (LCGA)
Thus, the class indicators in Equation 11 are estimated by the observations.
There is both an intercept (starting value) and a slope (rate of change). There is
no assumption that change is linear, as the slope from each time to the next is
78
estimated. Thus an individual’s score at a given time (t) is estimated as a function
of their individual change parameters intercept (!
0i
) and slope (!
1i
), as shown in
Equation 12.
Equation 12
!
y
ti
="
0i
+"
1i
T
t
+#
ti
These change parameters are estimated by the group’s intercept and slope ("
0
and "
1
, respectively) as well as the influence of the class of the individual (!
0
and
!
1
) as shown in Equations 13 (Intercept) and 14 (slope).
Equation 13
!
"
0i
=#
0
+$
0
x
i
+%
0i
Equation 14
!
"
1i
=#
1
+$
1
x
i
+%
1i
Figure 5. Hybrid model
79
Because the SANO study only has two measurement points, pre-test and
post-test, it is not possible to estimate classic growth curves with latent growth
parameters as in LCGA. However, if the pre-test and change scores are
substituted for intercept and slope, respectively, then this model would be an
abbreviated form of a LCGA. Further, many of the variables of interest in SANO
do no stand alone, but reflect different aspects of complex biological and
psychological phenomenon represented as constructs in Figure 1. Therefore, it is
of interest to test whether there are patterns that can be discerned across the
pre-test and change scores across related variables simultaneously. Thus, this
study proposes to use a hybrid model, which is shown in Figure 5. As can be
seen, this model includes observed parameters of change (pre-test and change)
for multiple variables (Y
1
, Y
2
, Y
3
) and then tests whether there exists in the
sample a mixture of participants who have similar patterns. For the purpose of
this study, the groupings of variables will be determined by the variables
representing the constructs in Figure 1. This is done to examine if there are
classes defining each construct in the theoretical model, as well as the practical
reason of limiting the number of variables in each model, given the small sample
size.
With this hybrid model there exists the possibility of testing the hypothesis
of an intervention effect (Equation 4) because the model shown in Figure 5 can
be extended to examine predictors of class. One such predictor would be
intervention group, which is expected to be related to change (Y1
!
- Y3
!
). If the
assumption of no group difference in pre-test scores is true, as found in
80
preliminary analyses, then the only difference across groups would be in change.
And if the intervention is the source of the group differences in change, then the
classes would be described perfectly by the intervention group, as shown in
Figure 6. In this way, the addition of intervention as a covariate this model can be
used as an alternate way to test the hypothesis of intervention effect for multiple
dependent variables simultaneously (similar to MANCOVA).
Figure 6. Class = Treatment
Thus, one of the strengths of this hybrid model is that it allows for the
classes being the intervention itself. If this is the case, the latent classes are not
necessary, indicating that all heterogeneity in the sample can be accounted for. If
81
there are indeed no differences in pre-test values across intervention group, then
the only difference between the groups would be the change. This difference
would be accounted for in the intervention factors. That is, there would be three
classes, and the class designation for each individual would match the
intervention group to which they were randomized. If, however, the latent classes
are not related to the intervention group, then what determines the classes, and
how the classes responded to the intervention is of interest. In these models,
each class is determined by the profile of the variables in the model. Thus a
profile of the scores for each class will be described so that the classes can be
differentiated. Additionally, how the class relates to other variables, or classes for
different constructs, is of interest. So if intervention group and class do not align,
then what may predict the classes is a question of interest, and will be described.
Model Fitting
The hybrid model will be run for the constructs in the SANO theoretical
model (Figure 1) using the variables in Table 1.. This paper will focus on creating
a latent profile model as shown in Figure 5 to explore whether there are
differences across groups (the class C) in pre-test and change as the dependent
variables. In these models each class is determined by the profile of the variables
in the model. Thus, a profile of these variables will be created so that the classes
can be differentiated and described. Additionally, how the classes related to other
variables, including classes for other constructs, is of interest and will be
explored.
82
First a 1-class model will be tested. This provides a reference point for the
fit of the multiple class models. If the 1-class model is deemed the best, then
there are no differences across the dependent variables in this sample. This
would indicate no treatment effect. Next a two-class model was fit, then a three
class, and so on until fit was not improved (as described below) or the model will
not converge. Lack of convergence is a common problem in latent class models,
especially with small sample sizes. To help with this issue, a number of random
starts will be used. A model will not be considered to have converged unless the
best fit, as determined by the highest log likelihood value, is replicated at least
twice indicating that a global solution has been reached. As all models are
nested with the addition of more classes in each successive model, direct
comparisons between the models are possible. Statistical modeling of the latent
class models, and larger mixture models (with predictors or class) will be
completed in Mplus (Muthen & L. K. Muthen, 1998-2007).
The Mplus program allows for flexibility in the type of observed variables
(Muthen & Satorra, 1995), as well as for bootstrapped standard errors and
confidence intervals, which improve the estimation. It utilizes maximum likelihood
estimation (see Goodman, 1974a, 1974b) with the EM algorithm (Dempster,
Laird, & Rubin, 1977) for optimization of the models.
Choosing the number of classes is one of the most challenging aspects of
mixture modeling. When two models appear to fit the data equally well,
guidelines in picking a model involves evaluation of the parsimony of the model
and the theory driving the analyses. The most parsimonious and yet sufficient
83
model is deemed the best. However, if the model makes no theoretical sense,
even if it fits well, then it is not necessarily a good model. The classical likelihood
ratio test used to compare nested models in structural equation modeling has
been shown to be a poor measure of fit for latent class models (Aitkin, Anderson,
& Hinde, 1981; Aitkin & Rubin, 1985; Clogg, 1995; Lo, Mendell, & Rubin, 2001;
McLachlan & Peel, 2000). In a Monte Carlo simulation study, Nyland,
Asparouhov, and Muthen (2006) demonstrated that two fit indices were superior
for determining the number of classes (C). These were the Bayesian information
criterion (BIC; Schwarz, 1978) and the bootstrapped likelihood ratio test (BLRT;
McLauchlan & Peel, 2000). The BIC can be used to assess the fit of a single
model, and be compared with the BIC of other, nested, models. The best fitting
model has the lowest BIC. The BLRT tests the fit of a model with Con classes to
that of a model with C-1 classes. This test yields a p-value to test the difference
between models. To aid in determining the best model, these indices will be
evaluated along with Akaike’s information criterion (AIC; Akaike, 1987; Loehlin,
1998) and entropy (Nagin, 1999; Ramaswamy, DeSarbo, Reibstein, & Robins,
1993). AIC is a measure of misfit in a model, and includes a parsimony
adjustment that favors a simpler model. As it is a measure of misfit, the smallest
AIC reflects the best fitting model. Entropy is a summary measure of how distinct
the classification of participants is into the number of classes in the given model.
Entropy ranges from 0 to 1, with 1 representing a matrix of k x k classes in which
the diagonal is all 1 and the off-diagonals are all 0, indicating perfect separation
of the participants into classes. That is, each participant loads perfectly into only
84
one class. Finally, to aid in evaluating fit, graphical indicators of fit will also be
used to examine at what number of classes the log likelihood values plateau
(Garrett & Zeger, 2000). The examination of multiple fit indices is especially
important with small sample, such as SANO because small sample size can
cause instability of fit (Yang, 2006). Because latent class models are known to be
easily subject to local maxima, multiple random starts will be used to make sure
that the highest log likelihood is replicated, indicating a true solution, and not a
local solution has been reached (Muthen & Muthen, 1998-2007). Equations for
these fit indices are reported in Appendix D.
Additionally, how individuals change classes as the number of classes
increases will be examined to determine if the number of classes is really distinct.
This is similar to examination of factor loadings for Eigen values greater than one
in factor analysis, where it is possible to have a univariate factor with an Eigen
value > 1 that does not represent a distinct factor. This will be accomplished by
mapping the classes into which individuals fit in each valid model. Figure 7
shows an example of this mapping with two examples for a 3-class model for 54
participants. The numbers in the boxes represent N’s. On the left there are three
distinct classes, and on the right there are three indistinct classes. A distinct
class is characterized by being composed of different individuals than previous
classes, that is, a lower number of classes may not capture the information
sufficiently. An indistinct class may not lose much information by using a smaller
number of classes, such as the example on the right where the 24 individuals in
class 1 of the 2-class model are also the 24 individuals in the first class of the 3-
85
class model; for the 30 individuals who make up class 2 in the 2-class model,
they split into class 2 and 3 of the 3-class model. These classes may or may not
be distinct, and this warrants further exploration.
Figure 7. Example of Class Maps for 1-3 Classes
3 Distinct Classes 3 Indistinct Classes
Once the number of classes is determined for each construct, then the
profile for the classes will be examined. These profiles will look at the distribution
of each variable in the model across classes. These will include measures of
central tendency, spread, and covariation and will allow for the creation of a
profile of the classes. Part of this profile includes what may be associated with
class membership. Of foremost interest is whether the intervention group to
which an individual was randomized predicts the class. Other predictors include
gender and age. To have remained in the study, there were cutoffs for the
number of weeks of the intervention attended, so evaluable participants had
similar total participation rates. This makes the number of make-up classes taken
important. It is possible that many participants compliance was driven by the
86
study staff, and may reflect different interest and commitment to the program that
may impact the effectiveness of the study.
Finally, it is of interest to see how the classes of the different constructs
related to each other. Nagin and Tremblay (2001) did something similar where
they created LCGA modeling for related constructs and then examined the
probability of overlap in the construct classes. The overlap in classes were
compared by chi-square tests for nominal or ordinal variables (e.g. intervention
group, gender) and ANOVA or robust alternatives tests for continuous variables
(e.g. age).
87
Chapter 8. Latent Profile Model Results
Strength
Hybrid model
The two strength variables, bench press and leg press, were tested for a
2-, 3-, and 4-class model. There was complete strength data for 51 participants.
The 5-class model would not converge so modeling was halted. The fit indices
indicate that the 3-class model fit the best (Figure 8) however the third class
included only two participants. Additionally, as the number of classes increased,
most participants tended to remain together in two groups so the 2-class model
was chosen as best model for the strength variables. Class 1 had 31 participants
(6.8%) and class 2 had 20 participants (39.2%).
Profile
These two classes were not significantly associated with randomization
group (!
2
(2)
=4.51 p = .11), though they were significantly associated with gender
!
2
(1)
=18.14 p < .001). Class 1 was comprised of primarily females (n=22, 71.0%)
and class 2 was comprised of primarily males (n=18, 9.0%). When stratified by
gender, there was a randomization effect for males (!
2
(2)
= 6.29 p = .043) but not
for females (!
2
(2)
= 3.64 p = .16). These class distributions are shown in Figure 9.
Further breakdown of the sample showed that all females in the control and
nutrition education groups were in class 1, as were most of the girls in strength
training. However, all girls in class 2 were in the Nut+ST group. All boys in the
Nut+ST group were in class 2, while most of the control boys and about half of
88
the nutrition boys were in class 2. This indicates that there was a gender x
intervention group interaction with the strength variables such that, boys seems
likely to improve no matter what the group, but would definitely improve in the
strength training + nutrition group. Girls, on the other hand, only showed strength
improvements if they received strength training.
Figure 8. Fit Indices and Class N’s: Strength
Number of classes
Variables Fit Indices 1 2 3 4
N 51 31/20 27/22/2 30/17/3/1
LnL -1126.2 -1109.5 -1095.9 -1086.4
# free parameters 10 15 20 25
AIC 2272.4 2249.0 2231.7 2222.8
BIC 2291.8 2278.0 227.3 2271.1
Bootstrap -2LR! NA 33.4 27.3 18.9
df NA 5 5 5
p-value NA .000 .000 .113
Bench Press
Leg Press
Entropy NA .821 .899 .941
89
All comparisons of the components
of the strength classes were significantly
different between classes (see Figures 10
& 11). Class 1 had lower pre-test bench
press (p
M
< .003, p
M
T20
< .001) and leg
press (p
M
= .004, p
M
T20
< .001), following
class gender distribution. Class 1
increased less than Class 2 on both bench
press (p
M
< .001, p
M
T20
< .001) and leg
press (p
M
< .001, p
M
T20
< .024). Those who
started the strongest (primarily boys)
increased the most in raw values. This
remains when looking at proportional
increases as well where class 2’s
increases were much higher than class 1’s
increases (or occasionally decreases).
Examination of interactions of class
and intervention group showed a
significant interaction for leg press change
(p
M
= .016) where class 2 showed very high increases in the Con and Nut+ST
groups, but a decrease in the Nut group (Figure 12). Trimmed mean
comparisons of this 2-way interaction were not significant.
Figure 9. Strength Classes
Across Intervention Group
A.
B
C
90
Figure 10. Profile Bench Press
Pre-Test Change
Class 1 2 1 2
M 8.2 12.3 6.8 2.0
SD 18.5 33.2 13.8 16.1
p .003 <.001
Class 1 2 1 2
M
T20
79.2 114.2 6.6 19.6
Wvar 153.9 205.0 37.8 49.7
P <.001 <.001
Note: Participants on right side of figure responded the best to the intervention. Significant
post hoc comparison using Bonferroni correction are reported. Analyses were performed
using Bonferroni’s correction. Significant pairwise comparisons used a critical p-value of
.009 are reported. Arrows indicate hypothesized direction of effects.
91
Figure 11. Profile Leg Press
Pre-Test Change
Class 1 2 1 2
M 401.5 653.0 13.5 103.8
SD 14.4 17.3 89.7 193.4
p .004 <.001
Class 1 2 1 2
M
T20
394.3 617.5 1.7 94.2
Wvar 8937.7 11768.7 193.7 6502.4
P <.001 .024
Note: Participants on right side of figure responded the best to the intervention. Significant post
hoc comparison using Bonferroni correction are reported. Analyses were performed using
Bonferroni’s correction. Significant pairwise comparisons used a critical p-value of .009 are
reported. Arrows indicate hypothesized direction of effects.
92
Figure 12. Leg Press Change Class x Intervention (M+SE)
Dietary Intake
Hybrid model
There were four dietary variables included in the hybrid models: total
energy (kcal), fiber (g), total sugar (g), and added sugar (g). There was complete
dietary data for 49 participants. Hybrid models for 2-, 3-, 4, and 5-classes were
performed, though the 5-class model did not converge. There was a monotonic
improvement of fit with the addition of classes, thus the 4-class model was the
best fit to these data (Figure 14). The sample was divided fairly evenly between
classes 1-3 (n= 18 (36.7%), 15 (3.6%), and 11 (22.4%), respectively) with 5
(1.2%) participants in class 4.
Profile
There was not a significant relationship of class with intervention group
(!
2
(6)
= 5.60 p = .47). There was a significant gender difference across classes 4
93
Figure 13. Fit Indices and Class N's: Dietary Intake
N 49 31/18 24/14/11 18/15/11/5
LnL -2136.9 -2093.1 -2058.0 -2039.4
# free
parameters 20 29 38 47
AIC 4313.9 4244.1 4192.1 4172.8
BIC 4351.7 4298.9 4263.9 4261.7
Bootstrap -2LR! NA 87.751 7.505 38.190
df NA 9 9 9
p-value NA .000 .000 .000
Total
Energy
Total
Sugar
Added
Sugar
Fiber Entropy NA .917 .948 .965
("
2
(3)
= 1.70 p = .01). Class 1 was mostly female (n=14, 77.8%), classes 2 and 4
were primarily male (n=11, 77.3% and n=4, 8.0% respectively). Class 3 was
evenly split (n=6 males (54.5%) and n=5 females (45.5%)). When stratified by
gender, there was a randomization effect for males ("
2
(6)
= 14.51 p = .024) but
not for females ("
2
(6)
= 7.07 p = .31). Controls and nutrition education boys were
spread across classes 2, 3, and 4, while strength training + nutrition education
boys were in classes 1 & 2. Only one girl (a control) was in class 4, all others
94
were spread across classes 1, 2, and
3. There were significant omnibus
effects for class for all variable means
and trimmed means in the dietary
intake model.
Exploration of an interaction of
intervention group with dietary intake
classes showed a significant effect
for mean change in energy only (pM
= .012). As seen in Figure 19, there
was much more class differentiation
in the Con group than in the Nut or
Nut+St groups. However, the small
cell sizes make these statistics
indications of effects, more than hard
evidence. These cell size limitations
do not allow for an examination of a
2-way ANCOVA.
At pre-test class 1 had lower
mean energy intake than all other classes, though for the trimmed means, it was
lower than classes 2 and 3, but not class 4. This group decreased in energy
Figure 14. Dietary Intake Classes
Across Intervention Group
A.
B
C
95
intake, though not as much as class 3, which had the highest pre-test intake and
decreased the most. In fact, every participant
Figure 15. Profile Energy Intake
Pre-Test Change
Class 1 2 3 4 1 2 3 4
M 1384.9 1933.1 2635.2 2081.6 -263.8 316.4 -98.5 82.9
SD 347.5 43.4 355.7 693.2 407.0 387.6 43.2 768.6
p <.001 <.001
Post-
hoc 2,3,4 1,3 1,2 1 2,3,4 1,3 1,2,4 1,3
Class 1 2 3 4 1 2 3 4
M
T20
1362.1 1941.7 2608.6 2059.4 -269.2 365.9 -903.6 654.3
Wvar 67034.0 107798.8 31957.0 8044.7 77241.4 97008.2 87465.7 116279
P <.001 .002
Pairwise 2,3 1,3 1,2 2,3,4 1,3 1,2,4 1,3
Note: Participants on right side of figure responded the best to the intervention. Significant
post hoc comparison using Bonferroni correction are reported. Analyses were performed using
Bonferroni’s correction. Significant pairwise comparisons used a critical p-value of .009 are
reported. Arrows indicate hypothesized direction of effects.
96
Figure 16. Profile Fiber Intake
Pre-Test Change
Class 1 2 3 4 1 2 3 4
M 12.4 14.7 21.7 14.9 2.8 4.9 -7.7 5.3
SD 5.4 3.9 9.9 7.3 8.6 7.7 6.9 8.8
p .007 .001
Post-hoc 3 1 3 3 1,2,4 3
Class 1 2 3 4 1 2 3 4
M
T20
12.0 14.2 18.3 13.9 2.4 4.6 -7.1 4.8
Wvar 12.8 9.9 4.7 1.3 38.5 2.5 14.7 6.4
P .020 .006
Pairwise 3 1 3 3 1,2,4 3
Note: Participants on right side of figure responded the best to the intervention. Significant
post hoc comparison using Bonferroni correction are reported. Analyses were performed
using Bonferroni’s correction. Significant pairwise comparisons used a critical p-value of
.009 are reported. Arrows indicate hypothesized direction of effects.
97
Figure 17. Profile Total Sugar Intake
Pre-Test Change
Class 1 2 3 4 1 2 3 4
M 77.05 102.98 167.03 135.73 -19.84 3.46 -83.41 71.19
SD 28.46 31.85 49.55 5.59 24.83 21.94 46.59 46.78
p <.001 <.001
Post-hoc 3,4 3 1,2 1 2,3,4 1,3 1,2,4 1,3
Class 1 2 3 4 1 2 3 4
M
T20
77.81 101.86 159.99 148.05 -18.85 3.18 -77.88 66.01
Wvar 438.20 575.58 1327.10 622.67 205.06 211.80 708.96 373.6
P .012 <.001
Pairwise 3 3 1,2 2,3,4 1,3 1,2,4 1,3
Note: Participants on right side of figure responded the best to the intervention. Significant post
hoc comparison using Bonferroni correction are reported. Analyses were performed using
Bonferroni’s correction. Significant pairwise comparisons used a critical p-value of .009 are
reported. Arrows indicate hypothesized direction of effects.
98
Figure 18. Profile Added Sugar Intake
Pre-Test Change
Class 1 2 3 4 1 2 3 4
M 47.57 66.66 12.33 103.24 -19.92 16.94 -7.50 54.16
SD 28.11 3.09 55.54 36.35 3.27 32.64 49.79 47.73
p <.001 <.001
Post-hoc 3,4 3 1,2 1 2,3,4 1,3 1,2,4 1,3
Class 1 2 3 4 1 2 3 4
M
T20
44.17 64.38 115.71 107.25 -17.68 18.91 -71.39 52.94
Wvar 287.76 371.07 1194.58 605.59 382.76 259.99 364.62 1347
P .017 <.001
Pairwise 3 3 1,2,4 3 2,3,4 1,3 1,2,4 1,3
Note: Participants on right side of figure responded the best to the intervention. Significant
post hoc comparison using Bonferroni correction are reported. Analyses were performed
using Bonferroni’s correction. Significant pairwise comparisons used a critical p-value of .009
are reported. Arrows indicate hypothesized direction of effects.
99
in class 3 declined in kcals by a considerable amount (by at least 467 calories
per day). This is largely driven by a decrease in total sugar (Figure 17) and
added sugar (Figure 18). Interestingly, class 3 also declined in fiber (Figure 16),
so while these participant were complying with the message about lowering
sugars, they were not complying with increasing dietary fiber. One explanation
for this may be that this class was made primarily of Con participants who did not
attend the classes about fiber, but perhaps were aware of low-carbohydrate
messages in the media. However, this is not the case, as this class was
comprised of 8 participants who had received the nutrition class (5 Nut and 3
N+ST) and 3 controls. Classes 2 and 4 who were in the middle of the pack for
pre-test energy intake both increased in energy intake at post-test, largely
attributable to increases in sugar, though these participants also increased in
fiber intake. Class 1 had a mix of increases and decrease in fiber.
Exploration of an interaction of intervention group with dietary intake
classes found a significant effect for mean change in energy only (p
M
= .012). As
seen in Figure 19, there was much more class differentiation in the Con group
than in the Nut or Nut+ST groups. While these statistics have limited usefulness
given the limited sample sizes of some of the cells, they are indicative of
differences. These cell limitations do not allow for an examination of a 2-way
ANOVA for the trimmed means.
100
Figure 19. Energy Intake (kcal) Change Class x Intervention (M+SE)
Motivation
Hybrid model
There were four variables included in the motivation model: Relative
Autonomy Index (RAI) for exercise, and three locus of control factors of eating
fruits and vegetables: autonomous, controlled, and amotivation. There was
complete motivation data available for 49 participants. Hybrid models were fit for
2-, 3-, and 4-class models. The 5-class model would not converge. There was an
early division of the final class (class 4 in the 4 class model) that remained
distinct from the rest of the data, while classes 1 to 3 (in the 4-class model) clung
together. Actually, by the class map, the 3-class model was the most distinct,
however this was not supported by the fit indices. Taken together, these indices
indicate a 2-class model is the best fit
101
for these data. Class 1 included 35 participants and class 2 included 14
participants.
Profile
There was no significant
relationship between intervention group
and motivation class (!
2
(2)
= .88 p = .64)
or gender (!
2
(1)
= 2.37 p = .12), as
shown in Figure 20. Unlike many of the
other class models, there were few
significant differences in the class
components in the motivation classes.
There was a trend for mean difference in
mean RAI pre-test (p
M
= .08) where class
2 started a bit higher (indicating more
autonomy for exercising) than class 1,
but this effect disappeared when trimmed means were compared (p
M
T20
= .13).
There were no difference between classes in RAI change (p
M
= .17, p
M
T20
= .27;
Figure 22). There was also a trend for the autonomous pre-test factor (p
M
= .06,
p
M
T20
= .08), whereby class 1 had higher autonomy for eating fruits and
vegetables than class 2, but again not for change (p
M
= .14, p
M
T20
= .11). There
was a significant difference between classes in mean pre-test of the
Figure 20. Motivation Classes
Across Intervention Group
A.
B
102
Figure 21. Fit Indices and Class N's: Motivation
Number of classes
Variables Fit Indices 1 2 3 4
N 49 35/14 7/29/13 15/15/12/7
LnL -767.394 -752.302 -74.23 -726.617
# free parameters 20 29 38 47
AIC NA 1562.605 1556.46 1547.23
BIC NA 1617.468 1628.35 1636.15
Bootstrap -2LR! NA 3.18 24.15 26.47
df NA 9 9 9
p-value NA .0000 .3077 .0400
RAI
Autonomous
Controlled
Amotivation
Entropy NA .912 .889 .892
control factor (p
M
= .03) but only a trend for trimmed means (p
M
T20
= .07). Class 2
reported being controlled more by others in their eating of fruits and vegetables,
which aligned with the reverse class effect for the autonomous factor. In addition,
those who had the highest controlled factor scores at pre-test (class 2) declined
in this by post-test, which was significant for mean comparison (p
M
= .002) and a
trend for trimmed mean comparison (p
M
T20
= .06). The final factor for the
Motivation for Eating Fruits and Vegetables amotivation factor appeared to be
103
driving the classes. There were highly significant differences of mean and
trimmed mean pre-test and change scores (Figure 25). Interestingly, class 2 who
had the highest pre-test scores (i.e. were the least motivated) declined in
amotivation, meaning they became increasingly motivated, while class one who
had lower amotivation at pre-test tended to increase in amotivation (become less
motivated), an apparent regression to the mean.
Examination of class by intervention interactions showed only a significant
mean interaction for change in amotivation (p
M
= .048). Within class 2, which
started the most amotivated, Con participants showed the greatest decreases in
amotivation, followed by Nut group, and Nut+ST (Figure 28). This may be
explained by the Con group getting excited about their chance at a month of
classes and having a personal trainer after waiting for 16 weeks. However, the
more intense the class, the smaller the decrease in amotivation for the groups
who received the intervention. However, for class 1 who started with lower
amotivation (meaning they were more motivated at the beginning), there was
little increase in amotivation for Con and Nut+ST groups, but a small increase for
the Nut group. There were trends for the controlled factors at pre-test (p
M
= .088,
P
MT20
= .074). As seen in and Figures 18 and 19, at pre-test class 2 had higher
controlled factors in the Con and Nut groups, but were overlapping in the Nut+ST
group.
104
Figure 22. Profile RAI
Pre-Test Change
Class 1 2 3 4
M 7.36 9.85 .30 -1.99
SD 4.37 4.59 5.21 5.20
p .082 .170
Class 1 2 3 4
M
T20
7.06 9.62 .20 -1.42
Wvar 1.86 11.90 7.33 9.80
P .129 .270
Note: Participants on right side of figure responded the best to the intervention.
Significant post hoc comparison using Bonferroni correction are reported. Analyses
were performed using Bonferroni’s correction. Significant pairwise comparisons
used a critical p-value of .009 are reported. Arrows indicate hypothesized direction
of effects.
105
Figure 23. Profile Autonomous Factor
Pre-Test Change
Class 1 2 3 4
M 4.96 4.10 .18 .83
SD 1.43 1.36 1.28 1.56
p .059 .139
Class 1 2 3 4
M
T20
5.10 4.25 .29 .82
Wvar .91 .97 .35 .48
P .083 .111
Note: Participants on right side of figure responded the best to the intervention.
Significant post hoc comparison using Bonferroni correction are reported. Analyses
were performed using Bonferroni’s correction. Significant pairwise comparisons
used a critical p-value of .009 are reported. Arrows indicate hypothesized direction
of effects.
106
Figure 24. Profile Controlled Factor
Pre-Test Change
Class 1 2 3 4
M 2.59 3.53 .20 -.94
SD 1.18 1.60 .99 1.39
p .029 .002
Class 1 2 3 4
M
T20
2.40 3.63 .15 -.94
Wvar .44 2.29 .26 1.66
P .068 .058
Note: Participants on right side of figure responded the best to the intervention.
Significant post hoc comparison using Bonferroni correction are reported. Analyses
were performed using Bonferroni’s correction. Significant pairwise comparisons
used a critical p-value of .009 are reported. Arrows indicate hypothesized direction
of effects.
107
Figure 25. Profile Amotivated Factor
Pre-Test Change
Class 1 2 3 4
M 1.95 4.31 .64 -2.19
SD .77 .87 1.03 1.23
p <.001 <.001
Class 1 2 3 4
M
T20
1.92 4.33 .55 -2.30
Wvar .45 .49 .33 .76
P <.001 <.001
Note: Participants on right side of figure responded the best to the intervention.
Significant post hoc comparison using Bonferroni correction are reported. Analyses
were performed using Bonferroni’s correction. Significant pairwise comparisons
used a critical p-value of .009 are reported. Arrows indicate hypothesized direction
of effects.
108
Figure 26. Amotivation Pre-test Class x Intervention (M+SE)
Figure 27. Amotivation Change Class x Intervention (M+SE)
109
Figure 28. Controlled Pre-test Class x Intervention (M+SE)
Figure 29. Controlled Change Class x Intervention (M
T
+SE
t20
)
Body Composition
Hybrid model
There were three variables included in the Body Composition model: BMI
Z-score, total fat, and total lean tissue. There was complete data from 53
participants for these data. Hybrid models were run for 2-, 3-, 4-, and 5-class
models. The best log likelihood was not replicated for the 5-class model. Fit
110
indices indicate that a 4 class models fits the data best. The distinctness of these
classes was confirmed by examining the N’s as additional classes were added
(Figure 32). There were 10 participants in class 1, nine in class 2, 12 in class 3,
and 22 in class 4.
Figure 30. Fit Indices and Class N's: Body Composition
Number of classes
Variables Fit Indices 1 2 3 4
N 53 39/14 11/32/10 10/9/12/22
LnL -763.9 -732.1 -709.5 -695.9
# free parameters 15 22 29 36
AIC 1557.9 1508.3 1477.0 1463.9
BIC 1587.4 1551.6 1534.2 1534.8
Bootstrap -2LR! NA 63.6 45.3 27.2
df NA 7 7 7
p-value NA <.0001 <.0001 <.0001
BMI
Total Body Fat
(kg)
Total Lean
Tissue Mass
(kg)
Entropy NA .941 .938 .961
111
Profile
There was not a significant
relationship of class with
intervention group (!
2
(6)
= 2.78 p =
.84) or a significant gender
difference across classes (!
2
(3)
=
.86 p = .84) as shown in Figure
31. Given the small number of
participants who changed much in
body composition, the classes are
primarily defined by pre-test level
(see Figures 33 - 35). For BMI-Z
score there was a significant
omnibus effect of class on pre-test
level (p
M
<.001, p
M
T20
< .001). All the groups were statistically different from each
other with post-hoc (M) or pairwise comparisons (M
T20
). In order from lowest to
highest BMI Z-scores pre-test were class 2, class 3, class 4, and class 1. There
were no differences across classes in change in BMI Z-scores. There was also a
significant class difference in fat weight pre-test scores (p
M
<.001, p
M
T20
< .001),
following the same pattern seen in BMI Z-score. There was a significant
difference in change in mean fat weight (p
M
= .020); post-hoc comparisons
Figure 31. Body Composition
Classes Across Intervention Group
A.
B
112
showed this difference to be between classes 1 and 3. Class 1 showed the most
decline in fat weight and class 3 was the only group with a mean increase in fat
weight. However, when trimmed means were compared, there was no class
difference in fat weight across groups (p
M
T20
= .12). Lean tissue weight followed
the same pattern as fat weight and BMI Z-score for classes with class 2 having
the lowest lean tissue weight, followed by class 3, class 4, and class 1. This
indicates that there was a not a vast difference in body composition across
classes, and is possibly largely due to weight. There were significant differences
in pre-test lean weight across classes (p
M
<.001, p
M
T20
= .029), though
differences between the various classes were largely driven by class 1. There
was a difference in mean change in lean tissue (p
M
=.015) with post-hoc
comparisons showing a significant difference between class 1 who increased
most in lean tissue and class 3 who decreased in lean tissue. There were no
significant difference in change in lean tissue when trimmed means were
examined (p
M
T20
< .001).
Exploration into an interaction of class with intervention group showed a
significant effect for mean pre-test fat weight tissue (p
M
=.025) only. This appears
to be driven by class 1, which was higher in the Con group than the other groups
(Figure 37). The other three classes were fairly stable across intervention group.
BMI Z-score was significant for trimmed means (p
MT20
= .003) where the level
differences varied a bit across intervention groups (Figure 36).
113
Figure 32. Fit Indices and Class N's: Body Composition
Number of classes
Variables Fit Indices 1 2 3 4
N 53 39/14 11/32/10 10/9/12/22
LnL -763.9 -732.1 -709.5 -695.9
# free parameters 15 22 29 36
AIC 1557.9 1508.3 1477.0 1463.8
BIC 1587.4 1551.6 1534.2 15364.7
Bootstrap -2LR! NA 63.573 45.261 27.181
df NA 7 7 7
p-value NA <.0001 <.0001 <.0001
BMI
Total Body Fat (kg)
Total Lean Tissue Mass (kg)
Entropy NA .941 .938 .961
114
Figure 33 . Profile BMI Z-score
Pre-Test Change
Class 1 2 3 4 1 2 3 4
M 2.813 1.161 1.773 2.292 -.034 -.013 .010 .01
SD .181 .250 .164 .157 .099 .198 .082 .11
p <.001 .358
Post-hoc 2,3,4 1,3,4 1,2,4 1,2,3
Class 1 2 3 4 1 2 3 4
M
T20
2.810 1.191 1.785 2.291 -.010 -.028 .019 .020
Wvar .016 .007 .018 .010 .001 .014 .003 .002
P <.001 .486
Pairwise 2,3,4 1,3,4 1,2,4 1,2,3
Note: Participants on right side of figure responded the best to the intervention.
Significant post hoc comparison using Bonferroni correction are reported. Analyses
were performed using Bonferroni’s correction. Significant pairwise comparisons used
a critical p-value of .009 are reported. Arrows indicate hypothesized direction of effects.
115
Figure 34. Profile Fat Weight
Pre-Test Change
Class 1 2 3 4 1 2 3 4
M 61.1 17.1 25.5 35.6 -6.4 -.6 4.9 -.9
SD 9.7 4.4 4.7 7.0 11.7 2.3 8.1 4.9
p <.001 .020
Post-hoc 2,3,4 1,3,4 1,2,4 1,2,3 3 1
Class 1 2 3 4 1 2 3 4
M
T20
59.6 17.2 26.1 35.3 -5.1 -.7 3.1 -.2
Wvar 35.9 17.7 12.0 27.3 19.2 4.9 15.8 6.6
P <.001 .117
Pairwise 2,3,4 1,3,4 1,2,4 1,2,3
Note: Participants on right side of figure responded the best to the intervention.
Significant post hoc comparison using Bonferroni correction are reported. Analyses
were performed using Bonferroni’s correction. Significant pairwise comparisons used
a critical p-value of .009 are reported. Arrows indicate hypothesized direction of effects.
116
Figure 35. Profile Lean Weight
Pre-Test Change
Class 1 2 3 4 1 2 3 4
M 67.6 49.9 53.1 57.4 3.8 .2 -2.0 1.2
SD 9.8 8.6 9.6 8.3 7.3 1.7 5.9 3.7
p <.001 .015
Post-hoc 2,3,4 1 1 1 3 1
Class 1 2 3 4 1 2 3 4
M
T20
66.1 5.4 51.6 57.6 4.7 .3 -1.3 1.2
Wvar 47.5 63.9 3.3 4.2 15.4 1.6 8.6 7.5
P .029 .166
Pairwise 2,3 1 1
Note: Participants on right side of figure responded the best to the intervention.
Significant post hoc comparison using Bonferroni correction are reported. Analyses
were performed using Bonferroni’s correction. Significant pairwise comparisons used
a critical p-value of .009 are reported. Arrows indicate hypothesized direction of effects.
117
Figure 36. BMI Z-score Pre-test Class x Intervention (M+SE
)
Figure 37. Fat Weight (kg) Pre-test Class x Intervention (M+SE)
118
Insulin & Glucose Dynamics
Hybrid model
The glucose and insulin variables included were insulin sensitivity (SI),
acute insulin response (AIR), and disposition index (DI). There were 52
participants with complete pre- and post-test data for this model. These variables
were tested for 2-, 3-, 4-, and 5-class models (see Figure 38). There was an
improvement in fit in all indices through the 4-class model. While AIC and BIC
were better in the 5-class model, the BLRT did not indicate a significant
improvement. Additionally, the 5-class model includes a class with only one
person, thus the 4-class model is deemed the best fit. In this model, most of the
participants (n=31, 59.6%) were in class 3, followed by class 1 (n=13, 25.0%),
and the rest were split between class 2 (n=5, 9.6%) and class 4 (n=3, 5.8%).
While the sample sizes for these latter classes are small, theses classes were
distinguished early on (See Figure 38) and remained consistent as the number of
classes increased. Classes 1 and 3 were combined through the 3-class model
and then split into distinct classes. Examination of the profile for these classes
(see below) shows how these classes are distinct from each other.
Profile
There was no significant relationship between class and intervention
group (!
2
(6)
=4.89 p = .56), and there is representation of each group in each
class (Figure 39A). There was no gender difference in distribution across classes
(!
2
(3)
=3.90 p = .27), though class 4 (n=3) included only boys (Figure 39-B).
119
Figure 38. Hybrid Model Fits: Glucose and Insulin Dynamics
Number of classes
Variables Fit Indices 1 2 3 4 5
N 52 45/7 5/44/3 13/5/31/3
23/13/9
/1/
-LnL -1811.5 -1782.6 -1782.6 -1754.1 -1739.8
# free
parameters 15 22 29 36 43
AIC 3652.941 3609.221 3609.221 358.225 3565.7
BIC 3682.210 3652.149 3652.149 365.470 3649.6
Bootstrap -
2LR! NA 57.812 3.471 31.347 27.142
df NA 10 10 10 10
p-value NA <.0001 <.0001 .0200 .1397
SI
AIR
DI
Entropy NA .976 .984 .907 .936
Examination of the variables across classes demonstrates the patterns of
differences in these variables across classes. The pre- and post-test data and
statistics for pre-test and change scores (the components of the latent profile
models) are reported in Figures 40 - 42. For the primary outcome of SI (see
Figure 40) there was a significant difference in pre-test across classes (p < .001),
with the main differences being between class 1, which had the highest SI pre-
test, and the other classes. Pairwise
120
comparisons of M
T20
also showed a
significant effect between classes 2
and 3. Comparison of change in SI
across classes showed a significant
difference (p
M
= .03), driven by class
1 (which declined in SI) and class 3
(which improved the most in SI).
However, this was not found when
trimmed means were compared
(p
M
T20
= .55). Likewise, there were
significant class effects for AIR pre-
test (p
M
<.001, p
M
T20
= .04), but only
for mean change (p
M
<.001), and a
trend for trimmed mean change
(p
M
T20
= .07). Pairwise comparisons showed some differences, particularly with
class 4 whose members all improved greatly in AIR. The differences in pre-test
are primarily with class 2 being so much higher than the other groups (Figure
40). DI, which is a composite of SI and AIR, showed a combination of these
effects. Class 1 started the highest, as with SI, and Class 4 improved the most
(as with AIR), especially when compared with class 1. In summary, the classes
for the glucose and insulin variables appear to be driven primarily by pre-test
level, though there were three participants (class 4) who started with very low SI
Figure 39. Glucose and Insulin:
Class Distribution Across
Intervention Group
A.
B
121
and average AIR and DI, and improved greatly in their insulin response (AIR),
and subsequently DI. Class 3 improved in SI and DI, but their AIR changed little
and was mixed in direction of change.
Examination of possible interaction of intervention group with classes
showed an interaction of class with randomization for AIR change and DI change
with a 2-way ANOVA (p
M
= .04 and .01, respectively). For AIR (Figure 43),
classes 1 – 3 showed little difference across intervention group, whereas class 4
was different. However, this must be interpreted with great caution as there was
only one class 4 participant in each intervention group. DI likewise showed that
the class differentiation was greatest in the Nut group (
Figure 44). While these statistics have limited usefulness given the limited
sample sizes of some of the cells, they are indicative of differences. These cell
limitations do not allow for an examination of a 2-way ANOVA for the trimmed
means.
122
Figure 40. Profile SI
Pre-Test Change
Class 1 2 3 4 1 2 3 4
M 2.56 .49 1.46 .83 -.49 .15 .36 .12
SD 1.05 .17 .73 .42 1.01 .41 .77 .57
p <.001 .026
Post-hoc 2,3,4 1 1 1 3 1
Class 1 2 3 4 1 2 3 4
M
T20
2.42 .46 1.36 .83 -.39 .24 .26 .12
Wvar .41 .01 .24 .17 .82 .07 .14 .33
P <.001 .547
Pairwise 2,3,4 1,3 1,2 1
Note: Participants on right side of figure responded the best to the intervention.
Significant post hoc comparison using Bonferroni correction are reported. Analyses
were performed using Bonferroni’s correction. Significant pairwise comparisons used
a critical p-value of .009 are reported. Arrows indicate hypothesized direction of effects.
123
Figure 41. Profile AIR
Pre-Test Change
Class 1 2 3 4 1 2 3 4
M 1267
338
0 9567 1625 8.3 114.5 -45.1
1857
.5
SD 541.5 81.8 49.1 687.2 353.5 427.2 322.6
783.
6
p <.001 <.001
Post
-hoc 2 1,3,4 2 2 4 4 4 1,2,3
Class 1 2 3 4 1 2 3 4
M
T20
1222.8
3438
.1 915.3
1625.
3 -39.6 153.3 -52.0
1857
.5
Wvar 226521
4361
60
12566
8
47223
6 68831 87234 25099
6139
82
P .038 .068
Pair-
wise 2
1,3,
4 2,4 2,3 4 4 4 1,2,3
Note: Participants on right side of figure responded the best to the intervention.
Significant post hoc comparison using Bonferroni correction are reported. Analyses
were performed using Bonferroni’s correction. Significant pairwise comparisons used
a critical p-value of .009 are reported. Arrows indicate hypothesized direction of effects.
124
Figure 42. Profile DI
Pre-Test Change
Class 1 2 3 4 1 2 3 4
M 2817.7 1643.0 1219.2 1207.4 -441.6 485.7 198.0 2128.0
SD 539.1 521.9 539.9 364.5 958.4 153.2 481.2 993.6
p <.001 <.001
Post-hoc 2,3,4 1 1 1 4 4 4 1,2,3
Class 1 2 3 4 1 2 3 4
M
T20
2806.1 1598.5 1262.3 1207.4 -457.1 952.6 144.3 2128.0
Wvar 112119 118127 175846 132827 725706 700199 94044 9872812
P .001 .084
Pair-wise 2,3,4 1 1 1 4 1
Note: Participants on right side of figure responded the best to the intervention.
Significant post hoc comparison using Bonferroni correction are reported. Analyses were
performed using Bonferroni’s correction. Significant pairwise comparisons used a critical
p-value of .009 are reported. Arrows indicate hypothesized direction of effects.
125
Figure 43. AIR Change Class x Intervention (M+SE)
Figure 44. DI Change Class x Intervention (M+SE)
126
Relationship of Classes Across Constructs
Table 18 reports the !
2
test for overlap of the classes from the various
constructs. There was a significant difference in distribution in glucose and
insulin classes across dietary classes (p = .007). Those participants who started
out with the healthiest dietary intake and yet improved (dietary intake class 1)
were most likely to improve in their metabolic parameters (glucose and insulin
class 3), along with those who decreased most in sugar (dietary intake class 3).
Of these participants, most were in the Nut group (47.7%), adding to the
evidence that the nutrition only education plan was especially efficacious in those
who were already eating healthier. Even though the Nut+ST group also received
the nutrition education classes, only about one third of these improvers in diet
who also improved in metabolic parameters were in the Nut+ST (31.6%) which
would be expected if there were no effect of randomization group. There was
also a trend to a difference in glucose and insulin classes across strength as well
(p = .071). Those in the weaker group (strength class 1) improved the most in
glucose and insulin, though this may be a gender effect given the distinct gender
distributions in the strength classes. When intervention was added in, this effect
seems most distinct in the Con and Nut groups, whereas the Nut+ST group
improvers in glucose and insulin were equally distributed across strength
classes, indicating that this effect may be improved in stronger participants when
they received the strength training.
127
Table 18. Overlap of Construct Classes
Strength Dietary Intake Motivation Body Composition
Insulin
&
Glucose C1 C2 C1 C2 C3 C4 C1 C2 C1 C2 C3 C4
C1 6 5 3 5 2 0 6 4 2 4 2 5
C2 2 3 3 0 1 1 3 2 3 0 0 2
C3 22 9 11 9 8 2 21 8 3 5 10 12
C4 0 3 0 0 0 2 3 0 1 0 0 2
!
2
(3)
= 7.0
p = .07
!
2
(9)
= 22.7
p < .01
!
2
(3)
= 2.1
p = .55
!
2
(9)
= 13.2
p = .15
Strength C1 C2 C3 C4 C1 C2 C1 C2 C3 C4
C1 13 6 8 1 19 9 2 6 9 13
C2 4 8 3 4 14 4 7 3 2 8
!
2
(3)
= 7.7
p = .05
!
2
(1)
= .5
p = .47
!
2
(3)
= 7.7
p = .05
Dietary Intake C1 C2 C1 C2 C3 C4
C1 11 6 4 6 4 3
C2 9 4 2 4 5 2
C3 7 3 4 2 2 2
C4 4 1 2 2 1 0
!
2
(3)
= .4
p = .93
!
2
(9)
= 12.1
p = .21
Motivation C1 C2 C3 C4
C1 5 4 10 16
C2 5 3 1 5
!
2
(3)
= 5.2
p = .16
There was also a significant relationship between dietary and strength
classes (p = .053) such that the weaker (strength class 1, i.e. girls) were likely to
improve their diets. As with the glucose and insulin by class distribution, this
effect was mainly seen in the Nut group. There was also a significant difference
in body composition across dietary classes (p = .052) whereby the stronger
participants (class 2, i.e. boys) improved in body composition the most. However,
when stratified by intervention group, this effect is mainly in the Nut+ST group,
128
whereas in the Nut group, it is the weaker participants (strength class 1) who
improved most.
Given the information about the classes above it could be suggested that
an individual is most likely to be successful in the improvement of glucose and
insulin parameters (the main goal of the SANO intervention) would be girls, who
are physically weaker but still show improvements, and participants who started
off eating less sugar and more fiber and still improved. While participants who
were stronger (primarily boys) were most likely to show improvement in body
compositions, Motivation, as measured by these indices, did not appear to have
much of an effect on the main outcome. It is interesting to note that in the 2-way
models, class effects were almost always stronger than intervention for the
change scores.
Summary of Hybrid Latent Profile Model Approach
Results from the hybrid latent profile model indicate that the population
from which the SANO sample was drawn was not homogeneous, despite
stringent inclusion criteria designed to assure this was the case. Additionally, this
heterogeneity was systematic and often had a greater influence on change in
these variables from pre- to post-test than did the intervention.
Gender was a particularly important influence on the strength and dietary
measures, which were the target of the intervention. That boys were much more
likely to gain strength (and more of it) than girls is not surprising, however the
demonstration that without being in the Nut+ST arm, girls are not going to
increase much is of interest. It indicates that to increase the muscle mass in girls,
129
they would need to receive a structured strength training intervention, whereas
boys may seek this out on their own. The opposite was true for nutrition, where
girls were more likely to improve their diets no matter what group they were in,
while boys in the combined group showed the most improvement. Additionally,
the profiles highlight the influence of pre-test values in direction and amount of
change across all the variables. Finally, the examination of how the profiles for
different constructs overlapped demonstrated that the theoretical model was
indeed correct, and that participants who made the greatest improvements in
strength and dietary intake benefitted most in body composition and glucose and
insulin variables.
Examination of these results demonstrate the limitation of traditional
ANCOVA for analyzing these complex data, by showing that the intervention was
differentially effective due to individual differences in the participants. The
detailed results from the class profiles provide a useful tool to understand the
intervention effects and further the science. Profiles such as those achieved with
the hybrid model can be used to create a priori intervention modalities for future
research whereby participant characteristics and pre-test scores can be used to
design the best intervention for each individual.
130
Chapter 9. Conclusion
One statistic does not fit all (Prescott, 2005), and this was very true of this
exploration of two non-traditional approaches to analyzing the SANO randomized
clinical trial. These analyses sought to determine why there were such limited
findings in the main outcomes of the SANO project. Were these null findings
really due to an ineffective intervention? Or was the intervention effective, and
the statistical choices limited in power or ability to detect effects that varied
across participants? A triangulation of several different non-traditional statistical
techniques were used to answer these questions.
To review, there were differences found in traditional ANCOVA for the
strength variables, bench press and leg press, and for total energy intake. There
were no significant effects for the main outcomes of glucose and insulin
dynamics (SI, AIR, DI). Nor were there effects for the dietary targets (fiber and
sugar intake), motivation factors, or body composition. Thus, the conclusion was
that the intervention was largely unsuccessful in changing the outcome variables.
So, when examining the modern statistics used here, the first question to answer
was whether this same conclusion was supported or whether the lack of
significance was due to the statistical method choice.
Did statistical method choice matter?
The robust ANCOVA approach demonstrated several significant
intervention effects not found with traditional methods. In addition to the
increases in strength and decreases in total dietary intake found in the traditional
131
ANCOVA, it was demonstrated that there were decreases amotivation,
improvements in body composition, and indication of some improvements in SI
and DI. The influences were different between the two intervention groups.
Dietary improvements and decrease in fat were seen in the Nut group, whereas
strength improvements and increase in lean tissue was seen in the Nut+ST
group. And while the Nut group improved in SI over the Con group, there is
indication that DI was better in Nut+ST than Nut only. However, these group
effects depended on where along the pre-test axis the comparison was made.
Results of the hybrid LPA models also show the importance of pre-test
values in determining treatment effect. Where an individual started out was very
important to the success he or she had in the intervention. These pre-test values
often were a driving force in determining the classes, and the change was related
to this starting point. The profiling of “successful” classes allows for examination
of how the different variables within a construct, and across constructs may work
together to create improvement, something the ANCOVA models (traditional or
robust) does not do. Thus, when looking at who had the greatest improvements
in the metabolic parameters, it was shown that the theoretical model did hold in
that those who decreased most in sugar, improved most in glucose and insulin
dynamics, though these people were not distributed evenly along the pre-test
axes. That is, the amount of change was related to pre-test score. The profiles
also brought out the differences seen across gender, in particular in how boys
and girls had differential effects for strength and body composition variables in
the two intervention groups.
132
One important difference between traditional ANCOVA and the
approaches used here was the treatment of outliers, and this affected the results
found. The heavy tailed distribution of many of the variables decreased the
power of traditional ANCOVA. With the robust ANCOVA approaches, the effect
of outliers is minimized. This had the effect of change some of the results with
this small sample where a couple of participants had a large influence on
estimators. This could be seen as both beneficial and or not, depending on the
outcome. The two intervention participants who lost the most fat (by a
considerable amount) were marginalized in these robust methods, which
removed two success stories from the analyses. However, this potential limitation
was seen as a benefit for the SI variable where two controls improved the most
out of all participants (Figure 2). Whether this elimination of outliers was seen to
support the hypotheses or not, the robust approaches to ANCOVA uncovered the
more systematic intervention effects better than traditional ANCOVA. Alternately,
the LPA model used the information outliers in the model and examined whether
these were indicative of classes. When examined within the distribution of the
classes, the outliers were not necessarily extremes, just indicative of a mixture of
distributions. The LPA modeling makes use of the individual differences found in
the sample, a potential strength of this method.
Based on these new analyses, it can be argued that the SANO
intervention did have a positive effect for several participants. And while some of
these positive changes could be attributed to intervention group, especially the
Nut assignment, there were other factors that need to be taken into account to
133
fully understand the impact the SANO intervention had on participants. The
results of these non-traditional analyses underscore the complex relationship of
the biological and psychological forces at play, and shows why traditional
statistics (and even robust statistics) may not be statistically significant due to
population heterogeneity.
A new theoretical model?
The aim of these analyses was to pinpoint which of three possible
outcomes could be responsible for the null findings of the SANO study when
using ANCOVA models. The first possibility was that the intervention was not
effective. This conclusion has changed given the results of these non-traditional
approaches. For some participants, the intervention was successful. Further,
these analyses provide more detailed information on which participants the
intervention was successful for, which is a benefit for future work. The second
possibility of the non-significant statistics using traditional ANCOVA was that
there just was insufficient power given the limitations of traditional ANCOVA. This
was definitely supported by these analyses. While it is a challenge to figure out
which results are the best here, the traditional ANCOVA clearly falls at the
bottom of the list. Finally, there is the possibility that the intervention was
differentially effective due to population heterogeneity. This was also supported,
and the profiles resulting from the LPA hybric models were the most informative
about the processes. They were also the most time consuming and challenging
to perform.
134
After evaluating these non-traditional approaches, the theoretical model
(Figure 1) can be revisited and evaluated to see if any adjustments are
warranted. That is, how would this model change if based on results of each
method? The ANCOVA model tested the effect of the intervention on each of the
components, as shown in Figure 45. This is a departure from Figure 1 because
the ANCOVA models do not test the mediating effects of the constructs. In this
operationalized model, it is hypothesized that the intervention affected each of
the constructs as shown in Table 1. Given the results of the ANCOVA model (as
reported in Table 1), this tested model would change to resemble Figure 46. Only
strength was increased. The dashed line to Lower Sugar & Increased Fiber
represents the effects seen in total energy intake, but not sugar and fiber. Thus,
as a result of the ANCOVA analyses, the theoretical model is almost entirely
unsupported. The robust ANCOVA model supports many more of the
hypothesized differences across group, as shown in the summary in Figure 47.
There was an influence to at least some of the variables in each of the
parameters that could be traced to the treatment. Based on these results, the
theoretical model could be extended to be more specific to the type of differences
found.
135
Figure 45. Model Tested by ANCOVA analyses
Figure 46. Changes to Model Based on ANCOVA Results
Figure 47. Changes to Model Based on Robust ANCOVA Results
136
Figure 48. Changes to Model Based on LPA Results
The LPA models change the theoretical model in Figure 1 the most.
Instead of the intervention driving the results, starting values on the measures
and gender drove them. There is some indication that the intervention modified
these relationship. Based on the results from these models, gender specific
interventions would be supported. Thus, if recommendations were to be made to
the development of SANO II, based on the results of these non-traditional
methods, the theoretical model would be modified to be stratified for gender, with
gender specific paths included. Additionally, it would be recommended that the
motivation component be modified as it seemed to glean few results in any of the
methods. Additionally, perhaps a larger focus on calorie intake would be even
more beneficial, or at least performing a strength-only (no nutrition) group to see
if the combined is too much for these adolescents to integrate into their lives
successfully. Finally, these results highlight the need for controls to be monitored
more carefully, or perhaps completing some type of pseudo intervention so that
they did not go out and seek out training.
137
Strengths and Weaknesses of Non-traditional Approaches
There were many benefits seen by analyzing the SANO data using these
methods. These include the ability to analyze untransformed data, which aids in
interpretation. Additionally, the results provide richer information than traditional
ANCOVA. These analyses show not only that there were treatment effects, but
help to pinpoint where they are, and why traditional methods found non-
significance. Thus, the use of these methods aids in updating the theoretical
model, something the traditional ANCOVA results did not do. Even though the
sample size was restrictive with the SANO data, it was possible to get richer
information about the effects of the intervention than would be possible with
traditional ANCOVA. Both of the non-traditional approaches used here provided
information lost with traditional ANCOVA.
However, viewed from a cost-benefit perspective, it could be argued that
the intensity of this modeling may not have resulted in much more knowledge
gained in this particular example using “real” data. Even these robust analyses
were hampered by the sample sizes, something that could not be changed. This
limitation is a tricky one to overcome for any study. One solution to this in regards
to SANO would be to combine the treatment groups, however when combined
the effects washed out (results not shown). As the Nut and Nut+ST participants
responded differently, and not in the hypothesized ordinal manner, this finding is
not surprising. Another limitation of these non-traditional methods was the time
they took to complete and interpret. These models were much more time
consuming to run than traditional ANCOVA, and required programming that was
138
rather complicated when compared to the ANCOVA models possible with
standard statistical packages.. The results are likewise much more difficult to
interpret. This may make them even more unsatisfactory to a medical audience,
which is the intended target of papers about the SANO study. This is likely the
most serious limitation of using non-traditional statistics. However, the fact that
significant effects were found where before there were none, may bring some
researchers around.
Future Directions
Future directions of research include additions to the current research, as
well as to the future of non-traditional methods. The SANO data used here can
be extended beyond the original study in two ways. First, a maintenance program
has been conducted for the participants in the SANO treatment groups. The
addition of this data to that presented here would allow the examination of the
possibility that the effects of the intervention solidified (or were eliminated) by the
passage of more time. Additionally, a parallel study has been conducted in
African American teenagers, and the data will soon be ready for analyses. Given
the findings of the exploratory analyses of the SANO data, a priori hypotheses
regarding profiles of who the intervention was successful for is now possible.
Additionally, the potentially important aspect of ethnic differences can be added.
The robust ANCOVA results will add to the literature the added power of
using robust statistics. This can highlight the usefulness of non-traditional
statistics for analyzing randomized clinical to a medical audience. And while
these methods certainly have usefulness in many fields, the hybrid LPA model
139
developed here has the potential to influence how randomized control trials are
conducted. The flexibility of the model, coupled with the detailed information of
the profile can be useful to many different fields of study. This model can be
applied as an exploratory method, as it was here, to create profiles and
determine which participants respond best to an intervention. These profiles can
be used to create individualized treatments for participants which will maximize
the intervention efforts. Those with existing data sets can examine how variables
that appear unrelated may be systematically related to each other. Additionally,
the hybrid LPA model can be used as a confirmatory model to test a priori
hypotheses of which participants would respond best to a treatment modality.
On a personal note, future plans include the consumption of excessive
amounts of alcohol and catching up on this season of 24, all of the episodes
being unwatched on my TiVo. Additionally, I will travel to Sweden and the UK,
where I will continue my plan of killing brain cells. Later this summer, I will
become addicted to Lost, along with my friend Alia who also wants to be able to
converse to the masses. Finally, I will again learn how to sleep through the night
without waking with my heart pounding and thoughts racing through all the things
I must include in this dissertation. Finally, I have found a new long-term project to
keep me out of trouble. I will be building an airplane and taking flying lessons,
having discovered my inner Rosie the Riveter. Oh, and a new job, because my
contract is up in a year. That would be good.
This study examined the crossroads of the lofty and exacting goals of pure
science and statistics with practicalities of real world research. A change in
140
paradigm is needed where applied statistics are concerned. Too much valuable
information is being lost with the use of traditional methods. Given the time and
effort (and money) spent to conduct the study, using inappropriate and limited
statistics is not scientifically responsible. The non-traditional approaches used
here may not be the best solution to this issue, however, they are certainly better
than traditional ANCOVA. The greatest limiting factor to this change in paradigm
is the audience. Thus, non-traditional statistics need better publicity. The best
way to get that is to show the richness and depth, not to mention missed
significance, found when using these methods. This study successfully achieved
that with the data from the SANO study.
141
Bibliography
Aitkin, M., Anderson, D., & Hinde, J. (1981). Statistical modeling of the data on teaching
styles (with discussion). Journal of the Royal Stastistical Society. Series A
(General), 144(4), 419-461.
Aitkin, M., & Rubin, D. (1985). Estimation and hypothesis testing in finite mixture
models. Journal of the Royal Stastistical Society. Series B (Methodological), 47,
67-75.
Akaike, H. (1987). Factor analysis and AIC. Psychometrika, 52(3), 317-332.
Algina, J., Keselman, H. J., & Penfield, R. D. (2005). An Alternative to Cohen's
Standardized Mean Difference Effect Size: A Robust Parameter and Confidence
Interval in the Two Independent Groups Case. Psychological methods, 10(3),
317-328.
Bauer, D. J., & Curran, P. J. (2003). Distributional assumptions of growth mixture
models; Implications for overextraction of latent trajectory classes. Psychological
Methods, 8(3), 338-363-338-363.
Boston, R., Stefanovski, D., Moate, P., Sumner, A., Watanabe, R., & Bergman, R.
(2003). MINMOD Millenium: A computer program to calculate glucose
effectiveness and insulin sensitivity from the frequently sampled intravenous
glucose tolerance test. Diabetes Technology and Therapeutics, 5(6), 1003-
10015.
Bowman, A., & Young, S. (1996). Graphical comparison of nonparameteric curves.
Applied Statistics, 45, 83-98.
Clogg, C. C. (1995). Latent Class Models. In G. Arminger, C. C. Clogg & M. E. Sobel
(Eds.), Handbook of Statistical Modeling for the Social and Behavioral Sciences
(pp. 311-359). New York: Plenum Press.
Davis, J., Ventura, E., Alexander, K., Salguero, L., Weigensberg, M., Crespo, N., et al.
(2007). Feasibility of a home-based versus classroom-based nutrition
intervention to reduce obestiy and type 2 diabetes in Latino youth. International
Journal of pedicatric obesity, 2, 22-3.
142
Davis, J. N., Kelly, L. A., Lane, C. J., Ventura, E., Byrd-Williams, C., Alexander, K., et al.
(In Press). Randomized control trial of nutrition education and strengt training to
reduce risk factors for obesity related diseases in overweight Latino adolescents.
Obesity.
Delgado, M. A. (1993). Testing the equality of nonparametric regression curves.
Statistics and Probability Letters, 17, 199-204.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood estimation
from incomplete data via the EM algorithm. Journal of the Royal Stastistical
Society. Series B (Methodological), 39(1), 1-38.
Dette, H., & Neumeyer, N. (2001). Nonparametric analysis of covariance. Annals of
Statistics, 29(1361-1400).
Dunn, G., Maracy, M., & Tomenson, B. (2005). Estimating treatment effects from
randomized clinical trials with noncompliance and loss to follow-up; the rolw of
instrumental variable methods. Satistical Methods in Medical Research, 14(4),
369.
Faigenbaum, A., Milliken, L., & Westcott, W. (2003). Maximal strength testing in healthy
children. Journal of Strength & Conditioning Research, 17, 162-166.
Fields, D., & Goran, M. (2000). Body composition techniques and the four-compartment
model in children. Journal of Applied Physiology, 89, 613-62.
Fields, D., Goran, M., & McCrory, M. (2002). Body-composition assessemnt vis air-
displacemnt pluthysmography in adults and children: A review. American Journal
of Clinical Nutrition, 75, 453-467.
Garrett, E. S., & Zeger, S. L. (2000). Latent class model diagnosis.[see comment].
Biometrics, 56(4), 1055-1067.
Goodman, L. A. (1974a). Exploring latent structure analysis using both identifiable adn
unidentifiable models. Biometrika, 61, 215-231.
Goodman, L. A. (1974b). The analysis of systems of qualitative variables when some of
the variables are unobserved. Part I: A modified latent strcture approach.
American Journal of Sociology, 79, 1179-1259.
Hall, P., Huber, C., & Speckman, P. L. (1997). Covariate-matched-one-sided tests for
the difference between functional means. Journal of the American Statistical
Association, 92, 1074-1083.
143
Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., & Stahel, W. A. (1986). Robust
statistics: Teh appraoch based on influence functions. New York: Wiley.
Haro, J. M., van Os, J., Vieta, E., Reed, C., Lorenzo, M., Goetz, I., et al. (2006).
Evidence for three distinct classes of 'typical', 'psychotic' and 'dual' mania:
Results from the EMBLEM study. Acta Psychiatrica Scandinavica, 113(2), 112-
12.
Harwell, M. R., Rubenstein, E. N., Hayes, W. S., & Olds, C. C. (1992). Summarizing
Monte Carlo results in methodological research: The one- and two-factor fixed
effects ANOVA cases. Journal of Educational Statistics, 17, 315-339.
Huber, P. (1981). Robust Statistics. New York: Wiley.
Jo, B. (2002). Estimation of intervention effects with noncompliance: Alternative model
specifications. Journal of Educational and Behavioral Statistics, 27(4), 385-409.
Kass, R. E., & Raftery, A. E. (1993). Bayes Factors. Journal of the American Statistical
Association, 90, 773-795.
Keselman, H. J., Algina, J., Lix, L. M., Wilcox, R. R., & Deering, K. N. (2008). A
generally robust approach for testing hypotheses and setting confidence intervals
for effect sizes. Psychological methods, 13(2), 110-129.
Keselman, H. J., Cribbie, R. A., & Wilcox, R. R. (2002). Pairwise multiple comparison
tests when data are nonnormal. Educational and Psychological Measurement,
62(3), 420-434.
Keselman, H. J., Lix, L. M., & Kowalchuk, R. K. (1998). Mutliple comparison procedures
for trimmed means. Psychological Methods, 3(1), 123-141.
Keselman, H. J., Wilcox, R. R., & Lix, L. M. (2003). A generally robust approach to
hypothesis testing in independent and correlated groups designs.
Psychophysiology, 40(4), 586-596.
Kulasekera, K. B. (1995). Comparison of regression curves using quasi-residuals.
Journal of the American Statistical Association, 90, 1085-1093.
Kulasekera, K. B., & Wang, J. (1997). Smoothing parameter selection for power
optimality in testing regression curves. Journal of the American Statistical
Association, 92, 500-511.
Lane, C. J. (2007). SANO-LA Key Variable Analyses: University of Southern California.
144
Lennon, M. C., McAllister, W., Kuang, L., & Herman, D. B. (2005). Capturing
Intervention Effects Over Time: Reanalysis of a Critical Time Intervention for
Homeless Mentally Ill Men. American Journal of Public Health, 95(10), 1760-176.
Lo, Y., Mendell, N. R., & Rubin, D. B. (2001). Testing the number of components in a
normal mixture. Biometrika, 88(3), 767-767-778.
Loehlin, J. C. (1998). Latent Variable Models: An Introduction to Factor, Path, and
Structural Analysis (3rd ed.). Mahwah, NJ: Lawerce Erlbaum Assoc. .
Lubke, G., & Neale, M. C. (2006). Distinguishing between latent classes and continuous
factors: Resolution by maximum likelihood? Multivariate Behavioral Research,
41(4), 499-532-499-532.
Lubke, G. H., & Muthen, B. (2005). Investigating population heterogeneity with factor
mixture models. Psychological Methods, 10(1), 21-39.
McCutcheon, A. L. (2002). Basic concepts and procedures in single- and multi-group
latent class analysis. In A. L. McCutcheon (Ed.), Advances in Latent Class
Models (pp. 56-85). West Nyack, NY: Cambridge University Press.
McLachlan, G., & Peel, D. (2000). Finite Mixture Models. New York: John Wiley.
McLauchlan, G., & Peel, D. (2000). Finite Mixture Models. New York: John Wiley &
Sons.
Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures.
Psychological Bulletin, 111, 156-166.
Mosteller, F., & Tukey, J. W. (1977). Data analysis and regression. Reading, MA:
Addison-Wesley.
Munk, A., & Detter, H. (1998). Nonparametric comparison of several regression
functions: Exact and asymptotic theory. Annals of Statistics, 26(2339-2368).
Muthen, B., & Asparouhov, T. (2008). Growth Mixture Modeling: Analysis with Non-
Gaussian Random Effects. In G. Fitzmaurice, M. Davidian, G. Verbeke & G.
Molenberghs (Eds.), Longitudinal Data Analysis (pp. 143-165). Boca Raton:
Chapman & Hall/CRC Press.
Muthen, B., Brown, C. H., Maysn, K., Jo, B., Khoo, S., Yang, C., et al. (2002). General
growth mixture modeling for randomized preventive interventions. Biostatistics,
3(4), 459-475.
145
Muthen, B., Collins, L. M., & Sayer, A. (2001). Second-generation structural equation
modeling with a combination of categorical and continuous latent variables. In
Anonymous (Ed.), New Methods for the Analysis of Change (pp. 291-291-322).
Washington, DC: APA.
Muthen, B., & Curran, P. J. (1997). General longitudinal modeling of individual
differences in experimental design: A latent variable framework for analysis and
power estimation. Psychological Methods, 2(4), 371-402.
Muthen, B., Hancock, G. R., & Samuelsen, K. M. (2007). Latent variable hybrids:
Overview of Old and New Models. In Anonymous (Ed.), Advances in Latent
Variable Mixture Models. (pp.???-???). Greenwich, CT: Information Age
Publishing, Inc.
Muthen, B. O. (2002). Beyond SEM: General latent variable modeling. Behaviormetrika,
29(1), 81-81-117.
Muthen, B. O., & Muthen, L. K. (1998-2007). Mplus version 4.1.
Muthen, B. O., & Satorra, A. (1995). Technical aspects of Muthen's Liscomp approach
to estimation of latent variable relations with a comprehensive measurement
model. Psychometrika, 60(4), 489-503-489-503.
Muthen, L. K., & Muthen, B. O. (1998). Mplus User's Guide. Los Angeles, CA: Muthen &
Muthen.
Muthen, L. K., & Muthen, B. O. (1998-2007). Mplus User's Guide. (4th ed.). Los
Angeles, CA: Muthen & Muthen.
Nagin, D. S. (1999). Analyzing developmental trajectories: A semiparametric, group-
based approach. Psychological methods, 4(2), 139-139.
Nagin, D. S., & Tremblay, R. E. (2001). Analyzing developmental trajectories of distinct
but related behaviors: A group-based method. Psychological methods, 6(1), 18-
18.
Neumeyer, N., & Dette, H. (2003). Nonparametric comparison of regression curves: An
empirical process approach. Annals of Statistics, 31, 880-92.
Nyland, K. L., Asparouhov, T., & Muthen, B. (2006). Deciding on the number of classes
in latent class analysis and growth mixture modeling: A Monte Carlo simulation
study (pp. 58): University of California, Los Angeles.
146
Othman, A. R., Keselman, H. J., Padmanabhan, A. R., Wilcox, R. R., & Fradette, K.
(2004). Comparing measures of the 'typical' score across treatment groups.
British Journal of Mathematical and Statistical Psychology, 57(2), 215-234.
Prescott, P. (2005). Robustness, Encyclopedia of Biostatistics: Wiley InterScience.
Ramaswamy, V., DeSarbo, W., Reibstein, D., & Robins, W. (1993). An empirical pooling
approach for estimating marketing mix elasticities with PIMS data. Marketing
Science, 12(1), 103-124.
Reed, J. F. (1998). Contributions to adpative estimation. Journal of Applied Statistics,
25(5), 651-669.
Resnicow, K. (2005). Novel theoretical approaches to behavior change moving beyond
the usual suspects: Self Determination Theory and Chaos Theory. Paper
presented at the International Society for Behavioral Nutriiton and Physical
Activity, Amsterdam.
Robert J Vallerand, G. F. L. (1999). An Integrative Analysis of Intrinsic and Extrinsic
Motivation in Sport. Journal of Applied Sport Psychology, 11, 142-169.
Rubak, S., Sandbaek, A., Lauaritzen, T., & Christiansen, B. (2005). Motivational
Interviewing: A systematic review and meta-analysis. British Journal fo General
Practice, 55, 305-312.
Ryan, R. M., & Connell, A. (1989). Perceived locus of causality and internalization:
Examining reasons for acting in two domains. Journal of Personality and Social
Psychology, 57(5), 749-761.
Ryan, R. M., & Deci, E. (2007). Active human nature: Self-determination theory and the
promotion and maintenance of sport, exercise, and health. In M. S. Hagger & N.
L. D. Chatzisarantis (Eds.), Intrinsic motivation and self-determination in exercise
and sport (pp. 1-19). Champaign, IL: Human Kinetics
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2),
461-461-464.
Segawa, E., Yanhong, J. E., Li, Y., Flay, B. R., & Aya, A. (2005). Evaluation of the
effects of the Aban Aya youth project in reducing violence among African
American adolescent males using latent class growth mixture modeling
techniques. Evaluation Review, 29, 128-148.
Shaibi, G., Curz, M., Weigensberg, M., Salem, G., Crespo, N., & Goran, M. (2006).
Effects of resistance training on insulin sensitivity in overweight Latino adolescent
males. Medicine & Science in Sports & Exercise, 38(7), 1208-1215.
147
Staudte, R. G., & Sheather, S. J. (1990). Robust Estimation and Testing. New York:
Wiley.
Stute, W., Manteiga, W. G., & Quindimil, M. P. (1998). Bootstrap appromations in model
checks for regression. Journal of the American Statistical Association, 93(441),
141-149.
The R Project for Statistical Computing. from http://www.r-project.org
Tukey, J. W. (1960). A survey of sampling from contaminated normal distributions. In I.
Olkin (Ed.), Contributions to Probability and Statistics (pp. 448-486). Stanford,
CA: Stanford University Press.
Tukey, J. W. (1975). Mathematics and the picturing of data. Paper presented at the
International Congress of Mathematicians, Vancouver.
Ventura, A. K., Loken, E., & Birch, L. L. (2006). Risk profiles for metabolic syndrome in
a nonclinical sample of adolescent girls. Pediatrics, 188(6), 2434-2442.
Westfall, P. H., Tobias, R. D., Rom, D., Wolfinger, R. D., & Hochberg, Y. (1999).
Multiple comparisons and multiple tests. Cary, NC: SAS Institute.
Wilcox, R. R. (1997). ANCOVA based on comparing a robust measure of location at
empirically determined design points. British Journal of Mathematical and
Statistical Psychology, 50(1), 93-103.
Wilcox, R. R. (1998a). How many discoveries have been lost by ignoring modern
statistical methods? American Psychologist, 53(3), 300-314.
Wilcox, R. R. (1998b). The goals and strategies of robust methods. British Journal of
Mathematical and Statistical Psychology, 51(1), 1-39.
Wilcox, R. R. (2001). Pairwise comparisons of trimmed means for two or more groups.
Psychometrika, 66(3), 343-356.
Wilcox, R. R. (2002). Understanding the practical advantages of modern ANOVA
methods. Journal of Clinical Child and Adolescent Psychology, 31(3), 399-412.
Wilcox, R. R. (2004). Kernel Density Estimators: An Approach to Understanding How
Groups Differ. Understanding Statistics, 3(4), 333-348.
Wilcox, R. R. (2005). A comparison of six smoothers when there are multiple predictors.
STatistical Methodology, 2(1), 49-57.
148
Wilcox, R. R. (2005). An Approach to ANCOVA that Allows Multiple Covariates,
Nonlinearity, and Heteroscedasticity. Educational and Psychological
Measurement, 65(3), 442-45.
Wilcox, R. R. (2005). Introduction to Robust Estimation and Hypothesis Testing. San
Diego: Elsevier Academic Press.
Wilcox, R. R. (2007). Robust ANCOVA: Some small-sample results where there are
multiple groups and multiple covariates. Journal of Applied Statistics, 34(3), 353-
364.
Wilcox, R. R. (2008). Robust ANCOVA using a smoother with bootstrap bagging
[Electronic Version]. British Journal of Mathematical & Statistical Psychology,
[Epub ahead of print].
Wilcox, R. R., & Keselman, H. J. (2003). Modern Robust Data Analysis Methods:
Measures of Central Tendency. Psychological methods, 8(3), 254-274.
Wilcox, R. R., Keselman, H. J., & Kowalchuk, R. K. (1998). Can tests for treatment
group equality be improved?: The bootstrap and trimmed means conjecture.
British Journal of Mathematical and Statistical Psychology, 51(1), 123-134.
Williams, G. C., Grow, V. M., Freedman, Z. R., Ryan, R. M., & Deci, E. L. (1996).
Motivational predictors of weight loss and weight-loss maintenance. Journal of
Personality and Social Psychology, 70(1), 115-126.
Yang, C. C. (2006). Evaluating latent class analysis models in qualitative phenotype
identification. Computational Statistics & Data Analysis, 50, 1090-1090-1104.
Young, S. G., & Bowman, A. W. (1995). Nonparametric analysis of covariance.
Biometrics, 51, 920-931.
Yuen, K. K. (1974). The two sample trimmed t for unequal population variances.
Biometrika, 61, 165-17.
149
Appendix A. SANO Variables
Strength
Before the hospital visits, strength was assessed using the established
procedures for 1 repetition maximum (1-RM; (Faigenbaum, Milliken, & Westcott,
2003). Bench press 1-RM assessed upper-body strength while leg press 1-RM
assessed lower body strength.
Dietary Data
Participants completed 3-day diet records at home and brought these to
the FSIVGTT inpatient visit where the records were clarified by research staff.
Nutrition data were analyzed using the Nutrition Data System for Research
(NDS-R version 5.0_35). For these analyses, total energy (kcal), total sugar,
added sugar, and fiber, were included. These were the main focus of the nutrition
portion of the intervention.
Motivation
There were two scales measuring motivation used in this study, which
covered the intervention constructs of diet and exercise and measured locus of
control over these behaviors. Participants completed these scales during their
testing visits, while waiting in the hospital. The Reasons for Healthy Diet from the
Intervention Self-Regulation Questionnaire (TSRQ-HD; (Resnicow, 2005; Ryan
& Connell, 1989; Williams, Grow, Freedman, Ryan, & Deci, 1996) and the SRQE
(Robert J Vallerand, 1999; Ryan & Deci, 2007), which measured motivation for
exercise. The TSRQ-HD includes three distinct factors of motivation:
150
autonomous, controlled, and amotivation. Each of these are treated separately
with no summary score. The SRQE contains a summary factor, the relative
autonomy index (RAI) that reflects the proportion of the factors that go into the
measure. These include external regulation, introjected regulation, identified
regulation, and intrinsic motivation. They are combined using the following
formula:
!
RAI = 2" Intrinsic + Identified# Introjected# 2" External
Body Composition
Weight and height were measured to the nearest .1 kg and .1 cm,
respectively, using a beam medical scale and wall-mounted stadiometer. From
these measurements, BMI was computed using the following formula:
!
BMI =
Wt(kg)
Ht(cm)
2
x10,000
Whole body fat and soft lean tissue was measured by Bod Pod, a
technique where the participant is placed inside an enclosed machine where the
air volume is known. Body composition is computed through examination of air
displacement plethysmography (Fields, Goran, & McCrory, 2002). This is a safe
technique for body composition measurement that does not expose participants
to radiation, as some other techniques do, nor does it require placement in a
water tank (Fields & Goran, 2000).
151
Glucose and Insulin Dynamics
Glucose and insulin dynamics used for this study were assessed during a
frequently sampled intravenous glucose tolerance test (FSIVGTT). The FSIVGTT
was completed at in-patient visit after an overnight fast. Participants arrived at
the General clinical research center (GCRC) in the evening and were served
standardized food, which was followed by an overnight fast. In the morning, an
insulin-modified frequently sampled intravenous glucose tolerance test
(FSIVGTT) was performed using standard procedures for collection and
processing of blood samples (see (J. N. Davis et al., In Press). Insulin sensitivity
(SI), acute insulin response (AIR), and disposition index (DI) were obtained
through analysis of the plasma values processed by the MINMOD
Millennium
2003 computer program (version 5.16, (Boston et al., 2003).
152
Appendix B. Pre- & Post-Test Value by Intervention Group
Figure 49. Bench Press Pre- and Post-Test
Figure 50. Leg Press Pre- and Post-Test
153
Figure 51. Total Energy Intake Pre- and Post-Test
Figure 52. Total Fiber Intake Pre- and Post-Test
154
Figure 53. Total Sugar Intake Pre- and Post-Test
Figure 54. Added Sugar Intake Pre- and Post-Test
155
Figure 55. Motivation for Exercise: Relative Autonomy Pre- and Post-Test
Figure 56. Motivation for Fruits & Vegetables: Autonomous Factor Pre- and
Post-Test
156
Figure 57. Motivation for Fruits & Vegetables: Controlled Factor Pre- and
Post-Test
Figure 58. Motivation for Fruits & Vegetables: Amotivation Factor Pre- and
Post-Test
157
Figure 59. BMI Z-Score Pre- and Post-Test
Figure 60. Fat Weight Pre- and Post-Test
158
Figure 61. Lean Tissue Weight Pre- and Post-Test
Figure 62. Insulin Sensitivity (SI) Pre- and Post-Test
159
Figure 63. Acute Insulin Response (AIR) Pre- and Post-Test
Figure 64. Disposition Index (DI) Pre- and Post-Test
160
Appendix C. Equations for Robust Estimators
All equations used for these analyses can be found in (Wilcox, 2007). For the
following equations:
J = # of groups
Mdn = Median
N = # of participants
X = covariates 1,…,k
Y = dependent variable
l = [gN] is the level of trimming
Median
Estimate of Standard error of the median:
!
S
2
=
Y
(N"k +1 )
"Y
(k)
2z
0.995
#
$
%
%
&
'
(
(
2
F-Statistic for Median
!
w
jk
=
1
S
jk
2
!
U
k
= w
jk
"
!
!
M
k
=
1
U
k
w
jk
Mdn
jk
"
161
!
A
k
=
1
J" 1
w
j
(Mdn
jk
"
!
M
k
)
2
#
!
B
k
=
2(J" 2)
J
2
" 1
(1" (w
jk
/U
k
))
n
jk
" 1
2
#
!
F
mk
=
A
k
1+ B
k
This may be tested against a F-distribution with degrees of freedom
!
"
1
= J# 1
"
2
=$
Trimmed Means
Compute trimmed means for each design point
!
Y t =
1
N" 2l
Y (i )
i = l +1
N" l
#
Compute Winzorized Y values for fixed j and k
!
Uijk =
Y (g + 1) jk, ifYijk " Y (g + 1 ) jk
Yijk, ifY (g + 1 ) jk < Yijk < Y (n # g) jk
Y (n # g) jk, ifYijk $ Y (n # g) jk
%
&
'
(
'
Compute Winzorized variance
!
s
wjk
2
=
1
njk" 1
(Uijk" U )
2
#
162
F-statistic for trimmed mean
!
d
jk
=
(n
1
" 1 )s
wjk
2
h
jk
# (h
jk
" 1 )
!
w
jk
=
1
d
jk
!
U
k
= w
jk
"
!
!
Y =
1
U
k
w
jk
Y
tjk
"
!
A
k
=
1
J" 1
w
jk
(Y
tjk
"Y
!
2
k
)
2
#
!
B
k
=
2(J" 2)
J
2
" 1
(1" (w
jk
/U
k
))
h
jk
" 1
2
#
!
F
tk
=
A
k
1+ B
k
!
"
1
= J# 1
"
2
=
3
J
2
# 1
$
(1# w
jk
/U)
k
2
h
jk
# 1
%
&
'
'
(
)
*
*
#1
163
Appendix D. Robust Descriptives
Table 19. Means, Trimmed Means, and Medians
N M SE M
T20
SE
T20
M
T10
SE
T10
Mdn SE
Mdn
Strength
CON T1
16 93.44 7.17
88.50 6.14 9.00 6.38 85.00 NA
!
16 11.56 2.18
12.00 2.40 11.79 2.54 1.00 NA
NUT T1
21 93.81 5.69
93.46 6.83 92.94 6.23 95.00 8.74
!
20 .50 2.40
1.25 2.36 .94 2.14 .00 2.91
NUT+ST T1
17 97.35 9.95
91.36 7.19 92.00 8.01 85.00 29.12
Bench Press (kg)
!
17 26.18 3.52
91.36 7.19 92.00 8.01 85.00 29.12
CON T1
16 512.50 48.08
489.50 51.75 512.14 55.68 477.50 NA
!
14 68.93 45.82
38.00 44.56 45.00 38.52 35.00 NA
NUT T1
21 511.24 45.99
502.00 44.73 496.24 43.68 501.00 59.20
!
20 -26.30 24.11
-16.33 23.55 -2.38 2.28 .00 31.06
NUT+ST T1
17 476.47 48.96
465.00 53.35 459.00 43.91 50.00 14.73
Leg Press (kg)
!
17 12.88 29.75
102.73 28.96 115.33 32.97 10.00 72.79
Dietary Intake
CON T1
14 1957.72 192.94
1926.38 289.97 1946.66 237.92 1789.14 NA
!
14 188.87 279.02
303.65 298.11 21.96 281.64 578.62 NA
NUT T1
20 1954.52 151.79
197.02 183.22 1952.21 162.44 1896.64 232.52
!
20 -202.44 102.75
-18.12 121.05 -195.01 117.06 -122.64 159.79
NUT+ST T1
15 1788.05 117.72
177.45 144.99 1769.73 138.46 1828.62 NA
Total Energy
(kcal)
!
15 -351.98 145.96
-377.49 131.32 -362.46 154.39 -373.81 NA
164
Table 19, Continued
N M SE M
T20
SE
T20
M
T10
SE
T10
Mdn SE
Mdn
CON T1
14 14.24 1.28
14.78 1.76 14.53 1.51 14.61 NA
!
14 2.82 2.18
2.55 2.85 2.64 2.29 2.57 NA
NUT T1
20 16.83 2.21
14.80 1.42 15.08 1.79 13.98 1.82
!
20 1.04 2.12
1.18 1.72 1.29 1.58 1.49 2.43
NUT+S
T T1
15 14.80 1.24
14.39 1.08 14.92 1.20 14.76 NA
Total Fiber (g)
!
15 .36 2.62
-.97 3.38 -.28 2.93 -5.00 NA
CON T1
14 12.91 14.12
118.03 19.19 119.51 17.24 117.18 NA
!
14 -2.42 21.55
2.89 24.34 .58 19.53 14.07 NA
NUT T1
20 111.11 13.15
104.83 9.24 104.01 12.31 109.12 12.59
!
20 -1.13 11.49
-5.36 8.93 -6.30 11.47 -8.80 12.01
NUT+S
T T1
15 102.17 9.12
99.75 11.11 101.09 1.99 94.00 NA
Total Sugar (g)
!
15 -15.02 12.01
-16.66 15.45 -15.47 14.04 -18.88 NA
CON T1
14 92.60 14.36
85.51 18.79 88.10 15.72 73.98 NA
!
14 -8.42 18.80
-4.75 18.73 -4.70 16.79 6.18 NA
NUT T1
20 75.20 1.93
7.26 1.99 7.15 9.65 74.94 12.98
!
20 -17.61 11.51
-13.34 9.04 -18.14 11.78 -15.22 11.77
NUT+S
T T1
15 59.69 7.76
56.71 3.23 58.36 9.42 55.60 NA
Added Sugar (g)
!
15 -9.27 1.24
-12.00 11.59 -9.53 11.70 -1.12 NA
165
Table 19, Continued
Motivation
N M SE M
T20
SE
T20
M
T10
SE
T10
Mdn SE
Mdn
CON T1
16 8.54 1.05
8.23 1.13 8.45 .96 8.50 NA
!
15 -1.74 .81
-1.44 .71 -1.66 .83 -1.00 NA
NUT T1
21 7.46 1.03
7.23 1.35 7.32 1.22 6.25 1.67
!
21 .57 1.27
.68 1.11 .76 1.15 .50 1.46
NUT+S
T T1
17 8.91 1.13
8.70 1.58 8.88 1.27 7.75 2.28
RAI (Exercise)
!
15 -.17 1.51
.36 1.36 .35 1.61 -1.25 NA
CON T1
16 4.88 .27
5.02 .38 4.94 .31 5.33 NA
!
16 -.27 .37
-.03 .51 -.20 .45 .25 NA
NUT T1
21 5.23 .29
5.31 .37 5.30 .34 5.00 .49
!
21 .38 .27
.34 .19 .35 .17 .33 .24
NUT+S
T T1
16 3.88 .36
4.12 .51 3.94 .45 4.17 NA
Autonomous
(Fruits &
Vegetables)
!
15 1.09 .28
.96 .26 1.05 .31 1.00 NA
CON T1
16 3.24 .31
3.11 .48 3.21 .39 2.94 NA
!
16 -.59 .25
-.45 .21 -.51 .27 -.31 NA
NUT T1
21 2.64 .32
2.38 .33 2.53 .40 2.25 .41
!
21 .03 .33
.14 .32 .04 .35 .25 .41
NUT+S
T T1
16 2.84 .32
2.72 .38 2.77 .34 2.69 NA
Controlled (Fruits
& Vegetables)
!
15 .15 .25
.26 .26 .23 .24 .21 NA
166
Table 19, Continued
Motivation
N M SE
M
T20
SE
T20
M
T10
SE
T10
Mdn SE
Mdn
CON T1
16 2.75 .36
2.53 .47 2.67 .43 2.33 NA
!
16 -.65 .46
-.57 .55 -.69 .52 -.33 NA
NUT T1
21 2.35 .26
2.23 .39 2.25 .29 2.33 .52
!
21 .15 .39
.42 .38 .26 .39 .33 .52
NUT+S
T T1
16 2.85 .34
2.67 .44 2.81 .40 2.67 NA
Amotivation
(Fruits &
Vegetables)
!
15 -.16 .35
.02 .35 -.10 .43 .00 NA
Body
Composition
CON T1
16 2.05 .17
2.07 .12 2.08 .19 2.09 NA
!
16 .03 .03
-.01 .01 .01 .02 -.02 NA
NUT T1
21 1.99 .11
2.03 .13 2.00 .13 2.09 .18
!
21 -.02 .03
-.02 .03 -.02 .03 -.01 .04
NUT+S
T T1
17 2.18 .14
2.24 .17 2.19 .17 2.36 .26
BMI Z-score
!
17 .01 .04
.04 .01 .02 .04 .03 .05
CON T1
16 34.89 4.98
29.91 3.87 33.19 5.66 28.45 NA
!
16 .08 1.54
-.14 1.29 -.44 1.21 .40 NA
NUT T1
21 32.25 2.81
31.59 3.33 32.12 3.22 3.50 4.46
!
21 -.11 1.64
-.68 1.15 -.73 1.33 -.20 1.42
NUT+S
T T1
17 37.70 3.69
35.67 5.04 36.72 4.14 31.90 8.06
Fat Weight (kg)
!
16 -1.74 2.47
-.08 .81 -.42 1.79 .10 NA
167
Table 19. Continued
Body Composition
N M SE
M
T20
SE
T20
M
T10
SE
T10
Mdn SE
Mdn
Lean Weight (kg)
CON T1
16 57.66 2.59
56.49 2.41 56.89 2.77 56.55 NA
!
16 .43 1.47
.78 1.15 .84 1.04 .30 NA
NUT T1
21 55.51 2.02
56.04 2.63 55.88 2.20 58.60 3.46
!
21 1.18 1.00
.89 1.17 1.04 .93 .57 1.57
NUT+ST T1
17 57.61 3.02
56.31 3.69 57.03 3.51 58.70 7.53
!
16 .67 1.32
1.28 .83 1.02 1.12 1.05 NA
Glucose & Insulin
SI
CON "1
15 1.82 .32
1.67 .34 1.71 .30 1.58 NA
!
15 .08 .29
-.02 .14 .00 .26 .02 NA
NUT "1
21 1.58 .18
1.53 .21 1.55 .21 1.51 .27
!
20 .23 .17
.31 .13 .32 .12 .29 .16
NUT+ST "1
17 1.46 .23
1.32 .21 1.37 .22 1.19 .61
!
17 -.01 .19
.08 .21 .08 .16 -.03 .25
AIR
CON "1
16 1308.48 232.73
1076.63 13.14 117.77 176.31 1005.80 NA
!
16 96.30 142.37
11.04 10.53 35.70 107.13 -59.40 NA
NUT "1
21 1118.16 189.77
918.25 182.06 996.22 197.68 877.70 246.77
!
20 62.48 9.49
4.37 73.39 29.28 93.92 -16.50 136.58
NUT+ST "1
17 1529.98 211.11
1409.32 194.04 1415.34 177.03 1439.80 644.08
!
17 77.02 185.15
-91.23 93.84 -49.01 131.50 -115.30 578.65
168
Table 19, Continued
Glucose & Insulin
N M SE
M
T20
SE
T20
M
T10
SE
T10
Mdn SE
Mdn
CON T1
15 1944.40 276.62
1928.07 284.53 1947.81 318.14 1734.90 NA
!
15 -146.96 234.49
-133.93 282.93 -105.66 264.70 -61.20 NA
NUT T1
21 126.35 132.12
1226.71 152.15 1225.46 131.68 1175.00 203.43
!
20 425.89 191.84
219.60 133.19 291.67 171.21 307.44 277.66
NUT+ST T1
17 1832.56 196.15
1818.47 235.27 1837.19 231.06 1841.40 359.38
DI
!
17 17.36 256.16
282.67 213.34 175.99 291.96 286.00 427.61
169
Appendix E. Effect Sizes
Table 20. Effect Sizes (akp.effect function)
Variable
!
R
M
T
20 M
T
10 M
Bench
Con vs. N .58 .72 .80
Con vs. N+ST .39 .46 .49
Nut vs N+ST .49 .58 .61
Leg
Con vs. N .60 .67 .72
Con vs. N+ST .39 .46 .48
Nut vs N+ST .50 .57 .60
Total Energy (kcal)
Con vs. N .54 .65 .72
Con vs. N+ST .68 .78 .87
Nut vs N+ST .81 .94 1.06
Total Fiber (g)
Con vs. N .50 .55 .56
Con vs. N+ST .41 .44 .47
Nut vs N+ST .55 .60 .65
Total Sugar (g)
Con vs. N .68 .76 .80
Con vs. N+ST .73 .84 .88
Nut vs N+ST .72 .80 .87
Added Sugar (g)
Con vs. N .66 .70 .73
Con vs. N+ST .68 .65 .66
Nut vs N+ST .84 .77 .80
Relative Autonomy Index (Exercise)
Con vs. N .85 .83 .90
Con vs. N+ST .75 .81 .89
Nut vs N+ST .58 .59 .60
170
Table 20, Continued
Variable
!
R
M
T
20 M
T
10 M
Autonomous (Fruits & Vegetables)
Con vs. N 1.18 1.39 1.61
Con vs. N+ST 1.02 1.11 1.23
Nut vs N+ST 1.05 1.19 1.32
Controlled (Fruits & Vegetables)
Con vs. N 1.85 1.52 1.67
Con vs. N+ST 2.09 1.77 1.70
Nut vs N+ST 1.48 1.25 1.38
Amotivation (Fruits & Vegetables)
Con vs. N 1.28 1.39 1.37
Con vs. N+ST 1.48 1.52 1.57
Nut vs N+ST 1.34 1.29 1.41
SI
Con vs. N 2.03 1.91 1.62
Con vs. N+ST 1.96 2.18 1.82
Nut vs N+ST 1.96 1.98 1.71
AIR
Con vs. N .54 .55 .45
Con vs. N+ST .56 .59 .54
Nut vs N+ST .64 .70 .64
DI
Con vs. N .57 .59 .60
Con vs. N+ST .43 .51 .53
Nut vs N+ST .45 .46 .49
BMI Z-score
Con vs. N 2.78 3.33 3.55
Con vs. N+ST 2.52 3.02 3.47
Nut vs N+ST 2.66 3.16 3.44
171
Table 20, Continued
Variable
!
R
M
T
20 M
T
10 M
Total Fat (kg)
Con vs. N .75 .78 .76
Con vs. N+ST .68 .76 .82
Nut vs N+ST .74 .71 .75
Total Lean Tissue (kg)
Con vs. N -7.97 -8.07 -7.25
Con vs. N+ST -9.66 -9.70 -7.29
Nut vs N+ST -8.79 -8.32 -7.46
172
Appendix F. Fit Indices for Mixed Models
Bayesian Information Criterion (BIC)
!
BIC ="2logL + r lnn
r : the number of free parameters in the model
(Kass & Raftery, 1993; B. Muthen et al., 2002; Schwarz, 1978)
Bootstrapped Likelihood Ratio Test (BLRT)
!
LRT = 2" [logL(Model
K
)# logL(Model
K#1
)]
K = # of classes; bootstrapping samples determine the c
2
distribution
(McLauchlan & Peel, 2000) (L. K. Muthen & Muthen, 1998)
Akaike’s Information Criterion (AIC)
!
AIC = "
2
+ 2q
q: the number of unknown parameters being solved for
(Akaike, 1987; Loehlin, 1998)
Entropy
!
E
k
= 1"
(
k
#
i
#
"
!
p
ik
ln
!
p
ik
)
nlnK
!
!
p
ik
: estimated conditional probability for individual i in class k
(Nagin, 1999; Ramaswamy, DeSarbo, Reibstein, & Robins, 1993)
Abstract (if available)
Abstract
When the findings of a randomized clinical trial are null, and yet the assumptions of the statistical model are not met, is it appropriate to conclude there is not an effect of the intervention? The purpose of this study is to examine statistical method choice for the analysis of a randomized clinical trial of a strength training and nutrition intervention in a sample of overweight Latino adolescents (N = 54). Results of analysis of covariance (ANCOVA) models were overwhelmingly null, however there were several concerns about the underlying assumptions. Two non-traditional statistical approaches that do not carry the same assumptions as traditional ANCOVA were used to reanalyze these data. The first approach uses a robust analog of ANCOVA that is based on fewer restrictions, and which can increase power while maintaining Type I error rate, even with small samples. Using these robust techniques, the conclusions regarding the effectiveness of the intervention varied widely from those of traditional ANCOVA, as several significant intervention effects were found with these robust methods. In the second approach, developed for this study, a hybrid of two common latent profile analysis models was created to generate a profile of which participants benefited from the intervention. This model tests whether the sample is homogeneous. With this model, it was shown that gender and pre-test values had more influence than the intervention on outcomes, and the intervention appeared to modify these influences. These results suggest that the use of traditional ANCOVA models in the face of assumption violations may lead to missing important effects of an intervention. Expanding the scope of standard techniques for analyzing randomized clinical trials would likely result in a different literature landscape for many disciplines, though the acceptability of the use of these results poses challenges for publication of papers using them.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
A comparison of methods for estimating survival probabilities in two stage phase III randomized clinical trials
PDF
Latent change score analysis of the impacts of memory training in the elderly from a randomized clinical trial
PDF
Estimation of treatment effects in randomized clinical trials which involve non-trial departures
PDF
Statistical methods for causal inference and densely dependent random sums
PDF
The impact of data collection procedures on the analysis of randomized clinical trials
PDF
Evaluating the effects of testing framework and annotation updates on gene ontology enrichment analysis
PDF
Investigations into causes of muscle atrophy and improving clinical outcomes following lower limb surgeries
PDF
An assessment of impact of early local progression on subsequent risk for the treatment failure in adolescent and young adult patients with non-metastatic osteosarcoma
PDF
The effect of vitamin D supplementation on the progression of carotid intima-media thickness and arterial stiffness in elderly African American women: Results of a randomized placebo-controlled trial
PDF
Finite dimensional approximation and convergence in the estimation of the distribution of, and input to, random abstract parabolic systems with application to the deconvolution of blood/breath al...
PDF
A randomized controlled clinical trial evaluating the efficacy of grafting the facial gap at immediately placed implants in the anterior maxilla: 3D analysis of bone and soft tissue changes
PDF
The effects of late events reporting on futility monitoring of Phase III randomized clinical trials
PDF
Addressing federal pain research priorities: drug policy, pain mechanisms, and integrative treatment
PDF
When they teach us: recruiting teacher candidates of color for the next generation of students, an evaluation study
PDF
Diffusion MRI of the human brain: signal modeling and quantitative analysis
PDF
Effects of air polishing for the treatment of peri-implant diseases: a systematic review and meta-analysis
PDF
Energy control and material deposition methods for fast fabrication with high surface quality in additive manufacturing using photo-polymerization
PDF
Closing the compliance gap: an evaluation of influences impacting appropriate compliance risk response among pharmaceutical company managers
PDF
Detecting joint interactions between sets of variables in the context of studies with a dichotomous phenotype, with applications to asthma susceptibility involving epigenetics and epistasis
PDF
Uncertainty quantification in extreme gradient boosting with application to environmental epidemiology
Asset Metadata
Creator
Lane, Christianne Joy
(author)
Core Title
The impact of statistical method choice: evaluation of the SANO randomized clinical trial using two non-traditional statistical methods
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Psychology
Publication Date
07/07/2009
Defense Date
05/11/2009
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
latent profile analyses,OAI-PMH Harvest,robust ANCOVA,Sano
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Wilcox, Rand R. (
committee chair
), Azen, Stanley Paul (
committee member
), Lu, Zhong-Lin (
committee member
), Read, Stephen J. (
committee member
)
Creator Email
christiannelane@yahoo.com,clane@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m2330
Unique identifier
UC152733
Identifier
etd-Lane-2893 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-567072 (legacy record id),usctheses-m2330 (legacy record id)
Legacy Identifier
etd-Lane-2893.pdf
Dmrecord
567072
Document Type
Dissertation
Rights
Lane, Christianne Joy
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
latent profile analyses
robust ANCOVA