Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Comparing robustness to outliers and model misspecification between robust Poisson and log-binomial models
(USC Thesis Other)
Comparing robustness to outliers and model misspecification between robust Poisson and log-binomial models
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
COMPARING ROBUSTNESS TO OUTLIERS AND MODEL MISSPECIFICATION BETWEEN ROBUST POISSON AND LOG-BINOMIAL MODELS by Wansu Chen ______________________________________________________ A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (BIOSTATISTICS) December 2015 Copyright 2015 Wansu Chen i Acknowledgements I would like to extend my deep appreciation to Dr. Stanley Azen for his tremendous encouragement and support from the very beginning of my study at USC. I can still remember the conversation with Dr. Azen about 6 years ago regarding the opportunity of studying at USC. For a very long time, I was considering going back to school to actualize an unfulfilled dream. It was Dr. Azen’s encouragement and confidence in me that made all this possible. My deepest thanks goes to Dr. Meredith Franklin, the co-Chair of my dissertation committee for her advice, encouragement and facilitation. I would also like to thank other members of my dissertation and/or qualifying examine committees (Dr. Mark Krailo, Dr. Rand Wilcox, and Dr. Myles Cockburn) for their very useful scholarly advice. I would also like to express my appreciation to my family for their love, support and encouragement. I am also grateful to other faculties and staff of the Biostatistics Division of the Department of Preventive Medicine for their advice and assistance. ii Table of Contents Acknowledgements ........................................................................................................ i List of Tables ............................................................................................................... iv List of Figures .............................................................................................................. vi Abbreviations .............................................................................................................. vii Abstract ...................................................................................................................... viii Chapter 1 Introduction ...................................................................................................1 1.1 Background ..................................................................................................1 1.2 Rationale ......................................................................................................2 1.2.1 Robustness to outliers ...................................................................2 1.2.2 Robustness to model misspecification ..........................................2 1.3 Research contribution ..................................................................................4 1.4 Organization of the chapters ........................................................................7 Chapter 2 Estimation......................................................................................................8 2.1 General description of generalized linear models (GLM) ...........................8 2.2 Log-binomial regression ..............................................................................8 2.3 Robust Poisson regression .........................................................................13 2.3.1 Quasi-likelihood methods ...........................................................13 2.3.2 Using quasi-likelihood methods and Poisson Distribution to model a binary outcome .....................................14 2.4 Comparison of the two estimation methods...............................................17 Chapter 3 Model Comparisons with Respect to Robustness to Outliers and Model Misspecification Using Simulation ..................................................................18 3.1 Robustness to outliers ................................................................................18 3.1.1 Simulation methods ....................................................................18 3.1.2 Measures of model performance .................................................21 3.1.3 Simulation results........................................................................22 3.1.4 Discussion ...................................................................................30 3.1.5 Citation of the published paper ...................................................35 3.2 Robustness to model misspecification .......................................................36 3.2.1 Simulation methods ....................................................................36 3.2.2 Design and generation of simulation datasets .............................45 3.2.3 Measures of model performance .................................................46 3.2.4 Simulation results........................................................................47 3.2.5 Discussion ...................................................................................60 iii Chapter 4 Applications ................................................................................................65 4.1 A study of relationship between FeNO and asthma impairment ...............65 4.2 A study of relationship between EOS and asthma exacerbation ...............69 Chapter 5 Conclusions and Future Research Directions ..............................................73 Bibliography ................................................................................................................77 Appendix I ...................................................................................................................81 Appendix II ..................................................................................................................83 Appendix III .................................................................................................................84 Appendix IV.................................................................................................................85 Appendix V ..................................................................................................................88 iv List of Tables Table 1-1: Medline search exclusion criteria and number of articles excluded ..................................................................5 Table 1-2: Number of publications in 2000-2014 using log-binomial and/or robust Poisson models by study type and outcome rate ...................................................................6 Table 3-1: Relative Bias (%) in log scale in the presence of outliers (n = 1500)………………………………….…24 Table 3-2: Standard error in log scale in the presence of outliers (n = 1500)………………...……………..……27 Table 3-3: Mean square error (MSE) in log scale in the presence of outliers (n = 1500)……………………..……………...28 Table 3-4: Expected values of the contaminated outcomes (Y) by the original outcome (10%, 25% and 40%)……………………..……..……………...30 Table 3-5: Parameters of DS1-DS3 for generating Yk for unexposed subjects (k = 1, 2, 3 and 4) .................................................40 Table 3-6: Parameters of DS4-DS6 for generating Yk for unexposed subjects (k = 1, 2, 3 and 4) .................................................42 Table 3-7: Parameters of DS7-DS8 for generating Yk for unexposed subjects (k = 1, 2, 3 and 4) .................................................44 Table 3-8: Relative bias (%) in log scale with and without model misspecification (n=1,000) .............................................................49 Table 3-9: Standard error in log scale with and without ..............................................53 model misspecification (n=1,000) Table 3-10: Mean square error in log scale with and without model misspecification (n=1,000) ...........................................................54 Table 4-1: Comparison of robust Poisson and log-binomial models in estimating relative risks (RR) of >7 SABA canisters dispensed in the past year………………………………………68 v Table 4-2: Comparison of robust Poisson and log-binomial models in estimating relative risks (RR) of asthma exacerbation in subsequent year ......................................................................................72 Table III-1: Description of 24 Scenarios for Data Generation ..................................... 84 Table IV-1: Relative bias (%) in log scale in the presence of outliers (n=500)...........85 Table IV-2: Standard error in log scale in the presence of outliers (n=500) ...............86 Table IV-3: Mean square error in log scale in the presence of outliers (n=500) .........87 Table V-1: Relative bias (%) in log scale with and without model misspecification (n=500) ...............................................................88 Table V-2: Standard error in log scale with and without model misspecification (n=500) ..........................................................................89 Table V-3: Mean square error in log scale with and without model misspecification (n=500)................................................................90 vi List of Figures Figure 3-1: Relative bias (%) in log scale in the presence of outliers (n=1,000) ........................................................25 Figure 3-2: Relative bias (%) in log scale when the models were correctly or incorrectly specified (n=1,000) .............................................51 Panel A: Distribution of P(Y=1) with varying max P(Y=1) ....................................51 Panel B: Distribution of P(Y=1) with varying beta of Z2 ........................................51 Panel C: Distribution of P(Y=1) with varying underlying distribution of of simulated data .......................................................................................51 Figure 3-3A: Distribution of P(Y=1) with varying max P(Y=1) .................................56 Figure 3-3B: Distribution of P(Y=1) with varying beta of Z2 .....................................57 Figure 3-3C: Distribution of P(Y=1) with varying underlying distribution of simulated data.............................................................................................................. 59 vii Abbreviations RR: Relative risk OR: Odds ratio CI: Confidence interval GLM: Generalized linear model GEE: Generalized estimating equation ZIP: Zero-inflated Poisson MLE: Maximum likelihood estimator IRLS: Iterative reweighted least square IWLS: Iterative weighted least square SE: Standard error MSE: Mean square error NLP: Non-linear programming FeNO: Fractional exhaled nitric oxide SABA: Short-acting beta-agonist ACT: Asthma control test EOS: Eosinophil counts HEDIS: Healthcare Effectiveness Data and Information Set GINA: Global Initiative of Asthma viii Abstract To estimate relative risks or risk ratios for common binary outcomes, the most popular model-based methods are the robust (also known as modified) Poisson and the log-binomial regression. Both models are commonly used in medical research including clinical trials. However, the performance of the two models under model misspecification has not been well studied. In this dissertation, I first explore the theories behind the models and derive the equations to estimate beta (log of relative risk (RR)) from log- binomial models iteratively. Next, I demonstrate the lack of robustness of the log- binomial models under the presence of outliers and various types of model misspecification using two simulation studies. Finally, to examine the differences in the models in an applied setting, I apply both models to two asthma studies. The findings of the outlier study indicate that for data coming from a population where the true relationship between the outcome and the continuous covariate contained a higher order term, the robust Poisson models consistently outperformed the log-binomial models even when the level of contamination is low. The findings of the model mis- specification study suggest that the point estimates of the log-binomial models can be biased when the underlying data fail to follow a log distribution (either the link function was mis-specified or the response depends on covariates through a non-linear combination of predictors). In conclusion, the robust Poisson models are more robust to outliers and model misspecification compared to the log-binomial models when estimating relative risks or risk ratios for common binary outcomes. From this, it is ix recommended that for large samples one should consider a robust Poisson model. For small to mid-sized samples, users may consider running both models. If the point estimates are comparable, use the results from the log-binomial model because the estimates from the log-binomial models may be more efficient (narrower confidence intervals). When applied to the first asthma study, the two models yielded large differences, with the largest difference in point estimates being 19%. The differences in point estimates in the 2 nd asthma study were small. However, the difference changed the interpretation for one of the covariates. 1 Chapter 1. Introduction 1.1 Background When the outcome of a study is binary, the most common method to estimate the effect(s) is to calculate an odds ratio (OR) as an estimate of relative risk (RR) using a logistic regression. When the outcome prevalence is high (>10%), the OR can still be estimated by using the logistic regression model, but the OR is no longer an acceptable estimate for RR (Greenland, 1987). Interpreting the OR as being equivalent to the RR still occurs in medical research, leading to overstated effect in the study findings (Agrawal, 2005; Mirjam et al., 2012). The degree of overstatement depends on the outcome rate (e.g., disease prevalence). A higher outcome rate leads to higher degree of exaggeration. Zhang and Yu proposed a formula to convert the adjusted OR derived from the logistic regression model to a risk ratio in studies with common outcomes (Zhang and Yu, 1998). However, the method was noted by McNutt et al. to produce biases in both point estimates and confidence intervals (McNutt et al., 2003). Miettinen suggested the doubling-of-cases method to estimate OR as an approximation of RR using logistic regression based on a modified dataset (Miettinen, 1982). Schouten et al. improved the method by applying robust standard errors (Schouten et al., 1993). However, this method was mentioned by Skov et al. to produce prevalence greater than one (Skov et al., 1998). Several model-based methods have been proposed to estimate RR and its confidence interval (CI) directly (Greenland, 2004). The most popular ones are the robust (also known as modified) Poisson model (Barros and Hirakata, 2003; Zou, 2004) and the log- binomial model (Skov, et al., 1998; Barros and Hirakata, 2003; Wacholder, 1986). The 2 performance based on simulations seemed to be equally good between the log-binomial model and the robust Poisson model (Mirjam, et al., 2012; Barros and Hirakata, 2003; Zou, 2004; Petersen and Deddens, 2008) when sample sizes are reasonably large. Out of the two models, it was reported that the robust Poisson may be less affected by outliers compared to the log-binomial method (Lumley et al., 2014). However, to my knowledge there is no study comparing the two models when the models are mis-specified. 1.2 Rationale 1.2.1 Robustness to outliers Outliers appear frequently in data of real life. Users are not always able to detect and remove outliers due to methodological limitations. For example, robust methods to detect outliers especially high leverage points for logistic regression models are available in popular statistical software packages (Hosmer and Lemeshow, 2000; Imon and Hadi, 2013; Syaiba and Habshah, 2013). However, similar approaches for log-binomial models are not yet available in commercial or non-commercial software. Thus, understanding the impact of outliers on model performance can provide guidance in terms of appropriate model selection. The purpose of the simulation study is to evaluate the performance of the log-binomial models and the robust Poisson methods in several scenarios when outliers exist. 1.2.2 Robust to model misspecification Similar to outlier detection, model misspecification is difficult to detect. Although various tests have been attempted to test for model misspecification, the power 3 of these tests were not satisfactory (Blizzard and Hosmer, 2006; White, 1982; Presnell and Boos, 2004; Capanu and Presnell, 2008). Three tests to assess model fit achieved reasonable type I errors. However, the power was low to moderate (Blizzard and Hosmer, 2006). Therefore, understanding the impact of model specification is critical to guide the selection of the most appropriate model. First, this study will provide a closer look on the impact of model misspecification by exploring the theories supporting each model. The theories described in Chapter 2 provide an evidence to support a higher level of sensitivity to model-misspecification when comparing log-binomial models to robust Poisson models. The mathematical theories behind the two regression models and the cause(s) of the differences in point estimates will be presented and discussed in the manuscript resulting from this study. Next, I will take a more systematic approach to demonstrate the differences between the two models when the models are mis-specified. As presented in Chapter 2, three assumptions need to be held for GLM to work. The design of the simulation was to violate two out of the three assumptions. The first approach to generate model mis- specification is to alter the link function (e.g. using robust Poisson model or log-binomial models (both with link function=”log”) to estimate effects while the underlying data follow logit or probit distribution). Another approach to cause model mis-specification is to change the distribution of the linear predictor (e.g. truncating the distribution of linear predictors). 4 1.3 Research contribution A Medline search of articles published between 2005 and 2014 was conducted to understand the popularity of the log-binomial and the robust Poisson models. The reason that I started from 2005 is because the modified or robust Poisson model was not commonly recognized as an approach to estimate risk ratio before Zou’s paper was published in 2004 (Zou, 2004), in which the method was applied using SAS. The search phrases were “log binomial regression” for log-binomial models, and (“modified Poisson regression”) OR (“robust Poisson regression”) for robust Poisson models. This resulted in the same results of Boolean search based on the following search terms: “log” AND “binomial” AND “regression” for log-binomial models, and (“modified” AND “Poisson” AND “regression”) OR (“robust” AND “Poisson” AND “regression”) for robust Poisson models. The Medline search engine is designed to find papers with the matching keywords defined by the authors, even if the key words do not appear consecutively or not at all in the abstracts or in the main body of the articles. The search was limited to “human” studies. A total of 461 and 564 articles were initially identified for log-binomial regression models and for modified or robust Poisson regression models, respectively. A manual review was performed to exclude studies that utilized statistical methods that are not related to log-binomial or robust Poisson models (e.g. negative binomial regression models or zero-inflated Poisson regression models). Studies that applied log-binomial or robust Poisson models for purposes other than estimating risk ratios were also excluded (e.g. Poisson regression models to estimate counts, rates, rate ratios or differences). 5 Moreover, extensions of log-binomial models or robust Poisson models to handle repeated or clustered measures were removed because the use is beyond the scope of the current study. The reasons for exclusion and the number of articles being excluded were listed in Table 1-1. Table 1-1. Medline search exclusion criteria and number of articles excluded “log-binomial regression” Negative binomial regression models Statistical methods research Log-binomial models with generalized estimating equation (GEE) Logistic regression models Log-binomial with a hierarchical approach Other (i.e. identified because of the appearance of key words) 36 24 23 9 5 4 “robust Poisson regression” or “modified Poisson regression” Poisson regression models to estimate counts, rates, rate ratios or differences Poisson two stage, multi-level, hierarchical, random effect or mixed effect models Robust or modified Poisson models with generalized estimating equation (GEE) Statistical methods research on robust or modified Poisson models Statistical methods research on other types of Poisson models (excludes ZIP) Zero-inflated Poisson (ZIP) Regression with distributions other than Poisson (e.g. Logistic) Poisson regression to estimate risk ratio. Unclear how the variance was estimated Other with the key words including 1 review paper and 2 papers on meta- analysis 73 27 16 12 15 8 9 8 20 Table 1-2 displayed the number of articles that utilized log-binomial regression models and robust Poisson regression models per year between 2005 and 2014 after the exclusions listed in Table 1-1. The volumes increased significantly in the past 11 years despite the fact that not much guidance has been provided in medical literature in terms of which model to choose. Three studies performed both and thus they were counted in both columns in Table 1-2. Four out of 360 (376) studies in which log-binomial (robust 6 Poisson) models were applied were clinical trials. Given the popularity of both models, it is critical to understand the performance of both models and to use the information to guide the selection of models in future medical/health research. Table 1-2. Number of publications in 2005-2014 using log-binomial and/or robust Poisson models Year Log-binomial models 1 Robust Poisson models 2 2005 8 2 2006 10 9 2007 16 8 2008 20 21 2009 29 20 2010 39 27 2011 38 46 2012 58 70 2013 67 88 2014 75 85 Total 360 376 1. Four out of 360 were clinical trials and the rest were observational studies. 2. Four out of 376 were clinical trials and the rest were observational studies. My research in this area will contribute to the overall knowledge in the following three areas: (1) The mathematical foundation that supports and details the key differences between the log-binomial and robust Poisson models. (2) The statistical robustness of the log-binomial and robust Poisson models to outliers and model misspecifications through simulation studies. (3) Recommendations as to which model to select under various circumstances when estimating relative risks (risk ratios) for common binary outcomes. 7 1.4 Organization of the chapters In Chapter 2 the estimation methods for log binomial and robust Poisson models are presented; in Chapter 3 the models are compared with respect to their robustness to outliers and model misspecification using simulation; in Chapter 4, the models are applied to two real-life datasets in order to demonstrate the potential differences in point estimates between the two models. Recommendations for application of these models in general are made. 8 Chapter 2. Estimation Methods 2.1 General description of generalized linear models (GLM) Generalized linear models (GLM) originated from a significant extension of traditional linear regression models (McCullagh and Nelder, 1989). A GLM consists of three components: (1) A random component, specifying the conditional distribution of the response variable (Yi) from an exponential family, given the values of the explanatory variables in the model, (2) a linear predictor, which is a linear function of the explanatory variables, ηi = α + β1Xi1 + β2Xi2 +· · ·+βkXik, and (3) a smooth and invertible link function g(·), which transforms the expectation of the response variable, μi ≡ E(Yi ), to the linear predictor: g(μi ) = ηi = α + β1Xi1 + β2Xi2 +· · ·+βkXik. For example, in a logistic regression model, the link function is log ( 1 i i µ µ − ). In a Poisson model, the link function is log ( i µ ), where i µ = E(Yi )=exp( i η ) = exp(α + β1Xi1 + β2Xi2 +· · ·+βkXik). The link function for a log-binomial model is also log ( i µ ); however, i µ , in this case, is often denoted as pi, because E(Yi ) is a probability with a value between 0 and 1. Any violation of the above assumptions could cause the issues of model misspecification. 2.2 Log-binomial regression The maximum likelihood approach is used for estimating parameters of log- binomial regression models (Skov et al., 1998; Wacholder, 1986). In the GLM framework, the conditional distribution of the response variable (Yi), given the values of the explanatory variables in the model, follows the binomial distribution with the mean 9 response related to the explanatory variables by the link function ‘log’. Although there are other methods to obtain efficient estimators, the maximum likelihood approach is an intuitive approach to obtain asymptotically efficient estimators. The theory and the process of finding MLEs of log-binomial models are presented below. 10 Let’s define a binary outcome, i y , with means ( 1|x) () i i i Prob y p β = = , where 12 x ' ( , ,..., ) i i i iK xx x = is the row vector comprised of the K covariates of the i-th individual, 01 ( , ,..., ) K β ββ β = are regression parameters and i=1,2,…,n. In a log-binomial model, log( ( )) x ' ii pββ = and since 0 1,x ' 0 ii p β ≤≤ ≤ (constrained), The log-likelihood is given by 11 ( ) log( ( )) (1 )log(1 ( )) nn ii i i ii y p y p β β β = = = +− − ∑ ∑ 11 () ( ()) (1 ) ( ()) () (1 ()) nn i i i i ii ji j i j y dp y dp pd p d ββ β β ββ β β = = ∂− = ⋅− ⋅ ∂− ∑ ∑ 1 (1 ) ( ( )) () (1 ()) n i ii i ii j y y dp p p d β β ββ = − = −⋅ − ∑ 1 ( ( )) ( ( )) () (1 ()) n ii i i ii j y p dp p pd ββ β ββ = − = ⋅ ⋅− ∑ Since 0 1 ( ) exp(x ) exp( ) k i i ij j j p x β ββ β = ′ = = + ∑ , 0 1 ( ( )) exp( ) ( ) k i ij j ij i ij i j dp x xp x d β ββ β β = = + ⋅= ⋅ ∑ , and thus 1 () ( ()) () () (1 ()) n ii i ij i ji i yp px pp ββ β β β β = ∂− = ⋅⋅ ∂ ⋅− ∑ 1 ( ( )) (1 ( )) n ii ij i i yp x p β β = − = ⋅ − ∑ , where j=1,..,K. (1) To find the expected second derivatives, we can first rearrange the 1 st derivatives derived above. 1 1 () ( ()) ( 1) (1 ()) (1 ( )) (1 ( )) n n ii i i ij ij i i ji i yp y p xx pp ββ β ββ β = = ∂ − − + − = ⋅= ⋅ ∂− − ∑ ∑ 11 11 1 1 () nn i ij ij ii i y xx p β = = − = ⋅+ − ∑∑ 2 1 () 1 ( )0 1 () n i ij i jl l i y x p β ββ β β = ∂∂ − = ⋅+ ∂∂ ∂ − ∑ 2 1 1 ( ( )) ( 1) (1 ( )) n i i ij i i l dp yx pd β ββ = = −⋅ ⋅ ⋅ − ∑ Since 0 1 ( ( )) exp( ) ( ) k i il l il i il l l dp x xp x d β ββ β β = = + ⋅= ⋅ ∑ , 2 22 11 () 1 ( 1) () ( 1) ( ) (1 ( )) (1 ( )) nn i i i ij i il ij il i i jl i i yp y x p x xx pp ββ β ββ β β = = ∂− = −⋅ ⋅ ⋅ ⋅ = ⋅ ∂∂ − − ∑∑ , and thus 2 1 () () () 1 () n i ij il i jl i p E xx p ββ ββ β = ∂ − = ∂∂ − ∑ , where 1≤j, l≤K (2) Statistical software utilizes iterative reweighted least squares (IRLS) approach or variations of IRLS to find MLEs for generalized linear models. IRLS is also referred to as iterative weighted least squares (IWLS). Fisher scoring tells us that the MLE for β can be carried out by the following iteration. ( 1) () () 1 () [ ''( )] '( ) t t t t E ββ β β +− = +− ⋅ , where is the log-likelihood function for the entire sample and ' ( '' ) is the first (second) derivatives. (3) Now rearrange the step of (3) above to make it resemble IRLS. First rewrite it as ( 1) () 1 () () () [ ''( )] {[ ''( )] '( )} t t tt t EE β β ββ β + − = −− + (4) 12 In the matrix form, () ( ()) '( ) ' () YP XW P ββ β ββ ∂− = = ⋅ ∂ and ''( ) ' E X WX β −= (see the proof in Appendix I), where W (referred to as weight) = ( ) () 1 ( ) i i p Diag p β β − . Putting it all together, (4) above becomes ( 1) 1 (' ) (' ) t X WX X Wz β + − = , where () () () ( ( )) () t t t YP zX P β β β − = + . (5) The iteration based on (5) continues until β stabilizes. The estimated variance-covariate matrix for ˆ β is the final 1 (' ) X WX − after β stabilizes. The SE’s of ˆ β are the square-roots of the diagonal elements of this matrix. The analyses can be performed using SAS’s PROC GENMOD, R’s GLM or STATA’s GLM procedure. However, for many situations in which continuous covariates exist, the MLE can be on the boundary of the parameter space (i.e. the predicted probability of the outcome equals to 1), leading to the difficulty of finding the MLE. To address the non-convergence issue, a SAS macro called “COPY” was developed (Deddens et al., 2003) and later enhanced (Petersen and Deddens, 2009) to increase the chance of finding an approximate MLE inside the interior of the parameter space using PROC GENMOD. The COPY method uses the results of the log-binomial regression if the model converges. When the log-binomial regression fails to converge, the COPY 13 method creates C (usually a large number) copies of the original data and one copy of the original data with the outcome variable reversed (Y converted to 1-Y) and then takes the modified data to generate the estimates (Deddens et al., 2003; Petersen and Deddens, 2009). The choice of C impacts the accuracy and the efficiency of the estimates as well as the model convergence. A larger C produces results that are closer to MLEs yet leads to a higher level of difficulty in finding the MLEs due to failure of convergence (Petersen and Deddens, 2009). 2.3 Robust Poisson regression 2.3.1 Quasi-likelihood method The quasi-likelihood is a technique used to estimate regression coefficients without fully specifying the distribution of the observed data (Wedderburn 1974, McCullagh, 1983; McCullagh and Nelder, 1989). Assume that there are n uncorrelated observations with ( ) () i E Yi µβ = and () ( ) ii Var Y V φ µ = ⋅ where 12 x ' ( , ,..., ) i i i iK xx x = is the row vector comprised of the K covariates of the i-th individual, 01 ( , ,..., ) K β ββ β = are regression parameters, φ is the dispersion parameter, V is any form of variance function and i=1,2,…,n. Please note that the assumptions above do not require the specification of the distribution of the conditional outcome variable ( i y ). Only a relationship between the mean and the variance is defined in which the variance is specified as a function of the mean. 14 The quasi-likelihood is defined as 1 ( ;) () i i n i y i yt Q y dt Vt µ β φ = − = ⋅ ∑ ∫ (Wedderburn 1974), and thus the quasi-score function is 1 ( ;) () n ii i j i j i j Qy y U V β µµ β φ µ β = ∂ −∂ = = ⋅ ∂ ⋅∂ ∑ , where j=1,2,…,K. i U has the following properties that are analog to that of a true score function that comes from a distribution within the exponential family. Thus it is not a surprise to see that the quasi-score function behaves similarly to a true score function. () 0 i E U = 1 () () i i Var U V φµ = 1 () () i i i U E V µ φµ ∂ − = ∂ Quasi-likelihood method is often used to handle Poisson or binomial data with dispersion, or data with an unknown distribution. It can be proven that the quasi-score function is identical to the score function associated with the maximum likelihood of a GLM if i y comes from an exponential family (see proof in Appendix II). Under mild regularity conditions, the quasi-likelihood estimators are consistent and asymptotically unbiased (White, 1982). However, they are not maximally and asymptotically efficient [32]. 2.3.2 Using quasi-likelihood method and Poisson distribution to model a binary outcome 15 It is natural to consider the Poisson distribution, as an alternative to the binomial, the true distribution of the outcome variable ( i y ). First, both Poisson and binomial distributions are from the exponential family and both are discrete. Both have the same link function (log). Lastly, the two distributions are related by np λ = . However, when the Poisson distribution is used to model a binomial outcome, dispersion is an issue because for i y that follow a binomial distribution, the mean does not equal to the variance most of time. This is why the concept of quasi-likelihood comes into play, because the relationship between the mean and the variance can be easily specified for the binary outcome. As mentioned above, the quasi-score function is identical to the score function associated with the maximum likelihood of a GLM if i y comes from an exponential family. When the Poisson distribution is chosen, the quasi-score function above can be simplified to 1 1 () n j i i ij i U yx µ φ = = − ∑ (see the proof in the text box below). It is not a surprise to see that the resulting quasi-score estimating equations, 1 () ( ) 0 n i i ij i S yx βµ = = −= ∑ , are the same as the estimating equations of the Poisson regression models. Please note that ˆ β does not depend on φ , the dispersion parameter. This is the reason that a GLM with the Poisson distribution can be applied to generate the point estimate of a quasi-likelihood based model in which the outcome follows a binomial distribution. 16 Since 1 ( ) exp( ) n i i ij j i EY x µβ = = = ∑ , 1 exp( ) n i ij j ij i ij i j d x x x d µ β µ β = = ⋅= ⋅ ∑ , For Poisson distribution, () i i Vµµ = Thus, 11 1 ( ;) 1 () () nn n ii i ii j i ij i i ij ii i j i j i Qy y y U x yx V β µµ µ µµ β φ µ β φµ φ = = = ∂ −∂ − = = ⋅ = ⋅⋅ = − ∂ ⋅∂ ⋅ ∑∑ ∑ As mentioned above, quasi-likelihood estimators are not maximally and asymptotically efficient (McCullagh and Nelder, 1989). Therefore, the robust Poisson regression model uses the classical sandwich estimator under the generalized estimation equation (GEE) framework to provide accurate standard errors (Huber, 1967; White, 1982; Zeger and Liang, 1986).The variance-covariance matrix is 1 11 11 1 [ [ ( )]] [ [( ( ) ( ) ]] [ [ ( )]] nn n T i ii i ii i EI E S S EI β ββ β − −− = = = ∑∑ ∑ , (6) where ( ) ( ) i i S I β β β ∂ = − ∂ is the information matrix [29]. A consistent estimate of the variance can be obtained by evaluating (6) at ˆ β . The robust variance mentioned above can be simply generated by using the REPEATED statement in SAS Proc GENMOD (Zhou, 2004) or the ROBUST option in STATA’s Poisson procedure (Barros and Hirakata, 2003). 17 2.4 Comparison of the two estimation methods For the log-binomial models, the denominators of the diagonal elements of the weight matrix,1 ( ) i p β − , can make the estimates of β coefficients very sensitive to observations with large values of ( ) i p β . In contrast, for the robust Poisson models, the point estimators based on the quasi-score estimating equations 1 ( ) 0 n i i ij i yx µ = −= ∑ should be relatively more stable compared to those of log-binomial models due to the smaller impact of large values of () i µβ . 18 Chapter 3. Model Comparisons with Respect to Robustness to Outliers and Model Misspecification Using Simulation 3.1 Robustness to Outliers 3.1.1 Simulation methods 3.1.1.1 Design of uncontaminated datasets Based on the experience gained from my own analyses and literature review, I expected the differences between the two models to be greatest when there were continuous confounders and when the models were complicated. Thus the study was designed to have at least one continuous confounder with the confounder(s) being associated with the outcome either in a linear or quadratic form. The rest of the section describes the design of the uncontaminated datasets. Let Y be a binary common outcome from a population with the probability of Y = 1 varying from 10%, 25% to 40%. Let X be a binary treatment/exposure variable (X = 1 for treatment/exposure and X = 0 for non-treatment/non-exposure) from a binomial distribution with the probability of X = 1 fixed at 50%. Let Z be a continuous confounder following the Beta (6, 2) distribution. The Beta distribution offers a wide range of possible forms. The parameters chosen (α = 6 and β = 2) provided the distribution with a mean of approximately 0.75. For example, if Z × 100 represents the ages of study subjects, they came from an elderly population with the average age being 75 years old. The relationship between X and Z is defined by the equation logit (px)=a0 + a1Z, where a1 = log(2) or log (4) to indicate moderate or strong association, respectively. The 19 relationship between Y and X is log-linear with adjusted RR being 1.5 or 2. Z is designed as a linear or quadratic confounder. Linear confounder Z: log(py) = β0 + β1X + β2Z, or Quadratic confounder Z: log(py) = β0 + β1X + β2Z + (0.5 × β2)Z 2 , where β1 = log (1.5) or log (2), and β 2 = log (2) or log (4) The value of β1 in the models denotes the effect due to exposure on the log-relative risk scale (i.e. β1 = log (RR)). To simplify the design, we set β2 = α1. There were a total of 24 scenarios (2 RRs, 3 outcomes, 2 levels of association, 2 types of confounders). A detailed description of the 24 scenarios is listed in Appendix III. 3.1.1.2 Rationale for choosing the values of the parameters The values of the parameters were chosen to reflect the distributions and the associations of the variables in the fractional exhaled nitric oxide (FeNO) asthma study described in Chapter 4. For example, in the asthma study, the rate of the outcome, 7 or greater short-acting beta-agonist (SABA) canisters dispensed in 12 months, was 22%. Thus, the simulation study was designed with the outcome rate ranging from 10%, 25% to 40%, covering a range of plausible values of a common outcome. In the same asthma study, the exposure variable (FeNO measures) was categorized into quartiles (4 categories, each contains 25% of the study subjects). To simplify the design of the simulation study, a binary exposure variable with an exposure rate of 50% (2 categories, each contains about 50% of the study subjects) was generated. This binary exposure 20 variable represents the grouping of the lower two categories and the higher two categories of the original FeNO measures in the real-life asthma dataset. 3.1.1.3 Data generating process First the random variable Z following the Beta (6, 2) distribution was generated for 1,500 subjects. Then the exposure variable X based on the subject-specific probability of exposure was created for each subject. Finally, the outcome conditional on the exposure status and the covariate was randomly generated with the log-binomial distribution. For each of the 24 scenarios described above, the values of α0 and β0 were searched iteratively among all the possible values to guarantee that the expose and the outcome rates reached the designed level (50% for exposure and 10%, 25% or 40% for outcome). The α0 and β0 for each scenario can be found in Appendix I. To study the differences between the two models in samples of moderate sizes, the same process was repeated to generate another set of data with n = 500. 3.1.1.4 Contaminating the datasets with outliers Most of time, people generate outliers by adding a small percentage of records from a distribution that is different from the underlying one, with the expected means being the same. However, the simulation study was designed by using a flipping approach to produce outliers that are more likely to be leverage points, which are expected to have a bigger impact on estimates. 21 For each of the 24 scenarios mentioned above, the simulation datasets were contaminated by flipping the outcomes (Y becomes 1-Y) of the records that had the extreme probabilities of having Y = 1 (i.e. either very low or very high py). The corresponding X’s and Z’s remained unchanged. Models were assessed by using the original simulated datasets and the datasets contaminated at the following level. Contamination rates: 0% – original datasets 2% – flipping the outcomes of records with py at bottom or top 1% 5% – flipping the outcomes of records with py at bottom or top 2.5% Although it is not realistic to expect outliers to solely come from observations with very low or very high probabilities of having the outcome, outliers resulting from flipping the outcome are possible due to documentation or data entry errors. For example, “0” (“not having the disease”) may be erroneously entered into the study database as “1” (“the disease was present”). Compared to adding outliers from a distribution that is different from the underlying one, the flipping approach produced observations that are more likely to be leverage points. In other words, they are more likely to impact the estimates. 3.1.2 Measures of model performance For each scenario, the simulation process was repeated 1,000 times to estimate the relative bias, standard error and mean square error (MSE) for the log-binomial model and the robust Poisson model. The relative bias was calculated as the average of the 1,000 22 estimated RR in log scale minus the log of the true RR divided by the log of the true RR. Standard error was defined as the empirical standard error of the estimated RR in log scale over all 1,000 simulations. MSE was calculated by taking the sum of the squared bias in log scale and the variances. To understand the impact of the data contamination on the expected values of the outcome variable (Y), the mean of the outcome variable (Y) at both levels of data contamination (2% and 5%) was examined among 1.5 million observations (=1,000 samples x 1,500 observations of each sample), for each of the 24 scenarios defined in 3.1.1.3. Due to the non-convergence issue in standard statistical software, the COPY method was used to generate the estimates of the log-binomial models. However, the accuracy of parameter estimates depends on the number of virtual copies. For this evaluation the number of copies was set to 1,000,000, the default of the COPY method (Petersen and Deddens, 2009). All the datasets were generated and analyzed using SAS Version 9.2 (SAS Institute, 2008). 3.1.3 Simulation results 3.1.3.1 Relative bias Table 3-1 and Figure 3-1 revealed the relative biases of the estimated RR in log scale from the two models in each of the 24 scenarios mentioned above when n = 1,500. As expected, both models accurately estimated β1 or log (RR) when the simulated databases were not contaminated (level of contamination = 0%). When the data were contaminated (level of contamination > 0%), the relative biases were all negative, 23 indicating that the point estimates were biased towards null. For a fixed RR and an outcome rate, the absolute value of the relative biases increased quickly with the level of contamination. However, the pace of such an increase varied by the outcome rate. Scenarios with lower outcome rates seemed to be associated with more elevated absolute relative biases. Models with non-linear confounder Z With the non-linear confounder Z, the robust Poisson model outperformed the log-binomial model when the data was contaminated (Table 3-1, Figure 3-1: C and D). The difference between the two models was visible even when the level of contamination was only 2%. The magnitude of difference between the two models also varied with the outcome rate. A larger difference was associated with smaller outcomes. 24 Table 3-1. Relative bias (%) in log scale in the presence of outliers (n=1,500) RR Prob (Y = 1) Contamina- tion Rate Association btw Z and Y: Linear Association btw Z and Y: Linear Association btw Z and Y: Non-Linear Association btw Z and Y: Non-Linear Level of association btw Z and X, Z and Y: Moderate Level of association btw Z and X, Z and Y: Strong Level of association btw Z and X, Z and Y: Moderate Level of association btw Z and X, Z and Y: Strong LB RP LB RP LB RP LB RP 1.5 10% 0% 1.8 1.8 2.0 2.0 0.0 0.0 −0.6 −0.8 2% −17.8 −16.7 −16.0 −16.0 −30.0 −19.5 −28.5 −20.4 5% −43.4 −36.7 −41.8 −36.9 −51.7 −41.3 −49.1 −43.0 25% 0% −0.4 −0.4 0.4 0.3 0.5 0.5 −0.8 −0.8 2% −10.4 −10.9 −10.1 −11.4 −14.1 −11.8 −18.6 −16.2 5% −25.3 −24.4 −26.6 −27.3 −34.6 −28.2 −43.4 −36.2 40% 0% 0.2 0.2 −0.7 −0.7 −0.0 −0.1 0.9 0.7 2% −7.5 −8.0 −9.8 −10.8 −11.0 −10.5 −14.9 −13.1 5% −19.2 −19.4 −23.4 −24.6 −26.9 −24.7 −36.9 −32.2 2.0 10% 0% 0.4 0.4 1.5 1.5 −0.2 −0.2 0.6 0.7 2% −18.9 −18.1 −17.1 −16.8 −26.3 −19.0 −25.0 −18.0 5% −42.0 −37.5 −40.2 −36.7 −47.9 −39.5 −47.4 −39.6 25% 0% 0.5 0.5 0.3 0.3 0.3 0.3 0.9 0.8 2% −9.0 −9.2 −9.2 −9.9 −12.3 −10.4 −13.7 −11.7 5% −23.3 −22.2 −23.7 −23.6 −30.0 −24.7 −34.5 −28.5 40% 0% −0.1 −0.1 0.1 0.1 0.3 0.3 −0.3 −0.3 2% −6.7 −7.0 −7.2 −7.8 −8.4 −8.0 −10.9 −10.2 5% −16.9 −16.6 −18.5 −19.0 −21.2 −19.5 −26.6 −24.5 Association between Z and X always linear LB: Log-Binomial Model RP: Robust Poisson Relative bias was defined as the average of the 1,000 estimated RR in log scale minus the log of the true RR divided by the log of the true RR. 25 Figure 3-1. Relative bias (%) in log scale in the presence of outliers (n=1,500). A. Association between Z and Y: Linear; Level of association between Z and X, Z and Y: Moderate. B. Association between Z and Y: Linear; Level of association between Z and X, Z and Y: Strong. C. Association between Z and Y: Non-Linear; Level of association between Z and X, Z and Y: Moderate. D. Association between Z and Y: Non-Linear; Level of association between Z and X, Z and Y: Strong. Robust Poisson: Red dotted lines. Log-Binomial: Blue solid lines. Models with linear confounder Z When the models contained a linear confounder Z, the two models yielded comparable relative biases (Table 3-1, Figures 3-1: A and B) except for a few scenarios in which the biases from the log-binomial models were slightly larger than those of the 26 robust Poisson models. The largest differences occurred when the contamination rate was 5% and the outcome rate was 10%, in which the log-binomial models had 3.5%-6.7% higher biases compared to the corresponding robust Poisson models. 3.1.3.2 Standard error (SE) When the simulated databases were not contaminated (level of contamination = 0%), the SEs of the two models were identical at the 2nd decimal point (Table 3-2). When the levels of contamination were greater than 0%, the estimated SEs remained comparable between the two models when the outcome rates were high (25% or 40%). However, when the outcome rate was low (10%), the SEs derived from the log-binomial models seemed to be slightly higher than those of robust Poisson in some of the scenarios. 3.1.3.3 Mean square error (MSE) The same pattern was observed for the MSEs as those of relative biases (Table 3- 3). When the models contained a linear confounder Z, the two models had comparable MSEs except when the outcome rate was 10% and the contamination reached 5%. When the models contained a non-linear confounder Z, the two models diverged even when the degree of contamination was only 2% (Table 3-3). 27 Table 3-2. Standard error in log scale in the presence of outliers (n = 1,500) RR Prob (Y = 1) Contamina- tion Rate Association btw Z and Y: Linear Association btw Z and Y: Linear Association btw Z and Y: Non-Linear Association btw Z and Y: Non-Linear Level of association btw Z and X, Z and Y: Moderate Level of association btw Z and X, Z and Y: Strong Level of association btw Z and X, Z and Y: Moderate Level of association btw Z and X, Z and Y: Strong LB RP LB RP LB RP LB RP 1.5 10% 0% 0.16 0.16 0.16 0.16 0.16 0.16 0.16 0.16 2% 0.14 0.14 0.14 0.14 0.15 0.14 0.14 0.14 5% 0.13 0.12 0.13 0.12 0.15 0.13 0.15 0.13 25% 0% 0.10 0.10 0.09 0.09 0.09 0.09 0.09 0.09 2% 0.09 0.09 0.09 0.09 0.09 0.09 0.09 0.09 5% 0.08 0.08 0.08 0.08 0.09 0.09 0.08 0.08 40% 0% 0.06 0.06 0.06 0.07 0.06 0.06 0.06 0.06 2% 0.06 0.06 0.06 0.06 0.06 0.06 0.06 0.06 5% 0.06 0.06 0.06 0.06 0.06 0.06 0.06 0.06 2.0 10% 0% 0.17 0.17 0.16 0.16 0.16 0.16 0.17 0.17 2% 0.15 0.15 0.15 0.15 0.16 0.15 0.15 0.15 5% 0.14 0.13 0.13 0.13 0.15 0.13 0.16 0.13 25% 0% 0.10 0.10 0.10 0.10 0.09 0.10 0.10 0.10 2% 0.09 0.09 0.10 0.10 0.09 0.09 0.09 0.09 5% 0.09 0.09 0.09 0.09 0.09 0.08 0.08 0.09 40% 0% 0.07 0.07 0.07 0.07 0.07 0.07 0.07 0.07 2% 0.07 0.07 0.07 0.07 0.07 0.07 0.07 0.07 5% 0.07 0.07 0.07 0.07 0.07 0.07 0.07 0.07 Association between Z and X always linear LB: Log-Binomial Model RP: Robust Poisson Standard error was defined as the empirical standard error of the estimated RR in log scale over all 1,000 simulations. 28 Table 3-3. Mean square error (MSE) in log scale in the presence of outliers (n = 1,500) RR Prob (Y = 1) Contamina -tion Rate Association btw Z and Y: Linear Association btw Z and Y: Linear Association btw Z and Y: Non-Linear Association btw Z and Y: Non-Linear Level of association btw Z and X, Z and Y: Moderate Level of association btw Z and X, Z and Y: Strong Level of association btw Z and X, Z and Y: Moderate Level of association btw Z and X, Z and Y: Strong LB RP LB RP LB RP LB RP 1.5 10% 0% 0.025 0.025 0.027 0.027 0.025 0.025 0.024 0.024 2% 0.026 0.025 0.025 0.024 0.036 0.026 0.033 0.026 5% 0.049 0.038 0.045 0.038 0.066 0.044 0.063 0.046 25% 0% 0.008 0.008 0.008 0.008 0.009 0.009 0.008 0.008 2% 0.010 0.010 0.009 0.010 0.011 0.010 0.013 0.012 5% 0.018 0.017 0.019 0.019 0.027 0.020 0.038 0.028 40% 0% 0.004 0.004 0.004 0.004 0.004 0.004 0.004 0.004 2% 0.005 0.005 0.006 0.006 0.006 0.006 0.008 0.007 5% 0.010 0.010 0.013 0.014 0.016 0.014 0.027 0.021 2.0 10% 0% 0.028 0.028 0.027 0.027 0.027 0.027 0.029 0.029 2% 0.040 0.038 0.035 0.035 0.057 0.039 0.053 0.038 5% 0.105 0.085 0.096 0.081 0.134 0.091 0.135 0.093 25% 0% 0.009 0.009 0.010 0.010 0.009 0.009 0.009 0.010 2% 0.012 0.013 0.014 0.014 0.016 0.014 0.018 0.016 5% 0.034 0.031 0.013 0.035 0.051 0.036 0.065 0.047 40% 0% 0.005 0.005 0.005 0.005 0.005 0.005 0.005 0.005 2% 0.007 0.007 0.007 0.008 0.008 0.008 0.011 0.010 5% 0.018 0.018 0.021 0.022 0.026 0.023 0.039 0.034 Association between Z and X always linear LB: Log-Binomial Model RP: Robust Poisson Mean square error was calculated by taking the sum of the squared bias in log scale and the variances. 29 3.1.3.4. Results from moderate sample sizes (n=500) When the simulation was conducted based on samples of moderate sizes (n = 500), the same pattern was observed (data not shown). As expected, the SEs based on the small samples were larger compared to those derived from large samples (n = 1,500). The results summarized from samples of size 500 were included in Appendix IV. 3.1.3.5. Impact of data contamination on the expected value of the outcome variable (Y) Compared to the original outcome variable (10%, 25% and 40% by design), the mean of the outcome variable (Y) in the contaminated data increased slightly (Table 3-4). The level of increase depended on the distribution of the outcome variable (Y) and the contamination rate. As expected, the lower the original outcome rate and the higher the contamination rate, the larger the impact. When the outcome rate was relatively low (10%) and the contamination rate was high (5%), the impact of data contamination could be as large of 4% (increasing from 10% to up to 14%). However, when the outcome rate was high (40%), the increase was minimal (either unchanged or changed from 40% to 41%). When the outcome rate was 25%, the contamination increased the average outcome rate to a range of 25.8%-27.6%. 30 Table 3-4. Mean of the contaminated outcomes (Y) by the original outcome (10%, 25% and 40%) RR Original Outcome (%) Contamination Rate Mean of the Contaminated Outcomes (Y) (%) Association btw Z and Y: Linear Level of association btw Z and X, Z and Y: Moderate Association btw Z and Y: Linear Level of association btw Z and X, Z and Y: Strong Association btw Z and Y: Non- Linear Level of association btw Z and X, Z and Y: Moderate Association btw Z and Y: Non- Linear Level of association btw Z and X, Z and Y: Strong 1.5 10 2% 5% 11.6 14.0 11.6 14.0 11.6 14.0 11.5 13.8 25 2% 5% 26.1 27.6 26.0 27.5 26.0 27.4 25.8 27.1 40 2% 5% 40.4 41.0 40.4 40.9 40.3 40.8 40.1 40.2 2.0 10 2% 5% 11.6 14.0 11.6 13.9 11.6 13.9 11.4 13.8 25 2% 5% 25.9 27.4 25.9 27.3 25.9 27.3 25.6 27.0 40 2% 5% 40.4 40.9 40.3 40.8 40.2 40.7 40.1 40.2 Association between Z and X always linear 3.1.4 Discussion In this study, evaluation was performed on the statistical properties of the two most popular model-based approaches to estimate RR for common binary outcomes in various scenarios when outliers existed. The results suggest that for data coming from a population in which the true relationship between the outcome and the covariate is not in 31 a simple form (e.g. containing a higher order term), the robust Poisson model consistently outperforms the log-binomial model even when the level of contamination is low (e.g. 2%). Statistical software utilizes iterative reweighted least squares (IRLS) approach or variations of IRLS to find MLEs for generalized linear models. For log-binomial models, the weights used by the IRLS approach contain the term 1/(1-p), where p = exp (X T β) with a range from 0 to 1 [19]. Lumley et al. pointed out that the MLE of a log-binomial model is likely to be too sensitive to outliers because a very large p (referred to as μ by the authors) has a large influence on the weights, even though the sum of the covariate values are still bounded [15]. In our study both the MLE, generated by the log-binomial models and the quasi-likelihood estimators, produced by the robust Poisson models, were deteriorated when outliers were introduced. However, the level of deterioration differed when the relationships between the confounder and the outcome was not in a simple form, possibly due to the bigger “μ”s yielded by the log-binomial model when the higher- order term of Z was added into the model. Deddens and Petersen compared the log-binomial and robust Poisson models by using three real-life examples (Deddens and Petersen, 2008). Out of the three examples, two produced different point estimates and SEs. In one of these two examples, the difference was nearly two folds for both point estimates and SEs for one of the covariates. The authors concluded that “the decision on which method to use is very important” (Deddens and Petersen, 2008); however, since the truth was unknown, it is unlikely to tell that between the log-binomial model and the robust Poisson mode, which one can yield estimates that are closer to the truth. In one of the two examples in which 32 differences between the two models were observed, the model contained a higher-order (quadratic) term. It is unclear whether or not the complexity of the model degenerated the performance of the models, especially for the log-binomial model. Of the two methods, the log-binomial method is generally preferred due to the fact that the MLEs estimated by the log-binomial models are more efficient compared to the pseudo-likelihood estimators used by the robust Poisson models (Kauermann and Carroll, 2001, page 2300). Spiegelman and Hertzmark recommend using the log- binomial models over the robust Poisson models when convergence is not an issue (Spiegelman and Hertzmark, 2005). Very small differences were observed in a simulation study with a sample size of 100 and a single independent variable with a uniform distribution (Petersen and Deddens, 2008). When data perfectly follows a log-binomial distribution (i.e. without outliers), the current study did not observe any difference in either biases or variances between the log-binomial and the robust Poisson models for large (n = 1,500) and moderate (n = 500) sample sizes. It appears that the gain in efficiency is beneficial to log-binomial models only for samples of small sizes. It is not a surprise to observe negative biases when the simulation datasets were contaminated, because flipping the outcomes of the records that have the very low or very high probabilities tend to weaken the associations between the exposure and the outcome leading the associations towards null. The observation of more elevated biases when outcome rate = 10% compared to those of 25% and 40% comes with no surprise either since the impact of flipping on the estimates is expected to be more significant for scenarios with more extreme outcomes (close to 0% or 100%). 33 Robust methods to detect outliers especially high leverage points for logistic regression models are available in popular statistical software packages (Hosmer and Lemeshow, 2000; Imon and Hadi, 2013; Syaiba and Habshah, 2010). However, similar approaches for log-binomial models are not yet available in commercial software packages although the adoption of the diagnostic statistics from those of logistic regression models were demonstrated and applied in a real life example (Blizzard and Hosmer, 2006). Efforts to develop goodness of fit tests resulted in reasonable type I errors yet low to moderate power (Blizzard and Hosmer, 2006). For this reason, the robust Poisson model seems to be a more attractive choice due to its capability of providing more robust results when outliers are undetected. Occasionally, failure of convergence remains to be an issue for log-binomial models even if the COPY method is applied. Petersen and Deddens (Petersen and Deddens, 2008) included a continuous exposure variable (referred to as the continuous covariate by the authors) and applied the COPY method in the simulation and reported a range of convergence between 70% and 100%. In this study where C, the number of the virtue copies, was set to be 1,000,000, the COPY method converged in all 1,000 simulations in all the 48 scenarios with the linear confounder, and in 30 of out 48 scenarios with the non-linear confounder. In the 18 scenarios in which the COPY method did not converge in all 1,000 simulations (failed in 1 or more simulations), the converge rates ranged from 0.983 to 0.999 (median 0.996). If the COPY method fails to converge and the maximum likelihood-based estimators are desired, one can choose the Non- 34 Linear Programming (NLP) procedure in SAS [26]. The NLP method is computationally expensive. However, it does not encounter any convergence issues. Given the lack of robustness of log-binomial models, the authors recommend using robust Poisson models to estimate RR when there are continuous covariates in the model, especially when the covariates are not in a simple linear form. Due to the concern of lack of efficiency for the robust Poisson models for small samples, log-binomial may still be the choice when the sample size is small. A potential limitation of this study is that complex forms between the confounder and the outcome were generated by a quadratic equation. It is not clear whether or not the findings can be generalized to other complex situations. In addition, all of the outliers generated occurred to records with very low or very high probabilities and such outliers are more likely to be leverage points. The impact of the outliers generated by this study could be more significant compared to that of another study with outliers that were differently distributed. There are many approaches to generate “outliers”. One of the common approaches is to generate distributions with desired characteristics by combining simple distributions. For example, to generate a normal distribution N (100,4) with 5% outliers, we could generate 95% of the sample from a normal distribution with mean 100 and standard deviation 4, and generate 5% of the sample from another normal distribution with mean 100 and standard deviation 16. Traditionally, in simulation people tend to generate outliers with the expected value to be the same as the original distribution (i.e. no shift with regards to the mean). In this study, the mean of the outcome variable did 35 shift to the right slightly when the “outliers” were introduced (see Table 3-4), especially when the outcome rate was low (10%). For the sake of more consistent terminology, it would be better to refer this type of outliers as “outliers with shifted means”. In summary, the current study revealed the evidence to support the robustness of the robust Poisson model in various scenarios. Further research should focus on the model misspecification due to deviations of underlying probabilities. It is desirable for future studies to develop methods to identify leverage points and efficient goodness-of-fit test for log-binomial models. 3.1.5 Citation of the published paper Title: Comparison of robustness to outliers between robust Pisson models and log- binomial models when estimating relative risks for common binary outcomes: a simulation study Authors: Wansu Chen, Jiaxiao Shi, Lei Qian and Stanley P Azen Journal: BMC Medical Research Methodology 2014,14:82 Link: http://www.biomedcentral.com/1471-2288/14/82 36 3.2 Robustness to model misspecification 3.2.1. Simulation methods 3.2.1.1 Design and generation of simulation datasets The idea is to create three sets of simulation datasets to examine the impact of wrong link functions, predictors that violate linear assumptions and both. As noted in Chapter 2 (Sections 2.2 and 2.4), the estimates of β coefficients can be very sensitive to observations with large ( ) i p β , due to the term (1- ( ) i p β ) in the denominator of the weight matrix. Therefore, three sets of data with varying maximum ( ) i p β s, varying average ( ) i p β s and altered link functions were generated. Let Y be a binary common outcome (Y =1 for response/disease and Y =0 for non- response/non-disease) and X be a binary treatment/exposure variable (X = 1 for treatment/exposure and X = 0 for non-treatment/non-exposure). First, uncorrelated random variables Z1 and Z2 following the Bernoulli (0.5) and the Uniform [0, 1] distributions, respectively, were generated for 1,000 subjects. Then for each subject the exposure variable X based on the subject-specific probability of exposure defined by the equation logit (P (X=1| Z1, Z2) )= -1.0 + Z1 + Z2 with E(P(X=1))=0.5 was created. The outcome variable Y took various forms described in the nine datasets below to enable the examination of the impact of model mis-specification. The adjusted RR (i.e. P (Y =1| X = 1, Z1, Z2) ) / P (Y =1| X = 0, Z1, Z2)) was fixed at 3.0. 37 Datasets to study the impact of large P(Y=1) First, three data sets DS1, DS2, and DS3 were created using the log link function in which the maximum values of P (Y1 =1| X = 1, Z1, Z2)) were set to 0.75, 0.85 and 0.95, respectively (Table 3-5) to study the impact of maximum P (Y1 =1| X = 1, Z1, Z2))’s. To study the “volume” impact of these large Ps on the model performance, three additional outcome variables (Y2, Y3, Y4) conditional on the exposure status and the covariates were randomly generated within each dataset. Unlike Y1, which always contains perfect linear predictors on the right side of the equations, Y2, Y3 and Y4 are “altered” outcome variables such that the values of P (Yk=1| X = 0, Z1, Z2) depend on whether or not (Z1 + 3 * Z2) reaches a threshold (k=2, 3 and 4). The thresholds being selected were 0.15, 0.30 and 0.60 for Y2, Y3 and Y4, respectively. 0.15, 0.30 and 0.60 are called “thresholds” because (Z1 + 3 * Z2) needs to be greater than 0.15, 0.30 and 0.60 to impact P (Y2 =1| X = 0, Z1, Z2)), P (Y3 =1| X = 0, Z1, Z2)) and P (Y4 =1| X = 0, Z1, Z2)), respectively. The application of the thresholds yielded a spike of observations at the maximums for both exposed and unexposed subjects. For a given maximum value (e.g. 0.95), the larger the threshold, the higher the spike. The percentages of exposed subjects who were at the maximum values set above (0.75, 0.85 and 0.95) were 1.4%, 2.8% and 5.8% (Table 3-5). These values were derived by simulation and represented the various levels of alteration of the linear predictors. Recall that P (Yk =1 | X = 1, Z1, Z2) ) / P (Yk =1 | X = 0, Z1, Z2)) =3.0 and the maximum P (Yk =1 | X = 1, Z1, Z2) was set to 0.75, 0.85 and 0.95, for DS1, DS2 and DS3, respectively. The intercepts in Table 3-5 were manually calculated to satisfy P (Yk =1 | X 38 = 0, Z1, Z2) ≤ 0.75/3, 0.85/3 and 0.95/3, respectively, for DS1, DS2 and DS3 (k=1, 2, 3, and 4). For example, for the equation log (P (Y1=1| X = 0, Z1, Z2) ) = α – Z1 – 3 * Z2 in DS1, α = log(0.75/3)= log(0.25) = -1.38 since the logarithm is an increasing function and hence the maximum of P (Y1 =1| X = 0, Z1, Z2) is achieved when Z1 = 0 and Z2 = 0. For the same reason, for the equation log (P (Y2=1| X = 0, Z1, Z2) ) = α – Z1 – max (Z 1 + 3 * Z 2, 0.15) in DS1, α = log(0.75/3) + 0.15 = -1.38+0.15= -1.23. Table 3-5 displays the parameters to generate P (Yk=1 | X = 0, Z1, Z2) for unexposed subjects. For exposed subjects, P (Yk=1| X = 1, Z1, Z2) = 3*P (Yk =1| X = 0, Z1, Z2), where k = 1, 2, 3 and 4. Under the design, log-binomial or robust Poisson models based on outcome Y1 are assumed to be correctly specified, while the two models with outcomes Y2, Y3 and Y4 are considered to be mis-specified. Datasets to Study the Impact of Average P(Y=1) To study the impact of the entire distribution of P(Y=1) of a dataset when large P(Y=1)’s exist, three sets of data (DS4, DS5, DS6) were produced with varying average P(Y=1)’s while the maximum values of P (Yk =1| X = 1, Z1, Z2) and P (Yk =1| X = 0, Z1, Z2) were fixed at 0.95 and 0.3167 (= 0.95/3), respectively (k=1, 2, 3 and 4). This was achieved by varying the beta coefficient of Z 2 (Table 3-6), as the distribution of P(Y=1)’s is shifted towards zero as the beta coefficient of Z 2 increases. The beta coefficient of Z2 was set to be 2, 3 and 4 in DS4, DS5 and DS6, respectively. DS5 is identical to DS3; however, it was given a new DS name in Table 3-6 for easier reference. The intercepts in 39 Table 3-6 were manually calculated so that the P (Yk =0| X = 1, Z1, Z2)) does not exceed 0.3167 (= 0.95/3), using the same approach described in the previous section. 40 Table 3-5. Parameters of DS1-DS3 for generating Y k for unexposed subjects (k = 1, 2, 3 and 4) a Maximum P (Y k =1| X = 1, Z 1, Z 2) Dataset name Scenario Models to generate simulation datasets Evaluation model mis- specified? Max P of exposed a Beta of Z 2 Link Function % of exposed at Max P DS1 0.75 3 log 0 log (P (Y 1=1| X = 0, Z 1, Z 2) )= -1.38 – Z 1 –3 * Z 2 No 0.75 3 log 1.4 log (P (Y 2=1| X = 0, Z 1, Z 2) )= -1.23 – max (Z 1 + 3 * Z 2, 0.15) Yes 0.75 3 log 2.8 log (P (Y 3=1| X = 0, Z 1, Z 2) )= -1.08 – max (Z 1 + 3 * Z 2, 0.30) Yes 0.75 3 log 5.8 log (P (Y 4=1| X = 0, Z 1, Z 2) )= -0.78 – max (Z 1 + 3 * Z 2, 0.60) Yes DS2 0.85 3 log 0 log (P (Y 1=1| X = 0, Z 1, Z 2) )= -1.26 – Z 1 –3 * Z 2 No 0.85 3 log 1.4 log (P (Y 2=1| X = 0, Z 1, Z 2) )= -1.11 – max (Z 1 + 3 * Z 2, 0.15) Yes 0.85 3 log 2.8 log (P (Y 3=1| X = 0, Z 1, Z 2) )= -0.96 – max (Z 1 + 3 * Z 2, 0.30) Yes 0.85 3 log 5.8 log (P (Y 4=1| X = 0, Z 1, Z 2) )= -0.66 – max (Z 1 + 3 * Z 2, 0.60) Yes DS3 0.95 3 log 0 log (P (Y 1=1| X = 0, Z 1, Z 2) )= -1.15 – Z 1 –3 * Z 2 No 0.95 3 log 1.4 log (P (Y 2=1| X = 0, Z 1, Z 2) )= -1.00 – max (Z 1 + 3 * Z 2, 0.15) Yes 0.95 3 log 2.8 log (P (Y 3=1| X = 0, Z 1, Z 2) )= -0.85 – max (Z 1 + 3 * Z 2, 0.30) Yes 0.95 3 log 5.8 log (P (Y 4=1| X = 0, Z 1, Z 2) )= -0.55 – max (Z 1 + 3 * Z 2, 0.60) Yes 41 For the same reason described above, four outcome variables (Y1, Y2, Y3, Y4) were created within each data set (DS4, DS5, DS6) to study the “volume” impact of large Ps on the model performance. The thresholds in DS4 (0.10, 0.20, 0.40) and DS6 (0.20, 0.40, 0.80) were manually selected such that the percentages of exposed subjects reaching maximum P (Yk =1| X = 1, Z1, Z2) in DS4 and in DS6 are same as the corresponding ones in DS5 (see the selection of thresholds in the previous section). Because Z2 follows the uniform distribution, the thresholds increase proportionally with the beta coefficients. For example, the threshold to make 1.4% of exposed subjects reach the maximum P (Y2 =1| X = 1, Z1, Z2) is 0.1 when the beta of Z2 is 2 and increases to 0.2 when the beta of Z2 becomes 4. Table 3-6 displays the parameters to generate P (Yk=0 | X = 0, Z1, Z2). P (Yk=1 | X = 1, Z1, Z2) = 3*P (Yk =1| X = 0, Z1, Z2), where k = 1, 2, 3 and 4. Under the design, log-binomial or robust Poisson models based on outcome Y1 are assumed to be correctly specified, while the two models with outcomes Y2, Y3 and Y4 are considered to be mis-specified. Datasets to Study the Impact of Mis-Specified Link Functions Finally, the link function was altered from ‘log’ (DS7) to ‘logit’ (DS8) and ‘probit’ (DS9) to assess the model performance when the link-functions is mis-specified. Again, the maximum value of P (Yk =1)s were set to 0.95 and 0.3167 (= 0.95/3) for exposed and unexposed subjects, respectively, where k=1,2,3 and 4. DS3 or DS5 generated in previous sections is renamed to DS7 in this section for easier reference. 42 Table 3-6. Parameters of DS4-DS6 for generating Y k for unexposed subjects (k = 1, 2, 3 and 4) DS5 is identical to DS3 in Table 3-5 a Maximum P (Y k =1| X = 1, Z 1, Z 2) Dataset name Scenario Models to generate simulation datasets Evaluation model mis- specified? Max P of exposed a Beta of Z 2 Link Function % of exposed at Max P DS4 0.95 2 log 0 log (P (Y 1=1| X = 0, Z 1, Z 2) )= -1.15 – Z 1 – 2 * Z 2 No 0.95 2 log 1.4 log (P (Y 2=1| X = 0, Z 1, Z 2) )= -1.05 – max (Z 1 + 2 * Z 2, 0.10) Yes 0.95 2 log 2.8 log (P (Y 3=1| X = 0, Z 1, Z 2) )= -0.95 – max (Z 1 + 2 * Z 2, 0.20) Yes 0.95 2 log 5.8 log (P (Y 4=1| X = 0, Z 1, Z 2) )= -0.75 – max (Z 1 + 2 * Z 2, 0.40) Yes DS5 0.95 3 log 0 log (P (Y 1=1| X = 0, Z 1, Z 2) )= -1.15 – Z 1 – 3 * Z 2 No 0.95 3 log 1.4 log (P (Y 2=1| X = 0, Z 1, Z 2) )= -1.00 – max (Z 1 + 3 * Z 2, 0.15) Yes 0.95 3 log 2.8 log (P (Y 3=1| X = 0, Z 1, Z 2) )= -0.85 – max (Z 1 + 3 * Z 2, 0.30) Yes 0.95 3 log 5.8 log (P (Y 3=1| X = 0, Z 1, Z 2) )= -0.55 – max (Z 1 + 3 * Z 2, 0.60) Yes DS6 0.95 4 log 0 log (P (Y 1=1| X = 0, Z 1, Z 2) )= -1.15 – Z 1 – 4 * Z 2 No 0.95 4 log 1.4 log (P (Y 2=1| X = 0, Z 1, Z 2) )= -0.95 – max (Z 1 + 4 * Z 2, 0.20) Yes 0.95 4 log 2.8 log (P (Y 3=1| X = 0, Z 1, Z 2) )= -0.75 – max (Z 1 + 4 * Z 2, 0.40) Yes 0.95 4 log 5.8 log (P (Y 4=1| X = 0, Z 1, Z 2) )= -0.35 – max (Z 1 + 4 * Z 2, 0.80) Yes 43 Again, four outcome variables (Y1, Y2, Y3, Y4) were created within each data set (DS7, DS8, DS9) to study the “volume” impact of large Ps. Within each dataset (DS7, DS8, DS9), Y1 is always the one that perfectly follows the distribution that is indicated by the link function. For example, in DS9, P (Y1=1) follows the distribution of a ‘probit’ function. Y2, Y3 and Y4 are altered outcome variables such that P (Yk =1| X = 0, Z1, Z2)) depend on whether or not (Z1 + 3 * Z2) reaches a threshold (k=2, 3 and 4). The thresholds being selected in DS8 and DS9 were identical to those of DS7 (0.15, 0.30, 0.60 for Y2, Y3 and Y4, respectively), to guarantee the same proportion of subjects whose Ps reached the maximum value (0.95 for exposed subjects or 0.95/3 for unexposed subjects) as those in DS7. The intercepts in Table 3-7 were manually calculated so that the P (Yk =0| X = 1, Z1, Z2)) does not exceed 0.3167 (= 0.95/3), using the same approach described in the previous section. For example, for equation logit (P (Y2=1| X = 0, Z1, Z2) )= α – max (Z1 + 3 * Z2, 0.15) in DS8, since a ‘logit’ is also increasing function and hence the maximum of P (Y =1| X = 0, Z1, Z2) is achieved when Z1 = 0 and Z2 = 0. α = logit (0.3167) + 0.15 ≈ -0.61. Table 3-7 displays the parameters to generate P (Yk=0| X = 0, Z1, Z2). P (Yk=1| X = 1, Z1, Z2))= 3*P (Yk =1| X = 0, Z1, Z2), where k = 1, 2, 3 and 4. By design, log-binomial or robust Poisson models based on all the outcomes in DS7, DS8 and DS9 are assumed to be mis-specified, except Y1 in DS7. In DS8 and DS9, Y2-Y4 had mis-specified (i.e. incorrect) link functions and mis-specified linear predictors (i.e. the assumption of linear predictors was violated). 44 Table 3-7. Parameters of DS7-DS9 for generating Y k for unexposed subjects (k = 1, 2, 3 and 4) DS7 is identical to DS3 in Table 3-5 and DS5 in Table 3-6 a Maximum P (Y k =1| X = 1, Z 1, Z 2) Dataset name Scenario Models to generate simulation datasets Evaluation model mis- specified? Max P of exposed a Beta of Z 2 Link Function % of exposed at Max P DS7 0.95 3 log 0 log (P (Y 1=1| X = 0, Z 1, Z 2) )= -1.15 – Z 1 – 3 * Z 2 No 0.95 3 log 1.4 log (P (Y 2=1| X = 0, Z 1, Z 2) )= -1.00 – max (Z 1 + 3 * Z 2, 0.15) Yes 0.95 3 log 2.8 log (P (Y 3=1| X = 0, Z 1, Z 2) )= -0.85 – max (Z 1 + 3 * Z 2, 0.30) Yes 0.95 3 log 5.8 log (P (Y 4=1| X = 0, Z 1, Z 2) )= -0.55 – max (Z 1 + 3 * Z 2, 0.60) Yes DS8 0.95 3 logit 0 logit (P (Y 1=1| X = 0, Z 1, Z 2) )= -0.76 – Z 1 – 3 * Z 2 Yes 0.95 3 logit 1.4 logit (P (Y 2=1| X = 0, Z 1, Z 2) )= -0.61 – max (Z 1 + 3 * Z 2, 0.15) Yes 0.95 3 logit 2.8 logit (P (Y 3=1| X = 0, Z 1, Z 2) )= -0.46 – max (Z 1 + 3 * Z 2, 0.30) Yes 0.95 3 logit 5.8 logit (P (Y 4=1| X = 0, Z 1, Z 2) )= -0.16 – max (Z 1 + 3 * Z 2, 0.60) Yes DS9 0.95 3 probit 0 probit (P (Y 1=1| X = 0, Z 1, Z 2) )= -0.48 – Z 1 – 3 * Z 2 Yes 0.95 3 probit 1.4 probit (P (Y 2=1| X = 0, Z 1, Z 2) )= -0.33 – max (Z 1 + 3 * Z 2, 0.15) Yes 0.95 3 probit 2.8 probit (P (Y 3=1| X = 0, Z 1, Z 2) )= -0.18 – max (Z 1 + 3 * Z 2, 0.30) Yes 0.95 3 probit 5.8 probit (P (Y 4=1| X = 0, Z 1, Z 2) )= -0.12 – max (Z 1 + 3 * Z 2, 0.60) Yes 45 3.2.2 Rationales for choosing the values of the parameters and distributions Similar to the “Robustness to Outliers” study described in 3.1 above, the values of the parameters in the study described in 3.2 were chosen to reflect the distributions and the associations of the variables in the FeNO study as closely as possible. For example, in the asthma study, the exposure variable (FeNO measures) was categorized into quartiles (4 categories, each contains 25% of the study subjects). In the simulation study, a binary exposure variable with a mean exposure rate of 50% (2 categories, each contains about 50% of the study subjects) was generated. Again, this binary exposure variable represents the grouping of the lower two categories and the higher two categories of the original FeNO measures in the real-life dataset. The outcome rates in the simulated datasets will be controlled by the intercepts and the beta coefficients of Z2, one of the two covariates, such that they will stay in the range of 10%-25%, with the exception of the datasets following the ‘probit’ distribution in which one of the outcome rate will drop to 7%. The range of 10%-25% is practically meaningful to represent non-rare outcomes. Although the adjusted relative risks estimated from the asthma study ranged from 1.23 to 2.51 (Table 4-1), 3.0 was selected as the true values of the adjusted relative risk for the simulation datasets to ease the illustration. A lower level of relative risk (e.g. 2.0) will also work. However, the differences between the two models are expected to be smaller, due to the design of the simulation study. An adjusted relative risk of 3.0 is common in real-life examples. 46 The choice of ‘logit’ and ‘probit’ as the distributions deviated from the ‘log’ distribution is not a surprise. Both can be used to model binary outcomes and in fact logistic regression, with the use of a ‘logit’ distribution, is commonly used to estimate odds ratios. Truncation of a distribution may appear in many real-life datasets. For example, a viral load under 50 copies/mL is undetectable. A typical scale used in clinics/hospitals can measure height up to 200 cm and weight up to 250 kg. The thresholds (e.g. 0.15/0.30/0.60, or 0.20/0.40/0.80) were selected because they truncate approxiamately1.4%, 2.8% and 5.8% of the exposed subjects by setting their P(Y=1) at the maximum (e.g. 0.95). The truncation rates (1.4%, 2.8% and 5.8%) are plausible values that users can relate to their real-life applications. 3.2.3 Measures of model performance For each of the 28 scenarios described above (7 unique datasets and 4 outcome variables within each dataset), the simulation process was repeated 1,000 times. In each of the 1,000 simulated datasets for each of the 28 scenarios, the log risk ratio was estimated from the log-binomial model and the robust Poisson model, respectively, using the methods described in Sections 2.2 and 2.3. For each scenario, the relative bias, standard error and mean square error (MSE) in log scale for all three measures were calculated by summarizing the results from the 1,000 datasets for each regression model. Relative bias was defined as the average of the 1,000 estimated RR in log scale minus the log of the true RR (3.0) divided by the log of the true RR (3.0). If ˆ i θ is denoted as the 47 estimated log risk ratio from the ith dataset using either the log-binomial model or the robust Poisson model, the relative bias is then defined as 1,000 1 ˆ 1 log3 ( ) *100% 1,000 log3 i i θ = − ∑ . Standard error was defined as the empirical standard error of the estimated risk ratio in log scale over all 1,000 simulations. The MSE was calculated by taking the sum of the squared bias in log scale and the variances, in which the bias was specified as 1,000 1 1 ˆ ( log3) 1,000 i i θ = − ∑ . Because both standard errors and MSEs depend on the sample size, the process described above was repeated for samples of size 500. The COPY method was used to generate the point estimates of the log-binomial models (Petersen and Deddens, 2009). For this evaluation, the number of virtual copies was set to 10,000. To ensure a fair comparison, the evaluation for the two models was conducted by only using the results based on exactly the same simulated data. If the COPY method did not converge for a dataset, the same dataset was then removed before the performance of the robust Poison models were evaluated. All the datasets were generated and analyzed using SAS Version 9.3 (SAS Institute, 2008). 3.2.4 Simulation Results 3.2.4.1 Relative bias (n=1,000) Table 3-8 and Figure 3-2 revealed the relative biases of the estimated RR in log scale from the two models in each of the 28 scenarios mentioned above when n = 1,000. Recall that DS3, DS5 and DS7 are identical datasets, and thus there were seven unique 48 datasets. Within each dataset, there were four outcome variables (Y1, Y2, Y3 and Y4), corresponding to 0%, 1.4%, 2.8% and 5.8% subjects at the max value of P, respectively. As expected, both models accurately estimated β1 or log (RR) when they were correctly specified (unshaded area in Table 3-8, and points labeled as “% exposed at max P = 0 and Link function = log in Figure 3-2) regardless of the value of the maximum P. When the models were mis-specified, the relative biases of the robust Poisson models were always quite small and stable, while those of log-binomial models tended to negatively bias away from null. For the log-binomial models, the magnitude of biases increased in all scenarios when the level of mis-specification, measured by the percentage of exposed subjects at maximum P, increased. The next three paragraphs focus on the performances of the log-binomial models. Impact of large P(Y=1) on mis-specified log-binomial models The results demonstrating the impact of large P(Y=1)s on biases were presented in the first three rows of Table 3-8 (labeled as DS1, DS2 and DS3) and by panel A of Figure 3-2. Having large P(Y=1)s was associated with an increased level of biases when the models were mis-specified. However, the impact depended on the level of mis- specified linear predictors. When the percentage of exposed subjects whose maximum P (Yk =1| X = 1, Z1, Z2) = 5.8, the relative bias changed from -4.1% to -11.7% when the maximum P (Yk =1| X = 1, Z1, Z2) raised from 0.75 in DS1 to 0.95 in DS3. However, when the percentage of exposed subjects whose maximum P (Yk =1| X = 1, Z1, Z2) = 1.4, the change of relative bias was minor (from 1.0% to -2.3%) when the maximum P (Y =1| X = 1, Z1, Z2) went up from 0.75 to 0.95. 49 Table 3-8. Relative bias a (%) in log scale with and without model misspecification (n=1,000) a Relative bias was defined as the average of the 1,000 estimated RR in log scale minus the log of the true RR divided by the log of the true RR. b Maximum P (Y k =1| X = 1, Z 1, Z 2) c Link function used in the model from which the simulated data were generated. DS3, DS5 and DS7 are identical datasets Unshaded: Models corrected specified Light shaded: Mis-specified linear predictors or mis-specified link function Dark shaded: Mis-specified linear predictors and mis-specified link function LB: log-binomial, RP: robust Poisson Dataset Scenario Percentage of Exposed Subjects at Maximum P 0% 1.4% 2.8% 5.8% Max P of exposed b Beta of Z 2 Link Function c LB RP LB RP LB RP LB RP DS1 0.75 3 log 0.4 0.6 1.0 1.5 -1.4 0.7 -4.1 1.1 DS2 0.85 3 log 0.4 0.3 -0.5 0.8 -3.0 1.0 -7.1 0.9 DS3 0.95 3 log 0.9 1.4 -2.3 1.0 -6.1 0.8 -11.7 0.5 DS4 0.95 2 log 0.2 0.6 -1.5 0.4 -3.8 0.2 -6.4 1.0 DS5 0.95 3 log 0.9 1.4 -2.3 1.0 -6.1 0.8 -11.7 0.5 DS6 0.95 4 log 0.1 0.7 -4.6 -0.2 -8.5 1.2 -15.4 1.2 DS7 0.95 3 log 0.9 1.4 -2.3 1.0 -6.1 0.8 -11.7 0.5 DS8 0.95 3 logit -1.7 0.6 -5.0 0.0 -6.8 1.2 -11.0 0.9 DS9 0.95 3 probit -5.9 2.2 -12.8 0.9 -18.0 0.4 -21.9 1.0 50 Impact of average P(Y=1) on the mis-specified log-binomial models The impact of the average P(Y=1) was displayed in the 2 nd three rows of Table 3- 8 (labeled as DS4, DS5 and DS6) and by panel B of Figure 3-2. For a fixed level of maximum P (Y =1| X = 1, Z1, Z2) and a fixed level of mis-specified linear predictors, the log-binomial models became more vulnerable (i.e. larger absolute value of the relative biases) when the beta coefficient increased (i.e. average P (Y =1| X = 1, Z1, Z2) decreased). For example, when the % of exposed subjects whose maximum P (Yk =1| X = 1, Z1, Z2) = 2.8, the relative bias changed jumped from -3.8% to -8.5% when the beta coefficient increased from 2 to 4. As I mentioned in Section 3.2.1.1, a higher beta coefficient of Z2 is associated with a lower level of average P (Y =1| X = 1, Z1, Z2). This indicates that not only the values of the large Ps, but also the value of the average Ps, impact the performance of the log-binomial models. Impact of mis-specified link functions on log-binomial models The impact of mis-specified link functions was shown in the 3 rd three rows of Table 3-8 and in panel C of Figure 3-2. When the underlying distribution of data was ‘logit’, mis-specifying the link function as ‘log’ did not influence the relative bias much. However, when the data followed a ‘probit’ distribution, the bias (-12.8%) was large even when the predictors were properly specified. Mis-specifying the link function from ‘probit’ to ‘log’ in the presence of mis-specified linear predictors had a serious consequence. The relative bias was almost -18% when the percentage of exposed subjects at the maximum P was only 2.8. 51 Figure 3-2. Relative bias (%) in log scale when the models were correctly or incorrectly specified (n=1,000). A. Varying max P(Y=1). Beta of Z2 is fixed at 3, Link function was fixed as log. B. Varying beta of Z2. Max P was fixed at 0.95. Link function was fixed as log. C. Varying underlying distribution of simulated data. Max P was fixed at 0.95, Beta of Z2 was fixed at 3. Robust Poisson: Red dotted lines. Log-Binomial: Blue solid lines. B C A 52 3.2.4.2 Standard error (SE) (n=1,000) In all the scenarios being simulated, the standard errors (SE) of the two models were comparable (Table 3-9) when n=1,000. At the 2 nd decimal point, the SEs derived from the log-binomial models were either the same or slightly smaller compared to those of robust Poisson models. The largest difference, 0.03 (=0.23-0.20), occurred to dataset DS9, in which data distribution was ‘probit’, maximum P was 0.95 and beta of Z2 was 3. When the models correctly specified, the SE of the two models were the same except for one scenario (DS6) in which the SE from log-binomial models was smaller by 0.01. The SEs of the log-binomial models were always smaller compared to those of robust Poisson models at the 3 rd decimal place, even if the models were correctly specified (data not shown). 3.2.4.2 Mean square error (MSE) (n=1,000) When the underlying distribution of data was ‘log’ or ‘logit’, the MSEs of the two models were comparable unless the level of the mis-specification of the linear predictors and the maximum P were large (e.g. % of exposed subjects at max P=5.8% and max P=0.95) (Table 3-10). However, when the data followed a ‘probit’ distribution, the robust Poisson models outperformed the log-binomial models even when the level of mis- specification of the linear predictors was small. For example, when the percentage of exposed subjects at the 0.95 (max P) was only 1.4, the MSEs were 0.051 and 0.038, from the log-binomial model and the robust Poisson model, respectively. 53 Table 3-9. Standard error a in log scale with and without model misspecification (n=1,000) a Standard error was defined as the empirical standard error of the estimated RR in log scale over all 1,000 simulations. b Maximum P (Y k =1| X = 1, Z 1, Z 2) c Link function used in the model from which the simulated data were generated. DS3, DS5 and DS7 are identical datasets Unshaded: Models corrected specified Light shaded: Mis-specified linear predictors or mis-specified link function Dark shaded: Mis-specified linear predictors and mis-specified link function LB: log-binomial, RP: robust Poisson Dataset Scenario Percentage of Exposed Subjects at Maximum P 0% 1.4% 2.8% 5.8% Max P of exposed b Beta of Z 2 Link Function c LB RP LB RP LB RP LB RP DS1 0.75 3 log 0.20 0.20 0.18 0.18 0.16 0.17 0.14 0.15 DS2 0.85 3 log 0.18 0.18 0.16 0.17 0.15 0.15 0.13 0.14 DS3 0.95 3 log 0.17 0.17 0.15 0.16 0.14 0.15 0.12 0.13 DS4 0.95 2 log 0.13 0.13 0.13 0.13 0.12 0.13 0.11 0.12 DS5 0.95 3 log 0.17 0.17 0.15 0.16 0.14 0.15 0.12 0.12 DS6 0.95 4 log 0.18 0.19 0.15 0.16 0.15 0.16 0.12 0.13 DS7 0.95 3 log 0.17 0.17 0.15 0.16 0.14 0.15 0.12 0.13 DS8 0.95 3 logit 0.14 0.15 0.13 0.14 0.13 0.14 0.11 0.12 DS9 0.95 3 probit 0.20 0.23 0.18 0.19 0.16 0.17 0.14 0.15 54 Table 3-10. Mean square error a (MSE) in log scale with and without model misspecification (n=1,000) a Mean square error was calculated by taking the sum of the squared bias in log scale and the variances. b Maximum P (Y k =1| X = 1, Z 1, Z 2) c Link function used in the model from which the simulated data were generated. DS3, DS5 and DS7 are identical datasets Unshaded: Models corrected specified Light shaded: Mis-specified linear predictors or mis-specified link function Dark shaded: Mis-specified linear predictors and mis-specified link function LB: log-binomial, RP: robust Poisson Dataset Scenario Percentage of Exposed Subjects at Maximum P 0% 1.4% 2.8% 5.8% Max P of exposed b Beta of Z 2 Link Function c LB RP LB RP LB RP LB RP DS1 0.75 3 log 0.040 0.040 0.032 0.033 0.026 0.028 0.022 0.022 DS2 0.85 3 log 0.032 0.033 0.026 0.028 0.023 0.024 0.023 0.019 DS3 0.95 3 log 0.028 0.031 0.022 0.025 0.023 0.021 0.031 0.016 DS4 0.95 2 log 0.017 0.018 0.016 0.018 0.016 0.016 0.017 0.014 DS5 0.95 3 log 0.028 0.031 0.022 0.025 0.023 0.021 0.031 0.016 DS6 0.95 4 log 0.032 0.036 0.026 0.027 0.031 0.026 0.044 0.018 DS7 0.95 3 log 0.028 0.031 0.022 0.025 0.023 0.021 0.031 0.016 DS8 0.95 3 logit 0.019 0.022 0.020 0.019 0.021 0.019 0.027 0.014 DS9 0.95 3 probit 0.046 0.052 0.051 0.038 0.063 0.028 0.076 0.021 55 3.2.4.3 Distribution of P(Y=1) in the simulated datasets (n=1,000) Figures 3-3 (A, B, C) revealed the distribution of P(Y=1) for all the data simulated in the study (one million data points for each scenario). As expected, the spikes (i.e. the vertical thin lines in all the figures excepts those with “0% at the max P”) for both exposed and unexposed subjects grew higher when the percentage of exposed subjects at the maximum P (or the percentage of unexposed subjects at the one-third of the maximum P) increased for all the panels (A, B and C) in Figures 3-3. An increase of maximum P(Y=1) from 0.75 to 0.95 while the link function and beta of Z2 were fixed at ‘log’ and 3, respectively, mildly stretched the distributions of P(Y=1) to the right and thus slightly lowered the peaks (Figure 3-3 A). Recall that in panel A of Figure 3-2 or Table 3-8 the relative bias was the highest when P=0.95 and lowest when P=0.75, for a fixed level of mis-specified linear preditors (e.g 2.8% of exposed subjects at the max P). Recall that biases occurred only when the linear predictors were incorrectly specified (Figure 3-2 A and Table 3-8). 56 Figure 3-3 A. Distribution of P(Y=1) with varying max P(Y=1). The x-axis is the value of P(Y=1). Beta of Z 2 was fixed at 3, Link function being used to generate the simulation data was log. From the left to the right panels, the max P(Y=1) increased from 0.75, 0.85 to 0.95. From the top to the bottom panels, the % of exposed subjects at the max P(Y=1) increase from 0%, 1.4%, 2.8% to 5.8%. When the beta of Z2 increased from 2, 3 to 4 while the max P(Y=1) was fixed at 0.95 and the link function was fixed as log, the distribution of P(Y=1) became taller, thinner and shifted towards zero (Figure 3-3 B). Recall that in panel B of Figure 3-2 or Table 3-8 the setting with beta of Z2 = 4 had the highest bias while the one with beta of Z2 = 2 had the least bias, for a fixed level of mis-specification of linear predictors. Recall that biases occurred only when the linear predictors were incorrectly specified (Figure 3- 57 2 B and Table 3-8). Figure 3-3 B provided the evidence to attribute the increase of biases to the decrease in the distribution of P(Y=1) (i.e. decrease in the prevalence of the outcome variable Y). Figure 3-3 B. Distribution of P(Y=1) with varying beta of Z 2. The x-axis is the value of P(Y=1). Max P(Y=1) was fixed at 0.95. Link function being used to generate the simulation data was log. From the left to the right panels, the beta of Z 2 increased from 2, 3 to 4. From the top to the bottom panels, the % of exposed subjects at the max P(Y=1) increase from 0%, 1.4%, 2.8% to 5.8%. 58 The distributions of P(Y=1) for data following ‘log’ and ‘logit’ functions appeared to be similar; however, the distributions for data following ‘probit’ distributions distinguished themselves significantly from those of ‘log’ or ‘logit’ functions (Figure 3-3 C). This obsevation supported the larger biases seen in panel C of Figure 3-2 and Table 3-8 when the underlying distrubiution of data was ‘probit’, compared to those of ‘log’ and ‘logit’ distributions even when the predictors were perfectly linear. 3.2.4.4 Results of simulation for moderate sample size (n=500) When the simulation was conducted based on samples of moderate sizes (n = 500), the same pattern was observed in terms of relative biases and standard errors. The summarized data were shown in Tables V-1, V-2 and V-3 in Appendix V. As expected, the SEs based on the small samples were larger compared to those derived from large samples (n = 1,000). The same pattern was also observed for MSEs; however, the differences between the two models were not as substantial as seen in the samples of sizes 1,000. 59 Figure 3-3 C. Distribution of P(Y=1) with varying underlying distribution of simulated data. The x-axis is the value of P(Y=1). Max P(Y=1) was fixed at 0.95. Beta of Z2 was fixed at 3. From the left to the right panels, the link function being used to generate the simulation data changed from log, logit to probit.. From the top to the bottom panels, the % of exposed subjects at the max P(Y=1) increase from 0%, 1.4%, 2.8% to 5.8%. 60 3.2.5 Discussion In this study, a comparison was made to study the statistical performance of the two most popular model-based approaches using simulation to estimate RR for common binary outcomes under various scenarios when the outcome models are mis-specified. The findings suggest that the point estimates of the log-binomial models could be biased when the underlying data fail to follow a log distribution or the response depends on covariates through a non-linear combination of predictors. When both types of mis- specification are present, the biases may be large. Lumley et al. noted the potential impact of a large p (referred to as μ by the authors) on the point estimates of log-binomial models (Lumley et al., 2014). The current study demonstrated the impact of both large ps and the average p through simulations. MLEs of log-binomial models are derived from an iterative reweighted least squares (IRLS) approach or variations of IRLS in statistical software. The weights being used in the IRLS approach contain () i p β in the numerator and 1 () i p β − in the denominator (refer to Section 2.2), where () i p β = exp (X T β) with a range from 0 to 1. This implies that not only the magnitude of the () i p β , but also the relative distance between () i p β and the average () p β [= 1 1 () n i i p n β = ∑ ] determines the relative contribution of the ith individual. An individual with a very large () i p β in a dataset with a low average () p β influences the point estimates more than an individual with the same magnitude of () i p β s in another dataset with a high average () p β . The interpretation based on the theory is consistent with the results of the simulation in which the deterioration of the log-binomial models 61 was obvious when the average () p β was low and large () i p β s existed. For example, when the distribution was ‘probit’ and maximum () i p β reached 0.95, the bias reached - 12.8% when only 1.4% of the exposed subjects whose distribution were distorted. Recall that the ‘probit’ distribution had lower average () p β compared to that of ‘log’ distribution, when other factors remained fixed (please refer to Figure 3-3). On the other hand, the point estimates from the robust Poisson models were rather stable in all the scenarios examined, even when they were applied to the data that were generated by ‘probit’, a distribution that is quite different from that of a ‘log’ distribution, and/or when the distribution of 5.8% of the exposed subjects were altered. In the previous study (please refer to the “outlier” study in Section 3.1), I showed that both the MLEs, generated by the log-binomial models and the quasi-likelihood estimators, produced by the robust Poisson models, were deteriorated when outliers were introduced. However, in the current study, the point estimates from the robust Poisson models were fairly stable, even when both link functions and predictors were incorrectly specified. This interesting phenomenon can be explained by a major difference in the design of the two studies. In the “outlier” study, the association between the exposure and the outcome was weakened when the “outliers” were introduced, and thus the negative biases were observed for the robust Poisson models. In the current study, because the true RR was maintained at 3.0, even when the underlying data did not follow a ‘log’ distribution and/or when part of the distributions were truncated. Recall that the quasi-likelihood is a technique used to estimate regression coefficients without fully specifying the distribution of the observed 62 data (refer to Section 2.3). It seems that for robust Poisson, the violation of the log distribution of the observed data did not hinder its ability to find the true RR. There are many approaches to generate “mis-specified models”. The design focused on two types of mis-specification that may cause biased estimators based on the foundation of the GLM framework. The choice of ‘logit’ and ‘probit’ as the distributions deviated from the ‘log’ distribution is not a surprise. Both distributions, along with many others can be used to model binary outcomes. The former is close, while the latter is quite different from ‘log’ in terms of their distributions. It is known that the relationship between ‘log’ and ‘logit’ distributions is nearly linear when the () i p β s are not too far away from 0.5. However, the two distributions are quite different for () i p β s that are located at the two ends (i.e. very low or very high). The similar results from the simulation can be well explained by the similarity between the two distributions. In contrast, when the underlying data followed ‘probit’ distribution, the biases were significantly larger compared to those of ‘log’ and ‘logit’. The significant differences between the ‘probit’ distribution and the ‘log’/’logit’ distributions were displayed by Figure 3-3. Although I did not evaluate data based on other distributions that are also suitable for model binary outcomes (e.g. complementary log-log or log-log), it is expected that the results have similar patterns. Truncation of a distribution may appear in many real-life datasets. For example, a viral load under 50 copies/mL is undetectable. A typical scale used in clinics/hospitals can measure height up to 200 cm and weight up to 250 kg. The truncation rates (1.4%, 2.8% and 5.8%) are plausible values that readers can relate to their real-life applications. 63 In contrast to the “outlier” study, the current study found small differences at the 2 nd decimal point between the log-binomial models the robust Poisson models in terms of variances in some of the scenarios for both sizes of samples (n=1,000 and n=500), even when the models were correctly specified. This finding is consistent with that of Petersen & Deddens which was based on a sample of size 100 and a single independent variable with a uniform distribution (Petersen and Deddens, 2008). Kauermann and Carroll showed that variances of sandwich estimators are generally less efficient than variance estimates derived from parametric models (Kauermann and Carroll, 2001). This weakness impacts the coverage probability, the probability that a confidence interval covers the true RR, and thus the ability to reject a null when the alternative is true. Hence, the log-binomial models are preferred over the robust Poisson models when MLE estimators derived from log-binomial models are not biased. COPY method was reported to have convergence issue when there are continuous covariates in the model (Petersen and Deddens, 2008). However, convergence was barely an issue in this study. The COPY method converged completely (i.e. 1,000 out of 1,000 simulations) in 23 out of 28 scenarios when the sample size was 1,000, and 21 out of 28 scenarios when the sample size dropped to 500. In the 12 scenarios (5 for sample size 1,000 and 7 for sample size 500) in which the COPY method did not completely converge in all 1,000 simulations, there was only one out of 1,000 simulations failed to converge for each scenario. The number of virtue copies used in the study, 10,000, was reported to be accurate to three decimal places (Petersen and Deddens, 2009). 64 Mis-specification tests were developed (White, 1982; Presnell and Boos, 2004) and were proven to be able to maintain reasonable size across various settings in simulation when they were applied to logistic and beta-binomial regression models (Capanu and Presnell, 208). However, the power to detect the types of alternatives commonly observed in practice (e.g. alternative link functions) was low (Capanu and Presnell, 208). Blizzard and Hosmer assessed model-fit of log-binomial models by applying the Hosmer-Lemeshow test (a commonly used goodness-of-fit test for logistic regression models), the Pearson chi-square test, and the unweight sum of squares test. All three tests exhibited acceptable type I errors yet low to moderate power (Blizzard and Hosmer, 2006). Due to the lack of powerful diagnostic tools to detect any forms of model mis-specification, Poisson models may be considered a good choice because of its robustness to model mis-specification. In summary, the current study revealed the evidence to support the robustness of the robust Poisson model in various scenarios when models were mis-specified. It is desirable for future studies to develop model misspecification and/or goodness-of-fit test that are powerful and convenient to apply for log-binomial models. 65 Chapter 4. Applications 4.1 A study of relationship between fractional exhaled nitric oxide (FeNO) and asthma impairment The log binomial and robust Poisson approaches were applied to data collected as part of a multicenter cross-sectional study assessing the association of fractional exhaled nitric oxide (FeNO), a marker of airway inflammation, and asthma burden among persistent asthma patients who were treated with inhaled corticosteroids (ICS). High FeNO levels were hypothesized to be associated with greater asthma burden. The primary outcome of interest was 7 or greater short-acting beta-agonist (SABA) canisters dispensed in 12 months, which is a validated administrative data surrogate for asthma impairment. The continuous exposure variable, FeNO, was categorized into four quartiles. Additional variables adjusted in the multivariable analysis included age, gender, race/ethnicity, number of aeroallergen sensitivities, clinical center, FEV1% predicted (≥80% vs. <80%) and asthma control test (ACT) score (>19, 16-19, <16). Age remained to be a continuous variable in the model. All analyses were conducted by using SAS (version 9.2 for Windows; SAS Institute, Cary, NC). The crude and adjusted relative risks and their 95% confidence intervals (CI) were reported in Table 4-1. Noticeable differences in the point estimates of categorized FeNo values between the two models were found (highlighted in yellow in Table 4-1). Using the robust Poisson model, the risk ratio of having ≥7 SABA canisters over the past 12 months was 2.05 (95% CI, 1.03-4.05) in patients whose FeNO value was in the 2 nd quartile, compared to patients whose FeNO value was in the 1 st quartile. However, when 66 the analysis was repeated by using the log-binomial model, the corresponding risk ratio was reduced to 1.67 (95% CI, 0.83-3.57), a 19% deduction from 2.05. Although the point estimates from the two models were in the same direction (i.e. both greater than 1.0), the interpretation of the results varied by the statistical model being chosen, because the confidence internal derived from the robust Poisson model did not cover one while the one from the log-binomial model included one. Compared to patients with FeNo value in the 1 st quartile, patients whose FeNo value was in the 3 rd quartile did not have a higher risk at the 95% level for both models (risk ratio = 1.40, 95% CI = 0.68-2.85 for the robust Poisson model and risk ratio = 1.23, 95% CI = 0.58-2.72 for the log-binomial model). For patients whose FeNo value was in the highest quartile, the risks were significantly higher no matter which model was chosen (risk ratio = 2.51, 95% CI = 1.31-4.79 for the robust Poisson model; risk ratio = 2.11, 95% CI = 1.10-4.45 for the log-binomial model), compared to patients whose FeNo was in the 1 st quartile. In summary, there were striking differences (12-19%) in the point estimates and the 95% confidence intervals between the two models. Although the truth is unknown in this real-life example, one may suspect that model misspecification or the existences of outliers may contribute to the large differences in the estimates. The model misspecification may be due to the mismatch between the link function being specified and the underlying distribution of the outcome variable. It may also be caused by the violation of the linearity assumption of the predictors. 67 The results published were based on the robust Poisson model (Zeiger et al., 2011), as they were considered to be more robust compared to those of log-binomial models. 68 Table 4-1. Comparison of robust Poisson and log-binomial models in estimating risk ratio (RR) of >7 SABA canisters dispensed in the past year Factors at baseline Robust Poisson RR (95% CI) Log-Binomial RR (95% CI) Unadjusted Adjusted Unadjusted Adjusted FeNO (ppb): reference 1 st quartile (7-19 ppb) P for trend ‡ <0.0001 P for trend ‡ =0.0011 P for trend ‡ <0.0001 P for trend ‡ =0.0243 2 nd quartile (20-28 ppb) 1.94 (0.95 - 3.99) 2.05 (1.03 - 4.05) 1.94 (0.97 - 4.17) 1.67 (0.83 - 3.57) 3 rd quartile (29-47 ppb) 1.56 (0.73 - 3.31) 1.40 (0.68 - 2.85) 1.56 (0.74 - 3.43) 1.23 (0.58 - 2.72) 4 th quartile (48-215 ppb) 3.37 (1.77 - 6.42) 2.51 (1.31 - 4.79) 3.37 (1.84 - 6.84) 2.11 (1.10 - 4.45) FEV1% predicted: reference ≥80% P for trend ‡ =0.2280 P for trend ‡ =0.2243 P for trend ‡ =0.2037 P for trend ‡ =0.1851 <80% 1.36 (0.91 – 2.05) 1.28 (0.84 - 1.94) 1.36 (0.90 - 2.04) 1.32 (0.85 - 1.98) ACT score: >19 (controlled), reference P for trend ‡ =0.0856 P for trend ‡ =0.2768 P for trend ‡ =0.0891 P for trend ‡ =0.3537 16-19 (not well controlled) 1.08 (0.64 - 1.81) 0.84 (0.51 - 1.37) 1.08 (0.61 - 1.77) 0.86 (0.49 - 1.46) <16 (very poorly controlled) 1.26 (0.76 - 2.08) 1.02 (0.61 - 1.72) 1.26 (0.74 - 2.03) 0.98 (0.61 - 1.60) *Variables adjusted in addition to FEV1% predicted and asthma control test (ACT) were age, gender, race/ethnicity, number of aeroallergen sensitivities, and clinical center. ‡ P-values for trend were based on tests for slope in the univariate or multivariable model when FeNO, FEV1% predicted and ACT score appeared as continuous variables. Bold: indicates statistically significant at the 95% level. 69 4.2 A study of relationship between blood eosinophil counts and asthma exacerbation Based on a randomized controlled trail, Green et al. reported that sputum eosinophil counts serve as an early indicator of asthma exacerbation in moderate to severe asthma patients (Green et al., 2002). A group of investigators at Kaiser Permanente Southern California initiated a study to examine the relationship between blood eosinophil counts (EOS in mm 3 ) and asthma exacerbation in a group of adult persistent asthma patients. The learned information could be used to guide the clinical practice if high EOS does increase the risk of asthma exacerbation later on. Asthma exacerbation, obtained from patient electronic medical records, was defined as a binary variable (yes/no). EOS was categorized according to various cutoff points (e.g. 150, 200, 300 and 400/mm 3 ). Because asthma exacerbation is a common outcome, robust Poisson regression model and log-binomial model were applied to estimate the adjusted risk ratios and their 95% CIs, instead of using a logistic regression model to estimate an adjusted odd ratio. Covariates at baseline being considered in the analysis were age (a continuous variable), gender, race/ethnicity, obesity (yes vs. no), geocoded education level, enrollment duration, smoking status, insurance type, Charlson comorbidity index (Deyo, Cherkin and Ciol, 1992), specific comorbidities (gastroesophageal reflux, rhinitis, chronic sinusitis, nasal polyps, eczema), medication dispensing (leukotriene modifiers, theophylline, omalizumab, and ≥7 SABA canisters dispensed), the Healthcare Effectiveness Data and Information Set (HEDIS)-defined asthma treatment adherence (9+ vs <9 controllers per year), Global Initiative of Asthma 70 (GINA) step-care level, and history of asthma exacerbations. The geocoded educated level was derived from a geocoding process in which the address of a patient was mapped to a census block group and then the education level of the residents 25+ years of age living in the same block group was assigned to the patient being studied. The geocoding approach was described in the publication by Chen et al. (Chen, Petitti, and Enger, 2004). Age, gender, obesity, geocoded education, and EOS were fixed covariates in the models. For other covariates, quasi-likelihood under the independence model criterion was used to compare any two models in a stepwise approach (Pan, 2001). Covariates were added into the model one at a time and the quasi-likelihood of the model with the covariate was compared to the one without the covariate. The model with the smaller quasi-likelihood under the independence model criterion was preferred. The covariates selected for the final robust Poisson model were applied to the final log-binomial regression model. Interaction (cross product) terms were attempted. All analyses were conducted by using SAS (version 9.2 for Windows; SAS Institute, Cary, NC). Asthma exacerbation occurred in 32.5% and 23.8% of persistent asthma patients whose EOS was ≥400 mm 3 and those whose EOS was <400 mm 3 , respectively. The covariates that remained in the final models were age, gender, race/ethnicity, BMI, history of asthma exacerbation, GINA step-care level, ≥7 SABA canisters dispensed and EOS. Table 4-2 displays the estimated risk ratios of asthma exacerbation in subsequent year from both models, when the exposure variable, EOS was categorized as ≥400 mm 3 and <400 mm 3 . Compared to patients with EOS <400 mm 3 , patients with elevated EOS values (EOS ≥400 mm 3 ) were at an increased risk of developing asthma exacerbation in 71 subsequent year. Please note that the estimated risk ratios of asthma exacerbation from the two models were identical at the 2 nd decimal point and the confidence intervals were quite comparable. The largest difference in point estimates occurred to “Asian vs. non- Hispanic whites” with the 95% confidence interval from the log-Binomial model covering one while the one from robust Poison did not (highlighted in yellow in Table 4- 2). The results that were based on the robust Poisson model were published (Zeiger et al. 2014). Although the truth remains unknown in this real-life study, the almost identical results from the robust Poisson and the log-binomial models increased my confidence on the robustness of the results. The models were probably correctly specified and the impact of outliers was minimal, even if outliers did exist. 72 Table 4-2. Comparison of robust Poisson and log-binomial models in estimating risk ratio (RR) of asthma exacerbation in subsequent year Patient demographics and characteristics in baseline year Robust Poisson Adjusted RR (95% CI) Log-binomial Adjusted RR (95% CI) Blood eosinophil counts (EOS) ≥ 400 mm 3 vs. < 400 mm 3 1.22 (1.04-1.41) 1.22 (1.06-1.41) Female 1.24 (1.06-1.46) 1.20 (1.02-1.40) Race/ethnicity Black vs. non-Hispanic whites Asian vs. non-Hispanic whites 1.33 (1.11-1.59) 1.30 (1.02-1.67) 1.32 (1.11-1.57) 1.24 (0.98-1.56) BMI 30 or higher vs. <30 1.16 (0.77-1.75) 1.14 (0.76-1.73) History asthma exacerbation 2.49 (2.16-2.87) 2.50 (2.17-2.88) GINA step-care level Levels 4-5 vs. levels 1-2 Level 3 vs. levels 1-2 1.27 (1.06-1.53) 1.12 (0.92-1.37) 1.23 (1.03-1.48) 1.07 (0.88-1.31) SABA canisters dispensed 7 or more vs. < 7 1.18 (1.03-1.35) 1.15 (1.01-1.31) *Variables adjusted in the final models were gender, race/ethnicity, BMI, history of asthma exacerbation, GINA step-care level and SABA canisters dispensed. 73 Chapter 5. Conclusions and Future Research Directions Conclusions Outliers and model misspecifications frequently occur in practice. Outliers could be introduced by measurement, reporting, data collection or data entry errors. Misspecifications of GLM models could be the result of violation of linear predictors (e.g. truncation of the distribution due to the minimum or maximum value allowed for a scale or a laboratory measure), or the improper application of the ‘log’ link function to model underlying response data that do not follow a log distribution. The simulation studies showed that in some scenarios the impact of the outliers and model misspecification can be large. Unfortunately, robust methods to identify outliers or powerful tests to detect model misspecifications are not available yet for log-binomial models. Therefore, the choice of robust Poisson models is a wise alternative to avoid the potential biases that could be produced with the blind application of a log-binomial model. Given the lack of robustness of log-binomial models when models are mis- specified or when datasets contain outliers, a robust Poisson model should be considered as the preferred choice for estimating risk ratios. This is especially the case when the prevalence of the outcome is low or the models are complicated (e.g. containing higher order terms of a continuous independent variable). Due to concern about lack of efficiency of the robust Poisson model for small samples, a log-binomial model may be considered as an alternative when the sample size is small. My recommendation is to use a robust Poisson model when the sample size is large, and when the sample size is small, 74 perform both models and compare the results. If the point estimates of the two models are comparable, use the results from the log-binomial model because the confidence intervals from the log-binomial model tend to be slightly narrower. Otherwise, use the results from the robust Poisson model. The design of the simulation studies have a few limitations. First, only records with very low or very high probabilities of having the outcome could become outliers. The impact of the outliers generated by the “flipping” approach could be more significant compared to studies with outliers that were distributed differently. Second, the distribution of the contaminated datasets had “shifted means” and the magnitude of the shift could be as large as 40% when the outcome was low (e.g. 10%). Subsequently, the impact of outliers demonstrated in the simulation may not be exhibited in situations in which the means remain the same. Third, the complex forms between the confounder and the outcome were generated by a quadratic equation. It is not clear whether or not these findings can be generalized to other situations, such as more complex non-linear relationships or interactions. Finally, demonstration of the model performance when models were mis- specified was established by using datasets where the RR, the relationship between the exposure and the outcome, was fixed at 3.0. It is possible that the differences between the two models could be smaller when the RR is closer to 1. Future Research: A. Development of reliable methods to detect outliers 75 Robust methods to detect outliers, namely high leverage points for logistic regression models, are available in popular statistical software packages (Hosmer and Lemeshow, 2000; Imon and Hadi, 2013; Syaiba and Habshah, 2010). However, similar approaches for log-binomial models are not yet available in commercial software packages, although the adoption of the diagnostic statistics from those of logistic regression models were demonstrated and applied in a real life example (Blizzard and Hosmer, 2006). It is recommended that future studies look into the potential approaches to accurately identify outliers. B. Development of powerful model misspecification tests White proposed an Information Matrix (IM) test in 1982 (White, 1982). Presnell and Boos developed the IOS test, a competitor of IM (Presnell and Boos, 2004). The IOS test is a general-purpose goodness-of-fit test based on the ratio of in-sample and out-of- sample likelihoods (Presnell and Boos, 2004). Capanu and Presnell applied the IOS and IM tests to binomial and beta-binomial models and compared their performance through the use of goodness-of-fit tests and simulations (Capanu and Presnell, 2008). Their findings suggest that both IM and IOS tests were proven to be able to maintain reasonable type I errors across various settings in simulation when they were applied to logistic and beta-binomial regression models (Capanu and Presnell, 2008). However, the power to detect the types of alternatives commonly observed in practice (e.g. alternative link functions) was low (Capanu and Presnell, 2008). Blizzard and Hosmer assessed model-fit of log-binomial models by applying the Hosmer-Lemeshow test (a commonly used goodness-of-fit test for logistic regression models), the Pearson chi-square test, and 76 the unweight sum of squares test. All three tests exhibited acceptable type I errors yet low to moderate power (Blizzard and Hosmer, 2006). Thus, it is of interest to develop powerful model misspecification tests with appropriate size for a more robust comparison of the log-binomial and robust Poisson models. C. Extension of risk ratio estimation to cluster-correlated binary outcome data Comparisons between the log binomial and robust Poisson models based on simulations have so far been limited to independent data. Cluster-correlated data often appear in longitudinal studies whereby information on the same individual is collected repeatedly at multiple time points. Clustering also happens when study subjects are classified into a number of distinct groups or clusters (e.g. patients recruited from the same center for a multi-center study can be considered being in the same cluster). Studies have shown that robust Poisson models can be extended to risk ratio estimation for cluster-correlated binary outcome data though the use of GEE method (Yelland et al., 2011; Zou and Donner, 2011). In these studies it was found that the performance of robust Poisson models was comparable to that of log-binomial models when each was correctly specified, and as long as the total number of clusters is at least 50 (Donner, 2011). It is very likely that the robust Poisson model will continue to be more robust to outliers and model misspecification when the outcome data are cluster-correlated, compared to the log-binomial model. 77 Bibliography Agrawal D. Inappropriate interpretation of the odds ratio: oddly not that uncommon. Pediatrics. 2005;116:1612–1613. Barros AJ, Hirakata VN. Alternatives for logistic regression in cross-sectional studies- an empirical comparison of models that directly estimate the prevalence ratio. BMC Med Res Methodol. 2003;3(21):1–13. Blizzard L, Hosmer DW. Parameter estimation and goodness-of-fit in log binomial regression. Biometrical J. 2006;48(1):5–22. Capanu M, Presnell B. Misspecification Tests for Binomial and Beta-Binomial Models. Stat Med. 2008;27:2536-2554. Carter RE, Lipsitz SR, Tilley BC. Quasi-likelihood estimation for relative risk regression models. Biostatistics. 2005;6: 39–44. Chen W, Petitti DB, Enger S. Limitations and potential uses of census-based data on ethnicity in a diverse community. Ann Epidemiol. 2004;14:339-345 Chen W, Shi J, Qian L, Azen SP. Comparison of robustness to outliers between robust Poisson models and log-binomial models when estimating relative risks for common binary outcomes: a simulation study. BMC Med Res Methodol. 2014;14:82. Deddens JA, Petersen MR, Lei X. Estimation of Prevalence Ratios When PROC GENMOD Does not Converge, In Proceedings at the SAS Users Group International Proceedings: March 30 SUGI28, Seattle, Washington. 2003. Deddens JA, Petersen MR. Approaches for estimating prevalence ratios. Occup Environ Med. 2008;65:501–506. Deyo RA, Cherkin DC, Ciol MA. Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. J Clin Epidemiol. 1992;45(6):613-619. Green RH, Brightling CE, McKenna S, Hargadon B, Parker D, Bradding P, Wardlaw AJ, Pavord ID. Asthma exacerbations and sputum eosinophil counts: a randomized controlled trial. Lancet. 2002;360:1715–1721. Greenland S. Model-based estimation of relative risks and other epidemiologic measures in studies of common outcomes and in case–control studies. Am J Epidemiol. 2004;160(4):301–305. 78 Greenland S. Interpretation and choice of effect measures in epidemiologic analyses. Am J Epidemiol. 1987;125(5):761–768. Hosmer DW, Lemeshow S. Applied Logistic Regression. Second edition. New York: John Wiley & Sons; 2000. Imon AHMR, Hadi AS. Identification of multiple high leverage points in logistic regression. J Appl Stat. 2013;40(12):2601–2616. Kauermann G, Carroll RJ. A note on the efficiency of sandwich covariance matrix estimation. J Am Stat Assoc. 2001;96:1387–1396. Lumley T, Kronmal RA, Ma S. Relative risk regression in medical research: models, contrasts, estimators and algorithms. UW Biostatistics Working Paper Series 293. http://www.bepress.com/uwbiostat/paper293. Accessed June 20, 2014. Seattle, WA: University of Washington. McCullagh P. Quasi-likelihood functions. Ann Stat. 1983;11:59-67. McCullagh P, Nelder J. Generalized Linear Models, Second Edition. Chapman and Hall/CRC. 1989. McNutt LA, Wu C, Xue X, Hafner JP. Estimating the relative risk in cohort studies and clinical trials of common outcomes. Am J Epidemiol. 2003;157(10):940–943. Miettinen O. Design options in epidemiologic research. An update. Scand J Work Env Hea. 1982;8(Suppl 1):7–14.7. Mirjam JK, Cessie SL, Algra A, Vandenbroucke JP. Overestimation of risk ratios by odds ratios in trials and cohort studies: alternatives to logistic regression. Can Med Assoc J. 2012;184(8):895–899. Pan W. Akaike’s information criterion in generalized estimating equations. Biometrics. 2001;57 (1):120–125. Petersen MR, Deddens JA. A comparison of two methods for estimating prevalence ratios. BMC Med Res Methodol. 2008;8(1):9. Petersen MR, Deddens JA. A revised SAS macro for maximum likelihood estimation of prevalence ratios using the COPY method [letter]. Occup Environ Med. 2009;66(9):639. Presnell B, Boos DD. The IOS Test for Model Misspecification. J Am Stat Assoc. 2004;99:216-227. 79 SAS Institute: Software Version 9.2. Cary, North Carolina: SAS Institute; 2008. Schouten EG, Dekker JM, Kok FJ, Le Cessie S, Van Houwelingen HC, Pool J, Vanderbroucke JP. Risk ratio and rate ratio estimation in case-cohort designs: hypertension and cardiovascular mortality. Stat Med. 1993;12:1733–1745. Skov T, Deddens J, Petersen MR, Endahl L. Prevalence proportion ratios: estimation and hypothesis testing. Int J Epidemiol. 1998;27:91–95. Spiegelman D, Hertzmark E. Easy SAS calculations for risk or prevalence ratios and differences. Am J Epidemiol. 2005;162:199–200. Syaiba BA, Habshah M. Robust logistic diagnostic for the identification of high leverage points in logistic regression model. J Appl Sci. 2010;10(23):2042–3050. Wacholder S. Binomial regression in GLIM: estimating risk ratios and risk differences. Am J Epidemiol. 1986;123:174–184. Wedderburn RW. Quasi-likelihood functions, generalized linear models, and the Gauss- Newton method. Biometrica. 1974;61:439-447. White H. Maximum likelihood estimation of misspecified models. Econometrica. 1982;50:1-26. Yelland LN, Salter AB, Ryan P. Performance of the modified Poisson regression approach for estimating relative risks from clustered prospective data. Am J Epidemiol. 2011;174:984–992. Yu B, Wang Z. Estimating relative risks for common outcome using PROC NLP. Comp Meth Prog Bio. 2008;90(2):179–186. Zeiger RS, Schatz M, Zhang F, Crawford WW, Kaplan MS, Roth RM, Chen, W. Association of Exhaled Nitric Oxide to Asthma Burden in Asthmatics on Inhaled Corticosteroids. J Asthma. 2011;48(1):8-17. Zeiger RS, Schatz M, Li Q, Chen W, Khaty D, Dossage D, Tran TN. High blood eosinophil count is a risk factor for future asthma exacerbations in adult persistent asthma. J Allergy Clin Immunol Pract. 2014;2:741-50. Zeger SL, Liang KP. Longitudinal data analysis for discrete and continuous outcomes. Biometrics. 1986;42(1):121-130. Zhang J, Yu KF. What’s relative risk? A method of correcting the odds ratio in cohort studies of common outcomes. JAMA. 1998;280:1690–1691. 80 Zou GY, Donner A. Extension of the modified Poisson regression model to prospective studies with correlated binary data [published online November 8, 2011]. Stat Methods Med Res. 2011. Zou G. A modified poisson regression approach to prospective studies with binary data. Am J Epidemiol. 2004;159(7):702–706. 81 Appendix I ( ( )) ' * ( ) YP XW P β β − 1 11 1 11 12 1 2 22 21 22 2 2 12 ( ) ( ) 1 1( ) ( ) .. ( ) ( ) .. 1 2 ( ) ( ) .. .. .. .. .. .. .. ( ) ( ) ( ) 1 ( ) i n n i k k kn nn n n n p yp p p xx x p yp xx x p p xx x yp p p p β β β β ββ ββ β β β β − − − − = − − 12 11 11 12 1 1 22 12 21 22 2 12 12 ( ) ( ) ( ) ( ) .. 1 1 ( ) 1 2 ( ) 1 ( ) ( ) ( ) ( ) ( ) ( ) .. 1 1 ( ) 1 2 ( ) 1 ( ) .. .. .. .. ( ) ( ) ( ) .. 1 1 ( ) 1 2 ( ) 1 ( ) n n ii n n n ii n n k k kn ii n p pp yp xx x pp p p p yp pp xx x p pp p p pp x x x pp p β ββ β ββ β β ββ ββ ββ β β ββ ββ β − −− − − −− − = −− − 2 ( ) .. ( ) ( ) nn n yp p β β β − 1 1 2 1 1 ( ( )) (1 ( )) ( ( )) (1 ( )) .. ( ( )) (1 ( )) n ii i i i n ii i i i n ii ik i i yp x p yp x p yp x p β β β β β β = = = − ⋅ − − ⋅ − = − ⋅ − ∑ ∑ ∑ 1 2 ( ) ( ) .. ( ) k β β β β β β ∂ ∂ ∂ ∂ = ∂ ∂ = '( ) l β 82 ' X WX 1 1 11 12 1 11 21 1 2 21 22 2 12 22 2 2 1 2 12 ( ) 1 () .. .. ( ) .. .. 1 ( ) .. .. .. .. .. .. .. .. .. .. .. ( ) 1 ( ) nk n k k k kn n n kn n n p p x xx x xx p x x x xx x p x x x xx x p p β β β β β β − − = − 12 11 12 1 12 11 21 1 12 21 22 2 12 22 2 12 12 12 12 ( ) ( ) ( ) .. 1 ( ) 1 ( ) 1 ( ) .. ( ) ( ) ( ) .. .. 1 ( ) 1 ( ) 1 ( ) .. .. .. .. .. .. .. ( ) ( ) ( ) .. 1 () 1 ( ) 1 ( ) n n n k n n k n n k k kn n p pp xx x pp p xx x p pp xx x xx x pp p p pp x x x pp p β ββ ββ β β ββ ββ β β ββ ββ β −− − −− − = −− − 12 .. .. n n kn xx x 11 1 2 1 11 1 12 11 1 ( ) ( ) ( ) .. 1 ( ) 1 ( ) 1 ( ) .. .. .. .. .. .. .. .. ( ) ( ) ( ) .. 1 ( ) 1 ( ) 1 ( ) nn n ii i i i i i i ki ii i ii i nn n ii i ki i ki i ki ki ii i ii i kxk pp p xx xx xx pp p pp p x x x x xx pp p β β β β β β β β β β β β = = = = = = − − − = − − − ∑∑ ∑ ∑∑ ∑ 22 2 11 1 2 1 22 2 12 ( ) ( ) ( ) ( ) ( ) .. ( ) .. .. .. .. .. .. .. .. ( ) ( ) ( ) ( ) ( ) .. ( ) k k k k k kxk EE E E E E β β β ββ ββ ββ β β β ββ ββ ββ ∂∂ ∂ −− − ∂∂ ∂∂ ∂∂ = ∂∂ ∂ − − − ∂∂ ∂∂ ∂∂ ''( ) El β = − <End of proof> 83 Appendix II Let’s define independent variables xi’=(xi1, xi2,…, xik) to be the row vector comprised of the k covariates of the i-th individual and a binary outcome, i y , with density from the exponential family () ( ; ,) exp ( ,) () yb f y c y a θθ θφ φ φ − = + , where θ is the canonical parameter, φ is the dispersion parameter and functions a( ), b( ) and c( , ) are known for i=1,2,…,n. Let’s further define ( ) x’ i ηβ β = i , where 01 ( , ,..., ) k β ββ β = are regression parameters. The link function ( ) ( ) x’ ii g µ ηβ β = = i links the linear predictors and i µ , where () i i EY µ = , the mean of i y . It can be shown that ( ) '( ) EY b µθ = = and ( ) ( ) ''( ) Var Y a b φθ = . The log-likelihood is given by 1 () ( ) log ( ; ,) ( ,) () n i i i i i yb Ly c y a θθ β θφ φ φ = − = = + ∑ and the score function for MLE is 1 n i ii i j i ii d d θµ β θ µβ = ∂∂ ∂ = ⋅⋅ ∂ ∂∂ ∑ 1 st term above= '( ) () () i i i ii i yb y aa θµ θφ φ ∂− − = = ∂ 2 nd term above= 1 () ''( ) ( ) i ii i da d b Var y θφ µθ = = , because '( ) ii b µθ = and ( ) ( ) ''( ) ii Var y a b φθ = Putting it all together, 1 () () ( ) n ii i i j ii ya a Var y µ φ µ β φ β = ∂− ∂ = ⋅⋅ ∂∂ ∑ 11 ( ) * ( ) nn ii i ii i ii ii ii yy Var y V µµ µ µ β φ µβ = = −∂ − ∂ = ⋅= ⋅ ∂∂ ∑∑ , which is the same as the quasi-score function provided in Chapter II. 84 Appendix III Table III-1. Description of 24 Scenarios for Data Generation β 1=log(RR) P(Y=1) Association between Z and Y: Linear 1 Association between Z and Y: Non-Linear 2 Level of association between Z and X, Z and Y Level of association between Z and X, Z and Y Moderate: α 1= β 2=log(2) Strong: α 1= β 2=log(4) Moderate: α 1= β 2=log(2) Strong: α 1= β 2=log(4) α 0 β 0 α 0 β 0 α 0 β 0 α 0 β 0 log(1.5) 10% -0.5188 -3.0518 -1.0425 -3.5885 -0.5188 -3.2636 -1.0425 -4.0302 25% -2.1349 -2.6714 -2.3473 -3.1136 40% -1.6649 -2.2016 -1.8769 -2.6440 log(2.0) 10% -3.2346 -3.7722 -3.4472 -4.2161 25% -2.3191 -2.8561 -2.5300 -3.3006 40% -1.8485 -2.3863 -2.0618 -2.8298 X is a binary treatment/exposure variable (X=1 for treatment/exposure and X=0 for non-treatment/non-exposure) from a binomial distribution with the probability of X=1 fixed at 50%. Y is a binary common outcome from a population with the probability of Y=1 varying from 10%, 25% to 40%. Z ~ Beta (6,2) For all scenarios: 1 Linear confounder Z: 2 Non-linear confounder Z: 85 Appendix IV . Relative bias (%), standard error (SE) and mean square error (MSE) in the presence of outliers (n=500) Table IV-1. Relative bias (%) in log scale in the presence of outliers (n=500) RR Prob (Y = 1) Contamination Rate Association bet Z and Y: Linear Association bet Z and Y: Linear Association bet Z and Y: Non- Linear Association bet Z and Y: Non-Linear Level of association bet Z and X, Z and Y: Moderate Level of association bet Z and X, Z and Y: Strong Level of association bet Z and X, Z and Y: Moderate Level of association bet Z and X, Z and Y: Strong LB RP LB RP LB RP LB RP 1.5 10% 0% 3.1 3.2 4.4 4.4 0.4 0.4 0.1 −0.1 2% −17.3 −15.5 −14.2 −13.8 −28.1 −17.8 −24.7 −17.2 5% −43.0 −35.3 −40.6 −35.1 −51.8 −39.7 −46.8 −39.0 25% 0% 0.0 −0.4 1.3 1.2 3.4 3.4 −0.5 −0.6 2% −9.9 −10.3 −9.5 −10.7 −11.3 −8.8 −18.1 −15.4 5% −25.4 −24.2 −27.0 −27.3 −32.2 −25.3 −43.6 −35.9 40% 0% 0.2 0.1 0.2 0.2 1.0 0.8 1.4 1.2 2% −7.9 −8.2 −8.8 −10.0 −9.6 −9.1 −14.2 −12.2 5% −19.7 −19.6 −22.4 −23.5 −26.1 −23.7 −36.0 −31.2 2.0 10% 0% 2.1 2.1 4.0 4.0 2.4 2.4 2.6 2.7 2% −17.7 −16.6 −15.6 −15.0 −21.5 −14.4 −22.0 −15.4 5% −40.4 −35.6 −39.7 −35.3 −43.8 −35.5 −45.5 −37.0 25% 0% 0.1 0.1 0.3 0.2 0.3 0.3 2.0 1.8 2% −9.7 −9.8 −9.2 −9.7 −12.2 −10.1 −12.3 −10.0 5% −24.0 −22.6 −23.8 −23.4 −29.4 −24.0 −33.1 −26.9 40% 0% −0.9 −0.9 0.6 0.5 0.2 0.2 −0.3 −0.1 2% −7.7 −7.9 −6.7 −7.3 −8.3 −7.7 −11.0 −10.0 5% −17.8 −17.5 −18.2 −18.6 −21.0 −19.1 −26.7 −24.1 Association between Z and X always linear LB: Log-Binomial Model RP: Robust Poisson Relative bias was defined as the average of the 1,000 estimated RR in log scale minus the log of the true RR divided by the log of the true RR. 86 Table IV-2. Standard error in log scale in the presence of outliers (n = 500) RR Prob (Y = 1) Contamination Rate Association bet Z and Y: Linear Association bet Z and Y: Linear Association bet Z and Y: Non-Linear Association bet Z and Y: Non-Linear Level of association bet Z and X, Z and Y: Moderate Level of association bet Z and X, Z and Y: Strong Level of association bet Z and X, Z and Y: Moderate Level of association bet Z and X, Z and Y: Strong LB RP LB RP LB RP LB RP 1.5 10% 0% 0.27 0.27 0.29 0.29 0.29 0.29 0.28 0.28 2% 0.24 0.24 0.25 0.25 0.26 0.26 0.25 0.25 5% 0.23 0.21 0.23 0.22 0.26 0.23 0.24 0.22 25% 0% 0.17 0.17 0.16 0.16 0.17 0.17 0.16 0.16 2% 0.17 0.17 0.16 0.16 0.16 0.16 0.16 0.16 5% 0.16 0.16 0.15 0.15 0.16 0.15 0.15 0.15 40% 0% 0.11 0.11 0.11 0.11 0.11 0.11 0.11 0.11 2% 0.11 0.11 0.11 0.11 0.11 0.11 0.11 0.11 5% 0.11 0.11 0.11 0.11 0.11 0.11 0.11 0.11 2.0 10% 0% 0.30 0.30 0.30 0.30 0.30 0.30 0.30 0.30 2% 0.27 0.26 0.26 0.25 0.27 0.26 0.26 0.26 5% 0.24 0.23 0.24 0.22 0.25 0.23 0.25 0.23 25% 0% 0.17 0.17 0.17 0.17 0.17 0.17 0.17 0.18 2% 0.16 0.16 0.17 0.17 0.17 0.16 0.17 0.17 5% 0.16 0.15 0.16 0.16 0.16 0.15 0.16 0.16 40% 0% 0.12 0.12 0.12 0.12 0.12 0.12 0.12 0.12 2% 0.12 0.12 0.11 0.12 0.12 0.12 0.12 0.12 5% 0.11 0.11 0.11 0.11 0.12 0.11 0.12 0.12 Association between Z and X always linear LB: Log-Binomial Model RP: Robust Poisson Standard error was defined as the empirical standard error of the estimated RR in log scale over all 1,000 simulations. 87 Table IV-3. Mean square error (MSE) in log scale in the presence of outliers (n = 500) RR Prob (Y = 1) Contamination Rate Association bet Z and Y: Linear Association bet Z and Y: Linear Association bet Z and Y: Non- Linear Association bet Z and Y: Non-Linear Level of association bet Z and X, Z and Y: Moderate Level of association bet Z and X, Z and Y: Strong Level of association bet Z and X, Z and Y: Moderate Level of association bet Z and X, Z and Y: Strong LB RP LB RP LB RP LB RP 1.5 10% 0% 0.073 0.073 0.082 0.082 0.083 0.083 0.077 0.077 2% 0.063 0.061 0.066 0.066 0.082 0.072 0.072 0.067 5% 0.083 0.065 0.079 0.068 0.111 0.079 0.094 0.074 25% 0% 0.028 0.028 0.027 0.027 0.028 0.028 0.026 0.026 2% 0.029 0.029 0.026 0.027 0.028 0.027 0.030 0.028 5% 0.035 0.034 0.034 0.034 0.042 0.034 0.053 0.043 40% 0% 0.013 0.013 0.012 0.012 0.012 0.012 0.013 0.013 2% 0.013 0.013 0.013 0.014 0.013 0.013 0.016 0.015 5% 0.018 0.018 0.020 0.021 0.022 0.020 0.034 0.028 2.0 10% 0% 0.087 0.087 0.089 0.089 0.092 0.092 0.091 0.091 2% 0.088 0.083 0.077 0.075 0.098 0.082 0.092 0.079 5% 0.138 0.113 0.131 0.110 0.156 0.113 0.164 0.119 25% 0% 0.029 0.029 0.029 0.030 0.029 0.029 0.031 0.031 2% 0.031 0.031 0.032 0.032 0.034 0.031 0.036 0.034 5% 0.053 0.049 0.052 0.051 0.066 0.051 0.079 0.061 40% 0% 0.015 0.015 0.014 0.014 0.015 0.015 0.014 0.014 2% 0.017 0.017 0.015 0.016 0.017 0.017 0.020 0.019 5% 0.028 0.027 0.028 0.029 0.035 0.031 0.048 0.041 Association between Z and X always linear LB: Log-Binomial Model RP: Robust Poisson Mean square error was calculated by taking the sum of the squared bias in log scale and the variances. 88 Appendix V. Relative bias (%), standard error (SE) and mean square error (MSE) with and without model mis-specification (n=500) Table V-1. Relative bias a (%) in log scale with and without model misspecification (n=500) a Relative bias was defined as the average of the 1,000 estimated RR in log scale minus the log of the true RR divided by the log of the true RR. b Maximum P (Y k =1| X = 1, Z 1, Z 2) c Link function used in the model from which the simulated data were generated. DS3, DS5 and DS7 are identical datasets Unshaded: Models corrected specified Light shaded: Mis-specified linear predictors or mis-specified link function Dark shaded: Mis-specified linear predictors and mis-specified link function LB: log-binomial, RP: robust Poisson Dataset Scenario Percentage of Exposed Subjects at Maximum P 0% 1.4% 2.8% 5.8% Max P of exposed b Beta of Z 2 Link Function c LB RP LB RP LB RP LB RP DS1 0.75 3 log 1.4 1.8 2.0 2.5 -0.7 1.0 -3.5 1.6 DS2 0.85 3 log 1.6 1.6 0.9 2.1 -1.9 1.9 -5.8 2.1 DS3 0.95 3 log 1.7 2.3 -2.2 0.7 -4.2 1.7 -10.8 0.9 DS4 0.95 2 log 0.3 0.9 -1.1 0.7 -2.7 1.0 -5.7 1.2 DS5 0.95 3 log 1.7 2.3 -2.2 0.7 -4.2 1.7 -10.8 0.9 DS6 0.95 4 log 1.2 2.0 -2.9 1.1 -6.0 2.4 -13.4 2.1 DS7 0.95 3 log 1.7 2.3 -2.2 0.7 -4.2 1.7 -10.8 0.9 DS8 0.95 3 logit -1.4 1.1 -4.3 0.2 -5.5 2.0 -10.6 0.9 DS9 0.95 3 probit -5.1 2.0 -10.0 1.9 -15.4 1.2 -19.8 1.8 89 Table V-2. Standard error a in log scale with and without model misspecification (n=500) a Standard error was defined as the empirical standard error of the estimated RR in log scale over all 1,000 simulations. b Maximum P (Y k =1| X = 1, Z 1, Z 2) c Link function used in the model from which the simulated data were generated. DS3, DS5 and DS7 are identical datasets LB: log-binomial, RP: robust Poisson Dataset Scenario Percentage of Exposed Subjects at Maximum P 0% 1.4% 2.8% 5.8% Max P of exposed b Beta of Z 2 Link Function c LB RP LB RP LB RP LB RP DS1 0.75 3 log 0.30 0.30 0.26 0.27 0.24 0.24 0.20 0.21 DS2 0.85 3 log 0.26 0.27 0.23 0.24 0.22 0.22 0.18 0.19 DS3 0.95 3 log 0.24 0.25 0.21 0.23 0.20 0.21 0.17 0.18 DS4 0.95 2 log 0.13 0.13 0.13 0.13 0.12 0.13 0.11 0.12 DS5 0.95 3 log 0.17 0.17 0.15 0.16 0.14 0.15 0.12 0.13 DS6 0.95 4 log 0.18 0.19 0.15 0.16 0.15 0.16 0.12 0.13 DS7 0.95 3 log 0.24 0.25 0.21 0.23 0.20 0.21 0.17 0.18 DS8 0.95 3 logit 0.20 0.22 0.19 0.20 0.18 0.19 0.16 0.17 DS9 0.95 3 probit 0.29 0.32 0.26 0.28 0.22 0.25 0.20 0.21 90 Table V-3. Mean square error a (MSE) in log scale with and without model misspecification (n=500) a Mean square error was calculated by taking the sum of the squared bias in log scale and the variances. b Maximum P (Y k =1| X = 1, Z 1, Z 2) c Link function used in the model from which the simulated data were generated. DS3, DS5 and DS7 are identical datasets Unshaded: Models corrected specified Light shaded: Mis-specified linear predictors or mis-specified link function Dark shaded: Mis-specified linear predictors and mis-specified link function LB: log-binomial, RP: robust Poisson Dataset Scenario Percentage of Exposed Subjects at Maximum P 0% 1.4% 2.8% 5.8% Max P of exposed b Beta of Z 2 Link Function c LB RP LB RP LB RP LB RP DS1 0.75 3 log 0.090 0.093 0.069 0.072 0.056 0.060 0.041 0.045 DS2 0.85 3 log 0.066 0.071 0.054 0.060 0.047 0.051 0.038 0.038 DS3 0.95 3 log 0.057 0.064 0.046 0.053 0.042 0.046 0.043 0.033 DS4 0.95 2 log 0.036 0.039 0.033 0.036 0.031 0.034 0.029 0.027 DS5 0.95 3 log 0.057 0.064 0.046 0.053 0.042 0.046 0.043 0.033 DS6 0.95 4 log 0.068 0.075 0.049 0.056 0.053 0.056 0.056 0.039 DS7 0.95 3 log 0.057 0.064 0.046 0.053 0.042 0.046 0.043 0.033 DS8 0.95 3 logit 0.042 0.047 0.038 0.040 0.034 0.036 0.041 0.030 DS9 0.95 3 probit 0.092 0.106 0.079 0.082 0.078 0.061 0.086 0.045
Abstract (if available)
Abstract
To estimate relative risks or risk ratios for common binary outcomes, the most popular model-based methods are the robust (also known as modified) Poisson and the log-binomial regression. Both models are commonly used in medical research including clinical trials. However, the performance of the two models under the presence of outliers and model misspecification has not been well studied. In this dissertation, I first explore the theories behind the models and derive the equations to estimate beta (log of relative risk (RR)) from log-binomial models iteratively. Next, I demonstrate the lack of robustness of the log-binomial models under the presence of outliers and various types of model misspecification using two simulation studies. Finally, to examine the differences in the models in an applied setting, I apply both models to two asthma studies. ❧ The findings of the outlier study indicate that for data coming from a population where the true relationship between the outcome and the continuous covariate contained a higher order term, the robust Poisson models consistently outperformed the log-binomial models even when the level of contamination is low. The findings of the model misspecification study suggest that the point estimates of the log-binomial models can be biased when the underlying data fail to follow a log distribution (either the link function was misspecified or the response depends on covariates through a non-linear combination of predictors). In conclusion, the robust Poisson models are more robust to outliers and model misspecification compared to the log-binomial models when estimating relative risks or risk ratios for common binary outcomes. From this, it is recommended that for large samples one should consider a robust Poisson model. For small to mid-sized samples, users may consider running both models. If the point estimates are comparable, use the results from the log-binomial model because the estimates from the log-binomial models may be more efficient (narrower confidence intervals). When applied to the first asthma study, the two models yielded large differences, with the largest difference in point estimates being 19%. The differences in point estimates in the 2nd asthma study were small. However, the difference changed the interpretation for one of the covariates.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Phase I clinical trial designs: range and trend of expected toxicity level in standard A+B designs and an extended isotonic design treating toxicity as a quasi-continuous variable
PDF
Prohorov Metric-Based Nonparametric Estimation of the Distribution of Random Parameters in Abstract Parabolic Systems with Application to the Transdermal Transport of Alcohol
PDF
An assessment of necrosis grading in childhood osteosarcoma: the effect of initial treatment on prognostic significance
PDF
Attrition in longitudinal twin studies: a comparative study of SEM estimation methods
PDF
Three essays on linear and non-linear econometric dependencies
PDF
Correcting for shared measurement error in complex dosimetry systems
PDF
Nonparametric estimation of an unknown probability distribution using maximum likelihood and Bayesian approaches
PDF
Nonlinear modeling and machine learning methods for environmental epidemiology
PDF
A general equilibrium model for exchange rates and asset prices in an economy subject to jump-diffusion uncertainty
PDF
Assessment of the mortality burden associated with ambient air pollution in rural and urban areas of India
PDF
Essays on econometrics analysis of panel data models
PDF
Model selection methods for genome wide association studies and statistical analysis of RNA seq data
PDF
New methods for asymmetric error classification and robust Bayesian inference
PDF
Modeling graphene: magnetic, transport and optical properties
PDF
Uncertainty quantification in extreme gradient boosting with application to environmental epidemiology
PDF
Age related macular degeneration in Latinos: risk factors and impact on quality of life
PDF
Modeling and simulation of complex recovery processes
PDF
Inverse modeling and uncertainty quantification of nonlinear flow in porous media models
PDF
Chronic eye disease epidemiology in the multiethnic ophthalmology cohorts of California study
PDF
On the interplay between stochastic programming, non-parametric statistics, and nonconvex optimization
Asset Metadata
Creator
Chen, Wansu
(author)
Core Title
Comparing robustness to outliers and model misspecification between robust Poisson and log-binomial models
School
Keck School of Medicine
Degree
Doctor of Philosophy
Degree Program
Biostatistics
Publication Date
11/10/2017
Defense Date
10/13/2015
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
generalized estimating equation,generalized linear model,iterative reweighted least square,log-binomial regression model,maximum likelihood estimator,model misspecification,modified Poisson regression model,OAI-PMH Harvest,outliers,quasi-likelihood estimator,relative risk,risk ratio,robust Poisson regression model,robustness
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Azen, Stanley P. (
committee chair
), Franklin, Meredith (
committee chair
), Krailo, Mark (
committee member
), Wilcox, Rand R. (
committee member
)
Creator Email
eumgwxc@gmail.com,wansu.chen@kp.org
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-197809
Unique identifier
UC11278368
Identifier
etd-ChenWansu-4029.pdf (filename),usctheses-c40-197809 (legacy record id)
Legacy Identifier
etd-ChenWansu-4029.pdf
Dmrecord
197809
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Chen, Wansu
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
generalized estimating equation
generalized linear model
iterative reweighted least square
log-binomial regression model
maximum likelihood estimator
model misspecification
modified Poisson regression model
outliers
quasi-likelihood estimator
relative risk
risk ratio
robust Poisson regression model
robustness