Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Observed and underlying associations in nicotine dependence
(USC Thesis Other)
Observed and underlying associations in nicotine dependence
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
OBSERVED AND UNDERLYING ASSOCIATIONS IN NICOTINE DEPENDENCE by Won Ho Lee A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (STATISTICAL GENETICS AND GENETIC EPIDEMIOLOGY) August 2012 Copyright 2012 Won Ho Lee ii DEDICATION Mom, you are more beautiful and remarkable than I will ever fully grasp. Dad, I can only hope to be half the man you were. iii ACKNOWLEDGEMENTS David, you never gave up on me. You made this possible. Jinghua, your perseverance and tenacity are humbling. You brought joy into these years. iv TABLE OF CONTENTS DEDICATION ii ACKNOWLEDGEMENTS iii LIST OF TABLES vi LIST OF FIGURES viii ABSTRACT ix INTRODUCTION 1 Gender Differences 2 Pharmacogenetics of Nicotine Addiction and Treatment Consortium 3 Study Design 3 Gene and single nucleotide polymorphism selection 5 Treatments and Genes 6 Biomarkers 12 Smoking Cessation 15 Conventional Analysis Methods 16 Gene x Gene and Gene x Environment Interactions 16 Latent Variable Framework 17 Nonparametric Approach 18 Dissertation Objectives 19 Chapter 1 – Conventional Regression – Gender-specific Genetic Associations on Smoking Abstinence 19 Chapter 2 – Conventional Regression – Nicotine Metabolizer-specific Genetic Associations on Smoking Abstinence 20 Chapter 3 – Nonparametric Latent Variable Framework 20 Chapter 4 - Summary 21 Introduction References 22 CHAPTER 1 – GENDER STRATIFIED GENE AND GENE–TREATMENT INTERACTIONS IN SMOKING CESSATION 27 Abstract 27 Introduction 28 Materials and Methods 30 Study sample and design 30 Statistical analysis 33 Results 35 Associations with abstinence at EOT and 6-month follow-up 36 Discussion 47 Chapter 1 References 54 CHAPTER 2 – DRD1 ASSOCIATIONS WITH SMOKING ABSTINENCE ACROSS SLOW AND NORMAL NICOTINE METABOLIZERS 60 Abstract 60 Introduction 61 v Study Sample 62 Genotyping procedures 63 SNP Analysis 64 Results 66 Discussion 68 Chapter 2 References 70 CHAPTER 3 – LATENT VARIABLE FRAMEWORK 72 Introduction 72 TTURC and PNAT 73 Previous Results 76 Conventional Regression 77 Aims 84 Latent Variable Framework 85 Nonparametric Latent Variable Framework 86 Framework Evaluation 91 Parameter Estimation 91 Prediction – Conventional Regression 92 Prediction – Latent Variable Framework 93 Area Under the Curve 94 Simulations 95 Real Data Prediction 97 Conventional Regression 98 Latent Variable Framework 99 Simulation Results 101 K = 2 101 K = 3 107 K = 7 112 Real Data Results 115 Latent Cluster Characterization – Full Data 115 Prediction Results 119 Latent Cluster Characterization - N est 119 AUC 121 Framework Extensions 123 Multiple Variables 123 Multiple Latent Clusters 126 Conclusions 128 Chapter 3 References 130 CONCLUSION 132 Further Applications 137 Multiple Latent Clusters and Variables 137 Model Selection 138 Time-varying Component 139 Conclusion References 143 BIBLIOGRAPHY 145 vi LIST OF TABLES Table 1: TTURC Study Characteristics 135 Table 2: Male significant gender stratified marginal single nucleotide polymorphisms (SNP) results for abstinence rates 137 Table 3: Female significant gender stratified marginal single nucleotide polymorphisms (SNP) results for abstinence rates 138 Table 4: Gender stratified SNP × Tx interaction results at end of treatment 143 Table 5: Gender stratified SNP × Tx interaction results at 6-month follow-up 144 Table 6: Interaction between DRD1 and CHRNA5-CHRNA3-CHRNB4 Chromosome 15 nAChR region SNPs and nicotine metabolism rate (slow vs. normal metabolizers) on abstinence at end of treatment 165 Table 7a: Conventional Regressions – Univariate effect estimates between smoking abstinence at 6-month follow-up and NMR, CYP2A6, FTND and Gender, adjusted for treatment 176 Table 7b: Conventional Regression – Joint effect estimates between smoking abstinence at 6-month follow-up and NMR, CYP2A6, adjusted for treatment 176 Table 7c: Conventional Regression – Univariate effect estimates between NMR and CYP2A6, FTND and Gender 176 Table 8a: K = 2, AUCs, LC and outcome prediction, for simulated conditions of θ, σ, ρ and β 101 Table 8b: K = 2, Parameter estimates for simulated conditions of θ, σ, ρ and β 103 Table 9a: K = 2, AUCs, outcome prediction, for simulated conditions of θ, σ, ρ, β, and γ 104 Table 9b: K = 2, Parameter estimates for simulated conditions of θ, σ, ρ, β, and γ 106 Table 10a: K = 3, AUCs, LC prediction, for simulated conditions of θ, σ, ρ and β 107 vii Table 10b: K = 3, AUCs, outcome prediction, for simulated conditions of θ, σ, ρ and β 107 Table 10c: K = 3, Mean NMR for simulated conditions of θ, σ, ρ and β 109 Table 10d: K = 3, Effect estimates for simulated conditions of θ, σ, ρ and β 109 Table 11a: K = 3, AUCs, outcome prediction, for simulated conditions of θ, σ, ρ, β, and γ 110 Table 11b: K = 3, Mean NMR for simulated conditions of θ, σ, ρ, β, and γ 111 Table 11c: K = 3, Effect estimates for simulated conditions of θ, σ, ρ, β, and γ 111 Table 12: Latent cluster allocation probabilities, overall and by CYP2A6 and NMR 116 Table 13: Latent variable framework cluster means 120 Table 14a: Areas under the curve for a joint regression model without CYP2A6 121 Table 14b: Areas under the curve for a joint regression model with CYP2A6 121 Table 15: Average areas under the curve for LV Framework conditions 122 Table 16: Average areas under the curve, multiple covariates in LC estimation 125 Table 17: Average areas under the curve, multiple covariates, conventional regression 126 Table 18: Latent cluster allocation probabilities and NMR values for an unconstrained number of latent clusters 127 Table 19: AUC comparison of the unconstrained LV framework to the constrained LV framework and conventional regression 128 viii LIST OF FIGURES Figure 1: Abstinence Rates for rs6702335 (EPB41) and rs806365 (CNR1) 140 Figure 2: P-values for imputed SNPs in EPB41 141 Figure 3: P-values for imputed SNPs in nAChR 151 Figure 4: DRD1 LD plot 165 Figure 5: DRD1 abstinence rates 167 Figure 6: NMR distribution by quartiles 178 Figure 7: Observed smoking abstinence probabilities, by NMR and treatment 179 Figures 8a: Conventional regression smoking abstinence probabilities, by NMR group, CYP2A6, and treatment 180 Figures 8b: Observed smoking abstinence probabilities, by NMR group, CYP2A6, and treatment 182 Figure 9: Latent Variable Framework 185 Figure 10: Latent Variable Framework incorporating the Dirichlet process and CYP2A6 189 Figure 11: Latent cluster distributions across NMR distribution 115 Figure 12: Latent cluster smoking abstinence probabilities, stratified by treatment 117 Figure 13: Nonparametric framework incorporating CYP2A6 and multiple covariates 124 ix ABSTRACT Tobacco-related morbidity and mortality remain among the costliest worldwide public health issues and the most preventable. Nicotine dependence is the primary factor in the persistence of this problem, leading to ongoing efforts to characterize nicotine dependence. Ongoing efforts have also led to the development of treatments to deal with this addictive drug. Nicotine replacement therapies, such as transdermal and nasal spray treatments, and other drugs, such as bupropion, targeted at neurotransmitters effected by nicotine, have had varying degrees of effectiveness for smoking cessation. Studies investigating genetic variants involved in the dopamine reward pathway, nicotine metabolism, and related mechanisms and pathways have shown that specific genetic profiles have different levels of nicotine dependence and respond differentially to treatments for nicotine dependence. For example, differential treatment effects for bupropion have been observed for variants within genes involved in the dopamine reward pathway on smoking cessation, with bupropion inhibiting the reuptake of dopamine. Bupropion has also been shown to be an antagonist for the nicotinic acetylcholine receptors, which bind nicotine. These two actions in concert highlight the interplay between treatments and genes. The metabolism of nicotine has also been of particular interest. Specifically, CYP2A6 has been shown to be a primary mechanism in the metabolism of nicotine to cotinine and then 3-hydroxycotinine (3HC). The nicotine metabolite ratio (NMR), the ratio of 3HC to cotinine, has subsequently been shown to be a reliable proxy for nicotine metabolism. Moreover, those carrying the CYP2A6 variant have been shown to have lower NMR levels, and be slow metabolizers, and those wildtype for CYP2A6 have been shown to be normal or fast metabolizers. x This intricate web of dependence, genetic variation, cessation treatments, and nicotine metabolism has led us to develop a latent variable framework that attempts to capture an underlying process that characterizes a nicotine profile. Latent variable models have been used in the social sciences as a means of estimating constructs that synthesize indices of behavior. Indices for smoking may not capture the extent or degree of dependence on their own, but a latent variable, in describing relationships between measured variables, may ascertain otherwise unobservable effects. The potential problem then may be the interpretation of a latent variable. A hypothetical construct of “nicotine dependence” or “biological pathway X” may or may not be reliable in capturing desired effects. We present a framework that incorporates a Dirichlet process that does not constrain this underlying process to a single distribution, but can flexibly cluster individuals into profiles based on observed biomarker levels, genetic variation and other biological and psychological characteristics. This non- or semiparametric model made up of a mixture of parametric distributions is able to allocate observed measurements into k clusters. It is non- or semiparametric in the sense that each cluster comes from a discrete distribution, but each cluster has a parametric (typically normal) distribution. This provides flexibility in the estimation of some unmeasured variable or parameter, yielding a multimodal distribution that shrinks observations towards respective cluster effects rather than a grand mean. In doing so, groups can be distinguished from one another, rather than constraining all observations to a single distribution. The association between this latent variable and a smoking related outcome can be estimated, with the eventual purpose of utilizing these risk profiles to predict the outcome of interest. In order to assess the performance of our framework, we conducted xi simulations and compared it with conventional regressions on the actual data. Simulations performed as expected. In the real data, latent clusters were similar to the slow, normal and fast nicotine metabolism categories based on NMR levels, regardless of whether we constrained the number of estimated latent clusters to 2 or 3 or allowed the framework to determine the number of latent clusters. This is no surprise given that our latent variable was estimated using NMR. What was of interest was that incorporating CYP2A6, NMR levels had an impact in clustering individuals within those carrying the CYP2A6 variant, but not those wildtype for CYP2A6, suggesting that CYP2A6 and NMR play complementary, but not completely dependent roles in nicotine metabolism. Moreover, while our framework did not outperform conventional regressions in predicting smoking cessation, it was comparable, with our framework having the advantage of more refined characterization of individuals and dimension reduction when including more variables. 1 INTRODUCTION In the U.S., 20% of adults are current smokers, an additional 21% smoked at some point in their lives, and at least 30% of all cancer deaths and nearly 80% of mortalities from lung cancer and chronic obstructive pulmonary can be attributed to smoking. 1, 2 The totality of these health effects has resulted in an economic burden of nearly $200 billion dollars per year. 3 These detrimental effects of smoking have motivated research on nicotine dependence. Studies suggest that 50-70% of nicotine dependence may be heritable, 4-8 highlighting the importance of genes in addiction. Independent of a genetic effect, treatments have focused on nicotine addiction as a means achieving smoking cessation. Available treatments have been nominally successful, but 70-80% involved in cessation studies relapse into old smoking behaviors. 9 Thus, there is a natural intersection between genetic effects and treatments for addiction. Recent waves of research have invested resources into pharmacogenetics, where the role of genetic variability in treatment response (e.g., drug metabolism and drug targets) is examined with hopes of tailoring treatments for different genetic profiles. 10-12 The Transdisciplinary Tobacco Use Research Center (TTURC) and Pharmacogenetics of Nicotine Addiction and Treatment (PNAT) Consortium have conducted multi-center randomized clinical trials for smoking cessation examining the efficacy of nicotine replacement therapies (NRT), such as transdermal nicotine or nicotine nasal spray 2 therapies, and sustained-release bupropion in concert with genetic mechanisms and targets with a priori functional evidence. 12 Gender Differences Earlier efforts to elucidate the variability between women and men in smoking behavior and these effects focused on smoking frequency and inhalation differences. Men smoked with greater frequency and inhaled more deeply, but among regular to heavy smokers, there were no gender differences in cigarette volume and inhalation patterns. 13, 14 This was supported when, among smokers, nicotine blood levels after smoking one cigarette did not differ between women and men. 15 Subsequent studies characterized gender differences in smoking behavior and nicotine addiction through pharmacological and smoking cessation studies. Women reported greater withdrawal symptoms, including higher levels of negative affect (e.g., anxiety, nausea, depression, neuroticism). 16, 17 Also, smoking behavior in women was more strongly reinforced by external social and situational cues, and concern about the “consequences” of smoking cessation (e.g., weight gain, social isolation). 16, 18 In contrast, smoking behavior in men was more reinforced by the nicotine dose, and pharmacological treatments were more effective for men than women. 17, 19, 20 Given the greater sensitivity to the effects of nicotine in men, along with lower relapse rates following nicotine replacement therapies and bupropion, we would anticipate a stronger effect of genes related to the impact of nicotine in men. 3 These gender differences in response to nicotine may have a physiological basis. In a study of the effects of sex hormones on nicotine metabolism, females, especially those taking oral contraceptives, metabolized nicotine more quickly than men. 21 Estrogen upregulates CYP2A6 activity and glucuronidation activity, two drug metabolizing activities essential for nicotine metabolism. 22-24 Animal models may help elucidate gender differences in response to nicotine observed in humans. Female rodents reached a threshold tolerance with less nicotine due to greater nicotine sensitivity, and subsequently, did not discriminate as well as male rodents between varying levels of nicotine. 25, 26 This differential response to nicotine arises from the regulation of dopamine through ovarian, not testicular, hormones, 20 with a greater increase in dopamine concentration in female rodents in response to nicotine. 26 Moreover, female mice were less sensitive to the pain and anxiety reducing effects of nicotine, attributable to female sex hormones (progesterone and estradiol) and not male sex hormones (testosterone) acting as nicotinic receptor antagonists. 27 Pharmacogenetics of Nicotine Addiction and Treatment Consortium Study Design Individuals were enrolled in one of two randomized clinical trials conducted through the Transdisciplinary Tobacco Use Research Center (TTURC). 12 The first study (bupropion) was 4 a double-blind randomized clinical trial conducted between April 1999 and October 2001 at Georgetown University (Washington, DC, USA) and SUNY Buffalo (New York, USA), where smokers were randomly assigned to the placebo or bupropion treatment arm. The second study (NRT) was an open-label randomized clinical trial conducted between February 2000 and August 2003 at Georgetown University (Washington, DC, USA) and the University of Pennsylvania (Philadelphia, PA, USA), where smokers were randomly assigned to receive one of two nicotine replacement therapies (NRT): transdermal nicotine treatment (i.e., patch) and nicotine nasal spray. The assigned treatment in the bupropion study (placebo or bupropion) was blinded, but the delivery nature of the two treatments in the NRT study (transdermal and spray) was known to the subjects. Aside from these differences in medication delivery, the studies had similar designs and procedures with subjects in both studies recruited using identical methods, making them directly comparable for analysis. Specific details of each study have been previously described. 12, 28 In brief, after applying exclusion criteria (pregnancy, a history of DSMIV axis I psychiatric disorder, seizure disorder, current use of antidepressants or any psychotropic medications) to smokers > 18 years of age who reported regular smoking (> 10 cigarettes per day over the past 12 months), 534 subjects in the placebo-controlled bupropion study and 577 subjects in the NRT study consented to treatment and provided a blood sample for genotyping. We limited our analyses to self-identified Caucasians with phenotype and genotype data (n = 412 and 381, respectively) in order to avoid the potential confounding and heterogeneity of effect estimates arising from differential linkage disequilibrium across ethnic groups. 5 In both studies, potential participants were smokers ≥ 18 years old who reported smoking ≥ 10 cigarettes per day over the prior 12 months. Through a medical and psychiatric screening, slightly different exclusion criteria were applied in each study and are detailed elsewhere. Treatment included 7 group behavioral counseling sessions plus study medication. Participants were instructed to quit smoking on their target quit date (TQD) which occurred 1-2 weeks after pre-quit counseling. Participants in the Bupropion study initiated treatment during the first week of the study period. Those in the NRT study began treatment at TQD. In both studies we focused on smoking abstinence at two endpoints: (1) End of treatment (EOT) assessed 8 weeks post-TQD and (2) 6-months post-TQD. For both endpoints, those self- reporting smoking abstinence for the 7 days prior to assessment were biochemically verified using saliva cotinine concentrations measured by gas-liquid chromatography. Subjects who self-reported abstinence over that prior week and had cotinine levels ≤ 15 ng/ml were classified as abstinent. Gene and single nucleotide polymorphism selection A candidate gene study was carried out as part of the Pharmacogenetics of Nicotine Addiction and Treatment Consortium (PNAT). 12, 28 Utilizing expert opinion, literature searches, and databases of biological pathways 58 candidate genes were selected for genotyping. These genes were selected for their roles in nicotine metabolism, the brain- reward pathway (e.g., nicotinic acetylcholine receptors, dopamine receptors and transporters), and their secondary role in interactions with the brain-reward pathway. We expanded each gene region 10 kb upstream and downstream, and combined adjacent genes (ANKK1 and 6 DRD2; CHRNA6 and CHRNB3; the nAChR region on chromosome 15 encompassing CHRNA5, CHRNA3, CHRNB4; MAOA and MAOB). Thus, we took 53 gene regions into consideration rather than 58 genes. We performed SNP selection within each gene region with Snagger software, using HapMap SNP data for the CEPH and Chinese populations (Genome Build 35). From the 1295 SNPs selected, 118 were chosen because of their putative function, and 1185 additional SNPs were selected to capture the underlying genetic structure using a previously described algorithm. An additional 233 ancestry informative markers (AIMs) were chosen to distinguish four parental populations (African, European, American Indian, and East Asian) within each subject and to estimate coefficients of ancestry and principal components. In total we genotyped 1528 SNPs. Treatments and Genes The choice of therapies and genes were motivated by the effects of nicotine. The neurochemical effect of nicotine plays a primary role in entrapping individuals in a cycle of smoking dependence. Nicotine, the addictive component of cigarettes, acts swiftly on central nervous system receptors and stimulates the release of neurotransmitters, such as dopamine. 22 These neurotransmitters are involved in the brain reward pathway, resulting in pleasure arousal and enhanced mood. 29, 30 Persistent exposure to nicotine results in the development of tolerance to its effects and a need for increased intake to achieve the same level of reward. 31 Subsequent withdrawal symptoms (e.g., irritability, hunger, anxiety) in the absence of nicotine and their severity are inextricably tied to the extent of dependence. 32 7 By exploring potential mechanisms of cessation, the following treatments allow for potential elucidation of nicotine addiction across different genetic profiles. Transdermal Nicotine Therapy (NRT) Transdermal nicotine therapy has been hypothesized to be effective for smokers with low to moderate nicotine dependence. It acts through a gradual release of nicotine that plateaus after 4-6 hours, minimizing withdrawal symptoms and nicotine seeking behavior over an extended period of time. It would seem effective for lower dependencies where cravings are not acute and that steady release of nicotine is enough to curb occasional and mild flares in smoking behavior. 33, 34 Nicotine Nasal Spray Therapy (NRT) Nicotine nasal spray therapy has been hypothesized to be effective for smokers with high nicotine dependence. It is a self-administered treatment that leads to a rapid increase in nicotine levels, and mimics the instant rewarding effects of cigarettes. Those with greater nicotine dependence experience more acute cravings that may not be effectively treated by a steady, yet relatively low, infusion of nicotine. Thus, a sharp increase in nicotine levels under severe symptoms may be the most effective means of curbing smoking behavior. 33, 34 Sustained Release Buproprion Bupropion is an antidepressant medication with both pharmacodynamic (i.e., the physiological and psychological effects of a substance) and pharmacokinetic (i.e., the 8 metabolic processes carried out on a substance) properties. It has a weak pharmacodynamic effect of inhibiting dopamine and norepinephrine reuptake, leading to a greater presence of neurotransmitters and positive psychobiological effects. This effect is consistent with prior studies where bupropion was found to have a beneficial impact on postcessation withdrawal symptoms and mood. Bupropion is metabolized to hydroxybypropion by cytochrome P450 2B6 (CYP2B6), and CYP2B6 is active in nicotine metabolism in the presence of elevated nicotine levels. Accordingly, studies have shown that the effectiveness of bupropion is mediated by different genetic profiles, via pharmacodynamic and pharmacokinetic processes. 34, 35 Dopamine Genes Several genes involved in dopamine signaling and transport have been studied for their pharmacogenetic effects on nicotine therapies. Neurotransmitters, such as dopamine, are central in the brain-reward process that is affected by addictive substances. Nicotine elicits the release of dopamine. These substances artificially alter levels and the activity of neurotransmitters, making the genes involved in these processes potential targets for these substances. 36 The dopamine D2 receptor (DRD2) gene has a few polymorphisms of interest. In examining the efficacy of nicotine transdermal therapy, carriers of the lower-activity variants (A1 allele for DRD2 Taq1A; Del C allele for DRD2 -141 Ins/Del, rs1799732) were shown to have increased smoking cessation rates compared to those homozygous for normal activity (A2 allele for DRD2 Taq1A; Ins C allele for DRD2 -141 Ins/Del). 8 However, in bupropion 9 studies, those homozygous for normally functioning alleles and receiving bupropion had higher cessation rates compared to those in the placebo group, while there were no treatment effects among low-activity carriers. 37 These genes have been suggested to affect the availability of dopamine D2 receptors. Another DRD2 variant, C957T (rs6277) showed similar results in NRT studies, but no associated was observed in bupropion studies. 8 So smokers receiving NRT with less available dopamine receptors may have greater cessation rates because fewer neurotransmitter targets need to be appeased, whereas the burden of need for neurotransmitters remains large in smokers with normal availability of receptors. However, smokers receiving bupropion experience an increased presence, and therefore higher, neurotransmitter levels, so those with greater receptor availability (i.e. normal receptor activity) may be able to experience an increased exposure to beneficial effects. Nicotinic Acetylcholine Receptors Nicotinic acetylcholine receptors (nAChRs) are located in the central nervous system (CNS) and bind nicotine and the endogenous neurotransmitter acetylcholine (ACh). 38 Once bound, nAChRs play a key role in the release of dopamine, a key neurotransmitter in the brain- reward pathway. Endogenous levels of acetylcholine and the subsequent release of dopamine via nAChRs are homeostatically regulated, but the chronic use of nicotine leads to increased nicotine tolerance and subsequent nicotine dependence through both a desensitization and an increase in the number of these receptors. 39 There are a number of hypotheses to explain the paradoxical state of nAChRs induced by nicotine, where more receptors are present but with decreased sensitivity to the binding of ligands (e.g., nicotine, ACh), 38, 40 but what remains clear is that nAChRs play a key role in nicotine addiction. Twelve subunits have been 10 identified (α2-α10, β2-β4), with the α4β2 heteromeric subtype being the most abundant in the brain. 38 Molecular biology studies have helped elucidate the role these specific subunits may play. Knock-out mice without a functional β2 subunit had a loss of sensitivity to nicotine, leading to unchanged dopamine levels in the presence of nicotine. Knock-in mice with a hypersensitive α4 subunit were more susceptible to nicotine addiction, requiring lower doses of nicotine to activate dopamine release and subsequent addiction. 39 In vitro studies using cell lines suggest a dual role of bupropion involving nAChRs. Bupropion was found to act as an antagonist for the α3β2 and α3β4 nAChR subunits, competing with nicotine for the binding of these receptor sites. It was also found to reduce the reuptake of dopamine and norepinephrine via neurotransmitter transporters. 41 Taken in concert, bupropion binds to nAChRs targeted by nicotine, thus limiting nicotine induced dopamine release, and reducing the reuptake of neurotransmitters that have been released, leading to beneficial physiological effects. Cytochrome P450 2A6 and 2B6 Two well-studied genes in the cytochrome P450 enzyme family play key complementary roles in the study of nicotine dependence. Cytochrome P450 2A6 (CYP2A6), in a pharmacokinetic response to nicotine, accounts for 70-80% of nicotine metabolism via a mechanism that results in the secondary metabolite, cotinine. While other pathways and mechanisms account for this conversion, only CYP2A6 11 metabolizes cotinine to trans-3’-hydroxycotinine (3HC). Nicotine metabolite ratio (NMR), the ratio of 3HC to cotinine, has been shown to be a reliable index of nicotine metabolism, and more importantly, a biomarker quantifying CYP2A6 activity. 42 CYP2A6 activity has subsequently been associated with smoking cessation and withdrawal symptoms, and shown to be altered in animal models following treatment. Genetic variants of CYP2A6 conferring higher nicotine metabolism have been associated with increased smoking relapse and worsened withdrawal symptoms. 43 Inactive variants of CYP2A6 have been more difficult to characterize. While slower nicotine clearance and metabolism has been associated with these variants, studies have found them to be associated with both increased and decreased smoking behavior. 44 Finally, monkey models receiving nicotine treatment have decreased levels of CYP2A6 protein. 45 Taken together, even the characterization of one gene involves the interplay of more than 20 variants with other factors and a mechanism of nicotine metabolism that is a cog in the more complex pathway of smoking behavior. Cytochrome P450 2B6 (CYP2B) has a pharmacokinetic and potential pharmacodynamic response to nicotine. Nicotine is metabolized to nornicotine in the brain via N-demethylation, primarily through CYP2B6, which leads to the desensitization of nicotine receptors. 42 Smokers had higher levels of CYP2B6 in the brain than non-smokers, suggesting that nicotine induces CYP2B6 expression. 42 It has also been shown to metabolize bupropion to hydroxybupropion, contributing to the effect of bupriopion as a weak inhibitor of dopamine and norepinephrine reuptake and effective treatment for nicotine addiction. In clinical trials using bupropion, the inactive variant of CYP2B6 conferred higher relapse rates and increased withdrawal symptoms independent of treatment. 8 To highlight the complex interplay of 12 genetic and treatment effects, an interaction with DRD2 was reported where normal expression of DRD2 conferred increased abstinence and decreased withdrawal symptoms in the presence of bupropion and the inactive variant of CYP2B6. 35 Biomarkers The literature suggests that the etiology of nicotine dependence can be further elucidated through quantitative and qualitative measurements that reflect pharmacokinetic (i.e., the metabolic processes carried out on a substance) and pharmacodynamic (i.e., the physiological and psychological effects of a substance) responses to nicotine. Pharmacokinetic – Cotinine Cotinine serves as a potential biomarker of nicotine dependence because it is a primary metabolite of nicotine. It is estimated that 80% of nicotine is converted to cotinine via CYP2A6. 42 It is preferable to assessing nicotine metabolism because of its longer half-lives (2 hours vs. 16 hours) and the ability to measure it non-invasively (i.e., through saliva). 46 Cotinine is preferable to CPD because CPD does not take into account differences in inhalation severity, whereby two individuals can smoke the same amount of cigarettes, but the one more dependent on nicotine may inhale more deeply and/or frequently in an attempt to intake more nicotine. Also, there may be misreporting of CPD. 47 13 Pharmacokinetic – Cigarettes Per Day (CPD) and Cotinine-CPD Ratio The number of cigarettes smoked per day (CPD) is the least invasive and most easily obtained measurement of smoking. In drawing associations between CPD and disease outcomes, it has been a robust measure in assessing lung cancer risk. 48 However, differences in nicotine content between cigarettes and the intensity with which one smokes introduces potentially unreliable measurements of nicotine intake when assessing associations with less robust variables of smoking dependence, biomarkers and genetic factors. 47 Discrepancies between CPD and measurements of cotinine levels, especially across races, point to measurements of nicotine and their subsequent biomarkers (cotinine and nicotine metabolite ratios) as more reliable measures of smoking intake. 49 However, assuming the reliability of CPD measurements, these differences allow for a composite measurement that may also reflect smoking dependence. Namely, the ratio between cotinine and CPD can be used as a “smoking intensity” variable. Smokers with higher cotinine to CPD ratios may have a greater dependence on nicotine that lead to deeper and more frequent inhalation of the cigarettes that they do smoke. Pharmacokinetic – NMR Nicotine metabolite ratio (NMR) is the ratio of cotinine and trans-3’-hydroxycotinine (3HC), the primary and secondary metabolites, respectively, of nicotine. Evidence for the role of CYP2A6 in the metabolism of nicotine has been well documented. Other pathways, such as N-oxidation through glucuronidation and the enzymatic activity of flavin monooxygenase, 14 have been shown to metabolize nicotine, and racial differences have highlighted the increased role of CYP2A6 and N-oxidation in non-Hispanic whites. However, CYP2A6 is of primary importance because, despite ethnic differences, it has been shown to account for 70- 80% of nicotine metabolism through a mechanism that has received much attention. Cotinine, the primary metabolite of nicotine via CYP2A6 and other mechanisms, is a preferred biomarker of nicotine because it has a longer half-life than nicotine (16 hours vs. 2 hours), making it more feasible to measure. 43 Moreover, CYP2A6 activity can further be quantified by measuring the presence of the secondary metabolite (3HC) that is broken down from cotinine. Although 3HC accounts for ~38% of nicotine metabolites, it is a particulary useful metabolite for two reasons: (1) the half-life is roughly equivalent to that of cotinine, making the ratio relatively constant over time, (2) the metabolism of cotinine to 3HC by CYP2A6 is the only known mechanism through which 3HC is formed. So while other pathways may account for the metabolism of nicotine to cotinine, NMR is a direct measure of nicotine metabolism through CYP2A6. 46 Pharmacodynamic – Withdrawal Symptoms and Fagerström Test for Nicotine Dependence The genetic and pharmacokinetic profiles of nicotine metabolism only paint a partial picture of the effect of nicotine. Defining nicotine dependence has been an extensive undertaking that has relied on self-reported measures of smoking behavior and patterns, as well as withdrawal symptoms. These pharmacodynamic effects were assessed at baseline throughout the study. We measured withdrawal symptoms through a battery of questions making up an 15 18-item scale that assesses the effect of abstinence on their eating and sleeping patterns, daily physical functioning, and smoking urges. 12 The Fagerström Test for Nicotine Dependence (FTND) is a validated six-item self-reported metric that assesses two aspects of dependence: (1) the need to restore nicotine levels after waking up and (2) the frequency of smoking throughout the day. It has been utilized these to assess these qualitative (e.g., desire and motive in reward seeking behavior and persistence in this behavior) and quantitative (e.g., frequency of this behavior) aspects of dependence. 50 In doing so, we attempt to capture potential psychosocial contexts or psychological cues that may contribute to dependence. Smoking Cessation Smoking cessation was assessed at the end of the treatment period (EOT) and 6-months following the target quit date (week 3 of the study). Abstinence was defined as 7 consecutive days of smoking abstinence prior to follow-up through biochemical verification of carbon monoxide levels (< 10 ppm CO) (i.e., point prevalence abstinence). This allows for brief relapses during the follow-up period leading up to the 7 days prior to verification. Cessation at EOT and 6-month follow-up have slightly different implications. Genetic effects observed for EOT cessation would suggest a direct influence of nicotine therapies, whereas genetic effects observed for cessation at 6-month follow-up would suggest the role of long-term physiological modulation arising from nicotine therapies. 12 16 Conventional Analysis Methods The wealth of genetic, physiological and psychological data presents a challenge for statistical analysis. The primary interest of this study is the identification of genes that may be associated with smoking cessation, and potential interactions with treatment, biomarkers, and other factors that may help elucidate the nature of this association. Conventional regressions follow the basic form – g(µY) = α + βX + γW + ε where g(µY) is some link function defining the generalized linear model for an outcome (e.g., smoking abstinence, biomarker measurements), and effect estimates (α, β, γ) capture the association between covariates of interest and the outcome, while adjusting for potential confounders. Of particular interest are SNP associations with smoking abstinence and biomarkers. Follow-up analyses within this standard linear regression framework include model selection of genes, genetic interaction analyses with treatment, gender, and biomarkers on smoking cessation, as well as haplotype analyses of promising genetic regions. Gene x Gene and Gene x Environment Interactions Genes rarely cause disease independent of other factors. Genetic effects often vary across different environmental conditions or other genetic profiles. In the absence of a priori knowledge regarding potential interactions, two-stage methods have been developed to identify statistically significant interactions, reducing the number of variables that need to be analyzed. 51 17 Latent Variable Framework Latent variable models have been established in the social sciences as a means of estimating constructs that synthesize indices of behavior. For example, indices of duration, frequency, and quantity of smoking can be synthesized into a construct of “nicotine dependence” that either quantitatively or qualitatively assesses the degree of addiction. The rationale for latent variables is that they can reduce the number of parameters that need to be estimated. To return to the “nicotine dependence” example, one unmeasured variable that aims to capture the effects of multiple measured variables reduces the degrees of freedom that need to be accounted for in a regression framework. A potentially more attractive property of latent variables is that they may capture unmeasured effects that imperfectly measured metrics miss. Indices for smoking may not capture the extent or degree of dependence on their own, but a latent variable, in describing relationships between measured variables, may ascertain otherwise unobservable effects. The potential problem then may be the interpretation of a latent variable. A hypothetical construct of “nicotine dependence” or “biological pathway X” may or may not be reliable in capturing desired effects. 52 A hierarchical framework to model a latent variable may elucidate that underlying relationship between observed measurements. We approach the estimation of an underlying relationship as a measurement error problem, in which there is an imperfectly measured surrogate, Z, for that true underlying exposure or process, X. We borrow from Richardson 18 and Gilks (1993) and Thomas (2007) who approach the estimation of an underlying relationship as a measurement error problem. 52, 53 The potential advantage of this approach lies in the estimation of the latent variable. The following univariate regressions: - Y ~ W - Y ~ Z and the multivariate regression: - Y ~ W +Z may be susceptible to measurement error in W and Z. In the study of complex biological pathways, biomarkers are imperfect proxies and gene functionality only partially known for many processes that have not been completely elucidated. Regressing them in univariate models might result in biased or attenuated unadjusted effect estimates, and a joint model would provide marginal effects that account for the other variable but not measurement error in that respective variable. The appeal of this hierarchical approach is the unbiased estimate of the “true exposure”, X, under two assumptions: (1) the relationship between Y and X follow a linear disease model; and (2) X and Z follow a Berkson error structure. 52 Nonparametric Approach We modify this framework by incorporating a Dirichlet process. The Dirichlet process (DP) is a “nonparametric” model made up of a mixture of parametric distributions that allocates observed measurements into k clusters. 54, 55 It is non- or semiparametric in the sense that each 19 cluster comes from a discrete distribution, but each cluster has a parametric (typically normal) distribution. This provides flexibility in the estimation of some unmeasured variable or parameter, yielding a multimodal distribution that shrinks observations towards respective cluster effects rather than a grand mean. In doing so, groups can be distinguished from one another, rather than constraining all observations to a single distribution. Dissertation Objectives Given the clinical data from the TTURC study and genetic data obtained through PNAT, and building upon prior studies, this dissertation will attempt to elucidate risk factors in nicotine dependence and alternative methods to assess these risk factors. Conventional standard regressions and development of a latent variable framework will be explored in the following chapters. Chapter 1 – Conventional Regression – Gender-specific Genetic Associations on Smoking Abstinence Gender-specific genetic associations with smoking abstinence will be assessed through conventional regression methods. Treatment specific effects will also be assessed. 20 Chapter 2 – Conventional Regression – Nicotine Metabolizer-specific Genetic Associations on Smoking Abstinence Nicotine metabolizer-specific genetic associations with smoking abstinence will be assessed through conventional regression methods. Nicotine metabolite ratio will be used to categorize individuals as slow (lower 25th NMR quartile) or normal-fast (upper 75th NMR quartiles) nicotine metabolizers. Chapter 3 – Nonparametric Latent Variable Framework A. Simulations Simulations will be performed using parameters to generate data similar to NMR, CYP2A6, and smoking abstinence. The effect of varying these parameters will be explored in order to determine the utility of this framework and describe the interplay of these variables in estimating underlying processes. Moreover, we will explore the ability of this framework for outcome prediction by estimating the area under the curve. B. Real Data The relationship between NMR and CYP2A6, and their respective roles in nicotine dependence make them natural candidates for analysis within a latent variable framework. Other potentially illuminating covariates, such as gender and the Fagerström Test for 21 Nicotine Dependence, are will be incorporated into the framework. Using a nonparametric approach to the latent variable framework, an underlying process will attempt to be quantified and defined to predict the smoking abstinence. Chapter 4 - Summary Concluding remarks on nicotine dependence research and our latent variable framework, and potential future directions for the development of the framework. 22 Introduction References 1. State-specific prevalence of current cigarette smoking among adults--United States, 2003. MMWR Morb Mortal Wkly Rep 2004; 53(44): 1035-1037. 2. CDC. Cigarette smoking among adults--United States, 2002. MMWR Morb Mortal Wkly Rep 2004; 53(20): 427-431. 3. CDC. Smoking-attributable mortality, years of potential life lost, and productivity losses-- United States, 2000-2004. MMWR Morb Mortal Wkly Rep 2008; 57(45): 1226-1228. 4. Heath AC, Martin NG. Genetic models for the natural history of smoking: evidence for a genetic influence on smoking persistence. Addict Behav 1993; 18(1): 19-34. 5. Sullivan PF, Kendler KS. The genetic epidemiology of smoking. Nicotine Tob Res 1999; 1 Suppl 2: S51-57; discussion S69-70. 6. Vink JM, Beem AL, Posthuma D, Neale MC, Willemsen G, Kendler KS, et al. Linkage analysis of smoking initiation and quantity in Dutch sibling pairs. Pharmacogenomics J 2004; 4(4): 274-282. 7. Xian H, Scherrer JF, Madden PA, Lyons MJ, Tsuang M, True WR, et al. The heritability of failed smoking cessation and nicotine withdrawal in twins who smoked and attempted to quit. Nicotine Tob Res 2003; 5(2): 245-254. 8. Lerman CE, Schnoll RA, Munafo MR. Genetics and smoking cessation improving outcomes in smokers at risk. Am J Prev Med 2007; 33(6 Suppl): S398-405. 9. Fiore MC, Jaen CR, Baker TB, Bailey WC, Benowitz N, Curry SJ, et al (2008). Treating Tobacco Use and Dependence: 2008 Update, Clinical Practice Guideline. U.S. Department of Health and Human Services PHS (ed): Rockville (MD). 10. Johnstone EC, Yudkin PL, Hey K, Roberts SJ, Welch SJ, Murphy MF, et al. Genetic variation in dopaminergic pathways and short-term effectiveness of the nicotine patch. Pharmacogenetics 2004; 14(2): 83-90. 11. Jorenby DE, Hays JT, Rigotti NA, Azoulay S, Watsky EJ, Williams KE, et al. Efficacy of varenicline, an alpha4beta2 nicotinic acetylcholine receptor partial agonist, vs placebo or sustained-release bupropion for smoking cessation: a randomized controlled trial. JAMA 2006; 296(1): 56-63. 12. Lerman C, Jepson C, Wileyto EP, Epstein LH, Rukstalis M, Patterson F, et al. Role of functional genetic variation in the dopamine D2 receptor (DRD2) in response to bupropion and nicotine replacement therapy for tobacco dependence: results of two randomized clinical trials. Neuropsychopharmacology 2006; 31(1): 231-242. 23 13. Russell MA, Wilson C, Taylor C, Baker CD. Smoking habits of men and women. Br Med J 1980; 281(6232): 17-20. 14. Waingrow SM, Horn D, Ikard FF. Dosage patterns of cigarette smoking in American adults. Am J Public Health Nations Health 1968; 58(1): 54-70. 15. Russell MA, Jarvis M, Iyer R, Feyerabend C. Relation of nicotine yield of cigarettes to blood nicotine concentrations in smokers. Br Med J 1980; 280(6219): 972-976. 16. Hogle JM, Curtin JJ. Sex differences in negative affective response during nicotine withdrawal. Psychophysiology 2006; 43(4): 344-356. 17. Schnoll RA, Patterson F. Sex heterogeneity in pharmacogenetic smoking cessation clinical trials. Drug Alcohol Depend 2009; 104 Suppl 1: S94-99. 18. Burgess DJ, Fu SS, Noorbaloochi S, Clothier BA, Ricards J, Widome R, et al. Employment, gender, and smoking cessation outcomes in low-income smokers using nicotine replacement therapy. Nicotine Tob Res 2009; 11(12): 1439-1447. 19. Perkins KA, Jacobs L, Sanders M, Caggiula AR. Sex differences in the subjective and reinforcing effects of cigarette nicotine dose. Psychopharmacology (Berl) 2002; 163(2): 194-201. 20. Pogun S, Yararbas G. Sex differences in nicotine action. Handb Exp Pharmacol 2009;(192): 261-291. 21. Benowitz NL, Lessov-Schlaggar CN, Swan GE, Jacob P, 3rd. Female sex and oral contraceptive use accelerate nicotine metabolism. Clin Pharmacol Ther 2006; 79(5): 480-488. 22. Benowitz NL, Hukkanen J, Jacob P, 3rd. Nicotine chemistry, metabolism, kinetics and biomarkers. Handb Exp Pharmacol 2009;(192): 29-60. 23. Harrington WR, Sengupta S, Katzenellenbogen BS. Estrogen regulation of the glucuronidation enzyme UGT2B15 in estrogen receptor-positive breast cancer cells. Endocrinology 2006; 147(8): 3843-3850. 24. Higashi E, Fukami T, Itoh M, Kyo S, Inoue M, Yokoi T, et al. Human CYP2A6 is induced by estrogen via estrogen receptor. Drug Metab Dispos 2007; 35(10): 1935-1941. 25. Hatchell PC, Collins AC. Influences of genotype and sex on behavioral tolerance to nicotine in mice. Pharmacol Biochem Behav 1977; 6(1): 25-30. 26. Isiegas C, Mague SD, Blendy JA. Sex differences in response to nicotine in C57Bl/6:129SvEv mice. Nicotine Tob Res 2009; 11(7): 851-858. 27. Damaj MI. Influence of gender and sex hormones on nicotine acute pharmacological effects in mice. J Pharmacol Exp Ther 2001; 296(1): 132-140. 24 28. Conti DV, Lee W, Li D, Liu J, Van Den Berg D, Thomas PD, et al. Nicotinic acetylcholine receptor beta2 subunit gene implicated in a systems-based candidate gene study of smoking cessation. Hum Mol Genet 2008; 17(18): 2834-2848. 29. Balfour DJ. The neurobiology of tobacco dependence: a preclinical perspective on the role of the dopamine projections to the nucleus accumbens [corrected]. Nicotine Tob Res 2004; 6(6): 899-912. 30. Glautier S. Measures and models of nicotine dependence: positive reinforcement. Addiction 2004; 99 Suppl 1: 30-50. 31. Perkins KA. Chronic tolerance to nicotine in humans and its relationship to tobacco dependence. Nicotine Tob Res 2002; 4(4): 405-422. 32. Baker TB, Piper ME, McCarthy DE, Majeskie MR, Fiore MC. Addiction motivation reformulated: an affective processing model of negative reinforcement. Psychol Rev 2004; 111(1): 33-51. 33. Ray R, Schnoll RA, Lerman C. Pharmacogenetics and smoking cessation with nicotine replacement therapy. CNS Drugs 2007; 21(7): 525-533. 34. Ray R, Schnoll RA, Lerman C. Nicotine dependence: biology, behavior, and treatment. Annu Rev Med 2009; 60: 247-260. 35. Lerman C, Shields PG, Wileyto EP, Audrain J, Hawk LH, Jr., Pinto A, et al. Effects of dopamine transporter and receptor polymorphisms on smoking cessation in a bupropion clinical trial. Health Psychol 2003; 22(5): 541-548. 36. Di Chiara G. Role of dopamine in the behavioural actions of nicotine related to addiction. Eur J Pharmacol 2000; 393(1-3): 295-314. 37. David SP, Strong DR, Munafo MR, Brown RA, Lloyd-Richardson EE, Wileyto PE, et al. Bupropion efficacy for smoking cessation is influenced by the DRD2 Taq1A polymorphism: analysis of pooled data from two clinical trials. Nicotine Tob Res 2007; 9(12): 1251-1257. 38. Tapper AR, McKinney SL, Nashmi R, Schwarz J, Deshpande P, Labarca C, et al. Nicotine activation of alpha4* receptors: sufficient for reward, tolerance, and sensitization. Science 2004; 306(5698): 1029-1032. 39. Marubio LM, Gardier AM, Durier S, David D, Klink R, Arroyo-Jimenez MM, et al. Effects of nicotine in the dopaminergic system of mice lacking the alpha4 subunit of neuronal nicotinic acetylcholine receptors. Eur J Neurosci 2003; 17(7): 1329-1337. 40. Ortells MO, Barrantes GE. Tobacco addiction: a biochemical model of nicotine dependence. Med Hypotheses 2010; 74(5): 884-894. 25 41. Miller DK, Sumithran SP, Dwoskin LP. Bupropion inhibits nicotine-evoked [(3)H]overflow from rat striatal slices preloaded with [(3)H]dopamine and from rat hippocampal slices preloaded with [(3)H]norepinephrine. J Pharmacol Exp Ther 2002; 302(3): 1113-1122. 42. Yamanaka H, Nakajima M, Fukami T, Sakai H, Nakamura A, Katoh M, et al. CYP2A6 AND CYP2B6 are involved in nornicotine formation from nicotine in humans: interindividual differences in these contributions. Drug Metab Dispos 2005; 33(12): 1811-1818. 43. Lerman C, Tyndale R, Patterson F, Wileyto EP, Shields PG, Pinto A, et al. Nicotine metabolite ratio predicts efficacy of transdermal nicotine for smoking cessation. Clin Pharmacol Ther 2006; 79(6): 600-608. 44. Benowitz NL, Perez-Stable EJ, Fong I, Modin G, Herrera B, Jacob P, 3rd. Ethnic differences in N-glucuronidation of nicotine and cotinine. J Pharmacol Exp Ther 1999; 291(3): 1196-1203. 45. Malaiyandi V, Goodz SD, Sellers EM, Tyndale RF. CYP2A6 genotype, phenotype, and the use of nicotine metabolites as biomarkers during ad libitum smoking. Cancer Epidemiol Biomarkers Prev 2006; 15(10): 1812-1819. 46. Dempsey D, Tutka P, Jacob P, 3rd, Allen F, Schoedel K, Tyndale RF, et al. Nicotine metabolite ratio as an index of cytochrome P450 2A6 metabolic activity. Clin Pharmacol Ther 2004; 76(1): 64-72. 47. Benowitz NL, Dains KM, Hall SM, Stewart S, Wilson M, Dempsey D, et al. Progressive commercial cigarette yield reduction: biochemical exposure and behavioral assessment. Cancer Epidemiol Biomarkers Prev 2009; 18(3): 876-883. 48. Benowitz NL, Perez-Stable EJ, Herrera B, Jacob P, 3rd. Slower metabolism and reduced intake of nicotine from cigarette smoking in Chinese-Americans. J Natl Cancer Inst 2002; 94(2): 108-115. 49. Perez-Stable EJ, Herrera B, Jacob P, 3rd, Benowitz NL. Nicotine metabolism and intake in black and white smokers. JAMA 1998; 280(2): 152-156. 50. Heatherton TF, Kozlowski LT, Frecker RC, Fagerstrom KO. The Fagerstrom Test for Nicotine Dependence: a revision of the Fagerstrom Tolerance Questionnaire. Br J Addict 1991; 86(9): 1119-1127. 51. Murcray CE, Lewinger JP, Gauderman WJ. Gene-environment interaction in genome-wide association studies. Am J Epidemiol 2009; 169(2): 219-226. 52. Thomas DC. Multistage sampling for latent variable models. Lifetime Data Anal 2007; 13(4): 565-581. 53. Richardson S, Gilks WR. Conditional independence models for epidemiological studies with covariate measurement error. Stat Med 1993; 12(18): 1703-1722. 26 54. Ferguson TS. A Bayesian analysis of some nonparametric problems. The Annals of Statistics 1973; 1(2): 209-230. 55. Ohlssen DI, Sharples LD, Spiegelhalter DJ. Flexible random-effects models using Bayesian semi-parametric models: applications to institutional comparisons. Stat Med 2007; 26(9): 2088- 2112. 27 CHAPTER 1 – GENDER STRATIFIED GENE AND GENE–TREATMENT INTERACTIONS IN SMOKING CESSATION Chapter 1 has been previously published in The Pharmacogenomics Journal.(2011). Abstract We conducted gender-stratified analyses on a systems-based candidate gene study of 53 regions involved in nicotinic response and the brain-reward pathway in two randomized clinical trials of smoking cessation treatments (placebo, bupropion, transdermal and nasal spray nicotine replacement therapy). We adjusted P-values for multiple correlated tests, and used a Bonferroni corrected α-level of 5×10 -4 to determine system-wide significance. Four SNPs (rs12021667, rs12027267, rs6702335, rs12039988; r 2 >0.98) in erythrocyte membrane protein band 4.1 (EPB41) had a significant male-specific marginal association with smoking abstinence (OR=0.5; 95% CI 0.3-0.6) at end of treatment (adjusted P<6×10 -5 ). rs806365 in cannabinoid receptor 1 (CNR1) had a significant male-specific gene-treatment interaction at 6-month follow-up (adjusted P=3.9×10 -5 ); within males using nasal spray, rs806365 was associated with a decrease in odds of abstinence (OR=0.04; 95% CI 0.01-0.2). While the role of CNR1 in substance abuse has been well studied, we report EPB41 for the first time in the nicotine literature. 28 Introduction Nicotine, the addictive component of cigarettes, plays a primary role in entrapping individuals in a cycle of tobacco dependence. It acts swiftly on central nervous system receptors and stimulates the release of neurotransmitters (e.g, dopamine). These neurotransmitters are involved in the brain reward pathway, resulting in pleasurable arousal and enhanced mood. 1, 2 Persistent exposure to nicotine results in the development of tolerance to its effects and a need for increased intake to achieve the same level of reward. 1-3 Subsequent withdrawal symptoms (e.g., irritability, hunger, anxiety) in the absence of nicotine and their severity are tied to the extent of dependence. 3-5 Earlier efforts to elucidate the variability between women and men in smoking behavior and these effects focused on smoking frequency and inhalation differences. 6, 7 These data suggest that men smoke with greater frequency and inhale more deeply. However, among regular to heavy smokers, there were no gender differences in cigarette volume and inhalation patterns. 7 Among smokers, nicotine blood levels after smoking one cigarette did not differ between women and men. 8 Gender-specific characteristics in smoking determinants and cessation have also been characterized. Women reported greater withdrawal symptoms, including higher levels of negative affect (e.g., anxiety, nausea, depression, neuroticism) than men. 9, 10 Smoking behavior in women was more strongly reinforced by external social and situational cues, and concern about the “consequences” of smoking cessation (e.g., weight gain, social 29 isolation). 9,11 In contrast, smoking behavior in men was reinforced more by nicotine dosage 12 , and pharmacological treatments were more effective for men than women. 10, 12, 13 Nicotine replacement therapies (e.g., nicotine patch, gum, spray) have been shown to be efficacious regardless of gender, increasing abstinence rates almost two-fold. 14 However, given the greater sensitivity to the effects of nicotine in men, along with lower relapse rates following nicotine replacement therapies and bupropion, 11, 15 we would anticipate a stronger pharmacogenetic effect in men. These gender differences in response to nicotine may have a physiological basis. In a study of the effects of sex hormones on nicotine metabolism, Benowitz et al showed that females, especially those taking oral contraceptives, metabolized nicotine more quickly than men. 16 Estrogen upregulates CYP2A6 activity 17 and glucuronidation activity, 18 two drug metabolizing activities essential for nicotine metabolism. 19 Animal models focusing on gender differences in response to nicotine showed that female rodents reached a threshold tolerance with less nicotine due to greater nicotine sensitivity, and subsequently, did not discriminate as well as male rodents between varying levels of nicotine. 20-22 This differential response to nicotine arises from the regulation of dopamine through ovarian, not testicular, hormones, 13 with a greater increase in dopamine concentration in female rodents in response to nicotine. 21 Moreover, female mice were less sensitive to the pain and anxiety reducing effects of nicotine, attributable to female sex hormones (progesterone, estradiol) and not male sex hormones (testosterone) acting as nicotinic receptor antagonists. 23 30 These differences motivated a gender-stratified analysis to investigate genes with potentially distinct roles in addiction specific to males and females. We utilized the same phenotype and genotype data from two comparable pharmacogenetic trials of smoking cessation treatment. 24, 25 As previously reported, a number of genetic variants were identified as predictors of cessation and/or therapeutic response in these studies, including the -141C Ins/Del and C957T in the dopamine D2 receptor gene (DRD2), 25 CYP2A6 and CYP2B6, 26-31 a VNTR and SNPs in SLC6A3, 24, 32, 33 and a SNP (rs2072661) within the 3’ UTR of the nicotinic acetylcholine receptor (nAChR) β2 subunit gene (CHRNB2). 24 We present gender-stratified marginal effects and gender-stratified gene-treatment interactions across both clinical trials for 1,198 SNPs in 53 gene regions, in order to identify variants of interest within each gender. SNPs with interesting effects within each gender were also assessed for differences between genders. Materials and Methods Study sample and design We analyzed subjects enrolled in two randomized clinical trials conducted by the University of Pennsylvania Transdisciplinary Tobacco Use Research Center. 24, 25 The first study (Bupropion) was a double-blind randomized clinical trial comparing the efficacy of bupropion to placebo. This study population and SNPs within the candidate genes presented 31 here were previously investigated for marginal SNP and SNP-treatment effects. 27 The second study (NRT) was an open-label randomized clinical trial comparing transdermal (patch) nicotine replacement therapy (NRT) to nicotine nasal spray NRT. The studies had similar designs, making subjects comparable for analysis. We limited our analyses to individuals who self-identified as white non-Hispanic race/ethnicity (n=411 and 378, respectively) to avoid the potential confounding and heterogeneity of effect estimates arising from differential linkage disequilibrium across ethnic groups. In both studies, potential participants were smokers ≥ 18 years old who reported smoking ≥ 10 cigarettes per day over the prior 12 months. Through a medical and psychiatric screening, slightly different exclusion criteria were applied in each study and are detailed elsewhere. 25 Treatment included 7 group behavioral counseling sessions plus study medication. Participants were instructed to quit smoking on their target quit date (TQD) which occurred 1-2 weeks after pre-quit counseling. Participants in the Bupropion study initiated treatment during the first week of the study period. Those in the NRT study began treatment at TQD. In both studies we focused on smoking abstinence at two endpoints: (1) End of treatment (EOT) assessed 8 weeks post-TQD and (2) 6-months post-TQD. For both endpoints, those self- reporting smoking abstinence for the 7 days prior to assessment were biochemically verified using saliva cotinine concentrations measured by gas-liquid chromatography. 34 Subjects who self-reported abstinence over that prior week and had cotinine levels ≤ 15 ng/ml were classified as abstinent. 24, 25 32 A candidate gene study was carried out as part of the Pharmacogenetics of Nicotine Addiction Treatment Consortium. We investigated previously genotyped SNPs within 53 genes involved in nicotine metabolism and the brain-reward pathway including nicotinic acetylcholine receptors, and dopamine candidate genes (see Supplemental Tables for complete list of genes and SNPs). Selection of the 1,528 SNPs has been described previously. 24, 25, 35, 36 Among the 1,528 total SNPs, we excluded 41 SNPs with a call rate of zero and 57 SNPs with MAFs < 0.01, leaving 1,198 SNPs (118 with putative functional evidence, 1080 to capture underlying LD structure) within the 53 gene regions for association analysis and 232 ancestry informative markers (AIMs). Analysis of these AIMs using STRUCTURE 37 showed negligible admixture in our self-identified white non-Hispanic study population, confirming their genetic homogeneity. Self-reported gender was verified through 41 SNPs located within X-chromosome genes. We estimated heterozygosity and excluded four individuals who self- reported male but were heterozygous for more than 30% of the X-chromosome SNPs. A complete description of the quality control procedures has been previously published. 24 Imputation was carried out on selected gene regions using the haplotypes from the European populations (EUR) from the August release of the 1000 Genomes Project 38 and the program IMPUTE2. 39, 40 Ten regions with multiple associated SNPs or a priori evidence of association (EPB41, CHRNB2, CNR1, the CHRNA2, CHRNB3;CHRNA6 region, the ANKK1;DRD2 region, CHRNA7, the chromosome 15 nAChR CHRNA5;CHRNA3;CHRNB4 region, CHRNA4, MAPK1) were targeted and expanded up- and downstream 50kb to encompass 33 potential SNPs in LD with those lying in respective regions. Using the 245 SNPs across those regions, we imputed an additional 7957 SNPs. We excluded 1716 imputed SNPs with imputation certainty scores < 0.9. Based on best genotype calls, we excluded 4232 imputed SNPs with an observed MAF < 0.01, and an additional 7 with a P-value < 0.0001 from an exact test of Hardy-Weinberg proportions, leaving 2002 imputed SNPs for additional analysis. LocusZoom was used to plot P-values from the analysis of these expanded regions. 41 Statistical analysis SNP association: To estimate gender-stratified SNP associations at EOT and 6-month follow up, we pooled both studies, and used logistic regression to estimate odds ratios for marginal SNP effects and SNP x treatment interaction effects. For each SNP we tested either an additive or dominant genetic model obtained from previously reported analyses. 24 The most common genotype served as the referent. We adjusted for study and treatment when estimating marginal effects, and used dummy variables indicating treatment assignment to assess within treatment effects of bupropion, patch, or spray vs. placebo. For all models we adjusted for age and the Fagerström Test for Nicotine Dependence. 42 For gender-stratified marginal SNP effects we performed a 1-df LRT within each gender to identify significant SNPs. For gender-specific gene-treatment interaction effects, we performed a 3-df LRT of gene-treatment interaction terms for each gender analysis. We did not test the treatment- specific effects within each gender. Analyses were performed using the R Statistical Program. 43 34 Correcting for correlated tests within a gene and determining system-level significance: Marginal 1-df LRT P-values were adjusted to account for the correlation and the number of tests performed across the SNPs within a gene region. 24, 44 The 3-df LRT P-values for the gene-treatment interaction were adjusted using an extension of the approach from Conneely and Boehnke 44 in which observed test statistics were compared to their asymptotic distribution through numerical integration. We used the correlation of individual contributions to score statistics for the 3-df tests to approximate the correlation structure. This approach performs similarly to computationally intensive permutation tests. 45 P-values are reported for each SNP adjusted for the number of correlated SNPs within each gene region. These adjusted P-values achieved significance within a gene region (i.e., region-wide significance) at an α-level of 0.05. Overall significance was determined using an additional Bonferroni correction across the 53 gene regions and two genders. This gives a system-wide α-level of 0.05/(53×2)=5×10 -4 to determine the significance of adjusted P-values in four independent tests: marginal effects at EOT, marginal effects at 6-month follow-up, interaction effects at EOT, and interaction effects at 6-month follow-up. As a conservative benchmark, study-wise significance across the 53 independent gene regions, two outcomes of interest, two genders, and two sets of hypotheses tested (gender-stratified marginal tests and gender-stratified SNP-treatment interactions) yields a α-level of 0.05 / (53*2*2*2) =1.1×10 -4 . 35 Results Table 1 TTURC Study Characteristics All subjects were confirmed European ancestry. a Fagerström test for nicotine dependence Our analyses were restricted to subjects that self-identified and were confirmed European ancestry (Table 1). Fagerström Test for Nicotine Dependence (FTND) scores for women Male Female N (%) N (%) 386 403 Age 46.1 ± 10.8 44.8 ± 12.1 Treatment Placebo 88 (23%) 106 (26%) Bupropion 97 (25%) 120 (30%) Patch 107 (28%) 81 (20%) Spray 94 (24%) 96 (24%) FTND a 0 2 (1%) 6 (1%) 1 13 (3%) 22 (5%) 2 18 (5%) 28 (7%) 3 32 (8%) 37 (9%) 4 54 (14%) 55 (14%) 5 63 (16%) 70 (17%) 6 62 (16%) 69 (17%) 7 68 (18%) 58 (14%) 8 41 (11%) 40 (10%) 9 26 (7%) 14 (3%) 10 7 (2%) 4 (1%) Mean 5.6 ± 2.1 5.1 ± 2.2 End of treatment - Abstinent All treatments 130 (34%) 105 (26%) Placebo 24 (27%) 18 (17%) Bupropion 35 (36%) 35 (29%) Patch 42 (39%) 25 (31%) Spray 29 (31%) 27 (28%) 6-month Follow-up - Abstinent All treatments 87 (23%) 78 (19%) Placebo 16 (18%) 18 (17%) Bupropion 27 (28%) 29 (24%) Patch 26 (24%) 11 (14%) Spray 18 (19%) 20 (21%) 36 (5.1, SD=2.2) were lower compared with men (5.6, SD=2.1). Across all treatments, at EOT, 26% of women were abstinent compared with 34% of men (OR=0.7, 95% CI 0.5-0.9). At 6- month follow-up, 19% of women were abstinent compared with 23% of men (OR=0.82, 95% CI 0.6-1.2). However, within the spray treatment arm, a slightly higher proportion of women remained abstinent (OR=1.1, 95% CI 0.5-2.3). Associations with abstinence at EOT and 6-month follow-up Gender stratified SNP associations are reported in Tables 2 and 3. Gender stratified SNP x treatment associations at EOT and 6-month follow-up are reported in Tables 4 and 5, respectively. Results are presented for males and females for all SNPs with adjusted P-values < 0.05 for either gender-stratified analysis. 37 ! ! Table 2 Male significant gender stratified marginal single nucleotide polypmorphism (SNP) results for abstinence rates All models were adjusted for age, FTND, and treatment. SNPs above the dashed line have adjusted P-values < 0.05 within males. SNPs below the dashed line have adjusted P-values < 0.05 within females. a P-values adjusted for the number of correlated tests within respective gene regions. b Adjusted SNP x Gender Interaction P-value < 0.05. SNP Gene Region Chromosome MAF SNP OR (95% CI) Observed P Adjusted P a MAF SNP OR (95% CI) Observed P Adjusted P a Significant SNPs within Males End of treatment rs6702335 b EPB41 1 0.39 0.46 (0.32-0.64) 2.30 × 10 -6 3.27 × 10 -5 0.39 0.93 (0.67-1.30) 6.67 × 10 -1 1 rs12021667 b EPB41 1 0.4 0.45 (0.32-0.64) 2.01 × 10 -6 3.65 × 10 -5 0.39 0.96 (0.69-1.34) 8.15 × 10 -1 1 rs12027267 b EPB41 1 0.4 0.46 (0.33-0.64) 2.38 × 10 -6 4.04 × 10 -5 0.39 0.95 (0.68-1.33) 7.62 × 10 -1 1 rs12039988 b EPB41 1 0.39 0.47 (0.33-0.65) 3.40 × 10 -6 5.77 × 10 -5 0.39 1.00 (0.72-1.40) 9.97 × 10 -1 9.97 × 10 -1 rs203278 b EPB41 1 0.35 2.27 (1.44-3.60) 3.30 × 10 -4 5.57 × 10 -3 0.34 0.79 (0.50-1.25) 3.20 × 10 -1 9.74 × 10 -1 rs150089 EPB41 1 0.32 1.71 (1.23-2.39) 1.48 × 10 -3 2.20 × 10 -2 0.31 0.86 (0.61-1.23) 4.09 × 10 -1 9.91 × 10 -1 rs2985322 EPB41 1 0.32 1.71 (1.23-2.39) 1.48 × 10 -3 2.22 × 10 -2 0.31 0.87 (0.61-1.24) 4.38 × 10 -1 9.94 × 10 -1 rs578776 CHRNA3;CHRNA5 15 0.24 1.75 (1.21-2.54) 2.86 × 10 -3 2.85 × 10 -2 0.21 0.94 (0.63-1.39) 7.47 × 10 -1 9.99 × 10 -1 rs4654390 EPB41 1 0.5 0.62 (0.45-0.85) 2.48 × 10 -3 3.50 × 10 -2 0.49 0.98 (0.72-1.34) 8.97 × 10 -1 1 rs10915216 EPB41 1 0.5 0.62 (0.45-0.85) 2.68 × 10 -3 3.72 × 10 -2 0.49 0.97 (0.71-1.33) 8.70 × 10 -1 1 6-month Follow-up rs6702335 b EPB41 1 0.39 0.46 (0.31-0.67) 3.44 × 10 -5 6.97 × 10 -4 0.39 1.11 (0.77-1.59) 5.87 × 10 -1 1 rs12027267 b EPB41 1 0.4 0.46 (0.31-0.68) 3.89 × 10 -5 7.95 × 10 -4 0.39 1.17 (0.81-1.69) 3.97 × 10 -1 9.91 × 10 -1 rs12021667 b EPB41 1 0.4 0.48 (0.32-0.70) 8.25 × 10 -5 1.61 × 10 -3 0.39 1.16 (0.80-1.66) 4.37 × 10 -1 9.96 × 10 -1 rs12039988 b EPB41 1 0.39 0.40 (0.24-0.66) 2.89 × 10 -4 5.26 × 10 -3 0.39 1.53 (0.89-2.63) 1.20 × 10 -1 7.83 × 10 -1 rs2238687 FOSB 19 0.14 2.17 (1.33-3.55) 2.42 × 10 -3 1.48 × 10 -2 0.14 1.57 (0.99-2.49) 6.30 × 10 -2 2.62 × 10 -1 rs11101694 CALY 10 0.13 0.42 (0.21-0.85) 1.01 × 10 -2 3.74 × 10 -2 0.12 0.84 (0.44-1.59) 5.89 × 10 -1 5.89 × 10 -1 rs16837840 EPB41 1 0.11 2.40 (1.38-4.17) 2.46 × 10 -3 3.78 × 10 -2 0.1 0.70 (0.35-1.41) 3.01 × 10 -1 9.71 × 10 -1 Female Male 38 ! ! Table 3 Female significant gender stratified marginal single nucleotide polypmorphism (SNP) results for abstinence rates All models were adjusted for age, FTND, and treatment. SNPs above the dashed line have adjusted P-values < 0.05 within males. SNPs below the dashed line have adjusted P-values < 0.05 within females. a P-values adjusted for the number of correlated tests within respective gene regions. ! SNP Gene Region Chromosome MAF SNP OR (95% CI) Observed P Adjusted P a MAF SNP OR (95% CI) Observed P Adjusted P a Significant SNPs within Females End of treatment rs4809549 CHRNA4 20 0.47 1.00 (0.74-1.35) 9.96 × 10 -1 9.96 × 10 -1 0.48 0.59 (0.42-0.83) 1.99 × 10 -3 1.69 × 10 -2 rs7123797 ANKK1 11 0.34 0.95 (0.62-1.46) 8.23 × 10 -1 1 0.34 0.46 (0.29-0.73) 8.58 × 10 -4 1.86 × 10 -2 rs4938012 ANKK1 11 0.33 0.97 (0.63-1.48) 8.74 × 10 -1 1 0.33 0.47 (0.29-0.74) 1.09 × 10 -3 2.33 × 10 -2 rs17115439 ANKK1 11 0.33 0.94 (0.61-1.44) 7.67 × 10 -1 1 0.33 0.47 (0.30-0.75) 1.45 × 10 -3 3.02 × 10 -2 rs4938015 ANKK1 11 0.33 0.94 (0.61-1.44) 7.67 × 10 -1 1 0.33 0.47 (0.30-0.75) 1.49 × 10 -3 3.10 × 10 -2 6-month Follow-up rs3766927 CHRNB2 1 0.31 0.86 (0.53-1.39) 5.30 × 10 -1 7.77 × 10 -1 0.29 0.45 (0.26-0.76) 2.24 × 10 -3 1.92 × 10 -2 rs1127314 CHRNB2 1 0.31 0.88 (0.62-1.27) 4.99 × 10 -1 7.99 × 10 -1 0.29 0.53 (0.34-0.82) 2.48 × 10 -3 2.07 × 10 -2 rs2834600 CLIC6 21 0.13 0.89 (0.51-1.57) 6.93 × 10 -1 9.87 × 10 -1 0.14 2.29 (1.34-3.91) 2.86 × 10 -3 3.08 × 10 -2 rs17759598 MAPK1 22 0.17 0.87 (0.55-1.39) 5.67 × 10 -1 9.82 × 10 -1 0.14 0.44 (0.23-0.83) 5.60 × 10 -3 3.71 × 10 -2 rs2131902 CHRNB2 1 0.32 0.82 (0.50-1.33) 4.20 × 10 -1 8.24 × 10 -1 0.3 0.49 (0.29-0.82) 5.66 × 10 -3 4.33 × 10 -2 Male Female 39 Marginal associations Within males at EOT, marginal associations achieving system-wide significance were observed for four SNPs (rs6702335, rs12021667, rs12027267, rs12039988; r 2 >0.98) in the erythrocyte membrane protein band 4.1 gene (EPB41) (P<6×10 -5 ). Two of these SNPs (rs6702335, rs12027267) had adjusted P-values that approached system-wide significance with abstinence at 6-month follow-up (P≤8.0×10 -4 ). Within males, these four SNPs were associated with a decrease in odds of abstinence at EOT (e.g., OR rs6702335 =0.46, 95% CI 0.32- 0.64). Similar effects were observed at 6-month follow-up for those three SNPs at or approaching system-wide significance (OR rs6702335 =0.46, 95% CI 0.31-0.67). Within females at EOT and 6-month follow-up, these SNPs had no effect, and all four had region-wide significant SNP x gender interactions (adjusted P for test of heterogeneity < 0.05). 40 Figure 1 Abstinence Rates for rs6702335 (EPB41) and rs806365 (CNR1) Abstinence rates comparing individuals with one (Heterozygote-GA) and two minor alleles (Variant-GG) for rs6702335 (EPB41) (lighter and unfilled bars, respectively) to those with two major alleles (WT-AA)), and at least one minor allele (Carrier-TT,TC) for rs806365 (CNR1) (unfilled bars) to those with two major alleles (WT-CC), estimated at each time point (end of treatment and 6-month follow-up). Abstinence rates for rs806365 (CNR1) were stratified by treatment (placebo, bupropion, patch, or spray). The additive and dominant genetic models for rs6702335 and rs806365, respectively, were determined in previously described analyses. Abstinence rates within males at EOT and 6-month follow-up for each rs6702335 genotype are shown in Figure 1a. At EOT, 41% of males with two major alleles (WT-AA) were abstinent, but only 20% of heterozygotes and 15% of those with two minor alleles (Variant- GG) remained abstinent (OR=0.4, 95% CI 0.2-0.6; OR=0.25, 95% CI 0.1-0.5; respectively). At 6-month follow-up, 33% of males with two major alleles were abstinent, but only 19% of Figure 1a - rs6702335 Abstinence Rates (%) 0 10 20 30 40 50 60 41.2 19.5 15.4 32.6 18.5 9.8 End of treatment 6-month Follow-up WT-AA Heterozygote-GA Variant-GG Placebo Bupropion Patch Spray Placebo Bupropion Patch Spray Figure 1b - rs806365 Treatment Abstinence Rates (%) 0 10 30 50 70 22.7 28.8 21.7 39.7 22.2 45.6 54.3 16.9 18.2 18.2 17.4 30.8 14.8 27.8 45.7 3.4 WT-CC Carrier-TT,TC 41 heterozygotes and 10% of those with two minor alleles remained abstinent (OR=0.4, 95% CI 0.3-0.8; OR=0.2, 95% CI 0.1-0.5; respectively). Figure 2 P-values for imputed SNPs in EPB41 Plot of p-values for the male-specific marginal SNP association between genotyped and imputed SNPs in EPB41 and smoking abstinence at EOT. Genotyped SNPs are designated by squares, imputed SNPs are designated by circles. SNPs were imputed using data from the 1000 Genomes Project and IMPUTE2. Imputation using data from the 1000 Genomes Project lends support to the association between smoking cessation at EOT and EPB41 within males (Figure 2). For EPB41, along with the four genotyped SNPs showing association (rs6702335, rs12021667, rs12027267, rs12039988; r 2 >0.98), nine other imputed non-coding SNPs (rs4654328, rs2930833, EPB41 − Main Effect, Males, EOT 0 2 4 6 8 10 −log 10 (p−value) 0 20 40 60 80 100 Recombination rate (cM/Mb) rs6702335 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.2 0.4 0.6 0.8 r 2 OPRD1 EPB41 TMEM200B SFRS4 MECR 29 29.1 29.2 29.3 29.4 Position on chr1 (Mb) Plotted SNPs 42 rs35013556, rs7546832, rs34474391, rs10429865, rs12734106, rs35423185, rs35579088) in LD with our strongest signal (rs6702335, r 2 >0.80) were strongly associated with smoking abstinence. Two other SNPs lying between OPRD1 and EPB41 (rs61783628, rs1485471) were also strongly associated with smoking abstinence and in moderate LD with rs6702335 (0.6<r 2 <0.80). The association between rs6702335 and cessation at EOT remained the strongest. At 6-month follow-up, associations for imputed SNPs within EPB41 were similar to those at EOT. Four SNPs (rs7123797, rs4938012, rs17115439, rs4938015; r 2 >0.95) in ankyrin repeat and kinase domain containing 1 (ANKK1), a gene immediately proximal of DRD2 on chromosome 11q, 46 achieved region-wide significance in association with abstinence at EOT within females (OR=0.5, 95% CI 0.3-0.7, adjusted P<0.05). There was no significant SNP x gender interaction among these four SNPs (adjusted P for test of heterogeneity > 0.05). Several SNPs located in or adjacent to nAChR subunit genes had marginal effects achieving region-wide significance. Within males, rs578776 47 in the CHRNA5-CHRNA3-CHRNB4 gene cluster was associated with abstinence at EOT. Within females, rs4809549 in the nAChR α4 subunit gene (CHRNA4) was associated with abstinence at EOT, and SNPs in ADAR, immediately distal to CHRNB2 (rs3766927, rs1127314, rs2131902; r 2 =0.94) were associated with abstinence at 6-month follow-up. There was no significant SNP x gender interaction among these SNPs (adjusted P for test of heterogeneity > 0.05). 43 ! Table 4 Gender stratified SNP x treatment interaction results at end of treatment All models were adjusted for age and FTND. SNPs above the dashed line have adjusted P-values < 0.05 within males. SNPs below the dashed line have adjusted P-values < 0.05 within females. a Observed P-values were obtained using a 3-df test of the gene-treatment interaction effect. P-values were adjusted for the correlation of individual contributions to score statistics for the 3-df tests within respective gene regions. b Adjusted SNP x Treatment x Gender Interaction P-value < 0.05. Placebo Bupropion Patch Spray SNP Gene Region Gender MAF SNP OR (95% CI) SNP OR (95% CI) SNP OR (95% CI) SNP OR (95% CI) Observed P Adjusted P Significant SNPs within Males rs806365 b CNR1 Male 0.46 1.38 (0.44-4.30) 2.31 (0.77-6.96) 3.07 (1.11-8.47) 0.17 (0.07-0.45) 7.01 × 10 -5 1.05 × 10 -3 Female 0.44 0.40 (0.14-1.13) 0.64 (0.28-1.49) 0.67 (0.25-1.84) 1.82 (0.63-5.24) 2.15 × 10 -1 7.70 × 10 -1 rs806369 b CNR1 Male 0.31 3.53 (1.23-10.09) 1.33 (0.57-3.10) 3.11 (1.33-7.27) 0.30 (0.11-0.79) 5.56 × 10 -4 7.08 × 10 -3 Female 0.3 0.35 (0.11-1.06) 0.87 (0.39-1.93) 0.68 (0.26-1.80) 1.83 (0.73-4.57) 1.29 × 10 -1 6.27 × 10 -1 rs10107450 CHRNA6 Male 0.23 2.11 (0.81-5.49) 2.43 (1.02-5.80) 0.34 (0.14-0.80) 0.62 (0.25-1.53) 3.09 × 10 -3 9.66 × 10 -3 Female 0.22 0.69 (0.23-2.03) 0.87 (0.39-1.95) 0.72 (0.27-1.92) 0.62 (0.23-1.69) 9.61 × 10 -1 9.61 × 10 -1 rs2072661 CHRNB2 Male 0.26 0.26 (0.09-0.74) 0.33 (0.13-0.80) 1.30 (0.59-2.85) 2.14 (0.87-5.24) 2.00 × 10 -3 1.68 × 10 -2 Female 0.22 0.71 (0.23-2.20) 0.54 (0.22-1.29) 1.35 (0.51-3.57) 0.37 (0.14-1.00) 3.02 × 10 -1 7.64 × 10 -1 Significant SNPs within Females rs950776 CHRNB4 Male 0.33 1.65 (0.64-4.24) 1.37 (0.56-3.31) 0.69 (0.31-1.53) 0.55 (0.23-1.35) 2.62 × 10 -1 7.68 × 10 -1 Female 0.35 1.71 (0.58-5.02) 0.38 (0.17-0.87) 0.49 (0.18-1.30) 3.61 (1.34-9.74) 1.52 × 10 -3 1.09 × 10 -2 rs871058 CHRNA5 Male 0.33 1.21 (0.47-3.11) 0.89 (0.37-2.12) 0.69 (0.31-1.53) 0.41 (0.17-1.01) 4.07 × 10 -1 9.17 × 10 -1 Female 0.37 1.31 (0.45-3.86) 0.24 (0.10-0.55) 0.59 (0.22-1.56) 2.44 (0.95-6.28) 1.78 × 10 -3 1.24 × 10 -2 rs514743 CHRNA3;CHRNA5 Male 0.35 1.00 (0.39-2.55) 1.16 (0.49-2.77) 0.49 (0.21-1.09) 0.51 (0.21-1.25) 3.80 × 10 -1 8.98 × 10 -1 Female 0.39 1.07 (0.36-3.15) 0.35 (0.15-0.81) 0.69 (0.26-1.83) 3.96 (1.40-11.23) 2.79 × 10 -3 1.87 × 10 -2 rs4275821 CHRNA5 Male 0.35 1.25 (0.48-3.24) 0.89 (0.37-2.12) 0.71 (0.32-1.57) 0.47 (0.19-1.16) 5.14 × 10 -1 9.67 × 10 -1 Female 0.38 1.10 (0.37-3.23) 0.27 (0.12-0.62) 0.54 (0.20-1.44) 2.58 (0.98-6.79) 3.49 × 10 -3 2.29 × 10 -2 rs3743075 CHRNA3 Male 0.35 1.27 (0.49-3.25) 1.09 (0.46-2.61) 0.52 (0.23-1.17) 0.51 (0.21-1.25) 3.37 × 10 -1 8.60 × 10 -1 Female 0.38 1.11 (0.37-3.26) 0.32 (0.14-0.73) 0.50 (0.19-1.34) 2.66 (1.01-7.02) 6.41 × 10 -3 3.92 × 10 -2 Test of Interaction a 44 Table 5 Gender stratified SNP x treatment interaction results at 6-month follow-up All models were adjusted for age and FTND. SNPs above the dashed line have adjusted P-values < 0.05 within males. SNPs below the dashed line have adjusted P-values < 0.05 within females. a Observed P-values were obtained using a 3-df test of the gene-treatment interaction effect. P-values were adjusted for the correlation of individual contributions to score statistics for the 3-df tests within respective gene regions. b Adjusted SNP x Treatment x Gender Interaction P-value < 0.05 Placebo Bupropion Patch Spray SNP Gene Region Gender MAF SNP OR (95% CI) SNP OR (95% CI) SNP OR (95% CI) SNP OR (95% CI) Observed P Adjusted P Significant SNPs within Males rs806365 b CNR1 Male 0.46 1.06 (0.30-3.75) 2.05 (0.62-6.78) 2.41 (0.74-7.84) 0.04 (0.01-0.21) 3.76 × 10 -6 3.86 × 10 -5 Female 0.44 0.30 (0.11-0.87) 0.80 (0.33-1.92) 0.53 (0.14-1.95) 1.32 (0.43-4.12) 2.66 × 10 -1 7.34 × 10 -1 rs684513 CHRNA5 Male 0.17 1.56 (0.51-4.74) 0.57 (0.19-1.74) 4.00 (1.53-10.42) 0.11 (0.02-0.50) 7.74 × 10 -5 6.98 × 10 -4 Female 0.15 1.56 (0.51-4.72) 1.84 (0.71-4.73) 1.19 (0.31-4.50) 1.59 (0.55-4.59) 9.64 × 10 -1 9.64 × 10 -1 rs4887069 CHRNA3;CHRNB4 Male 0.2 1.83 (0.72-4.61) 0.62 (0.22-1.72) 2.63 (1.24-5.55) 0.17 (0.04-0.61) 2.83 × 10 -4 2.30 × 10 -3 Female 0.17 1.22 (0.46-3.27) 0.91 (0.39-2.15) 1.51 (0.46-5.01) 0.69 (0.29-1.66) 7.19 × 10 -1 9.73 × 10 -1 rs938682 CHRNA3 Male 0.2 2.10 (0.79-5.57) 0.62 (0.23-1.72) 2.24 (1.07-4.67) 0.17 (0.04-0.61) 5.43 × 10 -4 4.20 × 10 -3 Female 0.16 1.26 (0.47-3.36) 1.33 (0.58-3.06) 1.08 (0.31-3.79) 0.75 (0.32-1.78) 7.88 × 10 -1 9.48 × 10 -1 rs637137 CHRNA5 Male 0.2 1.85 (0.70-4.93) 0.67 (0.24-1.86) 2.38 (1.14-4.97) 0.17 (0.04-0.61) 6.30 × 10 -4 4.81 × 10 -3 Female 0.16 1.54 (0.60-3.98) 1.33 (0.58-3.06) 1.17 (0.31-4.43) 0.73 (0.32-1.71) 6.51 × 10 -1 9.76 × 10 -1 rs3743078 CHRNA3;CHRNA5 Male 0.21 1.97 (0.81-4.81) 0.71 (0.27-1.87) 2.24 (1.07-4.67) 0.17 (0.04-0.61) 6.79 × 10 -4 5.13 × 10 -3 Female 0.16 1.13 (0.42-3.03) 1.33 (0.58-3.06) 1.17 (0.31-4.43) 0.75 (0.32-1.78) 8.07 × 10 -1 9.59 × 10 -1 rs667282 CHRNA5 Male 0.2 1.85 (0.69-4.92) 0.67 (0.24-1.86) 2.32 (1.11-4.86) 0.17 (0.04-0.61) 7.10 × 10 -4 5.30 × 10 -3 Female 0.16 1.54 (0.60-3.98) 1.33 (0.58-3.06) 1.17 (0.31-4.43) 0.73 (0.32-1.71) 6.51 × 10 -1 9.76 × 10 -1 rs806369 b CNR1 Male 0.31 2.16 (0.97-4.79) 1.31 (0.64-2.71) 1.68 (0.82-3.42) 0.14 (0.03-0.63) 7.65 × 10 -4 6.80 × 10 -3 Female 0.3 0.55 (0.23-1.32) 0.80 (0.43-1.49) 0.45 (0.13-1.52) 1.42 (0.63-3.22) 3.14 × 10 -1 7.88 × 10 -1 rs6495309 CHRNA3;CHRNB4 Male 0.19 2.19 (0.83-5.81) 0.67 (0.24-1.86) 1.98 (0.87-4.53) 0.17 (0.05-0.61) 1.37 × 10 -3 9.26 × 10 -3 Female 0.15 1.45 (0.54-3.86) 0.96 (0.40-2.31) 0.81 (0.21-3.16) 0.80 (0.33-1.92) 8.33 × 10 -1 9.55 × 10 -1 Significant SNPs within Females rs10423854 b FOSB Male 0.43 1.12 (0.35-3.61) 1.05 (0.41-2.72) 0.38 (0.15-0.96) 1.78 (0.57-5.54) 1.69 × 10 -1 4.26 × 10 -1 Female 0.41 0.19 (0.06-0.55) 0.48 (0.20-1.14) 6.33 (0.76-52.37) 1.14 (0.40-3.19) 2.65 × 10 -3 1.34 × 10 -2 rs950776 CHRNB4 Male 0.33 1.28 (0.43-3.82) 1.22 (0.47-3.16) 0.75 (0.30-1.85) 0.73 (0.26-2.06) 7.77 × 10 -1 8.67 × 10 -1 Female 0.35 3.29 (1.00-10.88) 0.41 (0.17-0.97) 0.38 (0.10-1.44) 2.46 (0.85-7.11) 3.42 × 10 -3 2.11 × 10 -2 rs7178270 CHRNA3;CHRNB4 Male 0.37 0.97 (0.33-2.88) 1.45 (0.53-3.94) 0.79 (0.31-1.97) 1.11 (0.39-3.14) 8.46 × 10 -1 9.29 × 10 -1 Female 0.42 2.83 (0.75-10.66) 0.31 (0.13-0.75) 0.25 (0.07-0.97) 1.64 (0.56-4.76) 4.24 × 10 -3 2.56 × 10 -2 rs7543174 b CHRNB2 Male 0.14 0.31 (0.06-1.48) 2.20 (0.80-6.06) 1.53 (0.57-4.12) 1.71 (0.56-5.23) 1.37 × 10 -1 3.35 × 10 -1 Female 0.18 4.07 (1.41-11.75) 0.76 (0.30-1.93) 0.15 (0.02-1.22) 0.92 (0.31-2.69) 7.61 × 10 -3 3.14 × 10 -2 rs7539745 b CHRNB2 Male 0.13 0.61 (0.16-2.37) 3.10 (1.08-8.91) 1.37 (0.49-3.83) 1.81 (0.59-5.59) 2.80 × 10 -1 4.55 × 10 -1 Female 0.15 4.54 (1.56-13.25) 0.65 (0.24-1.79) 0.19 (0.02-1.54) 1.18 (0.40-3.49) 7.55 × 10 -3 3.17 × 10 -2 Test of Interaction a 45 Gene x treatment interactions Within males at 6-month follow-up, a gene x treatment interaction achieving system-wide significance was observed for rs806365 in the cannabinoid receptor 1 gene (CNR1) (adjusted P=3.9×10 -5 ; Table 4). The odds ratio of rs806365 within the placebo arm was 1.06 (0.30– 3.75) with the odds of abstinence increased more than two-fold in the bupropion and patch arms (OR=2.05, 95% CI 0.62-6.78; OR=2.41, 95% 0.74-7.84; respectively). Most notably, however, was the 25-fold decrease in odds of abstinence within the spray arm (OR=0.04, 95% CI 0.01-0.21). Estimates of effect were consistent at EOT for rs806365 within males, with an adjusted interaction P-value near system-wide significance (P=0.001). Gene x treatment interaction analysis also identified rs806369 within CNR1. Within males, rs806369 had adjusted interaction P-values that reached region-wide significance at EOT and 6-month follow-up (adjusted Ps=0.007). These interactions were driven mostly by decreases in odds within the spray arm (OR=0.30, 95% CI 0.11-0.79; OR=0.14, 95% CI 0.03-0.63, respectively). For both cessation endpoints, rs806365 and rs806369 (r 2 =0.5) had region-wide significant SNP x treatment x gender interactions (adjusted P for test of heterogeneity < 0.05). Their effects within females were in the opposite direction as males with increased odds of abstinence within the spray arm for both cessation endpoints. Imputation of SNPs 50kb up- and downstream of CNR1 with 1000 Genomes Project data did not identify other SNPs with stronger SNP x treatment interactions, or SNPs in LD with our strongest signal in this region (rs806365). 46 Treatment stratified abstinence rates within males for rs806365 carriers and non-carriers are in Figure 1b. For both study endpoints within the bupropion and patch arms, males with at least one minor allele (Carrier-TT,TC) had a roughly two-fold increase in odds of abstinence. Within the placebo arm, those with at least one minor allele had slightly increased odds of abstinence at EOT, but this disappeared at 6-month follow-up. However, overall abstinence rates decreased in these three treatment arms from EOT to 6-month follow-up. Within the spray arm, the proportion of males abstinent remained relatively consistent for males with two major alleles (WT-CC) (54% vs. 46%); but the proportion of abstinent males with at least one minor allele decreased (17% to 3%). Three SNPs in MAPK1 (rs9610375, rs1063311, rs2266966) achieved region-wide significant treatment interactions within females at EOT. Within males at 6-month follow-up, another five SNPs in MAPK1 (rs743409, rs1892848, rs3788332, rs5999550, rs11704205) achieved region-wide significance for SNP x treatment and SNP x treatment x gender interactions (adjusted P for test of heterogeneity < 0.05). A number of nAChR SNPs achieved region-wide significance for gender-stratified gene x treatment interaction effects, including two that have been published previously (rs3743075, 47 rs2072661 24 ). Also, within males at 6-month follow-up, rs684513 48 (chromosome 15 nAChR region) had an adjusted gene x treatment interaction P-value that approached system-wide significance (P=0.0007). Within females, rs950776 had significant gene x treatment interactions at EOT and 6-month follow-up (adjusted P≤0.02). Other previously reported SNPs within the nAChR chromosome 15 region (rs1051730, rs1317286, 47 rs578776) 4, 47, 49 had null marginal and interaction effects. While two other previously reported SNPs (rs12914385, rs16969968) 47, 49 were not genotyped in this study, rs16969968 is in high LD (r 2 >0.95) with rs1051730 and rs1317286. Discussion Reported gender differences in smoking behavior and cessation motivated this gender- stratified analysis in two pharmacogenetic smoking cessation trials. Previous research identified SNPs and genes within the two separate clinical trials comprising this study. 24-28, 30- 32, 50 This analysis builds upon those study results by pooling both studies to look within genders across all four treatments across 53 candidate gene regions. Instead of focusing on estimated heterogeneity between genders, our aim was to identify SNPs with effects specific to males and/or females. Gender-stratified SNP marginal effects revealed a male-specific association between a region of high LD within EPB41 and smoking abstinence at EOT and 6-month follow-up. Within males, we identified four non-coding SNPs in EPB41 (rs6702335, rs12021667, rs12027267, rs12039988) in strong LD (r 2 >0.98). rs12021667 lies in the 5’-UTR, rs12039988 in the 3’- UTR, and rs12027267 and rs6702335 both in intronic regions. These SNPs achieved system- level significance in association with smoking abstinence at EOT (adjusted P<6×10 -5 ). At 6- month follow-up, rs6702335 and rs12027267 approached system-level significance (P≤8.0×10 -4 ) while rs12021667 and rs12039988 achieved region-level significance (adjusted 48 Ps=0.002 and 0.005, respectively). In males, these SNPs were associated with a more than two-fold decrease in odds of abstinence at both cessation endpoints. Erythrocyte membrane protein band 4.1 (EPB41), known as protein 4.1R, is critical for red blood cell morphology and membrane function. 51-54 This protein and its homologues have not been associated with nicotine dependence and smoking, but they were shown to stabilize the localization of dopamine receptors to the plasma membrane, 55 and protein 4.1R had a specific organizational role in the arrangement of postsynaptic molecules. 54 Our data suggests that genetic variation in EPB41 has gender-specific effects on smoking abstinence, potentially mediated through differential effects on the localization or function of dopamine receptors and further downstream effects on the brain reward pathway. EPB41 is adjacent to delta opioid receptor 1 (OPRD1), which is part of the family of opioid receptor genes associated with substance abuse. 56 Using HapMap SNP genotypes from the CEPH population (Release #28, NCBI Build 36) 36 and Haploview, 57 none of the SNPs in EPB41 that achieved system-wide significance are in LD with any SNPs in OPRD1 (r 2 <0.35). However, a SNP in EPB41 achieving region-wide significance (rs16837840) is in moderate LD (r 2 =0.6) with a SNP in OPRD1 (rs12404612). After imputation of EPB41, rs12404612 (certainty score > 0.9) had male-specific marginal associations with abstinence at EOT and 6-month follow-up (0.01 < P = 0.02 and 0.03, respectively). Interestingly, other imputed SNPs in OPRD1 with imputation certainty scores < 0.9 showed strong associations with smoking abstinence. 49 Gender-stratified SNP x treatment interaction analyses revealed a male-specific association between smoking abstinence and two SNPs in CNR1 in weak LD (rs806365, rs806369, r 2 =0.5). rs806365 lies in the 3’ flanking region of CNR1, while due to multiple CNR1 transcript variants, 58 rs806369 lies either in an intron or the 5’ flanking region. For abstinence at 6-month follow-up, the male-specific interaction between rs806365 and treatment achieved system-level significance (adjusted interaction LRT P=3.9×10 -5 ). The effect of this SNP was most prominent in males within the spray arm, where it was associated with a 25-fold decrease in odds of abstinence. Effects at EOT were similar, where rs806365 was associated with a nearly 6-fold decrease in abstinence odds for males in the spray arm (adjusted interaction LRT P=0.001). Males with two major alleles in the spray arm had the highest abstinence rates, and this remained relatively unchanged from EOT to 6-month follow-up; males with at least one minor allele had the lowest abstinence rates across all males. The relationship between cannabinoid receptor 1 (CNR1) and addictive behaviors has been well-characterized. 58-61 Along with its primary role in mediating the effects of marijuana, 58 it modulates dopamine release in response to other substances. 62 The CNR1 antagonist, rimonabant, was effective in suppressing smoking relapse and attenuating reward seeking behaviors during abstinence. 59, 60 There was also a female-specific association between SNPs and haplotypes in CNR1 and both nicotine dependence and smoking initiation. 63 Although there is no overlap of SNPs investigated in that study with our current investigation, both point to gender-specific associations of smoking behaviors with CNR1. Our analysis suggests that gene associations are not only gender-specific but treatment- and gender-specific. Of the treatments administered, the nasal spray most closely mimics the effects of smoking, 50 introducing a sharp, rapid increase in nicotine levels. 64 Since males are more responsive to nicotine dosage and pharmacological reinforcement by nicotine, they may be more sensitive to the effects of the spray. Within the spray arm, males with two major alleles for rs806365 and rs806369 had higher abstinence rates than those with at least one minor allele. Opposite and variable effects of these SNPs in females may be attributed to gender differences in the pharmacological effects of nicotine. 12, 65 It is worth highlighting that for both marginal and interaction effects, no SNPs achieving region- or system-wide significance within one gender stratum achieved significance within the other gender stratum. For marginal effects, there was no overlap in gene regions between genders, with male-specific significant SNPs in EPB41, and female-specific SNPs in ANKK1 and α4 and β2 nAChR gene subunits. Across both abstinence outcomes, gene x treatment interactions identified associations across women and men in the nAChR subunit regions and MAPK1. This consistency in gene regions adds evidence of their putative roles in nicotine dependence and smoking behavior. 47, 49, 66-71 These regions, along with the others without overlap, have shown prior evidence of association with nicotine dependence and smoking cessation, and require further study to validate gender-specific effects. 51 Figure 3 P-values for imputed SNPs in nAChR Plot of p-values for the male-specific SNP x treatment association between genotyped and imputed SNPs in the nAChR region and smoking abstinence at 6-month follow-up. Genotyped SNPs are designated by squares, imputed SNPs are designated by circles. SNPs were imputed using data from the 1000 Genomes Project and IMPUTE2. The use of 1000 Genomes Project data for imputation helped to broaden the areas for potential follow-up for the specific putatively causal variant within our top gene regions. Additionally, we imputed within the previously reported region of chromosome 15, a region linked to smoking dependence 72 and other smoking related diseases. 49, 73-75 For this region, imputation identified two gene regions proximal of the chromosome 15 α5-α3-β4 nAChR complex associated with smoking abstinence (Supplemental Figure 2). AGPHD1 and PSMA4 have imputed SNPs in LD (r 2 >0.6) with our strongest signal in the nAChR gene region Chr15 nAChR − Interaction, Males, 6−month Follow−up 0 2 4 6 8 10 −log 10 (p−value) 0 20 40 60 80 100 Recombination rate (cM/Mb) rs684513 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.2 0.4 0.6 0.8 r 2 IREB2 AGPHD1 PSMA4 CHRNA5 CHRNA3 CHRNB4 ADAMTS7 76.55 76.6 76.65 76.7 76.75 76.8 76.85 Position on chr15 (Mb) Plotted SNPs 52 (rs684513) that had comparable or more significant SNP x treatment interactions on smoking abstinence at 6-month follow-up within males (1×10 -5 < P < 5×10 -5 ). An intronic SNP in AGPHD1, rs12441354, had the strongest association (P=1.3×10 -5 ). Strengths and limitations of this study have been described previously, 24, 25 but we highlight a few here. We did not have to address potential biases in determining smoking abstinence since information was collected from individuals in a prospective manner, and those claiming to be abstinent were subjected to biochemical verification. Similar study designs between studies suggest that subjects are comparable. However, differential responses in abstinence rates across genders and treatments could have arisen from a key difference in exclusion criteria. 25 Only participants in the NRT study were excluded for drug or alcohol dependence or any subsequent treatment. Subjects in the Bupropion study may have used other substances to compensate for any adverse effects from smoking cessation. In our analysis, we performed an adjustment of P-values that accounted for multiple correlated tests within respective gene regions. A Bonferroni-corrected P-value was then applied across the 53 gene regions and two genders. This correction was less stringent than a uniform Bonferroni-correction across all SNPs, while still accounting for independent gene regions and both genders. Thus, a SNP was significant within a respective region and gender at a P-value < 0.05, while system-wide significance was set at an α-level of 0.05/(53*2)=5×10 -4 . 53 While our determination of significance was conservative, the results require independent replication. Identified SNPs with the strongest evidence of association have not been replicated in other studies, and the region with strongest marginal SNP effects, EPB41, has not been associated with nicotine dependence or smoking previously. Confirmation from other studies, especially gender-specific results, would lend support to our findings. However, our study has identified a novel gene region, EPB41, which may be associated with smoking cessation, along with gene regions in CNR1 that may be targeted to further elucidate the etiology of gender differences in smoking behaviors. 54 Chapter 1 References 1. Benowitz NL. Nicotine addiction. N Engl J Med 2010; 362(24): 2295-2303. 2. Hogg RC, Bertrand D. Neuroscience. What genes tell us about nicotine addiction. Science 2004; 306(5698): 983-985. 3. Zbikowski SM, Swan GE, McClure JB. Cigarette smoking and nicotine dependence. Med Clin North Am 2004; 88(6): 1453-1465, x. 4. Berrettini WH, Lerman CE. Pharmacotherapy and pharmacogenetics of nicotine dependence. Am J Psychiatry 2005; 162(8): 1441-1451. 5. Ray R, Schnoll RA, Lerman C. Nicotine dependence: biology, behavior, and treatment. Annu Rev Med 2009; 60: 247-260. 6. Russell MA, Feyerabend C, Cole PV. Plasma nicotine levels after cigarette smoking and chewing nicotine gum. Br Med J 1976; 1(6017): 1043-1046. 7. Russell MA, Wilson C, Taylor C, Baker CD. Smoking habits of men and women. Br Med J 1980; 281(6232): 17-20. 8. Russell MA, Jarvis M, Iyer R, Feyerabend C. Relation of nicotine yield of cigarettes to blood nicotine concentrations in smokers. Br Med J 1980; 280(6219): 972-976. 9. Hogle JM, Curtin JJ. Sex differences in negative affective response during nicotine withdrawal. Psychophysiology 2006; 43(4): 344-356. 10. Schnoll RA, Patterson F. Sex heterogeneity in pharmacogenetic smoking cessation clinical trials. Drug Alcohol Depend 2009; 104 Suppl 1: S94-99. 11. Burgess DJ, Fu SS, Noorbaloochi S, Clothier BA, Ricards J, Widome R, et al. Employment, gender, and smoking cessation outcomes in low-income smokers using nicotine replacement therapy. Nicotine Tob Res 2009; 11(12): 1439-1447. 12. Perkins KA, Jacobs L, Sanders M, Caggiula AR. Sex differences in the subjective and reinforcing effects of cigarette nicotine dose. Psychopharmacology (Berl) 2002; 163(2): 194-201. 13. Pogun S, Yararbas G. Sex differences in nicotine action. Handb Exp Pharmacol 2009;(192): 261-291. 14. Munafo MR, Shields AE, Berrettini WH, Patterson F, Lerman C. Pharmacogenetics and nicotine addiction treatment. Pharmacogenomics 2005; 6(3): 211-223. 15. Perkins KA, Scott J. Sex differences in long-term smoking cessation rates due to nicotine patch. Nicotine Tob Res 2008; 10(7): 1245-1250. 55 16. Benowitz NL, Lessov-Schlaggar CN, Swan GE, Jacob P, 3rd. Female sex and oral contraceptive use accelerate nicotine metabolism. Clin Pharmacol Ther 2006; 79(5): 480-488. 17. Higashi E, Fukami T, Itoh M, Kyo S, Inoue M, Yokoi T, et al. Human CYP2A6 is induced by estrogen via estrogen receptor. Drug Metab Dispos 2007; 35(10): 1935-1941. 18. Harrington WR, Sengupta S, Katzenellenbogen BS. Estrogen regulation of the glucuronidation enzyme UGT2B15 in estrogen receptor-positive breast cancer cells. Endocrinology 2006; 147(8): 3843-3850. 19. Benowitz NL, Hukkanen J, Jacob P, 3rd. Nicotine chemistry, metabolism, kinetics and biomarkers. Handb Exp Pharmacol 2009;(192): 29-60. 20. Hatchell PC, Collins AC. Influences of genotype and sex on behavioral tolerance to nicotine in mice. Pharmacol Biochem Behav 1977; 6(1): 25-30. 21. Isiegas C, Mague SD, Blendy JA. Sex differences in response to nicotine in C57Bl/6:129SvEv mice. Nicotine Tob Res 2009; 11(7): 851-858. 22. Lajtha A, Sershen H. Nicotine: alcohol reward interactions. Neurochem Res 2010; 35(8): 1248-1258. 23. Damaj MI. Influence of gender and sex hormones on nicotine acute pharmacological effects in mice. J Pharmacol Exp Ther 2001; 296(1): 132-140. 24. Conti DV, Lee W, Li D, Liu J, Van Den Berg D, Thomas PD, et al. Nicotinic acetylcholine receptor beta2 subunit gene implicated in a systems-based candidate gene study of smoking cessation. Hum Mol Genet 2008; 17(18): 2834-2848. 25. Lerman C, Jepson C, Wileyto EP, Epstein LH, Rukstalis M, Patterson F, et al. Role of functional genetic variation in the dopamine D2 receptor (DRD2) in response to bupropion and nicotine replacement therapy for tobacco dependence: results of two randomized clinical trials. Neuropsychopharmacology 2006; 31(1): 231-242. 26. Lee AM, Jepson C, Hoffmann E, Epstein L, Hawk LW, Lerman C, et al. CYP2B6 genotype alters abstinence rates in a bupropion smoking cessation trial. Biol Psychiatry 2007; 62(6): 635- 641. 27. Strasser AA, Malaiyandi V, Hoffmann E, Tyndale RF, Lerman C. An association of CYP2A6 genotype and smoking topography. Nicotine Tob Res 2007; 9(4): 511-518. 28. Malaiyandi V, Goodz SD, Sellers EM, Tyndale RF. CYP2A6 genotype, phenotype, and the use of nicotine metabolites as biomarkers during ad libitum smoking. Cancer Epidemiol Biomarkers Prev 2006; 15(10): 1812-1819. 56 29. Yamanaka H, Nakajima M, Fukami T, Sakai H, Nakamura A, Katoh M, et al. CYP2A6 AND CYP2B6 are involved in nornicotine formation from nicotine in humans: interindividual differences in these contributions. Drug Metab Dispos 2005; 33(12): 1811-1818. 30. Lerman C, Tyndale R, Patterson F, Wileyto EP, Shields PG, Pinto A, et al. Nicotine metabolite ratio predicts efficacy of transdermal nicotine for smoking cessation. Clin Pharmacol Ther 2006; 79(6): 600-608. 31. Patterson F, Schnoll RA, Wileyto EP, Pinto A, Epstein LH, Shields PG, et al. Toward personalized therapy for smoking cessation: a randomized placebo-controlled trial of bupropion. Clin Pharmacol Ther 2008; 84(3): 320-325. 32. Lerman C, Shields PG, Wileyto EP, Audrain J, Hawk LH, Jr., Pinto A, et al. Effects of dopamine transporter and receptor polymorphisms on smoking cessation in a bupropion clinical trial. Health Psychol 2003; 22(5): 541-548. 33. Bergen AW, Conti DV, Van Den Berg D, Lee W, Liu J, Li D, et al. Dopamine genes and nicotine dependence in treatment-seeking and community smokers. Neuropsychopharmacology 2009; 34(10): 2252-2264. 34. Feyerabend C, Russell MA. A rapid gas-liquid chromatographic method for the determination of cotinine and nicotine in biological fluids. J Pharm Pharmacol 1990; 42(6): 450-452. 35. Edlund CK, Lee WH, Li D, Van Den Berg DJ, Conti DV. Snagger: a user-friendly program for incorporating additional information for tagSNP selection. BMC Bioinformatics 2008; 9: 174. 36. Thorisson GA, Smith AV, Krishnan L, Stein LD. The International HapMap Project Web site. Genome Res 2005; 15(11): 1592-1593. 37. Pritchard JK, Stephens M, Rosenberg NA, Donnelly P. Association mapping in structured populations. Am J Hum Genet 2000; 67(1): 170-181. 38. Durbin RM, Abecasis GR, Altshuler DL, Auton A, Brooks LD, Gibbs RA, et al. A map of human genome variation from population-scale sequencing. Nature 2010; 467(7319): 1061-1073. 39. Howie BN, Donnelly P, Marchini J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet 2009; 5(6): e1000529. 40. Marchini J, Howie B, Myers S, McVean G, Donnelly P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nat Genet 2007; 39(7): 906-913. 41. Pruim RJ, Welch RP, Sanna S, Teslovich TM, Chines PS, Gliedt TP, et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 2010; 26(18): 2336-2337. 42. Heatherton TF, Kozlowski LT, Frecker RC, Fagerstrom KO. The Fagerstrom Test for Nicotine Dependence: a revision of the Fagerstrom Tolerance Questionnaire. Br J Addict 1991; 86(9): 1119-1127. 57 43. R Development Core Team. R: A Language and Environment for Statistical Computing. 2010. 44. Conneely KN, Boehnke M. So Many Correlated Tests, So Little Time! Rapid Adjustment of P Values for Multiple Correlated Tests. Am J Hum Genet 2007; 81(6). 45. Li D (2010). Multiple degree of freedom p-value adjustment using correlation of score statistics. Lee W (ed): Los Angeles, CA. 46. Perkins KA, Lerman C, Grottenthaler A, Ciccocioppo MM, Milanak M, Conklin CA, et al. Dopamine and opioid gene variants are associated with increased smoking reward and reinforcement owing to negative mood. Behav Pharmacol 2008; 19(5-6): 641-649. 47. Keskitalo K, Broms U, Heliovaara M, Ripatti S, Surakka I, Perola M, et al. Association of serum cotinine level with a cluster of three nicotinic acetylcholine receptor genes (CHRNA3/CHRNA5/CHRNB4) on chromosome 15. Hum Mol Genet 2009; 18(20): 4007-4012. 48. Amos CI, Gorlov IP, Dong Q, Wu X, Zhang H, Lu EY, et al. Nicotinic acetylcholine receptor region on chromosome 15q25 and lung cancer risk among African Americans: a case-control study. J Natl Cancer Inst 2010; 102(15): 1199-1205. 49. Le Marchand L, Derby KS, Murphy SE, Hecht SS, Hatsukami D, Carmella SG, et al. Smokers with the CHRNA lung cancer-associated variants are exposed to higher levels of nicotine equivalents and a carcinogenic tobacco-specific nitrosamine. Cancer Res 2008; 68(22): 9137-9140. 50. Lee AM, Jepson C, Shields PG, Benowitz N, Lerman C, Tyndale RF. CYP2B6 genotype does not alter nicotine metabolism, plasma levels, or abstinence with nicotine replacement therapy. Cancer Epidemiol Biomarkers Prev 2007; 16(6): 1312-1314. 51. Han BG, Nunomura W, Takakuwa Y, Mohandas N, Jap BK. Protein 4.1R core domain structure and insights into regulation of cytoskeletal organization. Nat Struct Biol 2000; 7(10): 871-875. 52. Shiffer KA, Goodman SR. Protein 4.1: its association with the human erythrocyte membrane. Proc Natl Acad Sci U S A 1984; 81(14): 4404-4408. 53. Tran YK, Bogler O, Gorse KM, Wieland I, Green MR, Newsham IF. A novel member of the NF2/ERM/4.1 superfamily with growth suppressing properties in lung cancer. Cancer Res 1999; 59(1): 35-43. 54. Scott C, Keating L, Bellamy M, Baines AJ. Protein 4.1 in forebrain postsynaptic density preparations: enrichment of 4.1 gene products and detection of 4.1R binding proteins. Eur J Biochem 2001; 268(4): 1084-1094. 55. Binda AV, Kabbani N, Lin R, Levenson R. D2 and D3 dopamine receptor cell surface localization mediated by interaction with protein 4.1N. Mol Pharmacol 2002; 62(3): 507-513. 58 56. Gelernter J, Gueorguieva R, Kranzler HR, Zhang H, Cramer J, Rosenheck R, et al. Opioid receptor gene (OPRM1, OPRK1, and OPRD1) variants and response to naltrexone treatment for alcohol dependence: results from the VA Cooperative Study. Alcohol Clin Exp Res 2007; 31(4): 555-563. 57. Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 2005; 21(2): 263-265. 58. Zhang PW, Ishiguro H, Ohtsuki T, Hess J, Carillo F, Walther D, et al. Human cannabinoid receptor 1: 5' exons, candidate regulatory regions, polymorphisms, haplotypes and association with polysubstance abuse. Mol Psychiatry 2004; 9(10): 916-931. 59. De Vries TJ, Schoffelmeer AN. Cannabinoid CB1 receptors control conditioned drug seeking. Trends Pharmacol Sci 2005; 26(8): 420-426. 60. Forget B, Hamon M, Thiebot MH. Cannabinoid CB1 receptors are involved in motivational effects of nicotine in rats. Psychopharmacology (Berl) 2005; 181(4): 722-734. 61. Tanda G, Goldberg SR. Cannabinoids: reward, dependence, and underlying neurochemical mechanisms--a review of recent preclinical data. Psychopharmacology (Berl) 2003; 169(2): 115- 134. 62. Mascia MS, Obinu MC, Ledent C, Parmentier M, Bohme GA, Imperato A, et al. Lack of morphine-induced dopamine release in the nucleus accumbens of cannabinoid CB(1) receptor knockout mice. Eur J Pharmacol 1999; 383(3): R1-2. 63. Chen X, Williamson VS, An SS, Hettema JM, Aggen SH, Neale MC, et al. Cannabinoid receptor 1 gene association with nicotine dependence. Arch Gen Psychiatry 2008; 65(7): 816-824. 64. Lerman C, Kaufmann V, Rukstalis M, Patterson F, Perkins K, Audrain-McGovern J, et al. Individualizing nicotine replacement therapy for the treatment of tobacco dependence: a randomized trial. Ann Intern Med 2004; 140(6): 426-433. 65. Robinson JD, Cinciripini PM, Tiffany ST, Carter BL, Lam CY, Wetter DW. Gender differences in affective response to acute nicotine administration and deprivation. Addict Behav 2007; 32(3): 543-561. 66. Baker TB, Weiss RB, Bolt D, von Niederhausern A, Fiore MC, Dunn DM, et al. Human neuronal acetylcholine receptor A5-A3-B4 haplotypes are associated with multiple nicotine dependence phenotypes. Nicotine Tob Res 2009; 11(7): 785-796. 67. Marubio LM, Gardier AM, Durier S, David D, Klink R, Arroyo-Jimenez MM, et al. Effects of nicotine in the dopaminergic system of mice lacking the alpha4 subunit of neuronal nicotinic acetylcholine receptors. Eur J Neurosci 2003; 17(7): 1329-1337. 59 68. Picciotto MR, Zoli M, Rimondini R, Lena C, Marubio LM, Pich EM, et al. Acetylcholine receptors containing the beta2 subunit are involved in the reinforcing properties of nicotine. Nature 1998; 391(6663): 173-177. 69. Saccone SF, Hinrichs AL, Saccone NL, Chase GA, Konvicka K, Madden PA, et al. Cholinergic nicotinic receptor genes implicated in a nicotine dependence association study targeting 348 candidate genes with 3713 SNPs. Hum Mol Genet 2007; 16(1): 36-49. 70. Wada T, Naito M, Kenmochi H, Tsuneki H, Sasaoka T. Chronic nicotine exposure enhances insulin-induced mitogenic signaling via up-regulation of alpha7 nicotinic receptors in isolated rat aortic smooth muscle cells. Endocrinology 2007; 148(2): 790-799. 71. Chen RJ, Ho YS, Guo HR, Wang YJ. Long-term nicotine exposure-induced chemoresistance is mediated by activation of Stat3 and downregulation of ERK1/2 via nAChR and beta- adrenoceptors in human bladder cancer cells. Toxicol Sci 2010; 115(1): 118-130. 72. Saccone NL, Saccone SF, Hinrichs AL, Stitzel JA, Duan W, Pergadia ML, et al. Multiple distinct risk loci for nicotine dependence identified by dense coverage of the complete family of nicotinic receptor subunit (CHRN) genes. Am J Med Genet B Neuropsychiatr Genet 2009; 150B(4): 453-466. 73. Amos CI, Wu X, Broderick P, Gorlov IP, Gu J, Eisen T, et al. Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1. Nat Genet 2008; 40(5): 616-622. 74. Gago-Dominguez M, Jiang X, Conti DV, Castelao JE, Stern MC, Cortessis VK, et al. Genetic variations on chromosomes 5p15 and 15q25 and bladder cancer risk: findings from the Los Angeles-Shanghai bladder case-control study. Carcinogenesis 2011; 32(2): 197-202. 75. Hung RJ, McKay JD, Gaborieau V, Boffetta P, Hashibe M, Zaridze D, et al. A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25. Nature 2008; 452(7187): 633-637. 60 CHAPTER 2 – DRD1 ASSOCIATIONS WITH SMOKING ABSTINENCE ACROSS SLOW AND NORMAL NICOTINE METABOLIZERS Chapter 2 has been previously published in Pharmacogenetics and Genomics (2012). Abstract Nicotine metabolism and genetic variation have an impact on nicotine addiction and smoking abstinence, but further research is required. The nicotine metabolite ratio (NMR) is a robust biomarker of nicotine metabolism used to categorize slow and normal nicotine metabolizers (lower 25 th quartile cutoff). In two randomized clinical trials of smoking abstinence treatments, we conducted NMR-stratified analyses on smoking abstinence across 13 regions coding for nicotinic acetylcholine receptors and proteins involved in the dopamine reward system. Gene × NMR interaction P-values were adjusted for multiple correlated tests, and we used a Bonferroni-corrected α-level of 0.004 to determine system-wide significance. Three SNPs in DRD1 (rs11746641, rs2168631, rs11749035) had significant interactions (0.001 ≤ adjusted P-values ≤ 0.004), with increased odds of abstinence within slow metabolizers (ORs=3.1-3.5, 95% CI 1.7-6.7). Our findings support the role of DRD1 in nicotine dependence, and identify genetic and nicotine metabolism profiles that may interact to impact nicotine dependence. Keywords: Genetic association studies, heterogeneity, smoking abstinence, nicotine metabolism, nicotine metabolite ratio, DRD1 61 Introduction Nicotine addiction is a persistent global public health issue with long-term quitting success achieved by only a small percentage of smokers. 1 Research continues to identify and characterize genetic markers and biological processes that impact nicotine addiction and smoking abstinence. The cytochrome P450 enzyme (CYP2A6) converts 80-90% of nicotine to cotinine. CYP2A6 subsequently metabolizes cotinine to 3-hydroxycotinine (3-HC). Nicotine metabolite ratio (NMR), the ratio of 3-HC to cotinine, is a stable phenotypic marker of nicotine metabolism 2 and variation of NMR has been shown to be related to genetic variation in the CYP2A6 gene 2 and to factors such as sex and hormone levels that alter nicotine metabolic rate. 3 The nicotinic acetylcholine receptor (nAChR) gene regions and genes involved in the dopamine reward system have been associated with nicotine dependence. 4-6 The nAChRs influence the dopamine pathways, 7 and nicotine stimulates dopamine release in the brain reward circuits. 8 A previous study did not find an association between smoking rate and the interaction between the chr15q25.1 nAChR region and NMR. 9 However, their independent associations make the interplay between nicotine metabolism and genetic variants involved in nAChR signaling and dopamine transmission on smoking abstinence of particular interest. In this study, we assessed the interaction between strata of NMR status (slow metabolizers, 62 lower 25 th quartile) and a priori candidate genes in the nicotinic receptor and dopaminergic pathways on smoking abstinence. Study Sample We pooled subjects enrolled in two smoking abstinence pharmacogenetic effectiveness trials conducted by the University of Pennsylvania Transdisciplinary Tobacco Use Research Center assessing the efficacy of alternate forms of nicotine replacement (NRT) and bupropion therapy (BUP). 5 Smokers in the NRT trial were randomized (open-label) to transdermal nicotine (patch) or nicotine nasal spray. The BUP trial was double-blind and randomized, where smokers received placebo or bupropion. The studies had similar designs with subjects recruited using identical methods, making them directly comparable for analysis. 5 After applying exclusion criteria, 1,111 subjects consented to treatment and provided a blood sample for genotyping and NMR measurement. We limited analyses to self-identified Caucasians with phenotype and genotype data (N=626) to avoid potential confounding and heterogeneity of effect estimates. Females comprised 51% percent of participants, 46% were college graduates, the mean age was 45 years old (SD=11), the average cigarettes smoked per day was 23 (SD=10), and the mean Fagerström Test for Nicotine Dependence (FTND) 10 score was 5.5 (SD=2.2). Of 626 participants: 164 were slow metabolizers (lower 25 th NMR quartile; NMR≤0.28); 462 were normal metabolizers (upper three NMR quartiles; NMR>0.28). 63 The NRT and BUP trials provided medication and group behavioral smoking abstinence counseling. NRT participants (N=298) began assigned treatments at target quit date (TQD) and continued for 8 weeks. BUP participants (N=318) initiated assigned treatments two weeks prior to TQD for a total of 10 weeks. The primary outcome was biochemically confirmed seven-day point-prevalence abstinence at the end of treatment (EOT), assessed eight weeks post-TQD. Per convention, 11 non-abstinent participants reported smoking within seven days prior to EOT, failed to provide a saliva sample, or had carbon monoxide levels >10ppm (NRT) or cotinine levels >15ng/ml (BUP). Among 626 participants, 183 (29%) were abstinent at EOT. Genotyping procedures Within a larger candidate gene study, we focused on 13 gene regions: six coding for nicotinic acetylcholine receptors (nAChRs – CHRNA2, CHRNA4, CHRNA5-CHRNA3-CHRNB4, CHRNA7, CHRNB2, CHRNB3-CHRNA6) and seven involved in the dopamine reward system (the dopamine receptor gene family – DRD1, DRD2, DRD3, DRD4, DRD5; catechol-o- methyltransferase COMT, DRD1 interacting protein gene CALCYON). We genotyped 281 SNPs across these genes (44 with a priori putative function, 237 to capture underlying genetic structure) at the University of Southern California Epigenetics Center using the GoldenGate assay (Illumina, San Diego, CA, USA). Among the 281 SNPs, 12 with genotype call rates <95% were excluded from analyses. One additional SNP deviated significantly from Hardy-Weinberg equilibrium (adjusted threshold P=1.9×10 -4 ) (Supplementary Table 1). Complete SNP selection and quality control procedures have been described previously. 4 64 SNP Analysis To estimate NMR-stratified genetic associations with abstinence at EOT, we used logistic regression to obtain odds ratios for marginal SNP effects within strata of slow and normal nicotine metabolizers. For each SNP we tested an additive or dominant genetic model consistent with previously reported analyses. The most common genotype served as the referent. All models were adjusted for gender, age, treatment and FTND. We performed a 1- df likelihood ratio test (LRT) on SNP × NMR interaction terms. Analyses were performed using the R Statistical Program. 12 Correlated tests adjustment and system-level significance Interaction 1-df LRT P-values were adjusted to account for the correlation and number of tests performed across SNPs within a gene region. Test statistics were modeled as asymptotically distributed multivariate normal with a co-variance structure estimated from the correlation of SNPs. 4 Final observed and adjusted P-values are reported. Overall significance was determined using an additional Bonferroni correction across the 13 gene regions, giving a system-wide α-level of 0.05/13=0.004. 4 65 Supplementary Figure 1 – DRD1 LD Plot Hap Slow Metabolizer Normal Metabolizer Interaction P-value SNP Block OR (95% CI) a OR (95% CI) a Observed Adjusted b DRD1 rs1310277 1 1.98 (1.03 – 3.83) 0.53 (0.32 – 0.87) 1.32 × 10 -3 2.26 × 10 -2 rs10476156 3 2.05 (1.10 – 3.80) 0.64 (0.41 – 1.00) 1.25 × 10 -3 2.22 × 10 -2 rs4867796 b/w 4-5 2.75 (1.53 – 4.94) 0.91 (0.61 – 1.37) 1.48 × 10 -3 2.40 × 10 -2 rs11746641 5 3.10 (1.67 – 5.79) 0.82 (0.53 – 1.25) 2.06 × 10 -4 4.05 × 10 -3 rs2168631 5 3.30 (1.75 – 6.22) 0.87 (0.58 – 1.30) 1.96 × 10 -4 3.96 × 10 -3 rs11749035 5 3.49 (1.84 – 6.67) 0.79 (0.52 – 1.21) 5.95 × 10 -5 1.25 × 10 -3 CHRNA5-A3-B4 rs2036527 1 1.52 (0.94 – 2.45) 0.81 (0.59 – 1.11) 3.88 × 10 -2 2.85 × 10 -1 rs1051730 2 1.54 (0.95 – 2.48) 0.85 (0.62 – 1.16) 4.23 × 10 -2 2.99 × 10 -1 rs1317286 2 1.50 (0.92 – 2.44) 0.85 (0.62 – 1.16) 4.81 × 10 -2 3.24 × 10 -1 rs7178270 3 0.46 (0.23 – 0.91) 1.07 (0.68 – 1.68) 3.37 × 10 -2 2.57 × 10 -1 Table 6 Interaction between DRD1 and CHRNA5-CHRNA3-CHRNB4 Chromosome 15 nAChR region SNPs and nicotine metabolism rate (slow vs. normal metabolizers) on abstinence at end of treatment. a Odds Ratio of quitting associated with minor allele at SNP in specific NMR group b P-value corrected for the correlation structure within respective gene region 66 Results In our study, six SNPs located in DRD1 had significant gene × NMR interactions on abstinence (Table 6). One SNP was located in haplotype block 1 (rs1310277 [merged into dbSNP rs266001], one SNP in block 3 (rs10476156), one SNP between blocks 4 and 5 (rs4867796), and three SNPs in block 5 (rs11746641, rs2168631, rs11749035; r 2 ≥0.9) (Figure 4). The interactions for the three SNPs located in block 5 achieved system-wide significance (0.001≤adjusted P-values≤0.004). Within slow metabolizers, the minor allele was associated with increased odds of abstinence (OR=3.1-3.5, 95% CI 1.7-6.7), but that association was null within normal metabolizers (OR=0.8-0.9, 95% CI 0.5-1.3). Abstinence rates (Figure 5) reflect these associations, and were higher for slow metabolizers carrying the minor allele for each of these SNPs (46-57%) compared to the other three metabolizer/genotype groups (19-30%). 67 Figure 5 DRD1 abstinence rates Abstinence rates across different DRD1 genotype groups within slow and normal nicotine metabolizers (N = 164 and 462, respectively). Abstinence rates for six SNPs in DRD1 with adjusted gene × NMR interaction P-values < 0.05. Data shown across four genotype/nicotine metabolizer groups: - Normal metabolizers, carrying two major alleles (i.e., WT) – Unlined, White bars - Normal metabolizers, carrying at least one minor allele (i.e., Variant) – Unlined, Grey bars - Slow metabolizers, carrying two major alleles (i.e., WT) – Lined, White bars - Slow metabolizers, carrying at least one minor allele (i.e., Variant) – Lined, Grey bars Of note are gene × NMR interactions (unadjusted P-values=0.03-0.05) for four SNPs (rs7178270, rs2036527, rs1051730, rs1317286) in the chr15q25.1 CHRNA5-CHRNA3- CHRNB4 nAChR region (Table 1), three of which are in strong LD (rs2036527, rs1051730, rs1317286; r 2 ≥0.9). Although they do not achieve region-wide significance after adjustment for correlated tests, two have strong a priori associations with nicotine dependence (rs1051730 6 and rs1317286 13 ). Within slow metabolizers, the minor alleles for these SNPs are associated with suggestive increases in odds of abstinence (OR=1.5, 95% CI 0.9-2.5), but 68 slight decreases within normal metabolizers (OR=0.85, 95% CI 0.6-1.2). For rs7178270, the minor allele is associated with decreased odds of abstinence within slow metabolizers (OR=0.46, 95% CI 0.2-0.9), but null within normal metabolizers (OR=1.1, 95% CI 0.7-1.7). For the three SNPs in strong LD, the minor allele is associated with a suggestive increase in odds of abstinence within slow metabolizers (OR=1.5, 95% CI 1.0-2.5), but null within normal metabolizers (OR=0.9, 95% CI 0.6-1.2). Discussion In summary, six SNPs in DRD1 have significant gene × nicotine metabolism ratio interactions with smoking abstinence at EOT. The minor alleles for these SNPs were associated with significantly increased abstinence rates within slow metabolizers. We find that gene × nicotine metabolism interactions are more strongly associated with smoking abstinence than unstratified gene effects. Prior studies have also shown associations between DRD1 polymorphisms and nicotine dependence. 14 The SNPs in DRD1 that interacted with NMR in our study are neither found in prior nicotine dependence studies nor in LD with SNPs reported in previous studies, but they may be in LD with an undiscovered functional variant for nicotine dependence and D1 dopamine receptor expression. In the absence of information on the functional consequences of the relevant DRD1 variants, the mechanism of the interaction with the rate of nicotine metabolism is unknown. Nicotine exposure has been shown to upregulate D1 dopamine receptor expression and activity in key brain regions important for nicotine reward. 15 Such upregulation may contribute to the level 69 of nicotine dependence, and the extent of upregulation may be influenced by the rate of nicotine metabolism. The association for rs1051730 in the chr15q25.1 nAChR region replicates previous findings between this SNP and nicotine dependence. An interaction was reported between rs1051730 and CYP2A6, where cigarette consumption and FTND both increased for those in increasing risk categories (homozygous for the rs1051730 minor allele and/or normal nicotine metabolizers as determined by CYP2A6 genotype). 6 This is consistent with our finding, where normal nicotine metabolizers as assessed by NMR carrying the rs1051730 minor allele are more likely to relapse. Strengths and limitations have been described previously. 4, 5 Strengths include bias reduction through baseline biomarker measurements and the prospective assessment of abstinence. Also, our P-value adjustment for multiple correlated test is less conservative than a Bonferroni adjustment. However, while gene effects have been shown to differ across treatments, 4, 5 we lack a sufficient sample to detect small gene × NMR × treatment interaction effects. In summary, we observe significant gene × NMR interactions in which six DRD1 SNPs are associated with increased odds of smoking abstinence with slow nicotine metabolizers. We also replicate previous findings for an interaction between rs1051730 in the chr15q25.1 nAChR region and the rate of nicotine metabolism. Independent validation of our results is necessary before more conclusions can be made from these findings. 70 Chapter 2 References 1. Lerman CE, Schnoll RA, Munafo MR. Genetics and smoking cessation improving outcomes in smokers at risk. Am J Prev Med 2007; 33(6 Suppl): S398-405. 2. Dempsey D, Tutka P, Jacob P, 3rd, Allen F, Schoedel K, Tyndale RF, et al. Nicotine metabolite ratio as an index of cytochrome P450 2A6 metabolic activity. Clin Pharmacol Ther 2004; 76(1): 64-72. 3. Benowitz NL, Lessov-Schlaggar CN, Swan GE, Jacob P, 3rd. Female sex and oral contraceptive use accelerate nicotine metabolism. Clin Pharmacol Ther 2006; 79(5): 480-488. 4. Conti DV, Lee W, Li D, Liu J, Van Den Berg D, Thomas PD, et al. Nicotinic acetylcholine receptor beta2 subunit gene implicated in a systems-based candidate gene study of smoking cessation. Hum Mol Genet 2008; 17(18): 2834-2848. 5. Lerman C, Jepson C, Wileyto EP, Epstein LH, Rukstalis M, Patterson F, et al. Role of functional genetic variation in the dopamine D2 receptor (DRD2) in response to bupropion and nicotine replacement therapy for tobacco dependence: results of two randomized clinical trials. Neuropsychopharmacology 2006; 31(1): 231-242. 6. Wassenaar CA, Dong Q, Wei Q, Amos CI, Spitz MR, Tyndale RF. Relationship between CYP2A6 and CHRNA5-CHRNA3-CHRNB4 variation and smoking behaviors and lung cancer risk. J Natl Cancer Inst 2011; 103(17): 1342-1346. 7. Livingstone PD, Wonnacott S. Nicotinic acetylcholine receptors and the ascending dopamine pathways. Biochem Pharmacol 2009; 78(7): 744-755. 8. Hong LE, Gu H, Yang Y, Ross TJ, Salmeron BJ, Buchholz B, et al. Association of nicotine addiction and nicotine's actions with separate cingulate cortex functional circuits. Arch Gen Psychiatry 2009; 66(4): 431-441. 9. Falcone M, Jepson C, Benowitz N, Bergen AW, Pinto A, Wileyto EP, et al. Association of the nicotine metabolite ratio and CHRNA5/CHRNA3 polymorphisms with smoking rate among treatment-seeking smokers. Nicotine Tob Res 2011; 13(6): 498-503. 10 Heatherton TF, Kozlowski LT, Frecker RC, Fagerstrom KO. The Fagerstrom Test for Nicotine Dependence: a revision of the Fagerstrom Tolerance Questionnaire. Br J Addict 1991; 86(9): 1119-1127. 11. SRNT. Biochemical verification of tobacco use and cessation. Nicotine Tob Res 2002; 4(2): 149-159. 12. (2010). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. 71 13. Li MD, Xu Q, Lou XY, Payne TJ, Niu T, Ma JZ. Association and interaction analysis of variants in CHRNA5/CHRNA3/CHRNB4 gene cluster with nicotine dependence in African and European Americans. Am J Med Genet B Neuropsychiatr Genet 2010; 153B(3): 745-756. 14. Huang W, Ma JZ, Payne TJ, Beuten J, Dupont RT, Li MD. Significant association of DRD1 with nicotine dependence. Hum Genet 2008; 123(2): 133-140. 15. Bahk JY, Li S, Park MS, Kim MO. Dopamine D1 and D2 receptor mRNA up-regulation in the caudate-putamen and nucleus accumbens of rat brains by smoking. Prog Neuropsychopharmacol Biol Psychiatry 2002; 26(6): 1095-1104. 72 CHAPTER 3 – LATENT VARIABLE FRAMEWORK Introduction Latent variable models have been used in the social sciences as a means of estimating constructs that synthesize indices of behavior. For example, indices of duration, frequency, and quantity of smoking can be synthesized into a construct of “nicotine dependence” that either quantitatively or qualitatively assesses the degree of addiction. One unmeasured variable that aims to capture the effects of multiple measured variables reduces the degrees of freedom that need to be accounted for in a regression framework. A potentially more attractive property of latent variables is that they may capture unmeasured effects that imperfectly measured metrics miss. 1, 2 Indices for smoking may not capture the extent or degree of dependence on their own, but a latent variable, in describing relationships between measured variables, may ascertain otherwise unobservable effects. 1, 2 The potential problem then may be the interpretation of a latent variable. A hypothetical construct of “nicotine dependence” or “biological pathway X” may or may not be reliable in capturing desired effects. A non- or semiparametric model may address the interpretation of the latent variable 3, 4 and provide a more robust estimate of the underlying process. 5 It is made up of a mixture of parametric distributions and is able to allocate observed measurements into k clusters. It is non- or semiparametric in the sense that each cluster comes from a discrete distribution, but each cluster has a parametric (typically normal) distribution. 6 This provides flexibility in the 73 estimation of some unmeasured variable or parameter, yielding a multimodal distribution that shrinks observations towards respective cluster effects rather than a grand mean. In doing so, clusters can be characterized as groups with specific levels and genotypes depending on the variables used to estimate the latent clusters. 6 Moreover, while studies have regressed latent variables on an outcome of interest, 7 we propose an additional component where latent cluster profiles can be used in risk prediction of a given outcome (e.g., smoking cessation) and to tailor treatments that yield increased probabilities of success for specific profiles. TTURC and PNAT In order to develop and assess the effectiveness of a nonparametric latent variable framework, we used data from two randomized clinical trials conducted through the Transdisciplinary Tobacco Use Research Center (TTURC) of the University of Pennsylvania. 8, 9 Smokers older than 18 years of age who reported smoking more than 10 cigarettes per day over the previous year were recruited and excluded under identical procedures. Subjects included in the trials underwent similar 10 week treatment and counseling sessions, with all subjects instructed to quit smoking on their target quit date (TQD) 1-2 weeks into the study period. Smoking abstinence was biochemically verified at the end of the 10-week treatment (EOT) and 6-months post-TQD. Thus, subjects in these clinical trials were directly comparable for analysis. The notable difference between these two clinical trials was the nature of delivery of the treatments. The trial comparing the efficacy of bupropion to a placebo was double-blind 74 randomized; in the trial assessing transdermal nicotine treatment (i.e., patch) and nicotine nasal spray, the delivery nature of the treatments was known to subjects. Genes and Biomarkers With these two clinical trials, a candidate gene study was carried out as part of the Pharmacogenetics of Nicotine Addiction and Treatment Consortium. 8, 9 Utilizing expert opinion, literature searches, databases of biological pathways, and after data cleaning, 1528 single nucleotide polymorphisms in 53 candidate gene regions were genotyped. These gene regions were selected for their roles in nicotine metabolism, the brain-reward pathway (e.g., nicotinic acetylcholine receptors, dopamine receptors and transporters), and their secondary role in interactions with the brain-reward pathway. In addition, we genotyped selected variable nucleotide tandem repeats (VNTRs) in gene regions that are well-characterized for their roles in nicotine metabolism and the brain-reward pathway. Specifically, cytochrome P450 2A6 (CYP2A6), in a pharmacokinetic response to nicotine, has a crucial role in nicotine metabolism. CYP2A6 is of primary importance because it has been shown to account for 70-80% of nicotine metabolism through a mechanism that has received much attention. CYP2A6 metabolizes nicotine to its primary metabolite, cotinine. Moreover, CYP2A6 activity can further be quantified by measuring the presence of the secondary metabolite, trans-3’-hydroxycotinine (3HC), that is broken down from cotinine. Although 3HC accounts for ~38% of nicotine metabolites, it is a particularly useful metabolite because the metabolism of cotinine to 3HC by CYP2A6 is the only known 75 mechanism through which 3HC is formed. Nicotine metabolite ratio (NMR) is the ratio of 3HC to cotinine. While other pathways may account for the metabolism of nicotine to cotinine, NMR is a direct measure of nicotine metabolism through CYP2A6. Subsequently, NMR has been shown to be a reliable index of nicotine metabolism, and more importantly, a biomarker quantifying CYP2A6 activity. 10-12 Higher NMR levels indicate faster nicotine metabolism, 12 which is subsequently associated with lower cessation rates. 11, 13 Those with CYP2A6 genotypes conferring 100% (“normal”) activity have higher NMR levels than those with genotypes conferring 75% (“intermediate) and < 50% (“slow”) activity. 12 In assessing the association between NMR and smoking abstinence, those with NMR levels in the lowest quartile had the highest abstinence rates, with those in upper three quartiles having similar abstinence rates. 13 Another study found that those with NMR levels in the highest quartile had a substantial decrease in abstinence rates compared to those in the lower three quartiles. 11 These studies lead to the categorization of nicotine metabolizers according to NMR levels, with “slow” metabolizers having NMR levels in the lowest quartile, “fast” metabolizers having NMR levels in the highest quartile, and “normal” metabolizers having NMR levels in the middle two quartiles. 11, 13 Along with these pharmacokinetic measures of nicotine dependence, the Fagerstrom Test for Nicotine Dependence (FTND) is a pharmacodynamic measure that utilizes self-reported metric to assess two aspects of dependence: (1) the need to restore nicotine levels after waking up and (2) the frequency of smoking throughout the day. 14 76 Previous Results The motivation for utilizing a latent variable framework comes from observed associations between NMR, CYP2A6, smoking abstinence at 6-month follow-up, and the treatment received by subjects in the clinical trials. After adjusting for treatment, observed associations (Tables 7a-c) suggest the interplay between these variables that may not be detected through the analysis of observable relationships. Abstinence at 6-Month Follow-Up N OR (95% CI) LRT p NMR 612 0.67 (0.44-1.03) 0.07 CYP2A6 612 0.63 (0.35-1.11) 0.10 FTND 612 0.80 (0.53-1.20) 0.28 Gender 612 0.94 (0.64-1.40) 0.76 Table 7a Conventional Regressions – Univariate effect estimates between smoking abstinence at 6- month follow-up and NMR, CYP2A6, FTND and Gender, adjusted for treatment Abstinence at 6-Month Follow-Up, Joint N OR (95% CI) LRT p NMR 612 0.57 (0.37-0.90) 0.02 CYP2A6 612 0.51 (0.28-0.93) 0.02 Table 7b Conventional Regression – Joint effect estimates between smoking abstinence at 6-month follow-up and NMR, CYP2A6, adjusted for treatment NMR N Mean Change Per Unit Change (SE) LRT p CYP2A6 612 -0.14 (0.02) 4.1 × 10 -9 FTND 612 -0.01 (0.02) 0.76 Gender 612 0.09 (0.02) 4.3 × 10 -7 Table 7c Conventional Regression – Univariate effect estimates between NMR and CYP2A6, FTND and Gender 77 Specifically, univariate associations with smoking abstinence at 6-month follow-up for NMR and CYP2A6 are suggestive (Table 7a, P = 0.07 and 0.10, respectively). However, modeling NMR and CYP2A6 jointly (Table 7b) yield significant, non-collinear associations (Ps=0.02). As expected from the biological evidence previously described, there is a strong marginal association between NMR and CYP2A6 (P=4.1 × 10 -9 ). There is also a strong association between NMR and gender (P=4.3 × 10 -7 ). Although observed associations with FTND are not significant, we have shown that genetic effects on smoking abstinence vary across levels of FTND (In preparation). Given these suggestive associations and a priori associations, we can incorporate these variables to take advantage of the dimension reduction within our latent variable framework. Conventional Regression Before discussing a latent variable framework, we must address a conventional approach that models observed values of NMR, CYP2A6, and treatment on smoking abstinence. We use the a priori definition of nicotine metabolizers using NMR to categorize individuals as slow (< 25 th percentile, 0.28), normal (> 25 th percentile and < 75 th percentile, 0.54), and fast (> 75 th percentile) nicotine metabolizers (Figure 6) 11, 13 – 78 Figure 6 NMR distribution by quartiles From the following conventional joint regression using these NMR categories, where µ is the intercept, β is the effect of categorized NMR, γ is the vector of effects for the treatment groups, and ρ is the vector of effects for the NMR×Treatment interaction – logit(Pr(Smoking Abstinence)) = µ + βNMR + γTx + ρNMR*Tx There is a significant difference in smoking abstinence probabilities for metabolism categories across the different treatments (P-interaction = 0.04). There are also significant within treatment marginal effects. Within the placebo group fast metabolizers have a significantly decreased odds of abstinence compared to slow (OR=0.21, 95% CI 0.05-0.84, P=0.03) and normal (OR=0.27, 95% CI 0.07-0.97, P=0.04) metabolizers. Within the patch treatment, compared to slow metabolizers, fast (OR=0.16, 95% CI 0.04-0.65, P=0.01) and normal (OR=0.37, 95% CI 0.15-0.93, P=0.03) metabolizers have significantly decreased odds of smoking abstinence. 0.0 0.5 1.0 1.5 NMR 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 1st Qu = 0.28 3rd Qu = 0.54 !"#$%&' ()*%+"&,-)#' .%/*' ()*%+"&,-)#' 0&"1' ()*%+"&,-)#' 79 Figure 7 Observed smoking abstinence probabilities, by NMR group and treatment Figure 7 shows observed smoking abstinence probabilities across the four treatments and metabolism groups. These abstinence probabilities are consistent with predicted probabilities from regression. Within the placebo group, slow and normal nicotine metabolizers have similar smoking abstinence probabilities (0.25 and 0.21) while fast metabolizers have a markedly decreased abstinence probability (0.07). Within the patch group, the abstinence probability within normal metabolizers is half of that in slow metabolizers (0.17 vs. 0.36) and the abstinence probability within fast metabolizers is half of that in normal metabolizers (0.09 vs. 0.17). Within the spray group, abstinence probabilities within groups of metabolizers are not appreciably different (slow=0.24, normal=0.19, fast=0.22). However, the trend of decreasing abstinence probabilities as metabolism rates increase is reversed within the Pr(Smoking Abstinence) 0.0 0.1 0.2 0.3 0.4 0.25 0.17 0.36 0.24 0.21 0.27 0.17 0.19 0.07 0.29 0.09 0.22 N=36 N=36 N=33 N=46 N=72 N=82 N=75 N=74 N=46 N=49 N=35 N=32 Placebo Bupropion Patch Spray Placebo Bupropion Patch Spray Placebo Bupropion Patch Spray Slow Normal Fast 80 bupropion group; slow metabolizers had the lowest abstinence probabilities (0.17) with normal and fast metabolizers having increased probabilities (0.27 and 0.29, respectively). Expanding this joint regression model to include CYP2A6, where α is the effect of CYP2A6 – logit(Pr(Smoking Abstinence)) = µ + αCYP2A6 + βNMR + γTx + ρNMR*Tx Figure 8a Conventional regression smoking abstinence probabilities, by NMR group, CYP2A6, and treatment The significant difference in smoking abstinence probabilities for metabolism categories across the different treatments remain (P-interaction = 0.03), with CYP2A6 variant carriers having significantly decreased odds of abstinence compared with those wildtype for CYP2A6 Slow Metabolizer Normal Metabolizer Fast Metabolizer Pr(Smoking Abstinence) 0.0 0.1 0.2 0.3 0.4 0.0 0.1 0.2 0.3 0.4 0.31 0.18 0.19 0.1 0.42 0.26 0.28 0.16 0.22 0.12 0.28 0.16 0.18 0.1 0.2 0.11 0.07 0.03 0.29 0.16 0.09 0.05 0.23 0.13 PlaceboBupropion Patch Spray PlaceboBupropion Patch Spray PlaceboBupropion Patch Spray CYP2A6 Wildtype CYP2A6 Variant 81 (OR=0.49, 95% CI 0.27-0.91, P = 0.02). Significant within treatment marginal effects also remain. Within the placebo group, the odds of abstinence between fast and slow metabolizers decreases further (OR=0.16, 95% CI 0.04-0.64, P=0.01); fast metabolizers, again, have decreased odds of abstinence compared with normal metabolizers (OR=0.25, 95% CI 0.07- 0.91, P=0.04). Within the patch treatment, compared to slow metabolizers, fast (OR=0.14, 95% CI 0.04-0.57, P=0.01) and normal (OR=0.31, 95% CI 0.12-0.80, P=0.02) metabolizers, again, have significantly decreased odds of smoking abstinence. Predicted smoking abstinence probabilities, stratified by CYP2A6, reflect this significant marginal effect with a consistent 50% decrease in abstinence probabilities between genotypes (Figure 3). Slow metabolizers on the patch have higher abstinence rates across the entire study (0.36), and further stratification by CYP2A6 shows that slow metabolizers wildtype for CYP2A6 having the highest abstinence rates (0.42). 82 Figure 8b Observed smoking abstinence probabilities, by NMR group, CYP2A6, and treatment However, observed abstinence probabilities, stratified by CYP2A6, treatment and metabolism groups show marked departures from values predicted by regression. From the univariate association between NMR and CYP2A6, CYP2A6 variant carriers have lower NMR levels than those wildtype for CYP2A6. This is clearly observed across metabolism groups where the number of variant carriers in the fast metabolism group (N=12, 7% of fast metabolizers) is smaller than the number in the slow and normal metabolism groups (N = 57 and 40, 38% and 13% of slow and normal metabolizers, respectively). The univariate and joint associations between CYP2A6 and smoking abstinence show that having they CYP2A6 is associated with decreased smoking abstinence, whereas the association between NMR and CYP2A6 would suggest the opposite. Observed abstinence probabilities provide an explanation. The small number of non-abstinent CYP2A6 variant carriers among the fast Pr(Smoking Abstinence) 0.0 0.1 0.2 0.3 0.4 0.0 0.1 0.2 0.3 0.4 0.37 0.12 0.24 0 0.38 0.33 0.28 0.18 0.21 0.2 0.26 0.3 0.18 0.1 0.2 0.1 0.07 0 0.29 0 0.1 0 0.25 0 N=19 N=17 N=25 N=11 N=21 N=12 N=29 N=17 N=62 N=10 N=72 N=10 N=65 N=10 N=64 N=10 N=45 N=1 N=48 N=1 N=29 N=6 N=28 N=4 Placebo Bupropion Patch Spray Placebo Bupropion Patch Spray Placebo Bupropion Patch Spray Slow Normal Fast CYP2A6 Wildtype CYP2A6 Variant 83 metabolizers drives this association between CYP2A6 and smoking abstinence. Moreover, across the other metabolism groups, the small number of variant carriers across treatments reduces the power of the interaction across CYP2A6 and nicotine metabolism groups. Those wildtype for CYP2A6 reveal trends consistent with previously reported treatment effects across metabolism groups. Those in the placebo group show an expected decrease in abstinence probabilities from slow metabolizers (37%) to normal and fast metabolizers (21% and 7%, respectively); this also suggests that slow metabolizers may not need any treatment to effectively quit smoking. Those in the bupropion group have consistent abstinence probabilites (24% to 29%), with a disproportionate number of slow metabolizers being allocated to bupropion compared to fast metabolizers (N = 25 and 48, respectively). Abstinence probabilities for those in the patch and spray groups show remarkable consistency with reported effects. Slow metabolizers in the patch group have the overall highest abstinence probability (38%); however, the patch is less effective for normal metabolizers (18%) and poor for fast metabolizers (10%). The spray is relatively efficacious for slow and normal metabolizers (28% and 20%, respectively), and is 2.5 fold more effective for fast metabolizers, compared to fast metabolizers in the patch group (25% vs. 10%). The differences between predicted and observed relationships between these variables highlights the inability for conventional regression to tease apart complex interactions and relationships. In order to address these deficiencies, we propose a statistical framework aimed at capturing the complex associations. 84 Aims This paper will utilize this nonparametric latent variable framework to address two goals: (1) We will use NMR levels and CYP2A6 genotypes to estimate and characterize latent clusters with regard to NMR and CYP2A6, as well as smoking abstinence profiles for these latent clusters. (2) We will evaluate how well these latent clusters predict smoking abstinence. Using a nonparametric hierarchical latent variable framework, our aim is to leverage genetic and biomarker data to estimate an underlying profile for categories of individuals that describe risk of nicotine addiction. We will carry out simulations under conditions similar to the actual data to explore this framework and how it allocates values based on observed variables, varying parameters of gene-latent variable associations, latent variable-biomarker associations, and latent variable-outcome associations. We will then apply this framework to our data to assess how individuals are allocated into clusters that capture a relationship between NMR and CYP2A6 and explain characteristics influencing smoking abstinence that conventional regression does not. We will discuss extensions of this framework to multiple variables, such as gender, other biomarkers (FTND), genes, latent variables, and outcomes to develop a framework that can flexibly incorporate and define numerous variables and their relationships. 85 Latent Variable Framework Fundamental Latent Variable Framework Figure 9 Latent Variable Framework We present a hierarchical framework based on Richardson and Gilks 2 and Thomas, 1 a latent variable model composed of three submodels (Figure 9): “Process” Model: X ~ αW “Measurement” Model: Z ~ γX “Disease” Model: Y ~ βX Within this general framework, the latent variable X describes an underlying exposure or process that captures a “true” state of nature (e.g., nicotine dependence or a biological pathway), W is a quantified “formative” exposure that has a direct effect on the latent variable (e.g., cigarettes per day or genes within a pathway), Z is a measured “reflective” indicator that may be a proxy for the latent variable (e.g., withdrawal symptoms or biomarker measurements), and Y is the outcome of interest (e.g., smoking abstinence or cancer). This !" !"#$#%&'()($*+ #" !,#)-.%#&#($*+ $" !/.$01&#*+ %" !2%.$3*+ !" 45#0$+16+ !+1(+%" #" 45#0$+16+%+ 1(+$" $" 45#0$+16+%+ 1(+#" 86 model implies that, conditional on X, the other three variables are independent of one another (i.e., W, Z, and Y are associated with one another through X). Univariate and multivariate regressions with Y as the outcome may be susceptible to measurement error in W and Z. In the study of complex biological pathways, biomarkers are imperfect proxies and gene functionality only partially known for many processes that have not been completely elucidated. Regressing them in univariate models might result in biased or attenuated unadjusted effect estimates, and a joint model would provide marginal effects that account for the other variable but not measurement error in that respective variable. Nonparametric Latent Variable Framework If we desire a latent variable that is not constrained to a single distribution, a semi- or nonparametric framework can be constructed, where the latent variable is made up a mixture of parametric distributions, and observed measurements can be allocated into k clusters. Thus, individuals can be allocated into groups that share similar characteristics for W and Z. The Dirichlet process (DP) is a “nonparametric” model made up of a mixture of parametric distributions that allocates observed measurements into k clusters. 6, 15 Each cluster has a discrete and typically normal distribution. This provides flexibility in the estimation of some unmeasured variable or parameter, yielding a multimodal distribution that shrinks observations towards respective cluster effects rather than a grand mean. In doing so, groups can be distinguished from one another, rather than constraining all observations to a single distribution. 87 Cluster allocation using the Dirichlet process follows – X i ~ Categorical(P k ) P k ~ DP(αP k=1 ) α ~ Uniform(a, b) Where the latent variable, X i , comes from a posterior categorical distribution, with allocation into cluster k has some probability P k . The Dirichlet process is the conjugate prior distribution of the categorical distribution, with P k a function of the baseline probability of allocation into the first latent category, P 1 , and the hyperparameter α. α comes from a uniform distribution and impacts the extent of allocation to further number of clusters, k. The Dirichlet process and the impact of α on clustering can be described through the DP stick-breaking process. Allocation into latent clusters (1, 2, … k) comes from a random draw of beta-distributed probabilities, (ξ 1 , ξ 2 , … ξ k ), where Σξ k =1 – ξ k ~ Beta(1, α) Pr(ξ i ) = αξ i α−1 Ε(ξ i ) = (1 + α) -1 Given a continuous variable, Z, that is a surrogate for the latent variable and has values -∞ to +∞, allocation into the first latent cluster is P 1 = ξ 1 and corresponds to the interval (-∞, z 1 ). This piece of the “stick” is broken off, and allocation into the second latent cluster is a function of the probability of not being in the first latent cluster, (1 – ξ 1 ), and the random probability draw of allocation into the second latent cluster, ξ 2 , giving P 2 = (1 – ξ 1 )ξ 2 , and 88 corresponding to the interval [z 1 , z 2 ). This stick-breaking process continues, giving the probability of allocation into cluster k – P k = ξ k Π j<k (1 – ξ j ) From this process, we can see the impact of α. Smaller values of α will lead to larger probabilities of being allocated into initial latent clusters, decreasing the probability of allocation into further latent clusters. Conversely, larger values of α will lead to smaller allocation probabilities into initial latent clusters, increasing the probability of allocation into further latent clusters. The flexibility of this framework lies in the non-deterministic nature of k and the specification of the allocation probability, P k . The number of clusters, k, may or may not be pre-specified, allowing the researcher to set the number of clusters if known a priori, or allowing k to be estimated from the observed data. Instead of coming from a Beta distribution, random probability distributions for each cluster, ξ k , can also come from discrete or continuous prior distributions, with non-informative or informative parameters. Moreover, the hyperparameter, α, can have a Uniform or Beta distribution with parameters that constrain the number or clusters and extent of clustering. We propose a modification of the fundamental framework that incorporates the Dirichlet process in latent variable estimation (Figure 5). 89 Figure 10 Latent Variable Framework incorporating the Dirichlet process and CYP2A6 The Dirichlet process is incorporated into the component corresponding to the “Process” model – X i ~ Categorical(P CYP2A6,k ) P CYP2A6,k ~ DP(αP CYP2A6,k=1 ) α ~ Uniform(a, b) We leverage genetic information from CYP2A6 by incorporating it into cluster allocation probabilities. Cluster allocation probabilities for a given CYP2A6 genotype, g, come from the random draw of beta-distributed probabilities, (ξ CYP2A6=g,1 , ξ CYP2A6=g,2 , … ξ CYP2A6=g,k ), where Σξ CYP2A6=g,k =1. This allows for different allocation probabilities into the same latent cluster k across different genotypes. Although allocation probabilities, P CYP2A6,k , are estimated for g × k CYP2A6 and cluster combinations, we still estimate k latent clusters, X i⊆k . There is a similar dimension reduction when incorporating another variable, W, with w possible values where !"#$%&' !"#$ ( )' %&'()'$ *&+,&-.($ ! )' "(&)$ !"#$/0+$ ( )' "# 1..02&30)$ 456(+7 6&+&8'(+$ # !"#$%&9):; $ <&=(.,)($ >+0-&-,.,'5$ # !"#$%&*)' ?.@='(+$ >+0-&-,.,'5$ $# A,B(+()2($ ,)$!$-C'$( ) $ &)D$( )7;' ):E$ !@8-(+$ 0/$ ?.@='(+=$ F80G,)H$ 1-=3)()2($ %, &, '# IB(2'=$0/$ ( ) 9$JK9$ ( ) LJK$0)$ 1-=3)()2($ J+(&'8()'$ 90 allocation probabilities, P W,CYP2A6,k , are estimated for w × g × k cluster combinations but only k latent clusters are estimated. The corresponding component to the “Measurement” model is – NMR k ~ N(θ Xk , τ 2 ) θ Xk = θ Xk-1 + λ k-1 Where θ Xk is the mean NMR value for latent cluster k, and λ k-1 is the difference in NMR means between latent cluster k and the latent cluster preceding it, k-1. Thus, θ Xk would be the NMR mean over the interval of NMR values [NMR k-1 , NMR k ), where NMR k-1 is the upper- bound NMR for cluster k-1 and NMR k is the upper-bound NMR for cluster k. The regression of the outcome on the latent variable, the “Disease” model, then follows – logit(Smoking Abstinence | X k , Tx) = β 1 + βX k + γTx + ρX k *Tx Where β 1 is the estimate for those in the first latent cluster (X 1 ) and in the placebo group, β is the vector of effects for subsequent latent clusters (X 2 , X 3 … X k ) in the placebo group, γ is the vector of effects for those in the first latent category and receiving a treatment (bupropion, patch, spray), and ρ is the vector of effects for the latent variable × treatment interaction. This disease arm can be model a continuous outcome, or a Cox proportional hazards model. 91 Framework Evaluation There are two aspects to evaluating our framework: parameter estimation and prediction. Of particular interest in parameter estimation is the effect of latent clusters on smoking abstinence, across the different treatments. We can then evaluate the strength of association between variables of interest. Perhaps a more comprehensive application of this framework is its potential for prediction. Given a representative sample, we can estimate posterior latent cluster allocation probabilities and posterior outcome probabilities conditional on these cluster allocation probabilities. Posterior allocation and outcome distributions can subsequently be estimated for a sample similar to this representative sample. While prediction offers a potentially comprehensive evaluation of our framework, prior research has suggested that it may not be as powerful compared to parameter estimation in detecting associations of interest. 16 Thus, both approaches are valuable in evaluating our framework. Parameter Estimation The most direct evaluation of our framework is the association between our estimated latent clusters and smoking abstinence, across the different treatments. We fit our framework and obtain parameter estimates using standard Markov Chain Monte Carlo (MCMC) methods in the software, Just Another Gibbs Sampler (JAGS). 17 We obtain a distribution for each parameter of interest (e.g., β, θ Xk ) across all MCMC iterations in framework estimation. We use the mean and standard deviation of a parameter’s respective distribution to obtain the 92 effect estimate for that parameter. Confidence intervals can then be constructed to determine the significance, and subsequently, the strength of association for a given parameter. Prediction – Conventional Regression In order to assess the performance of our framework in prediction, we compare it to predicted probabilities obtained using conventional regressions. Given a representative sample for estimation, N est , and a sample for prediction, N pred , each with an outcome variable, Y, a measured surrogate for some underlying process, Z, and a determinant of the underlying process and the surrogate, W, we obtain effect estimates β 2 and β 3 using N est for the association of Y on W and Z, respectively: logit(Pr(Y|W,Z)) = β 1 + β 2 W + β 3 Z Then, using N pred and these effect estimates, we can obtain predicted probabilities of Y: € E Pr(Y |W,Z) [ ] = expit ˆ β 1 + ˆ β 2 W + ˆ β 3 Z ( ) 93 Prediction – Latent Variable Framework Using our framework, for N est , we can estimate cluster allocation probabilities for the latent variable, X k , conditional on W having value w, and Z having value z: € Pr X k = ˆ X |W = w,Z = z ( ) = Pr X i =1 |W = w,Z = z ( ) Pr X i = 2 |W = w,Z = z ( ) ... Pr X i = k |W = w,Z = z ( ) " # $ $ % $ $ Subsequently, conditional on cluster allocation, we can estimate the effect of latent variable on the probability of the outcome Y: logit[Pr(Y|X k =k)] = βX k Where β is the vector of effects for each latent cluster. Then, for N pred , conditional on values for W and Z, we can obtain posterior estimates for cluster allocation probabilities. Conditional on these posterior allocation probabilities and corresponding effect estimates of the latent variable on the outcome, we can obtain posterior probabilities for the outcome: € E Pr(Y |X k ) [ ] = expit ˆ β X k ( ) 94 Area Under the Curve Area under the curve (AUC) to assess the prediction of latent clusters and the outcome is estimated using a method described by Mason and Graham. 18 Area under the curve is calculated as such – € AUC =1− 1 e'e f i i=1 e' ∑ Given a dataset with variable Q, where 1 indicates the occurrence of an event and 0 indicates the absence of an event, and the corresponding probabilities that an event occurs for each individual j, Pr(Q j =1), we order the data in descending order of probabilities. e’ is the total number of individuals for whom Q=1, and e is the total number of individuals for whom Q=0. Beginning with individual i=1, who has the highest Pr(Q=1) among those for whom an event occurs, f i is the number of individuals who do not have the event (i.e., Q=0) but have a greater probability of the event, Pr(Q=1). Then, for i=2, who has the next highest Pr(Q=1) among those with the event, f i is the same; if f 1 =2, then f 2 is those same two individuals plus any non-event individuals between f 1 and f 2 . This continues in descending order of Pr(Q=1) for all those with the event. 95 Simulations In order to assess the predictive ability of our framework compared to conventional regressions, we apply the steps in predicting posterior probabilities and estimating AUCs described above to simulations. We simulate data under the fundamental latent variable framework (Figure 4). Corresponding with the “Process” model describing the relationship between the determinant W and the latent variable X – W ~ Bin(N, 0.2) Pr(X) ~ expit(ρW) X k ~ Multinom(N, k, Pr(X)) The probability of 0.2 for W comes from the frequency of the CYP2A6 variant. ρ is the log odds ratio between W and X, yielding probabilities for the latent variable. The latent variable, X k , is a function of the number of latent clusters, k, and the probabilities for those latent clusters. We simulate the measurement variable, Z, conditional on the latent variable – Z ~ N(0.5X k , σ) Each latent cluster has a cluster mean of 0.5k, so that the first latent cluster has a mean of 0.5, the second latent cluster has a mean of 1.0, and so on. This fixes the distance between latent clusters, λ, at 0.5. σ is the standard deviation around the latent clusters. Thus, a small σ will 96 yield a multimodal Z with k peaks, and a large σ will yield a Z that is approximately normally distributed. The outcome, Y, is generated under one of two conditions: (1) Pr(Y) ~ expit(βX) (2) Pr(Y) ~ expit(βX + γW) Y ~ Bin(N, Pr(Y)) Equation (1) assumes that the outcome is a function of only the latent variable, while equation (2) assumes that there is an independent effect of the determinant, W, on the outcome. β and γ are the log odds ratios of the associations between outcome and the latent variable and determinant, respectively. The outcome Y is a function of these outcome probabilities. In addition to fixing the frequency of W at 0.2 and the distance between latent clusters at 0.5, within the latent variable framework, we fix α ~ Uniform(0.3, 10) according to studies that found this prior to allow for an unconstrained and flexible amount of clustering. 6 97 To investigate predictive ability, we vary the number of latent clusters (k), the standard deviation around clusters (σ), the association between W and X (ρ), the association between X and Y (β), and under equation (2), the association between W and Y (γ), such that – k = [2, 3, 7] σ = [0.25, 0.5] ρ = [log(1.5), log(4)] β = [log(1.5), log(4)] γ = [log(1.5), log(4)] The number of k ranges from a simple number of clusters (2), the number of categories of nicotine metabolizers (3), and the maximum number of latent clusters that can be estimated without individuals going unassigned to a given latent cluster, and thus causing issues in parameter estimation (7). We simulate moderate to strong variability in latent clusters (0.25 to 0.5) and moderate to strong associations (log(1.5) to log(4)). For each combination of parameters, we perform 20 replicates, each with N est =500 and N pred =500, and average AUCs for the prediction of latent clusters and the outcome. Real Data Prediction To obtain a sample for estimation (N est ) and a sample for prediction (N pred ) among the 616 individuals with complete data in the clinical trials, we sampled individuals proportionally across strata of treatment, CYP2A6, and smoking abstinence, so that within each strata (e.g., bupropion, CYP2A6 wildtype, and abstinent at 6-month follow-up) five individuals were 98 randomly selected for estimation, and one was randomly selected for prediction. This gave 509 individuals for estimation (N est ) and 107 for prediction (N pred ). Conventional Regression In order to assess the performance of our framework, we first estimated effect estimates under two conventional joint regression models using the 509 individuals (N est ) – (1) logit(Pr(Smoking Abstinence | NMR, Tx)) = β 1 + βNMR + γTx + ρNMR*Tx (2) logit(Pr(Smoking Abstinence | CYP2A6, NMR, Tx)) = β 1 + β 2 CYP2A6 + βNMR + γTx + ρNMR*Tx NMR was categorized into 2 (lower 25 th and upper 75 th quartiles) or 3 (lower 25 th quartile, 25 th -75 th quartiles, upper 25 th quartile) for comparison with K = 2 or 3 for cluster allocation. Treatment is included in each model because we observe a treatment effect on smoking abstinence from the clinical trial. Marginal analyses show that each of these variables is associated with smoking abstinence. We are then interested in the effect of NMR and the NMR-treatment interaction (1), and the full model including CYP2A6, NMR and the NMR- treatment interaction (2) that assumes an independent gene effect, conditional on NMR. Then for the 109 subset individuals (N pred ), for each respective model, the probability of smoking abstinence was estimated such that: € E Pr(Smoking Abstinence | C) [ ] = expit ˆ β 1 + ˆ β C ( ) 99 Where C is the set of variables regressed in a given model and € ˆ β is the vector of estimated effects for these variables. Because we have full information on N pred , we can then estimate AUC for each respective model. We repeat this procedure of sampling N est and N pred 20 times and average the AUCs for each model. Latent Variable Framework Following our modified framework (Figure 5), for CYP2A6 genotype g and NMR value nmr, we estimate latent cluster allocation probabilities for the 509 in N est such that – € Pr X k = ˆ X | CYP2A6 = g,NMR = nmr ( ) = Pr X i =1 | CYP2A6 = g,NMR = nmr ( ) Pr X i = 2 | CYP2A6 = g,NMR = nmr ( ) ... Pr X i = k | CYP2A6 = g,NMR = nmr ( ) " # $ $ % $ $ We then estimate the effect of the latent variable, treatment and CYP2A6 on probability of smoking abstinence under one of two disease models – These two models correspond to the models assessed in the conventional regressions above. The first model assumes that the latent variable captures the effect of CYP2A6 on smoking (1) logit Pr SmokingAbstinence|X k ,Tx ( ) ! " # $ =β 1 +βX k +γTx+δX k Tx (2) logit Pr SmokingAbstinence|X k ,CYP2A6,Tx ( ) ! " # $ =β 1 +βX k +ϕCYP2A6+γTx+δX k Tx 100 abstinence, while the second model assumes that conditional on the latent variable, CYP2A6 still has an independent effect on smoking abstinence. We model the latent variable-treatment interaction because conventional regressions show that the effect of the surrogate measure, NMR, varies across different treatments. For the 107 subset individuals (N pred ), conditional on CYP2A6 and NMR, we can obtain posterior probabilities of latent cluster allocation. We can then estimate the posterior probabilities of smoking abstinence, conditional on latent cluster allocation and treatment (1) or cluster allocation, treatment and CYP2A6 (2), and their corresponding effect estimates – Again, because we have full information, we can estimate AUC for the prediction of smoking abstinence. We repeat the sampling of N est and N pred 20 times to obtain parameter estimates and average AUCs under these two disease models. (1) E Pr SmokingAbstinence|X k ,Tx ( ) ! " # $ =expit ˆ β 1 + ˆ βX k + ˆ γTx+ ˆ δX k Tx ( ) (2) E Pr SmokingAbstinence|X k ,CYP2A6,Tx ( ) ! " # $ =expit ˆ β 1 + ˆ βX k + ˆ ϕCYP2A6+ ˆ γTx+ ˆ δX k Tx ( ) 101 Simulation Results K = 2 When 2 latent clusters are simulated, we compare AUCs for latent cluster allocation and subsequent outcome probabilities to AUCs obtained for outcome probabilities using conventional regression (Tables 8a and 3a). Parameter estimates β. γ, and θ are also reported (Tables 8b and 3b) – Table 8a K = 2, AUCs, LC and outcome prediction, for simulated conditions of θ, σ, ρ and β From simulations, we observe that latent clusters are accurately predicted (AUC > 0.9) when the cluster standard deviation is low (σ = 0.25), but still well predicted (AUC ~ 0.75) at greater standard deviation (σ = 0.5); both of these standard deviations yield normal distributions for our simulated measurement variable, Z. Moreover, when there is a strong simulated effect between the gene and latent clusters (ρ = log(4)), latent cluster prediction improves under both cluster standard deviation conditions. Thus, while cluster allocation is !, Cluster SD ", LC=G #, Y=LC LC Prediction LV Framework Conventional Regression 1.5 0.903 (0.03) 0.552 (0.04) 0.546 (0.03) 4 0.916 (0.01) 0.636 (0.03) 0.616 (0.03) 1.5 0.93 (0.01) 0.543 (0.03) 0.538 (0.03) 4 0.924 (0.01) 0.634 (0.03) 0.614 (0.02) 1.5 0.726 (0.03) 0.522 (0.02) 0.518 (0.02) 4 0.74 (0.03) 0.579 (0.03) 0.567 (0.02) 1.5 0.775 (0.02) 0.529 (0.03) 0.52 (0.03) 4 0.776 (0.02) 0.595 (0.02) 0.577 (0.02) AUC Outcome Prediction 0.25 1.5 4 0.5 1.5 4 Simulated Parameters !1=0.5, !2=1.0 102 largely a function of Z, as seen in the impact of cluster standard deviation, modeling the gene (G) in cluster estimation improves cluster allocation, especially when the gene and clusters are associated. Under simulations, outcome prediction using the latent variable framework is better than conventional regression for all conditions. Prediction improves as simulated associations increase, with a smaller cluster standard deviation (σ) and larger Y-LC (β) association giving AUCs of 0.63 and 0.62 for the LV framework and conventional regression, respectively. Also of interest is that the LC-G (ρ) association has a nominal impact on outcome prediction when cluster standard deviation is small, but a more pronounced impact on outcome prediction in the latent variable framework (0.52 to 0.53 when ρ=log(4) and β=log(1.5); 0.58 to 0.60 when ρ=log(4) and β=log(4)) when the cluster standard deviation is larger (σ = 0.5). Thus, as expected, stronger simulated effects between the latent cluster and outcome lead to improved outcome prediction, and when latent cluster prediction is attenuated (σ = 0.5), a stronger association between the gene and latent clusters (ρ=log(4)) improves outcome prediction. 103 Table 8b K = 2, Parameter estimates for simulated conditions of θ, σ, ρ and β Parameter estimates (Table 8b) for cluster means are closer to simulated conditions when associations are strong, but even weaker conditions yield parameter estimates approaching simulated parameters. When the simulated standard deviation around clusters is smaller (σ = 0.25), θ 1 and θ 2 are near 0.5 and 1.0, respectively, with standard deviations close to simulated cluster standard deviations (0.2-0.25); this is reflected in corresponding LC prediction probabilities greater than 0.9. Under simulated conditions with the weakest associations (σ = 0.5, ρ=log(1.5), β=log(1.5)), θ 1 and θ 2 are shifted towards greater values (0.67 and 1.72, respectively). However, stronger associations between latent clusters and the gene (ρ=log(4)), or the outcome and latent clusters (β=log(4)) both shrink cluster means back towards simulated values (0.53-0.56 and 1.2-1.3, respectively). So the impact of other associations is observed under conditions where there is greater variability in cluster means, with stronger associations contributing to improved cluster definition. This is reflected in corresponding LC prediction probabilities, where stronger associations between latent clusters, the gene and the outcome improve AUCs when the cluster standard deviations are larger. !, Cluster SD ", LC=G #, Y=LC !1 - Mean (SD) !2 - Mean (SD) # - OR (95% CI) 1.5 0.56 (0.25) 1.16 (0.18) 1.35 (0.62-2.9) 4 0.47 (0.21) 1.04 (0.2) 3.35 (1.85-6.08) 1.5 0.46 (0.21) 1.04 (0.21) 1.32 (0.88-1.97) 4 0.48 (0.21) 1.03 (0.21) 3.46 (2.08-5.76) 1.5 0.67 (0.52) 1.72 (0.3) 1.12 (0.49-2.6) 4 0.52 (0.44) 1.36 (0.35) 2.01 (0.71-5.74) 1.5 0.56 (0.48) 1.15 (0.45) 1.26 (0.66-2.4) 4 0.53 (0.48) 1.21 (0.45) 2.18 (0.95-5.03) Simulated Parameters !1=0.5, !2=1.0 0.25 1.5 4 0.5 1.5 4 Estimated Parameters 104 The estimated association between latent clusters and the outcome (β) is attenuated when latent clusters are not as well estimated. β approaches simulated values of 1.5 and 4.0 when the cluster standard deviation is smaller (σ = 0.25) and the association between latent clusters and the gene is strong (ρ=log(4)), with estimated odds ratios of 1.3 (95% CI 0.9-2.0) and 3.5 (95% CI 2.1-5.8). However, decreasing the association between latent clusters and the gene (ρ=log(1.5)) increases the variability around these estimates (1.35, 95% CI 0.6-3.0; 3.4 (95% CI 1.9-6.1), respectively). Increasing the cluster standard deviation (σ = 0.5) further attenuates β (OR = 1.1-1.2 and 2.0-2.2, for simulated values of 1.5 and 4.0, respectively), with a stronger latent cluster-gene association having a similar impact in decreasing the variability around these estimates. Table 9a K = 2, AUCs, outcome prediction, for simulated conditions of θ, σ, ρ, β, and γ !, Cluster SD ", LC=G #, Y=LC $, Y=G LV Framework Conventional Regression 1.5 0.567 (0.04) 0.566 (0.03) 4 0.607 (0.03) 0.61 (0.02) 1.5 0.648 (0.03) 0.632 (0.03) 4 0.689 (0.03) 0.673 (0.02) 1.5 0.569 (0.02) 0.562 (0.02) 4 0.637 (0.02) 0.633 (0.02) 1.5 0.648 (0.02) 0.631 (0.02) 4 0.706 (0.02) 0.688 (0.02) 1.5 0.561 (0.02) 0.556 (0.02) 4 0.615 (0.02) 0.61 (0.02) 1.5 0.596 (0.03) 0.585 (0.02) 4 0.65 (0.02) 0.638 (0.01) 1.5 0.552 (0.04) 0.549 (0.03) 4 0.616 (0.02) 0.616 (0.02) 1.5 0.606 (0.02) 0.597 (0.02) 4 0.666 (0.03) 0.655 (0.02) Simulated Parameters !1=0.5, !2=1.0 AUC - Outcome Prediction 0.25 1.5 1.5 4 4 1.5 4 0.5 1.5 1.5 4 4 1.5 4 105 When we simulate an independent gene effect on the outcome (γ), the prediction of latent clusters does not change appreciably (data not shown), but outcome prediction improves further, and our latent variable framework continues to outperform conventional regression. Including this independent Y-G effect improves prediction under each respective condition. For example, under the strongest simulated conditions (σ=0.25, ρ=log(4), β=log(4)), the outcome prediction probability for the latent variable framework and conventional regression is 0.63 and 0.61, respectively. But adding even a weak Y-G association (γ=log(1.5)) increases the AUC to 0.65 and 0.63, respectively. And a strong Y-G association (γ=log(4)) further increases the AUC to 0.71 and 0.69, respectively. However, we are most interested in simulation conditions closest to those in our data. From marginal associations (Table 7) and the NMR distribution, we can surmise that the simulation conditions bolded in red (σ=0.5, ρ=log(1.5), β=log(1.5), γ=log(1.5)) are closest to the real data. Under these conditions, we would predict a reasonable prediction of latent clusters (AUC=0.73) and a slightly improved outcome prediction (AUC=0.56). 106 Table 9b K = 2, Parameter estimates for simulated conditions of θ, σ, ρ, β, and γ Simulating an independent effect of the gene on the outcome does impact previously described results from the framework without this independent effect. Cluster means (θ 1 and θ 2 ) when the simulated cluster standard deviation is smaller (σ=0.25) are near simulated values of 0.5 and 1.0, respectively, across all values for ρ, β, and γ, with estimated cluster standard deviations roughly equal to the simulated value (0.2-0.25). Compared to the framework without the gene-outcome independent effect, estimates of β are slightly attenuated when clusters standard deviations are smaller (σ=0.25) and both the simulated latent cluster-gene and outcome-latent cluster effects are strong (ρ=log(4), β=log(4)) (OR=3.5 without the gene-outcome effect, OR=3.1-3.2 with the gene-outcome effect). However, when clusters standard deviations are larger (σ=0.5) and both the simulated latent cluster-gene and outcome-latent cluster effects are strong (ρ=log(4), β=log(4)), including the gene-outcome effect (OR=3.5 without the gene-outcome effect, OR=3.1-3.2 with the gene- outcome effect). !, Cluster SD ", LC=G #, Y=LC $, Y=G !1 - Mean (SD) !2 - Mean (SD) # - OR (95% CI) $ - OR (95% CI) 1.5 0.54 (0.24) 1.11 (0.18) 1.19 (0.56-2.54) 1.44 (0.9-2.29) 4 0.54 (0.25) 1.14 (0.19) 1.27 (0.56-2.88) 3.29 (1.85-5.85) 1.5 0.48 (0.22) 1.05 (0.2) 3.41 (1.58-7.37) 1.31 (0.83-2.07) 4 0.47 (0.22) 1.05 (0.2) 2.88 (1.36-6.07) 3.31 (1.94-5.63) 1.5 0.46 (0.21) 1.04 (0.21) 1.34 (0.85-2.12) 1.3 (0.84-2.04) 4 0.47 (0.21) 1.04 (0.21) 1.33 (0.79-2.27) 3.65 (2.12-6.3) 1.5 0.47 (0.21) 1.03 (0.21) 3.12 (1.84-5.29) 1.32 (0.83-2.11) 4 0.47 (0.22) 1.04 (0.21) 3.18 (1.83-5.53) 3.52 (2-6.2) 1.5 0.69 (0.53) 1.73 (0.3) 1.18 (0.43-3.24) 1.37 (0.85-2.21) 4 0.68 (0.52) 1.58 (0.31) 1.16 (0.38-3.53) 3.36 (1.78-6.35) 1.5 0.58 (0.48) 1.48 (0.32) 2.27 (0.74-6.97) 1.38 (0.83-2.29) 4 0.56 (0.46) 1.36 (0.38) 1.99 (0.62-6.37) 2.85 (1.51-5.37) 1.5 0.65 (0.52) 1.16 (0.46) 1.29 (0.56-2.97) 1.36 (0.79-2.35) 4 0.62 (0.51) 1.18 (0.42) 1.35 (0.44-4.08) 3.41 (1.55-7.54) 1.5 0.4 (0.41) 1.17 (0.42) 2.62 (1.03-6.65) 1.45 (0.82-2.56) 4 0.5 (0.45) 1.23 (0.44) 2.68 (0.92-7.82) 3.26 (1.51-7.04) 4 Estimated Parameters Simulated Parameters !1=0.5, !2=1.0 0.25 1.5 1.5 4 4 1.5 4 0.5 1.5 1.5 4 4 1.5 107 K = 3 When 3 latent clusters are simulated, there are similar patterns the 2 simulated latent clusters in prediction (Tables 10a-b and 11a) and parameter estimation (Tables 10c-d and 11b-c) – Table 10a K = 3, AUCs, LC prediction, for simulated conditions of θ, σ, ρ and β Table 10b K = 3, AUCs, outcome prediction, for simulated conditions of θ, σ, ρ and β Our latent variable framework, without a simulated effect between the gene (G) and outcome (Y), has higher prediction probabilities than conventional regression under every simulation condition. Cluster prediction probabilities for the first and third latent clusters are higher when simulated cluster standard deviations are smaller (σ=0.25, AUC > 0.9; σ=0.50, 0.85 > AUC > 0.8). However, an interesting trend becomes apparent for the prediction of the second K = 3 λ σ, Cluster SD ρ, LC=G β, Y=LC LC=1 LC=2 LC=3 0.5 0.25 1.5 1.5 0.944 (0.01) 0.674 (0.12) 0.938 (0.02) 4 0.938 (0.02) 0.596 (0.07) 0.942 (0.02) 4 1.5 0.948 (0.01) 0.809 (0.06) 0.944 (0.01) 4 0.949 (0.01) 0.726 (0.11) 0.947 (0.01) 0.5 1.5 1.5 0.794 (0.03) 0.539 (0.03) 0.804 (0.02) 4 0.801 (0.02) 0.529 (0.03) 0.803 (0.02) 4 1.5 0.831 (0.02) 0.571 (0.05) 0.842 (0.02) 4 0.82 (0.02) 0.541 (0.04) 0.835 (0.02) AUC - Cluster Prediction K = 3 λ σ, Cluster SD ρ, LC=G β, Y=LC LV Framework Conventional Regression 0.5 0.25 1.5 1.5 0.554 (0.02) 0.548 (0.03) 4 0.677 (0.02) 0.672 (0.02) 4 1.5 0.564 (0.03) 0.555 (0.03) 4 0.698 (0.02) 0.679 (0.02) 0.5 1.5 1.5 0.541 (0.03) 0.528 (0.03) 4 0.631 (0.02) 0.619 (0.02) 4 1.5 0.54 (0.03) 0.532 (0.03) 4 0.65 (0.02) 0.634 (0.02) AUC - Outcome Prediction 108 latent cluster; the AUC is up to 7% lower when the simulated effect between latent clusters and the outcome is stronger (β=log(4)). This can be attributed to a stronger simulated latent cluster – outcome association increasing the effect of the third latent cluster. Thus, when the framework estimates latent clusters concurrently with their effect on the outcome, this increased difference in effect estimates between the third and first latent clusters biases observations in the second latent cluster towards these clusters. Prediction probabilities for the outcome are higher when simulated effects are stronger, with the highest probabilities (LV Framework = 0.70, Conventional Regression = 0.68) obtained under the strongest simulated condition (σ=0.25, ρ=log(4), β=log(4)). Similar to K=2, stronger simulated effects between latent clusters and the determinant and outcome (ρ=log(4), β=log(4)) increase AUCs by 10-14%. Of interest is that simulating three latent clusters results in higher AUCs for all simulation conditions compared to two latent clusters, with the AUC under the strongest conditions only reaching 0.63 for the latent variable framework. Estimation of more latent clusters, with effects that increase additively, may be more informative, leading to improved outcome prediction under even moderate simulated conditions. 109 Table 10c K = 3, Mean NMR for simulated conditions of θ, σ, ρ and β Table 10d K = 3, Effect estimates for simulated conditions of θ, σ, ρ and β Parameter estimates for the framework without a simulated effect between G and Y are biased away from their simulated values under all conditions. Cluster means for the first and third clusters (θ 1 , θ 3 ) shrink toward the overall mean for the measurement variable (Z) of 1.0, with stronger simulated effects attenuating this bias. Consistent with observed cluster AUCs where prediction for the second latent cluster was smaller when the simulated outcome – latent cluster effect was stronger (β=log(4)), θ 2 is near the simulated value of 1.0 when β=log(1.5), but shrinks towards the first latent cluster when β=log(4). K = 3 θ1=0.5 θ2=1.0 θ3=1.5 λ σ, Cluster SD ρ, LC=G β, Y=LC θ1 θ2 θ3 1.5 0.67 (0.08) 0.91 (0.14) 1.29 (0.07) 4 0.65 (0.09) 0.81 (0.11) 1.29 (0.04) 1.5 0.63 (0.08) 0.99 (0.13) 1.37 (0.06) 4 0.57 (0.08) 0.88 (0.11) 1.37 (0.06) 1.5 0.61 (0.16) 0.92 (0.3) 1.35 (0.53) 4 0.55 (0.12) 0.78 (0.14) 1.33 (0.08) 1.5 0.65 (0.12) 1.01 (0.2) 1.68 (0.96) 4 0.5 (0.13) 0.81 (0.16) 1.39 (0.26) Estimated Parameters Simulated Parameters 0.5 0.25 1.5 4 0.5 1.5 4 K = 3 λ σ, Cluster SD ρ, LC=G β, Y=LC β2 β3 1.5 0.86 (0.31-2.43) 1.62 (0.86-3.06) 4 0.54 (0.12-2.48) 5.9 (2.07-16.85) 1.5 1.06 (0.44-2.58) 1.56 (0.88-2.77) 4 0.98 (0.27-3.56) 6.73 (2.66-16.99) 1.5 0.86 (0.3-2.45) 1.35 (0.65-2.83) 4 0.56 (0.14-2.28) 3.96 (1.39-11.32) 1.5 0.98 (0.38-2.55) 1.4 (0.7-2.83) 4 0.65 (0.16-2.71) 4.06 (1.49-11.12) 0.5 0.25 1.5 4 0.5 1.5 4 Estimated Parameters 110 Effect estimates for the latent cluster on the outcome explain the overall increase in outcome prediction. We simulated an additive effect of the latent clusters. Although the estimated effect of the second latent cluster is null, the third latent cluster approaches simulated effects of log(2*4) under the strongest conditions (log(6.7)). Even when the third latent cluster has an estimated effect less than the simulated additive effect of log(2*1.5) or log(2*4), its effect is still stronger than estimated effects under equivalent conditions in K=2, resulting in improved overall AUCs. Table 11a K = 3, AUCs, outcome prediction, for simulated conditions of θ, σ, ρ, β, and γ Similar to K=2, incorporating an independent gene effect on the outcome, γ, leads to an additional increase in AUC for outcome prediction. Consistent with results for K=3 without this independent gene effect, AUCs increase across all simulated conditions, compared to K=2; the AUC under the strongest conditions (σ=0.25, ρ=log(4), β=log(4), γ=log(4)) is 0.75 for our framework compared to 0.71 for the conventional regression. Also, when the K = 3 ! !, Cluster SD ", LC=G #, Y=LC $, Y=G LV Framework Conventional Regression 1.5 0.594 (0.03) 0.565 (0.04) 4 0.629 (0.02) 0.553 (0.03) 1.5 0.69 (0.02) 0.686 (0.01) 4 0.719 (0.02) 0.669 (0.01) 1.5 0.591 (0.02) 0.57 (0.03) 4 0.643 (0.02) 0.594 (0.02) 1.5 0.705 (0.02) 0.683 (0.03) 4 0.753 (0.03) 0.707 (0.02) 1.5 0.562 (0.03) 0.536 (0.03) 4 0.619 (0.02) 0.531 (0.02) 1.5 0.632 (0.02) 0.618 (0.02) 4 0.669 (0.04) 0.618 (0.03) 1.5 0.584 (0.02) 0.547 (0.03) 4 0.636 (0.02) 0.559 (0.03) 1.5 0.668 (0.02) 0.635 (0.02) 4 0.708 (0.03) 0.639 (0.02) AUC - Outcome Prediction "1=0.5, "2=1.0, "3=1.5 4 4 0.5 1.5 1.5 4 1.5 4 1.5 4 0.5 0.25 1.5 1.5 4 111 simulated gene effect is strong (γ=log(4)), our framework has AUCs that are markedly larger than conventional regressions (5-8%), even after accounting for the standard deviation in these AUC estimates. Table 11b K = 3, Mean NMR for simulated conditions of θ, σ, ρ, β, and γ Table 11c K = 3, Effect estimates for simulated conditions of θ, σ, ρ, β, and γ K = 3 θ1=0.5 θ2=1.0 θ3=1.5 λ σ, Cluster SD ρ, LC=G β, Y=LC γ, Y=G θ1 θ2 θ3 1.5 0.66 (0.08) 0.9 (0.15) 1.35 (0.34) 4 0.64 (0.09) 0.91 (0.27) 1.41 (0.46) 1.5 0.6 (0.09) 0.81 (0.09) 1.3 (0.05) 4 0.62 (0.09) 0.81 (0.09) 1.27 (0.04) 1.5 0.63 (0.07) 0.98 (0.13) 1.38 (0.1) 4 0.61 (0.08) 0.95 (0.14) 1.38 (0.18) 1.5 0.6 (0.1) 0.86 (0.11) 1.34 (0.05) 4 0.59 (0.09) 0.86 (0.12) 1.34 (0.05) 1.5 0.62 (0.19) 0.98 (0.46) 1.46 (0.9) 4 0.63 (0.18) 0.99 (0.52) 1.62 (1.27) 1.5 0.51 (0.14) 0.74 (0.13) 1.29 (0.07) 4 0.56 (0.14) 0.81 (0.14) 1.35 (0.18) 1.5 0.64 (0.14) 0.99 (0.19) 1.48 (0.53) 4 0.62 (0.13) 0.95 (0.19) 1.4 (0.23) 1.5 0.54 (0.15) 0.82 (0.16) 1.37 (0.1) 4 0.51 (0.15) 0.79 (0.14) 1.39 (0.19) 4 0.5 1.5 1.5 4 4 1.5 4 0.5 0.25 1.5 1.5 4 4 1.5 Simulated Parameters Estimated Parameters K = 3 λ σ, Cluster SD ρ, LC=G β, Y=LC γ, Y=G β2 β3 γ 1.5 0.85 (0.3-2.39) 1.53 (0.77-3.04) 1.4 (0.85-2.29) 4 0.69 (0.23-2.07) 1.52 (0.73-3.17) 3.06 (1.71-5.46) 1.5 0.52 (0.13-2.08) 5.44 (1.88-15.76) 1.15 (0.68-1.93) 4 0.37 (0.09-1.6) 4.82 (1.53-15.2) 2.51 (1.12-5.61) 1.5 0.99 (0.41-2.38) 1.72 (0.85-3.48) 1.36 (0.8-2.34) 4 0.81 (0.27-2.4) 1.89 (0.82-4.36) 2.79 (1.41-5.51) 1.5 0.65 (0.17-2.47) 5.69 (2.07-15.64) 1.17 (0.67-2.04) 4 0.57 (0.13-2.64) 5.84 (1.8-18.9) 2.57 (1.13-5.87) 1.5 0.87 (0.3-2.55) 1.24 (0.59-2.6) 1.34 (0.83-2.16) 4 0.8 (0.24-2.66) 1.24 (0.51-2.98) 3.25 (1.72-6.15) 1.5 0.45 (0.11-1.9) 3.54 (1.23-10.22) 1.14 (0.66-1.95) 4 0.45 (0.1-1.97) 3.16 (1.01-9.91) 2.46 (1.11-5.44) 1.5 0.92 (0.33-2.61) 1.51 (0.66-3.46) 1.35 (0.76-2.39) 4 0.83 (0.26-2.7) 1.65 (0.63-4.36) 3.13 (1.4-6.96) 1.5 0.58 (0.14-2.43) 3.88 (1.3-11.6) 1.32 (0.69-2.52) 4 0.47 (0.11-2.09) 3.99 (1.22-13.04) 2.8 (1.13-6.91) 0.5 0.25 1.5 1.5 4 4 1.5 4 0.5 1.5 1.5 4 4 1.5 4 Estimated Parameters 112 Cluster means after incorporating the independent gene effect on the outcome are consistent with cluster means without this effect, with cluster means being shrunk towards the second latent cluster. Consistent with results from K=2, the estimated gene effect on the outcome (γ) is independent of parameters pertaining to the latent clusters. As previously described, the estimated effect of the third latent cluster, though attenuated from the simulated additive effect (log(2*β)) is still stronger than estimated effects when K=2. This gene effect with the higher estimated effects from the third latent cluster result in stronger associations overall associations with the outcome, giving higher overall AUCs across all simulation conditions for K=3. K = 7 Simulating 7 latent clusters supports the trend in improved prediction observed when K=3. Parameter estimates provide an explanation for this improved prediction. When more latent clusters are simulated, our framework has high predictive ability for latent clusters in the extremes (LC=1 and LC=7, AUCs > 0.94), regardless of simulated strengths of association. However, AUCs for latent clusters between these extremes are variable, with at least half of the latent clusters being estimated with some reliability (AUCs > 0.6). Thus, when multiple clusters are present, our framework is able to identify extreme clusters well and does a reasonable job of identifying clusters lying between these clusters, even when the overall distribution for the measurement variable (Z) is unimodal. 113 Consistent with results observed from K=2 and 3, our framework has better outcome predictive ability than conventional regressions under all simulated conditions. Moreover, outcome prediction improves across all simulated conditions compared with K=2 and 3, surpassing 80% for our framework under optimal conditions (σ=0.25, ρ=log(4), β=log(4)), and not dropping below 60%, even under moderate conditions (σ=0.5, ρ=log(1.5), β=log(1.5)) Estimated cluster means for the first three clusters shrink towards the overall mean of the measurement variable (Z) of 2.0, but the cluster means of higher latent clusters have increased variability, with one exception: a strong simulated association between the latent cluster and outcome (β=log(4)) or decreased variability in simulated clusters (σ=0.25) results in a seventh latent cluster that has decreased variability and is nearer to the simulated mean of 3.5. Examining the association between latent clusters and the outcome and in line with observed estimates from K=3, we obtain an explanation for the overall increase in outcome predictive ability. While the effects of the first six latent clusters are null, the effect of the seventh latent cluster is strong. It does not approach the simulated effect of log(7*β), but as was observed when K=3, the association is stronger than effect estimates from K = 2 and 3, with odds ratios near 2 and greater than 5 for simulated associations of log(1.5) and log(4), respectively. 114 Incorporating an independent gene effect under for a greater number of latent clusters further improves outcome prediction of our framework, with an AUC over 0.85 for optimal conditions (σ=0.25, ρ=log(4), β=log(4), γ=log(4)). As with K=3, our framework outperforms conventional regressions, with AUCs over 0.6 for all simulation conditions, and at least 0.7 when either the independent gene effect on the outcome or the effect of the latent clusters on the outcome is strong (log(4)). Finally, as expected, parameter estimates for the latent clusters after including the independent gene effect are consistent with parameter estimates before including this effect. Likewise, parameter estimates for the independent gene effect are independent of simulated conditions for the latent clusters. In summary, our latent variable framework performs comparably with conventional regressions when there are fewer latent clusters. However, the advantage of our framework comes when more latent clusters are present. We are not only able to identify multiple latent clusters, but their effects on a given outcome, even when moderate, can be estimated reasonably well within our framework, allowing for improved prediction. 115 Real Data Results Latent Cluster Characterization – Full Data Before assessing the predictive ability of our framework, we use the full dataset to characterize latent cluster allocation. In applying our latent variable framework to the real data, we constrained the number of latent clusters to three (k=3), such that NMR is clustered into slow, normal and fast metabolizers. This leads to the following distributions of NMR values across the three latent clusters (Figure 6), and latent cluster allocation probabilities (Table 3) – Figure 11 Latent cluster distributions across NMR distribution 0.0 0.5 1.0 1.5 NMR 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 !"#$%&' ()*%+"&,-)#' .%/*' ()*%+"&,-)#' 0&"1' ()*%+"&,-)#' ! 2 34562' " 2 3'4572' " 8 3'4597' " : 3'452;' ! : 345:7' 116 Table 12 Latent cluster allocation probabilities, overall and by CYP2A6 and NMR The first latent category of slow metabolizers (Pr(LC=Slow)=0.19) consists mostly of those carrying the CYP2A6 variant (Pr(LC=Slow | CYP2A6=Variant) = 0.88); individuals allocated to this latent category have mean NMR levels (θ 1 ) of 0.28, which is equivalent to the first NMR quartile. The second latent category of normal metabolizers (Pr(LC=Normal)=0.72) consists mostly of those wildtype for CYP2A6 (Pr(LC=Normal | CYP2A6=Wildtype) = 0.86); individuals allocated to this latent category have mean NMR levels (θ 2 ) of 0.42, which is roughly equivalent to the median NMR level (0.40). The third latent category of fast metabolizers (Pr(LC=Fast)=0.09) consists mostly of with higher NMR levels, with a higher proportion carrying the CYP2A6 variant; individuals allocated to this latent category have mean NMR levels (θ 3 ) of 0.94, and encompasses the upper 10 th percentile of the NMR distribution (0.73-1.69). !"#$%&'()*+ ,$#"-#+ .$#"/&01+2,.3 402,.3 !"#$%& 402,.5!"#$%&3 6!7+ .$#"/&01 402' ( 8,.5!"#$%&9 6!73 :+;<= ;<;> ?+;<= ;<;@ :+;<= ;<A> ?+;<= ;<BA :+;<= ;<AC ?+;<= ;<DE :+;<= ;<;C ?+;<= ;<E; :+;<= ;<;; ?+;<= ;<E> :+;<= ;<;; ?+;<= ;<CE C ; ;<E; E ;<;> ;<;A @ ; ;<DF E ;<;B ;<>@ E ; ;<;= E ;<DD ;<EA 117 Figure 12 Latent cluster smoking abstinence probabilities, stratified by treatment Smoking abstinence probabilities differ for latent clusters across the placebo and three treatment groups (Figure 7). Across all treatments, the normal metabolizers cluster have the highest abstinence probabilities (≥20%), and, except for the bupropion group, the fast metabolizers cluster have the lowest abstinence probabilities (≤12%). The normal metabolizers cluster consist primarily of those without the CYP2A6 variant (86%) who have NMR levels between the 25 th and 75 th percentile. The slow metabolizers cluster consists primarily of CYP2A6 variant carriers (88%), while the fast metabolizers cluster consists primarily of those with high NMR levels. Thus, clustering and subsequent abstinence probabilities are consistent with results from conventional regressions, where the CYP2A6 variant and higher NMR levels are associated with markedly decreased odds of smoking abstinence. Moreover, the fast metabolizers cluster has a higher proportion of CYP2A6 Slow Metabolizer Normal Metabolizer Fast Metabolizer Pr(Smoking Abstinence) 0.00 0.05 0.10 0.15 0.20 0.25 0.18 0.19 0.17 0.17 0.2 0.26 0.21 0.23 0.12 0.21 0.06 0.06 PlaceboBupropion Patch Spray PlaceboBupropion Patch Spray PlaceboBupropion Patch Spray 118 variant carriers (31% variant carriers, 17% wildtype), which is consistent with overall smoking abstinence probabilities that are lower than the other two clusters. So, the nonparametric framework clustered individuals into groups with CYP2A6 and NMR profiles that have similar abstinence probabilities. Similar to the conventional regression, abstinence probabilities for fast metabolizers within the bupropion group are high compared with both fast metabolizers in the other treatment groups and slow metabolizers. This suggests that bupropion is an effective treatment, especially for those who are the least likely to remain abstinent. However, the NRTs (patch and spray) are ineffective for these same individuals who are at the greatest risk of relapse, perhaps exacerbating this risk. Unlike conventional interaction analyses where individuals are not stratified beyond observed measurements (e.g., genotype and biomarker levels), our nonparametric framework appears to distinguish between individuals with the same genotype and biomarker levels (e.g., CYP2A6 variant, high NMR levels), clustering them as slow or fast metabolizers who have subsequently different abstinence probabilities. The latent variable × treatment interaction makes our analysis more informative, showing how abstinence probabilities for these latent clusters differ across treatment, while providing a biological profile based on genotype and biomarker levels that can be harmonized with marginal observed associations. 119 Prediction Results Latent Cluster Characterization - N est Before examining the predictive ability of our framework, we see that latent cluster characterization remains largely unchanged from results using the entire dataset. Under the different latent variable framework conditions, each respective replicate gave similar NMR distributions by latent cluster allocation probabilities and CYP2A6. When two latent clusters are estimated, those who do not carry the CYP2A6 variant have a very small probability of being allocated into the first latent cluster (Pr(LC=1)), with NMR levels having little impact on these allocation probabilities. For those carrying the CYP2A6 variant, those with lower NMR levels have a higher Pr(LC=1), while higher NMR levels leads to a higher probability of being allocated into the second latent cluster (Pr(LC=2)). When three latent clusters are estimated, we observe a similar pattern where those who do not carry the CYP2A6 variant have a very small Pr(LC=1). Also, NMR levels for these individuals do not have in impact in Pr(LC=1), compared with allocation into the second or third (Pr(LC=3)) latent clusters. Pr(LC=1) is around 0.8 for those carrying the variant and with NMR levels less than the 3 rd quartile (0.54); conversely, Pr(LC=2) is less than 0.2 for those with NMR levels less than the 3 rd quartile. 120 Also of importance is the observed overlap between the first and second latent clusters. The Dirichlet stick-breaking process estimates allocation probabilities for non-overlapping latent clusters. However, by including CYP2A6 in cluster estimation, separate allocation probabilities are estimated for CYP2A6 wildtype and CYP2A6 variant carriers. Those wildtype for CYP2A6 are allocated to the second or third latent cluster, even at low NMR levels, while the allocation of CYP2A6 variant carriers is dependent on NMR levels. Thus, the overlap can be attributed to those who are CYP2A6 wildtype. Cluster means (θ) and the distance between clusters (λ) are approximately equivalent to quartile and median NMR values (Table 5): NMR 25th Quartile NMR Median NMR 75th Quartile 0.28 0.4 0.54 K θ 1 (SD) θ 2 (SD) θ 3 (SD) 2 0.29 (0.15) 0.47 (0.23) - 3 0.27 (0.12) 0.41 (0.15) 0.96 (0.19) Table 13 Latent variable framework cluster means When two latent clusters are estimated, the first latent cluster mean is estimated around the 25 th quartile (θ 1 = 0.29 (0.15)) and the second latent cluster mean lies between the median and 75 th quartile NMR values (θ 2 = 0.47 (0.23)). When three latent clusters are estimated, the first latent cluster is still estimated around the 25 th quartile (θ 1 = 0.27 (0.12)), the second latent cluster is estimated around the median (θ 2 = 121 0.41 (0.15)), and the third latent cluster is estimated around the 97 th percentile (θ 3 = 0.96 (0.19)). AUC Using the prediction framework, the conventional model without CYP2A6 yields the following AUC for abstinence prediction (Table 14a): AUC All AUC (<25th) “Slow” AUC (25th- 75th) “Normal” AUC (>75th) “Fast” 0.537 (0.05) 0.5 (0.09) 0.5 (0.05) 0.637 (0.14) Table 14a Areas under the curve for a joint regression model without CYP2A6 Overall, this joint regression model has an abstinence prediction probability of 0.54 that gains its predictive power within fast nicotine metabolizers (0.64). The joint model including CYP2A6 yields the following AUC for abstinence prediction (Table 14b) – AUC All AUC (<25th) “Slow” AUC (25th- 75th) “Normal” AUC (>75th) “Fast” 0.58 (0.04) 0.58 (0.09) 0.518 (0.06) 0.704 (0.12) Table 14b Areas under the curve for a joint regression model with CYP2A6 122 The overall abstinence predictive probability increases from 0.54 to 0.58, with increases in the slow and fast metabolizers (0.5 to 0.58 and 0.64 to 0.70, respectively). These increases in AUC also reflect the independent marginal effect of CYP2A6 on smoking abstinence. For the 20 replicates across each respective framework condition (2 or 3 latent clusters; the disease model with and without CYP2A6), we averaged the area under the curve (AUC), and assessed the prediction of our latent variable framework to the comparable conventional regression framework (Table 15). Table 15 Average areas under the curve for LV Framework conditions Overall, for conventional regressions (Tables 2 and 3) and the latent variable framework (Table 6), AUCs remain consistently low and comparable (between 0.55 and 0.60) regardless of the number of estimated latent clusters. We highlight AUCs when three latent clusters are estimated (K=3), and the higher AUCs (≥0.6) in the upper quartile (>75 th ), compared to AUCs (<0.6) in the lower quartile (<25 th ). Previous literature has shown that those with lower NMR levels can be categorized as slow nicotine metabolizers, and those with higher levels can be categorized as normal or fast nicotine metabolizers. From univariate associations between CYP2A6 and NMR, the CYP2A6 Disease Model AUC All AUC (<25th) AUC (>25th) AUC (>75th) With CYP2A6 0.568 (0.01) 0.528 (0.1) 0.583 (0.04) - No CYP2A6 0.548 (0.02) 0.537 (0.11) 0.568 (0.05) - With CYP2A6 0.554 (0.02) 0.555 (0.11) 0.522 (0.08) 0.612 (0.16) No CYP2A6 0.548 (0.04) 0.544 (0.1) 0.501 (0.05) 0.666 (0.16) K = 2 K = 3 123 variant is associated with a decrease in NMR; as such, those carrying the variant should be slow nicotine metabolizers and those without the variant should be normal or fast metabolizers. This delineation is observed in the latent cluster allocation probabilities for those without the variant: Pr(LC=1) (i.e., the probability of being a “slow” metabolizer) is low; Pr(LC=2) (i.e., the probability of being a “normal” metabolizer) is higher at between the 25 th and 75 th quartiles of NMR levels and decreases as NMR increases; Pr(LC=3) (i.e., the probability of being a “fast” metabolizer) increases steadily as NMR levels increase, especially in the upper 75 th quartile. However, unlike those without the variant where NMR levels did not affect Pr(LC=1) and led to expected decreases and increases in Pr(LC=2) and Pr(LC=3), respectively, as NMR increases, for those carrying the CYP2A6 variant, NMR levels only have an impact on Pr(LC=1) and Pr(LC=3), with Pr(LC=2) remaining low. Thus, our framework is distinguishing CYP2A6 variant carriers with high NMR levels that conventional regressions may not, allocating them as either slow or fast nicotine metabolizers. Framework Extensions Multiple Variables An advantage of this framework is that multiple covariates can be incorporated in the estimation of latent clusters (Figure 9). 124 Figure 13 Nonparametric framework incorporating CYP2A6 and multiple covariates We have previously observed gender × gene and FTND × gene interactions, making gender and FTND ideal candidates for incorporation into our framework. Incorporation of these variables in the framework leads to more refined allocation probabilities by NMR, CYP2A6, gender and FTND. Allocation probabilities for 2 estimated latent clusters change after including gender and FTND. NMR levels impact allocation probabilities for those wildtype for CYP2A6 and those carrying the variant. This is different from the framework without gender and FTND, where those wildtype for CYP2A6 were allocated into the first latent cluster regardless of NMR levels. Moreover, allocation probabilities are entirely a function of NMR levels; regardless of CYP2A6 genotype, gender or FTND level, those with NMR levels less than the third quartile !"# $%&'(# )*'&*+,-# ! .# $%&'(# )*'&*+,-# ! /# 0-123(# )*'&*+,-# 456# ! "# 7*8-38# 9*'&*+,-# ! "# 5-*3# 456#:2'# ! "# "# ;,,21*<23# =>?-'@ ?*'*A8-'# $ !"B%&$'()B"CD # !*E-,&3-# F'2+*+&,&8># $ !"B%&$'()B"# G,HE8-'# F'2+*+&,&8># $# I&J-'-31-# &3#!#+K8#! " # *3(#! "@D# "C.# 4HA+-'# 2:# G,HE8-'E# 0A2L&3M# ;+E<3-31-# %, &, '# NJ-18E#2:# ! " B#$OB# ! " P$O#23# ;+E<3-31-# $'-*8A-38# %&$'()# 125 are allocated into the first latent cluster, and those with NMR levels greater than the third quartile are allocated into the second latent cluster. We observe similar patterns when 3 latent clusters are estimated. Regardless of the CYP2A6 genotype, gender and FTND level, those with NMR levels less than the median are allocated into the first latent cluster, those with NMR levels around the median and 75 th quartile are allocated into the second latent cluster, and those with NMR levels in the upper 25 th quartile are allocated into the third latent cluster. For both two and three estimated latent clusters, allocation probabilities as a function of NMR levels, and not any of the other variables, may be due to the reduced sample size within each stratum, leading to cluster characterization that becomes a function of NMR. Applying our prediction framework using multiple variables in LC estimation leads to small changes in abstinence probability prediction (Tables 16-17) – Table 16 Average areas under the curve, multiple covariates in LC estimation AUCs for the model including gender and FTND are within a standard deviation of AUCs for the frameworks without them. Disease Model Covariates in LC Estimation AUC All AUC (<25th) AUC (>25th) AUC (>75th) No CYP2A6 CYP2A6 0.548 (0.02) 0.537 (0.11) 0.568 (0.05) - K = 2 With CYP2A6 CYP2A6 0.566 (0.01) 0.525 (0.08) 0.598 (0.05) - CYP2A6, Sex, FTND CYP2A6, Sex, FTND 0.557 (0.02) 0.549 (0.05) 0.56 (0.03) - No CYP2A6 CYP2A6 0.548 (0.04) 0.544 (0.1) 0.501 (0.05) 0.666 (0.16) K = 3 With CYP2A6 CYP2A6 0.565 (0.01) 0.519 (0.08) 0.54 (0.06) 0.707 (0.14) CYP2A6, Sex, FTND CYP2A6, Sex, FTND 0.571 (0.02) 0.557 (0.07) 0.529 (0.05) 0.669 (0.1) 126 This modest increase in AUCs is also observed in comparable conventional regressions (Table 17) – Table 17 Average areas under the curve, multiple covariates, conventional regression For the full conventional joint model including gender and FTND, AUCs again are within a standard deviation of the other models. From simulations, we observed increases in AUCs after including a variable with a strong association on the outcome. Gender and FTND only had modest associations with smoking abstinence (Table 1), possibly explaining the relatively unchanged AUCs after including them in the framework. Multiple Latent Clusters For the framework where only CYP2A6 is used in estimation of latent clusters but not included in the disease model, we also explored how many latent clusters would be estimated if we did not constrain the number to 2 or 3, along with subsequent AUCs. Looking first at allocation probabilities and NMR values for the estimated latent clusters, we see that they are similar to a framework constrained to three latent clusters (Table 18) – Disease Model AUC All AUC (<25th) AUC (25th-75th) AUC (>75th) No CYP2A6 0.56 (0.04) 0.5 (0.09) 0.581 (0.04) - With CYP2A6 0.566 (0.03) 0.561 (0.09) 0.568 (0.04) - CYP2A6, Sex, FTND 0.591 (0.03) 0.63 (0.06) 0.574 (0.03) - No CYP2A6 0.537 (0.05) 0.5 (0.09) 0.5 (0.05) 0.637 (0.14) With CYP2A6 0.58 (0.04) 0.58 (0.09) 0.518 (0.06) 0.704 (0.12) CYP2A6, Sex, FTND 0.596 (0.03) 0.615 (0.07) 0.503 (0.04) 0.686 (0.08) K = 2 K = 3 Conventional Regression 127 Table 18 Latent cluster allocation probabilities and NMR values for an unconstrained number of latent clusters Leaving the number of clusters unconstrained results in allocation into 6 latent clusters. However, individuals are largely allocated into one of the first three latent clusters, with only 6% allocated into the last three. Allocation probabilities and NMR means for the first three latent clusters are similar to those observed for a framework constrained to three latent clusters (Tables 4 and 5, respectively). The impact of estimating more latent clusters on allocation probabilities is observed in the second and third latent clusters, with Pr(LC=2) reduced from 0.72 to 0.61, Pr(LC=3) increased from 0.9 to 0.14, and the remaining 6% of individuals being allocated into the last three latent clusters. This impact on cluster means is observed in the third latent cluster, with the NMR mean for this latent cluster decreasing from 0.96 to 0.7, and the last three latent clusters having cluster means in the upper 95% NMR percentiles. LC=1 LC=2 LC=3 LC=4 LC=5 LC=6 Pr(LC) 0.176 0.621 0.139 0.055 0.006 0.003 NMR Mean (SD) 0.25 (0.1) 0.38 (0.11) 0.7 (0.07) 0.98 (0.1) 1.34 (0.08) 1.65 (0.17) NMR Quantile 20% 45% 87% 97% 99.60% 99.90% 128 AUCs under this framework where the number of latent clusters is unconstrained is similar to the constrained framework (Table 19): Table 19 AUC comparison of the unconstrained LV framework to the constrained LV framework and conventional regression The overall AUC for the unconstrained framework is similar to that of the constrained framework (0.55). However, the AUC for within the second and third NMR quartiles (AUC (25 th -75 th )) increases from 0.5 to 0.55. Perhaps, the reallocation of individuals in the second latent cluster of the constrained framework to the third latent cluster of the unconstrained latent cluster led to latent cluster characterization with more accurate smoking abstinence prediction. Conclusions We propose a nonparametric latent variable framework utilizing the Dirichlet process prior to cluster individuals into risk categories. The conventional approach of categorizing individuals as slow, normal, and fast nicotine metabolizers according to their NMR levels motivated this nonparametric approach. In order to evaluate this approach, we utilized a prediction framework to obtain the area under the curve for the probability of predicting smoking abstinence. AUC All AUC (<25th) AUC (25th-75th) AUC (>75th) Conventional Regression 0.537 (0.05) 0.5 (0.09) 0.5 (0.05) 0.637 (0.14) Constrained LV Framework 0.548 (0.04) 0.544 (0.1) 0.501 (0.05) 0.666 (0.16) Unconstrained LV Framework 0.549 (0.04) 0.482 (0.12) 0.555 (0.06) 0.603 (0.14) 129 Simulations showed that our latent variable framework reliably predicted latent clusters, even when the standard deviation around these clusters was moderate (AUC > 0.7). Simulations also showed that the latent variable framework outperformed conventional regressions, under comparable simulation conditions. However, the prediction probabilities were moderate, with AUCs not exceeding 0.7. In the real data, the latent variable framework performed similarly to conventional regressions with regard to smoking abstinence prediction. However, our framework offers the potential for dimension reduction and flexibility in the number of estimated latent clusters which can be advantageous compared to conventional regressions. 130 Chapter 3 References 1. Thomas DC. Multistage sampling for latent variable models. Lifetime Data Anal 2007; 13(4): 565-581. 2. Richardson S, Gilks WR. Conditional independence models for epidemiological studies with covariate measurement error. Stat Med 1993; 12(18): 1703-1722. 3. Muller P, Roeder K. A Bayesian Semiparametric Model for Case-Control Studies with Errors in Variables. Biometrika 1997; 84(3): 523-537. 4. Yang M, Dunson DB. Bayesian Semiparametric Structural Equation Models with Latent Variables. Psychometricka 2010; 75(4): 675-693. 5. Yan G, Sedransk J. The Effect of Sample Composition on Inference for Random Effects Using Normal and Dirichlet Process Models. Journal of Data Science 2010; 8: 579-595. 6. Ohlssen DI, Sharples LD, Spiegelhalter DJ. Flexible random-effects models using Bayesian semi-parametric models: applications to institutional comparisons. Stat Med 2007; 26(9): 2088- 2112. 7. Molitor J, Papathomas M, Jerrett M, Richardson S. Bayesian profile regression with an application to the National Survey of Children's Health. Biostatistics 2010; 11(3): 484-498. 8. Conti DV, Lee W, Li D, Liu J, Van Den Berg D, Thomas PD, et al. Nicotinic acetylcholine receptor beta2 subunit gene implicated in a systems-based candidate gene study of smoking cessation. Hum Mol Genet 2008; 17(18): 2834-2848. 9. Lerman C, Jepson C, Wileyto EP, Epstein LH, Rukstalis M, Patterson F, et al. Role of functional genetic variation in the dopamine D2 receptor (DRD2) in response to bupropion and nicotine replacement therapy for tobacco dependence: results of two randomized clinical trials. Neuropsychopharmacology 2006; 31(1): 231-242. 10. Dempsey D, Tutka P, Jacob P, 3rd, Allen F, Schoedel K, Tyndale RF, et al. Nicotine metabolite ratio as an index of cytochrome P450 2A6 metabolic activity. Clin Pharmacol Ther 2004; 76(1): 64-72. 11. Lerman C, Tyndale R, Patterson F, Wileyto EP, Shields PG, Pinto A, et al. Nicotine metabolite ratio predicts efficacy of transdermal nicotine for smoking cessation. Clin Pharmacol Ther 2006; 79(6): 600-608. 12. Malaiyandi V, Goodz SD, Sellers EM, Tyndale RF. CYP2A6 genotype, phenotype, and the use of nicotine metabolites as biomarkers during ad libitum smoking. Cancer Epidemiol Biomarkers Prev 2006; 15(10): 1812-1819. 131 13. Schnoll RA, Patterson F, Wileyto EP, Tyndale RF, Benowitz N, Lerman C. Nicotine metabolic rate predicts successful smoking cessation with transdermal nicotine: a validation study. Pharmacol Biochem Behav 2009; 92(1): 6-11. 14. Heatherton TF, Kozlowski LT, Frecker RC, Fagerstrom KO. The Fagerstrom Test for Nicotine Dependence: a revision of the Fagerstrom Tolerance Questionnaire. Br J Addict 1991; 86(9): 1119-1127. 15. Ferguson TS. A Bayesian analysis of some nonparametric problems. The Annals of Statistics 1973; 1(2): 209-230. 16. Vickers AJ, Cronin AM, Begg CB. One statistical test is sufficient for assessing new predictive markers. BMC Med Res Methodol 2011; 11: 13. 17. Plummer M (2012). JAGS - Just Another Gibbs Sampler. Plummer M (ed). 18. Mason SJ, Graham NE. Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation. Quarterly Journal of the Royal Meteorological Society 2002; 128: 2145-2166. 132 CONCLUSION Tobacco-related morbidity and mortality remain among the costliest worldwide public health issues and the most preventable. 1-3 Nicotine dependence is the primary factor in the persistence of this problem, leading to ongoing efforts to characterize nicotine dependence. For instance, differences in nicotine dependence between females and males have been well characterized, with females tending to be more influenced by external cues and male smoking behavior more reinforced by nicotine dose. 4-8 Itemized scales assessing nicotine dependence, such as the Fagerstrom Test for Nicotine Dependence, 9 have also been developed and validated to characterize whether individuals have low or high nicotine dependence. Ongoing efforts have also led to the development of treatments to deal with this addictive drug. Nicotine replacement therapies, such as transdermal and nasal spray treatments, and other drugs, such as bupropion, targeted at neurotransmitters effected by nicotine, have had varying degrees of effectiveness for smoking cessation. 10-12 Transdermal therapies, which act through a gradual release of nicotine, have been shown to be efficacious for those with low nicotine dependence. 11, 12 The nasal spray therapy, which involves an acute and rapid increase in nicotine levels, have been more effective for those with moderate or high nicotine dependence. 11, 12 Studies investigating genetic variants involved in the dopamine reward pathway, nicotine metabolism, and related mechanisms and pathways have shown that specific genetic profiles have different levels of nicotine dependence and respond differentially to treatments for 133 nicotine dependence. 10-14 Differential treatment effects for bupropion have been observed for variants within genes involved in the dopamine reward pathway on smoking cessation, with bupropion inhibiting the reuptake of dopamine. 10 Bupropion has also been shown to be an antagonist for the nicotinic acetylcholine receptors, which bind nicotine. 10, 12 These two actions in concert highlight the interplay between treatments and genes. The metabolism of nicotine has also been of particular interest. Specifically, CYP2A6 has been shown to be a primary mechanism in the metabolism of nicotine to cotinine and then 3- hydroxycotinine (3HC). 15 The nicotine metabolite ratio (NMR), the ratio of 3HC to cotinine, has subsequently been shown to be a reliable proxy for nicotine metabolism, with higher levels (>25th quartile) reflecting normal or fast metabolism and lower levels (<25th quartile) reflecting slow nicotine metabolism. 16 Moreover, those carrying the CYP2A6 variant have been shown to have lower NMR levels, and be slow metabolizers, and those wildtype for CYP2A6 have been shown to be normal or fast metabolizers. 17, 18 This intricate web of dependence, genetic variation, cessation treatments, and nicotine metabolism has led us to develop a latent variable framework that attempts to capture an underlying process that characterizes a nicotine profile. Latent variable models have been used in the social sciences as a means of estimating constructs that synthesize indices of behavior. 19 For example, indices of duration, frequency, and quantity of smoking can be synthesized into a construct of “nicotine dependence” that either quantitatively or qualitatively assesses the degree of addiction. One unmeasured variable that aims to capture the effects of multiple measured variables reduces the degrees of freedom that need to be 134 accounted for in a regression framework. A potentially more attractive property of latent variables is that they may capture unmeasured effects that imperfectly measured metrics miss. Indices for smoking may not capture the extent or degree of dependence on their own, but a latent variable, in describing relationships between measured variables, may ascertain otherwise unobservable effects. The potential problem then may be the interpretation of a latent variable. A hypothetical construct of “nicotine dependence” or “biological pathway X” may or may not be reliable in capturing desired effects. A hierarchical framework to model a latent variable may elucidate that underlying relationship between observed measurements. We approach the estimation of an underlying relationship as a measurement error problem, in which there is an imperfectly measured surrogate, Z, for that true underlying exposure or process, X. 20, 21 Different approaches to estimation of the latent variable, X, have been explored. 19, 22, 23 Parametric approaches assume that X ~ N(µ, σ 2 ), X ~ Bin(π, n), or comes from a single distribution. However, assuming a single distribution may be restrictive, unable to identify individuals that share similar characteristics for W and Z, and subsequently, not distinguishing between clusters of individuals that are different from other clusters. We have modified this framework by incorporating a Dirichlet process 24, 25 that does not constrain this underlying process to a single distribution, but can flexibly cluster individuals into profiles based on observed biomarker levels, genetic variation and other biological and psychological characteristics. This non- or semiparametric model made up of a mixture of parametric distributions is able to allocate observed measurements into k clusters. It is non- 135 or semiparametric in the sense that each cluster comes from a discrete distribution, but each cluster has a parametric (typically normal) distribution. This provides flexibility in the estimation of some unmeasured variable or parameter, yielding a multimodal distribution that shrinks observations towards respective cluster effects rather than a grand mean. In doing so, groups can be distinguished from one another, rather than constraining all observations to a single distribution. The association between this latent variable and a smoking related outcome can be estimated, with the eventual purpose of utilizing these risk profiles to predict the outcome of interest. In order to assess the performance of our framework, we conducted simulations based on NMR levels and CYP2A6 genotypes to estimate latent clusters, and determined how well these latent clusters predicted an outcome based on smoking cessation. The nonparametric latent variable framework outperformed conventional regressions. As expected, stronger simulated conditions (the association between the variables used in estimating the latent variable and the association between the latent variable and outcome) results in improved prediction of latent clusters and the outcome, with AUCs for latent cluster prediction greater than 90%, and AUCs for the outcome exceeding 80% under the best conditions. Under simulated conditions that most closely reflect associations in the real data, our latent variable framework has a predictive probability around 60%. Of interest are scenarios of misspecification of the latent variable. When fewer latent clusters are simulated, but we do not constrain the number of estimated latent clusters, our framework estimates the number of simulated latent clusters fairly reliably. Subsequent AUCs are 136 similar to scenarios when we constrain the number of estimated latent clusters to the number of simulated latent clusters. However, when we constrain the number of estimated latent clusters to be fewer than the number of simulated clusters, the estimated latent variable is grouped into clusters with similar effects, capturing the effects of the simulated clusters, and leading to AUCs comparable to those when many clusters are simulated and estimated. Thus, our framework is able to overcome misspecification in the estimation of the latent variable. Applying our framework to the real data and incorporating treatment, latent clusters were similar to the slow, normal and fast nicotine metabolism categories based on NMR levels, regardless of whether we constrained the number of estimated latent clusters to 2 or 3 or allowed the framework to determine the number of latent clusters. This is no surprise given that our latent variable was estimated using NMR. What was of interest was that incorporating CYP2A6, NMR levels had an impact in clustering individuals within those carrying the CYP2A6 variant, but not those wildtype for CYP2A6, suggesting that CYP2A6 and NMR play complementary, but not completely dependent roles in nicotine metabolism. Moreover, while our framework did not outperform conventional regressions in predicting smoking cessation, it was comparable, with our framework having the advantage of more refined characterization of individuals and dimension reduction when including more variables. 137 Further Applications From these issues arise potential extensions to our framework that can be explored through further research – - More latent clusters and variables, both observed and latent. - A model selection process. - A time-varying component in the disease arm and in estimation of the latent variable. Multiple Latent Clusters and Variables The performance of our latent variable framework in application to the real data points to a less tenable issue. There may be underlying factors in the data that cannot be captured without more data, or an expanded approach to the framework. This extension can be carried out through one of two means – (1) We can incorporate more simulated or observed variables, and subsequently estimate more latent clusters. Simulations showed that more latent clusters, even with modest effects, had markedly improved predictive abilities. We explored this briefly in the real data by not constraining the number of estimated latent clusters, but we only applied this when CYP2A6 and NMR were the only variables used in latent variable estimation – 138 X i ~ Categorical(P CYP2A6,k ) P CYP2A6,k ~ DP(αP CYP2A6,k=1 ) α ~ Uniform(a, b) In using more variables to estimate latent clusters, we constrained the number of estimated latent clusters. The natural progression is to use more variables in latent cluster estimation while leaving the number of latent clusters unconstrained. (2) We can estimate multiple latent variables, using variables with a priori relationships in the estimation of respective latent variables. For example, we have used CYP2A6 and NMR in the estimation of a latent variable because they are intricately tied in the same pathway. We can do the same for pharmacodynamic measurements such as FTND and withdrawal symptoms, creating a latent variable capturing the physiological result of nicotine addiction. This may capture an additional aspect of nicotine dependence that can be used to improve prediction of smoking abstinence. Model Selection A model selection component can also be incorporated into our framework. Given multiple measures of affect, genetic data, and study characteristics (e.g., gender, race) that can serve as “reflective” proxies of the latent variable (i.e., Z within the “measurement” model) and “formative” exposures that can have direct effects on the latent variable (i.e., W within the 139 “process” model), there will be a priori relationships that should be included in the model (e.g., gene “A” is known to be involved in the biochemical process of addiction and ought to be modeled as a “formative” exposure). There will also an exploratory nature to the analyses where we determine whether a variable should be modeled, and whether it should be incorporated into the “measurement” or “process” model. We will develop model selection procedures using different criteria (e.g., Bayes factors, BIC) to determine the optimal combinations of phenotypes and genotypes to be modeled in the frameworks. Time-varying Component In our framework, we have only applied a generalized linear model in the disease arm to identify associations with smoking abstinence. Given data with a time component, this arm could also be expanded to incorporate a Cox proportional hazard model. In comparison to conventional analyses, we could construct a latent variable framework that leverages the phenotypic, genetic, treatment, and abstinence data across multiple timepoints to estimate some underlying process that captures the unmeasured relationships between these variables, and subsequently serves as a robust predictor of nicotine dependence and smoking abstinence. 140 In brief, the fundamental framework follows – - “Process” Model: X ~ αW - “Measurement” Model: Z ~ γX - “Disease” Model: Y ~ βX Within this general framework, the latent variable X describes an underlying exposure or process that captures a “true” state of nature (e.g., nicotine dependence or a biological pathway), W is a quantified “formative” exposure that has a direct effect on the latent variable (e.g., cigarettes per day or genes within a pathway), Z is a measured “reflective” indicator that may be a proxy for the latent variable (e.g., withdrawal symptoms or biomarker measurements), and Y is the outcome of interest (e.g., smoking abstinence). This model implies that, conditional on X, the other three variables are independent of one another (i.e., W, Z, and Y are associated with one another through X). A conventional Cox proportional hazard model specifies some time-to-event variable, T 0 , which can be time to smoking relapse or abstinence. There is also a time to censoring variable, I, where an individual leaves the study due to multiple reasons (e.g., death, loss to follow-up). We can define the survival variable, T, and some censoring indicator, δ, as such – € T = min T 0 , I { } δ = 1 if T 0 > I 0 if T 0 ≤ I ⎧ ⎨ ⎩ 141 At a given time, t, for those who have not been censored, the log form of the proportional hazards model follows the basic form – log(h(t)) = log(h 0 (t)) + βW Where the hazard function, h(t), is conditional on the baseline hazard function, h 0 (t), and some set of covariates, W. We present latent variable frameworks that incorporate estimation using this time-varying component. We introduce time-varying components into the latent variable framework by incorporating estimation of the hazard function. The latent variable (X), the “formative” exposures (W), the “reflexive” proxy for the latent variable (Z), and the hyperparameters used in their respective prior distributions now vary across each time point, t, with each model (Process and Measurement) being estimated for each time point. Y t = Y t * + ε t (5) Y t * = φ 0 + φ 1 t (6) log(h(t|Y t * ,X t ) = log(h 0 (t)) + β 1 Y t * + β 2 X t (7) Given time-varying outcomes of interest, Y t (e.g., smoking relapse at time t), we assume they follow a linear growth model with residuals, ε t , (5) and Y t * a function of time, t, (6). Given these time-dependent variables, we can estimate the hazard function in the “Disease” model 142 (7) where the hazard function is conditional on the baseline hazard function, h 0 (t), and the time-dependent latent variable, X t . Moreover, the hazard function is conditional on Y t * , which describes the trend in the outcome interest at time t. This time-varying component can also be incorporated into estimation of the latent variable. The latent variable (X), the “formative” exposures (W), the “reflexive” proxy for the latent variable (Z), the hyperparameters specific to this framework (θ Xi , λ k , ξ W,k ), and the allocation probabilities for each latent cluster (P W,k ) can vary across each time point, t, and these parameters are estimated for each time point. 143 Conclusion References 1. State-specific prevalence of current cigarette smoking among adults--United States, 2003. MMWR Morb Mortal Wkly Rep 2004; 53(44): 1035-1037. 2. CDC. Cigarette smoking among adults--United States, 2002. MMWR Morb Mortal Wkly Rep 2004; 53(20): 427-431. 3. CDC. Smoking-attributable mortality, years of potential life lost, and productivity losses-- United States, 2000-2004. MMWR Morb Mortal Wkly Rep 2008; 57(45): 1226-1228. 4. Burgess DJ, Fu SS, Noorbaloochi S, Clothier BA, Ricards J, Widome R, et al. Employment, gender, and smoking cessation outcomes in low-income smokers using nicotine replacement therapy. Nicotine Tob Res 2009; 11(12): 1439-1447. 5. Hogle JM, Curtin JJ. Sex differences in negative affective response during nicotine withdrawal. Psychophysiology 2006; 43(4): 344-356. 6. Perkins KA, Jacobs L, Sanders M, Caggiula AR. Sex differences in the subjective and reinforcing effects of cigarette nicotine dose. Psychopharmacology (Berl) 2002; 163(2): 194-201. 7. Pogun S, Yararbas G. Sex differences in nicotine action. Handb Exp Pharmacol 2009;(192): 261-291. 8. Schnoll RA, Patterson F. Sex heterogeneity in pharmacogenetic smoking cessation clinical trials. Drug Alcohol Depend 2009; 104 Suppl 1: S94-99. 9. Heatherton TF, Kozlowski LT, Frecker RC, Fagerstrom KO. The Fagerstrom Test for Nicotine Dependence: a revision of the Fagerstrom Tolerance Questionnaire. Br J Addict 1991; 86(9): 1119-1127. 10. Lerman C, Shields PG, Wileyto EP, Audrain J, Hawk LH, Jr., Pinto A, et al. Effects of dopamine transporter and receptor polymorphisms on smoking cessation in a bupropion clinical trial. Health Psychol 2003; 22(5): 541-548. 11. Ray R, Schnoll RA, Lerman C. Pharmacogenetics and smoking cessation with nicotine replacement therapy. CNS Drugs 2007; 21(7): 525-533. 12. Ray R, Schnoll RA, Lerman C. Nicotine dependence: biology, behavior, and treatment. Annu Rev Med 2009; 60: 247-260. 13. David SP, Brown RA, Papandonatos GD, Kahler CW, Lloyd-Richardson EE, Munafo MR, et al. Pharmacogenetic clinical trial of sustained-release bupropion for smoking cessation. Nicotine Tob Res 2007; 9(8): 821-833. 144 14. David SP, Strong DR, Munafo MR, Brown RA, Lloyd-Richardson EE, Wileyto PE, et al. Bupropion efficacy for smoking cessation is influenced by the DRD2 Taq1A polymorphism: analysis of pooled data from two clinical trials. Nicotine Tob Res 2007; 9(12): 1251-1257. 15. Yamanaka H, Nakajima M, Fukami T, Sakai H, Nakamura A, Katoh M, et al. CYP2A6 AND CYP2B6 are involved in nornicotine formation from nicotine in humans: interindividual differences in these contributions. Drug Metab Dispos 2005; 33(12): 1811-1818. 16. Malaiyandi V, Goodz SD, Sellers EM, Tyndale RF. CYP2A6 genotype, phenotype, and the use of nicotine metabolites as biomarkers during ad libitum smoking. Cancer Epidemiol Biomarkers Prev 2006; 15(10): 1812-1819. 17. Dempsey D, Tutka P, Jacob P, 3rd, Allen F, Schoedel K, Tyndale RF, et al. Nicotine metabolite ratio as an index of cytochrome P450 2A6 metabolic activity. Clin Pharmacol Ther 2004; 76(1): 64-72. 18. Lerman C, Tyndale R, Patterson F, Wileyto EP, Shields PG, Pinto A, et al. Nicotine metabolite ratio predicts efficacy of transdermal nicotine for smoking cessation. Clin Pharmacol Ther 2006; 79(6): 600-608. 19. Bollen KA. Latent variables in psychology and the social sciences. Annu Rev Psychol 2002; 53: 605-634. 20. Richardson S, Gilks WR. Conditional independence models for epidemiological studies with covariate measurement error. Stat Med 1993; 12(18): 1703-1722. 21. Thomas DC. Multistage sampling for latent variable models. Lifetime Data Anal 2007; 13(4): 565-581. 22. Krueger RF, Markon KE, Patrick CJ, Iacono WG. Externalizing psychopathology in adulthood: a dimensional-spectrum conceptualization and its implications for DSM-V. J Abnorm Psychol 2005; 114(4): 537-550. 23. Krueger RF, Piasecki TM. Toward a dimensional and psychometrically-informed approach to conceptualizing psychopathology. Behav Res Ther 2002; 40(5): 485-499. 24. Ferguson TS. A Bayesian analysis of some nonparametric problems. The Annals of Statistics 1973; 1(2): 209-230. 25. Ohlssen DI, Sharples LD, Spiegelhalter DJ. Flexible random-effects models using Bayesian semi-parametric models: applications to institutional comparisons. Stat Med 2007; 26(9): 2088- 2112. 145 BIBLIOGRAPHY Amos CI, Gorlov IP, Dong Q, Wu X, Zhang H, Lu EY, et al. Nicotinic acetylcholine receptor region on chromosome 15q25 and lung cancer risk among African Americans: a case-control study. J Natl Cancer Inst 2010; 102(15): 1199-1205. Amos CI, Wu X, Broderick P, Gorlov IP, Gu J, Eisen T, et al. Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1. Nat Genet 2008; 40(5): 616- 622. Bahk JY, Li S, Park MS, Kim MO. Dopamine D1 and D2 receptor mRNA up-regulation in the caudate-putamen and nucleus accumbens of rat brains by smoking. Prog Neuropsychopharmacol Biol Psychiatry 2002; 26(6): 1095-1104. Baker TB, Piper ME, McCarthy DE, Majeskie MR, Fiore MC. Addiction motivation reformulated: an affective processing model of negative reinforcement. Psychol Rev 2004; 111(1): 33-51. Baker TB, Weiss RB, Bolt D, von Niederhausern A, Fiore MC, Dunn DM, et al. Human neuronal acetylcholine receptor A5-A3-B4 haplotypes are associated with multiple nicotine dependence phenotypes. Nicotine Tob Res 2009; 11(7): 785-796. Balfour DJ. The neurobiology of tobacco dependence: a preclinical perspective on the role of the dopamine projections to the nucleus accumbens [corrected]. Nicotine Tob Res 2004; 6(6): 899- 912. Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 2005; 21(2): 263-265. Benowitz NL, Dains KM, Hall SM, Stewart S, Wilson M, Dempsey D, et al. Progressive commercial cigarette yield reduction: biochemical exposure and behavioral assessment. Cancer Epidemiol Biomarkers Prev 2009; 18(3): 876-883. Benowitz NL, Hukkanen J, Jacob P, 3rd. Nicotine chemistry, metabolism, kinetics and biomarkers. Handb Exp Pharmacol 2009;(192): 29-60. Benowitz NL, Lessov-Schlaggar CN, Swan GE, Jacob P, 3rd. Female sex and oral contraceptive use accelerate nicotine metabolism. Clin Pharmacol Ther 2006; 79(5): 480-488. Benowitz NL, Perez-Stable EJ, Fong I, Modin G, Herrera B, Jacob P, 3rd. Ethnic differences in N-glucuronidation of nicotine and cotinine. J Pharmacol Exp Ther 1999; 291(3): 1196-1203. Benowitz NL, Perez-Stable EJ, Herrera B, Jacob P, 3rd. Slower metabolism and reduced intake of nicotine from cigarette smoking in Chinese-Americans. J Natl Cancer Inst 2002; 94(2): 108-115. Benowitz NL. Nicotine addiction. N Engl J Med 2010; 362(24): 2295-2303. 146 Bergen AW, Conti DV, Van Den Berg D, Lee W, Liu J, Li D, et al. Dopamine genes and nicotine dependence in treatment-seeking and community smokers. Neuropsychopharmacology 2009; 34(10): 2252-2264. Berrettini WH, Lerman CE. Pharmacotherapy and pharmacogenetics of nicotine dependence. Am J Psychiatry 2005; 162(8): 1441-1451. Binda AV, Kabbani N, Lin R, Levenson R. D2 and D3 dopamine receptor cell surface localization mediated by interaction with protein 4.1N. Mol Pharmacol 2002; 62(3): 507-513. Bollen KA. Latent variables in psychology and the social sciences. Annu Rev Psychol 2002; 53: 605-634. Burgess DJ, Fu SS, Noorbaloochi S, Clothier BA, Ricards J, Widome R, et al. Employment, gender, and smoking cessation outcomes in low-income smokers using nicotine replacement therapy. Nicotine Tob Res 2009; 11(12): 1439-1447. CDC. Cigarette smoking among adults--United States, 2002. MMWR Morb Mortal Wkly Rep 2004; 53(20): 427-431. CDC. Smoking-attributable mortality, years of potential life lost, and productivity losses--United States, 2000-2004. MMWR Morb Mortal Wkly Rep 2008; 57(45): 1226-1228. Chen RJ, Ho YS, Guo HR, Wang YJ. Long-term nicotine exposure-induced chemoresistance is mediated by activation of Stat3 and downregulation of ERK1/2 via nAChR and beta- adrenoceptors in human bladder cancer cells. Toxicol Sci 2010; 115(1): 118-130. Chen X, Williamson VS, An SS, Hettema JM, Aggen SH, Neale MC, et al. Cannabinoid receptor 1 gene association with nicotine dependence. Arch Gen Psychiatry 2008; 65(7): 816-824. Conneely KN, Boehnke M. So Many Correlated Tests, So Little Time! Rapid Adjustment of P Values for Multiple Correlated Tests. Am J Hum Genet 2007; 81(6). Conti DV, Lee W, Li D, Liu J, Van Den Berg D, Thomas PD, et al. Nicotinic acetylcholine receptor beta2 subunit gene implicated in a systems-based candidate gene study of smoking cessation. Hum Mol Genet 2008; 17(18): 2834-2848. Damaj MI. Influence of gender and sex hormones on nicotine acute pharmacological effects in mice. J Pharmacol Exp Ther 2001; 296(1): 132-140. David SP, Brown RA, Papandonatos GD, Kahler CW, Lloyd-Richardson EE, Munafo MR, et al. Pharmacogenetic clinical trial of sustained-release bupropion for smoking cessation. Nicotine Tob Res 2007; 9(8): 821-833. David SP, Strong DR, Munafo MR, Brown RA, Lloyd-Richardson EE, Wileyto PE, et al. Bupropion efficacy for smoking cessation is influenced by the DRD2 Taq1A polymorphism: analysis of pooled data from two clinical trials. Nicotine Tob Res 2007; 9(12): 1251-1257. 147 De Vries TJ, Schoffelmeer AN. Cannabinoid CB1 receptors control conditioned drug seeking. Trends Pharmacol Sci 2005; 26(8): 420-426. Dempsey D, Tutka P, Jacob P, 3rd, Allen F, Schoedel K, Tyndale RF, et al. Nicotine metabolite ratio as an index of cytochrome P450 2A6 metabolic activity. Clin Pharmacol Ther 2004; 76(1): 64-72. Di Chiara G. Role of dopamine in the behavioural actions of nicotine related to addiction. Eur J Pharmacol 2000; 393(1-3): 295-314. Durbin RM, Abecasis GR, Altshuler DL, Auton A, Brooks LD, Gibbs RA, et al. A map of human genome variation from population-scale sequencing. Nature 2010; 467(7319): 1061-1073. Edlund CK, Lee WH, Li D, Van Den Berg DJ, Conti DV. Snagger: a user-friendly program for incorporating additional information for tagSNP selection. BMC Bioinformatics 2008; 9: 174. Falcone M, Jepson C, Benowitz N, Bergen AW, Pinto A, Wileyto EP, et al. Association of the nicotine metabolite ratio and CHRNA5/CHRNA3 polymorphisms with smoking rate among treatment-seeking smokers. Nicotine Tob Res 2011; 13(6): 498-503. Ferguson TS. A Bayesian analysis of some nonparametric problems. The Annals of Statistics 1973; 1(2): 209-230. Feyerabend C, Russell MA. A rapid gas-liquid chromatographic method for the determination of cotinine and nicotine in biological fluids. J Pharm Pharmacol 1990; 42(6): 450-452. Fiore MC, Jaen CR, Baker TB, Bailey WC, Benowitz N, Curry SJ, et al (2008). Treating Tobacco Use and Dependence: 2008 Update, Clinical Practice Guideline. U.S. Department of Health and Human Services PHS (ed): Rockville (MD). Forget B, Hamon M, Thiebot MH. Cannabinoid CB1 receptors are involved in motivational effects of nicotine in rats. Psychopharmacology (Berl) 2005; 181(4): 722-734. Gago-Dominguez M, Jiang X, Conti DV, Castelao JE, Stern MC, Cortessis VK, et al. Genetic variations on chromosomes 5p15 and 15q25 and bladder cancer risk: findings from the Los Angeles-Shanghai bladder case-control study. Carcinogenesis 2011; 32(2): 197-202. Gelernter J, Gueorguieva R, Kranzler HR, Zhang H, Cramer J, Rosenheck R, et al. Opioid receptor gene (OPRM1, OPRK1, and OPRD1) variants and response to naltrexone treatment for alcohol dependence: results from the VA Cooperative Study. Alcohol Clin Exp Res 2007; 31(4): 555-563. Glautier S. Measures and models of nicotine dependence: positive reinforcement. Addiction 2004; 99 Suppl 1: 30-50. Han BG, Nunomura W, Takakuwa Y, Mohandas N, Jap BK. Protein 4.1R core domain structure and insights into regulation of cytoskeletal organization. Nat Struct Biol 2000; 7(10): 871-875. 148 Harrington WR, Sengupta S, Katzenellenbogen BS. Estrogen regulation of the glucuronidation enzyme UGT2B15 in estrogen receptor-positive breast cancer cells. Endocrinology 2006; 147(8): 3843-3850. Hatchell PC, Collins AC. Influences of genotype and sex on behavioral tolerance to nicotine in mice. Pharmacol Biochem Behav 1977; 6(1): 25-30. Heath AC, Martin NG. Genetic models for the natural history of smoking: evidence for a genetic influence on smoking persistence. Addict Behav 1993; 18(1): 19-34. Heatherton TF, Kozlowski LT, Frecker RC, Fagerstrom KO. The Fagerstrom Test for Nicotine Dependence: a revision of the Fagerstrom Tolerance Questionnaire. Br J Addict 1991; 86(9): 1119-1127. Heatherton TF, Kozlowski LT, Frecker RC, Fagerstrom KO. The Fagerstrom Test for Nicotine Dependence: a revision of the Fagerstrom Tolerance Questionnaire. Br J Addict 1991; 86(9): 1119-1127. Higashi E, Fukami T, Itoh M, Kyo S, Inoue M, Yokoi T, et al. Human CYP2A6 is induced by estrogen via estrogen receptor. Drug Metab Dispos 2007; 35(10): 1935-1941. Hogg RC, Bertrand D. Neuroscience. What genes tell us about nicotine addiction. Science 2004; 306(5698): 983-985. Hogle JM, Curtin JJ. Sex differences in negative affective response during nicotine withdrawal. Psychophysiology 2006; 43(4): 344-356. Hong LE, Gu H, Yang Y, Ross TJ, Salmeron BJ, Buchholz B, et al. Association of nicotine addiction and nicotine's actions with separate cingulate cortex functional circuits. Arch Gen Psychiatry 2009; 66(4): 431-441. Howie BN, Donnelly P, Marchini J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet 2009; 5(6): e1000529. Huang W, Ma JZ, Payne TJ, Beuten J, Dupont RT, Li MD. Significant association of DRD1 with nicotine dependence. Hum Genet 2008; 123(2): 133-140. Hung RJ, McKay JD, Gaborieau V, Boffetta P, Hashibe M, Zaridze D, et al. A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25. Nature 2008; 452(7187): 633-637. Isiegas C, Mague SD, Blendy JA. Sex differences in response to nicotine in C57Bl/6:129SvEv mice. Nicotine Tob Res 2009; 11(7): 851-858. Johnstone EC, Yudkin PL, Hey K, Roberts SJ, Welch SJ, Murphy MF, et al. Genetic variation in dopaminergic pathways and short-term effectiveness of the nicotine patch. Pharmacogenetics 2004; 14(2): 83-90. 149 Jorenby DE, Hays JT, Rigotti NA, Azoulay S, Watsky EJ, Williams KE, et al. Efficacy of varenicline, an alpha4beta2 nicotinic acetylcholine receptor partial agonist, vs placebo or sustained-release bupropion for smoking cessation: a randomized controlled trial. JAMA 2006; 296(1): 56-63. Keskitalo K, Broms U, Heliovaara M, Ripatti S, Surakka I, Perola M, et al. Association of serum cotinine level with a cluster of three nicotinic acetylcholine receptor genes (CHRNA3/CHRNA5/CHRNB4) on chromosome 15. Hum Mol Genet 2009; 18(20): 4007-4012. Krueger RF, Markon KE, Patrick CJ, Iacono WG. Externalizing psychopathology in adulthood: a dimensional-spectrum conceptualization and its implications for DSM-V. J Abnorm Psychol 2005; 114(4): 537-550. Krueger RF, Piasecki TM. Toward a dimensional and psychometrically-informed approach to conceptualizing psychopathology. Behav Res Ther 2002; 40(5): 485-499. Lajtha A, Sershen H. Nicotine: alcohol reward interactions. Neurochem Res 2010; 35(8): 1248- 1258. Le Marchand L, Derby KS, Murphy SE, Hecht SS, Hatsukami D, Carmella SG, et al. Smokers with the CHRNA lung cancer-associated variants are exposed to higher levels of nicotine equivalents and a carcinogenic tobacco-specific nitrosamine. Cancer Res 2008; 68(22): 9137- 9140. Lee AM, Jepson C, Hoffmann E, Epstein L, Hawk LW, Lerman C, et al. CYP2B6 genotype alters abstinence rates in a bupropion smoking cessation trial. Biol Psychiatry 2007; 62(6): 635-641. Lee AM, Jepson C, Shields PG, Benowitz N, Lerman C, Tyndale RF. CYP2B6 genotype does not alter nicotine metabolism, plasma levels, or abstinence with nicotine replacement therapy. Cancer Epidemiol Biomarkers Prev 2007; 16(6): 1312-1314. Lerman C, Jepson C, Wileyto EP, Epstein LH, Rukstalis M, Patterson F, et al. Role of functional genetic variation in the dopamine D2 receptor (DRD2) in response to bupropion and nicotine replacement therapy for tobacco dependence: results of two randomized clinical trials. Neuropsychopharmacology 2006; 31(1): 231-242. Lerman C, Kaufmann V, Rukstalis M, Patterson F, Perkins K, Audrain-McGovern J, et al. Individualizing nicotine replacement therapy for the treatment of tobacco dependence: a randomized trial. Ann Intern Med 2004; 140(6): 426-433. Lerman C, Shields PG, Wileyto EP, Audrain J, Hawk LH, Jr., Pinto A, et al. Effects of dopamine transporter and receptor polymorphisms on smoking cessation in a bupropion clinical trial. Health Psychol 2003; 22(5): 541-548. Lerman C, Tyndale R, Patterson F, Wileyto EP, Shields PG, Pinto A, et al. Nicotine metabolite ratio predicts efficacy of transdermal nicotine for smoking cessation. Clin Pharmacol Ther 2006; 79(6): 600-608. 150 Lerman CE, Schnoll RA, Munafo MR. Genetics and smoking cessation improving outcomes in smokers at risk. Am J Prev Med 2007; 33(6 Suppl): S398-405. Li D (2010). Multiple degree of freedom p-value adjustment using correlation of score statistics. Lee W (ed): Los Angeles, CA. Li MD, Xu Q, Lou XY, Payne TJ, Niu T, Ma JZ. Association and interaction analysis of variants in CHRNA5/CHRNA3/CHRNB4 gene cluster with nicotine dependence in African and European Americans. Am J Med Genet B Neuropsychiatr Genet 2010; 153B(3): 745-756. Livingstone PD, Wonnacott S. Nicotinic acetylcholine receptors and the ascending dopamine pathways. Biochem Pharmacol 2009; 78(7): 744-755. Malaiyandi V, Goodz SD, Sellers EM, Tyndale RF. CYP2A6 genotype, phenotype, and the use of nicotine metabolites as biomarkers during ad libitum smoking. Cancer Epidemiol Biomarkers Prev 2006; 15(10): 1812-1819. Marchini J, Howie B, Myers S, McVean G, Donnelly P. A new multipoint method for genome- wide association studies by imputation of genotypes. Nat Genet 2007; 39(7): 906-913. Marubio LM, Gardier AM, Durier S, David D, Klink R, Arroyo-Jimenez MM, et al. Effects of nicotine in the dopaminergic system of mice lacking the alpha4 subunit of neuronal nicotinic acetylcholine receptors. Eur J Neurosci 2003; 17(7): 1329-1337. Mascia MS, Obinu MC, Ledent C, Parmentier M, Bohme GA, Imperato A, et al. Lack of morphine-induced dopamine release in the nucleus accumbens of cannabinoid CB(1) receptor knockout mice. Eur J Pharmacol 1999; 383(3): R1-2. Mason SJ, Graham NE. Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation. Quarterly Journal of the Royal Meteorological Society 2002; 128: 2145-2166. Miller DK, Sumithran SP, Dwoskin LP. Bupropion inhibits nicotine-evoked [(3)H]overflow from rat striatal slices preloaded with [(3)H]dopamine and from rat hippocampal slices preloaded with [(3)H]norepinephrine. J Pharmacol Exp Ther 2002; 302(3): 1113-1122. Molitor J, Papathomas M, Jerrett M, Richardson S. Bayesian profile regression with an application to the National Survey of Children's Health. Biostatistics 2010; 11(3): 484-498. Muller P, Roeder K. A Bayesian Semiparametric Model for Case-Control Studies with Errors in Variables. Biometrika 1997; 84(3): 523-537. Munafo MR, Shields AE, Berrettini WH, Patterson F, Lerman C. Pharmacogenetics and nicotine addiction treatment. Pharmacogenomics 2005; 6(3): 211-223. Murcray CE, Lewinger JP, Gauderman WJ. Gene-environment interaction in genome-wide association studies. Am J Epidemiol 2009; 169(2): 219-226. 151 Ohlssen DI, Sharples LD, Spiegelhalter DJ. Flexible random-effects models using Bayesian semi- parametric models: applications to institutional comparisons. Stat Med 2007; 26(9): 2088-2112. Ortells MO, Barrantes GE. Tobacco addiction: a biochemical model of nicotine dependence. Med Hypotheses 2010; 74(5): 884-894. Patterson F, Schnoll RA, Wileyto EP, Pinto A, Epstein LH, Shields PG, et al. Toward personalized therapy for smoking cessation: a randomized placebo-controlled trial of bupropion. Clin Pharmacol Ther 2008; 84(3): 320-325. Perez-Stable EJ, Herrera B, Jacob P, 3rd, Benowitz NL. Nicotine metabolism and intake in black and white smokers. JAMA 1998; 280(2): 152-156. Perkins KA, Jacobs L, Sanders M, Caggiula AR. Sex differences in the subjective and reinforcing effects of cigarette nicotine dose. Psychopharmacology (Berl) 2002; 163(2): 194-201. Perkins KA, Lerman C, Grottenthaler A, Ciccocioppo MM, Milanak M, Conklin CA, et al. Dopamine and opioid gene variants are associated with increased smoking reward and reinforcement owing to negative mood. Behav Pharmacol 2008; 19(5-6): 641-649. Perkins KA, Scott J. Sex differences in long-term smoking cessation rates due to nicotine patch. Nicotine Tob Res 2008; 10(7): 1245-1250. Perkins KA. Chronic tolerance to nicotine in humans and its relationship to tobacco dependence. Nicotine Tob Res 2002; 4(4): 405-422. Picciotto MR, Zoli M, Rimondini R, Lena C, Marubio LM, Pich EM, et al. Acetylcholine receptors containing the beta2 subunit are involved in the reinforcing properties of nicotine. Nature 1998; 391(6663): 173-177. Plummer M (2012). JAGS - Just Another Gibbs Sampler. Plummer M (ed). Pogun S, Yararbas G. Sex differences in nicotine action. Handb Exp Pharmacol 2009;(192): 261- 291. Pritchard JK, Stephens M, Rosenberg NA, Donnelly P. Association mapping in structured populations. Am J Hum Genet 2000; 67(1): 170-181. Pruim RJ, Welch RP, Sanna S, Teslovich TM, Chines PS, Gliedt TP, et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 2010; 26(18): 2336-2337. R Development Core Team. R: A Language and Environment for Statistical Computing. 2010. Ray R, Schnoll RA, Lerman C. Nicotine dependence: biology, behavior, and treatment. Annu Rev Med 2009; 60: 247-260. Ray R, Schnoll RA, Lerman C. Pharmacogenetics and smoking cessation with nicotine replacement therapy. CNS Drugs 2007; 21(7): 525-533. 152 Richardson S, Gilks WR. Conditional independence models for epidemiological studies with covariate measurement error. Stat Med 1993; 12(18): 1703-1722. Robinson JD, Cinciripini PM, Tiffany ST, Carter BL, Lam CY, Wetter DW. Gender differences in affective response to acute nicotine administration and deprivation. Addict Behav 2007; 32(3): 543-561. Russell MA, Feyerabend C, Cole PV. Plasma nicotine levels after cigarette smoking and chewing nicotine gum. Br Med J 1976; 1(6017): 1043-1046. Russell MA, Jarvis M, Iyer R, Feyerabend C. Relation of nicotine yield of cigarettes to blood nicotine concentrations in smokers. Br Med J 1980; 280(6219): 972-976. Russell MA, Wilson C, Taylor C, Baker CD. Smoking habits of men and women. Br Med J 1980; 281(6232): 17-20. Saccone NL, Saccone SF, Hinrichs AL, Stitzel JA, Duan W, Pergadia ML, et al. Multiple distinct risk loci for nicotine dependence identified by dense coverage of the complete family of nicotinic receptor subunit (CHRN) genes. Am J Med Genet B Neuropsychiatr Genet 2009; 150B(4): 453- 466. Saccone SF, Hinrichs AL, Saccone NL, Chase GA, Konvicka K, Madden PA, et al. Cholinergic nicotinic receptor genes implicated in a nicotine dependence association study targeting 348 candidate genes with 3713 SNPs. Hum Mol Genet 2007; 16(1): 36-49. Schnoll RA, Patterson F, Wileyto EP, Tyndale RF, Benowitz N, Lerman C. Nicotine metabolic rate predicts successful smoking cessation with transdermal nicotine: a validation study. Pharmacol Biochem Behav 2009; 92(1): 6-11. Schnoll RA, Patterson F. Sex heterogeneity in pharmacogenetic smoking cessation clinical trials. Drug Alcohol Depend 2009; 104 Suppl 1: S94-99. Scott C, Keating L, Bellamy M, Baines AJ. Protein 4.1 in forebrain postsynaptic density preparations: enrichment of 4.1 gene products and detection of 4.1R binding proteins. Eur J Biochem 2001; 268(4): 1084-1094. Shiffer KA, Goodman SR. Protein 4.1: its association with the human erythrocyte membrane. Proc Natl Acad Sci U S A 1984; 81(14): 4404-4408. SRNT. Biochemical verification of tobacco use and cessation. Nicotine Tob Res 2002; 4(2): 149- 159. State-specific prevalence of current cigarette smoking among adults--United States, 2003. MMWR Morb Mortal Wkly Rep 2004; 53(44): 1035-1037. Strasser AA, Malaiyandi V, Hoffmann E, Tyndale RF, Lerman C. An association of CYP2A6 genotype and smoking topography. Nicotine Tob Res 2007; 9(4): 511-518. 153 Sullivan PF, Kendler KS. The genetic epidemiology of smoking. Nicotine Tob Res 1999; 1 Suppl 2: S51-57; discussion S69-70. Tanda G, Goldberg SR. Cannabinoids: reward, dependence, and underlying neurochemical mechanisms--a review of recent preclinical data. Psychopharmacology (Berl) 2003; 169(2): 115- 134. Tapper AR, McKinney SL, Nashmi R, Schwarz J, Deshpande P, Labarca C, et al. Nicotine activation of alpha4* receptors: sufficient for reward, tolerance, and sensitization. Science 2004; 306(5698): 1029-1032. Thomas DC. Multistage sampling for latent variable models. Lifetime Data Anal 2007; 13(4): 565-581. Thorisson GA, Smith AV, Krishnan L, Stein LD. The International HapMap Project Web site. Genome Res 2005; 15(11): 1592-1593. Tran YK, Bogler O, Gorse KM, Wieland I, Green MR, Newsham IF. A novel member of the NF2/ERM/4.1 superfamily with growth suppressing properties in lung cancer. Cancer Res 1999; 59(1): 35-43. Vickers AJ, Cronin AM, Begg CB. One statistical test is sufficient for assessing new predictive markers. BMC Med Res Methodol 2011; 11: 13. Vink JM, Beem AL, Posthuma D, Neale MC, Willemsen G, Kendler KS, et al. Linkage analysis of smoking initiation and quantity in Dutch sibling pairs. Pharmacogenomics J 2004; 4(4): 274- 282. Wada T, Naito M, Kenmochi H, Tsuneki H, Sasaoka T. Chronic nicotine exposure enhances insulin-induced mitogenic signaling via up-regulation of alpha7 nicotinic receptors in isolated rat aortic smooth muscle cells. Endocrinology 2007; 148(2): 790-799. Waingrow SM, Horn D, Ikard FF. Dosage patterns of cigarette smoking in American adults. Am J Public Health Nations Health 1968; 58(1): 54-70. Wassenaar CA, Dong Q, Wei Q, Amos CI, Spitz MR, Tyndale RF. Relationship between CYP2A6 and CHRNA5-CHRNA3-CHRNB4 variation and smoking behaviors and lung cancer risk. J Natl Cancer Inst 2011; 103(17): 1342-1346. Xian H, Scherrer JF, Madden PA, Lyons MJ, Tsuang M, True WR, et al. The heritability of failed smoking cessation and nicotine withdrawal in twins who smoked and attempted to quit. Nicotine Tob Res 2003; 5(2): 245-254. Yamanaka H, Nakajima M, Fukami T, Sakai H, Nakamura A, Katoh M, et al. CYP2A6 AND CYP2B6 are involved in nornicotine formation from nicotine in humans: interindividual differences in these contributions. Drug Metab Dispos 2005; 33(12): 1811-1818. 154 Yan G, Sedransk J. The Effect of Sample Composition on Inference for Random Effects Using Normal and Dirichlet Process Models. Journal of Data Science 2010; 8: 579-595. Yang M, Dunson DB. Bayesian Semiparametric Structural Equation Models with Latent Variables. Psychometricka 2010; 75(4): 675-693. Zbikowski SM, Swan GE, McClure JB. Cigarette smoking and nicotine dependence. Med Clin North Am 2004; 88(6): 1453-1465, x. Zhang PW, Ishiguro H, Ohtsuki T, Hess J, Carillo F, Walther D, et al. Human cannabinoid receptor 1: 5' exons, candidate regulatory regions, polymorphisms, haplotypes and association with polysubstance abuse. Mol Psychiatry 2004; 9(10): 916-931.
Abstract (if available)
Abstract
Tobacco-related morbidity and mortality remain among the costliest worldwide public health issues and the most preventable. Nicotine dependence is the primary factor in the persistence of this problem, leading to ongoing efforts to characterize nicotine dependence. Ongoing efforts have also led to the development of treatments to deal with this addictive drug. Nicotine replacement therapies, such as transdermal and nasal spray treatments, and other drugs, such as bupropion, targeted at neurotransmitters effected by nicotine, have had varying degrees of effectiveness for smoking cessation. Studies investigating genetic variants involved in the dopamine reward pathway, nicotine metabolism, and related mechanisms and pathways have shown that specific genetic profiles have different levels of nicotine dependence and respond differentially to treatments for nicotine dependence. For example, differential treatment effects for bupropion have been observed for variants within genes involved in the dopamine reward pathway on smoking cessation, with bupropion inhibiting the reuptake of dopamine. Bupropion has also been shown to be an antagonist for the nicotinic acetylcholine receptors, which bind nicotine. These two actions in concert highlight the interplay between treatments and genes. The metabolism of nicotine has also been of particular interest. Specifically, CYP2A6 has been shown to be a primary mechanism in the metabolism of nicotine to cotinine and then 3-hydroxycotinine (3HC). The nicotine metabolite ratio (NMR), the ratio of 3HC to cotinine, has subsequently been shown to be a reliable proxy for nicotine metabolism. Moreover, those carrying the CYP2A6 variant have been shown to have lower NMR levels, and be slow metabolizers, and those wildtype for CYP2A6 have been shown to be normal or fast metabolizers. ❧ This intricate web of dependence, genetic variation, cessation treatments, and nicotine metabolism has led us to develop a latent variable framework that attempts to capture an underlying process that characterizes a nicotine profile. Latent variable models have been used in the social sciences as a means of estimating constructs that synthesize indices of behavior. Indices for smoking may not capture the extent or degree of dependence on their own, but a latent variable, in describing relationships between measured variables, may ascertain otherwise unobservable effects. The potential problem then may be the interpretation of a latent variable. A hypothetical construct of “nicotine dependence” or “biological pathway X” may or may not be reliable in capturing desired effects. ❧ We present a framework that incorporates a Dirichlet process that does not constrain this underlying process to a single distribution, but can flexibly cluster individuals into profiles based on observed biomarker levels, genetic variation and other biological and psychological characteristics. This non- or semiparametric model made up of a mixture of parametric distributions is able to allocate observed measurements into k clusters. It is non- or semiparametric in the sense that each cluster comes from a discrete distribution, but each cluster has a parametric (typically normal) distribution. This provides flexibility in the estimation of some unmeasured variable or parameter, yielding a multimodal distribution that shrinks observations towards respective cluster effects rather than a grand mean. In doing so, groups can be distinguished from one another, rather than constraining all observations to a single distribution. The association between this latent variable and a smoking related outcome can be estimated, with the eventual purpose of utilizing these risk profiles to predict the outcome of interest. In order to assess the performance of our framework, we conducted simulations and compared it with conventional regressions on the actual data. Simulations performed as expected. In the real data, latent clusters were similar to the slow, normal and fast nicotine metabolism categories based on NMR levels, regardless of whether we constrained the number of estimated latent clusters to 2 or 3 or allowed the framework to determine the number of latent clusters. This is no surprise given that our latent variable was estimated using NMR. What was of interest was that incorporating CYP2A6, NMR levels had an impact in clustering individuals within those carrying the CYP2A6 variant, but not those wildtype for CYP2A6, suggesting that CYP2A6 and NMR play complementary, but not completely dependent roles in nicotine metabolism. Moreover, while our framework did not outperform conventional regressions in predicting smoking cessation, it was comparable, with our framework having the advantage of more refined characterization of individuals and dimension reduction when including more variables.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Population substructure and its impact on genome-wide association studies with admixed populations
PDF
Adaptive set-based tests for pathway analysis
PDF
Discovery of complex pathways from observational data
PDF
Pharmacogenetic association studies and the impact of population substructure in the women's interagency HIV study
PDF
Identifying genetic, environmental, and lifestyle determinants of ethnic variation in risk of pancreatic cancer
PDF
Air pollution, mitochondrial function, and growth in children
PDF
Genes and environment in prostate cancer risk and prognosis
PDF
The environmental and genetic determinants of cleft lip and palate in the global setting
PDF
Genetic and environmental risk factors for childhood cancer
PDF
Genomic risk factors associated with Ewing Sarcoma susceptibility
PDF
Genetic studies of cancer in populations of African ancestry and Latinos
PDF
Disparities in exposure to traffic-related pollution sources by self-identified and ancestral Hispanic descent in participants of the USC Children’s Health Study
PDF
Dietary carcinogens and genetic variation in their metabolism: epidemiological studies on the risk of selected cancers
PDF
Bayesian hierarchical models in genetic association studies
PDF
Age related macular degeneration in Latinos: risk factors and impact on quality of life
PDF
Association of traffic-related air pollution and lens opacities in the Los Angeles Latino Eye Study
PDF
Insulin’s effect on lactate levels in extremely low birth weight neonates. a multi-center, observational study
PDF
Genetic variation in inducible nitric oxide synthase promoter, residential traffic related air pollution and exhaled nitric oxide in children
PDF
Genomic variant analysis in whole blood of recurrent early-onset major depression patients and differential expression in BA25 of suicide completers with major depression
PDF
Genetic and dietary determinants of nonalcoholic fatty liver disease in Hispanic children
Asset Metadata
Creator
Lee, Won H.
(author)
Core Title
Observed and underlying associations in nicotine dependence
School
Keck School of Medicine
Degree
Doctor of Philosophy
Degree Program
Statistical Genetics and Genetic Epidemiology
Publication Date
08/02/2014
Defense Date
04/20/2012
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Dirichlet process,dopamine reward pathway,genetic association,nicotine dependence,nicotine metabolism,nonparametric latent variable framework,OAI-PMH Harvest
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Conti, David V. (
committee chair
), Gauderman, William James (
committee member
), Knowles, James (
committee member
), Leventhal, Adam M. (
committee member
), Thomas, Duncan C. (
committee member
)
Creator Email
wonhlee@usc.edu,wonho77@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-85457
Unique identifier
UC11289269
Identifier
usctheses-c3-85457 (legacy record id)
Legacy Identifier
etd-LeeWonH-1124.pdf
Dmrecord
85457
Document Type
Dissertation
Rights
Lee, Won H.
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
Dirichlet process
dopamine reward pathway
genetic association
nicotine dependence
nicotine metabolism
nonparametric latent variable framework