Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Shortcomings of the genetic risk score in the analysis of disease-related quantitative traits
(USC Thesis Other)
Shortcomings of the genetic risk score in the analysis of disease-related quantitative traits
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Shortcomings of the Genetic Risk Score in the
Analysis of Disease-related Quantitative Traits
By Zhu Chen
A Thesis Presented to the
FACULTY of THE GRADUATE SCHOOL UNIVERSITY OF
SOUTHERN CALIFORNIA
In Partial Fulfillment of the Requirements for the
Degree MASTER OF SCIENCE
(BIOSTATISTICS)
May 2017
2
Table of Content
Abstract ...................................................................................................... 3-4
1. Introduction ............................................................................................ 5-8
1.1 Complex Disease in the GWAS Era .............................................. 5-6
1.2 The Genetic Risk Score for Complex Disease in the Post-GWAS Era
................................................................................................................ 6-7
1.3 Diabetes Mellitus as an Illustration Example ................................ 7-8
2. Materials and Methods ......................................................................... 8-13
2.1 Subject Recruitment .......................................................................... 8
2.2 Clinical Protocols ........................................................................... 8-9
2.3 Molecular Analysis ...................................................................... 9-12
2.4 Data Analysis ............................................................................. 12-13
3. Results ................................................................................................ 13-15
4. Discussion .......................................................................................... 16-20
4.1 Comparison of Results from the GRS with Multiple Regression
Parsimonious Model ............................................................................. 16
4.2 The Effect of SNPs in the Model of Each T2DM-Related Trait
............................................................................................................ 16-18
4.3 Overlapping SNPs in the Model of Each T2DM-Related Trait . 18-20
5. Conclusion ......................................................................................... 20-21
6. References .......................................................................................... 21-26
7. Appendix ............................................................................................ 27-56
3
Abstract
Background: Genome-wide association studies (GWAS) have identified
genetic variation underlying complex diseases such as type 2 diabetes (T2D).
The genetic risk score (GRS), the total number of risk variants carried by an
individual, is a widely used tool to test association between genetic variation
and disease risk or disease-related quantitative traits. However, the GRS
created for a complex disease might have limitation for the complex
disease-related traits. This study used the T2DM as an example of complex
diseases and evaluated the performance of the GRS created for T2DM in the
analysis of T2DM-related quantitative traits (QTs), compared to traditional
multiple regression.
Methods:1044 subjects from the BetaGene study were used for analysis.
BetaGene consists of Mexican American families of probands with or
without previous GDM. 56 T2DM risk variants identified from previously
published GWAS studies were included in the analysis. A GRS was
created using the 56 T2DM associated SNPs and tested for association with
T2D-related QTs using a linear mixed-effects model and incorporating the
kinship matrix to adjust for the correlation between family members.
Additionally, stepwise model selection was used to acquire the model to test
the association between T2D-related QTs and T2D risk SNPs. Performance
of the GRS and the model selected by stepwise model selection was
compared in the aspect of variation explained and the significance of
association.
4
Results: For both analyses, a nominal p<0.05 was used to define statistical
significance. The GRS was associated with 8 of 24 T2D-related QTs tested
and explained 0.01-3% of the trait variation. In contrast, each QT included
8-17 SNPs in the stepwise regression analysis, which was associated with
QT trait and explained 3-13% of the trait variation. Furthermore, in cases
where both GRS and stepwise regression showed evidence for association,
the significance of association was stronger for stepwise regression.
Conclusion: These results suggest that GRS may not be an appropriate tool
to assess association between disease risk variants and disease-related QTs.
The GRS may lead to misinterpretation of the physiologic implications of
disease risk variants, since it cannot find the true association between
disease risk variants and complex diseases-related QTs. We concluded that
GRS should not be used to test association between disease risk variants and
disease-related QTs due to the pleiotropic and heterogenous effects of risk
variants on each disease-related trait.
5
1 Introduction
1.1 Complex Diseases in the GWAS Era
Complex disease, such as cancer, diabetes mellitus (DM), cardiovascular
disease (CVD) and Alzheimer’s disease, are the most prevailing diseases
worldwide and impose burden to national health systems and national
economies. Unlike single-gene (Mendelian) diseases that are caused by
mutation in a single gene, complex diseases are caused by the effects of
multiple genes working in complex interactions with lifestyle and
environmental factors.
Since complex diseases do not follow the standard Mendelian patterns of
inheritance, and genetic factors only compose part of the risk associated with
complex disease phenotypes [3], a person harboring the disease genetic risk
variants may not be fully destined to develop the disease. It only indicates
that these types of people are at higher risk (susceptibility) of developing
disease than people who do not harbor such genetic variants. Although genes
cannot be changed, with alteration of lifestyle and environmental factors or
medical intervention, those diseases could be prevented or their onset
delayed. Understanding the genetic architecture and biological mechanism
of a complex disease could be crucial for prevention, early-diagnosis and
early-intervention and is also the challenge faced by scientists.
Until recently, gene discovery mainly relied on the identification of
nonsynonymous mutations by the candidate gene approach and family-based
linkage studies [11]. With the advent of the common disease-common
variant hypothesis, the completion of the HapMap Project, and technological
6
advances in DNA genotyping and sequencing, research has transformed
from genetics to genomics by way of screening millions of single nucleotide
polymorphisms (SNPs) throughout the genome. Genome-wide association
studies (GWAS), without usage of any a priori knowledge of the disease,
can find loci that have not yet been identified and may yield more
comprehensive knowledge of the biological mechanism underlying complex
diseases. In the past two decades, numerous genetic variants have been
reported to be associated with complex disease through GWAS studies.
Those SNPs shown to be significantly-associated with disease risk or
disease-related quantitative traits are maintained at the NHGRI-EBI GWAS
catalog (http://www.ebi.ac.uk/gwas/). As of November 2016, this catalog
included 2,610 studies and 29,382 unique SNP-trait associations. However,
88% of these SNPs are located in intronic and intergenic non-coding regions
with small effect size, making the clinical translation of findings from
GWAS extremely difficult [4].
1.2 The Genetic Risk Score for Complex Disease in the Post-GWAS Era
Since the discovery of multiple genetic variants associated with complex
diseases, the genetic risk score (GRS) has been proposed as a common
approach to translate emerging genomic knowledge into clinical practice.
With the concept of combining effects of multiple associated SNPs with
small effect size into a single metric, the GRS counts the total number of risk
SNPs carried by an individual, to represent an individual’s genetic
predisposition to disease [10]. People with high GRS for certain complex
disease are presumably at high risk of developing this disease, and should
benefit from early prevention and intervention.
7
However, evidence indicates that the predictive power of the GRS is weak
[10, 20, 21]. The GRS, a single representation of the net effect of genetic
variation, may not be the best representation of the genetic architecture
underlying intermediate and disease-related phenotypes. Those intermediate
and disease-related phenotypes may also be the markers of subtypes, or even
the risk factors for complex disease. Discovery of the relation between those
complex disease-associated SNPs and complex disease-related traits should
provide unique insights into the underlying etiology of complex diseases and
help to reveal the pathogenesis of complex diseases. This study aimed to
evaluate the performance of the GRS created for complex diseases in the
analysis of complex diseases-related QTs, compared to using traditional
multiple regression.
1.3 Type 2 Diabetes Mellitus (T2DM) as an Illustrative Example
T2DM is among a group of serious, chronic, metabolic and heterogeneous
complex diseases, characterized by hyperglycemia, obesity, insulin
resistance, and pancreatic b-cell dysfunction [1]. It constitutes about 90%-95%
of DM cases. Because hyperglycemia in T2DM develops over time, the age
of onset for this form of diabetes is typically over the age of 40.
Effective approaches are available to delay T2DM and gestational diabetes
mellitus (GDM) onset or manage the disease. Early diagnosis and treatment
of T2DM could prevent severe disease complications and premature death.
Therefore, at-risk individuals could benefit from prevention and early
8
intervention of DM, through lifestyle and/or pharmacological interventions
[5-7].
Because of the prevalent use of the GRS in T2DM to represent an
individual’s genetic predisposition to T2DM, and because T2DM-related
quantitative traits such as BMI are important markers for T2DM research,
we chose T2DM as an example to evaluate the performance of the GRS to
represent other complex disease-related traits, in the aspect of variation
explained by predictors (the GRS or SNPs) and the significance of
association.
2 Materials and Methods
2.1 Subject Recruitment
This study used the data from the BetaGene Study [22]. The BetaGene Study
consists of Mexican American families of a proband with or without a
previous diagnosis of GDM. Details regarding family recruitment have been
published [22].
All protocols for the BetaGene Study were approved by the institutional
review boards of participating institutions and all participants provided
written informed consent prior to participation.
2.2 Clinical Protocols
In this study, we measured the following phenotypes: fasting, 30-min and
120-min plasma glucose and insulin concentrations, liver enzymes (ALT and
AST) , lipids (cholesterol, triglycerides, low-density lipoprotein (LDL) and
9
high-density lipoprotein (HDL)) levels, BMI, BAI, body fat percent, waist
hip ratio, the glucose effectiveness (SG), the insulin sensitivity (SI), the
acute insulin response to glucose (AIR), the insulin clearance rate, the
disposition index (DI), 30-minute DI and the systolic and diastolic blood
pressure . Phenotyping was performed on two separate visits to the
University of Southern California General Clinical Research Center. During
the first visit, each participant completed a physical examination, DNA
collection, and a 75 g 2-hour oral glucose tolerance test (OGTT) with
30-min blood samples to measure the plasma glucose and insulin
concentrations. The fasting blood samples were also used to measure liver
enzymes and lipids levels. Participants in GDM families with fasting glucose
<126 mg/ml and non-GDM probands with normal fasting and 2-hour
glucose levels were invited for a second visit, which consisted of a
dual-energy X-ray absorptiometry (DEXA) scan for body composition and
an insulin-modified intravenous glucose tolerance test (IVGTT ), which was
analyzed using the Minimal Model [12-13] to measure the SG, SI and AIR.
The DI, a measure of pancreatic b-cell compensation for insulin resistance
was calculated as SI×AIR; 30-minute DI was calculated as SI× (30-min
insulin – fasting insulin), as a measure of pancreatic b-cell compensation
from the OGTT [22].
2.3 Molecular Analysis
All samples were genotyped on the Illumina HumanOmniExpress BeadChip,
and read by GenomeStudio software (Illumina, San Diego, CA). Only
samples with call rates >0.98 and SNPs with call rates >0.99 and minor
allele frequency (MAF) >0.001 are included in the BetaGene study [19].
10
The SNPs that have been reported to be associated with T2DM were
selected from the literature. In total, we identified 109 different SNPs from
106 different genes (Table 1, [23-46]) that are reported to be associated with
T2DM.
Table 1 List of identified T2DM susceptibility SNPs
SNP Chr Gene SNP Chr Gene
rs17106184 1 FAF1 rs2796441* 9 TLE1
rs4660293 1 MACF1 rs13292136 9 TLE4,CHCHD9
rs10923931 1 NOTCH2 rs12779790 10 CAMK1D,CDC123
rs340874 1 PROX1 rs10886471* 10 GRK5
rs243021* 2 BCL11A rs1111875 10 HHEX, IDE
rs780094* 2 GCKR rs7903146* 10 TCF7L2
rs3923113 2 GRB14 rs1802295* 10 VPS26A
rs2943641* 2 IRS1 rs12571751 10 ZMIZ1
rs7560163 2 RBM43, RND3 rs1552224* 11 CENTD2
rs7593730 2 RBMS1 rs2334499* 11 DUSP8
rs7578597* 2 THADA rs3842770 11 INS
rs6723108* 2 TMEM163 rs5215* 11 KCNJ11
rs4607103* 3 ADAMTS9 rs5219* 11 KCNJ11
rs11708067* 3 ADCY5 rs2237892* 11 KCNQ1
rs4402960* 3 IGF2BP2 rs231362* 11 KCNQ1
rs6769511 3 IGF2BP2 rs10830963* 11 MTNR1B
rs6808574* 3 LPP rs1387153* 11 MTNR1B
rs1801282* 3 PPARG rs2074356 12 C12orf51
11
rs831571 3 PSMD6 rs11065756 12 CCDC63
rs16861329 3 ST6GAL1 rs11063069* 12 CCND2
rs7612463* 3 UBE2E2 rs1531343 12 HMGA2
rs6780569* 3 UBE2E2 rs7957197 12 HNF1A
rs6815464 4 MAEA rs10842994* 12 KLHDC5
rs6813195 4 TMEM154 rs4275659* 12 MPHOSPH9
rs10010131* 4 WFS1 rs1727313 12 MPHOSPH9
rs1801214 4 WFS1 rs7961581* 12 TSPAN8, LGR5
rs459193 5 ANKRD55 rs9552911 13 SGCG
rs702634* 5 ARL15 rs1359790 13 SPRY2
rs35658696 5 PAM, PPIP5K2 rs61736969 13 TBC1D4
rs4457053* 5 ZBED3 rs2028299* 15 AP3S2
rs7754840* 6 CDKAL1 rs7172432 15 C2CD4A/B
rs10440833 6 CDKAL1 rs7178572 15 HMG20A
rs4712525 6 CDKAL1 rs8042680* 15 PRC1
rs1535500 6 KCNK16 rs8042680 15 PRC1
rs3132524 6 POU5F1, TCF19 rs7403531* 15 RASGRP1
rs3130501* 6 POU5F1,TCF19 rs11634397* 15 ZFAND6
rs9502570 6 SSR1, RREB1 rs7202877* 16 BCAR1
rs9505118* 6 SSR1,RREB1 rs9936385* 16 FTO
rs9470794 6 ZFAND3 rs8050136* 16 FTO
rs2191349 7 DGKB/TMEM195 rs4430796* 17 HNF1B
rs6467136 7 GCC1, PAX4 rs11651052 17 HNF1B
rs4607517* 7 GCK rs13342692* 17 SLC16A11
rs864745 7 JAZF1 rs312457 17 SLC16A13,SLC16A11
12
rs972283* 7 KLF14 rs391300* 17 SRR
rs791595* 7 MIR129, LEP rs12454712* 18 BCL2
rs10229583* 7 PAX4 rs8090011 18 LAMA1
rs515071 8 ANK1 rs12970134* 18 MC4R
rs13266634* 8 SLC30A8 rs10401969* 19 CILP2
rs896854 8 TP53INP1 rs3794991 19 GATAD2A,CILP2,PBX4
rs10811661 9 CDKN2A,CDKN2A2B rs8108269* 19 GIPR
rs2383208 9 CDKN2A,CDKN2A2B rs3786897* 19 PEPD
rs7041847 9 GLIS3 rs4812829 20 HNF4A
rs11787792 9 GPSM1 rs738409 22 PNPLA3
rs17584499* 9 PTPRD rs5945326* X DUSP9
rs12010175 X FAM58A
* The SNPs that are included in the BetaGene genotype data, and hence included in current study.
2.4 Data Analysis
We restricted our analyses to samples with complete data across all SNPs
and all T2DM-related quantitative traits to ensure fair comparison among
approaches. Genotype data were tested for deviation from Hardy-Weinberg
equilibrium (HWE) and for non-Mendelian inheritance using PEDSTATS
V0.6.4 [47].
Quantitative traits (BMI, BAI, body-fat-percent, waist-hip-ratio, cholesterol,
HDL, LDL, triglycerides, ALT, AST, SG, fasting glucose, 30-min glucose,
120-min glucose, insulin clearance rate, SI, AIR, DI, DI30, fasting insulin,
30-min insulin, 120-min insulin, systolic blood pressure and diastolic blood
pressure) were inverse normal transformed to approximate univariate
13
normality prior to analyses. SNPs were coded for an additive genetic model
using the presumed “risk” allele as the reference, to make sure these SNPs
have the same direction of effect for T2DM risk. The GRS was computed
for each individual by summing risk alleles from all candidate SNPs.
Association with T2DM-related quantitative traits was assessed via linear
mixed-effects models that incorporated the kinship matrix to account for the
correlation among related individuals. The kinship matrix was estimated
using the full pedigrees from the BetaGene study and calculated by the
kinship2 R package for pedigree data [14]. The GRS was tested for
association with each T2DM-related trait adjusting for the effects of age and
sex. In contrast to the GRS, we also performed stepwise regression in which
individual SNPs were used as predictors. We used backward-and-forward
model selection based on the Akaike Information Criterion (AIC) to identify
the parsimonious model best capturing the variation in each T2DM-related
quantitative trait. The performance of GRS and multiple regression
parsimonious model was compared by testing the goodness-of-fitting and the
variation explained by the GRS or SNPs respectively.
Statistical analysis was performed using R version 3.3.2 [9].
3 Results
In total, 7590 subjects in 547 families were included in the BetaGene study
and 1232 samples were genotyped on the Illumina HumanOmniExpress
BeadChip. There were 1044 subjects and 56 candidate SNPs included in our
analysis after data cleaning. In order to evaluate the performance of the GRS
14
to represent the genetic risk for other T2DM-related traits, the model using
the GRS was compared to the model generated by stepwise model selection
from 56 candidate SNPs (Table 2). In contrast to the GRS which uses
information from all 56 SNPs, stepwise regression selected 8-17 SNPs for
each trait examined. Stepwise regression explained 3-13% of the trait
variation compared to only 0.01-3.00% using the GRS.
Among the 24 T2DM-related traits that were tested, only 8 traits (SG,
fasting glucose, 120-min glucose, AIR, DI, DI30, 30-min insulin and
systolic blood pressure) exhibited statistically significant associations with
GRS (Table 2). In contrast, the stepwise regression results showed all traits
to be associated with some combination of the 56 SNPs examined in this
analysis. These range from 8 SNPs associated with triglycerides
(P-value=3.06×10
-5
) and ALT (P-value =3.11×10
-5
) to 17 SNPs associated
with AIR (P-value =<1×10
-16
), 30-min Insulin (P-value = 1.78×10
-14
) and
120-min Insulin (P-value =9.70×10
-8
).
Table 2 Comparison of quantitative trait associations using stepwise
regression and the GRS for 56 known T2DM SNPs.
Trait
Multiple Regression Genetic Risk Score
#
SNPs
†
% variation
‡
P-value
§
% variation P-value
BMI 12 4.39% 1.44×10
-5
0.16% 0.219
BAI 12 4.23% 8.29×10
-8
0.07% 0.334
Body Fat Percent 13 2.81% 2.31×10
-6
0.12% 0.141
Waist Hip Ratio 10 3.15% 2.53×10
-5
0.01% 0.686
15
Cholesterol 11 5.18% 7.87×10
-8
0.06% 0.438
HDL 13 5.90% 2.12×10
-8
0.00% 0.949
LDL 11 4.53% 2.06×10
-6
0.17% 0.189
Triglycerides 8 3.06% 3.06×10
-5
0.02% 0.667
ALT 8 3.06% 3.11×10
-5
0.02% 0.729
AST 13 3.21% 7.90×10
-4
0.02% 0.668
Glucose Effectiveness 16 6.35% 6.92×10
-8
1.00% 0.002
Fasting Glucose 11 5.10% 7.99×10
-7
0.87% 0.004
30-min Glucose 10 3.20% 5.58×10
-4
0.25% 0.123
120-min Glucose 9 4.15% 1.41×10
-6
0.42% 0.039
Insulin Clearance Rate 14 6.85% 2.90×10
-9
0.07% 0.421
Insulin Sensitivity 11 5.00% 9.85×10
-7
0.03% 0.585
Acute Insulin Response 17 12.81% <1×10
-16
3.00% 5.44×10
-8
Disposition Index 15 9.55% 5.55×10
-15
2.88% 6.72×10
-8
DI30 11 4.35% 7.50×10
-6
1.21% 0.001
Fasting Insulin 15 8.17% 5.13×10
-11
0.03% 0.605
30-min Insulin 17 10.03% 1.78×10
-14
1.66% 5.97×10
-5
120-min Insulin 17 6.46% 9.70×10
-8
0.03% 0.622
Systolic Blood Pressure 11 4.11% 1.49×10
-6
0.47% 0.022
Diastolic Blood Pressure 15 4.83% 3.47×10
-6
0.09% 0.325
†
Number of SNPs selected by stepwise model selection
‡
The percentage of the total trait variation explained by either SNPs or the GRS
§
P-value for the goodness-of-fitting test of the effect of either SNPs or the GRS
16
4 Discussion
4.1 Comparison of Results from the GRS with Multiple Regression
Parsimonious Model
The relative effect of the GRS and the parsimonious model for each
T2DM-related trait was consistent. Both methods exhibited the strongest
association and explained the most variation for AIR. However, the multiple
regression parsimonious model outperformed the GRS both in the trait
variation explained and the significance of the variables fitting the model.
Since the GRS is computed from all candidate SNPs, irrelevant and
redundant information was included in the GRS. One the other hand, the
GRS only explained 0.01-3.00% of the trait variation, compared to 3-13%
explained by the parsimonious SNP model selected by stepwise model
selection, reflecting that there was also information loss for using GRS to
represent T2DM-related traits. This result can be confusing at first glance
and we will discuss it in details in next section.
Most of the T2DM-related traits that exhibited significant association with
the GRS were glucose- and insulin- related traits. The results from the GRS
would lead to a wrong conclusion: the 56 SNPs used to generate the GRS
are only involved in glucose- and insulin-related pathways.
4.2 The effect of SNPs in the Model of Each T2DM-Related Trait
It is also notable that even though each candidate SNPs were coded using the
presumed “risk” allele as the reference, which means they should have the
same direction of effect (inhibiting or promoting) for T2DM-related traits as
17
well, in reality they might have opposite directions of effect on certain
T2DM-related trait. This fact in part explained the information loss and lack
of association using the GRS, because the net effect would be offset by
adding up alleles having opposite directions of effect. This is the reason why
even though those SNPs had strong association with fasting insulin (P-value
=5.13×10
-11
), the GRS didn’t capture it (Table 2). Out of 15 T2DM risk
SNPs included in the model, 9 SNPs were associated with decreasing the
level of fasting insulin, while the remaining 6 SNPs were associated with
increasing it (Supplementary Table 14). The GRS added all 15 SNPs up,
assuming they should have the same effect on fasting insulin level given
their effect on T2DM, but the net effect was compromised by the opposite
directions of effect for each SNPs. The results of HDL (P-value = 2.12×10
-8
for multiple regression model, P-value = 0.949 for the GRS) could also be
explained by the opposite directions of effect of SNPs in the parsimonious
model (Supplementary Table 6, 7 out of 13 SNPs were associated with HDL
concentration decrease, 6 SNPs with HDL level increase).
There are many SNPs related to multiple traits. The result from each
single-SNP-view is also remarkable. For example, a SNP in ADCY5
(rs11708067) was included in models for 13 traits (Table 3). It was
associated with glucose level increase, and insulin level/activity decrease,
which is consistent with increasing risk of T2DM. On the other hand, this
SNP was associated with decrease in the risk for obesity-related traits (BAI,
BMI, body fat percent) and increase in insulin sensitivity, which would be
protective for T2DM. These results indicate either a single SNPs might have
pleiotropic effects and/or there is a feedback regulatory system.
18
Table 3 The effect of rs11708067 in the model of each trait that included it.
4.3 Overlapping SNPs in the Model of Each T2DM-Related Trait
Since many traits shared the same SNPs in the model, and SNPs were
involved in multiple traits, we examined the overlap of SNPs among the
different traits. We coalesced the individual traits into related groups:
obesity-related, lipid-related, glucose-level-related, insulin-level-related,
insulin-activity-related, liver-enzyme-related and blood-pressure-related, and
produced Venn diagrams to assess the overlap in SNPs among these traits
(Figure 1 & Supplementary Figure 1-6).
Trait Z score P value
Acute Insulin Response -4.42 0.00001
Fasting Insulin -2.63 0.0086
30-min Insulin -3.6 0.00032
DI -3.47 0.00052
DI30 -1.9 0.058
120-min Glucose 3.06 0.0022
BAI -2.68 0.0073
BMI -2.42 0.016
Body Fat Percent -1.65 0.099
Insulin Sensitivity 2.72 0.0066
HDL 1.91 0.056
Systolic Blood Pressure -1.99 0.047
Diastolic Blood Pressure -1.54 0.12
19
Many T2DM-related SNPs exhibit overlapping associations with
obesity-related traits. However, these associations are often heterogeneous
[15]. BMI and BAI (body adiposity index) are two indices that are widely
used to quantify the amount of fat mass and overall obesity in an individual.
Directly measured body fat percent is a more accurate indicator of adiposity,
which also takes the amount of lean and fat mass into account [16].
Waist-hip-ratio is a measurement of fat distribution. It is not surprising to
find that out of 12 SNPs in the model, 8 SNPs are shared by BMI and BAI
(Figure 1, Supplementary Table 1&2). Differences in the SNPs that each
trait captured are also a hint that even though BMI and BAI are attempting
to capture the same construct, the results might be different for analyses
using them. Caution should be taken to choose the appropriate criteria to
quantify the fat mass. The Venn diagram for obesity-related traits also shows
that BMI, BAI and body-fat-percent are more close to each other (10 out of
12 SNPs in the model of BMI are shared by the model of BAI and
body-fat-percent, and 10/12 and 10/13 for BAI and body-fat-percent
respectively). The measurement of waist-hip-ratio may differ from them
more (Figure 1, only 5/10 are shared by BMI, BAI and body-fat-percent).
These conclusions are consistent with the observation that the wait-hip-ratio
is associated with T2D risk independent of BMI [17,18].
20
Figure 1 Venn diagram of the intersection of SNPs associated with different
obesity-related traits.
5 Conclusions
The advent of GWAS has provided an opportunity to incorporate novel
genetic variants into the risk prediction for complex disease. Even though
GRS are widely used in clinical practice to represent individual risk, it has
many limitations. The pathogenesis of complex disease is not a linear model
as characterized by the GRS, it is a complex, dynamic and integrative
metabolism system consisting of regulation by feedback loops, which cannot
be represented by a single metric. In this study, we evaluated the
performance of the GRS to represent the genetic risk for other T2DM-related
quantitative traits, Because of the pleiotropic and heterogenous effects of
21
SNPs on each T2DM-related trait, results using the GRS might be
misleading.
6 References
[1] Definition, Diagnosis and Classification of Diabetes Mellitus and its
Complications. Part 1: Diagnosis and Classification of Diabetes Mellitus
(WHO/NCD/NCS/99.2). Geneva: World Health Organization; 1999.
[2] Causes of Diabetes. National Institute of Diabetes and Digestive and
Kidney Diseases. June 2014. Retrieved 10 February 2016.
[3] Craig, J. Complex diseases: Research and applications. Nature
Education 2008 1(1):184.
[4] Hindorff LA, Sethupathy P, Junkins HA et al. Potential etiologic and
functional implications of genome-wide association loci for human
diseases and traits. Proc Natl Acad Sci USA. 2009;106(23):9362-7.
[5] Li G, Zhang P, Wang J et al. The long-term effect of lifestyle
interventions to prevent diabetes in the China Da Qing Diabetes
Prevention Study: A 20-year follow-up study. Lancet. 2008; 371: 1783–
9.
[6] Lindstrom J, Ilanne-Parikka P, Peltonen M et al. Sustained reduction
in the incidence of type 2 diabetes by lifestyle intervention: Follow-up of
the Finnish Diabetes Prevention Study. Lancet. 2006; 368: 1673–9.
[7] Paulweber B, Valensi P, Lindstrom J et al. A European
evidence-based guideline for the prevention of type 2 diabetes. Horm
Metab Res. 2010; 42 (Suppl. 1): S3–36.
22
[8] Echouffo-Tcheugui JB, Dieffenbach SD, Kengne AP. Added value of
novel circulating and genetic biomarkers in type 2 diabetes prediction: A
systematic review. Diabetes Res Clin Pract. 2013; 101: 255–69.
[9] R Development Core Team (2008). R: A language and environment
for statistical computing. R Foundation for Statistical Computing, Vienna,
Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.
[10] Xueyin W, Garrett S, Yonghua H et al. Genetic markers of type 2
diabetes: Progress in genome-wide association studies and clinical
application for risk prediction. Journal of Diabetes 8: 24–35 (2016).
[11] Pranavchand R, Reddy BM. Genomics era and complex disorders:
Implications of GWAS with special reference to coronary artery disease,
type 2 diabetes mellitus, and cancers. J Postgrad Med. 2016
Jul-Sep;62(3):188-98.
[12] Bergman RN, Ider YZ, Bowden CR, Cobelli C. Quantitative
estimation of insulin sensitivity. Am J Physiol Endocrinol Metab
Gastrointest Physiol 236: E667–E677, 1979.
[13] Finegood DT, Hramiak IM, Dupre J. A modified protocol for
estimation of insulin sensitivity with the minimal model of glucose
kinetics in patients with insulin-dependent diabetes. J Clin Endocrinol
Metab 70: 1538–1449, 1990.
[14] Sinnwell JP, Therneau TM, Schaid DJ. The kinship2 R package for
pedigree data. Hum Hered. 2014;78(2):91-3.
[15] Karaderi T., Drong A. W., Lindgren C. M. (2015). Insights into the
genetic susceptibility to type 2 diabetes from genome-wide association
studies of obesity-related traits. Curr. Diab. Rep. 15, 83.
23
[16] Kilpelainen TO, Zillikens MC, Stancakova A, et al. Genetic variation
near IRS1 associates with reduced adiposity and an impaired metabolic
profile. Nat Genet. 2011;43(8):753–60.
[17] Carey VJ, Walters EE, Colditz GA, et al. Body fat distribution and
risk of non-insulin-dependent diabetes mellitus in women. The Nurses’
Health Study. Am J Epidemiol. 1997;145(7):614–9.
[18] Wang Y, Rimm EB, Stampfer MJ, et al. Comparison of abdominal
adiposity and overall obesity in predicting risk of type 2 diabetes among
men. Am J Clin Nutr. 2005;81(3):555–63.
[19] Palmer ND, Goodarzi MO, Langefeld CD et al. Genetic Variants
Associated With Quantitative Glucose Homeostasis Traits Translate to
Type 2 Diabetes in Mexican Americans: The GUARDIAN (Genetics
Underlying Diabetes in Hispanics) Consortium. Diabetes 64, 1853–66
(2015).
[20] Bao W, Hu FB, Rong S et al. Predicting risk of type 2 diabetes
mellitus with genetic risk models on the basis of established
genome-wide association markers: A systematic review. Am J Epidemiol.
2013; 178: 1197–207.
[21] Janssens AC, Gwinn M, Bradley LA, et al. A critical appraisal of the
scientific basis of commercial genomic profiles used to assess health
risks and personalize health interventions. Am J Hum Genet
2008;82(3):593–599.
[22] Watanabe RM, Allayee H, Xiang AH, Trigo E, Hartiala J, Lawrence
JM, Buchanan TA. Transcription factor 7-like 2 (TCF7L2) is associated
with gestational diabetes mellitus and interacts with adiposity to alter
insulin secretion in Mexican Americans. Diabetes 56:1481–1485 (2007).
24
[23] Scott LJ, Mohlke KL, Bonnycastle LL et al. A genome- wide
association study of type 2 diabetes in Finns detects multiple
susceptibility variants. Science. 2007; 316: 1341–5.
[24] Zeggini E, Scott LJ, Saxena R et al. Meta-analysis of genome-wide
association data and large-scale replication identifies additional
susceptibility loci for type 2 diabetes. Nat Genet. 2008; 40: 638–45.
[25] Dupuis J, Langenberg C, Prokopenko I et al. New genetic loci
implicated in fasting glucose homeostasis and their impact on type 2
diabetes risk. Nat Genet. 2010; 42: 105–16.
[26] Rung J, Cauchi S, Albrechtsen A et al. Genetic variant near IRS1 is
associated with type 2 diabetes, insulin resistance and hyperinsulinemia.
Nat Genet. 2009; 41: 1110–5.
[27] Voight BF, Scott LJ, Steinthorsdottir V et al. Twelve type 2 diabetes
susceptibility loci identified through large-scale association analysis. Nat
Genet. 2010; 42: 579–89.
[28] Morris AP, Voight BF, Teslovich TM et al. Large-scale association
analysis provides insights into the genetic architecture and
pathophysiology of type 2 diabetes. Nat Genet. 2012; 44: 981–90.
[29] Zeggini E, Scott LJ, Saxena R et al. Genome-wide trans-ancestry
meta-analysis provides insight into the genetic architecture of type 2
diabetes susceptibility. Nat Genet. 2014; 46: 234–44.
[30] Gudmundsson J, Sulem P, Steinthorsdottir V et al. Two variants on
chromosome 17 confer prostate cancer risk, and the one in TCF2 protects
against type 2 diabetes. Nat Genet. 2007; 39: 977–83.
25
[31] Saxena R, Elbers CC, Guo Y etal. Large-scale gene-centric
meta-analysis across 39 studies identifies type 2 diabetes loci. Am J Hum
Genet. 2012; 90: 410– 25.
[32] Yamauchi T, Hara K, Maeda S et al. A genome-wide association
study in the Japanese population identifies susceptibility loci for type 2
diabetes at UBE2E2 and C2CD4A-C2CD4B. Nat Genet. 2010; 42: 864–
8.
[33] Cho YS, Chen CH, Hu C et al. Meta-analysis of genome-wide
association studies identifies eight new loci for type 2 diabetes in east
Asians. Nat Genet. 2012; 44: 67–72.
[34] Ma RC, Hu C, Tam CH et al. Genome-wide association study in a
Chinese population identifies a susceptibility locus for type 2 diabetes at
7q32 near PAX4. Diabetologia. 2013; 56: 1291–305.
[35] Hara K, Fujita H, Johnson TA et al. Genome-wide association study
identifies three novel loci for type 2 diabetes. Hum Mol Genet. 2014; 23:
239–46.
[36] Imamura M, Maeda S, Yamauchi T et al. A single-nucleotide
polymorphism in ANK1 is associated with susceptibility to type 2
diabetes in Japanese populations. Hum Mol Genet. 2012; 21: 3042–9.
[37] Tsai FJ, Yang CF, Chen CC et al. A genome-wide association study
identifies susceptibility variants for type 2 diabetes in Han Chinese. PLoS
Genet. 2010; 6: e1000847.
[38] Li H, Gan W, Lu L et al. A genome-wide association study identifies
GRK5 and RASGRP1 as type 2 diabetes loci in Chinese Hans. Diabetes.
2013; 62: 291–8.
26
[39] Yasuda K, Miyake K, Horikawa Y et al. Variants in KCNQ1 are
associated with susceptibility to type 2 diabetes mellitus. Nat Genet. 2008;
40: 1092–7.
[40] Shu XO, Long J, Cai Q et al. Identification of new genetic risk
variants for type 2 diabetes. PLoS Genet. 2010; 6: e1001127.
[41] Kooner JS, Saleheen D, Sim X et al. Genome-wide association study
in individuals of South Asian ancestry identifies six new type 2 diabetes
susceptibility loci. Nat Genet. 2011; 43: 984–9.
[42] Tabassum R, Chauhan G, Dwivedi OP et al. Genome-wide
association study for type 2 diabetes in Indians identifies a new
susceptibility locus at 2q21. Diabetes. 2013; 62: 977–86.
[43] Palmer ND, McDonough CW, Hicks PJ et al. A genome-wide
association search for type 2 diabetes genes in African Americans. PLoS
ONE. 2012; 7: e29202.
[44] Williams AL, Jacobs SB, Moreno-Macias H etal. Sequence variants
in SLC16A11 are a common risk factor for type 2 diabetes in Mexico.
Nature. 2014; 506: 97–101.
[45] Sandhu MS, Weedon MN, Fawcett KA et al. Common variants in
WFS1 confer risk of type 2 diabetes. Nat Genet. 2007; 39: 951–3.
[46] Saxena R, Voight BF, Lyssenko V et al. Genome-wide association
analysis identifies loci for type 2 diabetes and triglyceride levels. Science.
2007; 316: 1331–6.
[47] Wigginton JE, Abecasis GR. PEDSTATS descriptive statistics,
graphics and quality assessment for gene mapping data. Bioinformatics
21:3445– 3447 (2005).
27
7 Appendix
Supplementary Table 1 The multiple regression model of BMI
Value Std Error Z score P-value
(Intercept) -0.30812904 0.306836636 -1 3.20E-01
sex 0.10214164 0.062539564 1.63 1.00E-01
age 0.02106244 0.004038478 5.22 1.80E-07
rs243021 0.11286118 0.044195117 2.55 1.10E-02
rs6723108 -0.09950526 0.067632018 -1.47 1.40E-01
rs11708067 -0.11595879 0.047915771 -2.42 1.60E-02
rs4402960 -0.11439265 0.05037739 -2.27 2.30E-02
rs4457053 0.08477915 0.046994619 1.8 7.10E-02
rs1387153 0.10815186 0.05253761 2.06 4.00E-02
rs7961581 0.08920427 0.058160895 1.53 1.30E-01
rs11634397 -0.09732951 0.043911505 -2.22 2.70E-02
rs2028299 -0.09128733 0.057602204 -1.58 1.10E-01
rs9936385 0.08019903 0.05138526 1.56 1.20E-01
rs12454712 -0.08223069 0.046187162 -1.78 7.50E-02
rs10401969 -0.13844933 0.093593185 -1.48 1.40E-01
28
Supplementary Table 2 The multiple regression model of BAI
Value Std Error Z score P-value
(Intercept)
-2.10547942 0.2020298 -10.42 0.00E+00
sex
1.17499574 0.053814932 21.83 0.00E+00
age
0.01558809 0.003406785 4.58 4.70E-06
rs243021
0.07927367 0.037286649 2.13 3.30E-02
rs11708067
-0.10873288 0.040529217 -2.68 7.30E-03
rs4402960
-0.13022138 0.042479653 -3.07 2.20E-03
rs4607103
-0.06826488 0.038665588 -1.77 7.70E-02
rs4457053
0.07215764 0.039519713 1.83 6.80E-02
rs2796441
-0.05756213 0.036796425 -1.56 1.20E-01
rs1387153
0.11028874 0.044355732 2.49 1.30E-02
rs2237892
-0.07774439 0.040625414 -1.91 5.60E-02
rs11634397
-0.08595863 0.037066025 -2.32 2.00E-02
rs2028299
-0.08405927 0.048520294 -1.73 8.30E-02
rs7403531
0.06644774 0.038934987 1.71 8.80E-02
rs12454712
-0.09002894 0.0388896 -2.31 2.10E-02
29
Supplementary Table 3 The multiple regression model of waist-hip-ratio
Value Std Error Z score P-value
(Intercept)
0.75686896 0.290094269 2.61 9.10E-03
Sex
-1.05257704 0.061161779 -17.21 0.00E+00
Age
0.02212931 0.003564504 6.21 5.40E-10
rs243021
0.10752532 0.03915108 2.75 6.00E-03
rs780094
-0.10033513 0.042748691 -2.35 1.90E-02
rs7612463
0.14754347 0.082415134 1.79 7.30E-02
rs4607517
0.12287949 0.052848234 2.33 2.00E-02
rs2796441
-0.09438203 0.03890893 -2.43 1.50E-02
rs10830963
0.07604088 0.046332818 1.64 1.00E-01
rs4275659
-0.06116587 0.039787504 -1.54 1.20E-01
rs11634397
-0.07256773 0.039078049 -1.86 6.30E-02
rs10401969
0.13407915 0.083592088 1.6 1.10E-01
rs8108269
-0.06786603 0.040640926 -1.67 9.50E-02
30
Supplementary Table 4 The multiple regression model of body fat
percentage
Value Std Error Z score P-value
(Intercept)
-2.4427674 0.20227606 -12.08 0.00E+00
sex
1.46346138 0.047340437 30.91 0.00E+00
age
0.01316033 0.003011642 4.37 1.20E-05
rs243021
0.0563642 0.033002735 1.71 8.80E-02
rs6723108
-0.09445334 0.05055624 -1.87 6.20E-02
rs11708067
-0.05915947 0.03582827 -1.65 9.90E-02
rs4402960
-0.12430611 0.037580904 -3.31 9.40E-04
rs4607103
-0.07093203 0.034105039 -2.08 3.80E-02
rs4457053
0.05296051 0.035058962 1.51 1.30E-01
rs972283
-0.05219562 0.033761737 -1.55 1.20E-01
rs10830963
0.08945205 0.038982857 2.29 2.20E-02
rs2237892
-0.07389086 0.035966067 -2.05 4.00E-02
rs7961581
0.06493962 0.043616785 1.49 1.40E-01
rs11634397
-0.07013161 0.032746564 -2.14 3.20E-02
rs12454712
-0.06644959 0.034380549 -1.93 5.30E-02
rs12970134
0.0826106 0.047222668 1.75 8.00E-02
31
Supplementary Table 5 The multiple regression model of cholesterol level
Value Std Error Z score P-value
(Intercept)
-1.12443648 0.282091874 -3.99 6.70E-05
sex
-0.36082425 0.062642995 -5.76 8.40E-09
age
0.03854997 0.003883541 9.93 0.00E+00
rs780094
-0.1434188 0.04587097 -3.13 1.80E-03
rs4402960
-0.07975576 0.048645419 -1.64 1.00E-01
rs972283
0.10030066 0.043512816 2.31 2.10E-02
rs10830963
0.1375836 0.050009175 2.75 5.90E-03
rs2237892
0.07903378 0.04694982 1.68 9.20E-02
rs2334499
-0.10459445 0.043411069 -2.41 1.60E-02
rs4275659
-0.06256483 0.043221847 -1.45 1.50E-01
rs8050136
0.08763372 0.049755753 1.76 7.80E-02
rs13342692
-0.08924713 0.047118662 -1.89 5.80E-02
rs10401969
0.16134306 0.089877342 1.8 7.30E-02
rs3786897
0.12521513 0.049087635 2.55 1.10E-02
32
Supplementary Table 6 The multiple regression model of HDL level
Value Std Error Z score P-value
(Intercept)
-0.715489113 0.330850788 -2.16 0.031
sex
0.669834071 0.063323757 10.58 0
age
0.002516839 0.003918288 0.64 0.52
rs243021
-0.136076577 0.043122797 -3.16 0.0016
rs2943641
-0.092246738 0.056109161 -1.64 0.1
rs11708067
0.089065353 0.046569957 1.91 0.056
rs702634
-0.13841987 0.056229693 -2.46 0.014
rs3130501
-0.105449419 0.051497302 -2.05 0.041
rs9505118
0.081861735 0.045677529 1.79 0.073
rs972283
-0.099643959 0.044211546 -2.25 0.024
rs2796441
0.122811644 0.042248775 2.91 0.0037
rs7903146
0.108779637 0.051830377 2.1 0.036
rs1552224
-0.173641032 0.100151258 -1.73 0.083
rs4430796
-0.065170503 0.043754508 -1.49 0.14
rs12454712
0.096920769 0.044973527 2.16 0.031
rs3786897
0.091444528 0.049705115 1.84 0.066
33
Supplementary Table 7 The multiple regression model of LDL level
Value Std Error Z score P-value
(Intercept)
-0.74623365 0.33691855 -2.21 2.70E-02
sex
-0.42264648 0.063815104 -6.62 3.50E-11
age
0.03077328 0.003952855 7.79 7.00E-15
rs2943641
0.09670554 0.056002528 1.73 8.40E-02
rs7578597
-0.1967002 0.089770734 -2.19 2.80E-02
rs1801282
0.10573284 0.066335533 1.59 1.10E-01
rs972283
0.13792653 0.04432011 3.11 1.90E-03
rs10830963
0.12503519 0.050922027 2.46 1.40E-02
rs2334499
-0.1117472 0.043432758 -2.57 1.00E-02
rs4275659
-0.07705783 0.043942079 -1.75 7.90E-02
rs7202877
0.1287914 0.081383916 1.58 1.10E-01
rs8050136
0.09246266 0.050884937 1.82 6.90E-02
rs13342692
-0.09685739 0.047878532 -2.02 4.30E-02
rs3786897
0.09878009 0.049871236 1.98 4.80E-02
34
Supplementary Table 8 The multiple regression model of triglycerides level
Value Std Error Z score P-value
(Intercept)
-0.48986159 0.324214954 -1.51 1.30E-01
sex
-0.46843921 0.062628542 -7.48 7.40E-14
age
0.03060916 0.003925551 7.8 6.30E-15
rs243021
0.07049075 0.042805732 1.65 1.00E-01
rs780094
-0.18162823 0.046089222 -3.94 8.10E-05
rs3130501
0.0744594 0.051305506 1.45 1.50E-01
rs9505118
-0.1021457 0.045403692 -2.25 2.40E-02
rs2796441
-0.08092489 0.04243638 -1.91 5.70E-02
rs2334499
-0.08171048 0.043067731 -1.9 5.80E-02
rs10401969
0.16854734 0.091045817 1.85 6.40E-02
rs7578597
0.12824593 0.088829975 1.44 1.50E-01
35
Supplementary Table 9 The multiple regression model of ALT
Value Std Error Z score P-value
(Intercept)
1.492733244 0.246885999 6.05 1.50E-09
sex
-0.861129937 0.063345037 -13.59 0.00E+00
age
0.009311774 0.003794523 2.45 1.40E-02
rs2943641
0.08682451 0.05411476 1.6 1.10E-01
rs4402960
-0.124279155 0.047495832 -2.62 8.90E-03
rs4607517
-0.098684848 0.05603142 -1.76 7.80E-02
rs791595
0.139325008 0.060837041 2.29 2.20E-02
rs2796441
-0.062695709 0.041223136 -1.52 1.30E-01
rs10830963
0.09908834 0.049146284 2.02 4.40E-02
rs7202877
-0.164847982 0.0787589 -2.09 3.60E-02
rs5945326
-0.095788519 0.036830432 -2.6 9.30E-03
36
Supplementary Table 10 The multiple regression model of AST
Value Std Error Z score P-value
(Intercept)
0.22504058 0.320357261 0.7 0.48
sex
-0.70658202 0.065423517 -10.8 0
age
0.01287133 0.003908217 3.29 0.00099
rs243021
0.06432882 0.042712901 1.51 0.13
rs780094
-0.08697053 0.046453787 -1.87 0.061
rs1801282
0.09339946 0.06536258 1.43 0.15
rs4402960
-0.08428455 0.049003366 -1.72 0.085
rs4607103
0.07144343 0.044724842 1.6 0.11
rs6780569
0.1254065 0.083583908 1.5 0.13
rs702634
0.0872884 0.055238911 1.58 0.11
rs9505118
0.06684998 0.045734281 1.46 0.14
rs791595
0.09954729 0.062465206 1.59 0.11
rs2237892
0.08835683 0.046394905 1.9 0.057
rs4275659
-0.0767584 0.043444152 -1.77 0.077
rs11634397
-0.08944533 0.042629939 -2.1 0.036
rs9936385
-0.09275481 0.049488493 -1.87 0.061
37
Supplementary Table 11 The multiple regression model of fasting glucose
Value Std Error Z score P-value
(Intercept)
-1.10268005 0.32904623 -3.35 8.00E-04
sex
-0.26578608 0.065710882 -4.04 5.20E-05
age
0.01980607 0.004021637 4.92 8.40E-07
rs243021
0.08857595 0.044343885 2 4.60E-02
rs2943641
0.12463528 0.057626735 2.16 3.10E-02
rs6780569
0.16046842 0.085592303 1.87 6.10E-02
rs702634
0.09033038 0.057559567 1.57 1.20E-01
rs2796441
-0.07267537 0.043467743 -1.67 9.50E-02
rs10830963
0.14993615 0.051966433 2.89 3.90E-03
rs11063069
0.102217 0.070205061 1.46 1.50E-01
rs7403531
0.1588057 0.046041271 3.45 5.60E-04
rs7202877
-0.12924291 0.082906655 -1.56 1.20E-01
rs3786897
0.09320458 0.05105633 1.83 6.80E-02
rs5945326
0.06624846 0.038819056 1.71 8.80E-02
38
Supplementary Table 12 The multiple regression model of 30-min glucose
Value Std Error Z score P-value
(Intercept)
-0.89008448 0.328424456 -2.71 6.70E-03
sex
-0.36172801 0.067588675 -5.35 8.70E-08
age
0.01364231 0.004040102 3.38 7.30E-04
rs2943641
0.10742899 0.057657568 1.86 6.20E-02
rs1801282
0.16130622 0.067972842 2.37 1.80E-02
rs13266634
0.08503647 0.050030955 1.7 8.90E-02
rs2796441
-0.08255048 0.043929296 -1.88 6.00E-02
rs10830963
0.21198978 0.09917446 2.14 3.30E-02
rs1387153
-0.16042265 0.100345398 -1.6 1.10E-01
rs1552224
0.14901023 0.102894444 1.45 1.50E-01
rs11063069
0.10581945 0.07140866 1.48 1.40E-01
rs7403531
0.11191914 0.046402393 2.41 1.60E-02
rs8042680
0.07260254 0.048093912 1.51 1.30E-01
39
Supplementary Table 13 The multiple regression model of 120-min glucose
Value Std Error Z score P-value
(Intercept)
-2.17987367 0.256598384 -8.5 0.00E+00
sex
0.21986894 0.066334811 3.31 9.20E-04
age
0.03112765 0.003893662 7.99 1.30E-15
rs11708067
0.14281303 0.046734825 3.06 2.20E-03
rs1801282
0.10821737 0.065718352 1.65 1.00E-01
rs4457053
0.06770331 0.045340772 1.49 1.40E-01
rs702634
0.19843814 0.055335006 3.59 3.40E-04
rs10886471
-0.10464183 0.044233696 -2.37 1.80E-02
rs7903146
0.11591387 0.051093867 2.27 2.30E-02
rs231362
-0.09733152 0.046606737 -2.09 3.70E-02
rs7403531
0.08351551 0.044769496 1.87 6.20E-02
rs4430796
0.06678988 0.044130674 1.51 1.30E-01
40
Supplementary Table 14 The multiple regression model of glucose
effectiveness
Value Std Error Z score P-value
(Intercept)
0.97364165 0.287646654 3.38 7.10E-04
sex
-0.0943275 0.068486647 -1.38 1.70E-01
age
-0.01603343 0.003888443 -4.12 3.70E-05
rs243021
-0.09014261 0.042481122 -2.12 3.40E-02
rs6723108
0.12875091 0.066776109 1.93 5.40E-02
rs780094
-0.11646902 0.047082661 -2.47 1.30E-02
rs6808574
-0.09807231 0.056094594 -1.75 8.00E-02
rs10010131
0.07628065 0.0483953 1.58 1.10E-01
rs3130501
0.08408228 0.050983364 1.65 9.90E-02
rs2796441
0.06879466 0.042569434 1.62 1.10E-01
rs10830963
-0.28903201 0.096133152 -3.01 2.60E-03
rs1387153
0.16548293 0.097885101 1.69 9.10E-02
rs5215
-1.49071282 0.481394984 -3.1 2.00E-03
rs5219
1.36395165 0.482458319 2.83 4.70E-03
rs7403531
-0.08463688 0.044770513 -1.89 5.90E-02
rs8050136
0.49179511 0.285525952 1.72 8.50E-02
rs9936385
-0.51321436 0.283839508 -1.81 7.10E-02
rs4430796
-0.09712444 0.044433616 -2.19 2.90E-02
rs5945326
-0.0552213 0.038393971 -1.44 1.50E-01
41
Supplementary Table 15 The multiple regression model of fasting insulin
Value Std Error Z score P-value
(Intercept)
0.516738332 0.402050547 1.29 0.2
sex
0.092116636 0.064736553 1.42 0.15
age
-0.001380392 0.003964034 -0.35 0.73
rs243021
0.138434507 0.04355942 3.18 0.0015
rs2943641
0.177304301 0.056682057 3.13 0.0018
rs6723108
-0.112561213 0.067058416 -1.68 0.093
rs11708067
-0.124126715 0.047235834 -2.63 0.0086
rs7612463
0.13960907 0.091597895 1.52 0.13
rs3130501
0.137941466 0.052323635 2.64 0.0084
rs7754840
-0.12669908 0.046484697 -2.73 0.0064
rs9505118
-0.092482897 0.04636287 -1.99 0.046
rs2796441
-0.081906815 0.042979479 -1.91 0.057
rs7903146
-0.084757452 0.05227922 -1.62 0.1
rs1552224
-0.157719043 0.101440943 -1.55 0.12
rs11634397
-0.09105359 0.043561044 -2.09 0.037
rs7202877
-0.297037413 0.082450587 -3.6 0.00032
rs9936385
0.095410268 0.050535214 1.89 0.059
rs4457053
0.065273283 0.04587548 1.42 0.15
42
Supplementary Table 16 The multiple regression model of 30-min insulin
Value Std Error Z score P-value
(Intercept)
2.0366787 0.321648685 6.33 2.40E-10
sex
-0.06253905 0.062716025 -1 3.20E-01
age
-0.01697512 0.003918404 -4.33 1.50E-05
rs243021
0.10357693 0.043241019 2.4 1.70E-02
rs2943641
0.10325113 0.055977515 1.84 6.50E-02
rs780094
0.07331642 0.046273606 1.58 1.10E-01
rs11708067
-0.16789418 0.04662301 -3.6 3.20E-04
rs4402960
-0.09935573 0.049283263 -2.02 4.40E-02
rs6808574
-0.08654328 0.056845338 -1.52 1.30E-01
rs7754840
-0.18546497 0.04596117 -4.04 5.50E-05
rs10229583
-0.1256472 0.059065684 -2.13 3.30E-02
rs972283
0.08293786 0.044058741 1.88 6.00E-02
rs7903146
-0.15148886 0.051790182 -2.93 3.40E-03
rs2237892
-0.07644725 0.046891909 -1.63 1.00E-01
rs7961581
0.1099626 0.057247779 1.92 5.50E-02
rs11634397
-0.06947202 0.04308208 -1.61 1.10E-01
rs2028299
-0.18161487 0.055955183 -3.25 1.20E-03
rs7202877
-0.32089452 0.080241932 -4 6.40E-05
rs12454712
-0.10139546 0.044828035 -2.26 2.40E-02
rs5945326
-0.06090494 0.038108031 -1.6 1.10E-01
43
Supplementary Table 17 The multiple regression model of 120-min insulin
Value Std Error Z score P-value
(Intercept)
-1.45753718 0.367453444 -3.97 7.30E-05
sex
0.36883107 0.066975021 5.51 3.70E-08
age
0.01101524 0.003958016 2.78 5.40E-03
rs243021
0.09858756 0.04355732 2.26 2.40E-02
rs2943641
0.12415938 0.057091981 2.17 3.00E-02
rs4402960
-0.08887346 0.049642079 -1.79 7.30E-02
rs4607103
0.10577608 0.045575427 2.32 2.00E-02
rs7612463
0.18898597 0.091337392 2.07 3.90E-02
rs702634
0.11574625 0.056258488 2.06 4.00E-02
rs3130501
0.12150842 0.052001557 2.34 1.90E-02
rs7754840
-0.08602345 0.046776754 -1.84 6.60E-02
rs9505118
-0.07134016 0.046767982 -1.53 1.30E-01
rs10886471
-0.10219302 0.044708018 -2.29 2.20E-02
rs1802295
0.10571066 0.050899076 2.08 3.80E-02
rs10830963
0.12423413 0.051469448 2.41 1.60E-02
rs1552224
-0.1471042 0.101070354 -1.46 1.50E-01
rs231362
-0.08115092 0.047202634 -1.72 8.60E-02
rs11634397
-0.09628403 0.043546243 -2.21 2.70E-02
rs9936385
0.08597176 0.050270995 1.71 8.70E-02
rs12454712
-0.09024593 0.045632838 -1.98 4.80E-02
44
Supplementary Table 18 The multiple regression model of insulin sensitivity
Value Std Error Z score P-value
(Intercept)
-0.304103519 0.339334957 -0.9 0.37
sex
0.135627823 0.068253207 1.99 0.047
age
-0.008845129 0.004063605 -2.18 0.03
rs243021
-0.125821825 0.044614524 -2.82 0.0048
rs6723108
0.136545697 0.069325974 1.97 0.049
rs11708067
0.131667615 0.048457597 2.72 0.0066
rs4402960
0.077043048 0.051310574 1.5 0.13
rs7612463
-0.142096595 0.094255291 -1.51 0.13
rs7754840
0.131028467 0.048114986 2.72 0.0065
rs972283
-0.084846245 0.045788078 -1.85 0.064
rs7903146
0.089824409 0.053430127 1.68 0.093
rs5215
-0.07643055 0.045865358 -1.67 0.096
rs7961581
-0.125806554 0.05946101 -2.12 0.034
rs7202877
0.218608545 0.084695841 2.58 0.0098
45
Supplementary Table 19 The multiple regression model of acute insulin
response to glucose
Value Std Error Z score P-value
(Intercept)
3.71693241 0.330058311 11.26 0.00E+00
sex
-0.15829372 0.063136535 -2.51 1.20E-02
age
-0.0240666 0.003807093 -6.32 2.60E-10
rs11708067
-0.2020994 0.045762583 -4.42 1.00E-05
rs4402960
-0.09348282 0.048181385 -1.94 5.20E-02
rs4457053
-0.08946303 0.044498569 -2.01 4.40E-02
rs7754840
-0.14056489 0.045126558 -3.11 1.80E-03
rs13266634
-0.12595913 0.047409795 -2.66 7.90E-03
rs17584499
-0.08191157 0.049473851 -1.66 9.80E-02
rs7903146
-0.15937796 0.050414169 -3.16 1.60E-03
rs10830963
-0.30809627 0.093681874 -3.29 1.00E-03
rs1387153
0.14791047 0.094478537 1.57 1.20E-01
rs1552224
-0.33881396 0.097252048 -3.48 4.90E-04
rs2237892
-0.16417894 0.046777592 -3.51 4.50E-04
rs2334499
-0.07332972 0.042970337 -1.71 8.80E-02
rs2028299
-0.19581034 0.055134536 -3.55 3.80E-04
rs8042680
-0.08607675 0.045819457 -1.88 6.00E-02
rs7202877
-0.20364071 0.079307722 -2.57 1.00E-02
rs4430796
-0.10696997 0.043233609 -2.47 1.30E-02
rs5945326
-0.12368738 0.037120895 -3.33 8.60E-04
46
Supplementary Table 20 The multiple regression model of DI
Value Std Error Z score P-value
(Intercept)
3.25803034 0.347764998 9.37 0.00E+00
sex
-0.0701925 0.06482591 -1.08 2.80E-01
age
-0.03253048 0.003771231 -8.63 0.00E+00
rs11708067
-0.15799103 0.04551295 -3.47 5.20E-04
rs4457053
-0.09522703 0.044148344 -2.16 3.10E-02
rs702634
-0.11507376 0.05390235 -2.13 3.30E-02
rs10886471
0.07876801 0.043115024 1.83 6.80E-02
rs7903146
-0.10315162 0.049695626 -2.08 3.80E-02
rs10830963
-0.37555128 0.093685697 -4.01 6.10E-05
rs1387153
0.17342345 0.094780378 1.83 6.70E-02
rs1552224
-0.28482064 0.096342062 -2.96 3.10E-03
rs2237892
-0.19701082 0.04546528 -4.33 1.50E-05
rs5215
-0.08815434 0.042819725 -2.06 4.00E-02
rs2028299
-0.14112274 0.054352497 -2.6 9.40E-03
rs7403531
-0.08363936 0.04347276 -1.92 5.40E-02
rs4430796
-0.09049905 0.042982646 -2.11 3.50E-02
rs10401969
-0.14057256 0.088939584 -1.58 1.10E-01
rs5945326
-0.11886768 0.037146554 -3.2 1.40E-03
47
Supplementary Table 21 The multiple regression model of DI30
Value Std Error Z score P-value
(Intercept)
1.83081605 0.273826427 6.69 2.30E-11
sex
0.02346145 0.067059385 0.35 7.30E-01
age
-0.02586542 0.003978895 -6.5 8.00E-11
rs11708067
-0.09037069 0.047677338 -1.9 5.80E-02
rs6808574
-0.10163682 0.057690219 -1.76 7.80E-02
rs702634
-0.08435984 0.057037171 -1.48 1.40E-01
rs7754840
-0.07245214 0.04713336 -1.54 1.20E-01
rs10229583
-0.1213632 0.060342417 -2.01 4.40E-02
rs7903146
-0.11500654 0.052212467 -2.2 2.80E-02
rs2237892
-0.11258486 0.047696739 -2.36 1.80E-02
rs5219
-0.11691204 0.045414584 -2.57 1.00E-02
rs2028299
-0.10474969 0.057813574 -1.81 7.00E-02
rs8042680
0.10461263 0.048194153 2.17 3.00E-02
rs12454712
-0.07287745 0.046354276 -1.57 1.20E-01
48
Supplementary Table 22 The multiple regression model of insulin clearance
rate
Value Std Error Z score P-value
(Intercept)
-0.252871774 0.360430282 -0.7 0.48
sex
0.092770059 0.068054487 1.36 0.17
age
-0.006732055 0.004015127 -1.68 0.094
rs243021
-0.114844427 0.044328122 -2.59 0.0096
rs2943641
-0.130750115 0.057908534 -2.26 0.024
rs6723108
0.145932005 0.068444313 2.13 0.033
rs4402960
0.11639659 0.050546826 2.3 0.021
rs7612463
-0.182280315 0.092889364 -1.96 0.05
rs3130501
-0.097896892 0.052747182 -1.86 0.063
rs7754840
0.175442116 0.04728009 3.71 0.00021
rs9505118
0.105916219 0.047338657 2.24 0.025
rs4607517
0.120515218 0.059405674 2.03 0.042
rs231362
0.095704428 0.047762569 2 0.045
rs7961581
-0.125826818 0.05883943 -2.14 0.032
rs11634397
0.06682543 0.044004083 1.52 0.13
rs7202877
0.162032754 0.083994149 1.93 0.054
rs8108269
0.088041031 0.045641027 1.93 0.054
49
Supplementary Table 23 The multiple regression model of systolic blood
pressure
Value Std Error Z score P-value
(Intercept)
1.83083243 0.261291597 7.01 2.40E-12
sex
-0.93029218 0.062612455 -14.86 0.00E+00
age
0.01807071 0.003694288 4.89 1.00E-06
rs243021
-0.08342262 0.040581159 -2.06 4.00E-02
rs6723108
-0.12911664 0.063121196 -2.05 4.10E-02
rs11708067
-0.08773304 0.044185623 -1.99 4.70E-02
rs6808574
0.0760579 0.053456581 1.42 1.50E-01
rs972283
-0.06393951 0.041543704 -1.54 1.20E-01
rs1802295
-0.0745839 0.047675461 -1.56 1.20E-01
rs10830963
0.07313227 0.048096853 1.52 1.30E-01
rs2334499
-0.07756089 0.040983805 -1.89 5.80E-02
rs4275659
-0.07620803 0.041329206 -1.84 6.50E-02
rs2028299
-0.19801893 0.053441904 -3.71 2.10E-04
rs8042680
-0.14540676 0.044772207 -3.25 1.20E-03
50
Supplementary Table 24 The multiple regression model of diastolic blood
pressure
Value Std Error Z score P-value
(Intercept)
1.37175176 0.308783561 4.44 8.90E-06
sex
-0.77940614 0.063914344 -12.19 0.00E+00
age
0.02439115 0.003734566 6.53 6.50E-11
rs2943641
-0.07961055 0.053778204 -1.48 1.40E-01
rs6723108
-0.11600442 0.064069418 -1.81 7.00E-02
rs780094
-0.0962468 0.044929949 -2.14 3.20E-02
rs11708067
-0.06895575 0.044842377 -1.54 1.20E-01
rs3130501
0.13412084 0.0494529 2.71 6.70E-03
rs972283
-0.06480614 0.042084496 -1.54 1.20E-01
rs7903146
0.0860753 0.04905371 1.75 7.90E-02
rs10830963
0.08364902 0.048607082 1.72 8.50E-02
rs2028299
-0.13523154 0.05410511 -2.5 1.20E-02
rs8042680
-0.06697261 0.045080776 -1.49 1.40E-01
rs7202877
-0.17256457 0.078431068 -2.2 2.80E-02
rs12454712
-0.06165916 0.043401215 -1.42 1.60E-01
rs3786897
-0.069392 0.047529219 -1.46 1.40E-01
rs8108269
0.09504203 0.042596475 2.23 2.60E-02
rs5945326
0.05471733 0.036633932 1.49 1.40E-01
51
Supplementary Figure 1 Venn diagram of the intersection of SNPs
associated with different lipid-related traits.
52
Supplementary Figure 2 Venn diagram of the intersection of SNPs
associated with different glucose-level-related traits.
53
Supplementary Figure 3 Venn diagram of the intersection of SNPs
associated with different insulin-level-related traits.
54
Supplementary Figure 4 Venn diagram of the intersection of SNPs
associated with different insulin-activity-related traits.
55
Supplementary Figure 5 Venn diagram of the intersection of SNPs
associated with different liver-enzyme-related traits.
56
Supplementary Figure 6 Venn diagram of the intersection of SNPs
associated with different blood-pressure-related traits.
Abstract (if available)
Abstract
Background: Genome-wide association studies (GWAS) have identified genetic variation underlying complex diseases such as type 2 diabetes (T2D). The genetic risk score (GRS), the total number of risk variants carried by an individual, is a widely used tool to test association between genetic variation and disease risk or disease-related quantitative traits. However, the GRS created for a complex disease might have limitation for the complex disease-related traits. This study used the T2DM as an example of complex diseases and evaluated the performance of the GRS created for T2DM in the analysis of T2DM-related quantitative traits (QTs), compared to traditional multiple regression. ❧ Methods: 1044 subjects from the BetaGene study were used for analysis. BetaGene consists of Mexican American families of probands with or without previous GDM. 56 T2DM risk variants identified from previously published GWAS studies were included in the analysis. A GRS was created using the 56 T2DM associated SNPs and tested for association with T2D-related QTs using a linear mixed-effects model and incorporating the kinship matrix to adjust for the correlation between family members. Additionally, stepwise model selection was used to acquire the model to test the association between T2D-related QTs and T2D risk SNPs. Performance of the GRS and the model selected by stepwise model selection was compared in the aspect of variation explained and the significance of association. ❧ Results: For both analyses, a nominal p<0.05 was used to define statistical significance. The GRS was associated with 8 of 24 T2D-related QTs tested and explained 0.01-3% of the trait variation. In contrast, each QT included 8-17 SNPs in the stepwise regression analysis, which was associated with QT trait and explained 3-13% of the trait variation. Furthermore, in cases where both GRS and stepwise regression showed evidence for association, the significance of association was stronger for stepwise regression. ❧ Conclusion: These results suggest that GRS may not be an appropriate tool to assess association between disease risk variants and disease-related QTs. The GRS may lead to misinterpretation of the physiologic implications of disease risk variants, since it cannot find the true association between disease risk variants and complex diseases-related QTs. We concluded that GRS should not be used to test association between disease risk variants and disease-related QTs due to the pleiotropic and heterogenous effects of risk variants on each disease-related trait.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Association of single nucleotide polymorphisms in GCK, GCKR and PNPLA3 with type 2 diabetes related quantitative traits in Mexican-American population
PDF
Variation in insulin-like growth factor-2 binding protein 2 interacts with adiposity to alter insulin sensitivity in Mexican Americans
PDF
An analysis of disease-free survival and overall survival in inflammatory breast cancer
PDF
Pharmacogenetic association studies and the impact of population substructure in the women's interagency HIV study
PDF
Genetic variation in the base excision repair pathway, environmental risk factors and colorectal adenoma risk
PDF
Bayesian hierarchical models in genetic association studies
PDF
Analysis of SNP differential expression and allele-specific expression in gestational trophoblastic disease using RNA-seq data
PDF
Genetic variations in gene from the cytochrome P450 family may account for differential response to pioglitazone therapy in the Hispanic women
PDF
Detecting joint interactions between sets of variables in the context of studies with a dichotomous phenotype, with applications to asthma susceptibility involving epigenetics and epistasis
PDF
Red and processed meat consumption and colorectal cancer risk: meta-analysis of case-control studies
PDF
Two-step testing approaches for detecting quantitative trait gene-environment interactions in a genome-wide association study
PDF
Variation in CRY2 and MTNR1B have independent effects on insulin secretion in Mexican Americans
PDF
The influence of dietary fructose on genetic effects of GCK and GCKR in Mexican Americans
PDF
Common immune-related factors and risk of non-Hodgkin lymphomy
PDF
Exploring the genetic basis of complex traits
PDF
The role of genetic ancestry in estimation of the risk of age-related degeneration (AMD) in the Los Angeles Latino population
PDF
Cross-sectional association of blood pressure, antihypertensive medications, MRI volumetric measures and cognitive function scores in an aging population
PDF
Fish consumption and risk of colorectal cancer
PDF
An assessment of necrosis grading in childhood osteosarcoma: the effect of initial treatment on prognostic significance
PDF
The impact of global and local Polynesian genetic ancestry on complex traits in Native Hawaiians
Asset Metadata
Creator
Chen, Zhu
(author)
Core Title
Shortcomings of the genetic risk score in the analysis of disease-related quantitative traits
School
Keck School of Medicine
Degree
Master of Science
Degree Program
Biostatistics
Publication Date
02/09/2017
Defense Date
12/17/2016
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
cogenetic risk score,complex disease,OAI-PMH Harvest,quantitative traits,T2DM
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Watanabe, Richard (
committee chair
), Mack, Wendy (
committee member
), Stern, Mariana (
committee member
)
Creator Email
zhuchen@usc.edu,zhuchen87@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-330675
Unique identifier
UC11258217
Identifier
etd-ChenZhu-5013.pdf (filename),usctheses-c40-330675 (legacy record id)
Legacy Identifier
etd-ChenZhu-5013.pdf
Dmrecord
330675
Document Type
Thesis
Rights
Chen, Zhu
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
cogenetic risk score
complex disease
quantitative traits
T2DM