Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Statistical algorithms for examining gene and environmental influences on human aging
(USC Thesis Other)
Statistical algorithms for examining gene and environmental influences on human aging
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
STATISTICAL ALGORITHMS FOR EXAMINING GENE AND ENVIRONMENTAL
INFLUENCES ON HUMAN AGING
by
Morgan Elyse Levine
A dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(GERONTOLOGY)
May 2015
Copyright 2014 Morgan Elyse Levine
ii
Dedication
To my husband, Zachary.
iii
Acknowledgements
I would like to express my deepest thanks and gratitude to my adviser and mentor Eileen
Crimmins for her continued support and guidance. You have served as an amazing role-model,
providing me with an example of the type of researcher I aspire to become—esteemed, devoted, and a
true leader in the field. I can’t express how appreciative I am for all the hours you spent diligently
supervising and providing input and guidance for my projects and manuscripts; going above and
beyond to seek out opportunities for me; and encouraging and supporting my ability to attend meetings
all over the world. Your diligent guidance has undoubtedly started me on the path towards becoming
a true multidisciplinary scientist.
I would also like to express my gratitude to my other committee members and mentors—Tuck
Finch, Hassy Cohen, Edward Schneider, and Jennifer Ailshire. Tuck, your endless support and teaching
has opened my mind to topics and research questions that I never knew existed. Our discussions over
lunch have truly shaped many of the questions I tried to answer while at USC, as well as questions I
hope to pursue in the years to come. There is no doubt in my mind that what I have learned from you
has truly influenced the way I approach my work and how I think about the aging process. I feel so
lucky to have had the opportunity to collaborate with one of the greatest minds in the field of
biogerontology. Hassy, words can’t express how thankful I am for all the time and guidance you have
provided to me. You have always listened and encouraged me, helped me shape my ideas, and reliably
guided me in the right direction. Your vision for the school and the future of the field are truly
inspiring, and I want to thank you for including me in the amazing opportunities that you’re responsible
for developing. My ideas have benefited so much from having you as a mentor. Ed, thank you for your
nearly three decades of support. You’re one of the reasons I attended USC as an undergraduate, and
your impact and leadership in the field inspired me to become a gerontologist. Researchers, including
iv
me, owe so much to your unique vision and forward-thinking, which was responsible for putting the
Davis school and the field of gerontology on the map. Jennifer, thank you for always being available
to answer any question I walked through your door with. For being both a friend and advisor, and for
always offering assistance and support—whether it be for networking, editing papers, organizing
student meetings, writing codes, putting together grant applications, and preparing for presentations. I
am very grateful to have had the opportunity to have worked and learned from you.
There are also many other people I would like to thank—Maria Henke, for your enduring
dedication and constant support of the program and students at the Davis school; Linda Hall, Linda
Broder and May Ng for always offering to help me with any problem I had; to Valter Longo, Susan
Enguidanos, and to all the other co-authors who contributed to my various projects; to the National
Institute on Aging for my funding over four years, and to the USC High Performance Computing
Cluster, which enabled me to carry-out the work for my dissertation.
I would also like to give special thanks to my family. My mother, Dr. Kate Wilber, who
inspires me every day with her brilliance and dedication, and who is the reason that I want to pursue
a career in academia. To my father, who has always offered the best advice, and has taught me the
importance of loving life and finding joy in everything I do. To my sister and best friend, Gabby,
whose creativity, kindness, and passion continue to inspire me. And finally, to my husband,
Zachary—my biggest fan, my collaborator, my teacher, and the love of my life. I am so grateful to
have someone who both supports and challenges me; who will be my partner as we discover what
life is about—both scientifically and personally. I am so excited to discover what our future holds,
and I know that as long as I have you by my side, it’s going to be a beautiful ride.
v
Table of Contents
Dedication ...................................................................................................................................... ii
Acknowledgements ...................................................................................................................... iii
Abstract .......................................................................................................................................... x
Chapter 1: Introduction ............................................................................................................... 1
1.1 Aging as a Conserved Biological Process ............................................................................ 1
1.2 Biological Theories of Aging................................................................................................ 3
1.3 Gene and Environment ......................................................................................................... 4
1.4 Modeling Biological Aging and Its Causes .......................................................................... 5
Chapter 2: Modeling the Rate of Senescence ............................................................................. 8
2.1 Introduction ...................................................................................................................... 8
2.2 Materials and Methods ................................................................................................... 10
2.2.1 Study Population ..................................................................................................... 10
2.2.2 Selection of Biomarkers .......................................................................................... 10
2.2.3 BA Estimates .......................................................................................................... 12
2.2.4 Mortality ................................................................................................................. 14
2.2.5 Validation and Comparisons of BA Algorithms ..................................................... 15
2.3 Results ............................................................................................................................ 15
2.3.1 Sample Characteristics ............................................................................................ 15
2.3.2 Algorithm Results ................................................................................................... 16
2.3.3 BA Estimates and Mortality.................................................................................... 19
2.4 Discussion ...................................................................................................................... 22
Chapter 3: Evidence of Accelerated Aging among African Americans and
its Implications for Mortality ..................................................................................................... 25
3.1 Introduction .................................................................................................................... 25
3.2 Materials and Methods ................................................................................................... 28
3.2.1 Study Population ..................................................................................................... 28
3.2.2 Biological Age Measure ......................................................................................... 28
3.2.3 Sociodemographic Characteristics .......................................................................... 29
3.2.4 Mortality ................................................................................................................. 30
3.2.5 Statistical Analysis .................................................................................................. 30
3.3 Results ............................................................................................................................ 31
vi
3.3.1 Sample Description ................................................................................................. 31
3.3.2 Biological Age by Race .......................................................................................... 32
3.3.3 Mortality Disparities and Biological Aging ................................................................. 34
3.4 Discussion ...................................................................................................................... 36
Chapter 4: Is 60 the New 50? Examining Changes in Biological Aging
over the Past Two Decades ......................................................................................................... 40
4.1 Introduction .................................................................................................................... 40
4.2 Materials and Method ..................................................................................................... 42
4.2.1 Study Population ..................................................................................................... 42
4.2.2 Biological Age Measure.......................................................................................... 43
4.2.3 Behavioral Characteristics ...................................................................................... 44
4.2.4 Medications ............................................................................................................. 44
4.2.5 Sociodemographic Characteristics .......................................................................... 45
4.2.6 Statistical Analysis .................................................................................................. 45
4.3 Results ............................................................................................................................ 46
4.3.1 Sample Description ................................................................................................. 46
4.3.2 Period Differences in Biological Age ..................................................................... 48
4.3.3 Smoking, Obesity, and Biological Aging ............................................................... 50
4.3.4 Changes in BMI and Smoking Explain Changes in Biological Age ...................... 52
4.3.5 Medication Use and Changes in Biological Age .................................................... 54
4.4 Discussion ...................................................................................................................... 55
Chapter 5: Not All Smokers Die Young: A Model for Hidden Heterogeneity
within the Human Population .................................................................................................... 59
5.1 Introduction .................................................................................................................... 59
5.2 Materials and Methods ................................................................................................... 62
5.2.1 Study Population ..................................................................................................... 62
5.2.2 Smoking History ..................................................................................................... 62
5.2.3 Mortality ................................................................................................................. 63
5.2.4 Physiological Status ................................................................................................ 64
5.2.5 Potential Confounders ............................................................................................. 65
5.2.6 Statistical Analysis .................................................................................................. 65
5.3 Results ............................................................................................................................ 66
5.3.1 Sociodemographic Characteristics by Age and Smoking Status ............................ 66
5.3.2 Age Effects of Smoking on Mortality ..................................................................... 67
5.3.3 Age Effects of Smoking on Physiological Health .................................................. 69
vii
5.4 Discussion ...................................................................................................................... 74
Chapter 6: A Genetic Network Associated with Stress Resistance, Longevity,
and Cancer in Humans ............................................................................................................... 79
6.1 Introduction .................................................................................................................... 79
6.2 Methods .......................................................................................................................... 81
6.2.1 Discovery and Validation Samples ......................................................................... 81
6.2.2 Genotyping and Quality Control ............................................................................. 82
6.2.3 Functional Interaction Network .............................................................................. 83
6.2.4 Polygenic Risk Score .............................................................................................. 83
6.2.5 Statistical Analysis .................................................................................................. 84
6.3 Results ............................................................................................................................ 86
6.3.1 Genome-Wide Association Study ........................................................................... 86
6.3.2 Network and Pathway Analysis .............................................................................. 88
6.3.3 Polygenic Risk Score Based Validation ................................................................. 91
6.4 Discussion ...................................................................................................................... 95
Chapter 7: Conclusions and Outlook ...................................................................................... 100
7.1: Discussion of Study Results ............................................................................................ 100
7.2 Proposal of a Novel Method for Identifying Aging Gene Networks ................................ 102
7.3 Network Theory of Aging ................................................................................................. 104
7.4 Closing Remarks ............................................................................................................... 106
Bibliography .............................................................................................................................. 107
Appendix A ................................................................................................................................ 122
viii
List of Tables
Table 2.1 Pearson Correlation between Chronological Age and Biomarkers...............................11
Table 2.2 Characteristics for the full sample and by age group ....................................................16
Table 2.3 Mean Age Estimates for Chronological Age and Biological Age ................................20
Table 2.4 ROC Curve Comparisons between CA and Estimates of BA by Age ..........................20
Table 2.5 Individual Cox Proportional Hazard Models Containing CA and One of Five BAs ....21
Table 3.1 Sample Characteristics for the Full Sample and by Race .............................................31
Table 3.2 Mean Biological Age by Race ......................................................................................32
Table 3.3 BA Mediates Racial Disparities in All-Cause, CVD and Cancer Mortality .................35
Table 4.1 Sample Characteristics ..................................................................................................47
Table 5.1 Demographic Characteristics by Age and Smoking Status...........................................67
Table 5.2 Mortality Effects of Smoking and Age, and the Influence of
Daily Smoking Quantity ...............................................................................................................68
Table 5.3 Hazard Ratios of Current Smoking and Heavy Smoking by age ..................................69
Table 5.4 Regression Coefficients of the Association between Current Smoking
and Biomarkers ..............................................................................................................................71
Table 5.5 Associations between Biomarkers and Mortality for Current and Never Smokers ......74
Table 6.1 Odds Ratios for PRS from a Multinomial Regression Model for
Longevity using the validation sample ..........................................................................................92
Table 6.2 Comparing the Performance of Various PRS for Predicting Longevity in
the validation sample .....................................................................................................................93
Table 6.3 Random Effects Logistic Regression Models of the Association between
PRS and Disease Prevalence ..........................................................................................................94
Table S6.1 Enriched Pathways in the Network of 215 Genes ....................................................122
Table S6.2 SNPs and Weights used to Generate the Polygenic Risk Score ...............................123
ix
List of Figures
Figure 1.1 Gompertz Mortality Curve Based on Data from the Social Security
Administration’s Period Life Table, 2010 .......................................................................................2
Figure 3.1 Racial Differences in Adjusted Mean BA by 10-year CA Groups..............................33
Figure 4.1 Changes in biological age between period 1 and period 2 by sex and age ..................48
Figure 4.1 Changes in the frequency of smoking and obesity between the two periods ..............49
Figure 4.3 Additions to biological age related to smoking and obesity ........................................51
Figure 4.4 Contributions of BMI and smoking to decreases in biological age between
period 1 and period 2 .....................................................................................................................53
Figure 4.5 Medication use and decreases in biological age ..........................................................55
Figure 5.1 Age Trends in the Association between Smoking and Biomarkers ............................73
Figure 6.1 Study Approach ...........................................................................................................85
Figure 6.2 GWAS Results .............................................................................................................87
Figure 6.3 Functional Interaction Network ...................................................................................89
Figure 6.4 Associations between PRS and Longevity ..................................................................90
Figure 7.1 Adaptation of transcriptomic WGCNA to Genomic WGCNA .................................103
Figure 7.2 Theoretical Multilevel Network ................................................................................104
x
Abstract
Aging is an exceedingly complex process, and for centuries scientists have strived to
uncover the mechanisms that contribute to differences in the rate of the physiological decline that
characterizes this process. Over the years it has become apparent that the rate of physiological
decline that an organism undergoes over time is likely regulated by complex interactions of genes
and environment. In animal models, nutritional interventions and genetic knockouts have produced
dramatic results in lifespan extensions. It has been shown that genetic variation also influences
how animals respond and cope with environmental perturbations. In humans, environmental
factors such as nutrition and exposure to cigarette smoke have been found to significantly affect
lifespan. Nevertheless, these effects are not consistent for everyone, as evidenced by the history of
smoking behavior amongst the very old.
Given the complexities involved in both the aging process and its regulators, complex
statistical models will likely facilitate our understanding of these multifaceted interactions—
particularly in humans. While the relatively short lifespan of laboratory animals makes them ideal
for studying the aging process, our ability to impact our own health requires studies on biological
aging processes to also be conducted in humans. Waiting to examine subjects at the end of life or
collecting data over many decades may not be ideal for uncovering the causes of aging and lifespan
heterogeneity in humans. Thus, the development of mathematical algorithms that are able to
approximate the degree of aging an individual has undergone could allow researchers to test the
efficacy of aging interventions in human populations. Additionally, advanced statistical methods
may also allow us to model the complex interactions between genes and environment that
contribute to within- and between-species difference in aging and lifespan
xi
In Chapter 1 of my dissertation, I outline some of the work that has been done in biology
and demography that contributes to our understanding of the aging process. I then go on to offer
further evidence for the importance of more complex statistical modeling in studying human aging.
In Chapter 2, I present and provide validation for an algorithm that estimates biological age. I show
that the residuals in the difference between biological and chronological age are strongly
associated with mortality and that biological age is a more accurate predictor of remaining life
expectancy than chronological age.
In Chapters 3 and 4, I provide further proof of concept for the biological age algorithm.
Chapter 3 examines whether differences in biological age account for the racial disparities in
mortality that are present in the U.S. I find that biological age is successful in identifying the most
at-risk individuals in a population and that adjustment for biological age at baseline completely
accounts for race differences in all-cause, cardiovascular disease, and cancer mortality. In Chapter
4, I examine how the aging of the population has changed over the past two decades and examine
the relative contributions of population changes in the prevalence of smoking, obesity, and
medication-use. Results in Chapter 4 suggest that aging has slowed for older and middle-aged
adults, especially men, and that this was likely due to decreases in smoking and increased
medication use. Unfortunately, younger women experienced very little improvement, which likely
resulted from their increasing rates of obesity.
Chapters 5 and 6 examined reasons why some individuals are able to reach extreme old age even
in the presence of clearly high exposure to damaging factors. In chapter 5, I tested whether long-
lived smokers represented a biologically resilient phenotype that could facilitate our understanding
of heterogeneity in the aging process. Results showed that while smoking significantly increased
mortality in most age groups, it did not increase the mortality risk for those who were age 80 and
xii
over at baseline. Additionally when comparing the adjusted means of biomarkers between never
and current smokers, long-lived smokers (80+) were found to have similar inflammation, HDL,
and lung function levels to never smokers. Given the evidence that these individuals represent an
innately resilient group, in Chapter 6, I used this phenotype to identify genomic networks that
contributed to stress resistance and longevity. Overall, using a unique phenotype and incorporating
prior knowledge of biological networks, I identified a cluster of 215 single nucleotide
polymorphisms that together appear to be associated with human aging, stress resistance, cancer,
and longevity.
In the final chapter of my dissertation, I propose a novel method that, in moving forward
with my research, I plan to use to identify complex genetic networks associated with a multitude
of age-related conditions. I suggest how this method could also be incorporated into gene by
environment or multi-level networks that utilize various types of omics data, social and behavioral
data, and aging-related outcomes. I conclude the dissertation by outlining a new theory of aging
that incorporates multi-level networks, programmed and stochastic theories of aging, and the
second law of thermodynamics—which in future work, I plan to test using stochastic simulations.
1
Chapter 1: Introduction
1.1 Aging as a Conserved Biological Process
Over time, biological organisms experience a progressive decline of cellular structure and
function, resulting in a decreased ability for cells to adequately respond to environmental
perturbations and thus maintain homeostasis. This process, known as aging, is one of the most
complex biological mysteries we can observe, encompassing a multitude of coordinated
physiological alterations across numerous interacting systems, and leading to the ultimate demise
of the organism. In particular, aging in humans is often characterized by processes such as the
accumulation of atherosclerotic plaques, vascular stiffening, reduced vital capacity, impaired
creatinine clearance, dysregulation of mitochondrial energy metabolism, impaired glucose
functioning, mutation accumulation, increased presence of senescent cells, protein aggregation,
decreases in white matter volume, reduced protein turnover, and reduced nutritional absorption (1-
3). Together these changes contribute to an individual’s increasing susceptibility to a number of
distinct conditions such as cardiovascular disease, cancer, diabetes, neurodegenerative diseases,
sarcopenia, lung disease, and frailty, as well as the increasing risk of mortality over time (4).
Slowing the pace of aging on a molecular level would delay death and disease and result in longer
healthy life (5).
Traditionally, mortality rates have been used to measure the pace of aging of a population
(6). The mortality acceleration that occurs over the lifespan of a group is a relatively conserved
phenomenon that can often be modeled mathematically. In humans, as with many model
organisms, such as worms, mice, and flies, the mortality rate increases exponentially with age,
following development. This observation is termed the Gompertz-Makeham law of mortality, for
2
which the Gompertz function represents an age-dependent component of mortality (1).
ℎ( 𝑥 ) = 𝛼 𝑒 𝛽𝑥
(1.1)
From this equation, the exponential coefficient (β) can be taken to represent the population’s
senescent component. This component varies significantly between species—those with rapid
senescence (e.g. yeast, worms, and flies) having the highest values, followed by those with gradual
senescence (e.g. mice, and humans), and finally those with negligible senescence (e.g. rougheye
rockfish, bristle cone pine, and many long-lived tortoises) (1). Even among humans, differences
in the senescent component are observed between the sexes—with males typically having higher
coefficients than females (Figure 1).
Figure 1.1: Gompertz Mortality Curve Based on Data from the Social Security
Administration’s Period Life Table, 2010. Observed male and female mortality
trends follow very closely with the Gompertz mathematical function (R2≈0.995). From
about age 30 to age 100 the mortality rate increases exponentially for both sexes—with
exponential coefficients of 0.084 for males and 0.091 for females.
3
1.2 Biological Theories of Aging
Our ability to alter the pace of aging relies on our understanding of the mechanisms which
regulate it, thus a number of biological theories have been proposed to describe how changes on a
physiological level contribute to what is seen at the population level. These theories can often be
broken down into two groups—the programmed theories of aging and the stochastic theories of
aging (7). The stochastic theories assert that aging is a result of damage that accumulates over the
lifetime (8-10). Proponents of these damage accumulation theories suggest that our physiological
systems encounter cumulative insults, which are artifacts of physiological functioning and the
environment we live in. More specifically, the free radical theory, which was first proposed by
Harman in 1956 (11), postulates that reactive oxygen species—byproducts of endogenous
metabolism and exogenous environment factors such as pollutants, smoke, or ultraviolet
radiation—contribute to aging by causing chronic damage to cellular components. However, the
question still remains, how does the accumulation of damage from factors which are, for the most
part, universal generate the significant variations we observe in the lifespan of different species?
One explanation is that species allocate different proportions of energy towards
mechanisms associated with maintenance and repair. August Weismann and then Thomas
Kirkwood proposed that organisms make trade-offs regarding energy allocation (12-14). Because
of the force of natural selection, which is most not influenced by survival past reproduction, and
with segregation of the germ and the soma, species will invest greater energy in germ cells,
directing resources away from the soma. This theory, later termed the Disposable Soma Theory by
Kirkwood and included growth and reproduction as oppositional components to somatic
maintenance and repair, explains how variations in fecundity, parental investment, growth rate,
and environmental selection forces (predation, nutrient availability, crowding) contribute to
4
differences in the finite lifespan of populations (15). The theory also accounts for within-group
differences—a result of stochastic damage that goes unresolved when energy is allocated away
from somatic maintenance and repair.
Programmed theories of aging offer a more deterministic view of the causes of aging. Like
the factors that regulate growth and development, factors that regulate senescence will operate
according to a biological timetable—contributing to changes in gene expression, methylation and
chromatin structure, activation of transposable elements, inflammation, and hormone levels—
which may influence systems involved in maintenance and repair (16). Such factors that direct
the innate timing of age-related decline are believed to be genetically conferred and may operate
regardless of external signals or may also be environmentally triggered. The programmed theories
suggest that aging ‘evolved’ via an adaptive process for which limiting survival benefits the
population as a whole (17). While there are a number of arguments against group selection, within-
and between-group heterogeneity in the timing of death and decline suggest that an interaction
likely exists between 1) the stochastic environmental causes of aging outlined in the accumulative-
damage theories and 2) the genetically-determined variations in allocation of somatic maintenance
and repair which underlie the programmed theories of aging.
1.3 Gene and Environment
Substantial evidence exists to suggest that the aging process is influenced by complex
dynamics and interactions between genes and the environment (18). The influence of
environmental factors such as diet, exposure to environmental toxins, nutrition, and psychological
stress has been well documented (19-21). In animal models from yeast to mice, interventions such
as caloric restriction have been utilized to drastically delay the incidence of death and age-related
conditions (22). It is hypothesized that reductions in nutrient availability send biochemical signals
5
to up-regulate or allocate more energy towards maintenance and repair mechanisms. Nevertheless
there is evidence that environmental contributions to aging may be genetically dependent. In
mouse models for instance, the benefits of caloric restriction has been shown to be strain-specific
(23)—eliciting large gains in lifespan among some, yet little to no gains for others.
In humans, environmental and behavioral factors such as nutrition/obesity, inhalation of
cigarette smoke or environmental toxins, socioeconomic status, alcohol consumption, and
psychosocial stress have been shown to have very large effects for health, aging, and mortality
risk. Stressors such as cigarette smoking are predicted to decrease the typical lifespan of an
individual by an average of 10 years—likely via an acceleration of the aging process (20).
Nevertheless, some smokers are able to survive to very old age (24), including the longest-lived
human on record, Jeanne Calment who reportedly survived to the age of 122 despite smoking for
nearly 100 years (25). This, along with the findings on calorie restriction support the idea that
complex genetic networks may regulate how the environment interacts with and modifies
biological processes.
1.4 Modeling Biological Aging and Its Causes
Aging is a multi-system process that is inherently difficult to quantify. Nevertheless, to
date, many of the measures proposed to study the aging process at the biological level have relied
on single-system or individual measures of physiological functioning. While reductionist
approaches are crucial for making causal inferences regarding connections between components
of a system, the draw-back of these approaches for studying the aging organism as a whole is that
many of the multiplicative or additive effects across physiological systems are not captured.
Changes in systems likely influence one another and may take on complex functional forms over
time. While damage may originate faster in one system, the effects likely cascades through the
6
larger multi-system network that makes up an organism. Thus it is important to model biological
aging and mortality risk using methods that account for changes across a multitude of
physiological systems.
The genetic contributors to aging are also likely to exhibit complex network-based
characteristics. Evidence from model organisms substantiate the premise that aging is influenced
by multiple interacting genes within a small number of biological pathways (26). Given the
intrinsic connectivity between genes, it is hypothesized (27) that the effects of individual gene
mutations may disseminate through these networks which may give rise to multiple phenotypes—
a phenomenon known as ‘pleiotropy’. The pleiotropic nature of the aging phenotype—as pointed
out previously regarding the numerous age-related outcomes, such as lifespan, multiple chronic
diseases, and loss of functioning—further validates the idea that aging is likely regulated by a
complex system of interacting factors.
As a whole, our ability to accurately model genetic and environmental contributions to age-
related health outcomes is vital to our understanding of the underlying mechanisms that contribute
to differences in the pace of aging. Thus, in Chapter 2, I will present an algorithm that can be used
to estimate the biological age of individuals (28). I also provide validation of this measure by
showing that the residuals between biological and chronological age are significantly associated
with mortality risk and that this algorithm produces measures that, within a human population, are
better predictors of remaining lifespan than chronological age. This is carried over to my work in
Chapter 3 (29), in which I compare two groups with well-established life expectancy
differentials—Non-Hispanic Whites and Non-Hispanic Blacks—and present evidence to illustrate
that disparities in all-cause, cardiovascular disease, and cancer mortality are fully accounted for
by differences in biological age.
7
The relatively short lifespans of model organisms is one of the many appeals for using
them to study aging in lieu of humans. Although the eight decade average lifespan of humans
makes it difficult to study the developmental aging process from cradle to grave, algorithms such
as the one presented in this work, could allow researchers to determine how the pace of aging is
regulated by various environmental and genetic factors without needing to rely on mortality as an
outcome. I provide further evidence for this in Chapter 4 by showing how historical changes in
health behaviors, such as smoking, obesity, and medication use, contribute to population-level
differences in biological aging. I show that along with obesity, smoking accelerates the aging
process, as measured using the algorithm for biological age. Nevertheless, studies using animal
models suggest that environmental influences on aging likely depend upon their interactions with
genes. Thus, in Chapters 5 (24) and 6, I present a long-lived smoker phenotype that provides
evidence for innate within-group heterogeneity regarding resilience to environmental stressors,
and use this phenotype to develop a network based measure of genetic load that along with
resiliency is associated extreme longevity and lower risk of cancer in the general population.
8
Chapter 2: Modeling the Rate of Senescence
2.1 Introduction
Aging is often defined as the gradual functional and structural decline of an organism,
resulting in an increasing risk of disease, impairment and mortality over the lifespan (30).
Although aging can be seen in nearly all species, the rate of age-related decline is not universal
(7). Heterogeneity arises within and between species due to variations in exposure to damaging
properties—diverse behaviors and environments—as well as the bodies’ innate ability to cope with
such stressors (31). Consequently, age, when measured chronologically, may not be a reliable
indicator of the body’s rate of decline or physiological breakdown, but rather, may serve only as a
proxy for the rate of aging. Nevertheless, in order to better assess an individual’s degree of aging,
and thus residual lifespan or susceptibility to disease, new approaches need to be developed that
provide predictive power beyond what is gained from measuring chronological age (CA) alone.
The idea that age-related biological changes could be measured was first proposed by Alex
Comfort in 1969
(32). Given the number of cellular and systemic changes that accompany the
aging process, it is believed that such changes could be quantified through the identification and
measurement of biomarkers of aging. Over the years, significant work has gone into trying to
identify biomarkers of aging that could be used to study senescence in humans or animal models
(33); however, there has been limited success thus far. It has been suggested that, due to the
complexity of the aging process—particularly in humans—no single biomarker is likely to be
identified that accurately measures the rate of biological aging (34). On the other hand, unlike
individual biomarkers of aging, Biological Age (BA) estimates facilitate the merging of multiple
biomarkers into a single latent variable, which may better account for the complexity of the aging
9
process. The coalescing of various measures into a single multifaceted biomarker may prove useful
in both biological research—to study the relative contributions of genes, environment, and
stochasticity to the pace of aging—as well as in public health research or clinical practice—to
identify individuals at increased risk of disability, death, and/or disease.
Although several papers have been published on the measurement of biological age (BA),
there is little consensus regarding the method in which BA should be calculated. Over the years, a
number of varying mathematical algorithms have been suggested, such as multiple linear
regression (MLR) (35-38), principal component analysis (PCA) (39-42), and more recently, a
novel method proposed by Klemera and Doubal (43). However, validation of such estimates has
been limited, particularly when it comes to utilizing BA for predicting mortality.
Given that the intrinsic value of BA is impossible to measure, the validation of calculated
estimates proves difficult. Nevertheless, the reliability and validity of BA measurements should
be evaluated using common criteria. For example, BA calculations should produce realistic
measurements, within the limits of recorded lifespan. BA estimations should also be able to
identify at-risk individuals prior to them entering a disease state. Many of the methods currently
used in identifying at-risk individuals rely upon indexes of disease, frailty, or cumulative deficits
of biomarkers reaching a predetermined cutoff (44-46). However, these estimates may not be
useful in examining young or middle aged adults and therefore, may not be ideal for use in
prevention early in the lifespan. Finally, BA should satisfy the criteria set forth for biomarkers of
aging, which states that: 1) A biomarker needs to be a better predictor of multiple age-associated
biological and functional outcomes than is chronological age; 2) Biomarkers should be able to
predict both remaining longevity and disease-specific mortality in a population for which 90% of
the individuals are still alive; and 3) The method of measurement should not affect life expectancy
10
or any future age-related measurements (47).
Using these criteria, the focus of this study is to compare BA measures, estimated using
various methods that have been proposed in the literature, with the goal of determining their
validity and usefulness in predicting mortality outcomes within a large nationally representative
human sample.
2.2 Materials and Methods
2.2.1 Study Population
The study population included subjects from the third National Health and Nutrition
Examination Survey (NHANES III), a nationally representative, cross-sectional study conducted
by the National Center for Health Statistics (NCHS) between 1988 and 1994. Data for NHANES
III were collected from at-home interviews and examinations taking place at a Mobile Examination
Center (MEC). Further details of recruitment, procedures, population characteristics and study
design are available through the Centers for Disease Control and Prevention. The current study
was limited to adults aged 30-75, in order to insure subjects were old enough to be experiencing
detectable age-related changes in biomarkers, yet not too old as to represent a select group with
above average health and longevity. Of the 12,517 adults aged 30-75 in NHANES, our final
analytic sample included 9,389 subjects. Excluded participants consisted of those with missing
data on one or more of the biomarker measures.
2.2.2 Selection of Biomarkers
Biomarkers were selected based upon knowledge regarding their role or dependency on
the aging process, independence, use in previous BA or biomarkers of aging studies (41, 48), their
availability, and the statistical significance and strength of their relationship with CA. The 21
biomarkers considered in our analysis can be classified into eight domains: 1) Metabolic
11
Function— Glycated Hemoglobin, (Hba1c), Total Cholesterol, and High Density Lipoprotein
(HDL); 2) Cardiac Function—Systolic Blood Pressure, Diastolic Blood Pressure, and Pulse; 3)
Lung Function—Forced Expiratory Volume (FEV); 4) Kidney Function—Serum Creatinine;
Serum Urea Nitrogen 5) Liver Function—Serum Alkaline Phosphatase, and Serum Albumin; 6)
Immune Function and Inflammation: C-reactive Protein (CRP), Cytomegalovirus optical density
(CMV), Lymphocyte percent, Mononuclear percent, and Granulocyte percent; and 7) Cell Blood
Count (CBC)—White Blood Cell count (WBC), Red Blood Cell count (RBC), Platelet count,
Hemoglobin, and Hematocrit. Pearson Correlations were then used to assess the relationships of
the 21 potential biomarkers with age (Table 2.1).
Table 2.1 Pearson Correlation Coefficients between Chronological Age and Biomarkers
CRP (mg/dL)
0.122***
Serum Creatinine (mg/dL) 0.148***
Glycated Hemoglobin (%) 0.261***
Serum Albumin (g/dL) -0.220***
Serum Total Cholesterol (mg/dL) 0.288***
Cytomegalovirus optical density (CMV) 0.261***
Serum Urea Nitrogen (mg/dL) 0.296***
Serum Alkaline Phosphatase SI (U/L) 0.218***
Forced Expiratory V olume (FEV) (ml) -0.535***
Systolic Blood Pressure 0.501***
Serum High Density Lipoproteins (mg/dL) 0.026**
Hemoglobin (g/dL) -0.052***
Lymphocyte Percent -0.033**
White Blood Cell Count -0.020*
Hematocrit (%) -0.036**
Red Blood Cell Count -0.096***
Mononuclear percent 0.074***
Granulocyte percent 0.010
Platelet Count -0.046***
Pulse (beats/min) 0.054***
Diastolic Blood Pressure 0.047***
***p<.0001, **p<.01, *p<.05
12
Ten biomarkers that significantly correlated with CA at r > 0.10 were selected for inclusion into
the BA estimates. These biomarkers included: CRP, Serum Creatinine, Hba1c, Systolic Blood
Pressure, Serum Albumin, Total Cholesterol, CMV, Serum Alkaline Phosphatase, FEV, and
Serum Urea Nitrogen.
2.2.3 BA Estimates
2.2.3.1 Principal Component Analysis (PCA)
PCA is a method used to reduce a set of variables to a small number of factors, called
principal components, while optimizing the amount of variance explained. BA was calculated in
accordance with the method proposed by Nakamura et al. (41). The 1
st
principal component score
was used to represent a BA score (BAS). Given that BAS is not in units of years, scores were
transformed to allow for comparisons with CA. Finally, BA models were further adjusted by
adding a z-score to the BA estimates, as suggested by Nakamura et al. (41), in order to account for
systematic errors that may cause over or under estimations of BA.
2.2.3.2 Multiple Linear Regression (MLR)
Although MLR remains one of the most commonly used methods for the calculation of
BA, it has encountered criticism given the risk of multicollinearity in the models, as well as the
potential for estimates to regress towards the mean (49). Using MLR, BA is assumed to be equal
to the predicted CA of an individual (2), and is based upon the relationship of true (measured) CA
and several biomarkers (m).
𝐵𝐴
𝑖 = 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝐶𝐴
𝑖 = 𝑎 0
+ ∑ 𝑏 𝑗 𝑥 𝑗𝑖
𝑚 𝑗 =1
(2)
13
Two BA scores were calculated using sex-stratified MLR. The first incorporated all ten
biomarkers and the second used only those selected by the PCA. The results from the equations
were then standardized so that the mean BA of subjects of a given age was equal to CA.
2.2.3.3 Klemera and Doubal’s Method (KDM)
In their paper, “A new approach to the concept and computation of biological age”,
Klemera and Doubal present a new mathematical algorithm, claiming that it is the optimum
method for the calculation of BA (43). The BA estimates are based upon minimizing the distance
between m regression lines and m biomarker points, within an m dimensional space of all
biomarkers. In their paper, the authors used computer-generated simulations to validate the
method they propose. They defined BA as equal to CA, plus some random variable, 𝑅 𝐵𝐴
, with a
mean of zero and a variance 𝑠 𝐵𝐴
2
. Klemera and Doubal presented two alternative methods for
calculating the optimum estimates of BA (equation 3 and equation 4), in which the later method
utilizes CA in the final equation and was shown to be superior in simulations.
𝐵𝐴
𝐸 =
∑ ( 𝑥 𝑗 −𝑞 𝑗 𝑚 𝑗 =1
)
𝑘 𝑗 𝑠 𝑗 2
∑ (
𝑘 𝑗 𝑠 𝑗 )
2
𝑚 𝑗 =1
(3)
𝐵𝐴
𝐸𝐶
=
∑ ( 𝑥 𝑗 −𝑞 𝑗 𝑚 𝑗 =1
)
𝑘 𝑗 𝑠 𝑗 2
+
𝐶𝐴
𝑠 𝐵𝐴
2
∑ (
𝑘 𝑗 𝑠 𝑗 )
2
𝑚 𝑗 =1
+
1
𝑠 𝐵𝐴
2
(4)
In order to produce an estimate for BA, using equation 4, 𝑠 𝑗 2
and 𝑠 𝐵𝐴
2
have to be calculated.
The value, 𝑠 𝑗 , represents the root mean squared error of a biomarker regressed on BA. However,
given that BA is not measurable, root mean squared errors (MSE) from the regressions between
14
each biomarker and CA, rather than BA, were used, as suggested by Haeng Cho et al. (49). Finally,
in order to calculate 𝑠 𝐵𝐴
2
, equation 3, as well as the following two equations were used sequentially.
𝑟 𝑐 ℎ𝑎𝑟
=
∑
𝑟 𝑗 2
√1−𝑟 𝑗 2
𝑚 𝑗 =1
∑
𝑟 𝑗 √1−𝑟 𝑗 2
𝑚 𝑗 =1
(5)
𝑠 𝐵𝐴
2
= (
∑ ( ( 𝐵𝐴
𝐸𝑖
−𝐶𝐴
𝑖 −∑ ( 𝐵𝐴
𝐸𝑖
−𝐶𝐴
𝑖 ) 𝑛 ) ⁄
𝑛 𝑖 =1
𝑛 𝑗 =1
2
𝑛 ) − (
1−𝑟 𝑐 ℎ𝑎𝑟
2
𝑟 𝑐 ℎ𝑎𝑟
2
) × (
( 𝐶𝐴
𝑚𝑎𝑥 −𝐶𝐴
min
)
2
12𝑚 ) (6)
The value 𝑟 𝑗 2
, used to calculate the characteristic correlation coefficient from equation 5, refers to
the variance explained by regression CA on m biomarkers. Finally, in accordance with the
assumption made by Klemera and Doubal, 𝑠 𝐵𝐴
2
was transformed so that 𝑠 𝐵𝐴
maintained the same
mean, but was now linearly increasing with age, with a difference of 5 between subjects at 𝐶𝐴
𝑚𝑖𝑛
and 𝐶𝐴
𝑚𝑎𝑥
. As with the MLR approach, two BA scores were calculated—one based on all ten
biomarkers and one based on the biomarkers selected by PCA. For further details of the methods
and equations refer to the paper by Klemera and Doubal (43).
2.2.4 Mortality
Mortality follow-up was based on linked data from records taken from the National Death
Index through 2006, provided through NHANES III. Data on mortality status was available for all
subjects. During analysis, deaths due to HIV, violence, or accidents, were censored given that
mortality is being used to validate measures of aging, and thus variables should be evaluated on
their ability to predict age-related, rather than stochastic deaths. Finally, given that subjects took
part in NHANES III at different points in time between 1988 and 1994, potential mortality follow-
15
up time ranged from 12-18 years, causing some subjects who were alive in 2006, to be censored
at less than 18 years. Nevertheless, time of enrollment in NHANES III was random and should
therefore not confound results.
2.2.5 Validation and Comparisons of BA Algorithms
Receiver Operating Characteristics (ROC) curves were used to determine the sensitivity
of CA and the five BA estimates in predicting mortality up to 18 years after follow-up. To test the
sensitivity of the variables for different age cohorts, CA and BA estimates were first compared
using the entire age sample and then rerun using two age-stratified groups—those 30-59 years old
and those 60-75 years old. Next, five Cox Proportional Hazard Models, containing both CA and
one of the BA estimates, were used to investigate which one had more predictive power when
included in the same model. Given that BA estimates were calculated separately for males and
females, all analysis were run controlling for sex.
2.3 Results
2.3.1 Sample Characteristics
Sample characteristics, are shown in Table 2.2. Approximately half (51.83%) of the
subjects were female, and ranged in age from 30-75, with a mean of 47.46 years. Additionally,
78.5% of subjects were between the ages of 30 and 59, while 21.5% were between the ages of 60
and 75. Overall, 1,843 subjects died between baseline and follow-up. Due to NHANES procedures,
subjects who were presumed alive did not have equivalent follow-up times and were therefore
considered censored during analysis. In addition to those assumed alive, 88 subjects were censored
due to deaths from HIV, violence, or accidents. For living subjects, total person-years was 112,734
years. For those who were deceased total person-years was 16,643 years.
16
Table 2.2 Characteristics for the full sample and by age group
Full Sample
(N=9,389)
Ages 30-59
(N=6,603)
Ages 60-75
(N=2,786)
Age (years), mean
47.46 (14.05)
42.14 (9.65)
66.86 (4.21)
Female (%) 51.83 51.23 54.01
Died, (N) 1,843 566 1,277
Censored, (N) 7,546 6,037 1,509
Person Years (Mean) 14.17 (3.35) 14.65 (2.83) 12.43 (3.92)
CRP (mg/dL) , mean 0.42 (0.67) 0.39 (0.60) 0.53 (0.79)
Creatinine (mg/dL), mean 1.07 (0.29) 1.05 (0.29) 1.14 (0.27)
Glycated Hemoglobin (%), mean 5.42 (1.08) 5.33 (1.08) 5.76 (1.02)
Albumin (g/dL), mean 4.16 (0.38) 4.19 (0.40) 4.06 (0.30)
Total Cholesterol (mg/dL), mean 209.36 (46.20) 204.99 (47.14) 225.29 (40.46)
Cytomegalovirus (CMV), mean 1.78 (1.29) 1.66 (1.38) 2.25 (0.93)
Alkaline Phosphatase (U/L), mean 81.66 (32.59) 79.24 (31.90) 90.50 (32.84)
FEV (ml), mean 3102.16 (1002.91) 3304.69 (979.62) 2363.82 (705.09)
Urea Nitrogen (mg/dL), mean 14.38 (5.28) 13.71 (4.97) 16.82 (5.34)
Systolic Blood Pressure, mean 123.13 (18.77) 119.47 (17.18) 136.49 (16.90)
2.3.2 Algorithm Results
2.3.2.1 PCA
The ten biomarkers selected from the Pearson Correlation were included in the PCA and
run separately for males and females. Of the biomarkers included, seven significantly loaded on
the 1
st
principal component, for both males and females. The biomarkers that reached significance
for males were CRP, Hba1c, Serum Albumin, CMV, Serum Alkaline Phosphate, FEV, and systolic
blood pressure. For females, the biomarkers that reached significance were CRP, Hba1c, total
cholesterol, Serum Alkaline Phosphate, FEV, serum urea nitrogen, and systolic blood pressure.
CA was then loaded and unloaded to test the stability of the candidate biomarkers and the
17
relationship between age and the 1
st
principal component. Using the variable set consisting of the
nine biomarkers and CA, the 1
st
principal component had an eigenvalue of 2.61 for males and 3.23
for females. Furthermore, CA had PCA loadings of 0.694 and 0.740 for males and females,
respectively. In the final PCA, with the nine biomarkers, excluding CA, the 1
st
principal
component for males had an eigenvalue of 2.08; while for females, the 1st principal component
had an eigenvalue 2.67. Next, regressing the nine variables produced BA score (BAS) equations 7
for males and 8 for females.
𝐵𝐴𝑆 = 0.382 + 0.451( 𝐶𝑅𝑃 )+ 0.230( 𝐻𝑏𝑎 1𝑐 )− 0.746( 𝐴𝑙𝑏𝑢𝑚𝑖𝑛 )+ 0.175( 𝐶𝑀𝑉 )+
0.008( 𝐴𝑙𝑘𝑎𝑙𝑖𝑛𝑒 𝑃 ℎ 𝑜𝑠𝑝 ℎ 𝑎𝑡𝑒 )− 0.0004( 𝐹𝐸𝑉 )+ 0.014( 𝑆𝐵𝑃 ) (7)
𝐵𝐴𝑆 = −4.10 + 0.229( 𝐶𝑅𝑃 )+ 0.220( 𝐻𝑏𝑎 1𝑐 )+ 0.005( 𝐶 ℎ 𝑜𝑙𝑒𝑠𝑡 𝑒 𝑟𝑜𝑙 )+
0.008( 𝐴𝑙𝑘𝑎𝑙𝑖𝑛𝑒 𝑃 ℎ 𝑜𝑠𝑝 ℎ 𝑎𝑡𝑒 )− 0.0004( 𝐹𝐸𝑉 )+ 0.034( 𝑈𝑟𝑒𝑎 𝑁𝑖𝑡𝑟𝑜𝑔𝑒𝑛 )+ 0.015( 𝑆𝐵𝑃 ) (8)
BAS estimates were then transformed to years by multiplying them by the standard deviation of
CA and summing with CA, as shown in equations 9 for males and 10 for females.
𝐵𝐴 = ( 𝐵𝐴𝑆 × 14.18)+ 47.15 (9)
𝐵𝐴 = ( 𝐵𝐴𝑆 × 13.92)+ 47.75 (10)
Finally, true BA (TBA) was calculated by adding z scores, calculated as 𝑧 = ( 𝑦 𝑖 − 𝑦 ̂)×
( 1 − 𝑏 ) , to BA values, where 𝑦 𝑖 is the individual’s CA for the group, 𝑦 ̂ is mean CA, and b is the
coefficient of BA regressed on CA.
18
2.3.2.2 MLR
Sex-stratified MLR was performed utilizing the 10 biomarker variables as predictors of
CA. Correlation between biomarkers were run to examine multicollinearity, which is common
concern of the MLR method. Among the 10 biomarkers, all correlation coefficients were found to
be within acceptable levels (less than r=0.40). The biomarkers in the MLR models accounted for
50.9% and 58.8% of the variance in CA for males and females, respectively (r
2
males=0.509;
r
2
females=0.588). As noted by Ingram (50), unexplained variance is necessary to capture the
differences in BA among individuals of a given CA. From the regression models, the following
equations were used to generate BA estimates for males (equation 11) and females (equation 12).
𝐵𝐴
𝐸 = 54.73 − .297( 𝐶𝑅𝑃 )+ 0.989( 𝐶𝑟𝑒𝑎𝑡𝑖𝑛𝑖𝑛𝑒 )+ 0.620( 𝐻𝑏𝑎 1𝑐 )− 6.413( 𝑎𝑙𝑏𝑢𝑚𝑖𝑛 )+
0.016( 𝑡𝑜𝑡𝑎𝑙 𝑐 ℎ 𝑜𝑙𝑒𝑠𝑡𝑒𝑟𝑜𝑙 )+ 1.083( 𝐶𝑀𝑉 )− .009( 𝑎𝑙𝑘𝑎𝑙𝑖𝑛𝑒 𝑝 ℎ 𝑜𝑠𝑝 ℎ 𝑜𝑡𝑎𝑠 𝑒 )− 0.008( 𝐹𝐸𝑉 )+
0.505( 𝑈𝑟𝑒𝑎 𝑁𝑖𝑡𝑟𝑜𝑔𝑒𝑛 )+ 0.205( 𝑆𝐵𝑃 ) (11)
𝐵𝐴
𝐸 = 34.17 − 1.231( 𝐶𝑅𝑃 )− 0.595( 𝐶𝑟𝑒𝑎𝑡𝑖𝑛𝑖𝑛𝑒 )+ 0.128( 𝐻𝑏𝑎 1𝑐 )− 1.512( 𝑎𝑙𝑏𝑢𝑚𝑖𝑛 )+
0.047( 𝑡 𝑜𝑡𝑎𝑙 𝑐 ℎ 𝑜𝑙𝑒𝑠𝑡𝑒𝑟𝑜𝑙 )+ 0.589( 𝐶𝑀𝑉 )− .006( 𝑎𝑙𝑘𝑎𝑙𝑖𝑛𝑒 𝑝 ℎ 𝑜𝑠𝑝 ℎ 𝑜𝑡𝑎𝑠𝑒 )− 0.008( 𝐹𝐸𝑉 )+
0.526( 𝑈𝑟𝑒𝑎 𝑁𝑖𝑡𝑟𝑜𝑔𝑒𝑛 )+ 0.196( 𝑆𝐵𝑃 ) (12)
The results from the equations were then standardized so that the mean BA for subjects of
a given age was equal to CA. Finally, the MLR method was then run again, using only those
variables selected from the PCA, thus, producing a new equation for males (equation 13) and
females (equation 14).
19
𝐵𝐴
𝐸 = 65.274 − 0.170( 𝐶𝑅𝑃 )+ 0.764( 𝐻𝑏𝑎 1𝑐 )− 6.135( 𝑎𝑙𝑏𝑢𝑚𝑖𝑛 )+ 1.023( 𝐶𝑀𝑉 )−
0.013( 𝑎𝑙𝑘𝑎𝑙𝑖𝑛𝑒 𝑝 ℎ 𝑜𝑠𝑝 ℎ 𝑜𝑡𝑎𝑠𝑒 )− 0.007( 𝐹𝐸𝑉 )+ 0.215( 𝑆𝐵𝑃 ) (13)
𝐵𝐴
𝐸 = 28.951 − 1.102( 𝐶𝑅𝑃 )+ 0.156( 𝐻𝑏 𝑎 1𝑐 )+ 0.047( 𝑡𝑜𝑡𝑎𝑙 𝑐 ℎ 𝑜𝑙𝑒𝑠𝑡𝑒𝑟𝑜𝑙 )+
0.008( 𝑎𝑙𝑘𝑎𝑙𝑖𝑛𝑒 𝑝 ℎ 𝑜𝑠𝑝 ℎ 𝑜𝑡𝑎𝑠𝑒 )− 0.009( 𝐹𝐸𝑉 )+ 0.506( 𝑢𝑟𝑒𝑎 𝑛𝑖𝑟𝑡𝑜𝑔𝑒𝑛 )+ 0.199( 𝑆𝐵𝑃 ) (14)
Like the results from the first regression equations, results were then standardized to assure that
the mean BA of subjects of a given age was equal to CA. We will refer to the results from the
MLR method using the 16 biomarkers and 9 biomarkers as MLR1 and MLR2, respectively.
2.3.2.3 KDM
The 𝑟 𝑐 ℎ𝑎𝑟
calculated using equation 9 for KDM1 (10 biomarkers) was 0.360 and 0.432 for
males and females, respectively, while the 𝑟 𝑐 ℎ𝑎𝑟
for KDM2 (7 biomakrers) was 0.399 and 0.474
for males and females, respectively. For KDM1, 𝑠 𝐵𝐴
2
, which was used to calculate 𝐵𝐴
𝐸𝐶
in equation
8, ranged from 57.66-158.58 with a mean of 99.12 for males, and ranged from 43.13-133.79 with
a mean of 77.67 for females, while for KDM2, 𝑠 𝐵𝐴
2
ranged from 72.45-182.55 with a mean of
118.03 for males, and ranged from 36.23-121.40 with a mean of 68.44 for females.
2.3.3 BA Estimates and Mortality
Means, standard deviations, and ranges for CA and the five BA estimates are listed in Table
3. The algorithm produced very similar means for BA, however ranges varied significantly. The
BA calculated using PCA, ranged from 19 to 185 years, while the two BA estimates calculated by
MLR ranged from about 15 to 110 years when all ten biomarkers were used in the estimates, and
from 17 to 106 years when only the seven biomarkers selected by PCA were used. The estimates
for KDM1 and KDM2 ranged from 24 to 110 years and 22 to 101 years, respectively.
20
Table 2.3 Mean Age Estimates for Chronological Age (CA) and Biological Age (BA)
Mean (SD)
Min
Max
Chronological Age
47.46 (14.05)
30.00
75.00
BA from PCA (7 Biomarkers) 47.93 (14.08) 19.07 185.18
BA from MLR1 (10 Biomarkers) 47.46 (15.63) 15.12 110.29
BA from MLR2 (7 Biomarkers) 47.46 (15.63) 15.50 106.33
BA from KDM1 (10 Biomarkers) 47.47 (15.07) 23.80 110.28
BA from KDM2 (7 Biomarkers) 47.46 (15.08) 20.98 100.64
Note: Principal Component Analysis=PCA; Multiple Linear Regression with ten variables=MLR1; Multiple Linear
Regression with seven variables=MLR2; Klemera and Doubal method with ten variables=KDM1; Klemera and
Doubal method with seven variables =KDM2
Results from the ROC curve comparisons for the whole sample and for the age-stratified
sub-samples are listed in Table 4. For the model containing all subjects (ages 30-75), each of the
five BA estimates produced significantly (p<.05) better mortality predictions than CA. Similar
results were found when comparing BA estimates to CA within young and old age groups. Using
ROC curves, the best performing BA estimates were those employing the KDM algorithm. For the
whole sample and the young group, KDM2 had the highest sensitivity. (AUCages 30-75=0.851 and
AUCages 30-59=0.779), compared to CA (AUCages 30-75=0.827 and AUCages 30-59=0.731), while KDM1
had the highest sensitivity (AUC=0.735) in the older group compared to CA (AUC=0.670).
Table 2.4 ROC Curve Comparisons between CA and Estimates of BA by Age
Full Sample
Ages 30-59
Ages 60-75
AUC (SE)
p-value
AUC (SE)
p-value
AUC (SE)
p-value
CA
0.827 (0.0052)
reference
0.731 (0.0108)
reference
0.670 (0.0067)
reference
PCA 0.840 (0.0050) <.0001 0.773 (0.0105) <.0001 0.712 (0.0062) <.0001
MLR1 0.847 (0.0050) <.0001 0.762 (0.0106) <.0001 0.727 (0.0061) <.0001
MLR2 0.849 (0.0049) <.0001 0.772 (0.0105) <.0001 0.727 (0.0077) <.0001
KDM1 0.853 (0.0049) <.0001 0.774 (0.0104) <.0001 0.743 (0.0059) <.0001
KDM2 0.854 (0.0049) <.0001 0.779 (0.0103) <.0001 0.737 (0.0077) <.0001
Note: Area under the ROC curve=AUC; Principal Component Analysis=PCA; Multiple Linear Regression with ten
variables=MLR1; Multiple Linear Regression with seven variables=MLR2; Klemera and Doubal method with ten
variables=KDM1; Klemera and Doubal method with seven variables =KDM2
21
Table 2.5 Individual Cox Proportional Hazard Models Containing CA and One of Five BAs
CA BA
H.R. (95%CI) S.E. HR (95%CI) S.E.
Model 1 (BA=PCA)
Log Likelihood= -15,025.53
Nagelkerke R
2
= 0.258
1.06 (1.06-.07)***
0.003
1.03 (1.03-.04)***
0.001
Model 2 (BA=MLR1)
Log Likelihood= -15,038.34
Nagelkerke R
2
= 0.256
1.03 (1.02-.04)***
0.004
1.07 (1.06-.07)***
0.003
Model 3 (BA=MLR2)
Log Likelihood= -15,014.63
Nagelkerke R
2
= 0.260
1.02 (1.01-.03)***
0.004
1.08 (1.07-.08)***
0.003
Model 4 (BA=KDM1)
Log Likelihood= -14,974.76
Nagelkerke R
2
= 0.267
1.01 (1.01-.02)***
0.004
1.08 (1.07-.09)***
0.003
Model 5 (BA=KDM2)
Log Likelihood= -14,975.42
Nagelkerke R
2
= 0.267
1.01 (0.99-1.02)
0.004
1.09 (1.08-1.09)***
0.004
N=9,439 and N of Events=1,843
Results for individual Cox Proportional Hazard Models are listed in Tables 5. Overall, BA
estimated by KDM2 produced the most robust results, When BA and CA were both included in
the same model, the hazard ratio for the BA estimate, calculated using KDM2, was statistically
significant and found to be higher than the hazard ratio for CA (HR BA_KDM2: 1.09, 95% CI: 1.08-
1.09; HRCA: 1.01, 95% CI: 0.99-1.02). Moreover, KDM2 had the most robust predictive power of
any of the BA estimates and was the only one that produced a null association between CA and
mortality. Although the results weren’t as robust, CA was also found to have less predictive power
than KDM1 (HRBA_KDM1: 1.08, 95% CI: 1.07-1.09; HRCA: 1.01, 95% CI: 1.01-1.02), MLR1
(HRBA_MLR1: 1.07, 95% CI: 1.06-1.07; HRCA: 1.03, 95% CI: 1.02-1.04), and MLR2 (HRBA_MLR2:
1.08, 95% CI: 1.07-1.08; HRCA: 1.02, 95% CI: 1.01-1.03), however, in all three models, CA
22
remained statistically significant. Finally, in the model that included CA and PCA, both were
statistically significant, however, CA was found to be more robust (HRBA_PCA: 1.03, 95% CI: 1.03-
1.04; HRCA: 1.06, 95% CI: 1.06-1.07).
2.4 Discussion
The KDM algorithm, particularly when variables are chosen by PCA performed the best
overall. KDM2 produced plausible estimates of BA and was a more reliable predictor of mortality
than CA or any of the other BA algorithms in multiple age cohorts. Furthermore, KDM2
outperformed CA when included in the same model—accounting for the entire association
between CA and mortality. The method of selecting variables through PCA to include in the
KDM2 estimate is in accordance with the model’s assumptions. Klemera and Doubal suggest that
all the biomarkers included in the algorithm be functionally uncorrelated and that factor analysis
(FA) or PCA be used to reach this goal (43).
While the current study provides evidence for the usefulness of the KDM algorithm for
estimating BA, there is potential for improvement. Advancements in technology and increasing
knowledge regarding the aging process have facilitated the identification and measurement of
more sophisticated and theoretically conceptual age-associated biomarkers, such as telomere
length, measures of oxidative damage, mitochondrial oxygen consumption, neuroendocrine
secretion levels, and cyclin-dependent kinase inhibitor 2A (p16
Ink
4a) expression (33, 51-53).
Furthermore, among the oldest-old, a number of physical performance and blood cell count
measures have been shown to be useful biomarkers of aging (54). The equation proposed by
Klemera and Doubal provides researchers with the potential to combine many of these distinct
biomarkers into a single measure, better capturing the complexity of the aging process.
23
The risks for a range of chronic conditions increase significantly over the lifespan, given
the diversity of structures and systems in which age-related degradation operates on (30).
According to evolutionary theories of aging, particularly Kirkwood’s Disposable Soma theory
(55), living is accompanied by exposure to damaging properties, which innate protective and repair
mechanisms are set-up to guard against. However, the degree of protection is optimized to increase
evolutionary fitness and investment that surpasses reproductive needs is thus avoided. As a result,
some degree of damage is accumulated at varying structural and functional levels over the lifetime,
increasing as fecundity declines.
BA estimates are meant to measure an individual’s level of damage accumulation, and
when measured longitudinally, can be used to track the trajectory of damage over a period of time.
Consequently, reliable BA estimates may facilitate the investigation of a number of questions
related to the biology of aging. For instance, changes in BA should mirror changes in the rate of
aging as a result of genes or environmental conditions—such as caloric restriction, crowding, heat
shock, psychological stress, or exposure to reactive oxygen species (ROS)—which alter energy
allocation for maintenance and repair, or the degree of damaging properties (56-60). Finally, while
most experiments have relied on average or maximum lifespan to serve as a measure for studying
how various factors affect the rate of aging, BA allows for the investigation of alterations in the
rate of aging at points other than the end of life and may facilitate the identification of robust versus
frail individuals prior to death (61).
Although BA estimated by KDM2 was found to be a highly sensitive and specific predictor
of mortality, there are limitations within the current study that should be discussed. First, the use
of cross-sectional data means that mortality selection has changed the sample and this may
confound the results. However, to mitigate the potential bias, the sample was limited to those ages
24
75 and younger. Second, although other useful physiological measures exist, biomarkers used in
BA calculations were limited to those available in NHANES III. Third, biomarker data was
available for 75% of the NHANES sample ages 30-75, potentially resulting in a selection bias.
Given that subjects may not be missing at random, additional analyses were run to check for
differences between included and excluded participants. When compared to the analytic sample,
excluded individuals had a similar sex distribution, but were found to be an average of 3 years
older and were more likely to die between baseline and follow-up. Finally, although we used a
large nationally representative sample to generate the equation for BA, it may not be appropriate
when examining other populations with dissimilar environmental or genetic characteristics. For
this reason, more work is needed to identify a population that would be the most appropriate from
which to generate an equation for BA that represents human aging in general.
In a large representative sample, the algorithm proposed by Klemera and Doubal was able
to predict mortality better than more commonly used methods, such as PCA and MLR.
Furthermore, the estimates using KDM2 produced significantly more information regarding the
risk of mortality, than is generated by CA alone. Given its ability to use a single measure to
combine a number of varying biomarkers, KDM accounts for the complexity of aging in its
measurement. In moving forward, BA estimates, similar to the one proposed by Klemera and
Doubal, may be useful phenotypic traits for examining behavioral, environmental, or heritable
factors that affect the heterogeneity of aging and lifespan. Finally, the development and validation
of a BA construct is valuable given its impact on our theoretical understanding of the aging
process, and may facilitate future development of preventative interventions with implications for
health and longevity.
25
Chapter 3: Evidence of Accelerated Aging
among African Americans and its
Implications for Mortality
3.1 Introduction
Race is linked to striking health disparities in the United States. Overall, blacks experience
death and disease much earlier in the life course than do whites, which may suggest that on average
blacks are aging faster (62). Because the progression of physiological deterioration that
accompanies aging may be strongly related to environmental factors (63), it is conceivable that the
various social, economic, mental, and physical factors encountered by many racial minorities
throughout their lives may be capable of causing an acceleration of the aging process.
A number of factors have been shown to contribute to racial differences in morbidity and
mortality: socioeconomic status (SES) (62, 64), neighborhood (65, 66), availability of quality
healthcare (67, 68), behaviors (69), and psychological stress (70). Over time, these factors have
the ability to get “under the skin” and alter physiological functioning (71). Blacks also experience
more discrimination, have less economic security and often live in worse neighborhoods, offering
fewer nutritional options, worse air quality, and less access to recreational activities (72-74). These
experiences may lead to higher levels of both physical and psychological stress with the potential
to cause a myriad of biological changes with implications for aging. Finally, the higher prevalence
of dangerous health behaviors, such as obesity, among blacks relative to whites (75), are also
believed to contribute to progressive breakdowns in biological tissues and systems, leading to
widening gaps in physiological function Growing disparity in physiological functioning due to
the continual exposure to adverse conditions is the premise of the “Weathering Hypothesis”, which
26
suggests that the negative effects of exposure to hazardous physical, social, and economic
environments of socially disadvantaged racial groups accumulate over the lifespan and contribute
to premature health deterioration, which may be indicative of an acceleration of the biological
aging process (76).
The pace of age-related deterioration, potentially resulting from the accumulation of tissue
and cellular damage to molecules like DNA and proteins, may be strongly influenced by the
amount of wear and tear the body undergoes over time (77). As a result, individuals exposed to
hazardous environments may presumably age quicker, causing them to appear biologically older
at a given chronological age. In fact, previous research examining race differences in cumulative
biological risk have shown that on average, blacks have the same number of “high-risk” (indicated
using clinically established cutoffs) physiological indicators as whites who are significantly older
chronologically (76, 78).
The earlier onset of aging-related deteriorations in physiological functioning is believed to
also give rise to premature incidence of mortality. It has been reported that life expectancy for
blacks is about 5 years less than whites (79) and contributes to approximately 100,000 excess
deaths per year (80). Additionally, dramatic racial disadvantages have been found across multiple
domains of health. Even in middle-age, blacks have been shown to have significantly higher
prevalence of both fatal and non-fatal chronic conditions than whites (62) and being black is often
associated with earlier onset of many age-related chronic diseases (81). This shift in the age
curve—in which the onset of death and disease occurs earlier in life—is thought to explain why
racial disparities in mortality risks cross-over in late life, resulting from mortality selection at
earlier ages (82, 83). Taken together, this may suggest that a majority of those in the black
population may be aging faster than the white population in the U.S.
27
Biological age measures were developed to quantify multi-system age-related changes on
a physiological level and may be useful as proxies for the pace or extent of aging of an individual.
While the concept of combining multiple measures into a single variable to model the rate of aging
was proposed over fifty years ago, recent techniques have been found to be promising predictors
of aging-related health outcomes (28, 49). These measures utilize information from multiple
biomarkers to determine where an individual lies on an aging trajectory. Typically, the trajectory
is determined using a data driven approach that calculates age-associated differences in the various
markers within a large representative sample (37, 38, 40). As a result, biological age reflects the
chronological age which on average is characterized by the specified biological profile. For
example, someone with a biological age of 50 has the physiological functioning of the average 50
year old within the population. An individual’s chronological age can be subtracted from
biological age to determine whether the pace of aging for an individual or group is accelerated (i.e.
they are older biologically than they are chronologically). For this reason, although it is not an
actual marker of mortality risk, the concept of biological age may be useful for examining health
disparities, as it allows us to directly estimate the degree of aging, or the difference between
biological and chronological age, of disadvantaged groups, as well as compare biological age for
race groups at varying chronological ages.
Using data from the National Health and Nutrition Examination Survey (NHANES III),
this study examines 1) the racial difference in the pace of aging across ages and by ten-year age
groups, to determine if blacks are aging biologically faster than whites and whether disparities in
the pace of aging decline or cross-over in later life; and 2) whether these differences in the pace of
aging, account for racial disparities in age-specific risks of all-cause mortality, cardiovascular
disease (CVD) mortality, and cancer mortality. Overall, we hypothesize that blacks will have
28
higher levels of accelerated aging compared to whites. However, these differences should decrease
with age since the most disadvantaged are selected out of the population earlier. Finally, we
hypothesize that racial differences in pace of aging will account for the higher mortality risk among
blacks.
3.2 Materials and Methods
3.2.1 Study Population
We use data from NHANES III, a nationally representative, cross-sectional study
conducted by the National Center for Health Statistics (NCHS) between 1988 and 1994. Data were
collected from at-home interviews and examinations taking place at a Mobile Examination Center
(MEC). Further details of recruitment, procedures, population characteristics and study design are
available through the Centers for Disease Control and Prevention. Our analytic sample (N=7,587)
was restricted to black and white subjects ages 30-89. Hispanics were excluded because, although
they have slightly higher life expectancy than whites, nativity is believed to be may be a major
factor in this observation, and there is evidence to believed that these differences may be explained
by the “salmon hypothesis” which suggests that many Hispanics may return to their country of
origin once they become ill and thus their mortality is not observed (78). Those over age 89 were
excluded given that NHANES III top-codes age at 90. Complete biomarker data was available for
approximately 70% of the age-eligible sample. However, excluded subjects were more likely to
be black, have lower education, were older, and were more likely to die between baseline and
follow-up.
3.2.2 Biological Age Measure
Our estimation was calculated using information for ten biomarkers—C-Reactive Protein
(CRP), Serum Creatinine, Glycosylated Hemoglobin (HbA1c), Systolic Blood Pressure, Serum
29
Albumin, Total Cholesterol, Cytomegalovirus Optical Density (CMV), Serum Alkaline
Phosphatase, Forced Expiratory Volume at 1 second (FEV1), and Serum Urea Nitrogen. These
markers were selected because they had been suggested as potential biomarkers of aging, used in
prior estimations of Biological Age using the NHANES III sample (28), or had been found to
significantly correlate with chronological age at r > 0.10. Together, these biomarkers provide an
indication of metabolic, cardiovascular, inflammatory, kidney, liver, and lung functioning.
Biological age was calculated in accordance with the method proposed by Klemera and
Doubal (43). Which in Chapter 2 was shown to predict death more accurately than other well-
known Biological Age algorithms, such as Multiple Linear Regression and Principle Component
Analysis, and was found to be a better indicator of mortality risk than chronological age (28).
3.2.3 Sociodemographic Characteristics
Chronological age, race, sex, education, and smoking were based on self-reports. Subjects
were categorized into two race groups—Non-Hispanic White and Non-Hispanic Black. Education
was used as an indicator of SES. Reported school years completed were used to create four
education groups: <12 years (less than a high school education), 12 years (high school degree),
13-15 years (some college), and 16+ years (college degree). Next, three smoking groups were
created based on subjects’ answers to two questions: non-smokers (reported not having smoked at
least 100 cigarettes during their life time), former-smokers (not currently smoking but reported
having smoked at least 100 cigarettes during their life time), and current smokers. Finally, BMI
was calculated as measured weight (in kg) divided by measured height (in meters) squared, and
used to classify participants as underweight (BMI<18.5), normal weight (BMI 18.5-24.9),
overweight (BMI 25-29.9), and obese (BMI 30+).
30
3.2.4 Mortality
Mortality follow-up and person-months of follow-up were available for all participants
from linked records from the National Death Index through 2006. Information was provided for
113 potential underlying causes of death (UCOD-113), and used to code for all-cause mortality,
cardiovascular disease (CVD) mortality, and cancer mortality. For all mortality analyses, violent,
accidental or HIV related deaths were censored given that our study is concerned with age-related
mortality that can be linked to chronic diseases.
3.2.5 Statistical Analysis
Using OLS regression, we compared the biological ages of blacks and whites adjusting for
chronological age and sex, and then again after adjusting for chronological age, sex, and additional
covariates such as education, BMI and smoking. Results from these models were then used to
estimate adjusted mean biological age for the two groups. Biological ages of blacks and whites
were then compared in ten-year age groups to determine whether differences converged later in
life. Finally we examined whether Biological Age could explain racial disparities in mortality
using Cox Proportional Hazard Models. First, models were run only controlling for chronological
age and sex to determine the association between race and mortality (baseline models). Next we
reran models, adjusting for biological age to determine whether it mediated the association
between race and mortality and then reran these models with the inclusion of covariates such as
education, BMI and smoking.. Cause-specific mortality was examined using a competing-risks
framework. All analyses were run in STATA and used sample weights and appropriate survey
procedures for dealing for complex sampling design.
31
3.3 Results
3.3.1 Sample Description
As shown in Table 3.1, both chronological and biological age had means of 50.2 years;
however, as expected the standard deviation was slightly larger for biological age (15.5) than
chronological age (14.9). The sample is mostly made up of whites, with only about 11% blacks.
Approximately 10% of the sample never attended high school, one-quarter attended but did not
graduate from high school, one-third completed high school, one-fifth had some college
education, and one-quarter completed at least four years of college. Just over half of the subjects
are female (53%).
Table 3.1 Sample Characteristics for the Full Sample and by Race
Characteristic
Full Sample
(N=7,587)
Whites
(N=4,851)
Blacks
(N=2736)
Biological Age, (s.d.) 50.2 (15.5)
50.2 (15.5) 50.0 (15.0)
Chronological Age, (s.d.) 50.2 (14.9) 50.6 (15.0) 47.0 (13.6)
Female (%) 52.9 52.5 56.3
0-8 years schooling (%) 9.5 8.8 15.0
Some High School (%) 12.5 11.7 19.2
High School Degree (%) 34.8 34.7 35.6
Some College (%) 20.0 20.1 18.6
College Degree (%) 23.2 24.6 11.6
Underweight (%) 1.8 1.74 1.8
Normal BMI (%) 38.6 39.5 31.4
Overweight (%) 35.1 35.2 34.5
Obese (%) 24.5 23.6 32.3
Former Smoker (%) 30.8 32.2 19.8
Current Smoker (%) 27.6 26.6 36.1
All-Cause Mortality (%) 19.8 19.9 19.5
CVD Mortality 8.7 8.8 7.9
Cancer Mortality 5.5 5.4 5.9
Person-Years (Total) 104,641 93,495 11,146
32
Overall, the majority of subjects had a normal BMI (38.6%), while 35.1% were overweight,
24.5% were obese, and approximately 2% were underweight. Over half the sample has a history
of smoking, with 31% reporting they were former smokers, and 28% reporting that they were
current smokers. Finally, over the 18-year follow-up, 20% of subjects died overall, 9% died from
CVD, and 5.5% died from cancer. The analysis is based on a total of 97,557 person-years of
exposure for 7,587 subjects.
3.3.2 Biological Age by Race
Adjusted means for Biological Age by race are shown in Table 3.2. When controlling for
chronological age and sex, blacks were found to have biological ages that were significantly higher
than whites (P<.001). On average, the biological age for blacks was 53.16 years, which was more
than a 3 year increase over whites, who had an average biological age of 49.84 years. Next, in
addition to controlling for chronological age and sex, means were also adjusted for SES (as
measured by years of education), BMI, and smoking, to determine if these accounted for the
difference in biological age between blacks and whites. Results showed that even when controlling
for SES and health behaviors, blacks were still found to have higher biological ages than whites
(P<.001). Nevertheless, the difference was slightly reduced (2.83 years), with blacks having a
mean of 52.72 years compared to 49.89 years for whites.
Table 3.2 Mean Biological Age by Race (N=7,587)
Mean Biological Age (S.E.)
Adjusted for Chronological
Age and Sex
Adjusted for Chronological Age,
Sex, SES, BMI, and Smoking
Non-Hispanic White 49.84 49.89
Non-Hispanic Black 53.16 52.72
Difference 3.32 (P<.001) 2.83 (P<.001)
33
Overall, this suggests that on average blacks have the physiological functioning of whites who are
more than three years older chronologically. Furthermore, even when accounting for the effects of
SES, obesity, and smoking, some of the differences were attenuated; however, with chronological
age, SES and health behaviors controlled blacks are still biologically older than whites.
Figure 3.1: Racial Differences in Adjusted Mean BA by 10-year CA Groups
The difference in biological age between blacks and whites increased with chronological age
prior to the age of 70. Blacks in their thirties, forties, fifties, and sixties had biological ages
that were 2.28, 3.63, 4.59, and 4.82 years, respectively, higher than whites. However, for those
in their seventies and eighties racial differences in biological age decreased to 2.94 and 1.17,
respectively, and were no longer significant for persons ages 80-89. Models were adjusted for
age, sex, education, BMI, and smoking. Bars represent stand errors of adjusted means.
To test whether differences in biological age by race varied across the age range, we
compared adjusted means for Biological Age by race within ten year age groups (controlling for
chronological age and sex). As shown in Figure 3.1, race differences in biological age increased
34
up until ages 60-69—when blacks were found to be 4.82 years older biologically than whites
(P<.001). However, after ages 60-69 the difference in biological age between black and whites
steadily declined until there was only a 1.17 year difference (P=.054) for those 80-89 years old.
3.3.3 Mortality Disparities and Biological Aging
Cox proportional hazard models, controlling for chronological age and sex, were run to
determine the overall mortality and disease-specific mortality risks associated with being black.
As shown in Table 3, subjects who are black were 46% more likely to die overall (HR: 1.46,
95%CI: 1.30-1.65), 40% more likely to die from CVD (HR: 1.40, 95%CI: 1.18-1.67), and 51%
more likely to die from cancer (HR: 1.51, 95%CI: 1.24-1.85) when compared to subjects who are
white. Additionally, a one year increase in chronological age was associated with an 11% increase
in all-cause mortality (HR: 1.11, 95%CI: 1.10-1.11), a 12% increase in CVD mortality (HR: 1.12,
95%CI: 1.11-1.13), and an 8% increase in cancer mortality (HR: 1.08, 95%CI: 1.07-1.09). This
suggests that the increased risk of mortality associated with being black is equivalent to the risks
associated with a 3-6 year increase in chronological age.
When models were run controlling for biological age, it completely accounted for racial
disparities in all-cause, CVD, and cancer mortality. While a one year increase in biological age
was associated with having an 11% greater risk for all-cause mortality (HR: 1.11, 95%CI: 1.09-
1.12), being black no longer significantly increased the risk of all-cause mortality (HR: 0.97,
95%CI: 0.85-1.11). Similarly, being one year older biologically was associated with an 11%
increase in the risk of CVD mortality (HR: 1.11, 95%CI: 1.10-1.13) and a 6% increase in the risk
of cancer mortality (HR: 1.06, 95%CI: 1.04-1.08); however, there were no longer significant
differences in the risk of either CVD or cancer mortality risks for blacks relative to whites.
35
Table 3.3: BA Mediates Racial Disparities in All-Cause, CVD and Cancer Mortality (N=7,587)
Hazard Ratio (95% Confidence Interval)
All-Cause Mortality CVD Mortality Cancer Mortality
Model 1 Model 2 Model 3 Model 1 Model 2 Model 3 Model 1 Model 2 Model 3
Black
1.46
(1.30-1.65)
0.97
(0.85-1.11)
0.97
(0.85-1.11)
1.40
(1.18-1.67)
0.91
(0.75-1.11)
0.90
(0.74-1.11)
1.51
(1.24-1.85)
1.20
(0.97-1.50)
1.24
(0.99-1.55)
Chronological Age (years)
1.11
(1.10-1.11)
1.00
(0.99-1.01)
1.02
(1.01-1.03)
1.12
(1.11-1.13)
1.01
(0.99-1.03)
1.02
(1.01-1.04)
1.08
(1.07-1.09)
1.02
(0.99-1.04)
1.04
(1.02-1.07)
Sex (Female=1)
0.67
(0.60-0.75)
0.54
(0.48-0.60)
0.58
(0.51-0.66)
0.61
(0.52-0.72)
0.49
(0.41-0.57)
0.49
(0.41-0/59)
0.65
(0.52-0.80)
0.57
(0.46-0.72)
0.69
(0.54-0.88)
Biological Age (years)
1.11
(1.09-1.12)
1.10
(1.09-1.11)
1.11
(1.10-1.13)
1.11
(1.09-1.12)
1.06
(1.04-1.08)
1.05
(1.02-1.07)
Education
(Reference=College)
0-8 Years Education
1.36
(1.11-1.66)
1.32
(0.96-1.81)
1.19
(0.79-1.78)
Some High School
1.48
(1.21-1.82)
1.46
(1.05-2.01)
1.30
(0.87-1.94)
Completed High
School
1.42
(1.17-1.72)
1.42
(1.06-1.92)
1.39
(0.96-2.02)
Some College
1.29
(1.04-1.60)
1.38
(0.99-1.93)
1.20
(0.80-1.80)
BMI
(Reference=Normal)
Underweight
1.89
(1.36-2.60)
1.30
(0.80-2.09)
1.21
(0.49-2.97)
Overweight
0.85
(0.75-0.97)
0.87
(0.71-1.06)
0.92
(0.71-1.18)
Obese
0.93
(0.79-1.08)
1.01
(0.81-1.27)
1.06
(0.79-1.42)
Smoking
(Reference=Never Smoker)
Former Smoker
1.28
(1.13-1.47)
1.05
(0.86-1.27)
1.99
(1.51-2.63)
Current Smoker
2.04
(1.73-2.40)
1.59
(1.26-2.02)
3.41
(2.51-4.63)
36
Furthermore, chronological age was no longer significantly associated with all-cause, CVD
or cancer mortality. Even when SES, BMI and smoking were included in the hazard models,
biological age was found to be a strong predictor of all-cause, CVD and cancer mortality, while
race remained non-significant. Finally, we examined the interaction between race and biological
age and also ran race-stratified mortality models to determine whether the association between
biological age and mortality differed for blacks and whites. Interactions between race and
biological age were not statistically significant for any of the mortality outcomes. The stratified
models also showed very little race differences across the three mortality outcomes. For instance,
a one-year increase in biological age was associated with a 10% increase in all-cause mortality for
whites, and an 8% increase in all-cause mortality for blacks. For CVD, and cancer mortality, the
risks for whites increased by 11% and 5%, respectively, with every one year increase in biological
age, while the risks for blacks increased by 9% and 3%, respectively, with every one year increase
in biological age.
3.4 Discussion
Our results suggest that being black is associated with significantly higher biological age
and that this is a pathway to early death overall, and from CVD or cancer. We have long known
that race is linked to earlier mortality and morbidity. Life expectancy at age 25 is about 5-6 years
lower for U.S. blacks than it is for whites (79). Furthermore, disease incidence has also been found
to occur significantly earlier for blacks (81). However, this study is novel in offering evidence that
racial differences in the pace of aging—as signified by biological age—may be a central
mechanism for the earlier overall and disease-specific mortality of black individuals.
Given that the physiological changes associated with the aging process lead to an increase
37
in susceptibility to disease onset and death (30), mortality and morbidity is likely to occur
significantly earlier for individuals who are aging faster. Our results showed that on average blacks
tend to be more than 3 years older biologically than whites. This is consistent with findings from
previous studies reporting that blacks tend to have levels of biological risk factors that are
indicative of someone significantly older chronologically (76)—providing further evidence that
the pace of aging may be accelerated.
Everyday stressors associated with being black may negatively impact physiological
functioning and under chronic exposure, accumulate over the lifespan and contribute to growing
disparities in biological risk. Furthermore, if such environmental, behavioral and mental factors
contribute to an acceleration of the aging process, we would expect that persons who are aging the
fastest should have the highest risk of mortality and thus be selected out of the population at
younger ages due to their lower life expectancy. The presence of mortality selection should also
lead to a convergence in biological risk between advantaged and disadvantaged populations (84).
This is consistent with the current study, which showed that racial disparities in biological age
systematically varied across the age range. We found that with increasing age, the gap in biological
age between blacks and whites widened up until participants were nearing old age, after which
point it began to converge. This suggests that the most disadvantaged blacks may be accumulating
poorer and poorer health as they age; however, as those who are worst off are selected out of the
population by mortality, disparities between blacks and whites steadily decrease. Similar findings
have been shown for differences in biological risk by SES (85).
We also showed that differences in biological age completely accounted for the increased
risk of all-cause, CVD, and cancer mortality experienced by blacks, thus implying that the black
participants with the highest biological ages may also the one’s contributing to the increased
38
mortality risk among the black population. This is supported by our result showing no interaction
between race and biological age. Higher biological age was associated with higher mortality risks
among blacks and whites, suggesting that biological age operates similarly for both races. Finally,
when comparing the mortality between blacks and whites who have equivalent biological ages, no
significant differences are present.
While not a direct marker of mortality risk, the concept of biological age may do a good
job estimating an individual’s degree of physiological decline and dysregulation. Conventionally,
cumulative risk scores or allostatic load have been used to examine disparities in physiological
function (76, 86, 87). However, racial differences in such measures have only been able to account
for part of the association between race disparities in mortality (88). Given that cumulative risk or
allostatic load measures rely on counts of biomarkers for which a subject falls into a “high risk”
category (89) they may lose some of the information that could be gained from continuous
measures. For instance, aging-related breakdowns within various physiological systems tend to
increase progressively over the lifespan, and as a result, race differences in the timing and age
patterns of disease and mortality may be better explained by continuous measures that are more
highly associated with the pace of biological aging.
Given that physiological declines associated with social inequalities are believed to
accumulate and build over the lifecourse, biological aging may be a useful measure for studying
health disparities, particularly from a cumulative disadvantage framework. The theory of
cumulative disadvantage describes how disadvantage, beginning in early life, or in prior
generations, may intensify over time, leading to a divergence in health among various social
groups (90). While measures of allostatic load or cumulative risk based on clinical cut-points
capture dysregulation after it has reached a critical point, they ignore the process of decline, leading
39
up to it, as well as the continuous progression of decline thereafter. From a life course perspective,
measures of biological age may allow us to examine the shape of aging trajectories at different
stages in life, to determine the role of early life or prenatal conditions, as well as the accumulation
of disadvantage over time.
Additionally, differences in biological age could also reflect genetic differences between
groups, or gene by environment interactions. For instance, populations with different ancestries
may possess different frequencies of protective or risk alleles. As a result, biological age measures
may serve as a useful phenotype for examining genetics of human aging. Additionally, it may also
allow us to identify resilient persons—those who appear biologically younger than expected.
Ultimately, this could facilitate our ability to study how genetic or environmental factors enable
some individuals to cope with disadvantage.
There are limitations in the present study that should be acknowledged. First, biomarker
data for NHANES respondents were only available for a single time point, preventing us from
looking at trajectories of biological age. Next, due to missing biomarker data, our analytic sample
only included approximately 70% of NHANES participants ages 30-89 who were Non-Hispanic
black or Non-Hispanic white. Overall those excluded from our analysis were 54% more likely to
be black, 7 years older, had one less year of schooling and were 2.5 times as likely to die. As a
result, our estimates of race differences are likely to be somewhat conservative.
The findings presented here provide evidence that blacks may be aging at a faster pace than
whites and that biological aging is an important factor in explaining racial disparities in overall,
CVD, and cancer mortality. In moving forward, the use of biological age may allow us to examine
how social, behavioral, environmental, and economic factors contribute to health disparities, and
how these disparities are affected by the accumulation of disadvantage over the life course.
40
Chapter 4: Is 60 the New 50? Examining
Changes in Biological Aging over the Past
Two Decades
4.1 Introduction
Life expectancy has been rapidly increasing over the past sixty years (91). Because
mortality schedules are often used to estimate the rate of aging, researchers have taken this to
signify that the pace of aging may be slowing (6, 92). However, mortality may not always serve as
a reliable proxy for how fast an individual or population is aging on a biological level. For instance,
medical interventions and treatments that are enacted after a diagnosis has occurred have the
potential to extend lifespan while not affecting the pace of aging-related physiological decline (93).
In such instances, the extra years of life gained may not be coupled with an extension of healthy
lifespan or a compression of morbidity and therefore, are probably not reflective of a deceleration
of the aging process (94). As a result, it may be more beneficial to try to estimate biological aging
directly, in order to determine whether the pace of aging has slowed in a given population.
Biological age measures were developed to quantify the degree of physiological aging an
individual has undergone (31, 95), and as a result, may enable us to more accurately estimate
changes in the pace of aging for different historical periods. Measures of biological age combine
information from multiple physiological systems to estimate where an individual is on the aging
trajectory (28, 43). For instance, an individual with a biological age of fifty is estimated to have a
physiological status which characterizes someone who is fifty years old chronologically.
Furthermore, if the given individual is forty years old chronologically and fifty years old
biologically, it may suggest that he/she is aging at an accelerated rate. As a result, biological age
41
measures have been found to be reliable predictors of both morbidity and mortality and are thought
to provide more precise estimates of an individual’s true remaining life expectancy (28, 49).
The pace of biological aging is believed to be strongly influenced by environmental factors,
genetic characteristics, and some level of stochasticity (96) (15). While for the most part, the
genetic makeup of a population does not change much from year to year (97) and therefore would
not lead to differences in the pace of aging over a short period of time, environments and behaviors
do change more rapidly, and as a result may lead to changes in how quickly populations age. For
instance, there have been a number of recent changes in the prevalence of smoking and obesity that
could have altered the pace of aging over the past few decades. Smoking prevalence has decreased
dramatically since the 1980s, which has the potential to cause the pace of aging to decelerate (98).
On the other hand, during this time obesity rates have more than doubled, which may counteract
the declines in smoking, causing an acceleration of the aging process (99).
Nevertheless, there is evidence that the changes in smoking and obesity prevalence have
not been equivalent across different subpopulations—for instance for men and women. In the mid
to late twentieth century, smoking contributed to significantly more excess deaths for men than for
women (100). However, the prevalence of smoking among the sexes has started to equalize and as
a result, the degree of deceleration in the pace of aging that can be attributed to smoking cessation
should be greater for males. This is one explanation for the decrease in the longevity gender gap
since the 1980s, with males gaining more additional years of life than females (101). Additionally,
mean Body Mass Index (BMI) has increased faster for females than for males. BMI was higher for
males during the latter half of the 20th
century; however, it is estimated that between 1994
and1999, the average BMI of females surpassed that of males (102).
42
Finally, the use of pharmaceutical drugs to control blood pressure and cholesterol has also
increased substantially over the past few decades, potentially attenuating some of the age-related
declines in physiological functioning related to cardiovascular and metabolic health. For instance,
over the decade between 1988 and 2000 both treatment and control of hypertension significantly
increased among US adults (103). Additionally, statin use to control high cholesterol increased by
almost ten-fold, from about 2% to 25%, among US adults ages 45 and over. Nevertheless, the use
of such drugs has been more common among men than women. While the number of Americans
on hypertensive medication grew between 1988-1991 and 1999-200, most of this was due to
increases among males. Furthermore, among adults ages 65-74 in 2005-2008 approximately half
of men, but only one-third of women used statins over the past month.
Using nationally representative data, the goal of this paper is to examine how biological
aging changed between 1988 and 2010 for U.S. males and females, while also estimating the
contribution of changes in smoking, obesity, and medication use.
4.2 Materials and Method
4.2.1 Study Population
Our analytic sample included 21,575 subjects ages 20-79 from the third and fourth waves
of the National Health and Examination Survey—NHANES III (1988-1994), NHANES IV (2007-
2010). Data for NHANES were collected from at-home interviews as well as examinations which
took place at a Mobile Examination Center (MEC). Complete biomarker data were available for
approximately 70% of the age-eligible sample. Excluded subjects were older, more likely to be
black, and had lower levels of education.
43
4.2.2 Biological Age Measure
Our biological age estimation was calculated using information on eight factors that
collectively indicate metabolic, cardiovascular, inflammatory, kidney, liver, and lung functioning
These biomarkers include glycosylated hemoglobin, total cholesterol, systolic blood pressure, ratio
of forced expiratory volume at 1 second (FEV1) to forced vital capacity, serum creatinine, serum
alkaline phosphatase, serum albumin, and C-reactive protein (CRP). These markers have been used
in prior estimations of Biological Age, and had been found to significantly correlate with
chronological age at r > 0.10 (28).
Biological age was estimated using an algorithm proposed by Klemera and Doubal (43). In
Chapter 2, estimates from this method were shown to predict mortality more accurately than
chronological age or biological age estimated using other methods (28). To reiterate, the
calculation produces estimates that are linearly related to chronological age, with a slope of 1,
intercept of 0, and residual deviation. As a result, for the population, mean biological age should
equal mean chronological age. The equation used to calculate biological age (Eq. 4) combines
information on the participants’ measured biomarker values (x
j
), as well as the slope (k
j
), intercept
(q
j
), and root mean squared error (s
j
) from the equation of chorological age regressed on each
biomarker. Additionally, the equation also incorporates information on the variance (S
2
BA) of the
random variable, R
BA
, which represents the difference between participants’ biological and
chronological ages. Its calculation takes into account the variability in the first half of the equation,
the mean variance of the biomarkers that is explained by chronological age, the range of
chronological age, and the number of biomarkers included in the analysis.
44
4.2.3 Behavioral Characteristics
Smoking status was self-reported in responses to two questions—“Have you smoked at least
100 cigarettes during your lifetime?”, and “Do you currently smoke cigarettes”. Based on their
answers to these two questions, participants were classified as current smokers if they answered
yes to both questions; former smokers if they reported smoking at least 100 cigarettes during their
lifetime, but did not currently smoke; and never smokers if they answered no to both questions.
Body mass index (BMI) was calculated as measured weight (kg) divided by measured height
(meters) squared. Participants with a BMI between 25 and 29.9 were classified as overweight, while
those with a BMI of 30 or above were classified as obese.
4.2.4 Medications
Medication use was determined using self-reports. NHANES participants were asked
whether or not they were currently taking prescribed medication for 1) high blood pressure and 2)
high cholesterol. Using their answers to these two questions, we recalculated biological age using
imputed values for systolic blood pressure and total cholesterol. If participants reported that they
were currently taking prescribed medication for high blood pressure, their biological age was
recalculated substituting 140 for their systolic blood pressure, if the level they had when measured
was ≤140. If participants answered “yes” that they were currently taking prescribed medication for
high cholesterol, their biological age was recalculated substituting 200 for their total cholesterol, if
the level they had when measured was ≤200. Those reporting that they took both medications had
their biological age recalculated using imputed levels for both systolic blood pressure and total
cholesterol.
45
4.2.5 Sociodemographic Characteristics
Race, education, sex, and chronological age were self-reported. Subjects were categorized
into four race ethnicity groups—Non-Hispanic White, Non-Hispanic Black, Hispanic, and Other.
Years of schooling, reflecting the highest grade attended, was used to create four education
groups—those with less than 12 years of schooling, those with exactly 12 years of schooling, those
with 13-15 years of school, and those with 16 or more years of schooling. A dummy variable for
sex was created with males coded as 0 and females coded as 1. Finally, subjects were categorized
into three twenty year age categories based on whether they were young (20-39), middle-aged (40-
59), or old (60-79).
4.2.6 Statistical Analysis
All analyses were run controlling for covariates such as chronological age, race/ethnicity,
and education. Ordinary Least Squares (OLS) regression, with the sample stratified into 20-year
age categories, was used to measure the association between biological age and the interaction
between time period (1988-1994 and 2007-2010) and sex, to determine 1) what the biological ages
were for young, middle aged, and older males and females during the two time periods, 2) whether
persons in period two were biologically older or younger than those in period 1, and 3) whether
period differences in biological age were similar for both sexes (Eq. 15). Next, interactions with
obesity and smoking were sequentially added into sex and age-stratified OLS models of the
association between biological age and period to examine whether changes in the levels and effects
of smoking and obesity could partially account for changes in biological aging (Eq. 16, 17). Finally,
Eq. 15 was rerun using the biological age measure with imputed levels for participants on
hypertensive and cholesterol-lowering medications, to determine how much improvement in
biological aging could be attributed to increased pharmaceutical use.
46
(14)
(15)
(16)
4.3 Results
4.3.1 Sample Description
As shown in Table 4.1, both chronological and biological age had means of 43.9 years; however,
as expected the standard deviation was slightly larger for biological age (17.0) compared to
chronological age (15.5). Overall, the sample is mostly made up of whites (74.3%), with only about
10% Non-Hispanic black, 11% Hispanic, and 4.4% other. Approximately 21% of the sample never
completed high school, 30% had a high school degree or GED, 25% had some college education,
and another 25% completed at least four years of college. Just over half of participants are female
(50.6%). Overall, the majority of subjects had a BMI less than 25 (39%), while 33.4% were
overweight, and 27.4% were obese. Just over half the sample had a history of smoking, with 25.1%
reporting they were former smokers, and 26.4% reporting that they were current smokers.
Approximately 15.8% of participants reported being on hypertensive medications and 7.5%
reported being on cholesterol-lowering medications. Finally, 59.5% of participants took part in
NHANES between 1988 and 1994, while the other 40.5% took part in NHNAES between 2007
and 2010.
47
Table 4.1 Sample Characteristics (N=21,575)
Characteristic
Statistic
Chronological Age, mean (s.d.)
43.91 (15.5)
Biological Age mean (s.d.)
43.91 (17.0)
Female (%)
50.6
Race/Ethnicity (%)
Non-Hispanic White
74.3
Non-Hispanic Black
10.1
Hispanic
11.2
Other
4.4
Education (%)
<12 Years
21.0
High School Degree/GED
29.9
Some College
24.7
College Degree
24.5
BMI (%)
Overweight
33.4
Obese
27.4
Smoking (%)
Former
25.1
Current
26.4
Taking Anti-Hypertensive Medication
15.8
Taking Cholesterol-lowering Medication
7.5
Period (%)
1 (1988-1994)
59.5
2 (2007-2010)
40.5
Overall, the sample is mostly made up of whites (74.3%), with only about 10% Non-
Hispanic black, 11% Hispanic, and 4.4% other. Approximately 21% of the sample never completed
high school, 30% had a high school degree or GED, 25% had some college education, and another
25% completed at least four years of college. Just over half of participants are female (50.6%).
Overall, the majority of subjects had a BMI less than 25 (39%), while 33.4% were overweight, and
27.4% were obese. Just over half the sample had a history of smoking, with 25.1% reporting they
were former smokers, and 26.4% reporting that they were current smokers. Approximately 15.8%
of participants reported being on hypertensive medications and 7.5% reported being on
48
cholesterol-lowering medications. Finally, 59.5% of participants took part in NHANES between
1988 and 1994, while the other 40.5% took part in NHNAES between 2007 and 2010.
Figure 4.1 Changes in biological age between period 1 and period 2 by sex and age.
Although there were decreases in biological age for all sex by age groups (P<.001), sex difference
were more pronounced at younger ages (fig. 1a). Furthermore, for adults ages 20-39 the decreases
for males were significantly greater than the decreases for females (P=.033). Finally, for adults
ages 40-59 and 60-79 (fig. 1b, 1c), there were no significant sex differences in the decrease between
period 1 and period 2.
4.3.2 Period Differences in Biological Age
Biological age decreased for all age and sex groups over the twenty year time period (Figure
4.1). Additionally, during both periods, sex differences in biological age were larger at younger
49
ages. Among participants ages 20-39, males had biological ages of 31.8 years and 30.5 years for
period 1 and period 2, respectively, which were significantly higher than the biological ages of
women during period 1 (28.3 years) and period 2 (27.7 years). Nevertheless, while males and
females, ages 20-39, had significant decreases in biological age, the decrease for males was
significantly larger (p=.003), contributing to a reduction in the gender gap over time.
No sex differences in biological age were found for subjects age 60-79 at either time period.
However, both males and females had similar reductions in biological age of about four years. For
males, mean biological age was 70.3 years at period 1 and 66.0 years at period 2, while for females,
mean biological age was 69.0 years at period 1 and 65.4 years at period 2.
a) b)
Figure 4.2 Changes in the frequency of smoking and obesity between period 1 and 2
The prevalence of obesity significantly increased for males and females of all ages (fig.
1a,). Conversely, the prevalence of current smoking was significantly reduced for males of
all ages, as well as younger and older females.
50
4.3.3 Smoking, Obesity, and Biological Aging
Based on predicted probabilities, controlling for age, race/ethnicity, and education, the
prevalence of current smoking and obesity were estimated for each age by sex category (Figure
4.2). Among both males and females, obesity was significantly higher for all age groups in period
2 compared to period 1. From 1988-1992 until 2007-2010 obesity prevalence increased for males
from 14.8% to 30.9%, for those ages 20-39; 26.0% to 35.5% for those age 40-59; and 22.8% to
40.4% for those ages 60-79. For females, prevalence to obesity increased from 20.1% to 33.3% for
20-39 year olds; 29.7%-36.8% for 40-59 year olds; and 26.7%-42.6% for 60-79 year olds.
Among males, the proportion of current smokers significantly decreased for all age groups,
with the largest decreases taking place among middle-aged men. In 1988-1992 current smokers
made up 35.9%, 32.9%, and 17.2% of the male population ages 20-39, 40-59, and 60-79,
respectively. However, by 2007-2010, current smokers accounted for 34%, 25.8%, and 14.6% of
males ages 20-39, 40-59, and 60-79, respectively. Among females, the prevalence of current
smoking decreased for those ages 20-39 (from 30.4% to 25.7%) and for those ages 60-79 (from
14.4%-11.4%), between the two time periods. However, there was no change in current smoking
for females age 40-59, for which the prevalence of smoking was 22.5% for both time periods.
The association between biological age and smoking and BMI after adjusting for covariates
such as sex, chronological age, race/ethnicity, and education are shown in Figure 4.3. Both smoking
and BMI were associated with significant increases in biological age. Furthermore, when
considering them simultaneously results suggest that they have an additive effect on biological age.
Compared to never smokers with a normal BMI, never smokers who were overweight were 1.2
years older biologically, while never smokers who were obese were 2.4 years older biologically.
51
Figure 4.3 Additions to biological age related to smoking and obesity
Biological age was increased for current and former smokers, as well as participants who were
overweight or obese. Furthermore, there appeared to be an additive effect between smoking and
BMI. On average, subjects who were current smokers and obese had the highest biological age,
which was almost 4.5 years more than the biological age of normal weight individuals who had
never smoked.
Similarly, former smokers who were normal weight were about half a year older biologically, while
current smokers who had normal BMIs were 1.2 years older biologically. Finally, compared to
normal weight participants who had never smoked, overweight former smokers were 1.6 years
older biologically, overweight current smokers were 2.4 years older biologically, obese former
smokers were 2.8 years older biologically, and obese current smokers were over 3.7 years older
biologically.
52
4.3.4 Changes in BMI and Smoking Explain Changes in Biological Age
The association between changes in BMI/smoking prevalence and changes in biological
age are shown in Figure 4.4. For younger adults, reductions in the prevalence of smoking did not
contribute to the decreases in biological age between period 1 and period 2; however, increases in
BMI during this time counteracted the decrease in biological age. Overall males and females ages
20-39 were 1.28 and 0.64 years younger biologically in period 2 compared to period 1,
respectively. However, when controlling for differences in the levels and effects of BMI between
the two periods, young males and females were 1.80 and 1.15 years younger biologically in period
2 compared to period 1, respectively—suggesting that if the distribution of BMI had not changed,
males ages 20-39 would have had an additional 40.6% decrease in biological age, while females
ages 20-39 would have had an additional 79.7% decrease in biological age.
Similarly, middle-aged females (40-59 years) did not benefit from reductions in smoking,
but were hurt by increases in BMI. Overall, their biological age was 2.36 years lower in period 2
than in period 1. However, if BMI had remained constant across the two periods, the group would
have had an extra 10% reduction in biological age. Conversely, middle aged males benefited from
reductions in smoking and were hurt by increases in BMI. Overall, middle-aged males had
reductions in biological age of about 2.65 years between the two periods. However, controlling for
smoking suggests that approximately 10% of their reductions in biological age could be accounted
for by decreases in smoking prevalence and its effects. However, increases in BMI appeared to
counteracted decreases in smoking. Controlling for BMI in the model suggests that if BMI levels
had not changed between the two periods males ages 40-59 would have had an additional 7%
reduction in biological age.
53
a)
b)
c)
Model 1: Adjusted for covariates (race/ethnicity, SES, age)
Model 2: Adjusted for covariates plus interaction with BMI
Model 3: Adjusted for covariates plus interaction with smoking.
Figure 4.4 The Contributions of BMI and Smoking to Decreases in Biological Age between
Period 1 (1988-1994) and Period 2 (2007-2010)
Changes in smoking prevalence did not appear to influence the decreases in biological age of
females, as well as younger males (4a). However, declines in smoking was associated with
approximately 10% of the decreases in biological age among 40-59 year old (4b) and 8% of the
decreases in biological age among 60-79 year old males (4c). On the other hand, obesity was found
to counteract the decreases in biological age for younger males and females (4a), as well as middle-
aged females (4b). When controlling for differences in the distribution of BMI between period 1
and period 2, males ages 20-39 had an additional 41% decrease in biological age, while females
ages 20-39 and 40-59 had an additional 80% and 10% decrease in biological age, respectively.
54
Finally, decreases in smoking also appeared to have a large influence on changes in
biological age for older males. Our results showed that the decreasing prevalence and effects of
smoking accounted for 8% of the reductions in biological age of males ages 60-79. On the other
hand, changes in BMI among males age 60-79 had no significant association with changes in
biological age. Nevertheless, for older women, decreases in smoking did not account for the
decreases in biological age, while increases in obesity lessened the declines in biological age by
about 7%.
4.3.5 Medication Use and Changes in Biological Age
Biological age was re-estimated for participants self-reporting that they took medications
for either hypertension or hypercholesterolemia, by setting total cholesterol and systolic blood
pressure values just above the cut-offs used for prescribing such treatments. Using OLS regression,
controlling for age race/ethnicity, and education we compared the sex-specific differences in
biological age between the two periods using our original biological age estimate and the estimate
incorporating medication usage (Figure 4.5).
Overall, it appears increased medication use accounted for some of the decreases in
biological age at every age for both sexes. However, medications seemed to have the largest
influence on older adults, especially older males. While drugs to combat hypertension and
hypercholesterolemia were only associated with 0.12 and 0.14 years of the reduction in biological
age for males and females ages 20-39, respectively, among those age 40-59 they were associated
with a 0.57 year reduction in biological age of males and a 0.52 year reduction in the biological
age of females, and among those ages 60-79, medication use was associated with a 1.33 year
reduction in the biological age of males, and a 0.99 year reduction in the biological age of females.
55
Figure 4.5 Medication Use and Decreases in Biological Age
Period differences in biological age estimated with and without adjustments for medication
use did not vary significantly for males and females ages 20-39. However, among middle-aged
and older adults differences in biological age were significantly higher before adjusting for
medication use; thus, suggesting that the increased prevalence of persons on hypertension
and/or hypercholesterolemia medication in 2007-2010 relative to 1988-1992 contributed to a
proportion of the decreases in biological age for males and females ages 40-59, and 60-79.
Finally, this was most apparent for the oldest age group, particularly males.
4.4 Discussion
Over the past twenty years, the degree of biological aging appears to have slowed for males
and females across the age range. However, the degree of change has not been the same for men
and women or by age. Our results showed that young males experienced a greater decrease in
biological age than did young females. This may explain why early adult mortality has decreased
more for males than females, contributing to a narrowing of the gender mortality gap. Additionally,
decreases in biological age were also larger for older adults than for younger adults.
56
The differences in the association between biological age and changes in both behaviors
and medication use may partially explain why older adults had more dramatic decreases in
biological aging between 1988-1994 and 2007-2010 compared to younger adults. For instance, our
results suggest that decreases in smoking may have disproportionately benefited participants who
were older, especially older men. Decreases in smoking prevalence were most pronounced for
middle-aged and older men, accounting for a significant proportion of their improvement in
biological age. Conversely, the biological ages of younger adults, especially females, were the most
affected by increases in BMI. According to our results, if the distribution of BMI had not shifted
upward over time, younger males would have had an additional 40% decrease in biological age,
while younger females might have had decreases in biological age that were 80% larger than what
they actually experienced.
Another explanation for the differences in change over the two periods between younger
and older adults is medication use. Similar to what has been reported previously (104, 105), we
showed that the proportion of persons, especially middle aged and older adults taking cholesterol
and blood pressure lowering medications has increased significantly over recent decades. Given
that medications are typically administered at secondary or tertiary prevention stages, younger
individuals will not experience as much benefit (106). This was evidenced when we compared
biological ages before and after adjusting for medication usage. While adjusting for medication
use lessened the differences in biological age between the two periods only slightly among the 20-
39 year old population, differences shrunk much more for middle-aged and older adults—with the
largest decrease occurring among males ages 60-79.
Finally, another reason biological age may have slowed more for older adults could be due
to the fact that the variance in biological age increases with chronological age. Given that the
57
negative effects of environment and genes accumulate over the lifetime (107), biological age may
not differentiate individuals to the same degree at younger ages. For example, in a heterogeneous
population, it may be harder to detect differences in the pace of aging earlier in life; however, as
age, and damage increase over the life course, the physiological profile of frail individuals (those
with accelerated aging) and robust individuals (those with decelerated aging) may increasingly
diverge. Finally, at younger ages, when negative effects have not yet manifested, only so much
improvement can occur.
While behavioral factors like smoking, obesity, and medication usage explained some of
the period differences in biological age, a significant proportion of the decreases over time are
unaccounted for. Other explanations for the suggested slowing of the aging process that we were
unable to test are improvements in early life and prenatal conditions, or reductions in infectious
disease. Studies have consistently supported the links between early life conditions and later life
health and aging (108-111). Over the 20
th
century US childhood mortality drastically decreased
(112), which suggests an overall improvement in childhood health during this time. Declines in
death rates were accompanied by reductions in exposure infectious diseases (113, 114), better
nutrition, and advances in clinical medicine (115). These improvements in early life health may
directly effects the physiological functioning of these cohorts slowing the pace of biological aging.
There are limitations in the present study that should be acknowledged. First, due to missing
biomarker data, our analytic sample included approximately 70% of NHANES participants ages
20-79, and those excluded from our analysis were older, more likely to be members of race/ethnic
minority groups, and had fewer years of schooling. Although this could affect our estimates, there
doesn’t appear to be a difference in the patterns of missing data between the two periods, and
therefore, this should not bias our conclusions regarding changes in biological age over time.
58
Finally, because NHANES only collects cross-sectional data, we are unable to compare changes
across individuals or attenuate for mortality selection. Nevertheless, our study is strengthened by
its use of a large nationally-representative data that includes multiple biomarker,
sociodemographic, and behavioral measures.
In conclusion, we showed that the biological age of the population has declined over the
past twenty years and that the largest improvements have been for males and older adults. We also
showed that changes in smoking, obesity, and medication use may explain part of the decrease and
why improvements have not been as dramatic for females and young adults. In moving forward,
it may be useful to examine how cumulative disadvantage linked to socioeconomic factors and
psychosocial stressors have influenced changes in biological aging over time. Overall, research
examining how changing environments impact aging and health is important given that our ability
to extend healthy lifespan is influenced by our understanding of the factors that regulate the pace
of aging.
59
Chapter 5: Not All Smokers Die Young: A
Model for Hidden Heterogeneity within the
Human Population
5.1 Introduction
The rate of aging and subsequent mortality risk is hypothesized to result from the balance
between the body’s exposure to harmful environmental factors, and its genetically-determined
ability to repair and protect against damage (30). Thus, the ability of some individuals to reach
extreme old age, particularly in the presence of clearly high exposure to damaging factors, may
signal an innate resiliency that could be related to slower rates of aging. Genetic and environmental
factors impact the rate of aging via a number of downstream physiological processes, for example:
inflammation, oxidative stress, the accumulation of advantaged glycation end products that
contribute to the cross-linking of proteins, loss of homeostatic control, and damage to DNA (116-
118). Cigarette smoking has been identified as an environmental factor with the ability to
exacerbate a number of these processes (119, 120) and as a result, smoking has been associated
with accelerated rates of physiological decline, increased disease incidence, and reductions in life
expectancy (121, 122). Nevertheless, some smokers do survive to extreme ages and these
individuals may provide an opportunity to examine a resilient subgroup of the population and
uncover the factors that impact susceptibility to physiological stressors.
Harman’s free-radical theory of aging proposes that exposure to reactive oxygen species
(ROS) is one of the major contributors to aging, and has been linked to increased risk of diseases
such as cancer, cardiovascular disease, diabetes, and dementia (11). Nevertheless, there is evidence
of variation in the susceptibility to such damage. Studies on animal models suggest that longer-
lived animals may possess innate stress resistance mechanisms allowing them to limit the amount
60
of oxidative damage (123). Additionally, oxidative stress associated inflammatory responses to
endogenous and exogenous stressors may also contribute to differences in lifespan given its
implications for the accumulation of cellular damage. Consequently, variations in innate immune
response, either as a result of genetic or epigenetic factors, may have the potential to influence the
aging process, to the degree that individuals with diminished pro-inflammatory activation may
experience increases in longevity. Links between longevity and inflammation associated cellular
damage are consistent with Kirkwood’s disposable soma theory, which suggests that increased
energy allocation towards physiological processes involved in somatic maintenance and repair,
and away from those involved in growth and reproduction, contribute to life extension (124).
Therefore, given that smoking is associated with increased ROS exposure, and pro-inflammatory
cytokine activation, individuals with genotypes associated with down regulation of inflammatory
processes, and the up regulation of processes associated with cellular protection and regeneration
may be less prone to the negative effects of cigarette exposure, thus enabling them to survive
longer than other smokers.
Evidence of longevity associated resiliency to stressors has recently been documented in
studies of centenarians (125, 126). Results from these studies suggest that protection from
oxidative stress and decreased production of pro-inflammatory cytokines may promote longevity
in humans. Studies have also found significantly higher levels of high density lipoprotein
cholesterol (HDL) among centenarian offspring compared to age-matched controls (127). High-
density lipoproteins (HDLs) have been shown to have antioxidant and anti-inflammatory
properties and are associated with survival in late-life (128, 129). Finally, although HDL levels
are often reduced by smoking (130)—presumably contributing to increased risks for
atherosclerosis—individuals with predisposed resiliency may not experience these declines.
61
Because the prevalence of individuals with high levels of resiliency may be small,
differences in vulnerability to physiological stressors may be hard to detect in younger populations.
This results from hidden heterogeneity, which refers to variability in the susceptibility to death
within a population (131). For a younger population which includes a large number of non-resilient
individuals, the overall mortality risk will be representative of the general, non-resilient, sub-
population (84). However, as the frailer (more susceptible) individuals are selected out of the
population via mortality, the resilient individuals begin to make up a larger proportion of the
population, and the risk estimates for the group will start to resemble those of the resilient sub-
population. Consequently, mortality selection may provide a convenient way to visualize hidden
heterogeneity. While at younger ages we would expect smokers to have much higher physiological
dysregulation and mortality than non-smokers—given that most of the smoking population is non-
resilient—when comparing older smokers to non-smokers mortality should have already selected
out the individuals that are not resilient to smoking, and as a result, the smokers who remain should
be less susceptible to the negative effects of cigarette exposure.
Although the adverse effects of smoking on health have been well documented, little is
known about whether individuals vary in their vulnerability to biological stressors, such as
smoking. Using data from the National Health and Nutrition Examination Survey (NHANES III),
this study aims to uncover 1) whether differences in mortality and levels of physiological
dysregulation of smokers and non-smokers converge with age—signifying greater resilience
among long-lived smokers, and 2) whether indicators of physiological dysregulation can be used
to uncover hidden heterogeneity among smokers.
62
5.2 Materials and Methods
5.2.1 Study Population
The study was based on data from the third National Health and Nutrition Examination
Survey (NHANES III), and included 4,655 adults ages 50 and over. Excluded subjects (n=850)
were those who reported past (but not current) smoking, and those with missing biomarker data.
NHANES III is a nationally representative, cross-sectional study conducted by the National Center
for Health Statistics (NCHS) between 1988 and 1994. Data for NHANES III were collected during
at-home interviews, and physician examinations, which took place in a Mobile Examination
Center (MEC). Biomarker, smoking status, and sociodemographic data were available for a single
time-point when a participant was interviewed between 1988 and 1994. However, mortality
follow-up was available for all participants through 2006. Further details on recruitment,
procedures and study design are available through the Centers for Disease Control and Prevention.
5.2.2 Smoking History
In order to test whether individuals chronically exposed to biological stressors, but
surviving into extreme old age are more resilient, only two groups were compared—never-smokers
and current smokers. Those reporting smoking in the past were excluded given the evidence that
some of the negative effects of smoking can be reversed after cessation (132). Persons reporting
not having smoked at least one-hundred cigarettes in their lifetime were classified as never-
smokers; while persons who report smoking at the time of interview were classified as current
smokers. In addition, years of cigarette use and average number of cigarettes smoked per day were
calculated for current smokers. The number of years of smoking was estimated as the difference
between the age at which the subject started smoking and his/her current age. Periods of
63
nonsmoking are also reported and any period of time during which subjects reported cessation
were subtracted.
Daily smoking quantity was calculated based on smokers’ answers to five questions—1)
“About how many cigarettes do you smoke per day?”; 2) “For approximately how many years
have you smoked this amount?”; 3) “Was there ever a period of a year or more when you smoked
more than (number previously reported) cigarettes per day?”; 4) “During the period when you were
smoking the most, about how many cigarettes per day did you usually smoke?”; and 5) “For how
many years did you smoke that amount?”. Given that smoking patterns tend to change over the
lifetime, both current and highest smoking rate was used to calculate average reported cigarette
use. This was estimated by summing the number of cigarettes currently smoked per day (multiplied
by the number of years smoking that quantity) and the number of cigarettes smoked per day at its
highest (multiplied by the number of years smoking that quantity) and then dividing by the total
number of years reported on.
current current highest highest
current highest
quantity years quantity years
quantity
years years
A variable for heavy smoking was created based on whether subjects started smoking prior to age
30 and reported smoking at least a pack or more (20+ cigarettes) per day. Never smokers were
coded as a zero and used as the reference group in analyses.
5.2.3 Mortality
Data for mortality follow-up was available via linked mortality files from National Death
Index records through 2006. During analysis, violent, accidental, and HIV deaths were censored
64
as these should not be related to smoking-attributable mortality. Person months of follow-up were
provided and converted to years by dividing by twelve. Because participants took part in NHANES
III at different times between 1988 and 1994, potential follow-up time was variable, ranging from
12–18 years. To ensure all subjects had the potential to be followed for the same amount of time,
10 year survival was used.
5.2.4 Physiological Status
In order to examine links between smoking exposure and physiological resiliency,
indicators of physiological functioning were selected a priori which, in previous research, have
been shown to be affected by cigarette exposure and are also associated with processes related to
longevity. For instance, given the inflammatory response to cigarette exposure (133, 134) and the
links between chronic inflammation and accelerated-aging (123), we examined measures related
to immune activation and inflammation such as CRP, total leukocyte number, lymphocyte number,
granulocyte number, and monocyte number.
CRP is protein produced by the liver in response to acute cytokine activation, and as a
result is often used as a convenient marker of general systemic inflammation (135). Measures of
CRP were log-transformed in order to improve their distribution. Leukocytes, also known as white
blood cells, are immune cells involved in host defense and are composed of various types,
including: lymphocytes (T cells, B cells, NK cells), granulocytes (neutrophils, basophils,
eosinophils), and monocytes. Total leukocytes and its components are increased in response to
smoking (136, 137) and have implications for a number of age-related diseases, including: heart
disease, stroke, neurodegenerative diseases, cancer, lung disease and diabetes (138-142).
In addition, we also examined the associations between smoking, resiliency, and measures
of HDL cholesterol and lung function, for which high levels are thought to be beneficial and yet
65
have been shown to be lowered as a result of chronic cigarette exposure (143, 144). HDL is a
lipoprotein which facilitates lipid transport and is protective against cardiovascular disease,
neurodegeneration, diabetes, and cancer (145-147). Since the lungs are one of the first areas to
interact with the chemicals found in cigarettes, lung function may provide a useful estimate of the
amount of tissue damage inflicted by cigarette smoking (122). Lung function was measured as the
ratio between Forced Expiratory Volume at one second and Forced Vital Capacity (FEV1/FVC),
which has been shown to correlate with measures of frailty (148, 149).
5.2.5 Potential Confounders
Age, race/ethnicity, education, sex, and body mass index (BMI) were used as controls in
all analyses because these have been related to smoking, physiological outcomes and mortality.
Age was top-coded at 90 in the original NHANES data set to protect confidentiality of respondents.
This should not affect results since, for the majority of the analysis, persons are classified into four
age groups (50-59 years, 60-69 years, 70-79 years, and 80+). Three race/ethnicity categories are
included: non-Hispanic whites, non-Hispanic blacks, and Hispanics, most of whom are Mexican
Americans. In analyses, Non-Hispanic whites are used as the reference category. Education is
measured as years of schooling completed and is included as a continuous variable. Sex was
indicated with a dichotomous variable, with females coded as 1. Finally, BMI was calculated as
height in meters divided by weight in kilograms squared.
5.2.6 Statistical Analysis
All analyses were run, using sample weights and controlling for potential confounders
including race/ethnicity, education, sex, and BMI. Sample weights are calculated and provided by
NHANES. They are used to account for the complex sampling design employed by NHANES.
66
Weights are assigned to each participant in order to represent the number of persons in the U.S.
population with given sociodemographic characteristics. As a result, when weights are used in
analysis, a sample can be said to be representative of the U.S. population. The association between
mortality and smoking, age, and an interaction for age by smoking was modeled using a
proportional hazard model with a Gompertz distribution (150). Based on these results, age-
stratified mortality models were used to estimate the hazard of smoking in each age group. These
models were first run with the inclusion of all smokers and then rerun, limiting the smoking sample
to heavy smokers. This was done to ensure that the proportion of light smokers in the old age group
was not driving results. Next, ordinary least squares regression models were used to examine the
association between biomarkers and age by smoking interactions. From these models, predicted
means for HDL, log CRP, leukocyte number, lymphocyte number, granulocyte number, monocyte
number, and FEV1/FVC ratio were estimated and compared between smokers and non-smokers
within each age group. Finally, in order to examine whether biomarkers were associated with
survival among current and never smokers, proportional hazard models were run testing for
associations between biomarkers and mortality in smokers and never smokers, controlling for age,
sex, education, race/ethnicity, and BMI.
5.3 Results
5.3.1 Sociodemographic Characteristics by Age and Smoking Status
Within our population, smoking prevalence was highest for those ages 50-59 (40%) and
was lower in each subsequent age group, becoming fairly rare among those 80+ (8%) (Table 4.1).
Overall, differences by sex and socioeconomic status (SES) were consistent with what would be
expected—the smoking group included a higher proportion of males and individuals with low
education, while at the same time, older cohorts were made up of smaller proportions of males and
67
individuals with low education. Based on these frequencies, SES and sex did not appear to play a
greater role in survival for current smokers compared to never smokers. This assumption was
tested empirically by examining interactions between 1) sex, smoking, and age category, and 2)
education, smoking, and age category using proportional hazard models of mortality, and for both
models, interactions were not found to be statistically significant.
Table 5.1: Demographic Characteristics by Age and Smoking Status (N=4,655)
50-59 years
(N=1,188)
60-69 years
(N=1,471)
70-79 years
(N=1,075)
80+ years (N=921)
Never
Smokers
Current
Smokers
Never
Smokers
Current
Smokers
Never
Smokers
Current
Smokers
Never
Smokers
Current
Smokers
Subjects (%) 61.9 38.1 65.5 34.5 79.3 20.7 92.1 7.9
Years Smoking,
-- 38.3 (5.0) -- 46.4 (6.3) -- 54.6 (8.2) -- 63.0 (8.8)
Smoking (Start) -- 16.4 (4.4) -- 17.8 (5.5) -- 19.4 (7.8) -- 19.7 (7.8)
Heavy Smoker (%) -- 74.6 -- 66.6 -- 55.5 -- 41.8
Female (%) 71.1 41.3 68.0 51.4 77.8 49.5 78.7 55.2
White (%) 81.9 81.6 82.3 81.7 87.4 87.0 88.7 88.5
Black (%) 9.3 13.4 9.1 12.1 9.3 9.2 8.3 5.8
Hispanic (%) 8.8 5.0 8.6 6.2 3.3 3.8 2.9 5.7
Education,
12.4 (3.4) 11.5 (3.1) 11.5 (3.8) 10.8 (3.3) 10.7 (3.7) 10.4 (3.5) 10.0 (3.8) 10.1 (4.1)
BMI
a
,
28.5 (5.8) 26.4 (5.1) 27.9 (5.4) 25.8 (5.4) 27.2 (5.8) 24.7 (4.9) 25.2 (4.6) 22.9 (3.8)
Died (%) 4.3 13.6 12.8 36.5 33.9 53.1 72.8 77.0
a
BMI: Body Mass Index. All values are run using sample weights
5.3.2 Age Effects of Smoking on Mortality
A proportional hazard model (Gompertz distribution), controlling for race/ethnicity,
education, sex, and BMI was used to examine the association between smoking and mortality for
each age group. Overall, we found that while both higher age and smoking were related to an
increased risk of mortality, the association between smoking and mortality was significantly
reduced in the oldest age group (HR: 0.40; 95% CI: 0.21-0.78) (Table 5.2).
Given that significant age by smoking interactions were found for mortality, we used age-
stratified proportional hazard models to determine the hazard ratio for current smokers versus
never smokers, within each age group. Results showed that the relative mortality risk associated
68
with smoking was extremely high for younger age groups; however, it lessened considerably for
older age groups, to the point where smoking no longer significantly contributed to increased
mortality risk for subjects who were 80 years of age and older (Table 5.3).
Table 5.2 Mortality Effects of Smoking and Age, and the Influence of Daily Smoking Quantity
Hazard Ratio 95% Confidence Interval
Female 0.77 0.66-0.90
Education 0.97 0.95-0.99
Black 1.28 1.09-1.51
Hispanic 0.67 0.45-1.01
BMI 1.00 0.99-1.02
Age (60 years) 3.06 1.84-5.09
Age (70 years) 9.07 5.64-14.60
Age (80 years) 29.76 18.70-47.35
Smoking 3.01 1.73-5.23
Age (60) by Smoking 0.98 0.52-1.85
Age (70) by Smoking 0.59 0.32-1.09
Age (80) by Smoking 0.40 0.21-0.78
a
Proportional Hazard model was run with mortality as the outcome, with person-years of exposure included.
b
Overall,
2,393 deaths occurred over a total of 52,144 person-years
Among subjects ages 50-59, current smokers had an over 4 fold increase in mortality risk
compared to never smokers (HR: 4.16; P<.001). The risk of mortality from smoking was slightly
lower for those ages 60-69, with current smokers being more than 3 times as likely to die as never
smokers (HR: 3.36; P<.001). For those ages 70-79, current smoking was associated with a 73%
increase in the risk of mortality (HR: 1.73; P<.001). Nevertheless, among those in the oldest age
group, no significant increase in mortality risk was found for current smokers relative to never
smokers (HR: 1.31; P=.079).
69
Table 5.3 Hazard Ratios of Current Smoking and Heavy Smoking by age
Hazard Ratio (P-value)
Ages 50-59 Ages 60-69 Ages 70-79 Ages 80+
N Deaths 223 604 698 868
Person-Years 16,518 18,379 11,188 6,060
Current Smoking
a
4.16 (<.001) 3.36 (<.001) 1.73 (<.001) 1.31 (.079)
Heavy Smoking
a,b
5.04 (<.001) 3.77 (<.001) 2.50 (<.001) 1.57 (.062)
a
Reference group is never smokers
b
Heavy smoking defined as smoking uptake prior to age 30 and smoking at least a pack or more (20+ cigarettes)
per day. Models were run controlling for sex, race/ethnicity, education, BMI, and age.
Finally, to ensure that lower smoking-related mortality risks at older age weren’t resulting
from an increased proportion of light smokers or those who started later in life among the 80+ age
group, models were rerun including only never smokers and heavy smokers, who we defined as
current smokers who began smoking prior to age 30 and reported smoking an average of 20 or
more cigarettes per day (Table 5.3). Similar results were found to those reported above. Overall,
the relative risks associated with smoking were highest at younger ages and were no longer
significant for subjects ages 80+. Heavy smokers ages 50-59 had a more than 5 fold increase in
the risk of mortality compared to never smokers (P<.001). Heavy smokers in their sixties and
seventies were approximately 3.8 and 2.5 times as likely to die as never smokers (P<.001),
respectively; and finally among those age 80 and above, there was no significant increase in
mortality for heavy smokers versus never smokers (P=.062).
5.3.3 Age Effects of Smoking on Physiological Health
Four independent regression models were used to examine the age-effects of smoking on
indicators of physiological health, measured by levels of HDL, log CRP, leukocyte number,
lymphocyte number, granulocyte number, monocyte number, and FEV1/FVC ratio (Table 5.4).
Results showed that overall both smoking and age were significantly associated with worse
70
physiological status. However, statistically significant interactions for smoking by age were also
found, suggesting that, overall, smokers and non-smokers appeared to have very different age
trends, which may be a result of differential mortality selection within the two groups. At younger
ages, smoking was related to worse biomarker levels—lower HDL and FEV1/FVC and higher log
CRP, leukocyte number, lymphocyte number, granulocyte number, and monocyte number.
However, for older subjects, the differences in HDL, CRP, leukocyte number, lymphocyte number,
granulocyte number, and monocyte number between smokers and non-smokers were significantly
reduced or eliminated (P<.05)—suggesting that, at ages 80 and above, current smokers may have
similar physiological statuses to never smokers.
From these models, adjusted levels of each marker were calculated for the eight smoking
by age groups, controlling for race/ethnicity, education, and sex (Figure 1). These results showed
that as the age of the groups increased, differences between smokers and non-smokers were less
pronounced or even reversed. For instance, a cross-over effect was found when comparing HDL
of current and never smokers over the age range (Figure 5.1a). At younger ages, never-smokers
were found to have significantly higher HDL (P=.006)—53.91 mg/dl for never-smokers ages 50-
59, compared to only 50.12 for current smokers ages 50-59. However, for each subsequent age
group, the difference in HDL by smoking status was smaller, and became no longer significant
among those ages 60 and above. Furthermore, there was evidence of a cross-over effect given that
for subjects eighty and over, the predicted HDL was higher for current smokers (53.04 mg/dl) than
for never smokers. However, this did not reach statistical significance.
71
Table 5.4 Regression Coefficients of the Association between Current Smoking and Biomarkers
FEV1/FVC HDL CRP
b
Leukocyte Monocyte Lymphocyte Granulocyte
Sex (Female=1) 0.023*** 10.686*** 0.100*** -0.208* -0.035*** 0.074 -0.230**
Education 0.000 0.232** -0.006 -0.009 -0.001 -0.007 -0.002
Black 0.210*** 7.365*** 0.173*** -0.996*** -0.067*** 0.086* -1.033***
Hispanic 0.220*** -0.802 0.009 0.174 0.028 0.100 0.043
BMI 0.003*** -0.822*** 0.036*** 0.063*** 0.005*** 0.023*** 0.036***
Age (60-69) -0.020*** -0.988 0.047 0.105 0.014 0.025 0.060
Age (70-79) -0.038*** -2.553** 0.092* 0.448*** 0.027 -0.033 0.437***
Age (80+) -0.039*** -2.606* 0.187*** 0.960*** 0.075*** 0.027 0.841***
Smoking
a
-0.075*** -3.785** 0.231*** 2.210*** 0.140*** 0.437*** 1.636***
Age (60-69) by
Smoking
-0.011 2.210 -0.009 -0.524* -0.029 -0.024 -0.471*
Age (70-79) by
Smoking
0.002 4.449* -0.010 -0.849** -0.058* -0.125 -0.688**
Age (80+) by
Smoking
0.012 5.523* -0.129 -1.538*** -0.089* -0.344* -1.102***
Constant 0.689 5.049 65.926 0.121 5.049 1.516 3.223
R-squared .203 .172 .088 .149 .081 .042 .157
N 4075 4366 4334 4404 4323 4404 4323
* p<0.05; ** p<0.01; *** p<0.001
Results Based on separate OLS Regression Models;
a
Smoking refers to current smoking;
b
CRP is log-transformed
When comparing log CRP by smoking, current smokers had higher levels at each age
(Figure 5.1b). Among subjects in their fifties and sixties, current smokers had about 0.23 mg/l
(P<.001) and 0.22 mg/l (P<.001) higher predicted log CRP than never smokers, respectively. For
those ages 70-79, log CRP remained significantly higher for smokers—1.41 mg/l compared to
1.19 mg/l for never smokers (P=.007). However, for subjects age 80 and over the differences
decreased and were no longer significant. For current smokers ages 80 and above, log CRP was
1.39 mg/l, which was only 0.10 mg/l higher (P=.319) than log CRP for never smokers in this age
range (1.29 mg/l).
72
Similar patterns were found when examining differences in leukocyte number, lymphocyte
number, granulocyte number, and monocyte number (Figure 5.1c-f). For subjects in their fifties,
leukocyte numbers were 2.21 × 10
3
cells/µl higher for smokers compared to non-smokers,
lymphocyte numbers were 0.437 × 10
3
cells/µl higher for smokers compared to non-smokers,
granulocyte numbers were 1.64 × 10
3
cells/µl higher for smokers compared to non-smokers, and
monocyte numbers were 0.139 × 10
3
cells/µl higher for smokers compared to non-smokers.
However, the differences were smaller for each subsequent age group. When comparing never and
current smokers ages 60-69, 70-79, and 80 and above, differences in leukocyte numbers (× 10
3
cells/µl) were 1.69, 1.36 and 0.67, respectively; differences in lymphocyte numbers (× 10
3
cells/µl)
were 0.41, 0.31, 0.09, respectively; differences in granulocyte numbers (× 10
3
cells/µl) were 1.17,
0.95, 0.53, respectively; and differences in monocyte numbers (× 10
3
cells/µl) were 0.11, 0.08, and
0.05, respectively. Overall, these differences were significant for ages 50-59, 60-69, and 70-79.
However, among those ages 80 and above, differences were only significant for leukocyte number
P=.04.
Finally, never-smokers had significantly higher FEV1/FVC, regardless of age (Figure 5.1g).
Overall, differences in FEV1/FVC between never and current smokers remained relatively stable
for the four age groups, with differences of 0.07% for subjects ages 50-59, 0.09% for subjects 60-
69, 0.07% for subjects ages 70-79, and 0.06% for subjects ages 80 and over.
Associations between Biomarkers and Survival
To determine whether variations in biomarkers, which could be a sign of resiliency, are
associated with susceptibility to death, proportional hazard models were run, controlling for age,
sex, race/ethnicity, education and BMI, for never smokers and current smokers to estimate the
associations between biomarkers and mortality risk within the two groups (Table 5.5). Levels of
73
log CRP, leukocyte numbers, monocyte numbers, and granulocyte numbers were associated with
mortality in both never smokers and current smokers. However, the strength of these associations
was larger in the smoking group. A one unit increase in log CRP was associated with 32% increase
in mortality risk for current smokers (HR:1.32; P <.001), and a 21% increase in mortality risk for
never smokers (HR:1.21; P <.001). Similarly, one unit increases in Leukocyte, Monocyte, and
Granulocyte numbers were significantly associated with 10%, 84%, and 12% increases in
mortality risk for current smokers, respectively, and 3%, 47%, and 11% increases in mortality risk
for never smokers. Finally, although they were not associated with mortality risks for never
smokers, among current smokers, FEV1/FCV was significantly associated with mortality risk
(P<.001) and Lymphocyte number was marginally associated with mortality risk (P =.09).
Figure 5.1 Age Trends in the Association between Smoking and Biomarkers: A cross-over effect
was found when comparing HDL by smoking status and age (a) with non-smokers having higher HDL
at younger ages, and smokers having higher HDL at older ages. For CRP, leukocyte number,
monocyte number, lymphocyte number, and granulocyte number the difference between smokers and
non-smokers was largest for subjects in their fifties (b-f). However, these differences appeared to
converge with age and were not significantly different for CRP, monocyte number, lymphocyte
number, and granulocyte number after age 80. Finally, FEV1/FVC was lower for non-smokers of all
ages, and remained significant (g).
74
Table 5.5: Associations between Biomarkers and Mortality for Current and Never Smokers
Current Smokers Never Smokers
Hazard Ratio P Value Hazard Ratio P Value
FEV1/FVC 0.04 <.001 0.83 0.7049
HDL 0.99 0.6627 0.99 0.0017
Log CRP 1.32 <.001 1.21 <.001
Leukocyte 1.1 <.001 1.03 0.0057
Monocyte 1.84 0.0206 1.47 0.0029
Lymphocyte 1.09 0.0922 0.99 0.57
Granulocyte 1.12 <.001 1.11 <.001
5.4 Discussion
Based on our results, the risk of death associated with smoking is significantly lower at
older ages, to the point where smoking no longer increases mortality for individuals who survive
to age 80 and beyond. Furthermore, this does not appear to be a result of cohort differences in
smoking habits, as similar patterns are found when comparing only heavy smokers to never
smokers. Differences in physiological health by smoking status also converged at older age groups.
In younger populations, current smokers had significantly elevated levels of inflammation,
immune activation and lower HDL and lung function compared to never smokers. However, at
older ages differences between current and never smokers were significantly lower or non-existent.
Furthermore, mortality among smokers was strongly related to differences in inflammation and
immune activation, as well as lung function.
Given that older subjects had significantly more years of cigarette exposure, one would
presume that in a homogenous population, as years of smoking increased, disparities in health
between non-smokers and smokers would diverge. However, the increasing similarity between
smokers and non-smokers with age, suggests that surviving smokers may represent a distinct sub-
population who may possess physiological factors that allow them to either avoid or repair the
75
damage imposed by cigarettes. For instance, compared to shorter-lived smokers, long-lived
smokers may exhibit different immunologic responses to biological stressors. We showed that
levels of CRP, leukocytes, monocytes, and granulocytes strongly predicted survival, especially
among current smokers, which may explain why smokers and non-smokers look more similar as
age increases. Furthermore, as expected, smoking was associated with increased immune
activation and inflammatory processes for most age groups, as evidenced by the significantly
higher CRP, leukocyte, monocyte, lymphocyte, and granulocyte levels among current smokers
relative to never smokers. However, long-lived smokers had CRP, monocyte, lymphocyte, and
granulocyte levels that were statistically equivalent to those of long-lived persons who had never
smoked.
Genetically linked differences in inflammatory and immune responses to stimuli have been
reported in the literature (151). There are a large number of genes involved in the inflammatory
pathways, with significant genomic variation. For instance, the +896G+ TLR4 polymorphism was
shown to be associated with higher IL-10 levels—an anti-inflammatory cytokine which limits
inflammatory signal and response—and lower IL-6—a pro-inflammatory cytokine involved in the
recruitment of leukocytes (152). Additionally, studies have also shown that single nucleotide
polymorphisms (SNPs) in -765GC COX-2 are associated with decreased circulating plasma CRP
levels (153). Given that vascular injury from cigarette smoking has been shown to initiate an
immunologic response (122), long-lived smokers may have a genetic predisposition that enables
them to maintain low levels of inflammation, attenuating their likelihood of accruing additional
tissue damage.
Like inflammation, FEV1/FVC levels among smokers were significantly associated with
survival. It has been shown that lung injury is often a result of reactive oxygen species (ROS) that
76
cause oxidative damage to proteins, lipids, and DNA (154). Membrane lipid peroxidation has the
potential to increase cellular damage, decreasing lung function and impacting a number of disease
states (155). Additionally, ROS have also been shown to cause apoptosis, stimulate mucus
secretion, and disrupt the extracellular matrix and blood vessels (156). Finally, given the large
surface area and blood supply of the lungs, when exposed to exogenous oxidants such as cigarette
smoke, tissue may be particularly vulnerable to oxidative stress and damage (157). As a result,
smokers who have innate mechanisms to reduce or offset ROS-induced damage may maintain
better lung functioning regardless of cigarette exposure, and given the large differences in survival
by FEV1/FVC, lung function may be a useful proxy for resiliency among smokers.
A number of animal models have highlighted the associations between stress resistance
and longevity. It has been hypothesized that associations between increased resistance to biological
stressors and lifespan extension may be due to stronger antioxidant systems activity. For instance,
increased enzymatic antioxidant expression is linked to decreases in damage from ROS and has
been shown to increase longevity (158-161). Additionally, superoxide dismutase (SOD) has been
shown to act as an initial defense mechanism against damage from ROS, and deletions in SOD
genes significantly decrease lifespan in flies, mice, and yeast [51-60]. Nevertheless, more work is
still needed to understand the role antioxidants play in the aging process.
Given that 1) mortality was not increased for smokers who had survived to age 80 and
beyond, 2) smoking was found to have less impact on inflammation for long-lived individuals, and
3) lung function and inflammation were strongly associated with survival among smokers, in
moving forward more research is needed to identify factors that allow some smokers to survive to
extreme old-age, in spite of sixty or more years of cigarette exposure. In human populations,
genetic factors have been estimated to account for approximately 25% of the variation in longevity;
77
however, for those living into their 90s and 100s the force of heritability on lifespan is predicted
to be even higher (162). Furthermore, long-lived mutant strains have been identified for a number
of specifies, including the nematode Caenorhabditis elegans (C. elegans), Drosophila, and mice
(163-165), and many of these mutations have been found to be associated with increased levels of
stress resistance. Future studies that examine genetic factors such as single nucleotide
polymorphisms (SNPs), gene-networks, or gene expression—paying particular attention to
processes and pathways involved in inflammation and oxidative stress—may be important for
identifying such factors.
There are limitations in the present study that should be acknowledged. The use of cross-
sectional biomarker data prevents us from examining changes or trajectories in physiological
characteristics of long-lived and short-lived smokers. Also, the small sample size of individuals,
particularly older smokers, prevented us from comparing groups at even older ages. Third, data
for smoking quantity was based on retrospective self-reports and asked only about current and
heaviest smoking levels. As a result, our estimates of smoking quantity may be somewhat biased.
Finally, age cohort and gender patterns in smoking history are not random, and therefore hinder
our ability to accurately compare between age groups or make estimates or predictions of past or
future mortality rates.
Our study is novel in defining a sub-population that may possess high levels of innate
physiological resiliency. It presents evidence that long-lived smokers represent a distinct and
biologically advantaged group, who are less susceptible to the negative side effects of smoking.
Given what we know about smoking and the aging process, the investigation of long-lived smokers
provides a natural experiment to examine the ways in which deterministic and stochastic processes
interact to impact the rate of aging and the susceptibility to death and disease. In moving forward,
78
more research is needed to facilitate our understanding of the interaction between environmental
and genetic mechanisms that influence the degree of degradation with age and to enhance our
understanding of factors which influence resiliency and its effect on longevity.
79
Chapter 6: A Genetic Network Associated
with Stress Resistance, Longevity, and
Cancer in Humans
6.1 Introduction
Over time, nearly all biological organisms experience a progressive decline of cellular
structure and function, resulting in a decreased ability for systems to adequately respond to
environmental perturbations and maintain homeostasis. This process known as aging is the number
one risk factor for mortality among humans, contributing to an individual’s susceptibility to a
number of distinct conditions such as cardiovascular disease, cancer, diabetes, neurodegenerative
diseases, sarcopenia, lung disease, vision/hearing impairment and frailty (166, 167). Accordingly
it has been suggested that slowing the aging process would not only increase lifespan, but also
postpone most major illnesses and disability (168).
Since the discovery of mutations with the potential to double the lifespan of model
organisms (169-171), the search for genes that regulate human aging and longevity has gained
significant interest. Using twin data, researchers have estimated that genetic differences account
for 20-30% of the variance in human lifespan, with the remainder being under the influence of
environmental or stochastic factors (96, 172). Yet, it has been suggested that the degree of genetic
influence may vary as a function of environment (173). Evidence from animal models suggests
that genetic factors which influence longevity may be linked to innate stress resistance (123, 174-
176), if so, genetic endowment may play a larger role in longevity under adverse conditions. Under
optimal environmental conditions, such genes may not be essential for survival; however, for
organisms encountering chronic environmental stressors, these genes may facilitate the organism’s
ability to limit physiological damage and maintain cellular structure and function. For instance
80
centenarians have been shown to exhibit the same poor health behaviors as other members of their
birth cohort, and as a result, their longevity may be more attributable to advantageous gene variants
than to optimal environment (177).
Smoking is one of the most consistent biological stressors among humans and has been
shown to have drastic consequences for lifespan and disease progression, most notable heart
disease and cancer (178, 179). It is suggested that cigarette exposure may impact the risk of death
and disease via its acceleration of the aging process (121, 122). Yet, not all smokers experience
earlier mortality—in fact, a small proportion manage to survive to extreme ages and there is reason
to believe that these long-lived smokers may represent a biologically distinct group, endowed with
genetic variants allowing them to respond differentially to environmental stressors. In previous
work, we showed that current heavy smokers who had survived to age 80 and beyond had mortality
risks and inflammatory levels similar to non-smoking individuals of the same age—suggesting
that they may be innately equipped to offset the harmful effects of cigarette exposure (24).
Many of the genes associated with stress resistance and longevity in animal models have
been found to be comprised within pathways, such as the IGF-I/Insulin signaling pathway, that are
evolutionarily conserved among yeast, drosophila, C. elegans, mice, and humans (176, 180). We
hypothesize that multiple genes within these conserved pathways may influence longevity in a
polygenic manner. While most genome-wise association studies (GWAS) of human longevity
investigate the individual influences of single nucleotide polymorphism (SNPs), we believe that
incorporating a priori information on networks may allow us to identify functionally related genes
whose effects are too small to observe individually, yet jointly influence aging, longevity, and
disease risks. Therefore, the current study aims to 1) investigate genes associated with the long-
lived smoker phenotype, drawing on previous knowledge of functional interaction networks and
81
pathways, in order to conceptualize GWAS results; 2) generate a Polygenic Risk Score (PRS)
based on GWAS and network selected SNPs; 3) examine how the genetic score is related to age
in the non-smoking population of middle-aged and older adults; 4) examine how the genetic score
is related to prevalence of disease within both the smoking and non-smoking populations.
6.2 Methods
6.2.1 Discovery and Validation Samples
Participants were part of the 2006 and 2008 waves of the Health and Retirement Study
(HRS), a nationally-representative longitudinal study of health and aging in the U.S (181). Our
discovery sample was limited to white current smokers only. Cases (N=90) were participants who
reported that they currently smoked and who had survived to at least age 80 at the last wave they
were interviewed, while controls (N=730) were participants who reported that they currently
smoked and who were less than 70 years of age at the last wave they were interviewed. These two
groups were chosen based on prior evidence that suggests younger smokers tend to have the high
mortality risks and physiological declines that we typically associate with prolonged smoking
behavior, whereas smokers who have survived to age 80 and beyond appeared to have mortality
levels and physiological functioning consistent with never-smokers their age—suggesting that this
may be a biologically resilient group (24).
Our validation sample (N=6,447) was made up of HRS participants who self-reported as
non-smokers at the time of their last interview, were ages 52 and above, and who had complete
genetic data from which to generate a PRS. Participants under age 52 were excluded given that
HRS collects data on a nationally-representative sample of older adults (ages 52 and over), and
their spouses, and as a result, younger participants represent spouses of persons 52 and over and
therefore may not be representative of the population their age. Of validation sample, 4,501 had
82
missing genotype information for at least one of the SNPs used to create the PRS. When comparing
excluded individuals (ages 50 and over) to our validation sample we found that they did not
significantly differ in age, sex, or smoking status (former versus never). However, our validation
sample was made up of significantly more participants who self-reported their race as white—86%
in the validation sample, and 83% in the excluded sample.
6.2.2 Genotyping and Quality Control
Genotyping was performed for participants who provided saliva samples and signed
consent forms in 2006 and 2008 and was carried-out by the NIH Center for Inherited Disease
Research using the Illumina Human Omni-2.5 Quad beadchip, with coverage of approximately 2.5
million single nucleotide polymorphisms (SNPs). Quality assurance and quality control filters
were performed by the Center for Inherited Disease Research (CIDR) and the Genetics
Coordinating Center of the University of Washington (UWGCC). These filters consisted of: 1)
SNP probes, 2) Intensity-only SNPS, 3) CIDR technical filters, 4) Duplicate SNPs, 5) Missing call
rates >=2%, 6) > 4 discordant calls in 423 study duplicates, 7) > 1 Mendelian error, 8) Hardy-
Weinberg Equilibrium P-values <10
-4
in European or African samples, 9) Sex differences in all
allelic frequency >=0.2, and 10) Sex differences in heterozygosity >0.3. As a result, 2,201,371
SNPs remained. However, given our small sample of cases which could inflate P-values for SNPs
with small minor allele frequencies (MAFs), we set our MAF cutoff at 0.05, which left us with a
total of 1,224,285 SNPs for our analysis.
Principal components analysis (PCA) was conducted by the HRS to account for population
structure in accordance with the methods described by Patterson et al. (182). This analysis
produced sample eigenvectors (EV). A scree plot generated by HRS showed that the 20
components produced by the PCA only accounted for a small fraction of the overall genetic
83
variance (<4%) for the full HRS genetic sample and that most of this was contained within the first
two components
23
. We used a logistic regression model to examine the relationship between the
20 EV and our phenotype, and found that none of the EV were significantly associated with being
a long-lived smoker. Nevertheless, we ultimately decided to adjust for the first four eigenvectors
in all subsequent analyses. More information on QC checks and the PCA is provided by HRS
(http://hrsonline.isr.umich.edu/sitedocs/genetics/HRS_QC_REPORT_MAR2012.pdf).
6.2.3 Functional Interaction Network
PLINK’s gene report command (183) was used to map SNPs with P<5×10
-3
to Genes based
on GRCh37/hg19 coordinates. Using Cytoscape plugin, Reactome FI (184), we examined
functional interaction networks for two gene sets—those whose SNPs with P <5×10
-3
and those
whose SNPs with P <5×10
-4
. These two thresholds were selected to allow for more lenient
significance criteria which may address the problem of missing heritability (185), yet limiting the
potential of over-fitting which could weaken predictive ability in validation studies (186). The
subset with P<5×10
-3
was selected for further analysis and validation given that the network
comprised of genes from SNPs with P<5×10
-4
significance had too few SNPs to generate
meaningful polygenic scores. Using the network of genes with SNPs passing the P<5×10
-3
significance threshold criteria, we ran pathway enrichment analysis using Reactome FI, which
examines whether the network gene set is overrepresented in Reactome pathways in excess of what
would be expected by chance alone.
6.2.4 Polygenic Risk Score
Polygenic Risks Score (PRS) were developed as a means of examining the aggregate
influence of multiple genetic markers (187). A PRS can be thought of as a measure of ‘genetic
84
burden (188) and has become increasingly used to facilitate understanding genetic associations
with complex traits. To generate a PRS, the 215 genes in our final network were mapped back to
the original SNPs. For those mapping to more than one SNP, the SNP with the lowest P-value was
selected to represent that gene. Next polygenic risk scores (PRS) were calculated for the discovery
(smokers) and the validation samples (non-smokers) from the HRS population.
The PRS assumes a dose-response effect, where for each SNP, persons who are
homozygous for the negatively associated allele (major allele if the beta coefficient was positive,
and minor allele if the beta coefficient was negative) are coded as 0, persons who are heterozygous
are coded as 1, and persons who are homozygous for the positively associated allele are coded as
2. Finally, the allele counts for each SNP were weighted by the log of their Odds Ratio from the
GWAS and summed across the 215 SNPs from our FI network, to generate the total and component
PRS. Scores were then standardized (mean=0, s.d.=1).
6.2.5 Statistical Analysis
Study methodolgy is outlined in Figure 6.1. A case-control GWAS (long-lived vs. normal
lived smokers) was used to identify SNPs that are potentially associated with longevity and
biological stress resistance. Moderately significant SNPs from the GWAS were then mapped to
genes and used to build a genetic network based on a priori experimental proteomic evidence of
identified genetic pathways and gene interactions. SNPs included within the gene network were
used to calculate composite PRS for the entire HRS genetic sample. Using multinomial logistic
regression, controlling for the first four EV, and sex, we examined the association between PRS
and longevity—operationalized as the probability of being in an older age group: ages 80-89, 90-
99, or 100+, relative to being ages 50-79 during the most recent wave interviewed—using a
validation sample of non-smokers from the HRS (n=6,447). We then examined whether using the
85
network based approach to SNP selection for inclusion in the PRS improved predictive ability, we
compared the association between being very old and our final PRS to the associations between
being very old and four other PRS which utilized other SNP selection criteria—top hits, a random
subset with P<5×10
-3
, a random subset of the 784 SNPs with P<5×10
-3
that also mapped to genes,
and the top hits of the 784 SNPs (P<5×10
-3
that also mapped to genes).
Figure 6.1 Study Approach. The study utilized data from the Health and Retirement Study
(HRS) to run a GWAS and network analysis using long-lived smokers as the phenotype of
interest. SNPs identified through these processes were then used to create polygenic risk scores
(PRS). For validation and replication, we examined the association between the score and age
or longevity among non-smokers in the nationally-representative population in HRS.
To compare the PRS, logistic regression models controlling for the first 4 EV and sex were
used, with participant ages 50-79 coded as 0 and participants ages 90 and above coded as 1. The
cut-off age was increased to 90, versus 80 which was used in the initial GWAS, given that our
86
validation sample was made up of non-smokers for whom survival to age 80 is much more likely.
Finally, we tested the association between PRS and disease prevalence for three major diseases of
aging: heart disease, cancer (other than skin), and diabetes. Over ten waves spanning from 1992
to 2010, participants were asked whether they had ever been diagnosed with each condition. Three
logistic regression models incorporating the panel data and adjusting for repeated observations
using random effects were run to assess the association between PRS and each of the three
conditions. These models were run controlling for age, sex, the first for EV, self-reported race,
education, smoking status at each wave, body mass index at each wave, and sample classification
(discovery cases, discovery controls, or validation sample).
6.3 Results
6.3.1 Genome-Wide Association Study
Our GWAS was run controlling for sex and four eigenvectors which control for population
stratification. Although no SNPs met the genome-wide significant threshold—which is not
surprising given our small sample size—20 SNPs met the threshold for “suggestive” association
(Figure 6.2a). Also, as shown in our QQ Plot (Figure 6.2b) we observed a moderate departure from
the null hypothesis of no association, beginning between P=10
-3
and 10
-4
. Studies of ‘missing
heritability have suggested that while for the most part, the SNPs meeting statistical significance
cut-offs in GWAS only account for a relatively small proportion of the variance in a phenotype,
there is evidence that the additional consideration of less significant loci may capture more of the
association with phenotypic heterogeneity.
87
a)
b)
Figure 6.2 GWAS Results. We found that while no SNPs met the criteria for genome-wide
significance, a number of SNPs had “suggestive” association with longevity among smokers
(2a). Additionally, our Q-Q plot shows that we had more SNPs that had P-values<5×10-04 than
might be expected by chance (2b).
88
Two P-value thresholds were considered (P<5×10
-3
and P<5×10
-4
) based on previous studies
suggesting that the proportion of the variance explained for a phenotype increases when allowing
P-value threshold to relax to such levels (185), but that allowing for variables with higher P-
values than this reduced predictive power (186).
6.3.2 Network and Pathway Analysis
SNP locations were mapped to Genes—for SNPs with P<5×10
-3
and P<5×10
-4
from the
GWAS. Overall, there were 535 SNPs with P<5×10
-4
which mapped to 115 genes, and 5,184 SNPs
with P<5×10
-3
which mapped to 784 unique genes. Cytoscape plugin Reactome FI was used to
construct functional interaction networks and run subsequent pathway enrichment analyses.
Reactome FI was designed to identify network patterns that relate to disease. The database covers
over 50% of human proteins which are used to build functional interaction networks based on a
set of input genes.
Using the 2013 FI network build, we found that 215 of our 784 genes (P<5×10
-3
) made up
functional networks that had five or more genes each, the largest of which was encompassed by
202 genes (Figure 6.3). The other 569 genes were either not functionally connected to any other
genes that had SNPs with P<5×10
-3
or formed networks of three or less genes. On the other hand,
only 3 genes were comprised in the network that utilized a P-value cutoff of P<5×10
-4
, therefore
the network with P<5×10
-3
was selected for use in further investigation and validation.
Next we ran Reactome FI’s pathway enrichment analysis for these 215 genes in the
network using P<5×10
-3
as the significance threshold and found 21 pathways that were enriched
at FDR<5×10
-3
. The ten most enriched pathways, included: P13K-Akt signaling, pathways in
cancer, signaling by PDGF, glutamatergic synapse, Ras signaling, Rap1 signaling L1CAM
interactions, Focal adhesion, Netrin-1 signaling, and Netrin-mediated signaling (Table S1).
89
Figure 6.3 Functional Interaction Network. After mapping SNPs with P-values<5×10
-03
to
genes, we found that 215 of them were comprised within functional interaction networks of five
or more genes—200 genes were in a single network, and 15 genes made up 3 networks of 5
genes each. Of the 215 SNPs (represented by their respective genes in the network) the majority
had P-values between 5×10
-04
and 5×10
-03
, associations that would have been overlooked using
a normal GWAS approach.
90
a)
b)
c)
Figure 6.4 Associations between PRS and Longevity. Overall, the weighted PRS was found
to be fairly evenly distributed and ranged between -3.68 and 6.02 (3a). When comparing the
scores of our GWAS cases and controls, we found that there was no overlap between the two
groups (3b)—cases all had scores of 2.34 or greater (with a mean of 4.17), while controls had
scores ranging between -3.41 and 2.32 (with a mean of -0.55). A multinomial logistic regression
model was used to examining the association between PRS and age in a validation sample
(n=6,447). Results, were used to predict the proportion of centenarians in the population by PRS
(3c). We found that among individuals with a PRS of -2.0 (two standard deviations below the
mean), only 3.2 in 100,000 persons are predicted to be a centenarian. On the other hand, for
individuals with a PRS of 2.0 (two standard deviations above the mean), 340.3 in 100,000
persons are predicted to become centenarians.
91
6.3.3 Polygenic Risk Score Based Validation
A standardized PRS was generated based on a weighted composite score of the 215 SNPs
from the selected interaction network, and was evenly distributed (Figure 6.4a), with a range from
-3.68 to 6.02 in the overall HRS population (discovery and validation sample). Mean PRS were
compared between our original cases and controls—smokers ages 80+, and smokers under age 70,
respectively—to determine how much of variation in the original phenotype was explained using
a composite SNP score of only 215 SNPs from the original 1,224,285 SNPs. Results showed that
the score completely accounted for group membership, with no overlap between the two groups
(Figure 6.4b). Of our original 90 cases of long-lived smokers, 49 had complete data on all the
SNPs needed to generate the overall PRS, and we found that for this group, PRS ranged from 2.34
to 6.02, with a mean of 4.17 and a standard deviation of 0.78. Among the 730 controls, 422 had
no missing genotype data for the 215 SNPs, and these participants had PRS ranging from -3.41 to
2.32, with a mean of -0.55 and a standard deviation of 0.95. Not only were scores significantly
higher for the long-lived group, but scores also appeared to be more homogeneous.
Next using our validation sample we performed a multinomial logistic regression,
controlling for the four eigenvectors, sex, and race to determine if the PRS was associated with the
probability of being in an older age group—ages 80-89 (n=1,290), 90-99 (n=253), and 100+
(n=4)—relative to those ages 52-79 (n=4,900). Our results showed (Table 6.1) that among these
6,447 participants, a higher PRS was associated with an increased likelihood of being age 90-99
or being a centenarian, relative to being in the youngest age group (50-79). Results showed that a
one unit increase was associated with 20% greater likelihood of being ages 90-99 compared to 29-
79 (OR: 1.20, P=.007), and a 3.3-fold increases in the likelihood of being a centenarian (OR: 3.27,
P=.027). Based on the parameter from table 6.1, we estimated the predicted proportion of
92
centenarians in the population of non-smokers, by PRS (Figure 6.4c). We found that for individuals
with a PRS that was two standard deviations below the mean (PRS=-2), only 3.2 in 100,000 were
predicted to be centenarians. For individuals with a mean PRS (PRS=0), 33.2 in 100,000 were
predicted to be centenarians, and for individuals with a PRS that was two standard deviations
above the mean (PRS=2), 340.3 in 100,000 were predicted to be centenarians.
Table 6.1 Odds Ratios for PRS from a Multinomial Regression Model for Longevity
using the validation sample (N=6,447)
Age Category Odds Ratio P-Value
80-89 1.04 0.204
90-99 1.20 0.007
100+ 3.27 0.027
<80 (Reference Category)
PRS: Polygenic Risk Score
Model run adjusting for sex and the first four eigenvectors
To provide evidence that using a network based approach to select candidate SNPs
improved our predictive measure, we compared the strength of the association between longevity
and our measure to the associations between longevity and four other weighted PRSs from 215
SNPs selected via other means—top hits (the 215 most significant SNPs from the GWAS), a
random subset of 215 SNPs with P<5×10
-3
from the GWAS, a random subset of 215 SNPs that
were part of the 784 SNPs that had P<5×10
-3
from the GWAS and that also mapped to genes, and
the 215 most significant SNPs out of the subset of 784 SNPs (those with P<5×10
-3
that also mapped
to genes). To validate our approach we used five separate logistic regression models—one per
individual PRS—with the outcome being 1 if an individual was 90+ years old and 0 if he/she was
ages 52-79. Models were run using our validation sample only and controlling for the four
eigenvectors, sex, and self-reported race (Table 6.2). Of the five models, our original measure
93
which utilized the functional interaction network to select SNPs for inclusion in the PRS was the
only PRS variable found to be statistically significant (OR: 1.22, P=.005)—and it remained
significant even after Bonferroni adjustment (P<.01). On the other hand, the four other PRSs were
not significantly associated with being age 90 and over, and had Odds Ratios between 0.83 and
1.10 with P-values ranging from .122 to .872. Lists of SNPs and corresponding gene sets used for
the other PRS measures are available upon request.
Table 6.2 Comparing the Performance of Various PRS for Predicting Longevity in the
validation sample (N=6,447)
PRS Method Odds Ratio P-Value
Network Based PRS
a
1.21 .005
Top SNPs
b
0.95 .649
Random SNPs
c
0.83 .122
Random Genes
d
0.98 .872
Top Genes
e
1.10 .253
a
Original PRS generated from SNPs within the Functional Interaction Network
b
PRS of the 215 top hits from the GWAS
c
PRS from a random subset of 215 SNPs with P< 5×10
-03
d
PRS from a random subset of 215 SNPs with P< 5×10
-03
that also mapped to genes
e
PRS of the 215 most significant hits that also mapped to genes
Finally, we examined the association between our original PRS and disease prevalence
measures for heart disease, cancer, and diabetes. Over the ten waves, which spanned approximately
18 years, 20% of participants self-reported having been diagnosed with heart disease, 15.5% self-
reported having been diagnosed with diabetes, and 12% self-reported having been diagnosed with
cancer other than skin cancer at some point during their lifetime.
94
Table 6.3 Random Effects Logistic Regression Models of the Association between PRS and
Disease Prevalence
Cancer (Non-Skin) Heart Disease Diabetes
Observations=49,891 Observations=49,931 Observations=49,920
N=6,434 N=6,434 N=6,434
Odds Ratio P-Value Odds Ratio P-Value Odds Ratio P-Value
PRS 0.61 0.002 0.902 0.38 0.955 0.745
Age (Years) 2.573 <0.001 2.641 <0.001 2.035 <0.001
EV1 1.0E+134 <0.001 2.10E+08 0.693 1.30E+73 0.006
EV2 2.65E-13 0.253 7.89E-10 0.21 2.40E+107 <0.001
EV3 7.02E+43 <0.001 6.89E+25 <0.001 1.11E+37 <0.001
EV 1.16E+35 <0.001 2.20E+14 0.012 5.71E-12 0.129
BMI
Underweight
(<18.5)
2.522 0.146 0.935 0.89 0.54 0.406
Overweight (25-
29.9)
1.339 0.141 0.846 0.29 1.339 0.107
Obese (30+) 2.248 0.003 1.309 0.176 3.088 <0.001
Education
GED 5.056 0.049 56.717 <0.001 3.404 0.101
High-School
Graduate
1.499 0.365 2.268 0.011 0.719 0.435
Some College 0.783 0.613 3.947 <0.001 1.41 0.432
College and Above 15.766 <0.001 1.524 0.249 0.885 0.784
Race (Self-Reported)
Black/AA <0.001 <0.001 2.891 0.413 58.857 0.011
Other <0.001 <0.001 14.508 0.001 4.29 0.185
Sex (Female=1) 0.019 <0.001 0.058 <0.001 0.144 <0.001
Ever Smoked 1.307 0.411 5.748 <0.001 1.336 0.299
Currently Smoke 0.114 <0.001 0.043 <0.001 0.508 0.009
Sub-Sample
Discovery Cases 6.08E-07 <0.001 8.01E-07 <0.001 4.73E-07 <0.001
Validation 3.24E-06 <0.001 0.001 <0.001 <0.001 <0.001
Constant 1.15E-33 <0.001 5.64E-32 <0.001 1.22E-24 <0.001
OR SE OR S.E. OR S.E.
Ln sig
2
u 6.257 0.032 5.764 0.034 5.597 0.034
Sigma u 22.841 0.37 17.848 0.3 16.42 0.276
rho 0.994 <0.001 0.99 <0.001 0.988 <0.001
OR= Odds Ratio, S.E.=Standard Error, PRS= Polygenic Risk Score; EV= Eigenvector; BMI=Body Mass
Index; GED=General Educational Development Degree; AA= African American. Disease prevalence run
as three separate logistic regression models. The reference categories for independent variables in each
model were: Normal weight (18.5-24.9) for BMI; No high school degree or equivalent for education;
White/Caucasian for self-reported race; male for sex; and discovery controls for sub-sample.
95
Logistic regression models were run including both the discovery and the validation sample, given
that smokers make up a large proportion of the disease prevalence for these three conditions.
Nevertheless, we included dummy variables to adjust for whether participants were cases from our
discovery set, controls from our discovery set, or part of the validation sample. We also examined
whether interactions between the sub-samples classification and the PRS were significant, which
would suggest that the association between PRS and the disease phenotype varied in magnitude
between the three groups. Repeated measures data across all waves were pooled over time and
included an indicator to correct for related sample observations. Our results showed that higher
PRS was significantly associated with lower prevalence of cancer (Table 6.3). For each one
standard deviation increase in the PRS, the likelihood that an individual had ever been diagnosed
with cancer was reduced by nearly 40% (OR: 0.61; P=.002; 95%CI: 0.44-0.84). Furthermore, we
found no significant interactions between PRS and the three sub-sample classifications (P=.814
for the interaction between PRS and being a long-lived relative to a normal aged smoker; P=.450
for the interaction between PRS and being a non-smoker relative to a normal aged smoker). This
suggests that the trend between PRS and cancer prevalence held for individuals regardless of
whether they were a long-lived smokers, normal aged smokers, or a non-smokers. When
examining the association between PRS and either heart disease or diabetes, we found no
statistically significant relationships (Table 6.3).
6.4 Discussion
For most indivduals, environment may play a major role in their probability of postponing
disease and reaching old age. However, for those under chronic exposure to exogenous stressors,
such as cigarette smoke, genetic variants may act as key factors in determining whether indivdiuals
are able to delay the age-related progressive decline in physiological functioning by offseting
96
damage though activation of somatic maintenance and repair mechansims. As a result, survival
among smokers may serve as a unique model for examing the genetics of stress resistance, aging
and longevity. Using long-lived smokers as our phenotype we were able to identify a network of
SNPs that, collectively, were strongly associated with extreme survival and lower cancer rates in
an independent vaidation sample or nationally-representative non-smokers.
Our findings suggest that a one standard deviation increase in the genetic load of the 215
SNPs we identified was associated with a 20% increase in the liklihood that a individual was ages
90-99 and an over three-fold increase in the likelihood of being a centenarian. Additionally, our
model predicted that approximately 340 in 100,000 individuals who had a PRS that was two
standard deviations above the mean would be a centenarian, compared to only 33 and 3 in 100,000
who had a PRS that at the mean or was two standard deviations below the mean, repcetively.
According to the U.S. Census Bureau in 2010, there were just over 17 centenarians per 100,000
people in the U.S., which appears similar to our estimate for mean PRS, taking into account that
our validation sample is made up of participants ages 52 and over and only includes non-smokers.
However, additional studies utilizing differnt samples that include larger numbers of centenarians
should be conducted to better understand the association between the PRS we generated in the
current study and an individual’s liklihood of surving to age 100 and beyond.
One of the major physiological risks of exogenous genotoxic exposure that accompanies
smoking, is the accumulation of DNA damage (189). However, it is likley that long-lived smokers
possess variants which prevent genomic instability and allow them to survive to more exterme
ages. Genomic instability, also happens to be one of the hallmarks of cancer pathogenesis (190),
thus the same genes that may promote survival amoung smokers may also be important for cancer
prevenion. This is consistant with our findings which showed that the genes we identified through
97
our GWAS and network analysis on long-lived smokers were collectively associated with a nearly
40% lower cancer pervalence in the genral population. Additionally, our functional interaction
network of 215 genes was significantly enriched with Pathways in cancer, as well as Ras-signaling,
Rap1 signaling pathways, and Signaling by PDGF—all of which have implications for cancer
pathogenesis (191-193).
Pathways which are believed to be potential regulators of the aging process were also
enriched in our network. Overall, results showed that the PI3K/AKT signaling pathway had the
highest enrichment score. This pathway has previously been shown to comprise genes related to
stress resistance, DNA repair, cell death, protein turnover, and antioxidants (194). PI3K/AKT
pathway is activated via insulin/insulin-like growth factor (IGF) signaling (IIS). IIS is
evolutionarily conserved has been shown to elicit a strong influence on lifespan in model
organisms (175) and there is further evidence to suggest it may play an important role in human
longevity (195). In worms, transcription factor Daf-16 (abnormal Dauer Formation-16) is a key
regulator of IIS, and has been found to be fundamental for extreme lifespan extension (196). The
FOXO family of transcription factors are the human homolog for DAF-16, and FOXO3a has been
shown to be one of the most consistently cited longevity genes in human populations (197). Our
network analysis and PRS included SNP rs12203834 (Chr6:108975562), which is an intron variant
in FOXO3a. While rs12203834 has not been previously cited, according to CEU Hapmap data, it
is in perfect Linkage disequilibrium (r
2
=1) with two SNPs which have been previously associated
with extreme longevity in two distinct populations—Hawaiian men of Japanese ancestry
(rs13217795) and Germen men and women (rs9400239) (198, 199). This suggests that FOXO3a
may be an important hub gene in pathways regulating longevity and aging. Nevertheless, it is likely
that additional hubs may simultaneously be important for extreme survival, especially under
98
adverse conditions.
Previous studies have provided evidence that suggest lifespan is a polygenic trait (200),
influenced by multiple alleles with individual small effects. However, many SNPs associated with
a given phenotype will go unnoticed in traditional association studies which rely on strict
significance criteria, thus contributing to what is being termed the “missing heritability problem”
(201). There is an urgent need for employing methods that both allow for the examination of
cumulative associations across SNPs as well as reliable methods for selecting SNPs for inclusion
in predictive measures. Our and others’ result illustrate the usefulness of polygenic measures (186,
202); nevertheless best practices for SNP selection in the creation of these measures has been less
concrete. It has been suggested that network-based analyses may facilitate variant selection for
polygenic traits (203). Given the evidence that phenotypes like longevity may be influenced by
genes within specific pathways and networks (180), we belives that the use of prior knowledge,
such as functional interaction network analysis, provides better inclusion criteria for composite
scores than methods that only consider top GWAS hits. This is consistant with the present study
which showed that PRSs composed of SNPs identified using other means—top hits—were not
significantly associated with longevity, while the PRS made up of SNPs in a functional interaction
network was found to be a significant predictor of whether an individual was 90 years or older.
Furthermore, the strength of this association increased further when predicting whether a
particpant was a centanarian, remaining significant even with a very small sample size. This is
consistant with previously studies reporting that genes may be more important for extreme
longevity versus variations in lifespan within typical ranges (199).
Through our use of a unique phenotype, functional interaction networks to select SNPs,
and methods allowing examination of associations with aging-related phenotypes using composite
99
measures of multiple genetic variants, we delevoped a genetic risk score that was significantly
associated with an indivdiual’s liklihood of surviving to extreme old age and also found to predict
lower cancer prevalence. Overall, our findings suggest that longevity may be under the regulation
of complex genetic networks which influence stress resistance and geneomic stability. In moving
forward, it will be importnat to examine how functional variants associated with the SNPs in our
score interact with one another to impact signaling within their respective pathways and how these
alterations translate into differnces in lifespan and cancer risk.
100
Chapter 7: Conclusions and Outlook
7.1: Discussion of Study Results
In this dissertation I have shown that complex statistical models 1) can aid in developing
accurate measures of the human aging process that reflect differences in environment, and 2) can
be used to quantify genetic profiles to study the influence of genomic networks on aging and
longevity phenotypes. In Chapters two through four, via the use of mathematical algorithms I
provided evidence that biological aging can be reliably quantified in humans and that these
measures predict mortality more accurately than chronological age, account for life expectancy
differences between groups and individuals, and reflect the impact of environmental stressors, like
smoking and excess nutritional intake.
In moving forward, using a multi-system approach to quantify the complex progressive
deterioration that the body undergoes with time, will enable us to examine the mechanisms that
lead to aging heterogeneity within a population. This is particularly beneficial for human aging
studies, since testing the effects of environmental or molecular manipulations would take decades,
especially if reliance on mortality outcomes is necessary for quantifying differences in the pace of
aging. The work presented here shows that biological age measures could serve as reliable proxies
for life expectancy, and thus examining how changing environmental or behavioral characteristics
influence aging and health span, or the determining he efficacy of aging therapies could be carried-
out in a more feasible time-frame.
Such measures may also prove useful for the development of more targeted prevention
strategies, given that they are able to identify at-risk individuals prior to the occurrence of a
negative health event. My work in Chapter 3 showed that biological age differences accurately
101
identified at-risk individuals in the population and provided evidence that interventions that
reduced disparities in biological age would also diminish differences in lifespan in the decades to
come. Additionally, in Chapter 4, I also showed that even among young adults, biological age
reflected the negative health consequences of obesity and smoking. Thus interventions that target
individuals early in life—when variations in the pace of aging begin to diverge—could have
dramatic benefits for the future health of the population.
In this dissertation I also present a method for quantifying complex genetic networks that
add to our understanding of the biological mechanisms that contribute to aging, while also allowing
us to identify individuals may respond differentially to given environments. The great majority of
existing genetic studies on aging and longevity focus solely on the identification of individual
mutations, with the intention of finding a singular ‘longevity gene’. However, we know that genes
and their protein products typically do not to operate in isolation, but rather are likely to interact
with one another in complex functional networks (204). Thus, it is likely that genetic variants
influence age-related decline and lifespan in a polygenic (additive) or epistatic (interactive)
manner. This hypothesis was supported by the results described in Chapter 6, which showed that
while none of the 1.2 million SNPs met genome-wide significance (P<5×10
-8
) on their own, when
information was combined across a cluster of functionally inter-connected SNPs, the score
generated from this network was strongly associated with increased longevity and decreased
cancer prevalence. Furthermore, this network was significantly enriched with genes that were part
of evolutionarily conserved pathways which have been shown to have dramatic effects for the
lifespan of yeast, flies, worms, and mice (180).
102
7.2 Proposal of a Novel Method for Identifying Aging Gene Networks
While the method presented in this dissertation for quantifying genetic risk based on
networks did produce significant predictions for aging-related mortality outcomes such as
longevity and cancer, these methods can continue to be improved upon. Due to the
multidimensional nature of aging, genes that together influence the pace of senescence should also
be significantly associated with differences in the risk of a multitude of morbidity outcomes. The
method presented in Chapter 6 of this dissertation relied on previously reported information in
functional interaction databases that are admittedly biased in reporting diseases such as cancer—
which may explain why the genetic network that was discovered was associated with cancer and
not diabetes or heart disease. In order to overcome this, future studies on SNP-based gene networks
will need to be generated based on statistical inference rather than literature-based approaches.
While such methods have been employed using transcriptomic data, by relying on correlations
between the expression levels of gene pairs (205, 206), such methods will not work when it comes
to identifying genomic networks. A correlation matrix generated between SNPs would not reflect
functional relatedness like it does for RNA-sequencing data, but rather—as a result of
recombination and linkage disequilibrium—will represent chromosomal location, adding little to
our understanding of functional genomics of complex traits like aging.
To overcome this, in the next stage of my research, building upon the work I presented in
this dissertation, I am proposing to adapt results from GWAS so that they can be incorporated into
the methods used for weighted gene co-expression network analysis (WGCNA). I propose to
accomplish this by running multiple GWAS (potentially up to 40) across a variety datasets and
aging phenotypes: longevity, biological age, cognitive decline, cancer, diabetes, heart disease,
disability, etc. Next, instead of testing correlations between expression levels across study subjects
103
(as is done in traditional transcriptomics-based WGCNA) I will test for correlations between the
coefficients for each gene across the 40 studies (Figure 7.1). This will then produce a correlation
matrix that represents the covariance of genes with respect to multiple aging and longevity-related
phenotypes. This matrix can then be transformed into an adjacency matrix, from which topological
overlap measure (TOM) and hierarchical clustering can be used to group networked genes, or
modules, from which genetic load measures (eigengenes) can be generated (206).
Figure 7.1: Adaptation of transcriptomic WGCNA to Genomic WGCNA. In traditional
WGCNA that relies on transcriptomics, input datasets are made up of a data matrix where
subjects are typically assigned to columns and genes are assigned to rows—values in each cell
represent the subject’s expression level for the given gene. In order to run these types of models
using genomics data, I propose to use a datamatrix where genes are still listed in rows, but now
GWAS studies (different phenotypes or datasets) are listed in columns. Therefore, values in
each cell will now represent the given gene’s coefficient from a particular GWAS.
Once the genomic networks, or eigengenes, are estimated, they can be incorporated into
multi-level networks that include genomic, transcriptomic, proteomic, environmental, and
phenotypic data. Given the complexity of aging process, as well as the causal interactions at both
the molecular and environment level, approaches from systems biology which utilize complex and
nonlinear models may be helpful for systematically studying and categorizing the various genetic
104
and environmental contributions to aging. Networks provide an intuitive approach for modeling
the interactions between biology, phenotype, and the environment and from them accurate
predictive models can be generated to test the contributions the various network components have
in producing the final phenotypic outcomes (Figure 7.2).
Figure 7.2: Theoretical Multilevel Network. Networks likely exist at multiple levels. Thus
lower-level network functioning will have implications for higher-level networks, and
ultimately for outcomes like mortality.
7.3 Network Theory of Aging
Billions of systems or networks of various sizes and complexities combine to make up a
larger system, which represents the organism. Thus it is likely that with aging, the complexity of
these networks decreases, as dysregulation increases—the network interaction and connectivity
degrades as nodes lose the ability to respond to signals from one another. From physics, we know
that over time, disorder, also known as entropy, increases within a system—causing the system to
transition from an initial (complex) state to a final (simpler) state characterized by uniform chaos.
This is consistent with the idea that the various multi-level networks within an organisms, and
105
between the organisms and its environment, become less regulated and more disordered with age.
Endogenous and exogenous stressors—such as reactive oxygen species—may further disrupt the
systems, thus increasing entropy.
Nevertheless, according to the second law of thermodynamics, the time it takes for a
system—or organism—to transition from the initial to the final state can also be modified via
energy allocation. Work is required to maintain or produce order (207). Similar to what is
suggested by Kirkwood’s’ Disposable Soma Theory, evolutionary forces may determine how
much energy is allocated to slowing the march of entropy by maintaining these complex networks
(14).
Another factor that may likely influence system degradation is the initial size and
complexity of the networks. This may also explain why certain organisms live significantly longer
than others. There are both pros and cons for increasing network complexity and size, when it
comes to the longevity of the overall system. While larger, more complex networks may be harder
to maintain (require more energy to decrease entropy) small perturbations to a few nodes will
likely take longer to propagate through the network, disrupting the entire system. Thus some
network features, such as size, complexity, and maintenance may in fact be programmed
contributors to lifespan. Finally, the idea that damage may start in individual nodes and then work
its way through the network is consistent with observations from individuals and populations
which show that declines start slow and then begin to accumulate until finally surpassing some
critical threshold. This is apparent when examining the exponential mortality increases in a
population, as well as the physiological functioning deficits in an individual. This theory also
supports the observation that aging contributes to a number of degenerative diseases that often
accumulate together. In examining this theory further, the use of stochastic simulations may allow
106
for justification of such theories, as long as the parameters can be defined as either levels or
distributions (network complexity, size, organization, maintenance from energy, entropy, etc.) and
outcomes such as lifespan or mortality curves are predicted that can be compared to real data
7.4 Closing Remarks
The use of mathematical and statistical algorithms are essential for our understanding of
the aging process. With the ever growing availability of large high-dimensional datasets, advances
in biotechnology, and increasing accessibility and scalability of computing models which facilitate
big data analytic strategies will enable us to dig deeper in uncovering the mechanisms that control
aging and lifespan. The bridging of social and behavioral sciences to the STEM fields—
capitalizing on statistical methods that combine environmental data with omics data—will
eventually allow us to translate what is discovered in the lab to what we observe in large
populations. Given the complexities of human biology and aging, these models are not only
essential, but represent some of the most efficient and practical ways to improve our understanding
of the basic mechanisms of life. While we will all succumb to old age, it is possible that such
models might provide insight that will someday allow us to meaningfully intervene in the lives of
individuals with the goal of extending healthy life expectancy far beyond what exists today.
107
Bibliography
1. Finch C. The biology of human longevity : inflammation, nutrition, and aging in the
evolution of life spans. 1st ed. Burlington, MA: Academic Press; 2007.
2. Moseley JB. Cellular Aging: Symmetry Evades Senescence. Current Biology.
2013;23:R871-R873.
3. Arking R, Baker GT. Human Aging - Biological Perspectives - Digiovanna,Ag. The
Gerontologist. 1995;35:845-845.
4. Harman D. The aging process: Major risk factor for disease and death. Res Leg Med.
2002;27:53-77.
5. Fries JF, Bruce B, Chakravarty E. Compression of morbidity 1980-2011: a focused
review of paradigms and progress. Journal of aging research. 2011;2011:261702.
6. Yashin AI, Ukraintseva SV, Boiko SI, Arbeev KG. Individual aging and mortality rate:
How are they related? Soc Biol. 2002;49:206-217.
7. Hayflick L. Biological aging is no longer an unsolved problem. Ann Ny Acad Sci.
2007;1100:1-13.
8. Partridge L. The new biology of ageing. Philos T R Soc B. 2010;365:147-154.
9. Rattan SIS. Theories of biological aging: Genes, proteins, and free radicals. Free radical
research. 2006;40:1230-1238.
10. Hayflick L. Entropy explains aging, genetic determinism explains longevity, and
undefined terminology explains misunderstanding both. PLoS genetics. 2007;3:2351-2354.
11. Harman D. Aging: a theory based on free radical and radiation chemistry. J Gerontol.
1956;11:298-300.
12. Weismann A. Ueber die Dauer des Lebens, ein Vortrag. Jena,: G. Fischer; 1882.
13. Goldsmith TC. Aging as an evolved characteristic - Weismann's theory reconsidered.
Med Hypotheses. 2004;62:304-308.
14. Kirkwood TBL, Holliday R. Evolution of Aging and Longevity. Proc R Soc Ser B-Bio.
1979;205:531-546.
15. Kirkwood TB. Evolution of ageing. Mechanisms of ageing and development.
2002;123:737-745.
16. de Magalhaes JP. Programmatic features of aging originating in development: aging
108
mechanisms beyond molecular damage? Faseb J. 2012;26:4821-4826.
17. Longo VD, Mitteldorf J, Skulachev VP. Opinion - Programmed and altruistic ageing.
Nature Reviews Genetics. 2005;6.
18. Mackay TF. Aging in the post-genomic era: simple or complex? Genome Biology.
2000;1:reports4018.4011 - reports4018.4016.
19. Levine ME, Suarez JA, Brandhorst S, Balasubramanian P, Cheng CW, Madia F, et al.
Low protein intake is associated with a major reduction in IGF-1, cancer, and overall mortality in
the 65 and younger but not older population. Cell metabolism. 2014;19:407-417.
20. Doll R, Peto R, Boreham J, Sutherland I. Mortality in relation to smoking: 50 years'
observations on male British doctors. Bmj. 2004;328:1519.
21. Russ TC, Stamatakis E, Hamer M, Starr JM, Kivimaki M, Batty GD. Association
between psychological distress and mortality: individual participant pooled analysis of 10
prospective cohort studies. Bmj. 2012;345:e4933.
22. Colman RJ, Beasley TM, Kemnitz JW, Johnson SC, Weindruch R, Anderson RM.
Caloric restriction reduces age-related and all-cause mortality in rhesus monkeys. Nature
communications. 2014;5:3557.
23. Liao CY, Rikke BA, Johnson TE, Diaz V, Nelson JF. Genetic variation in the murine
lifespan response to dietary restriction: from life extension to life shortening. Aging cell.
2010;9:92-95.
24. Levine M, Crimmins E. Not all smokers die young: a model for hidden heterogeneity
within the human population. PloS one. 2014;9:e87403.
25. Jeune B, Robine J-M, Young R, Desjardins B, Skytthe A, Vaupel J. Jeanne Calment and
her successors. Biographical notes on the longest living humans. In: Maier H, Gampe J, Jeune B,
Robine J-M, Vaupel JW, eds. Supercentenarians. Demographic Research Monographs: Springer
Berlin Heidelberg; 2010:285-323.
26. Longo VD, Fabrizio P. Regulation of longevity and stress resistance: a molecular strategy
conserved from yeast to humans? Cellular and molecular life sciences : CMLS. 2002;59:903-
908.
27. Chavali S, Barrenas F, Kanduri K, Benson M. Network properties of human disease
genes with pleiotropic effects. BMC systems biology. 2010;4:78.
28. Levine ME. Modeling the rate of senescence: can estimated biological age predict
mortality more accurately than chronological age? The journals of gerontology Series A,
Biological sciences and medical sciences. 2013;68:667-674.
29. Levine ME, Crimmins EM. Evidence of accelerated aging among African Americans and
its implications for mortality. Social science & medicine. 2014;118:27-32.
109
30. Yin D, Chen K. The essential mechanisms of aging: Irreparable damage accumulation of
biochemical side-reactions. Experimental gerontology. 2005;40:455-465.
31. Comfort A. Test-Battery to Measure Ageing-Rate in Man. Lancet. 1969;2:1411-&.
32. Kirkwood TBL. Alex Comfort and the measure of aging. Experimental gerontology.
1998;33:135-140.
33. Johnson TE. Recent results: Biomarkers of aging. Experimental gerontology.
2006;41:1243-1246.
34. Sprott RL. Biomarkers of aging and disease: Introduction and definitions. Experimental
gerontology. 2010;45:2-4.
35. Hollings.Jw, Hashizum.A, Jablon S. Correlations between Tests of Aging in Hiroshima
Subjects-an Attmept to Define Physiologic Age. Yale J Biol Med. 1965;38:11-&.
36. Takeda H, Inada H, Inoue M, Yoshikawa H, Abe H. Evaluation of Biological Age and
Physical Age by Multiple-Regression Analysis. Med Inform. 1982;7:221-227.
37. Kroll J, Saxtrup O. On the use of regression analysis for the estimation of human
biological age. Biogerontology. 2000;1:363-368.
38. Bae CY, Kang YG, Kim S, Cho C, Kang HC, Yu BY, et al. Development of models for
predicting biological age (BA) with physical, biochemical, and hormonal parameters. Arch
Gerontol Geriat. 2008;47:253-265.
39. Hofecker G, Skalicky M, Kment A, Niedermuller H. Models of the Biological Age of the
Rat .1. A Factor Model of Age Parameters. Mechanisms of ageing and development.
1980;14:345-359.
40. Nakamura E, Miyao K. A method for identifying biomarkers of aging and constructing
an index of biological age in humans. J Gerontol a-Biol. 2007;62:1096-1105.
41. Nakamura E, Miyao K, Ozeki T. Assessment of Biological Age by Principal Component
Analysis. Mechanisms of ageing and development. 1988;46:1-18.
42. MacDonald SWS, Dixon RA, Cohen AL, Hazlitt JE. Biological age and 12-year
cognitive change in older adults: Findings from the Victoria Longitudinal Study. Gerontology.
2004;50:64-81.
43. Klemera P, Doubal S. A new approach to the concept and computation of biological age.
Mechanisms of ageing and development. 2006;127:240-248.
44. Mitnitski AB, Graham JE, Mogilner AJ, Rockwood K. Frailty, fitness and late-life
mortality in relation to chronological and biological age. BMC geriatrics. 2002;2:1.
45. Kulminski AM, Ukraintseva SV, Culminskaya IV, Arbeev KG, Land KC, Akushevich L,
110
et al. Cumulative Deficits and Physiological Indices as Predictors of Mortality and Long Life. J
Gerontol a-Biol. 2008;63:1053-1059.
46. Seplaki CL, Goldman N, Glei D, Weinstein M. A comparative analysis of measurement
approaches for physiological dysregulation in an older population. Experimental gerontology.
2005;40:438-449.
47. Butler RN, Sprott R, Warner H, Bland J, Feuers R, Forster M, et al. Biomarkers of aging:
From primitive organisms to humans. J Gerontol a-Biol. 2004;59:560-567.
48. Crimmins E, Vasunilashorn S, Kim JK, Alley D. Biomarkers Related to Aging in Human
Populations. Adv Clin Chem. 2008;46:161-216.
49. Cho IH, Park KS, Lim CJ. An empirical comparative study on biological age estimation
algorithms with an application of Work Ability Index (WAI). Mechanisms of ageing and
development. 2010;131:69-78.
50. Ingram DK. Key Questions in Developing Biomarkers of Aging. Experimental
gerontology. 1988;23:429-434.
51. Starr JM, Shiels PG, Harris SE, Pattie A, Pearce MS, Relton CL, et al. Oxidative stress,
telomere length and biomarkers of physical aging in a cohort aged 79 years from the 1932
Scottish Mental Survey. Mechanisms of ageing and development. 2008;129:745-751.
52. Balaban RS, Nemoto S, Finkel T. Mitochondria, oxidants, and aging. Cell. 2005;120:483-
495.
53. Krishnamurthty J, Torrice C, Ramsey MR, Kovalev GI, Al-Regaiey K, Su LS, et al.
Ink4a/Arf expression is a biomarker of aging. J Clin Invest. 2004;114:1299-1307.
54. Martin-Ruiz C, Jagger C, Kingston A, Collerton J, Catt M, Davies K, et al. Assessment
of a large panel of candidate biomarkers of ageing in the Newcastle 85+ study. Mechanisms of
ageing and development. 2011;132:496-502.
55. Kirkwood TB, Austad SN. Why do we age? Nature. 2000;408:233-238.
56. Masoro EJ. Caloric restriction and aging: an update. Experimental gerontology.
2000;35:299-305.
57. Kenyon C. A conserved regulatory system for aging. Cell. 2001;105:165-168.
58. Verbeke P, Fonager J, Clark BF, Rattan SI. Heat shock response and ageing: mechanisms
and applications. Cell biology international. 2001;25:845-857.
59. McEwen BS. Sex, stress and the hippocampus: allostasis, allostatic load and the aging
process. Neurobiology of aging. 2002;23:921-939.
60. Pham-Huy LA, He H, Pham-Huy C. Free radicals, antioxidants in disease and health.
111
International journal of biomedical science : IJBS. 2008;4:89-96.
61. Karasik D, Demissie S, Cupples LA, Kiel DP. Disentangling the genetic determinants of
human aging: biological age as an alternative to the use of survival measures. The journals of
gerontology Series A, Biological sciences and medical sciences. 2005;60:574-587.
62. Hayward MD, Crimmins EM, Miles TP, Yang Y. The significance of socioeconomic
status in explaining the racial gap in chronic health conditions. Am Sociol Rev. 2000;65:910-
930.
63. Finch CE, Tanzi RE. Genetics of aging. Science. 1997;278:407-411.
64. Franks P, Muennig P, Lubetkin E, Jia HM. The burden of disease associated with being
African-American in the United States and the contribution of socio-economic status. Social
science & medicine. 2006;62:2469-2478.
65. Williams DR, Collins C. Racial residential segregation: A fundamental cause of racial
disparities in health. Public Health Rep. 2001;116:404-416.
66. Acevedo-Garcia D, Osypuk TL, McArdle N, Williams DR. Toward a policy-relevant
analysis of geographic and racial/ethnic disparities in child health. Health affairs. 2008;27:321-
333.
67. Mayberry RM, Mili F, Ofili E. Racial and ethnic differences in access to medical care.
Medical care research and review : MCRR. 2000;57 Suppl 1:108-145.
68. United States. Department of Health and Human Services., United States. Agency for
Healthcare Research and Quality. 2005 national healthcare disparities report. Rockville, Md.:
U.S. Department of Health and Human Services, Agency for Healthcare Research and Quality;
2005.
69. Jackson JS, Knight KM, Rafferty JA. Race and Unhealthy Behaviors: Chronic Stress, the
HPA Axis, and Physical and Mental Health Disparities Over the Life Course. American journal
of public health. 2010;100:933-939.
70. McEwen BS. Protective and damaging effects of stress mediators. New Engl J Med.
1998;338:171-179.
71. Taylor SE, Repetti RL, Seeman T. Health psychology: What is an unhealthy environment
and how does it get under the skin? Annu Rev Psychol. 1997;48:411-447.
72. Krieger N, Rowley DL, Herman AA, Avery B, Phillips MT. Racism, Sexism, and Social-
Class - Implications for Studies of Health, Disease, and Well-Being. Am J Prev Med. 1993;9:82-
122.
73. Ellen IG, Mijanovich T, Dillman KN. Neighborhood effects on health: Exploring the
links and assessing the evidence. J Urban Aff. 2001;23:391-408.
112
74. Bell ML, Ebisu K. Environmental inequality in exposures to airborne particulate matter
components in the United States. Environmental health perspectives. 2012;120:1699-1704.
75. Flegal KM, Carroll MD, Kit BK, Ogden CL. Prevalence of Obesity and Trends in the
Distribution of Body Mass Index Among US Adults, 1999-2010. Jama-J Am Med Assoc.
2012;307:491-497.
76. Geronimus AT, Hicken M, Keene D, Bound J. "Weathering" and age patterns of
allostatic load scores among blacks and whites in the United States. American journal of public
health. 2006;96:826-833.
77. Everitt AV, Burgess JA. Hypothalamus, pituitary, and aging. Springfield, Ill.: Thomas;
1976.
78. Crimmins EM, Kim JK, Alley DE, Karlamangla A, Seeman T. Hispanic paradox in
biological risk profiles. American journal of public health. 2007;97:1305-1310.
79. Arias E. United States life tables, 2004. National vital statistics reports : from the Centers
for Disease Control and Prevention, National Center for Health Statistics, National Vital
Statistics System. 2007;56:1-39.
80. Levine RS, Foster JE, Fullilove RE, Fullilove MT, Briggs NC, Hull PC, et al. Black-
white inequalities in mortality and life expectancy, 1933-1999: Implications for healthy people
2010. Public Health Rep. 2001;116:474-483.
81. Bibbins-Domingo K, Pletcher MJ, Lin F, Vittinghoff E, Gardin JM, Arynchyn A, et al.
Racial Differences in Incident Heart Failure among Young Adults. New Engl J Med.
2009;360:1179-1190.
82. Johnson NE. The racial crossover in comorbidity, disability, and mortality. Demography.
2000;37:267-283.
83. Manton KG, Poss SS, Wing S. Black-White Mortality Crossover - Investigation from the
Perspective of the Components of Aging. The Gerontologist. 1979;19:291-300.
84. Vaupel JW, Yashin AI. Heterogeneity Ruses - Some Surprising Effects of Selection on
Population-Dynamics. Am Stat. 1985;39:176-185.
85. Crimmins EM, Kim JK, Seeman TE. Poverty and Biological Risk: The Earlier "Aging"
of the Poor. J Gerontol a-Biol. 2009;64:286-292.
86. Karlamangla AS, Merkin SS, Crimmins EM, Seeman TE. Socioeconomic and Ethnic
Disparities in Cardiovascular Risk In the United States, 2001-2006. Annals of epidemiology.
2010;20:617-628.
87. Merkin SS, Basurto-Davila R, Karlamangla A, Bird CE, Lurie N, Escarce J, et al.
Neighborhoods and Cumulative Biological Risk Profiles by Race/Ethnicity in a National Sample
of US Adults: NHANES III. Annals of epidemiology. 2009;19:194-201.
113
88. Duru OK, Harawa NT, Kermah D, Norris KC. Allostatic Load Burden and Racial
Disparities in Mortality. J Natl Med Assoc. 2012;104:89-95.
89. Seeman TE, McEwen BS, Rowe JW, Singer BH. Allostatic load as a marker of
cumulative biological risk: MacArthur studies of successful aging. Proceedings of the National
Academy of Sciences of the United States of America. 2001;98:4770-4775.
90. Merton RK. The Matthew Effect in Science: The reward and communication systems of
science are considered. Science. 1968;159:56-63.
91. Oeppen J, Vaupel JW. Demography - Broken limits to life expectancy. Science.
2002;296:1029-+.
92. Vaupel JW. Biodemography of human ageing. Nature. 2010;464:536-542.
93. Rosen M, Haglund B. From healthy survivors to sick survivors--implications for the
twenty-first century. Scandinavian journal of public health. 2005;33:151-155.
94. Crimmins EM, Beltran-Sanchez H. Mortality and Morbidity Trends: Is There
Compression of Morbidity? J Gerontol B-Psychol. 2011;66:75-86.
95. Mooradian AD. Biomarkers of Aging - Do We Know What to Look For. J Gerontol.
1990;45:B183-B186.
96. Finch C, Kirkwood TBL. Chance, development, and aging. New York: Oxford
University Press; 2000.
97. Hardy GH. Mendelian proportions in a mixed population. 1908. Yale J Biol Med.
2003;76:79-80.
98. Wang HD, Preston SH. Forecasting United States mortality using cohort smoking
histories. Proceedings of the National Academy of Sciences of the United States of America.
2009;106:393-398.
99. Finucane MM, Stevens GA, Cowan MJ, Danaei G, Lin JK, Paciorek CJ, et al. National,
regional, and global trends in body-mass index since 1980: systematic analysis of health
examination surveys and epidemiological studies with 960 country-years and 9.1 million
participants. Lancet. 2011;377:557-567.
100. Preston SH, Wang HD. Sex mortality differences in the United States: The role of cohort
smoking patterns. Demography. 2006;43:631-646.
101. Crimmins EM, Preston SH, Cohen B, National Research Council (U.S.). Panel on
Understanding Divergent Trends in Longevity in High-Income Countries. Explaining divergent
levels of longevity in high-income countries. Washington, D.C.: National Academies Press;
2011.
102. Wang Y, Beydoun MA. The obesity epidemic in the United States - Gender, age,
114
socioeconomic, Racial/Ethnic, and geographic characteristics: A systematic review and meta-
regression analysis. Epidemiol Rev. 2007;29:6-28.
103. Hajjar I, Kotchen TA. Trends in prevalence, awareness, treatment, and control of
hypertension in the United States, 1988-2000. Jama-J Am Med Assoc. 2003;290:199-206.
104. Cohen JD, Cziraky MJ, Cai QA, Wallace A, Wasser T, Crouse JR, et al. 30-Year Trends
in Serum Lipids Among United States Adults: Results from the National Health and Nutrition
Examination Surveys II, III, and 1999-2006. Am J Cardiol. 2010;106:969-975.
105. Psaty BM, Manolio TA, Smith NL, Heckbert SR, Gottdiener JS, Burke GL, et al. Time
trends in high blood pressure control and the use of antihypertensive medicationsin older adults -
The cardiovascular health study. Archives of internal medicine. 2002;162:2325-2332.
106. Goetzel RZ. Do Prevention Or Treatment Services Save Money? The Wrong Debate.
Health affairs. 2009;28:37-41.
107. Phoenix C, de Grey ADNJ. A model of aging as accumulated damage matches observed
mortality patterns and predicts the life-extending effects of prospective interventions. Age.
2007;29:133-189.
108. Crimmins EM, Finch CE. Infection, inflammation, height, and longevity. Proceedings of
the National Academy of Sciences of the United States of America. 2006;103:498-503.
109. Finch CE, Crimmins EM. Inflammatory exposure and historical changes in human life-
spans. Science. 2004;305:1736-1739.
110. Hayward MD, Gorman BK. The long arm of childhood: The influence of early-life social
conditions on men's mortality. Demography. 2004;41:87-107.
111. Heckman JJ. Skill formation and the economics of investing in disadvantaged children.
Science. 2006;312:1900-1902.
112. Singh GK, Yu SM. US childhood mortality, 1950 through 1993: Trends and
socioeconomic differentials. American journal of public health. 1996;86:505-512.
113. Mortality statistics, subports. Public Health Rep. 1906;21:880-880.
114. Guyer B, Hoyert DL, Martin JA, Ventura SJ, MacDorman MF, Strobino DM. Annual
summary of vital statistics - 1998. Pediatrics. 1999;104:1229-1246.
115. Healthier mothers and babies - 1900-1999 (Reprinted from MMWR, vol 48, pg 849-857,
1999). Jama-J Am Med Assoc. 1999;282:1807-1810.
116. Finch CE. Evolution of the human lifespan and diseases of aging: Roles of infection,
inflammation, and nutrition. Proceedings of the National Academy of Sciences of the United
States of America. 2010;107:1718-1724.
115
117. Lee HC, Wei YH. Mitochondria and Aging. Adv Exp Med Biol. 2012;942:311-327.
118. Del Turco S, Basta G. An update on advanced glycation endproducts and atherosclerosis.
Biofactors. 2012;38:266-274.
119. Valavanidis A, Vlachogianni T, Fiotakis K. Tobacco Smoke: Involvement of Reactive
Oxygen Species and Stable Free Radicals in Mechanisms of Oxidative Damage, Carcinogenesis
and Synergistic Effects with Other Respirable Particles. Int J Env Res Pub He. 2009;6:445-462.
120. Nicholl ID, Bucala R. Advanced glycation endproducts and cigarette smoking. Cellular
and molecular biology. 1998;44:1025-1033.
121. Valdes AM, Andrew T, Gardner JP, Kimura M, Oelsner E, Cherkas LF, et al. Obesity,
cigarette smoking, and telomere length in women. Lancet. 2005;366:662-664.
122. Csiszar A, Podlutsky A, Wolin MS, Losonczy G, Pacher P, Ungvari Z. Oxidative stress
and accelerated vascular aging: implications for cigarette smoking. Front Biosci-Landmrk.
2009;14:3128-3144.
123. Finch CE, Morgan TE, Longo VD, de Magalhaes JP. Cell resilience in species life spans:
a link to inflammation? Aging cell. 2010;9:519-526.
124. Barbieri M, Rizzo MR, Manzella D, Grella R, Ragno E, Carbonella M, et al. Glucose
regulation and oxidative stress in healthy centenarians. Experimental gerontology. 2003;38:137-
143.
125. Franceschi C, Bonafe M, Valensin S, Olivieri F, De Luca M, Ottaviani E, et al. Inflamm-
aging. An evolutionary perspective on immunosenescence. Annals of the New York Academy of
Sciences. 2000;908:244-254.
126. Franceschi C, Olivieri F, Marchegiani F, Cardelli M, Cavallone L, Capri M, et al. Genes
involved in immune response/inflammation, IGF1/insulin pathway and response to oxidative
stress play a major role in the genetics of human longevity: the lesson of centenarians.
Mechanisms of ageing and development. 2005;126:351-361.
127. Barzilai N, Gabriely I, Gabriely M, Iankowitz N, Sorkin JD. Offspring of centenarians
have a favorable lipid profile. Journal of the American Geriatrics Society. 2001;49:76-79.
128. Barter PJ, Nicholls S, Rye KA, Anantharamaiah GM, Navab M, Fogelman AM.
Antiinflammatory properties of HDL. Circ Res. 2004;95:764-772.
129. Rahilly-Tierney C, Sesso HD, Gaziano JM, Djousse L. High-Density Lipoprotein and
Mortality Before Age 90 in Male Physicians. Circ-Cardiovasc Qual. 2012;5:381-386.
130. Garrison RJ, Kannel WB, Feinleib M, Castelli WP, McNamara PM, Padgett SJ. Cigarette
smoking and HDL cholesterol: the Framingham offspring study. Atherosclerosis. 1978;30:17-25.
116
131. Yashin AI, Manton KG, Vaupel JW. Mortality and Aging in a Heterogeneous Population
- a Stochastic-Process Model with Observed and Unobserved Variables. Theor Popul Biol.
1985;27:154-175.
132. Pirie K, Peto R, Reeves GK, Green J, Beral V, Collaborators MWS. The 21st century
hazards of smoking and benefits of stopping: a prospective study of one million women in the
UK. Lancet. 2013;381:133-141.
133. Santos S, Rooke TW, Bailey KR, McConnell JP, Kullo IJ. Relation of markers of
inflammation (C-reactive protein, white blood cell count, and lipoprotein-associated
phospholipase A2) to the ankle-brachial index. Vasc Med. 2004;9:171-176.
134. Gan WQ, Man SF, Sin DD. The interactions between cigarette smoking and reduced lung
function on systemic inflammation. Chest. 2005;127:558-564.
135. Watson J, Round A, Hamilton W. Raised inflammatory markers. Bmj. 2012;344:e454.
136. Flouris AD, Poulianiti KP, Chorti MS, Jamurtas AZ, Kouretas D, Owolabi EO, et al.
Acute effects of electronic and tobacco cigarette smoking on complete blood count. Food and
chemical toxicology : an international journal published for the British Industrial Biological
Research Association. 2012;50:3600-3603.
137. Smith MR, Kinmonth AL, Luben RN, Bingham S, Day NE, Wareham NJ, et al. Smoking
status and differential white cell count in men and women in the EPIC-Norfolk population.
Atherosclerosis. 2003;169:331-337.
138. Conen D. Inflammation, blood pressure and cardiovascular disease: heading east. Journal
of human hypertension. 2013;27:71.
139. Lu H, Ouyang W, Huang C. Inflammation, a key event in cancer development. Molecular
cancer research : MCR. 2006;4:221-233.
140. King GL. The role of inflammatory cytokines in diabetes and its complications. Journal
of periodontology. 2008;79:1527-1534.
141. Brown GC, Neher JJ. Inflammatory neurodegeneration and mechanisms of microglial
killing of neurons. Molecular neurobiology. 2010;41:242-247.
142. O'Donnell R, Breen D, Wilson S, Djukanovic R. Inflammatory cells in the airways in
COPD. Thorax. 2006;61:448-454.
143. Erhardt L. Cigarette smoking: an undertreated risk factor for cardiovascular disease.
Atherosclerosis. 2009;205:23-32.
144. Anthonisen NR, Connett JE, Kiley JP, Altose MD, Bailey WC, Buist AS, et al. Effects of
Smoking Intervention and the Use of an Inhaled Anticholinergic Bronchodilator on the Rate of
Decline of Fev(1) - the Lung Health Study. Jama-J Am Med Assoc. 1994;272:1497-1505.
117
145. McGrowder D, Riley C, Morrison EY, Gordon L. The role of high-density lipoproteins in
reducing the risk of vascular diseases, neurogenerative disorders, and cancer. Cholesterol.
2011;2011:496925.
146. Barter PJ. High density lipoprotein: a therapeutic target in type 2 diabetes. Endocrinology
and metabolism. 2013;28:169-177.
147. Walter M. Interrelationships Among HDL Metabolism, Aging, and Atherosclerosis.
Arterioscl Throm Vas. 2009;29:1244-1250.
148. Fragoso CAV, Enright PL, McAvay G, Van Ness PH, Gill TM. Frailty and Respiratory
Impairment in Older Persons. Am J Med. 2012;125:79-86.
149. Hubbard RE, O'Mahony MS, Woodhouse KW. Characterising frailty in the clinical
setting-025EFa comparison of different approaches. Age Ageing. 2009;38:115-119.
150. El-Gohary A, Alshamrani A, Al-Otaibi AN. The generalized Gompertz distribution. Appl
Math Model. 2013;37:13-24.
151. Vasto S, Candore G, Balistreri CR, Caruso M, Colonna-Romano G, Grimaldi MP, et al.
Inflammatory networks in ageing, age-related diseases and longevity. Mechanisms of ageing and
development. 2007;128:83-91.
152. Balistreri CR, Candore G, Colonna-Romano G, Lio D, Caruso M, Hoffmann E, et al.
Role of Toll-like receptor 4 in acute myocardial infarction and longevity. Jama-J Am Med
Assoc. 2004;292:2339-2340.
153. Papafili A, Hill MR, Brull DJ, McAnulty RJ, Marshall RP, Humphries SE, et al.
Common promoter variant in cyclooxygenase-2 represses gene expression - Evidence of role in
acute-phase inflammatory response. Arterioscl Throm Vas. 2002;22:1631-1636.
154. Puri BK, Treasaden IH, Cocchi M, Tsaluchidu S, Tonello L, Ross BM. A comparison of
oxidative stress in smokers and non-smokers: an in vivo human quantitative study of n-3 lipid
peroxidation. Bmc Psychiatry. 2008;8.
155. Dalle-Donne I, Rossi R, Colombo R, Giustarini D, Milzani A. Biomarkers of oxidative
damage in human disease. Clin Chem. 2006;52:601-623.
156. Rahman I, Marwick J, Kirkham P. Redox modulation of chromatin remodeling: impact
on histone acetylation and deacetylation, NF-kappaB and pro-inflammatory gene expression.
Biochem Pharmacol. 2004;68:1255-1267.
157. Rahman I. Oxidative stress, chromatin remodeling and gene transcription in inflammation
and chronic lung diseases. J Biochem Mol Biol. 2003;36:95-109.
158. Vermeulen CJ, Loeschcke V. Longevity and the stress response in Drosophila.
Experimental gerontology. 2007;42:153-159.
118
159. Vanfleteren JR. Oxidative Stress and Aging in Caenorhabditis-Elegans. Biochem J.
1993;292:605-608.
160. Kurz CL, Tan MW. Regulation of aging and innate immunity in C-elegans. Aging cell.
2004;3:185-193.
161. Brown-Borg HM. Hormonal regulation of longevity in mammals. Ageing Res Rev.
2007;6:28-45.
162. Herskind AM, McGue M, Holm NV, Sorensen TI, Harvald B, Vaupel JW. The
heritability of human longevity: a population-based study of 2872 Danish twin pairs born 1870-
1900. Human genetics. 1996;97:319-323.
163. Lithgow GJ, Andersen JK. The real Dorian Gray mouse. BioEssays : news and reviews in
molecular, cellular and developmental biology. 2000;22:410-413.
164. Kenyon C. The plasticity of aging: insights from long-lived mutants. Cell. 2005;120:449-
460.
165. Lin YJ, Seroude L, Benzer S. Extended life-span and stress resistance in the Drosophila
mutant methuselah. Science. 1998;282:943-946.
166. Harman D. The aging process: major risk factor for disease and death. Proceedings of the
National Academy of Sciences of the United States of America. 1991;88:5360-5363.
167. Niccoli T, Partridge L. Ageing as a Risk Factor for Disease. Current Biology.
2012;22:R741-R752.
168. Goldman DP, Cutler D, Rowe JW, Michaud PC, Sullivan J, Peneva D, et al. Substantial
Health And Economic Returns From Delayed Aging May Warrant A New Focus For Medical
Research. Health affairs. 2013;32:1698-1705.
169. Johnson TE. Increased Life-Span of Age-1 Mutants in Caenorhabditis-Elegans and
Lower Gompertz Rate of Aging. Science. 1990;249:908-912.
170. Kenyon C, Chang J, Gensch E, Rudner A, Tabtiang R. A C-Elegans Mutant That Lives
Twice as Long as Wild-Type. Nature. 1993;366:461-464.
171. Gems D, Sutton AJ, Sundermeyer ML, Albert PS, King KV, Edgley ML, et al. Two
pleiotropic classes of daf-2 mutation affect larval arrest, adult behavior, reproduction and
longevity in Caenorhabditis elegans. Genetics. 1998;150:129-155.
172. Herskind AM, McGue M, Iachine IA, Holm N, Sorensen TI, Harvald B, et al. Untangling
genetic influences on smoking, body mass index and longevity: a multivariate study of 2464
Danish twins followed for 28 years. Human genetics. 1996;98:467-475.
173. v BHJ, Iachine I, Skytthe A, Vaupel JW, McGue M, Koskenvuo M, et al. Genetic
influence on human lifespan and longevity. Human genetics. 2006;119:312-321.
119
174. Johnson TE, Henderson S, Murakami S, de Castro E, de Castro SH, Cypser J, et al.
Longevity genes in the nematode Caenorhabditis elegans also mediate increased resistance to
stress and prevent disease. J Inherit Metab Dis. 2002;25:197-206.
175. Gems D, Partridge L. Genetics of Longevity in Model Organisms: Debates and Paradigm
Shifts. Annu Rev Physiol. 2013;75:621-644.
176. Vijg J, Suh Y. Genetics of longevity and aging. Annu Rev Med. 2005;56:193-212.
177. Rajpathak SN, Liu YH, Ben-David O, Reddy S, Atzmon G, Crandall J, et al. Lifestyle
Factors of People with Exceptional Longevity. Journal of the American Geriatrics Society.
2011;59:1509-1512.
178. Cancer Prevention Study II. The American Cancer Society Prospective Study. Statistical
bulletin. 1992;73:21-29.
179. Preston SH, Stokes A, Mehta NK, Cao BC. Projecting the Effect of Changes in Smoking
and Obesity on Future Life Expectancy in the United States. Demography. 2014;51:27-49.
180. Longo VD, Finch CE. Evolutionary medicine: from dwarf model systems to healthy
centenarians? Science. 2003;299:1342-1346.
181. Sonnega A, Faul JD, Ofstedal MB, Langa KM, Phillips JW, Weir DR. Cohort Profile: the
Health and Retirement Study (HRS). Int J Epidemiol. 2014;43:576-585.
182. Patterson N, Price AL, Reich D. Population structure and eigenanalysis. PLoS genetics.
2006;2:e190.
183. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK:
A tool set for whole-genome association and population-based linkage analyses. Am J Hum
Genet. 2007;81:559-575.
184. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: A
software environment for integrated models of biomolecular interaction networks. Genome Res.
2003;13:2498-2504.
185. Zhang G, Karns R, Sun GY, Indugula SR, Cheng H, Havas-Augustin D, et al. Finding
Missing Heritability in Less Significant Loci and Allelic Heterogeneity: Genetic Variation in
Human Height. PloS one. 2012;7.
186. Allen HL, Estrada K, Lettre G, Berndt SI, Weedon MN, Rivadeneira F, et al. Hundreds
of variants clustered in genomic loci and biological pathways affect human height. Nature.
2010;467:832-838.
187. Wray NR, Goddard ME, Visscher PM. Prediction of individual genetic risk to disease
from genome-wide association studies. Genome Res. 2007;17:1520-1528.
120
188. Wray NR, Goddard ME, Visscher PM. Prediction of individual genetic risk of complex
disease. Curr Opin Genet Dev. 2008;18:257-263.
189. Soares JP, Cortinhas A, Bento T, Leitao JC, Collins AR, Gaivao I, et al. Aging and DNA
damage in humans: a meta-analysis study. Aging (Albany NY). 2014;6:432-439.
190. Negrini S, Gorgoulis VG, Halazonetis TD. Genomic instability--an evolving hallmark of
cancer. Nature reviews Molecular cell biology. 2010;11:220-228.
191. Shields JM, Pruitt K, McFall A, Shaub A, Der CJ. Understanding Ras: 'it ain't over 'til it's
over'. Trends in cell biology. 2000;10:147-154.
192. Largaespada DA. A bad rap: Rap1 signaling and oncogenesis. Cancer cell. 2003;4:3-4.
193. Wang Z, Ahmad A, Li Y, Kong D, Azmi AS, Banerjee S, et al. Emerging roles of PDGF-
D signaling pathway in tumor development and progression. Biochimica et biophysica acta.
2010;1806:122-130.
194. Mercken EM, Crosby SD, Lamming DW, JeBailey L, Krzysik-Walker S, Villareal DT, et
al. Calorie restriction in humans inhibits the PI3K/AKT pathway and induces a younger
transcription profile. Aging cell. 2013;12:645-651.
195. Suh Y, Atzmon G, Cho MO, Hwang D, Liu B, Leahy DJ, et al. Functionally significant
insulin-like growth factor I receptor mutations in centenarians. Proceedings of the National
Academy of Sciences of the United States of America. 2008;105:3438-3442.
196. Kwon ES, Narasimhan D, Yen K, Tissenbaum HA. A new DAF-16 isoform regulates
longevity. Nature. 2010;466:498-502.
197. Tazearslan C, Cho M, Suh Y. Discovery of Functional Gene Variants Associated With
Human Longevity: Opportunities and Challenges. J Gerontol a-Biol. 2012;67:376-383.
198. Wilicox BJ, Donlon TA, He Q, Chen R, Grove JS, Yano K, et al. FOXO3A genotype is
strongly associated with human longevity. Proceedings of the National Academy of Sciences of
the United States of America. 2008;105:13987-13992.
199. Flachsbart F, Caliebeb A, Kleindorp R, Blanche H, von Eller-Eberstein H, Nikolaus S, et
al. Association of FOXO3A variation with human longevity confirmed in German centenarians.
Proceedings of the National Academy of Sciences of the United States of America.
2009;106:2700-2705.
200. Yashin AI, Wu DQ, Arbeev KG, Ukraintseva SV. Joint influence of small-effect genetic
variants on human longevity. Aging-Us. 2010;2:612-620.
201. Dudbridge F. Power and Predictive Accuracy of Polygenic Risk Scores. PLoS genetics.
2013;9.
121
202. Peterson RE, Maes HH, Holmans P, Sanders AR, Levinson DF, Shi JX, et al. Genetic
risk sum score comprised of common polygenic variation is associated with body mass index.
Human genetics. 2011;129:221-230.
203. Leiserson MDM, Eldridge JV, Ramachandran S, Raphael BJ. Network analysis of
GWAS data. Curr Opin Genet Dev. 2013;23:602-610.
204. Chesler EJ, Lu L, Shou SM, Qu YH, Gu J, Wang JT, et al. Complex trait analysis of gene
expression uncovers polygenic and pleiotropic networks that modulate nervous system function.
Nature genetics. 2005;37:233-242.
205. Horvath S, Zhang B, Carlson M, Lu KV, Zhu S, Felciano RM, et al. Analysis of
oncogenic signaling networks in glioblastoma identifies ASPM as a molecular target.
Proceedings of the National Academy of Sciences of the United States of America.
2006;103:17402-17407.
206. Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network
analysis. Bmc Bioinformatics. 2008;9.
207. Vonbertalanffy L. The Theory of Open Systems in Physics and Biology. Science.
1950;111:23-29.
122
Appendix A
Table S6.1: Enriched Pathways in the Network of 215 Genes
Pathway Enrichment FDR Genes
PI3K-Akt
signaling
pathway 0.0349 <1.000e-03
OSMR,LPAR1,GNG4,RXRA,RPTOR,PPP2R5E,EFNA5,LAMA1,SOS2,COL6A2,
THBS2,IL7,CD19,FGF14,FGF12,FOXO3,PDGFC,COL24A1,FGFR1,TEK,ITGA1,
ITGA4,COL4A2,COL5A3,COL5A1,JAK1
Pathways in
cancer 0.033 <5.000e-04
RXRA,MECOM,CTNNA3,CTNNA2,DVL2,SMAD3,LAMA1,CBLB,ETS1,MITF,
ZBTB16,SOS2,STAT1,FGF14,FGF12,FGFR1,RUNX1,EPAS1,PLCG1,PLCG2,
LEF1,DCC,TPM3,COL4A2,JAK1
Signaling by
PDGF 0.0177 <3.333e-04
ADCY2,ADCY8,PDE1B,COL6A2,THBS2,RASA1,STAT1,PLG,ITPR2,CD19,
ADCY9,FOXO3,PDGFC,FGFR1,PLCG1,NRG3,NRG1,COL4A2
Glutamatergic
synapse 0.0117 <2.500e-04
ADCY2,ADCY8,GRIN2B,GNG4,GRIN2A,HOMER2,PLA2G4C,GRM7,DLGAP1,
GRIA4,ITPR2,ADCY9,GRIK4,CACNA1C
Ras signaling
pathway 0.0229 <2.000e-04
GRIN2B,GNG4,GRIN2A,EFNA5,ETS1,PLA2G4C,SOS2,RASA1,FGF14,FGF12,
PDGFC,FGFR1,MRAS,TEK,PLCG1,PLCG2,RAP1A,PAK2,PAK1
Rap1 signaling
pathway 0.0215 <1.667e-04
ADCY2,ADCY8,LPAR1,GRIN2B,MAGI1,GRIN2A,EFNA5,APBB1IP,ADCY9,
FGF14,FGF12,PDGFC,FGFR1,MRAS,TEK,PLCG1,RAP1A,ARAP2
L1CAM
interactions 0.0095 <1.429e-04
LAMA1,NRP2,DNM3,ANK2,ANK3,SPTB,FGFR1,ITGA1,SPTBN1,PAK1,SCN7A,
NFASC
Focal adhesion 0.0208 <1.250e-04
LAMA1,PARVB,SOS2,COL6A2,THBS2,PDGFC,COL24A1,DOCK1,ITGA1,
ITGA4,RAP1A,MYLK,PAK2,PAK1,COL4A2,COL5A3,COL5A1
Netrin-1
signaling 0.0042 0.0001111 PITPNA,TRIO,SLIT2,ROBO1,DOCK1,PLCG1,DCC,ABLIM2
Netrin-mediated
signaling events 0.0032 0.0002 PITPNA,TRIO,DOCK1,PLCG1,PAK1,DCC,MAP1B
ECM-receptor
interaction 0.0087 0.0005455
LAMA1,COL6A2,THBS2,COL24A1,ITGA1,ITGA4,SV2C,COL4A2,COL5A3,
COL5A1
Dilated
cardiomyopathy 0.0091 0.001083
ADCY2,ADCY8,CACNB2,ADCY9,CACNA2D1,ITGA1,ITGA4,CACNA2D3,
CACNA1C,TPM3
Signalling by
NGF 0.0281 0.001
ADCY2,ADCY8,ABR,TRIO,MCF2L,ARHGEF3,PDE1B,SOS2,DNM3,ITPR2,
CD19,ADCY9,FOXO3,FGFR1,PLCG1,RAP1A,NRG3,NRG1
Arrhythmogenic
right ventricular
cardiomyopathy
(ARVC) 0.0075 0.001071
CTNNA3,CTNNA2,CACNB2,CACNA2D1,ITGA1,ITGA4,CACNA2D3,
CACNA1C,LEF1
DAG and IP3
signaling 0.0031 0.0016 ADCY2,ADCY8,PDE1B,ITPR2,ADCY9,PLCG1
Signaling by
Robo receptor 0.0032 0.001688 SLIT2,SOS2,ROBO1,CLASP1,PAK2,PAK1
Integration of
energy
metabolism 0.0106 0.002059
ADCY2,ADCY8,GNG4,CACNB2,ITPR2,ADCY9,ABCC8,RAP1A,CACNA1C,
PRKAG2
Axon guidance 0.0128 0.001944
EFNA5,EPHA3,SLIT2,RASA1,PLXNA1,PLXNA2,ROBO1,PAK2,PAK1,DCC,
ABLIM2
Calcium
signaling
pathway 0.0183 0.002789
ADCY2,ADCY8,GRIN2A,PDE1B,PTGFR,ITPR2,ADCY9,ITPKC,PLCG1,PLCG2,
ADRA1A,CACNA1C,MYLK
Regulation of
RAC1 activity 0.0038 0.0042 ABR,TRIO,CHN2,DOCK2,DOCK1,PREX1
123
Table S6.2: SNPs and Weights used to Generate the Polygenic Risk Score
Gene SNP A1 Weight P-value
ABCC8 rs916827 A 0.5463858 0.001228
ABLIM2 rs75071016 A 0.7701083 0.001315
ABR rs76616688 A 0.7889118 0.004876
ADAR rs6699825 G -0.5161732 0.002392
ADCY2 rs1428526 A 0.7109871 0.0001315
ADCY8 rs12334868 G -0.7001718 0.001637
ADCY9 rs2601805 A 0.4983481 0.00263
ADRA1A rs1390512 A 0.5205779 0.004957
ALK rs12618086 A 0.4824262 0.003642
ALMS1 rs2037814 A 0.5805382 0.004549
ANK2 rs35784458 G 0.6956441 0.004281
ANK3 rs1934750 A -0.47965 0.003573
APBB1IP rs1775233 A 0.5013812 0.002042
ARAP2 rs1397422 A 0.5411609 0.00164
ARHGAP21 rs71493393 A -0.7526849 0.001206
ARHGEF3 rs76601653 G 0.7471618 0.003314
ARNTL rs3789327 G 0.4762342 0.00359
ASB7 rs59065565 G 0.4643627 0.004336
B3GAT1 rs10894807 G 0.5312163 0.0008023
B4GALT7 rs2046511 C -0.5862678 0.0006609
BMP1 rs7830627 A 0.7100043 0.003127
BMP6 rs9505270 A 0.5596158 0.001031
BMPR2 rs4032752 A 0.6070445 0.002596
BRD7 rs4595801 A 0.5993858 0.002783
CAB39L rs73186871 G 0.8342126 0.001461
CACNA1C rs7297582 A -0.5332416 0.001999
CACNA2D1 rs6942458 G 0.5900064 0.001209
CACNA2D3 rs6784395 G -0.5539071 0.002847
CACNB2 rs74120235 C 0.7687184 0.0006153
CBFA2T3 rs12923300 G 0.4567917 0.004916
CBLB rs35581896 G -0.8196178 0.002336
CCAR1 rs17462632 A 0.7668623 0.002738
CD19 rs2070962 A 0.5068176 0.00117
CDH13 rs74031991 A 0.6688545 0.002184
CDH23 rs1227051 G 0.5223588 0.003724
CDH4 rs2427147 A -0.5109923 0.001825
CHN2 rs3793252 A 0.5822156 0.001816
CHRD rs16858780 C -0.9752449 0.0004284
CLASP1 rs12613842 A 0.8078142 0.002705
CNTN4 rs9830036 G -0.5898688 0.0006997
COL13A1 rs2894303 G -0.6485562 0.001815
COL24A1 rs12145024 A 0.8510053 0.0002968
COL4A2 rs199927500 G -1.200313 0.004466
124
COL5A1 rs3124936 A -0.9046092 0.001159
COL5A3 rs746052 A 0.5187938 0.001026
COL6A2 rs4458293 G 0.5211719 0.00158
CRHR1 rs1880756 A -0.5875267 0.001631
CRMP1 rs17748512 G 0.7929925 0.002814
CTNNA2 rs4331529 A 0.678541 0.0002783
CTNNA3 rs10733830 A -0.5851901 0.002603
DAAM1 rs1958180 G 0.614104 0.0008917
DAB1 rs197110 A 0.5353231 0.001574
DBNL rs34863001 A 0.6621724 0.003155
DCC rs17681615 A -0.5850105 0.002178
DLC1 rs111309706 A 0.9122827 0.001136
DLGAP1 rs8092707 A 0.6544064 0.001455
DNM3 rs6673646 A 0.552735 0.0007193
DOCK1 rs11816982 C 0.7193021 0.0003029
DOCK2 rs10063533 A 0.506215 0.001776
DSCAM rs2837467 A 0.4675005 0.002729
DVL2 rs72839770 A -0.5234045 0.002902
ECE1 rs3026820 T 0.5816568 0.000329
EEF1D rs28606985 A 0.5019866 0.002689
EFNA5 rs1835111 G 0.5176026 0.004098
EIF3H rs67991519 A 0.5515836 0.00316
EPAS1 rs1992846 A 0.5423243 0.001762
EPHA3 rs7374904 G -0.9784321 0.003678
ETS1 rs7125213 A -0.9233154 0.001827
FAM13A rs2085600 A 0.5359084 0.003259
FDFT1 rs1736057 G -0.5335826 0.002336
FGF12 rs9859577 A 0.6344582 0.0008385
FGF14 rs9518604 G -1.137873 0.002218
FGFR1 rs2280846 A 0.632335 0.004997
FOXO3 rs12203834 A 0.585562 0.004585
FPR3 rs73058873 G 0.8441501 0.001909
FRAS1 rs391275 G 0.5423243 0.002314
GAD2 rs11015008 A -0.7356372 0.002038
GARNL3 rs3934537 G 0.7504722 0.0000898
GCK rs758988 A -0.8509713 0.002343
GNG4 rs488618 G -0.4866209 0.003497
GPC5 rs4773668 G 0.4787155 0.003474
GPC6 rs61963748 G 0.6162661 0.0006279
GRIA4 rs1562221 G 0.9242589 0.0006274
GRID2 rs6848888 A 0.6075892 0.0002238
GRIK4 rs11218014 A 0.7797835 0.004885
GRIN2A rs9938172 A -0.5454171 0.002407
GRIN2B rs888149 A -0.6727566 0.003761
GRM7 rs3804935 G 0.8368145 0.003582
125
GXYLT2 rs62249904 A 0.4517123 0.004571
HDAC4 rs10211599 A 0.686626 0.001039
HELLS rs12248434 G -0.7695963 0.004831
HIVEP1 rs9885932 G 0.6146449 0.0006581
HK2 rs656489 A 0.5241367 0.001448
HLA-DQB1 rs2854272 G -0.5474896 0.001421
HLA-DRB1 rs3830135 A 0.8565407 0.0001031
HOMER2 rs17360083 A 0.4631049 0.004235
HS3ST2 rs1054028 G -0.8347108 0.00452
IL7 rs2583763 A 0.6544064 0.003589
IRF2 rs6816525 A 0.544067 0.001612
ITGA1 rs2454582 A 0.5128236 0.002267
ITGA4 rs3770129 A 0.7256144 0.001525
ITGBL1 rs9557678 G 0.5241367 0.004871
ITPKC rs890934 A -0.74423 0.0000735
ITPR2 rs12228503 A -0.5060039 0.004949
JAK1 rs59566971 A 0.6307397 0.001439
JAM2 rs73161749 G 0.7100043 0.003099
KLC1 rs201203389 A 0.4995624 0.002114
LAMA1 rs12454984 G 0.5794184 0.0008898
LEF1 rs1291490 A 0.6423801 0.0007475
LIMD1 rs2578693 A -0.6632001 0.0006293
LPAR1 rs62570358 G 0.5911145 0.004027
LRSAM1 rs2243767 A 0.6518042 0.0000563
LYZ rs1800973 A 0.8224195 0.00313
MAGI1 rs17370143 C 0.7575294 0.00473
MAML2 rs74383109 G 0.7006192 0.001164
MAP1B rs112222884 A -1.234432 0.003754
MATN2 rs2444895 G 0.4768551 0.004809
MC1R rs1805005 A 0.6360476 0.002374
MCF2L rs10665 G 0.7193021 0.001284
MECOM rs73172059 G 0.7409846 0.001616
MED27 rs7848165 A -0.7747912 0.004142
MITF rs12639523 A 0.5128236 0.002846
MRAS rs1199334 A 0.5544597 0.002604
MYBPC1 rs2958098 A -0.5383681 0.001304
MYLK rs820371 A -0.6923475 0.002467
NCOA2 rs71523172 G 0.6507615 0.003894
NCOR2 rs11057590 G 0.7691818 0.002537
NELL1 rs4923452 G 0.6333972 0.0001182
NFASC rs2802847 G 0.4780958 0.002649
NLRC5 rs35756276 A 0.9210793 0.001602
NRG1 rs4733342 G -1.087376 0.0007454
NRG3 rs74146712 G 0.6097656 0.004807
NRP2 rs849574 A 0.7385984 0.001773
126
OSMR rs525735 A -0.7700282 0.001884
PAK1 rs116884829 G 0.9873082 0.000459
PAK2 rs6774504 A -0.4913498 0.003728
PARK2 rs7745686 G 0.5469647 0.001118
PARVB rs1007863 G 0.693647 0.0000243
PCBD1 rs35541340 A 0.5158131 0.003666
PDE1B rs3782403 G 0.4965239 0.002244
PDGFC rs72683373 A 0.4731238 0.00455
PDS5A rs17619024 G 0.585562 0.002862
PIK3C2G rs10743273 G -0.6249279 0.001089
PIP5K1A rs4971013 A 0.5579 0.001357
PITPNA rs9890892 G -0.5400828 0.003971
PLA2G4C rs1366442 C -0.4825621 0.00449
PLCG1 rs6124323 A -0.4550756 0.0048
PLCG2 rs7195470 G -0.5566951 0.002795
PLG rs4252170 G 0.7542422 0.002562
PLXNA1 rs79072379 A 0.9392257 0.001175
PLXNA2 rs11118974 A 0.6673164 0.004938
POLR2J rs1131384 A 0.8333439 0.001639
POU2F2 rs4802131 G 0.8721298 0.001208
PPP2R5E rs10144281 G 0.8687798 0.0005325
PREX1 rs55979552 A 0.7691818 0.001325
PRKAG2 rs12703165 A -0.7637842 0.003236
PSMD4 rs55877187 A 0.5475432 0.001537
PTGFR rs3766355 A 0.662688 0.001647
PTPRE rs61873692 A 0.8011043 0.001953
PTPRG rs9837811 A 0.5788581 0.001548
PTPRK rs9482850 A 0.585562 0.004577
RAP1A rs3767595 A 0.5630385 0.002997
RAP1GAP2 rs8072031 A 0.7197891 0.001647
RASA1 rs10045850 G 0.5434864 0.0009501
RFWD2 rs7517503 A -0.5188578 0.002584
RFX5 rs1752387 A 0.4995624 0.004883
RGS17 rs4599660 A 0.5411609 0.0008719
RGS7 rs4660068 A 0.4668738 0.004827
RHOBTB1 rs2893869 A -0.9016485 0.001664
RIMS1 rs2807535 A -0.5882468 0.001225
ROBO1 rs7616065 G -0.6468363 0.0008196
RORA rs922782 C 0.5538851 0.0006946
RORC rs949969 A -0.6326167 0.00434
RPTOR rs719781 A -0.6829988 0.001331
RUNX1 rs56045941 G 0.6054083 0.001216
RXRA rs10776909 A 0.506215 0.004339
SCN7A rs7570585 A 0.5229518 0.003488
SCP2 rs10437066 A 0.5982869 0.004115
127
SDC2 rs2575735 G -0.6647541 0.001103
SGMS1 rs4935712 A 0.4498009 0.003951
SKI rs12119470 A 0.6270074 0.003325
SLC26A3 rs41668 G -0.644738 0.002041
SLC2A4 rs5418 G 0.4649911 0.003982
SLIT2 rs7680945 G 0.4983481 0.003287
SMAD3 rs1530060 G -0.6167419 0.002966
SMAD7 rs12967019 G -0.5061698 0.002311
SORBS1 rs7080061 G -0.5593158 0.004659
SOS2 rs56162847 A 0.5971868 0.001645
SOX5 rs11046975 A 0.6021278 0.0007572
SPTB rs229596 G 0.7518877 0.0002395
SPTBN1 rs115140859 G 0.8078142 0.004341
SRRM1 rs11249151 G 0.5007753 0.002544
STAC rs11717255 C 0.6647477 0.001124
STAT1 rs36077929 A 0.5777363 0.003895
STAT4 rs7573832 G 0.6734545 0.0005227
SUMO1 rs59293950 A 0.711969 0.0004984
SV2C rs6882321 C -1.096614 0.003275
SYT2 rs4364933 A 0.4681269 0.00326
TEK rs578327 G -0.6663105 0.003728
THBS2 rs9406326 A -0.860856 0.001689
THRB rs80264341 A 0.8246139 0.004364
TLE1 rs10780524 G 0.608678 0.001325
TLK1 rs3821087 A 0.9662231 0.0005882
TMOD1 rs10982602 G 0.6749832 0.002865
TOP1 rs6072249 G -0.4650559 0.004024
TPM3 rs12028949 G 0.8144794 0.000244
TRIM9 rs12883270 A 0.6564832 0.0009469
TRIO rs26092 A 0.5636078 0.00281
TUBGCP2 rs72864794 A 0.7584667 0.002284
UNC13B rs10465027 G -0.5462801 0.001816
UNC13C rs11071043 G 0.4718769 0.003536
USP7 rs4076904 A 0.5335652 0.00284
UTRN rs7765923 A -0.665532 0.001362
ZBTB16 rs622200 G -1.131652 0.0002872
ZNF160 rs4801954 C -0.6790471 0.0007198
Abstract (if available)
Abstract
Aging is an exceedingly complex process, and for centuries scientists have strived to uncover the mechanisms that contribute to differences in the rate of the physiological decline that characterizes this process. Over the years it has become apparent that the rate of physiological decline that an organism undergoes over time is likely regulated by complex interactions of genes and environment. In animal models, nutritional interventions and genetic knockouts have produced dramatic results in lifespan extensions. It has been shown that genetic variation also influences how animals respond and cope with environmental perturbations. In humans, environmental factors such as nutrition and exposure to cigarette smoke have been found to significantly affect lifespan. Nevertheless, these effects are not consistent for everyone, as evidenced by the history of smoking behavior amongst the very old. ❧ Given the complexities involved in both the aging process and its regulators, complex statistical models will likely facilitate our understanding of these multifaceted interactions—particularly in humans. While the relatively short lifespan of laboratory animals makes them ideal for studying the aging process, our ability to impact our own health requires studies on biological aging processes to also be conducted in humans. Waiting to examine subjects at the end of life or collecting data over many decades may not be ideal for uncovering the causes of aging and lifespan heterogeneity in humans. Thus, the development of mathematical algorithms that are able to approximate the degree of aging an individual has undergone could allow researchers to test the efficacy of aging interventions in human populations. Additionally, advanced statistical methods may also allow us to model the complex interactions between genes and environment that contribute to within‐ and between‐species difference in aging and lifespan. ❧ In Chapter 1 of my dissertation, I outline some of the work that has been done in biology and demography that contributes to our understanding of the aging process. I then go on to offer further evidence for the importance of more complex statistical modeling in studying human aging. In Chapter 2, I present and provide validation for an algorithm that estimates biological age. I show that the residuals in the difference between biological and chronological age are strongly associated with mortality and that biological age is a more accurate predictor of remaining life expectancy than chronological age. ❧ In Chapters 3 and 4, I provide further proof of concept for the biological age algorithm. Chapter 3 examines whether differences in biological age account for the racial disparities in mortality that are present in the U.S. I find that biological age is successful in identifying the most at‐risk individuals in a population and that adjustment for biological age at baseline completely accounts for race differences in all‐cause, cardiovascular disease, and cancer mortality. In Chapter 4, I examine how the aging of the population has changed over the past two decades and examine the relative contributions of population changes in the prevalence of smoking, obesity, and medication use. Results in Chapter 4 suggest that aging has slowed for older and middle-aged adults, especially men, and that this was likely due to decreases in smoking and increased medication use. Unfortunately, younger women experienced very little improvement, which likely resulted from their increasing rates of obesity. ❧ Chapters 5 and 6 examined reasons why some individuals are able to reach extreme old age even in the presence of clearly high exposure to damaging factors. In chapter 5, I tested whether long‐lived smokers represented a biologically resilient phenotype that could facilitate our understanding of heterogeneity in the aging process. Results showed that while smoking significantly increased mortality in most age groups, it did not increase the mortality risk for those who were age 80 and over at baseline. Additionally when comparing the adjusted means of biomarkers between never and current smokers, long‐lived smokers (80+) were found to have similar inflammation, HDL, and lung function levels to never smokers. Given the evidence that these individuals represent an innately resilient group, in Chapter 6, I used this phenotype to identify genomic networks that contributed to stress resistance and longevity. Overall, using a unique phenotype and incorporating prior knowledge of biological networks, I identified a cluster of 215 single nucleotide polymorphisms that together appear to be associated with human aging, stress resistance, cancer, and longevity. ❧ In the final chapter of my dissertation, I propose a novel method that, in moving forward with my research, I plan to use to identify complex genetic networks associated with a multitude of age‐related conditions. I suggest how this method could also be incorporated into gene by environment or multi‐level networks that utilize various types of omics data, social and behavioral data, and aging-related outcomes. I conclude the dissertation by outlining a new theory of aging that incorporates multi‐level networks, programmed and stochastic theories of aging, and the second law of thermodynamics—which in future work, I plan to test using stochastic simulations.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Social determinants of physiological health and mortality in China
PDF
Aging in a high infection society
PDF
Air pollution neurotoxicity throughout the lifespan: studies on the mechanism of toxicity and interactions with effects of sex and genetic background
PDF
A biodemographic approach to understanding sociodemographic disparities in kidney functioning on three dimensions: individual, population, and cross-national
PDF
Health differences between the elderly in Japan and the United States
PDF
Computational approaches to identify genetic regulators of aging and late-life mortality
PDF
Biomarkers of age-related health changes: associations with health outcomes and disparities
PDF
Internet communication use, psychological functioning and social connectedness at older ages
PDF
International sex and age differences in physical function and disability
PDF
Estimating survival in the face of pain: evidence from the health and retirement study
PDF
The regulation, roles, and mechanism of action of mitochondrial-derived-peptides (MDPs) in aging
PDF
Age integration in late life: sociodemographic & psychosocial correlates of intergenerational-only, peer-only, and age-integrated social networks
PDF
Self-perceptions of Aging in the Context of Neighborhood and Their Interplay in Late-life Cognitive Health
PDF
Longitudinal assessment of neural stem-cell aging
PDF
Three essays on modifiable determinants of shingles: risk factors for shingles incidence and factors affecting timing of vaccine uptake
PDF
Is stress exposure enough? Race/ethnic differences in the exposure and appraisal of chronic stressors among older adults
PDF
Alpha-ketoglutarate, an endogenous metabolite, extends lifespan and compresses morbidity in aging mice
PDF
Residential care in Los Angeles: policy and planning for an aging population
PDF
Essays on health and aging with focus on the spillover of human capital
PDF
Investigating brain aging and neurodegenerative diseases through omics data
Asset Metadata
Creator
Levine, Morgan Elyse
(author)
Core Title
Statistical algorithms for examining gene and environmental influences on human aging
School
Leonard Davis School of Gerontology
Degree
Doctor of Philosophy
Degree Program
Gerontology
Publication Date
01/16/2015
Defense Date
12/15/2014
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
aging,biomarkers,genetics,longevity,mortality,OAI-PMH Harvest
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Crimmins, Eileen M. (
committee chair
), Ailshire, Jennifer (
committee member
), Cohen, Pinchas (
committee member
), Finch, Caleb E. (
committee member
), Schneider, Edward (
committee member
)
Creator Email
canon@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-522802
Unique identifier
UC11297931
Identifier
etd-LevineMorg-3118.pdf (filename),usctheses-c3-522802 (legacy record id)
Legacy Identifier
etd-LevineMorg-3118.pdf
Dmrecord
522802
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Levine, Morgan Elyse
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
biomarkers
genetics
longevity
mortality