Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Essays in panel data analysis
(USC Thesis Other)
Essays in panel data analysis
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ESSAYS IN PANEL DATA ANALYSIS
by
Yanyu Wu
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(ECONOMICS)
August 2012
Copyright 2012 Yanyu Wu
ii
Acknowledgments
This dissertation would not have been possible without the help, support and
encouragement of a small group of amazing people who possess the combination of
experience and skills that enabled me to make this labor of intellectual curiosity
possible. I am fortunate to have had such a tremendous experience, both as a
researcher and a professional, and I am grateful to all who have helped me along the
way.
My greatest appreciation goes to my advisor, Professor Cheng Hsiao. I am thankful
to him for introducing me to the field of health care. He taught me the importance
of careful empirical work, and helped shape my thoughts into logical ideas. With his
focus and attention to my work, I avoided many errors. Moreover, Professor Hsiao’s
kind personality, rigorous scholarship, and sophisticated modeling skills have won
my profound admiration.
To Professor Neeraj Sood, whose expertise in health care research allowed me to
rapidly immerse myself in this area, I give my sincere thanks. He generously shared
his time and expertise with me, and assisted me at every step of my research:
digging for references, searching through data, providing constructive suggestions,
and reading and revising innumerable drafts.
iii
And with uncommon kindness, Professor Jeffrey Nugent helped me to develop my
presentation skills. I have been greatly energized by his unmatched enthusiasm for
both my dissertation and our collaboration. With his encouragement and passionate
support for my career development, Professor Nugent has gone above and beyond
what I expected.
Many other faculty members outside of my dissertation committee also had an
impact on my work. I am indebted to Professor Roger Moon, who was invaluable
during the early stage of my research. He, too, frequently pointed me towards new
opportunities. Profesor Rober Dekle and Professor John Ham also contributed expert
feedback on my research.
Finally, I would like to thank the University of Southern California Department of
Economics for the opportunity to conduct my research at one of the top institutions
in the world. The thought leaders with whom I have worked are inspiring, and the
collective wisdom of the faculty is truly remarkable.
iv
Table of Contents
Acknowledgm ents .......................................................................................................... ii
List of Tables ................................................................................................................. vi
List of Figures .............................................................................................................. vii
Abstract ...................................................................................................................... viii
Chapter 1: The Impact of Insurance and HIV Treatment Technology on HIV Testing . 1
1.1 Introduction .......................................................................................... 1
1.2 Conceptual Framework ........................................................................ 6
1.2.1 A Simple Behavioral Model .......................................................6
1.2.2 Research Questions and Propositions ......................................12
1.3 Empirical Strategy .............................................................................. 17
1.3.1 A Model with Structural Break in 1996 ...................................17
1.3.2 A Recursive Bivariate Probit Model ........................................19
1.3.3 Identification and Estimation of a Recursive Bivariate
Probit Model ............................................................................22
1.3.4 Empirical Hypothesis and Estimation of Average
Treatment Effects .....................................................................26
1.4 Data .................................................................................................... 29
1.4.1 Behavioral Risk Factor Surveillance System (BRFSS) ............29
1.4.2 Restriction Exclusions ...............................................................31
1.5 Results ................................................................................................ 37
1.6 Conclusions ........................................................................................ 46
Chapter 2: Factor Analysis on U.S. Housing Price Indexes ........................................ 49
2.1 Introduction ........................................................................................ 49
2.2 Determining the Number of Factors ................................................... 55
2.3 Examining the Links between Common Factors in HPI and Other
Business Cycles…………………………………………………………....61
2.4 Data .................................................................................................... 66
2.5 Factor and Factor Loadings ................................................................ 70
2.5.1 Number of Factors ...................................................................70
2.5.2 Factor Loadings .......................................................................77
2.6 Goodness of Fit .................................................................................. 81
2.6.1 Testing the Fitness of Macroeconomic Variables and
Financial Market indices ..........................................................81
v
2.6.2 Lag, Number of Factors and Goodness of Fit ..........................86
2.7 Conclusion ............................................................................................. 90
References .................................................................................................................... 94
Appendices........................................................................................ .......................... 99
Appendix A: Reasons of HIV Testing ......................................................... 99
Appendix B: Definitions of Variables for the First Chapter ...................... 100
Appendix C: Definitions of Variables for the Second Chapter ................. 101
vi
List of Tables
Table 1 State Characteristics by Changes of the Proportion of 1115 Waiver
Beneficiaries ................................................................................................................ 36
Table 2 State Characteristics by Changes in Firm Size ............................................... 37
Table 3 Descriptive Statistics by Risk Group .............................................................. 38
Table 4 Recursive Bivariate Probit Regression ........................................................... 40
Table 5 ATE from Recursive Bivariate Probit Model .................................................. 41
Table 6 Recursive Bivariate Probit Regression with State Controls ............................ 44
Table 7 ATE From Recursive Bivariate Probit Model with State Controls ................. 45
Table 8 HPA Correlation (12 MSAs ) for the period 1991Q1 to 2010Q4 .................... 56
Table 9 Estimated Numbers of Factors (r) : kmax=8 ................................................... 71
Table 10 Goodness of Fit ( and 2 ) .................................................................... 83
Table 11 Reasons of Testing ......................................................................................... 99
Table 12 Definitions of Variables for the First Chapter ............................................. 100
Table 13 Macroeconomic Variables (Macro) ............................................................. 101
vii
List of Figures
Figure 1 Decision Tree ................................................................................................... 8
Figure 2 HPI of top 5 states with lowest foreclosure rate in 2007 ............................... 50
Figure 3 HPI of Bottom 5 states with lowest foreclosure rate in 2007 ........................ 50
Figure 4 Co-movements between Macroeconomic Aggregates and House Prices ...... 62
Figure 5 Housing Price Appreciation (HPA) and its V olatility at MSA level .............. 69
Figure 6 1st Factor (OFHEO) v.s. National Index (OFHEO): Standardized HPA ...... 73
Figure 7 First 4 Factors (OFHEO) ............................................................................... 75
Figure 8 1st Factor (MRAC) v.s. National Index (OFHEO) ........................................ 77
Figure 9 Factor Loadings (OFHEO) ............................................................................ 79
Figure 10 Number of Factors ( ) and R-square ( 2 ) ............................................... 88
Figure 11 Number of Factors ( ) and Correlation with Macroeconomic Series
( ) ........................................................................................................................ 89
viii
Abstract
This dissertation consists of two empirical studies on panel data models. In the first
chapter, we study a pseudo panel data set with binary variables as outcome variables
and variables of interest. We discuss the identification of a recursive bivariate probit
model. We estimate the average treatment effects of health insurance and new
treatment technology on the probability of being tested for HIV. In the second
chapter, we study genius panel data and apply a factor analysis to Metropolitan
Statistical Area (MSA)-level Housing Price Indexes (HPIs). With this, we estimate
the number of common factors using Bai and Ng (2002)’s information criteria, and
measure the closeness of the links between macroeconomic variables and the set of
common factors.
The first chapter investigates the effects of health insurance and new antiviral
treatments on HIV testing rates among the U.S. general population. A theoretical
model is developed in which an agent decides whether or not to undergo HIV
testing. This decision is determined by the value of early treatment and the value of
identifying HIV-negative status. We test the predictions from the theoretical model
by using nationally representative data from the Behavioral Risk Factor Surveillance
Survey (BRFSS) for the years 1993 to 2002. We estimate a recursive bivariate probit
ix
model, with insurance coverage and HIV testing as the dependent variables. We use
changes in Medicaid eligibility and distribution of firm size over time within a state
as restriction exclusions for insurance coverage. Using a bootstrap method, we
estimate robust confidence intervals of average treatment effects. Consistent with the
theoretical model, the results suggest that (a) insurance coverage increases HIV
testing rates, (b) insurance coverage increases HIV testing rates more among the
high-risk population, and (c) the advent of Highly Active Antiretroviral Therapy
(HAART) increases the effects of insurance coverage on HIV testing for high-risk
populations.
The second chapter aims to identify common factors underlying fluctuations in the
Metropolitan Statistical Area (MSA)-level Housing Price Indexes (HPIs). More
importantly, we examine whether the observed macroeconomic variables are exact
factors. For robustness, we study the two most popular housing price indexes: the
Office of Federal Housing Enterprise Oversight (OFHEO) repeat sales index, and
the Mortgage Risk Assessment Corporation (MRAC) median home price index. The
methodology follows a two-step procedure: first, we look at several information
criteria to determine the number of common factors that underlie fluctuations in the
MSA-level HPIs. Next, we measure the overall closeness of the links between each
macroeconomic variable and the set of common factors, using the factors estimated
x
in the first step. Our results suggest that only a small number of factors capture the
main co-movements of housing price time series data in the U.S.: Bai and Ng
(2002)’s IC and PC criteria both suggest the presence of four factors, while Ahn and
Horenstein (2009)’s criteria only finds one latent factor in the OFHEO panel. The
first factor, called the “summary measure,” closely tracks with the national index.
The degree of closeness between this summary measure and the national index
reflects the accuracy of weights assigned to census divisions in constructing the
national index. We find a geographical pattern of factor loadings, which is useful in
defining submarkets. Our comparison study shows that the MRAC HPIs are more
volatile than the OFHEO HPIs. As a result, a larger number of factors are extracted.
Finally, findings show that GDP, personal consumption, fixed private investment,
employment growth, unemployment rates, Treasury bill market rates, and certificate
of deposit market rates have strong correlations with the housing market. Financial
markets have contemporary effects on the housing market while investment and
consumption show lag effects. The measures in the second step reconfirm the
presence of four latent factors in the housing price markets.
1
Chapter 1: The Impact of Insurance and HIV Treatment
Technology on HIV Testing
1.1 Introduction
The number of uninsured Americans has risen steadily during the last decade. In 1999,
about 40 million Americans lacked health insurance, and in 2009 more than 50 million
were uninsured. The recent economic downturn has exacerbated the problem with the
number of uninsured rising even more rapidly compared to historical trends with 6
million more uninsured since 2007. Health care reform in the U.S. seeks to reverse this
trend by providing subsidies for insurance coverage to low income persons and by
creating insurance exchanges to promote competition among health insurers. This
naturally raises the question: How will changes in uninsurance affect the health of
Americans now and in the future? It is well understood that lack of health insurance can
affect health status by reducing access to treatment, especially new and expensive
treatments (Jayanta Bhattacharya, Goldman and Sood 2003;David Card, Dobkin and
Maestas 2008;David Card, Dobkin and Maestas 2009;Michael J. McWilliams 2009).
However, it is less clear how uninsurance affects health related behaviors which might
affect long term population health. In this paper, we seek to investigate this issue in the
context of HIV/AIDS—a disease that has claimed more than half a million lives in the
2
U.S. and about 25 million lives worldwide. CDC estimates that approximately 50,000
people are newly infected with HIV each year in the United States.
There are several reasons that make HIV/AIDS an interesting case study for examining
the effects of health insurance on population health and health related behaviors. First,
insurance coverage might have competing effects on health and health related behaviors
in the context of HIV. On one hand, insurance coverage saves lives by improving access
to expensive but efficacious HIV treatments (Jayanta Bhattacharya, Goldman and Sood
2003)
1
. On the other hand, insurance coverage might increase the spread of HIV by
increasing risky sexual behavior among the HIV positive (Darius Lakdawalla, Sood and
Goldman 2006). However, prior work does not address how insurance coverage might
affect HIV testing rates, a question we address in the current paper. Second, new but
expensive treatments for HIV – Highly Active Antiretroviral Therapy (HAART) - were
introduced in the mid-1990s. The introduction of HAART allows us to understand how
the effects of insurance coverage on health related behavior are influenced by health care
innovation. In particular, we study how the effects of insurance coverage on HIV testing
differed in the pre- and post-HAART era. Finally, changes in HIV testing rates induced
by changes in health insurance might have significant externalities in terms of changing
1
Although there is still pervasive discrimination against HIV+ people, such discrimination is illegal under
the Americans with Disabilities Act of 1990 and the Federal Rehabilitation Act of 1973 and the laws of
most states.
3
the dynamics of the HIV epidemic. Thus, the results of the analysis are relevant for
understanding the welfare implications of government-financed health insurance
expansions in the U.S. as well as in several African countries ravaged by HIV.
To examine the potential implications of insurance on HIV testing, one needs to
understand the potential motivations for HIV testing or not testing. Prior work suggests
two motivations for testing (Tomas J. Philipson and Posner 1995). First, HIV testing is a
way to signal quality or HIV status to potential partners in the market for mutually
beneficial sexual trades. Second, HIV testing is motivated by the desire to seek early
treatment – if a person knows that he is infected then he can start treatment early. We
argue that the dominant of these two motivations depends on risk status. High risk
individuals would primarily benefit from initiation of treatment while low risk
individuals would benefit from signaling HIV negative status.
We develop a stylistic theoretical model that incorporates both these motivations for HIV
testing. The model predicts that health insurance coverage encourages testing for both
high risk and low risk populations. Insurance encourages testing among the high risk
population by reducing out of pocket costs of treating infection today. Similarly,
insurance encourages testing among low risk populations by reducing the expected costs
4
of treating future infections due to increased sexual activity if the person tests negative
2
.
Thus, the extent to which insurance coverage might influence testing will depend on risk
of infection
3
.
New treatments that are effective but expensive might mediate the effects of insurance
coverage on incentives for HIV testing in complex ways. New expensive treatments
increase the value of subsidized treatments available through insurance: i.e. insurance is
more valuable if treatment costs $15,000 rather than $5,000. This implies that the effect
of insurance on HIV testing would increase with the advent of new expensive treatments
such as HAART. However, new treatments also reduce the signaling value of a negative
test as potential partners are less worried about getting infected. This implies a smaller
increase in the risk of infection and decreased expected burden of future infections. Thus,
if the reduction in signaling value dominates then new treatments might reduce the effect
of insurance on HIV testing. Again the effects might be heterogeneous and depend on the
risk of infection or the probability of testing positive today. For high risk populations, the
former effect will dominate and the effect of insurance on HIV testing will likely increase
2
In theory, knowing one’s HIV negative status might increase or decrease risk of future infection. However,
available empirical evidence suggests that knowing HIV negative status is associated with either an
increase in risk of infection or no change in risk of infection.(M. W. Otten Jr, et al. 1993;L. S. Weinhardt, et
al. 1999).
3
Insurance coverage can also reduce the costs of HIV testing, but we ignore this effect as many uninsured
consumers can get a free HIV test from safety net providers and HIV support groups.
5
after the introduction of new treatments. Similarly for low risk populations, the signaling
effect will dominate and new treatments will likely lower the effects of insurance on HIV
testing.
We test the predictions from our theoretical model by using nationally representative data
for the years 1993 to 2002 from the Behavioral Risk Factor Surveillance Survey
(BRFSS). We estimate recursive bivariate probit models with insurance coverage and
HIV testing as the dependent variables. We use changes in Medicaid eligibility and
distribution of firm size over time within a state as restriction exclusions for health
insurance equation. The validity of these variables is discussed in detail in section 4 and
section 5. Consistent with the theoretical model, the results suggest that insurance
coverage increases HIV testing among both the high risk and low risk populations. The
results also suggest that the advent of HAART increases the effects of insurance coverage
on HIV testing for high risk populations but lowers the effects of insurance coverage on
HIV testing for low risk populations.
Overall these results suggest that insurance coverage has the potential to have significant
effects on health related behaviors. The paper also contributes to literature on the
economic epidemiology of HIV. Past work has shown that increased insurance coverage
save lives and improves welfare of the infected by improving access to HIV treatment.
6
The results from this paper show that insurance coverage might also have long term
effects on welfare of the current uninfected by altering HIV testing rates and hence the
dynamics of the epidemic. The results of this paper also add to our understanding of the
motivations for HIV testing. For example, a recent paper in this field uses data from a
randomized experiment to show that small financial incentives or subsidies for HIV
testing can have significant effects on HIV testing rates in poor countries (Rebecca L.
Thornton 2008). Our results suggest that HIV testing rates can not only be improved by
subsidizing HIV testing but also by improving access to treatment.
The rest of the paper proceeds as follows: Section 2 develops the theoretical model;
Section 3 describes the data used in the estimation of the empirical model; Section 4
presents the empirical results and section 5 concludes.
1.2 Conceptual Framework
1.2.1 A Simple Behavioral Model
In this section we develop a stylized model for analyzing how insurance coverage might
affect incentives for HIV testing.
7
Consider the problem of person who has a positive probability of being HIV+ but whose
HIV status is unknown. This at-risk agent must decide whether or not to get tested for
HIV. We assume that the at-risk agent can keep his test results confidential. If the agent
decides to get tested and tests positive, he starts treatment and reduces the level (say
number of partners) and riskiness of sexual activity (say sex without condom)
4
. If he tests
negative (uninfected), we assume that he increases the number of sexual partners and
reduces the riskiness of sexual activity (Tomas J. Philipson and Posner 1995). If the agent
does not undergo testing, he continues with his current risk level of sexual exposure and
does not start treatment or starts treatment late when his HIV status is known (Figure 1).
4
In theory, knowing one’s HIV+ status can either increase or decrease riskiness of sexual activity. For
example, altruistic HIV+ persons might reduce risky sexual activity on knowing one’s status in order to
reduce risk of infection for potential partners. On the other hand, knowing one’s HIV+ status might
increase riskiness of sexual activity as one has less to lose from risky sexual activity if one is already
infected with HIV . However, existing empirical evidence suggests that knowing one’s HIV+ status is
associated with a decrease in risky sexual activity. (Gary Marks, et al. 2005;Gary Marks, et al. 2005;Gary
Marks, Crepaz and Janssen 2006;Rebecca L. Thornton 2008).
8
Figure 1 Decision Tree
As illustrated below, the person decides to test if the payoff from testing
exceeds the payoff from not testing
1
,
. (1)
Let, superscripts, + and –, represent the HIV positive (infected) and HIV negative
(uninfected) states respectively. The payoff of not testing is a weighted average of the
9
utility of being HIV negative with unknown HIV status and the utility of being HIV
positive with unknown HIV status. These utilities are weighted by the probability of
infection . is a function of observable characteristics ( ).
1
.
(2)
Similarly, the payoff of testing is a weighted average of the utility of being HIV negative
and knowing one’s HIV negative status and the utility of being HIV positive and
knowing one’s HIV positive status
1
,
(3)
where c denotes the monetary and psychological costs. An individual will decide to test if
1
,
(4)
10
i.e., the individual tests if the benefits exceed the costs. There are two competing effects
on utility for those who test positive
5
. First, those who test positive enjoy the benefits of
treatment net of costs. We call this the “treatment effect”
, (5)
where is the benefit of treatment net of costs, is the health benefit of treatment and
are the treatment costs. Second, those who test positive reduce the riskiness and level of
sexual activity. In theory, level of risky sexual activity can either increase or decrease.
However, empirical evidence suggests that the level of sexual activity falls sharply due to
testing HIV+ (Gary Marks, Crepaz and Janssen 2006). Therefore, we assume a reduction
in the level and riskiness of sexual activity. We assume that this change in riskiness and
level of sexual activity reduces the level of pleasure derived from sex by . We call this
reduced sexual activity among the HIV+ the “protective effect” of testing. Therefore, the
change in utility from testing for the HIV+ population can be expressed as
. (6)
5
For simplicity we ignore cross partials and assume that utility is separable in consumption, health and
pleasure from sexual activity.
11
Now let’s turn to the effects of testing on the HIV negative population. An HIV negative
test increases utility as it allows the person to increase the level of sexual activity by
signaling or verifying HIV negative status. The HIV negative person might also respond
by reducing the riskiness of sexual activity so the net effect on risk of future infections is
unclear (Tomas J. Philipson and Posner 1995). However, empirical evidence that uses
sexually transmitted infections as a bio marker for risk of HIV infection suggests that
HIV testing increases the risk of infection among the HIV negative population (M. W.
Otten Jr, et al. 1993;L. S. Weinhardt, et al. 1999). Therefore, we assume that a HIV
negative test result increases the risk of future infection. This change in utility from the
signaling effect of testing is shown below. It represents the change in utility from
increased sexual activity due to known HIV negative status net of the expected costs of
treating future infections
∆, (7)
where, S represents the change in utility from increased sexual activity, ∆p represents the
increased risk of future infection and represents the discounted value of costs of
treating future infections. Substituting equations (6) and (7) into expression (4) we obtain
that a person tests for HIV if
12
∆ ≡1 ∆.
(8)
In (8), ∆ captures the treatment effect, the protective effect and the signaling effect of
testing. The treatment effect, , and the protective effect, , are weighted by the
probability of being infected and the signaling effect, ∆ , is weighted by the
probability of being uninfected. Consumers decide to undergo an HIV test if ∆ exceeds
the monetary and psychological costs of testing, c.
1.2.2 Research Questions and Propositions
Using this stylized theoretical framework, we aim to answer the following questions:
(1) How does health insurance affect HIV testing? And how does the effect of
insurance on HIV testing differ for high risk and low risk people
(2) How does the introduction of HAART affect the impact of insurance on HIV
testing? And how is this effect of HAART different for high risk people and low
risk people?
In order to evaluate the above questions we need to first evaluate how insurance status
might influence the expression in (8). We posit that insured consumers enjoy the same
benefit from treatment but pay only a fraction, , of the treatment costs, where
1 is the coinsurance rate for the insurance policy. Higher values of imply less
13
generous insurance and 1 implies no insurance. We also assume that insurance
coverage does not affect and as there is no obvious direct mechanism through which
insurance coverage affects pleasure from sexual activity. Although insurance coverage
can reduce the monetary costs of HIV testing, we ignore this effect as many uninsured
consumers can get a free HIV test from safety net providers and HIV support groups.
Given this setup, it is easy to see how insurance coverage might affect incentives for HIV
testing and these effects might differ by risk group. Propositions 1 and 2 listed below
describe these effects.
Proposition 1: Health insurance increases incentives for HIV testing.
This proposition follows from the fact that:
1 ∆ 1 ∆ .
(9)
Insurance coverage increases incentives for testing through two mechanisms. First, it
increases incentives for initiation of treatment as insured consumers pay only a fraction
of the treatment costs. Second, it increases the signaling value of an HIV test, keeping
sexual activity constant, as it lowers expected treatment costs of future infections. It is
14
important to note that utility maximizing consumers might increase sexual activity in
response to lower expected treatment costs. However, by envelope theorem the increase
in pleasure from sexual activity will always offset the increase in expected costs for
utility maximizing consumers. So the net effect of insurance is still an increase in
signaling value.
Proposition 2: Health insurance increases incentives for HIV testing more for high risk
than low risk consumers as long as sexual activity of the uninfected does not increase
with insurance.
Using the expression in (9) the change in incentives for testing due to insurance coverage
keeping sexual activity constant can be expressed
1 ∆ . (10)
Taking the derivative of the above expression with respect to the risk of infection yields:
∆ 0 . (11)
15
The above expression is greater than zero since ∆p is less than one. Hence, the change in
incentives for testing due to insurance coverage increases with the risk of infection.
Intuitively, both HIV positive and HIV negative individuals benefit from insurance
coverage due to reduction in treatment costs. However, HIV negative enjoy a smaller
benefit as they only benefit from reduction in treatment costs if they get infected in the
future. Since higher risk implies greater probability of being HIV positive it also implies
a greater benefit of insurance and greater change in incentives for testing due to insurance
coverage.
The above result would be ambiguous if sexual activity of the uninfected increases in
response to insurance coverage. In particular the expression in (11) would have an
additional term which reflects the costs of treating future infections due to increased
sexual activity induced by insurance coverage net of the pleasure derived from this
increased sexual activity. While there is little empirical evidence to support the result that
sexual activity increases with insurance coverage the result is certainly theoretically
plausible.
Our next task is to understand how the introduction of HAART, which greatly increased
treatment costs, but also provided significant health benefits, influences the impact of
insurance on incentives for HIV testing (Proposition 1). As a corollary we also want to
16
examine how these effects of HAART may vary by risk of infection (Proposition 2). In
the context of our model, we posit that HAART increases the health benefits of treatment
and treatment costs and that HAART is welfare enhancing on net as health benefits of
treatment increase by more than treatment costs. We also posit that after the advent of
HAART potential sexual partners are less worried about getting HIV. This implies that
the increase in partners or sexual activity due to signaling HIV negative status is lower
with HAART. These potential effects of HAART are stated below
0,
∆
∆
.
(12)
Given these potential effects we can derive the effects of HAART on the impact of
insurance on incentives for HIV testing.
Proposition 3: HAART has an ambiguous effect on the impact of insurance on HIV
testing.
Expression (10) shows how insurance coverage affects the incentives for testing.
Evaluating the expression before and after HAART reveals two competing effects of
HAART. On the one hand HAART increases treatment cost which increases the value of
17
insurance and the effect of insurance on HIV testing. On the other hand, HAART reduces
the value of signaling, therefore reduces the magnitude of increase in sexual partners if
one tests negative, which reduces the chance of future infection. The reduced probability
of future infection reduces the value of insurance and the effect of insurance on HIV
testing. Thus, HAART has an ambiguous effect on the impact of insurance on HIV
testing. However, HAART is likely to affect the impact of insurance testing differently
for high risk and low risk individuals. HAART likely increases the impact of insurance
coverage on incentives for testing for high risk individuals since high risk individuals
primarily benefit from HIV testing due to the treatment value of testing. Since both
HAART and insurance increase the treatment value of testing it is likely that HAART
increases the impact of insurance coverage on incentives for testing for the high risk
group.
1.3 Empirical Strategy
1.3.1 A Model with Structural Break in 1996
The model is represented by two equations. The first equation models the decision to
undergo an HIV test ( ). The second equation models insurance status - whether the
person is insured or uninsured ( ). We approximate the utility function of HIV testing by
a reduced form specification, i.e., a latent HIV testing equation,
18
∗
if an individual has some kind of health care coverage, including health insurance,
prepaid plans such as HMOs, or government plans such as Medicare, and,
∗
if an individual has no health care coverage. X denotes the vectors of observable factors
and post is a dummy variable for post HAART years (i.e. following introduction of
HAART in1996),
1 1996
0 1996
We d enote
∗
as the lat e nt u t i lity o f havin g h ealth in surance. S imilarly, i t is
approximat ed by,
∗
such t hat 1 if ∗
0 and 0 o therwis e , where z denotes t h e addit i o n al
observable factors.
19
,
, are ass u med jointly normally d istr ibuted w ith mean z eros a n d the
following covaria nce matrix.
,
, 1
1
1
In p articular, w e observe two discrete o utcomes: h aving health insuranc e and
undergoing HIV testing, which take the form of ,
.
1 indicates t hat an
individual has ever tested for HIV within a year and 0 otherwis e. 1 indicates
that an individual has a kin d of health care coverage.
This model specification leads to an endogenous switching model , with e quations
being th e t h ree lat e n t e quations t o be e st imated. For s i mplicity, we assume that
,
0 and
0 , this m odel b ecomes a b ivariate p r o bit
model with the HIV testing equation being the same for people w ith and withou t
insuranc e o t her than t he in t erc e pt
,
.
1.3.2 A Recursive Bivariate Probit Model
The outcomes are modeled using a latent variable approach where we only observe
whether the latent variable is above or below zero. The latent variables underlying HIV
20
testing and are given below. We estimate separate models from each risk group , . We discuss the empirical definitions of the risk group in the data
section of the paper.
In particular, we observe two discrete outcomes: having health insurance and undergoing
HIV testing, which take the form of ( ,
). In a bivariate probit model these outcomes
are modeled using a latent variable approach where we only observe whether the latent
variable is above or below zero. The latent variables underlying HIV testing and are
given below. We estimate separate models from each risk group
k HighRisk, LowRisk. We discuss the empirical definitions of the risk group in the
data section of the paper.
We use a Recursive Bivariate Probit model to estimate the causal effect of health
insurance coverage on the probability of testing for HIV. The recursive structure builds
on a first equation for the potentially endogenous dummy–insurance status and a second
equation determining the outcome of interest-decision to undergo an HIV test.
HI
∗
α
α
X
α
z
υ
,
(13)
H
1H
∗
0 .
(14)
21
T
∗
β
β
H
β
Post
β
Post
∗H
β
X
ε
,
(15)
T
1T
∗
0 ,
(16)
In (13)-(16), ∗
is the latent variable for HIV testing, ∗
is the latent variable for
health insurance, is a dummy variable for post HAART years (i.e. following
introduction of HAART in1996),
, , where are
demographic variables including age, gender, education, race, and income level. We add
state fixed-effects ( ) to control for time invariant unobserved heterogeneity across
states and year fixed-effects ( ) to control for secular time trends. In particular, We
assume that ε
, υ
is independent of z
and distributed as bivariate normal with mean
zero, each has unit variance, and ρ
c orr ε
, υ
. Standard errors are clustered at the
state level.
In this model, insurance status and HIV testing are linked for two reasons. First,
insurance status enters as a regressor in the HIV testing equation – that is, insurance is
causally linked to HIV testing. Second, unobserved determinants of HIV testing and
insurance status (the error terms in each equation) are correlated. For example,
individuals engaged in risky behaviors might be more likely to undergo an HIV test but
less likely to have insurance.
22
1.3.3 Identification and Estimation of a Recursive Bivariate Probit Model
Our model is a straightforward extension of a fully simultaneous bivariate probit model.
Incoherence, which can be interpreted as a form of model misspecification, arises in such
simultaneous games. Heckman (1978) refers to coherence condition as conditions for
existence of the model. The condition for the internal consistency of a simultaneous
probit model with endogenous truncated variables found by Schmidt (1981) is essentially
that it be recursive, which is similar to Heckman’s (1978) linear model result, except that
nonlinearity permits the direction of causality to vary across observations. Our model
therefore satisfies the coherence condition.
Again, our model belongs to the general class of simultaneous equation models with both
continuous and discrete endogenous variables introduced by James J. Heckman (1978).
In this general context, James J. Heckman (1978) argues that only the full rank of the
regressor matrix is needed to identify the parameters. However, G. S. Maddala (1983)
asserts that the parameters of the second equation are not identified if there are no
exclusion restrictions on the exogenous variables. However, Joachim Wilde (2000) shows
that this assertion is not true and exclusion restrictions are not necessary for identification
as long as each equation contains at least one exogenous regressor, i.e., theoretical
identification does not require exclusion restrictions if there is sufficient variation in the
data.
23
Although the model is identified by its non-linear functional form even in the absence of
exclusion restrictions, identification by functional form relies heavily on the assumption
of bivariate normality. Under distributional misspecification, exclusion restrictions might
help to make the estimation results more robust. In a Monte Carlo simulation study,
Chiara Monfardini and Radice (2008) show that, even under correct distributional
assumptions, the lack of availability of a valid instrument will make exogeneity tests
unreliable.
It is therefore a common practice to impose exclusion restrictions to improve
identification. These exclusion restrictions (instruments), z
, should be causally linked to
insurance status and should affect HIV testing only through their effect on insurance
status. We use expansion in Medicaid coverage and changes in the distribution of firm
size within states overtime as instruments for insurance coverage. They are discussed in
greater detail in the data section.
The model jointly estimates both the HIV testing equation and the health insurance
equation using maximum likelihood estimation. Consistent and asymptotically efficient
parameter estimates are obtained by maximum likelihood estimation of the bivariate
probit model. This is based on a likelihood function consisting of a product of individual
contributions of the type:
24
, | , , , , | , | ,|,
The second part of the likelihood is simply a probit for . The first part of the individual
likelihood contributions is given as,
1 | 1,
0 |
0
1
where ϕ and Φ denote respectively the distribution function and the density of the N(0,1)
distribution. The ratio
is the inverse Mill’s ratio. Of course, PT 0|H 1, X can
be approximated by one minus this expression. When conditioning on H 0 a similar
approximation holds, replacing
by
.
25
The log-likelihood function becomes,
, | , , ,
1 1
1
1 1 1
1
1
Where
Φ
1Φ
To simplify the computation of the above MLE, In the first step we estimate a probit
model for the binary endogenous regressor, while in the second step we estimate the
probit model of interest with an additional explanatory variable.
This model is qualitatively different from the bivariate probit model because health
insurance appears on the right-hand side of HIV testing equation. It is a recursive,
simultaneous-equations model. Surprisingly, the endogenous nature of one of the
variables on the right-hand side of the first equation can be ignored in formulating the
log-likelihood. [Greene 5ed, chapter 21, p. 715).] It therefore can be fit as an ordinary
26
bivariate probit model with the additional right hand side variable in the second equation,
ignoring the simultaneity. The full information maximum likelihood estimation can
therefore be executed using the statistical software Stata version 11.
1.3.4 Empirical Hypothesis and Estimation of Average Treatment Effects
The key parameters of interest are the parameters related to the causal effect of health
insurance coverage on HIV testing. In particular, β
, specifies the causal effect of health
insurance on HIV testing for risk group , in the pre-HAART
era. Similarly, β
β
, specifies the causal effect of health insurance on HIV testing in
the post HAART era.
Based on the estimates, we can estimate average treatment effect (ATE) under
counterfactual condition. In our model, ATE is the average effect of health insurance on
HIV testing for a randomly selected individual among the U.S. general population. This
ATE in the pre-HAART era and post-HAART era, denoted by ∆
and ∆
, k highrisk, l owrisk respectively can be computed by
∆
≡
Δ T
Δ HI
|
P T1 | HI 1 , P ost 0 P T1 | HI 0 , P ost 0
and
27
∆
≡
Δ T
Δ HI
|
P T1 | HI 1 , P ost 1 P T 1 | HI 0 , P ost 1
We take the difference between ∆
and ∆
, denoted as ∆
, to quantify the impact of
HAART on the effect of health insurance on HIV testing,
∆
≡∆
∆
ΔT
ΔHIΔPost
P T 1 | HI 1 , P ost 1 P T1 | HI 0 , P ost 1 P T1 | HI 1 , P ost 0 P T1 | HI 0 , P ost 0
Motivated by our conceptual framework we are interested in testing the following
empirical hypothesis related to these parameters.
Empirical Hypothesis 1: Health insurance increases HIV testing rates for each group
and in both the pre and post-HAART era.
This hypothesis implies that ∆
0 δ
0. That is, the causal effect of
insurance on HIV testing is positive for both risk groups in both the pre and post HAART
era.
28
Empirical Hypothesis 2: Health insurance increases HIV testing rates more for the high
risk than the low risk group in both the pre and post HAART era.
This hypothesis implies that ∆
∆
and ∆
∆
. That is,
the causal effect of insurance on HIV testing is greater for the high risk group in both the
pre and post HAART era.
Empirical Hypothesis 3: HAART increases the effects of insurance on HIV testing for
the high risk group. However, HAART decreases the effects of insurance on HIV testing
for low risk group.
As discussed earlier the sign of ∆
is a priori ambiguous. On the one hand, HAART
increases treatment cost which increases the value of insurance and the effect of
insurance on HIV testing. On the other hand, HAART reduces the magnitude of increase
in sexual partners if one tests negative, which reduces the chance of future infection. The
reduced probability of future infection reduces the value of insurance and the effect of
insurance on HIV testing. However, we expect high risk individuals whose primary
motivation for testing is treatment initiation to have positive values of ∆
as
HAART makes both treatment and insurance more valuable.
29
We next describe the data used for estimating the bivariate probit model and testing the
above hypothesis.
1.4 Data
1.4.1 Behavioral Risk Factor Surveillance System (BRFSS)
BRFSS is a population-based, random-digit-dialed telephone survey administered yearly
to a representative sample of non-institutionalized U.S. adults aged 18 years and older
that inquires about various health behaviors associated with premature morbidity and
mortality. The survey has been approved by institutional review boards in each state, and
participants provide oral consent to be interviewed. Interviewers record answers using
computer software. We used data from 1993 to 2002 surveys. Below we describe the
variables from BRFSS that we used in our analysis.
HIV Testing of Adults Under 65
We measured HIV testing as the self-report of an HIV test within 12 months before the
interview. The HIV/AIDS section of the current BRFSS core questionnaire collects the
following information: whether the respondent was ever tested for HIV and, if so, the
month and year of the last test and the facility where last tested. Respondents were coded
as having tested (tested = 1) if they reported that they tested for HIV sometime in the year
preceding their interview date; we assigned a value of 0 otherwise. Respondents who had
30
never been tested for HIV were coded as 0 because they had not been tested in the
preceding 1 year.
6
HIV-Risk Group: Self-perceived HIV Risk
Subject’s self-perception of HIV risk was based on their response to the core BRFSS
question that asked: “What are your chances of getting infected with HIV, the virus that
causes AIDS?” Responses were categorized as High, Medium, Low, None, Not applicable
or Refused. We defined those who reported having a high or medium risk of HIV
infection as the high risk group and those who evaluated themselves as having low or no
risk as the low risk group.
Demographics and Health Insurance Status
Demographic characteristics in the BRFSS include gender, age, education (Less than a
high school degree, High school degree, Some college or AA degree), marital status,
ethnicity (Non-white or Hispanic). The BRFSS also includes information on whether the
respondent is employed and their income level (Less than 200% of federal poverty line
(FPL) or More than 200% of FPL). Finally, BRFSS asks respondents “Do you have any
6
The only difficulty concerns the inclusion of blood tests. In the pre-1998 period, the BRFSS does not
differentiate between HIV tests conducted during a blood donation (and thus unrelated to a decision to
screen for HIV) and other tests. During the later period, however, the BRFSS asks separately about tests
during blood donation and other tests. However, we can construct a consistent variable indicating whether
an individual had any HIV test during the last 12 months.
31
kind of health care coverage, including health insurance, prepaid plans such as HMOs, or
government plans such as Medicare?” Respondents that answered yes to this question are
coded as insured.
1.4.2 Restriction Exclusions
We used the two sets of restriction exclusions for insurance choice – Medicaid expansion
and firm size distribution. These exclusions are similar to those instruments used by
Bhattacharya, Goldman and Sood (2003) who estimate the causal effect of insurance on
HIV related mortality.
The first set of restriction exclusions captures availability of public insurance through the
Medicaid program. There has been a significant expansion of Medicaid eligibility with
significant variation across states in the pace at which these expansions have occurred.
Prior research documents a strong association between Medicaid expansions and
insurance coverage despite evidence that public insurance crowds-out private insurance
coverage (Jonathan Gruber and Simon 2008). However, other research finds no evidence
of a crowding out effect of Medicaid expansion (John C. Ham, Ozbeklik and Shore-
Sheppard 2011; Lara Shore-Sheppard, Buchmueller and Jensen 2000). Medicaid is also
an important source of coverage for HIV+ individuals with about half of the HIV insured
receiving coverage from Medicaid (Bhattacharya, Goldman and Sood 2003). Medicaid
32
eligibility criteria vary from state to state and change over time, but the eligibility criteria
are mandated by a Federal statute as the federal government pays about half of Medicaid
expenditures. We measure changes in Medicaid eligibility within a state by estimating the
percentage of Medicaid 1115 waiver beneficiaries over the total Medicaid beneficiaries
7
in every state and year. The reason we choose the fraction of 1115 beneficiaries is that
section 1115 demonstration authority is one of the ways that States can expand eligibility
for the Medicaid program beyond what is authorized under federal law (J. Jordan, Adamo
and Ehrmann 2000). Waivers allow states to provide coverage and deliver services to the
low-income population by using federal Medicaid funds in ways that do not conform to
existing federal standards and options. Medicaid section 1115 waivers have recently been
promoted as a way to expand coverage without committing new federal resources.
8
About
13 states have utilized section 1115 demonstrations to increase Medicaid enrollment by
expanding eligibility for state-sponsored health insurance (Teresa A. Coughlin and
Zuckerman 2008). To date, ten states have also applied for Section 1115 waivers to
expand Medicaid coverage to people living with HIV who are not legally considered
7
These data are obtained from Centers for Medicaid and Medicare Services (CMS): Medicaid Beneficiaries
by Maintenance Assistance Status, available online at
https://www.cms.gov/MedicaidDataSourcesGenInfo/MSIS/list.asp.
8
Section 1115 Waivers at a Glance: Summary of Recent Medicaid and SCHIP Waiver Activity, Kaiser
Commission on Key Facts, April, 2003.
33
disabled and three HIV 1115 waivers have been approved in the District of Columbia,
Maine, and Massachusetts.
9
The second set of restriction exclusions capture availability of private insurance. In
particular, we use data on the distribution of firm size in every state and year to construct
two variables at the state-year level
10
(Jayanta Bhattacharya, et al. 2009): (1) the
percentage of workers employed in firms with 100 to 499 employees, and (2) the
percentage of workers employed in firms with 500 or more employees. These two
indexes are strong predictors of insurance coverage as large firms are much more likely
to offer insurance coverage. For example, data from the 2008 Current Population Survey
show that 32% of workers in firms with less than 25 employees are uninsured and only
13% of workers in private firms with more than 500 employees are uninsured. For our
analysis we interact these restriction exclusions with poverty status as prior research
suggests that the effects of availability of insurance on take-up of insurance might vary
with income (John C. Ham, Ozbeklik and Shore-Sheppard 2011)
9
“Disease Management and Medicaid Waiver Services for HIV/AIDS Patients”,
http://aspe.hhs.gov/health/reports/09/hivmgt/report.pdf.
10
These data are obtained from the Statistics of U.S. Businesses (SUSB) available online at
http://www.sba.gov/advo/research/data.html.
34
The proposed instruments are valid under two conditions. First, they should be strong
predictors of insurance coverage. Second, they should affect HIV testing only through
their effect on insurance choice. In the next section, we show that the instruments are
strong predictors of insurance coverage. The second assumption cannot be directly tested.
However, it seems unlikely that changes in firm size distribution within a state or timing
of adoption of 1115 waivers (our models have state fixed effects) would be related to
HIV testing, except through insurance coverage. However, one concern is that variation
in our instrument might be correlated with state economic conditions or other
characteristics correlated with HIV testing. To address this concern, Table 1 compares
five important state characteristics (unemployment rate, disposable income, poverty rate,
percent White and age distribution of population) between 19 states with an increasing
proportion of 1115 waiver beneficiaries over total Medicaid beneficiaries from 1993 to
2003 with 31 states without 1115 waiver or with a decreasing proportion of 1115 waiver
beneficiaries over the total Medicaid beneficiaries in the same period. We find that not
only are the levels of these characteristics similar across these two types of states but
trends in these characteristics over the sample period are also similar. This allays
concerns about systematic differences in characteristics of states with high versus low
values of our instruments for public insurance coverage. Table 2 compares the same state
characteristics between 41 states with rising proportion of employment in medium and
large firms from 1993 to 2003 to 9 states with falling proportion of employment in
35
medium and large firms from 1993 to 2003. Again, the data show little difference in
levels or trends in state characteristics across states with high versus low values of our
instruments for private insurance coverage. In another similar test of validity of our
instrumental variables, we estimate alternative models which include these state-year
level variables as covariates. We find that our results are robust to the inclusion of these
state-year level variables as covariates. The results section provides more details of this
specification test.
36
Table 1 State Characteristics by Changes of the Proportion of 1115 Waiver
Beneficiaries
11
Public Insurance Instrument
States with an increasing
proportion of 1115 waiver
beneficiaries
(19 states
b12
)
States with no change or
decreasing proportion of 1115
waiver beneficiaries
(31 states)
State characteristics in 1993
Unemployment rate 6.49% 6.17%
Disposable income $18,714 $18,109
Poverty rate 13.20% 15.03%
Age less than 20 29.16% 28.93%
Age 20-64 58.63% 58.09%
Percent White(2000) 80.12% 83.52%
State characteristics in 2003
Unemployment rate 5.50% 5.66%
Disposable income $28,748 $28,064
Poverty rate 11.39% 12.13%
Age less than 20 27.84% 27.70%
Age 20-64 59.80% 59.65%
Percent White 79.70% 83.09%
Change in state characteristics in 1993 to 2003
Unemployment rate -0.99% -0.52%
Disposable income 54.04% 55.12%
Poverty rate -1.82% -2.90 %
Age less than 20 -1.32% -1.24%
Age 20-64 1.17% 1.56%
Percent White -0.42% -0.43%
11
Data on poverty rate, percent of white and population age structure are from census bureau. Data on
unemployment rate is from the bureau of labor statistics. Data on disposable income is from the bureau of
economic analysis.
12
The 19 states that passed section 1115 waivers or had increasing proportion of 1115 waiver beneficiaries
as a share of total Medicaid population include Alabama, Arizona, California, Delaware, Hawaii, Illinois,
Maine, Maryland, New Jersey, New Mexico, New York, Rhode Island, South Carolina, Utah, Vermont,
Virginia, Washington, Wisconsin, Wyoming.
37
Table 2 State Characteristics by Changes in Firm Size
Private Insurance Instrument
States with increasing
proportion of employment
in medium and large firms
(41 states)
States with decreasing
proportion of employment in
medium and large firms
(9 states
b13
)
State characteristics in 1993
Unemployment rate 6.19% 6.74%
Disposable income $18,260 $18,682
Poverty rate 14.25% 14.82%
Age less than 20 29.03% 28.93%
Age 20-64 58.20% 58.72%
Percent White(2000) 82.52% 81.00%
State characteristics in 2003
Unemployment rate 5.57% 5.71%
Disposable income 28,164 $29,040
Poverty rate 11.80% 12.12%
Age less than 20 27.73% 27.84%
Age 20-64 59.67% 59.86%
Percent White 82.11% 80.51%
Change in state characteristics in 1993 to 2003
Unemployment rate -0.62% -1.03%
Disposable income 54.57% 55.39%
Poverty rate -2.45% -2.70%
Age less than 20 -1.30% -1.09%
Age 20-64 1.47% 1.14%
Percent White -0.41% -0.49%
1.5 Results
Table 3 shows the descriptive statistics for the analytic sample. The average age of
respondents is 40 years and slightly more than half the respondents are females. About
60% have some college education, more than a third have incomes below 200% of the
federal poverty line and 85% have health insurance. Overall, about 16% of the
13
The 9 states that have decreasing proportion of employment in medium and large size firms include
Alaska, Colorado, Florida, Georgia, Massachusetts, Minnesota, North Carolina, New Jersey and Utah.
38
respondents report testing for HIV in the previous year and not surprisingly, the testing
rate among high risk individuals is 10 percentage points higher than the rate among low
risk individuals. High risk individuals also are less likely to be white, female, and single.
High risk individuals have similar insurance coverage and education.
Table 3 Descriptive Statistics by Risk Group
14
Low Risk
N = 662,283
(93%)
High Risk
N=49,267
(7%) All
Covariates
Age
39.86 36.55 39.63
Non-White or Hispanic
18.57% 28.25% 19.24%
Female
57.04% 53.10% 56.7%
Married
41.70% 58.12% 42.84%
Income below 200% FPL
37.62% 43.98% 38.06%
Education level
Less than HS degree
2.20% 3.78% 2.31%
High school degree
38.02% 38.77% 38.07%
Some college or AA degree
29.42% 31.84% 29.59%
College degree
30.36% 25.61% 30.03%
Have health plan
85.72% 81.45% 85.42%
Restriction Exclusions
1115 Waiver
1.57% 1.49% 1.56%
Employment at medium size
14.57% 14.56% 14.57%
Employment at large size firms
46.32% 46.60% 46.34%
Dependent Variable
Tested for HIV in past 12 months
15.64% 25.92% 16.35%
14
Data on covariates, risk status and HIV testing is from the Behavioral Risk Factor Surveillance Survey
1993 to 2003. Data on 1115 waiver is from Centers for Medicaid and Medicare Services (CMS): Medicaid
Beneficiaries by Maintenance Assistance Status, available online at
https://www.cms.gov/MedicaidDataSourcesGenInfo/MSIS/list.asp. Data on firm size is obtained from the
Statistics of U.S. Businesses (SUSB) available online at http://www.sba.gov/advo/research/data.html.
39
Table 4 reports results from the recursive bivariate probit model of HIV testing and
health insurance coverage. We estimate separate regressions for the low and high risk
groups. Key ATEs and their confidence intervals are reported in Table 5. We use
bootstrap to estimate confidence intervals of these average treatment effects
15
(Itzhak
Krinsky and Robb 1986;Itzhak Krinsky and Robb 1990).
15
We first drew a sample of coefficients from the multivariate normal distribution with means of the
estimated coefficients and covariance of the covariance matrix. We estimated the AETs using these
coefficients 1,000 different iterations. We used these data to calculate mean ATEs and their confidence
intervals.
40
Table 4 Recursive Bivariate Probit Regression
*p<.01, **p<.05, ***p<.01
Self evaluated high risk Self evaluated low risk
Tested for HIV in
Past 12 Months
Health Plan
Tested for HIV in
Past 12 Months
Health Plan
Age
-0.00441
(0.0040332)
0.0028512
(0.0042165)
-0.01759***
(0.00203)
-0.0015
(0.001526)
Age^2
-0.0001176**
(0.0000535)
0.0001047*
(0.0000548)
-1.2E-05
(2.38E-05)
0.000105***
(1.87E-05)
Non-White or Hispanic
0.1206372***
(0.0278905)
-0.0745605***
(0.0263186)
0.27511***
(0.026768)
-0.06353***
(0.017478)
Female
-0.0902754***
(0.0166471)
0.1771266***
(0.0180991)
-0.11405***
(0.009364)
0.134873***
(0.012042)
Married
0.1554186***
(0.0332024)
-0.3823977***
(0.0254022)
0.156786***
(0.011776)
-0.38122***
(0.012401)
Income below 200% FPL
0.0719764**
(0.0328822)
-1.864541***
(0.3771449)
0.099087***
(0.017174)
-1.32331***
(0.245194)
High School Degree
0.07992*
(0.0433959)
0.400258***
(0.0515101)
0.012429
(0.028286)
0.282589***
(0.03575)
Some college or tech
school
0.1949218***
(0.051416)
0.6668733***
(0.0480076)
0.059199**
(0.028023)
0.486135***
(0.036946)
College graduate or higher
0.2223715***
(0.0570605)
0.8535916***
(0.0515084)
0.04968*
(0.02881)
0.702641***
(0.037507)
Health plan coverage
0.3903225*
(0.2199806)
0.179839**
(0.090972)
Post97* Health plan
0.0672901**
(0.0354586)
-0.03383***
(0.010926)
Restriction Exclusions
1115 Waiver
0.2965837**
(0.1616104)
0.083016
(0.111885)
Employment at medium
size firm
-3.576105
(2.538708)
3.377685**
(1.777654)
Employment at large
size firm
0.0907749
(1.425739)
1.208295*
(0.720941)
1115 Waiver*Poor
0.1897171
(0.3399313)
0.446019**
(0.197146)
Employment at medium
size firm* Poor
5.938942***
(1.873143)
3.110923***
(1.189398)
Employment at large
size firm * Poor
0.823227**
(0.3393542)
0.323573
(0.236484)
Constant
-1.112364***
(0.1903717)
1.031995
(0.9604056)
-0.72135***
(0.111287)
-0.09589
(0.552774)
State controls No No
Year dummies Yes Yes
Log likelihood -47617.85 -505986.3
rho -0.1618973** -.0318648
Chi-2 Test for joint
significance of
restriction exclusions
18.01***
(p value: 0.0062)
42.84***
(p value: 0.0000)
Number of obs 49,267 662,283
41
Table 5 ATE from Recursive Bivariate Probit Model
ATE Bootstrap Std.
Err.
Normal-based
[95% Conf. Interval]
16
High risk (N=49,267)
Health insurance-Pre-HAART .0270241*** .0103723 0.00669, 0.04735
Health insurance-Post-
HAART
.0484517*** .0077029 0.03335, 0.06355
Change in ATE Post HAART .0214276** .0120659 -0.00222, 0.04508
Low risk (N=662,283)
Health insurance-Pre-HAART .0262222 *** .002381 0.02156, 0.03089
Health insurance-Post-
HAART
.0184173*** .0020959 0.01431, 0.02253
Change in ATE Post HAART -.0078049*** .0031429 -0.01397, -0.00164
Overall the results are consistent with the three empirical hypothesis discussed earlier.
First, the results suggest that health insurance coverage significantly increases the
probability of testing for HIV in both the pre and post HAART period (Hypothesis 1).
For example, among the low risk population insurance coverage increases the probability
of HIV testing by 2.6 percentage points in the pre-HAART period. Similarly, among the
high risk population insurance coverage increases the probability of HIV testing by 2.7
percentage points in the pre-HAART period. Second, the results suggest that the effects
of insurance coverage on HIV testing are larger for the high risk group in both the pre-
HAART and HAART period (Hypothesis 2). For example, in the post-HAART period
16
We calculate the confidence intervals using Bootstrap method by the following steps.
1. Draw a random sample with full replacement
2. Estimate the model with this new sample
3. Estimate ATE for this draw. Save ATE.
4. Repeat steps (1) to (3) 1000 thousand times
5. Report standard deviation of ATEs across 1000 bootstrap samples
42
insurance increases the probability of HIV testing for the high risk group by 4.8
percentage points. In contrast, in the post-HAART period insurance increases the
probability of HIV testing for the low risk group by 1.8 percentage points. Third, the
results suggest that HAART increases the effects of insurance on HIV testing for the high
risk group (Hypothesis 3). Specifically, the ATE of insurance on HIV testing for the high
risk group increases from 2.7 percentage points in the pre-HAART period to 4.8
percentage points in the post-HAART period. We do not observe a similar increase in the
average treatment effect of insurance for the low risk group. In fact, the evidence
suggests a modest decrease in the effect of insurance on HIV testing for the low risk
group.
Restriction Exclusions
The results in Table 4 suggest that our restriction exclusions are statistically significantly
related to insurance coverage. Firm size distribution is a strong predictor of insurance
coverage for the low risk population and Medicaid eligibility expansion is a strong
predictor of insurance coverage for high risk population. We also find that availability of
both public and private employer provided insurance has a stronger effect on insurance
coverage for poor rather than rich households. In both the high risk and low risk models
the restriction exclusions are jointly significant with a Chi-2 statistic of 18 (p-value =
43
0.0062) and 42.84 (pvalue = 0.0000) respectively
17
.
Since our instruments vary at the state year level, one concern is that they might be
correlated with other state year level determinants of HIV testing. To address this
concern, Tables 6 and 7 reports results from models that include additional state-year
level covariates. The additional covariates are poverty rate, unemployment rate, per
capita income and population age structure. The results are robust to the inclusion of
these state controls. (Table 6 and Table 7) As before, we find that health insurance
increases the likelihood of testing for HIV and that the effects of insurance coverage on
HIV testing are larger for the high risk group. Finally, consistent with hypothesis 3 we
also find that the advent of HAART increases the effects of insurance coverage on HIV
testing for the high risk population but not for the low risk population.
Finally, in a second indirect test of instrument validity, we checked the robustness of our
results to the inclusion or exclusion of certain sets of restriction exclusions. This test is in
the spirit of the Hausman over-identification test and is based on the principle that if all
our instruments are valid then the estimates obtained by using only a subset of
17
We conduct Monte Carlo simulations to check whether the restriction exclusions/ instruments are strong.
We find the relative bias of our instrumental variables coefficients is 6.1% which is less than the Douglas
Staiger and Stock 1997) rule of thumb threshold of 10% for strong instruments (Douglas Staiger and Stock
1997). We also find that the rejection rate for alpha = 0.05 is 11.6% which is slightly higher that the
Douglas Staiger and Stock 1997) rule of thumb threshold of 10% for strong instruments.
44
instruments should differ only as a result of sampling error (Jerry A. Hausman 1978).
Thus, for this test we estimated two different sets of models. The first set of models only
used the instruments related to availability of private insurance and the second set of
models only used the instruments related to availability of public insurance. The results
from both these sets of models were virtually identical to the model that used both sets of
instruments. Thus, these results also suggest that our instruments are valid.
Table 6 Recursive Bivariate Probit Regression with State Controls
Self evaluated high risk Self evaluated low risk
Tested for HIV
in Past 12
Months
Health Plan
Tested for
HIV in Past
12 Months
Health Plan
Age
-0.00439
(0.004046)
0.002877
(0.004224)
-0.01758***
(0.002031)
-0.00153
(0.001525)
Age^2
-0.00012**
(5.37E-05)
0.000104**
(0.000055)
-1.2E-05
(2.38E-05)
0.000106***
(1.88E-05)
Non-White or Hispanic
0.120571***
(0.027911)
-0.07481***
(0.026261)
0.275224***
(0.026785)
-0.06364***
(0.017447)
Female
-0.09029***
(0.016695)
0.177123***
(0.018078)
-0.11414***
(0.009369)
0.134758***
(0.012016)
Married
0.155219***
(0.033247)
-0.38207***
(0.025428)
0.157134***
(0.01166)
-0.38135***
(0.012414)
Income below 200% FPL
0.071533**
(0.033455)
-1.86339***
(0.376387)
0.099705***
(0.016848)
-1.33768***
(0.242714)
High School Degree
0.079695*
(0.043674)
0.4007***
(0.051954)
0.011215
(0.028147)
0.283337***
(0.035796)
Some college or tech school
0.194768***
(0.051756)
0.667502***
(0.048572)
0.057859**
(0.02791)
0.486705***
(0.037006)
College graduate or higher
0.221955***
(0.057823)
0.854054***
(0.051983)
0.04819*
(0.02865)
0.703368***
(0.037522)
Health plan coverage
0.389482*
(0.220517)
0.183024**
(0.08887)
Post97* Health plan
0.067967**
(0.035237)
-0.03275***
(0.010427)
45
Table 6, Continued
Restriction Exclusions
1115 Waiver
0.240392
(0.18773)
0.029446
(0.10847)
Employment at medium size firm
-3.50809
(2.681891)
3.879333***
(1.530893)
Employment at large size firm
0.259895
(1.587199)
1.651469***
(0.659517)
1115 Waiver*Poor
0.208704
(0.339713)
0.470232**
(0.202217)
Employment at medium size
firm* Poor
5.927808***
(1.87447)
3.135998***
(1.179203)
Employment at large size firm *
Poor
0.823269***
(0.338207)
0.345312
(0.234563)
Constant
-2.87796
(3.154479)
-0.67556
(4.324606)
-1.80055
(2.159406)
-0.91881
(1.504703)
State controls Yes Yes
Year dummies Yes Yes
Log likelihood -47612.285 -505904.01
rho -.1632921** -.0338877
Chi-2 Test for joint
significance of
restriction exclusions
16.47**
(p value: 0.0114)
62.17***
(p value: 0.0000)
Number of obs 49267 662283
*p<.01, **p<.05, ***p<.0
Table 7 ATE From Recursive Bivariate Probit Model with State Controls
ATE Observed Coef. Bootstrap Std.
Err.
Normal-based
[95% Conf. Interval]
High risk (N=49,267)
Health insurance-Pre-HAART .0267755*** .010194 0.00680, 0.04676
Health insurance-Post-HAART .0484143*** .0075589 0.03360, 0.06323
Change in ATE Post HAART .0216388** .0117057 -0.0013, 0.04458
Low risk (N=662,283)
Health insurance-Pre-HAART .0260579 *** .0023705 0.02141, 0.03070
Health insurance-Post-HAART .0185034*** .0020897 0.01441, 0.02260
Change in ATE Post HAART -.0075545*** .0031261 -0.01368, -0.00143
46
1.6 Conclusions
In this paper we analyzed the effects of insurance coverage on HIV testing behavior and
how this link between testing and insurance coverage changed with HIV treatment
innovations. We developed a theoretical model that makes predictions about the impact
of insurance coverage on HIV testing. We test these predictions from the theoretical
model by using nationally representative data for the years 1993 to 2002 from the
Behavioral Risk Factor Surveillance Survey (BRFSS). Consistent with the theoretical
model the results suggest that (a) insurance coverage increases HIV testing among both
the high risk and low risk populations (b) insurance has larger effects on HIV testing for
the high risk population, and (c) the advent of effective HIV treatment increased the
effects of insurance coverage on HIV testing for high risk group.
Overall these results suggest that providing insurance or subsidized treatment can be an
effective strategy for increasing HIV testing rates. These results have important
implication for developing economies which are considering subsidizing treatment and
for developed economies like the U.S. where budget pressures might force several states
to reduce the generosity of public insurance coverage for HIV.
47
The results suggest that providing subsidized treatment not only improves the health of
the infected but also has important effects on the dynamics of the HIV epidemic. Prior
research suggests that knowing one’s HIV status can reduce risky sexual activity. For
example, a meta-analytic review of published research from1985 to 1997 found that after
testing and counseling, HIV positive participants reduced unprotected intercourse and
increased condom use. A more recent review (Gary Marks, et al. 2005) found that the
prevalence of high-risk sexual behavior is reduced substantially after people become
aware they are HIV+. Following these findings, Marks et al. (2006) estimate the
proportion of sexual transmission of HIV attributable to HIV-positive aware and unaware
persons in the USA. They found that the transmission rate from the unaware group was
3.5 times that of the aware group. Similarly, Thornton (2008) found in an experiment in
rural Malawi that sexually active HIV-positive individuals who learned their results are
three times more likely to purchase condoms two months later than sexually active HIV-
positive individuals who did not learn their results. These prior studies indicate increased
HIV testing can potentially have a large impact on HIV transmission. Juxtaposing these
results from the prior literature with the results from this study suggest that insurance
coverage might reduce risky sexual behavior and reduce the spread of HIV.
The results of this study also improve our understanding of the motivation for HIV
testing and how changes in treatment technology can influence HIV testing behavior. The
48
results are consistent with the notion that high risk individuals are motivated to test
primarily due to the desire to seek early treatment. Therefore, improvements in treatment
technology increase the incentives for HIV+ persons to test. On the other hand, the results
suggest that low risk persons are motivated to test primarily to signal HIV negative status
to potential partners. The value of this signal and the incentive to test diminishes with
advancements in treatment technology as potential partners are less worried about
contracting HIV.
Overall, the lessons learnt from this research might have wider applicability. They
suggest that public policy and health care innovation can have important and complex
effects on health related behaviors. Policymakers should be cognizant of such effects as
they design policies to improve health.
49
Chapter 2: Factor Analysis on U.S. Housing Price Indexes
2.1 Introduction
A rise in foreclosures has flooded the U.S. market since 2007. These troubles in housing
precipitated the credit crisis in the summer of that year, and now, as predicted, have
created a significant decline on the economy. This decline is associated closely with
historical housing price appreciation, as in Nevada, where the home prices soared,
creating a housing boom, and then dropped rapidly (Figure 2 and Figure 3). In contrast,
the states with low foreclosure rates tend to be states that missed the boom in housing
prices, such as South Dakota. The importance of housing price dynamics to what is called
the business cycle has recently attracted great attention in academic circles. House price
volatility, although generally lower than the volatility of financial asset prices, is well
known to have important effects on economic activity and financial stability.
50
Figure 2 HPI of top 5 states with lowest foreclosure rate in 2007
Figure 3 HPI of Bottom 5 states with lowest foreclosure rate in 2007
0
100
200
300
400
500
600
700
1975
1976
1978
1979
1981
1982
1984
1985
1987
1988
1990
1991
1993
1994
1996
1997
1999
2000
2002
2003
2005
2006
HPI of Top 5 States With Lowest Foreclosure Rate In 2007
Nevada
Florida
Michigan
California
Colorado
0
100
200
300
400
500
600
1975
1976
1978
1980
1982
1983
1985
1987
1989
1990
1992
1994
1996
1997
1999
2001
2003
2004
2006
HPI of Bottom 5 States With Lowest Foreclosure Rate In 2007
South
Dakota
Vermont
Maine
West
Virginia
North
Dakota
51
Observations on the dynamics of housing prices and macroeconomic variables have
prompted interest in quantifying their relationships. Changes in house prices may be
caused by a variety of factors that affect both the supply of and the demand for housing.
What are the factors that drive house price dynamics? An important question is whether
housing market fluctuations are an independent source of shocks or whether they just
reflect macroeconomic fluctuations. A large amount of literature attempt to build such
models to link house prices to variables thought to be fundamental determinants of house
prices (K. E. Case et al. (2003); Cho (1996); Topel and Rosen (1988)). These
fundamentals include real disposable income (per capita growth), per capita output,
housing affordability, interest rates (short term, long term), real credit (growth),
residential investment, stock price, oil price, population growth, bank crisis, and
demographic variables. However, in these regression analyses, the independent variables
have been chosen without a formal analysis of the strengths of the proxy variables.
The most popular method used in research to overcome this problem is the “diffusion
index method”. This method selects a few predictors to pool the information in all the
candidate predictors, averaging away idiosyncratic variations in the individual series. A
small set of crucial latent factors can be used to measure common co-movements within a
large set of macroeconomic variables. This smaller number of indexes can be constructed
by principal component analysis.
52
Intuitively, if we include all the macroeconomic variables to construct the estimated
common factors, it is possible that we will have a lot of redundant information that is not
actually related to housing price dynamics. Therefore, if we observe the factors instead of
estimating them, we can reduce the forecast error variance. In finite samples, this may
yield important reductions in prediction error variance. We discuss the method for
statistically selecting the actual macroeconomic variables that are exact factors in
predicting housing prices. Using this approach, we argue that if observable economic
variables are indeed good proxies of the unobserved factors, then these proxies can be
used in place of the factors in the diffusion index model, which is in turn used for
prediction. Once the set of factor proxies is fixed, we effectively eliminate the
incremental increase in forecast error variance associated with the use of estimated
factors. The motivation behind this approach is that in the context of forecasting,
estimation errors associated with factor construction can be avoided to some extent by
using directly observable economic variables as factor proxies.
We propose to test a large set of economic variables using the common factors that drive
the co-movement of U.S. housing price indexes. Kallberg, Liu and Pasquariello (2009)
found that the explanatory power of the common factors driving residential real estate
prices increased between 1992 and 2008. This implied an increasing integration of
regional residential real estate markets in the U.S. and suggested that future shocks to any
of those regional markets are more likely to have nationwide, prolonged effects than they
53
did in the past. However, little research has been done on the measurement and
explanation of the level, dynamics and implications of these common factors, which
represent the real estate price co-movements.
While a growing body of literature uses factor models to capture common trends, the
literature to date has often assumed the number of factors rather than determining it by
the data. To fill these gaps, this study uses factor analysis to evaluate the number of
underlying latent variables in U.S. housing price indexes and to investigate whether the
observed macroeconomic factors and financial indices are exact factors that drive
housing price dynamics.
This paper contributes to the literature mainly by using recently developed tools to
determine the number of factors that underlie the panels of U.S. housing price indexes.
We evaluate, using latent variables, whether some observable economic variables are in
fact the underlying causes for changes in housing prices. We estimate factor models for
two U.S. HPIs: the OFHEO HPI for 1991:1–2010:4 and the MRAC HPI for 1976:1-
2007:2. The methodology follows a two-step procedure. We first use several information
criteria to determine the number of common factors that underlie fluctuations in MSA-
level HPI. Using principal components, we estimate the common factors and the factor
loadings for each MSA. The factor loadings provide estimates of the links between each
MSA-level HPI and each common factor. Using the factor estimated in the first step, we
evaluate the macroeconomic variables and financial indices and identify which observed
variable is closest to the common factors. The statistics ( and ), defined in Section
54
3, measure the overall closeness of the links between each macroeconomic variable and
the set of common factors.
We have several empirical findings. Our study provides evidence that only a small
number of factors capture the main co-movements of housing price time series data. We
find that the panel of MSA HPIs can be summarized with a few common factors which
represent the national business cycle. In fact, the eigenvalue ratio estimator proposed by
Ahn and Horenstein (2009) found only one common factor presenting in the panel of
housing price indexes at MSA level in the U.S. This single factor, also called the
“summary measure”, follows the same trend with national index. The degree of
coincidence between the “summary measure” and the national index reflects the accuracy
of weights assigned to census divisions in constructing the national index. Also, this
result indicates that national conditions will clearly affect the price movement for a
number of small sub regions. Associated with each factor is a set of loadings that indicate
the extent to which each MSA HPI is related to the corresponding factor. According to
the factor loadings, there is a great deal of heterogeneity in the strength, and sometimes
the direction, of the links between the MSA HPI and the national HPI. We found a
geographical pattern of factor loadings for housing price appreciation at MSA level. This
geographical pattern is useful in defining submarkets. Our comparison study shows that
the MRAC HPIs are more volatile than the OFHEO HPIs. As a result, a larger number of
factors are extracted. Finally, we found that GDP, personal consumption, fixed private
investment, employment growth, unemployment rates, Treasury bill market rates, and
certificate of deposit market rates have stronger correlations with the housing market.
55
Financial markets have contemporary effects on the housing market while investment and
consumption show lag effects. These tests of fitness of macroeconomic variables
reconfirm the presence of four latent factors presenting in the housing price markets.
The remainder of the paper proceeds as follows: Section 2 introduces the first-step
criteria in determining the number of common factors. Section 3 proposes the second-
step criteria to evaluate the observed macroeconomic variables. Section 4 describes two
datasets used in the paper. Section 5 presents and interprets the empirical results. Section
6 concludes.
2.2 Determining the Number of Factors
Table 8 shows non-negligible correlation among Housing Price Appreciations (HPAs)
in 12 selected MSAs sorted by population size. We observed high correlations among
these MSAs, especially between the populous MSAs. This high correlation is generally
associated with a common loading on major factors, or common shocks. Factor models
have been widely used for studying asset returns with strong cross-sectional correlations.
The major difficulty in implementing multi-factor models is the identification of common
and relevant factors; identifying the number of common factors is one of the major tasks
of factor analysis. Our first goal is therefore to estimate the number of underlying factors
(r) presenting in U.S HPA at MSA level.
56
Table 8 HPA Correlation (12 MSAs ) for the period 1991Q1 to 2010Q4
Population
size
Big --------------------------------------------------------------------------------------------------- Sm
all
New
York
Los
Angeles
Chicago
-
Naperville
-Joliet
Boston
-Quincy
Seattle
Bellevue
-Everett
San Diego
-Carlsbad
-San
Marcos
Colorado
Springs
Oakland
-Fremont
-
Hayward
Columbia
Santa
Rosa
-Petaluma
Edison
(NJ)
Bellin
gham
New York 1
Los Angeles 0.85381 1
Chicago-
Naperville-
Joliet
0.83599 0.85985 1
Boston-
Quincy
0.85299 0.68262 0.66959 1
Seattle-
Bellevue-
Everett
0.34353 0.55098 0.54841 0.15522 1
SD-Carlsbad-
San Marcos
0.82756 0.91334 0.84540 0.81119 0.43001 1
Colorado
Springs
0.38734 0.24198 0.50130 0.38341 0.11066 0.32184 1
Oakland-
Fremont-
Hayward
0.81804 0.85530 0.87019 0.77382 0.54973 0.90249 0.46098 1
Columbia 0.14974 0.14526 0.25873 0.13736 0.11946 0.17069 0.27050 0.20308 1
Santa Rosa-
Petaluma
0.74106 0.79805 0.82786 0.74596 0.53316 0.89656 0.39863 0.95579 0.21274 1
Edison(NJ) 0.95233 0.84047 0.79980 0.84556 0.24704 0.82120 0.35481 0.77962 0.16397 0.71733 1
Bellingham 0.27111 0.48034 0.53509 0.00263 0.62438 0.38451 0.10874 0.37683 0.11230 0.40404 0.26618 1
57
In our model, we assume that a few latent factors, , drive the co-movements of a high-
dimensional vector of time-series variables, HPI at MSA level, which is also affected by
a vector of mean-zero idiosyncratic disturbances, .
Let
be the housing price index of the
MSA at time , for 1,2, … … (N is
the total number of MSAs included in the analysis) and 1,2, … … (T is total number
of quarters). Suppose that
admits a static factor model representation with r
common factors
, i.e., assume that the variation of the HPI of all MSAs can be
explained by a small set of r unobservable factors contained in the matrix ,
and related to through a matrix of factor loadings , Equation (1) is a generic
representation:
⋯
(17)
Where
,
, ……,
are the common factors that determine
; ,
,… …,
are
factor loadings associated with and
is the idiosyncratic component of
. The
products,
,
, ……,
are the common components of
.
58
To obtain the estimates of and , we solve the following minimization problem,
min
Λ,
(18)
. .
′
(19)
The subscript in and denotes the number of factors included in the estimation. The
solution of the maximization problem (2) can be derived by the principal component
method: The estimated factor matrix F
is √ T times the eigenvectors corresponding to the
k largest eigenvalues of the TT matrix ′
. Given F
, the factor loadings are
obtained through ordinary least squares, ′
/ is the corresponding matrix
of factor loadings.
To estimate the true number of factors, r, Bai and Ng (2002) propose to minimize the
criterion functions. This is done by minimizing a penalized likelihood or log sum of
squares where the penalty factor, , , increases linearly in the number of factors.
Among the preferred criteria suggested in this paper, we consider two panel information
criteria (
) with different penalty functions, which do well in simulations and are more
desirable in practice.
̃
̃
,
59
Where k
is the maximum possible number of factors. Considering that BIC
has good
properties in the presence of cross section correlation and gives a result of a smaller
number of true factors (the penalty term has a greater weight), we also include in
the study.
̃
However, Bai & Ng criteria are shown to overestimate the true number of common
factors when panel data have considerable serial dependence (Greenaway-McGrevy, Han
and Sul). Overestimation impairs the predictability in forecast. Greenaway-McGrevy et al.
suggested filtering the data before applying the Bai & Ng method. Two filtering methods
are applied in our analysis: first differencing (∅1 ) and 1 fitting (∅1 ):
∅
Bai & Ng criteria also overestimate the true number of common factors when either the
number of cross sections or the number of time series is small (Ahn and Horenstein
(2009)). Ahn and Horenstein (2009) propose new estimators, eigenvalue ratio estimator
(ER) and growth ratio estimator (GR), which are proved to outperform the Bai & Ng
estimators in small panel cases. These new estimators only use the eigenvalues of sample
covariance matrices of response variables and are not sensitive to the choice of the
maximum possible number of factors (
). is estimated as the maximizer of the ratio
60
of adjoining eigenvalue, i.e. this corresponds to finding the edge of the cliff in the screen
plot.
̃
̃
∗
∗
Where, is
largest eigenvalue of the sample covariance matrix of , and
∗
∑ / ∑ .
Considering our case as a small panel with high serial correlation and cross section
dependence, we choose eight criteria: 1 with first differencing, 2 with first
differencing, with first differencing, 1 with 1 fitting, 2 with 1
fitting, BIC
with 1 fitting, and . Table 9 reports the estimation of the number
of factors according to minimization of above eight criteria.
Above we have derived a set of implicit factors endogenously. The difficulty then lies in
the economic interpretation of the implied factors. In the next section, we move on to test
whether the observed factors which we believe or know to be relevant are highly
correlated to the common factors.
61
2.3 Examining the Links between Common Factors in HPI and other
Business Cycles
Housing markets can be viewed as an extension of capital markets and residential
property may be seen as a potential institutional investment class. This is not only
because that housing is an alternative portfolio choice available to investors but also
because financing terms available on capital markets have a significant effect on the
return on housing. However, housing markets diverge from capital markets in a number
of ways. They face high governmental intervention; investments are illiquid, indivisible,
structurally and locationally heterogeneous. More importantly, housing prices are much
more sensitive to economic conditions. Finance literature identifies highly successful
factors such as risk premium, Fama-French factors to price asset returns. These factors
are all derived from asset returns alone. However, in real estate market, failing to
consider the stronger dependency of housing on macroeconomic fundamentals, will lead
to biases in estimation.
Housing prices are assumed to respond to external forces as proxies by way of a set of
macroeconomic variables. Figure 4 shows the co-movements of national OFHEO HPA
and key macroeconomic variables
18
over the years 1975 to 2007. The strongest
relationship seems to be that between house prices and consumption of nondurable
18
All variables have been de-trended by taking logs and then regressed on a constant and a linear trend. The
strongest relationship seems to be that between house prices and consumption of nondurable goods.
62
goods. It is not surprising that we see a negative relationship between the growth rates of
fixed investment and HPA, while a positive relationship between the growth rates of
nondurable goods and HPA.
Figure 4 Co-movements between Macroeconomic Aggregates and House Prices
1975198019851990199520002005
-0.2
0
0.2
3.1HPI and Nominal GDP
%deviation from trend
1975198019851990199520002005
-0.5
0
0.5
1975198019851990199520002005
-0.2
-0.1
0
0.1
0.2
3.2HPI and Real GDP
%deviation from trend
1975198019851990199520002005
-0.1
-0.05
0
0.05
0.1
1975198019851990199520002005
-0.2
0
0.2
3.3HPI and Nominal Personal Comsuption
%deviation from trend
1975198019851990199520002005
-0.5
0
0.5
1975198019851990199520002005
-0.2
-0.1
0
0.1
0.2
3.4HPI and Real Personal Consumption
%deviation from trend
1975198019851990199520002005
-0.1
-0.05
0
0.05
0.1
1975198019851990199520002005
-0.2
0
0.2
3.5HPI and Nominal Durable Comsumption
%deviation from trend
1975198019851990199520002005
-0.5
0
0.5
1975198019851990199520002005
-0.2
-0.1
0
0.1
0.2
3.6HPI and Real Durable Consumption
%deviation from trend
1975198019851990199520002005
-0.2
-0.1
0
0.1
0.2
63
Figure 4, Continued
197519801985 1990199520002005
-0.2
-0.1
0
0.1
0.2
3.7HPI and Nominal NondurableConsumption
%deviation from trend
197519801985 1990199520002005
-0.2
-0.1
0
0.1
0.2
19751980 19851990 19952000 2005
-0.2
0
0.2
3.8HPI and Real Nondurable Comsumption
%deviation from trend
19751980 19851990 19952000 2005
-0.05
0
0.05
197519801985 1990199520002005
-0.2
0
0.2
3.9HPI and Residential Fixed Investment
%deviation from trend
197519801985 1990199520002005
-0.5
0
0.5
19751980 19851990 19952000 2005
-0.2
-0.1
0
0.1
0.2
3.10HPI and Real Residential Fixed Investment
%deviation from trend
19751980 19851990 19952000 2005
-0.4
-0.2
0
0.2
0.4
197519801985 1990199520002005
-0.2
0
0.2
3.11HPI and Nonresidential Fixed Investment
%deviation from trend
197519801985 1990199520002005
-0.5
0
0.5
19751980 19851990 19952000 2005
-0.2
0
0.2
3.12HPI and Real Nonresidential Fixed Invest
%deviation from trend
19751980 19851990 19952000 2005
-0.5
0
0.5
64
Figure 4, Continued
The link between housing markets and the rest of the economy operates primarily
through the effects of house price fluctuations, as they represent the main source of
fluctuations in housing wealth. If that relationship is stable, then fundamentals can
explain house prices. Given the importance of housing in household wealth, it also seems
1975198019851990199520002005
-0.2
0
0.2
3.13HPI and Civilian Unemployment Rate
%deviation from trend
1975198019851990199520002005
-5
0
5
19751980 198519901995 20002005
-0.2
-0.1
0
0.1
0.2
3.14HPI and Civilian Employment
%deviation from trend
19751980 198519901995 20002005
-0.1
-0.05
0
0.05
0.1
1975198019851990199520002005
-0.2
0
0.2
3.15HPI and Disposable Personal Income
%deviation from trend
1975198019851990199520002005
-0.5
0
0.5
19751980 198519901995 20002005
-0.2
0
0.2
3.16HPI and Real Disposable Personal Income
%deviation from trend
19751980 198519901995 20002005
-0.05
0
0.05
1975198019851990199520002005
-0.2
0
0.2
3.17HPI and Personal Income
%deviation from trend
1975198019851990199520002005
-0.5
0
0.5
19751980 198519901995 20002005
-0.2
-0.1
0
0.1
0.2
3.18HPI and Oil Price
%deviation from trend
19751980 198519901995 20002005
-1
-0.5
0
0.5
1
65
reasonable to conjecture that the observed downs and ups in housing prices could have
substantial macroeconomic impacts. It would be interesting to explain these movements
in the housing market, and to what extent they are related to macroeconomic movement
in business cycles. The question we raise here is then whether the common shocks to
economic activity are also the common shocks to housing price dynamics. A second step
analysis will help to identify the individual variables that explain the highest proportion
of the variation in the housing price data.
Suppose we observe , an 1 vector of macroeconomic variables. Let
be an element of the m vector . We are interested in the relationship
between
and , but we do not observe . Given the idea that
is a
good proxy for , it should explain
, we can simply regress
on
and
use postestimation test to assess the explanatory power of
. However, even if
Macro
equals F
exactly,
might still be weakly correlated with
if the
variance of the idiosyncratic error
is large.
In order to better evaluate the observed factors via latent factors more effectively, Bai and
Ng (2006) have proposed several criteria that indicate the extent to which the two sets of
factors differ. These criteria have good properties only when the sample size is large in
both the cross-section and the time series dimensions. Considering the panel of HPI is
small, we construct a straightforward statistics to test the fitness. Suppose we observe
, an 1 vector of economic variables. Let
be an element of the m
66
vector . The null hypothesis is that Macro
is an exact factor, or more precisely,
that there exists a δ
such that
′
∗
for all . Consider the following
regression,
⋯
Let be the least squares estimate of and let
′
∗
. We consider two
descriptive statistics: R
and the sample correlation between and
.
′ ∗ ′
∗
1
∑ ′
1
∑
1
∑
1
∑
1
∑ 1
∑
1
∑
Use the above two criteria, we test the vector macroeconomic variables one at a time. The
results are reported in table 10.
2.4 Data
The data used in this research are two balanced panels of quarterly housing price indexes
at MSA-level: repeat transactions home price indexes estimated by the Office of Federal
67
Housing Enterprise Oversight (OFHEO) for 1991:1–2010:4 and MRAC Single Family
Residence Home Price Index for 1976:1-2007:2.
OFHEO has estimated repeat sales price indexes for the U.S. metropolitan areas, census
regions and states. These data, which are updated and released quarterly, exploit
variations in geographical distribution. Our focus is on the housing prices for the 381
different metropolitan housing markets. These prices at the MSA level were normalized
to 100 in the first quarter of 1995
19
. Although the observations started from the first
quarter in 1975, the OFHEO effort was not undertaken continuously until 1996, resulting
in a lot of missing data in the early years. Considering that our analysis is based on a
balanced panel, we selected the observations that have a record of continuous
observations which end in the third quarter of 2008 and begin no later than the fourth
quarter of 1989. As a result, all 40 MSAs without continuous observations are excluded
from the study. This requirement that MSAs have a continuous record introduces
selection bias into the samples. To minimize this bias, our analysis also included the
quarterly series of 9 census divisions
20
.
In contrast to the OFHEO HPI, MRAC Single Family Residence Home Price Index is
based on mean or median housing prices. Median-based estimators are less influenced by
extreme observations and therefore can provide a more accurate measure of the expected
19
The difference in normalization dates has no impact on appreciation rates obtained from the index.
20
Each of the 381 MSAs is in one of these divisions
68
appreciation rate for a typical property than the standard repeat sales estimator (McMillen
(2008)). As a result, Industry organizations such as the National Association of Realtors
and the International Association of Assessing Officers continue to prefer median price
indices. Unlike the OFHEO HPI, the MRAC HPI is a balanced panel in which all MSAs
have continuous observations, starting from the first quarter of 1976 and ending in the
second quarter of 2007. This sample, therefore, does not suffer from selection bias. In
addition and for robustness, we test both datasets since we agree neither of them can
exactly represent house price dynamics.
Figure 5 describes the house price trends and cycles from 1980Q2 to 2007Q4, using
OFHEO HPI. As we can see, real house prices are very volatile and fluctuate over time,
with an average standard deviation of the growth rate of around 1 percent per quarter. In
recent years, while house prices have been buoyant, their volatility has declined
markedly, a phenomenon which seems to be worldwide
21
.
21
See “World Economic Outlook” Sep 2004 by IMF
69
Figure 5 Housing Price Appreciation (HPA) and its Volatility at MSA level
-0.2
-0.1
0
0.1
0.2
0.3
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
OFHEO HPA rolling quarterly log return-MSA Level
Max
Average change
Min
Average+1standard deviation
Average-1standard deviation
70
Much of the researches typically focus on how macroeconomic indicators react with each
other under policy shocks and we are interested in how the macroeconomic indicators
help us to understand the housing price movements. To evaluate the macroeconomic
variables, we rely on a panel of quarterly observations
22
of U.S. macroeconomic variables
and financial indexes, which measure production(GDP, industrial production index),
consumption(durable goods, nondurable goods, services), investment(residential fixed
investment, non-residential fixed investment), savings, employment (employment growth
and unemployment rate), prices (oil price, PPI, CPI, GDP deflator, money supply), and
financial assets returns (Treasury bill, CD, FFR, bond yields etc. (see Appendix for the
descriptions of the macroeconomic indicators and asset class benchmarks included in the
study)
2.5 Factor and Factor Loadings
2.5.1 Number of Factors
Two filtering methods, first differencing
23
and 1 fitting, are used before we apply
the information criteria.
Table 9 shows the results of estimating the number of factors. "
" is set at 8. The
sample period is 1991Q1-2010Q4 for the OFHEO panel and 1976:1-2007:2 for the
22
If quarterly data are not available, monthly data are transformed to quarterly data by taking the average
within quarter. The data are first taken natural log and then de-trended.
23
First differencing generates housing price appreciation. Considering high volatility of the data,
standardization is also applied before the analysis.
71
MRAC panel. In the OFHEO panel, 4 mutually orthogonal principal components, which
explain most of the cross-sectional variance of the OFHEO , are extracted by 1
and 2 after first differencing or after 1 fitting. , a stricter criterion, shows a
lower value than 1 and 2 : 3 factors are extracted by . Strikingly but not
surprisingly, the estimator and estimator capture only 1 factor, because the second
largest eigenvalue drops significantly.
Table 9 Estimated Numbers of Factors (r) : kmax=8
Dataset OFHEO MRAC
Sample Period 1991Q1-2010Q3 1976Q2-2007Q2
# Observations 349 381
ICp1 with first differencing 4 5
ICp1 with AR(1) fitting 4 5
ICp2 with first differencing 4 5
ICp2 with AR(1) fitting 4 5
BIC3 with first differencing 3 2
BIC3 with AR(1) fitting 3 2
ER 1 1
GR 1 1
Considering the sample selection bias in the OFHEO HPIs, we include a comparative
study of the MRAC HPIs. This is a complete and balanced panel consisting of all 381
MSAs and 122 quarters. More factors are extracted using Bai & Ng criteria. 1 with
first differencing, 2 with first differencing, 1 with 1 fitting and 2 with
1 fitting all choose 5 factors. We thus find more factors in the MRAC HPAs than
the OFHEO HPAs. One interpretation of this result is that the MRAC HPIs are more
volatile than the OFHEO HPIs and high volatility data more likely to be contaminated by
noise. However, with first differencing or 1 fitting suggests the presence of
72
only 2 factors. ER and GR again both choose only 1 factor.
Figure 6 plots the first factor extracted from the OFHEO panel, its Hodrick-Prescott (HP)
filter and OFHEO national index. Plotting the first factor and the national index together,
we see a rough overlap, i.e. the national index tracks the first factor very closely. The
evidence suggests that the first factor is closely related to the national housing price
index. In other words, national housing price index can represent the first factor.
However, this single index does not capture the movements of the second factor, the third
factor and so on. This overlap of the first factor and national index has done extremely
well since 2002. One interpretation of this result is that idiosyncratic shocks were more
important before 2002, but common shocks have become more important in recent years
as sources of economic fluctuations.
73
Figure 6 1st Factor (OFHEO) v.s. National Index (OFHEO): Standardized HPA
90 95 00 05
-3
-2
-1
0
1
2
1st Factor HP Filter
1st Factor
National HPI
74
Figure 7 presents an overview of the four most significant factors. The solid lines are the
corresponding Hodrick-Prescott (HP) filters
24
, Which represent the trends of the
movements of the latent factors. The second factor HP filter moves in an opposite
direction with the first factor HP filter. We therefore call the second factor a
“counteracting factor”. The third factor HP filter is a “lag factor” in relation to the first
factor HP filter. The fourth factor HP filter is much weaker and does not show a
significant link with the first factor. We can predict that the values of the subsequent
factors will be smaller and smaller and display weaker and weaker correlations with the
first factor. We call the fourth and subsequent factors “disturbances”. The latent factors in
OFHEO HPAs thus include a summary indicator (aggregate mean), a counteracting
factor, a lag factor and disturbances.
24
The thickest bold line is the HP filter of the first factor; the second thickest bold line is the HP filter of the
second factor; the third thickest bold line is the HP filter of the third factor; the thinnest bold line is the HP
filter of the fourth factor.
75
Figure 7 First 4 Factors (OFHEO)
90 95 00 05
-3
-2
-1
0
1
2
3
First 4 Factors
1st Factor HP Filter
1st Factor
2nd Factor HP Filter
2nd Factor
3rd Factor HP Filter
3rd Factor
4th Factor HP Filter
4th Factor
76
According to the and criteria, only one common factor presents in the U.S.
housing price indexes at MSA level. and have been proven to perform better in
small panels than the and criteria. This better performance, together with the
observation in Figure 6 that the national index overlaps with the first factor, shows that
historically the national index has provided a good summary of overall housing price
dynamics. This finding also provides strong evidence of consistency in the OFHEO
housing price indexes. The national index is a weighted combination of the nine Census
division indexes, in which the weights reflect the share of the housing stock in each of the
divisions. This weighting approach is less susceptible to distortions related to geographic
shifts in transaction sales volumes, which can introduce biases in index measures as a
relationship that frequently exists between sales volumes and prices. In Figure 6, we
observe a strong correspondence between the weighted index and the extracted first
factor from the OFHEO MSA panel, especially after 2003. The weight assignments
therefore have been proved to work better after 2003.
Figure 8 compares the summary measure (the first factor) of the MRAC HPAs
25
and the
appreciation of the OFHEO national index. The two lines diverge significantly from each
other, especially after 1990. The quarterly growth rates of the MRAC HPI are more
volatile and the summary measure (the first factor) has higher peaks and lower troughs in
most periods than the OFEHO HPAs. This fact implies potential biases caused by
different methods of housing price construction. This observation raises the possibility
25
We do not have the MRAC national index.
77
that OFHEO HPAs underestimate both appreciations and depreciations. If this were so,
during a housing boom, the wealth effect could be larger, and the impact on GDP greater,
than current estimates obtained from OFHEO indexes. During a decline in housing
prices, the negative wealth effect and the drag on GDP will probably be greater than
expected.
Figure 8 1st Factor (MRAC) v.s. National Index (OFHEO)
2.5.2 Factor Loadings
Our estimations also yield a factor loading for each MSA level of HPI. For example, the
first factor loading for Los Angeles’ HPA measures the co-movement between the Los
Angeles HPA and the first common factor which represents the aggregates of the MSA-
80 85 90 95 00 05
-3
-2
-1
0
1
2
3
4
Factor 1
1st Factor (MRAC) HP Filter
1st Factor (MRAC)
OFHEO National Index
78
level HPAs. The set of MSAs’ factor loading vectors provides us with convenient
measures of MSA-level differences in the co-movement between the business cycles of
the MSAs and the country as a whole. Intuitively, a particular MSA-level HPA with
“large” (“small”) factor loadings is strongly (weakly) linked to the national HPA, as
fluctuations in the common factors representing the national HPA are associated with
large (small) fluctuations in the MSA level HPA.
Figure 9 plots the factor loadings sorted by Census divisions. We are then able to observe
the geographic patterns of factor loadings. The first factor is the most important in terms
of explanatory power for MSA-level HPA. It has the strongest effect and influences all
the house prices in all divisions in the same direction. The second factor has its strongest
effect on Census divisions PAC ENC and MT. It causes the house prices in ENC, ESC,
MT and WNC move in the same direction, opposite to the movements in MA PAC and
NE. The effect goes weaker as it turns to the third factor, fourth factor, and so on. The
subsequent factors show weak regional correlations.
79
Figure 9 Factor Loadings (OFHEO)
26
Factor Loadings for the First Factor
Factor Loadings for the Second Factor
26
The U.S. is divided into nine Census divisions,
New England (NE): Connecticut, Massachusetts, Maine, New Hampshire, Rhode Island, Vermont
Mid-Atlantic (MA): New Jersey, New York, Pennsylvania
South Atlantic (SA): Washington, Delaware, Florida, Georgia, Maryland, North Carolina, South Carolina,
Virginia, West Virginia
East North Central (ENC): Illinois, Indiana, Michigan, Ohio, Wisconsin
West North Central (WNC): Iowa, Kansas, Minnesota, Missouri, North Dakota, South Dakota, Nebraska
East South Central (ESC): Alabama, Kentucky, Mississippi, Tennessee
West South Central (WSC): Arkansas, Louisiana, Oklahoma, Texas
Mountain (MT): Arizona, Colorado, Idaho, Montana, New Mexico, Nevada, Utah, Wyoming
Pacific (PAC): Alaska, California, Hawaii, Oregon, Washington
80
Figure 9, Continued
Factor Loadings for the Third Factor
Factor Loadings for the Fourth Factor
Factor loadings are informative to define mortgage submarkets. This definition is crucial
for mortgage bankers to minimize concentration risks by diversifying a portfolio because
one type of diversification is to pick loans in different regions. The key to this is the
81
housing price correlation because large regional housing price shocks do occur
periodically and can seriously hurt investors if their real estate portfolios are not well-
diversified. However, for this purpose, it is not clear how to determine the regions. The
factor loadings help to create distinct regions better than do alternatives such as OFHEO
census regions. OFHEO has divided the US into four different regions according to their
geography. It may not be the best for concentration risk management because it excludes
the fact that there are big cities, such as Los Angeles and New York, with shared similar
HPA paths but in different regions.
2.6 Goodness of Fit
2.6.1 Testing the Fitness of Macroeconomic Variables and Financial Market
indices
We have observed significant co-movements between macroeconomic aggregates and
housing prices (Figure 4). These dynamic interrelations between housing volatility and
economic and demographic variables have been widely studies. In this section, we
empirically test whether the common shocks to economic activity are also the common
shocks to housing price dynamics.
Empirical asset pricing research assumes that the returns of securities are represented as
linear combinations of factors. Factor models relate the returns of securities to a set of
82
factors. A growing body of research models real estate returns using traditional asset
pricing models which rely on information of asset returns alone. However, failing to
consider the stronger dependency of housing on macroeconomic fundamentals will lead
to biases in estimation. The following fitness test serves as a data mining technique to
identify the hidden factors from historical data.
We first take the log difference of both the housing prices and the macroeconomic series
to obtain the growth rates. According to the result in the previous section, we set r at 4 in
the analysis of the OFHEO panel. Since we are interested in the forecastability of
macroeconomic variables on housing price movements, we run separate regressions for
macroeconomic variables in the contemporary period, with 1 quarter lag and with 1 year
lag. Results are reported in Table 10.
One-by-one Test on Each Macroeconomic Variable
The correlation coefficients are around 20-60% in the fitting of most series. s range
from 2% to 50%.
83
Table 10 Goodness of Fit ( and )
Variables
27
Correlation ( )
contemporary 1 Q lag
1 year
lag
contemporary 1 Q lag
1 year
lag
Real Personal Income 0.362097 0.318642 0.376751 0.131114 0.101533 0.141941
Real Disposable Income 0.224209 0.140047 0.212889 0.05027 0.019613 0.045322
GDP 0.46133 0.53185 0.396135 0.212826 0.282865 0.156923
Real Personal
Consumption 0.5217610.5520330.453860.272234 0.30474 0.205989
Consumption Durable
Goods 0.2924520.3584810.3117820.085528 0.128509 0.097208
Consumption Nondurable
Goods 0.2728440.3606320.208160.074444 0.130056 0.043331
Consumption Services 0.605988 0.569055 0.517657 0.367221 0.323824 0.267969
Private Residential Fixed
Invest 0.6616720.7090170.671790.43781 0.502705 0.451301
Private Nonresidential
Fixed Invest 0.42093 0.429545 0.476256 0.177182 0.184509 0.22682
Fixed Private Invest 0.563432 0.583214 0.575839 0.317456 0.340139 0.33159
Private Saving 0.088201 0.275148 0.167298 0.007779 0.075707 0.027989
Industrial Production 0.47086 0.388571 0.507634 0.221709 0.150988 0.257692
Civilian Employment 0.557507 0.57031 0.597914 0.310814 0.325254 0.357501
Civilian Unemployment 0.560675 0.493095 0.558225 0.314356 0.243143 0.311615
Oil Price 0.354537 0.287393 0.210651 0.125697 0.082595 0.044374
PPI 0.307205 0.337913 0.269798 0.094375 0.114185 0.072791
CPI 0.246145 0.316411 0.288026 0.060587 0.100116 0.082959
M1 0.229805 0.236256 0.355186 0.05281 0.055817 0.126157
GDP Deflator 0.569031 0.62143 0.586837 0.323796 0.386175 0.344378
TreasuryBill3m 0.463742 0.421224 0.513732 0.215057 0.17743 0.263921
TreasuryBill6m 0.629008 0.571557 0.552969 0.395651 0.326678 0.305775
TreasuryBill1y 0.665909 0.568175 0.501412 0.443434 0.322823 0.251414
Treasury5y 0.608257 0.416059 0.24479 0.369977 0.173105 0.059922
Treasury10y 0.628717 0.398877 0.19686 0.395285 0.159103 0.038754
FFR 0.512322 0.47053 0.525303 0.262473 0.221399 0.275944
AAA 0.483762 0.255116 0.250204 0.234026 0.065084 0.062602
BAA 0.279223 0.205421 0.221841 0.077966 0.042198 0.049213
CD1m 0.527128 0.563154 0.483802 0.277864 0.317143 0.234064
CD3m 0.532198 0.604327 0.476406 0.283234 0.365211 0.226962
CD6m 0.526468 0.553004 0.450711 0.277169 0.305813 0.203141
27
Data are taken natural log and first differenced to obtain the growth rate, and then standardized.
84
House prices and income have been argued to be linked by a stable long-run relationship.
The gap between the two may be a useful indicator of when house prices are above or
below their equilibrium values, and therefore a useful predictor of future house-price
changes. However, we do not observe a significant correlation between housing prices
and income. Consumption has a much stronger coherence with the common factors
underlying the OFHEO HPAs. This finding suggests that inflows of money do not affect
the housing purchase behavior significantly; however outflows of money are more
important in determining the housing demands. The effects of housing market wealth on
consumption are obvious. K. E. Case et al. (2003) find strong evidence that variations in
housing market wealth have important effects upon consumption. A long list of factors
can be expected to have a depressing effect on consumer consumption of the deflating
housing bubble. When house prices fall, homeowners’ total wealth declines and the value
of mortgage equity is lowered. Consumers also face constraints due to the declines in the
stock market and the tightening of lending terms from depository institutions. The
resulting rise of the prices of energy, food, and other commodities tax the disposable
incomes of households and restrains consumer spending. Households with significant
mortgage debt may need to adjust non-durable consumption when confronted by a
negative, unanticipated economic shock, which is called the “lock-in” effect. More
importantly, changes in housing prices may have substantial effects on other
macroeconomic variables through private consumption. These effects can be reciprocal,
that is, consumption can also affect housing price movements.
85
The strong correlations between investments, productions and the common factors of
OFHEO HPAs reveal that investments and productions are able to explain a high
proportion of housing price movements. This finding indicates that investments and
productions generate a great proportion of housing demands, thereby affecting the
housing price movements. The fitness tests show that fixed investment, with one
quarter’s lag, is the best proxy of the common factors underlying MSA-level U.S.
housing prices. is 50%.
There is hardly any evidence for a relation between the other price indexes (including
producer price index, consumer price index and oil price) and the housing price indexes.
This is because high frequency or high volatility data are more likely to be contaminated
by noise. For example, oil price with high volatility are less reliable proxies for the
systematic variations in the data.
The financial variables we include are mainly benchmarks of riskless asset returns and
asset returns with low risk: interested rates and bond yields. According to the theoretical
frameworks that links house prices to interest rates, interest rates and housing prices are
suggested to be strongly correlated. For example, the standard Gordon Growth Model
implies a convex relationship between house prices and interest rates: the lower interest
rates are, the bigger the percentage increase in house prices when interest rates fall by one
percentage point. Most series perform well in the test. Correlations range from 50%-70%
except for the BAA bond yield.
86
2.6.2 Lag, Number of Factors and Goodness of Fit
Lag and Goodness of Fit
Most research argue that housing market wealth have important effects upon other
macroeconomic variables such as consumption. We are more interested in the
forecastability of macroeconomic variables on housing price movements. To this end, we
run separate regressions for macroeconomic variables in the contemporary period, with 1
quarter lag and with 1 year lag. We find that financial variables such as Treasury bill
market rates and bond yields have contemporary effects on the housing market while
consumption, investments and production show lag effects. It is unsurprising that interest
rates have an immediate effect on housing prices. House prices, like other asset prices,
are very sensitive to interest rates because interest rates directly determine the value and
the cost of holding housing assets. Consumption, investment and production have been
argued to have a long run relationship with housing prices. The transmission of changes
in consumption, investment and production to the changes of housing prices will take
time.
Number of Factors and Goodness of Fit
Figure 8 plots the values of for each macroeconomic variable. The x-axis represents
the number of latent factors. Naturally, the more the number of factors included in the
regression ( ), the higher the and are. Ahn and Horenstein (2009) estimate as
the maximizer of the ratio of adjoining eigenvalue. Similarly, we want to find the edge of
the cliff in the screen plot of and . The increase of drops significantly at
87
4 in most plots, especially for the financial variables. This finding is coincident with
the estimated number of using Bai and Ng (2002) criteria. It reconfirms the presence of
4 common factors in OFHEO HPIs.
88
Figure 10 Number of Factors ( ) and R-square ( )
2 4 6 8
0.1
0.12
0.14
0.16
0.18
Personal Income
2 4 6 8
0.04
0.05
0.06
0.07
Disposable Income
2 4 6 8
0.15
0.2
0.25
GDP
2 4 6 8
0.2
0.25
0.3
Personal Consumption
2 4 6 8
0.05
0.1
0.15
C:Durable Goods
2 4 6 8
0.02
0.04
0.06
0.08
0.1
C:Nondurable Goods
2 4 6 8
0.34
0.36
0.38
0.4
C: Services
2 4 6 8
0.2
0.3
0.4
Residential Fixed Invest
2 4 6 8
0.05
0.1
0.15
0.2
Nonresidential Fixed Invest
2 4 6 8
0.15
0.2
0.25
0.3
Fixed Invest
2 4 6 8
0.005
0.01
0.015
0.02
0.025
Private Saving
2 4 6 8
0.05
0.1
0.15
0.2
0.25
Industrial Production
2 4 6 8
0.2
0.25
0.3
0.35
0.4
Employment
2 4 6 8
0.15
0.2
0.25
0.3
0.35
Unemployment
2 4 6 8
0.05
0.1
0.15
Oil Price
2 4 6 8
0.02
0.04
0.06
0.08
0.1
0.12
PPI
2 4 6 8
0.02
0.04
0.06
0.08
CPI
2 4 6 8
0.1
0.15
0.2
0.25
0.3
M1
2 4 6 8
0.1
0.15
0.2
0.25
0.3
GDP Deflator
2 4 6 8
0.1
0.15
0.2
0.25
Treasury Bill 3y
2 4 6 8
0.1
0.2
0.3
0.4
Treasury Bill 6m
2 4 6 8
0.1
0.2
0.3
0.4
Treasury Bill 1y
2 4 6 8
0.1
0.2
0.3
Treasury Constant 5y
2 4 6 8
0.1
0.2
0.3
0.4
Treasury Constant 10y
2 4 6 8
0.1
0.15
0.2
0.25
0.3
FFR
2 4 6 8
0.1
0.2
0.3
AAA
2 4 6 8
0.05
0.1
0.15
0.2
BAA
2 4 6 8
0.1
0.2
0.3
CD 1m
2 4 6 8
0.1
0.2
0.3
CD 3m
2 4 6 8
0.1
0.2
0.3
CD 6m
89
Figure 11 Number of Factors ( ) and Correlation with Macroeconomic Series ( )
2 4 6 8
0.3
0.35
0.4
Personal Income
2 4 6 8
0.2
0.22
0.24
0.26
Disposable Income
2 4 6 8
0.35
0.4
0.45
0.5
GDP
2 4 6 8
0.45
0.5
0.55
Personal Consumption
2 4 6 8
0.2
0.25
0.3
0.35
C:Durable Goods
2 4 6 8
0.1
0.2
0.3
C:Nondurable Goods
2 4 6 8
0.58
0.6
0.62
0.64
C: Services
2 4 6 8
0.45
0.5
0.55
0.6
0.65
Residential Fixed Invest
2 4 6 8
0.2
0.3
0.4
Nonresidential Fixed Invest
2 4 6 8
0.35
0.4
0.45
0.5
0.55
Fixed Invest
2 4 6 8
0.08
0.1
0.12
0.14
0.16
Private Saving
2 4 6 8
0.2
0.3
0.4
0.5
Industrial Production
2 4 6 8
0.4
0.45
0.5
0.55
0.6
Employment
2 4 6 8
0.4
0.45
0.5
0.55
0.6
Unemployment
2 4 6 8
0.1
0.2
0.3
0.4
Oil Price
2 4 6 8
0.1
0.15
0.2
0.25
0.3
PPI
2 4 6 8
0.05
0.1
0.15
0.2
0.25
CPI
2 4 6 8
0.3
0.4
0.5
M1
2 4 6 8
0.3
0.4
0.5
GDP Deflator
2 4 6 8
0.3
0.4
0.5
Treasury Bill 3y
2 4 6 8
0.3
0.4
0.5
0.6
Treasury Bill 6m
2 4 6 8
0.3
0.4
0.5
0.6
Treasury Bill 1y
2 4 6 8
0.2
0.3
0.4
0.5
0.6
Treasury Constant 5y
2 4 6 8
0.2
0.3
0.4
0.5
0.6
Treasury Constant 10y
2 4 6 8
0.3
0.4
0.5
FFR
2 4 6 8
0.3
0.4
0.5
AAA
2 4 6 8
0.1
0.2
0.3
0.4
BAA
2 4 6 8
0.2
0.3
0.4
0.5
0.6
CD 1m
2 4 6 8
0.3
0.4
0.5
0.6
CD 3m
2 4 6 8
0.3
0.4
0.5
CD 6m
90
The nexus between macroeconomic factors and housing prices have been well recorded.
However, low suggests that a single macroeconomic factor cannot fully capture the
housing price dynamics. The heterogeneity in housing market is crucial in interpreting
this result, i.e., heterogeneity makes it difficult to distinguish between aggregate and
individual price variations.
2.7 Conclusion
Our paper uses several information criteria to identify the number of common factors
underlying fluctuations in the Metropolitan Statistical Area (MSA)-level Housing Price
Indexes (HPIs), and more importantly, examines whether the observed macroeconomic
variables are exact factors. This methodology allows us to extract interpretable common
information about unobserved asset returns. The results confirm the importance of many
aspects in affecting the course of housing prices.
Our study provides empirical evidence that only a small number of factors capture the
main co-movements of housing price time series data. We find that the national HPI can
be summarized with a few common factors representing the national business cycle. In
the OFHEO panel, 4 mutually orthogonal principal components are extracted by both
and
. chooses 3 factors. In the MRAC panel, more factors are extracted by
Bai & Ng criteria.
and
both choose 5 factors and suggests the presence of
91
only 2 factors. The eigenvalue ratio estimator and growth ratio estimator proposed by
Ahn and Horenstein (2009) found only one common factor presenting in both OFHEO
HPI and MRAC HPI at MSA level in the U.S.
This first factor, called “summary measure”, shares the same trend with the national
index. The degree of coincidence between this “summary measure” and the national
index indicates that the aggregated index can only capture one dimension of housing
price movements in a specific area. However, to forecast housing prices more accurately,
we tend to incorporate as much information as possible. Including the cross sectional
information embodied in the principal components, i.e. using these common factors to
forecast housing prices can improve forecasting performance (Stock and Watson (2002)).
Stock and Watson (2002) estimates the indexes and constructs forecasts using an
approximate dynamic factor model, in which, the predictors are summarized using a
small number of indexes (unobserved factors) constructed by principal component
analysis. This method, known as “diffusion indices” forecasting, allows the information
in a large number of variables to be used while keeping the dimension of the forecasting
model small.
Forecasting housing prices is crucial for both individual investors and mortgage
companies. Understanding the future path of house prices in relation to economic stresses
is critical to successful strategic planning and risk management. It is widely recognized
households could reduce and diversify house price risk via a functioning market in house
92
price derivatives such as future and options. The integrity of such markets depends on the
accurate measurement of price levels and volatilities. A broad housing futures market
would necessarily be based upon the housing price indexes published by large
organizations. Most of these housing price indexes are constructed from actual
transaction data. All housing price indexes are unavoidably subject to either
unavailability of information or biases caused by estimation methods. The repeat sales
method introduced by Bailey, Muth and Nourse (1963) and weighted repeat sales model
extended by K. E. Case and Shiller (1987) are most widely used because they control for
the heterogeneity of structural and location characteristics while requiring only
transaction prices and sales dates. However, numerous recent studies have examined
various potential biases in the repeat sales house price indexes.
28
Their results suggested
that these measurement problems severely impede these indexes’ ability to capture true
risk. The entire historical path of each of the OFHEO indexes is also subject to revision
quarterly. However, Deng and Quigley (2008) found little evidence that the revisions to
these indexes were strongly predictable. To solve those measurement problems, various
alternatives have been proposed (B. Case and Quigley (1991); Quigley (1995); Englund,
Quigley and Redfearn (1998); Deng, McMillen and Sing (2010)). However, these
revisions all require new information besides the transaction prices.
28
These biases include: (1) “Renovation Bias” due to the inability to account for the possibility of structure
changes between two sales; (2) “Hedonic Bias” due to the inability to account for depreciation,
maintenance and improvements; (3) “Trading-frequency Bias” caused by relative infrequency of sales
causes; (4) “Sample Selection Bias” due to the fact that it only includes properties transacted more than
once; (5) “Aggregation Bias” caused by the specific interval employed. (Cho (1996))
93
We suggest that a new perspective of constructing housing price indexes is to consider
the integration of housing market and macroeconomic variables. An alternative
methodology of measuring housing price risks is to look broadly at the financial and
macroeconomic dynamics to help to reduce these biases in the current index constructed
from transaction data. Housing price forecasts will also be more effective with
consideration of market fundamentals. Our results show that GDP, personal consumption,
fixed private investment, employment growth, unemployment rates, Treasury bill market
rates, and certificate of deposit market rates have strong correlations with the housing
market. Financial markets have contemporary effects on the housing market while
investment and consumption show lag effects. This study will stimulate further research
exploring a corresponding “Fama and French Factor Model” in the real estate market.
94
References
Ahn, S. C., Horenstein, A. R., 2009. Eigenvalue ratio test for the number of factors.
Manuscript, Arizona State University.
Bai, J., Ng, S., 2002. Determining the number of factors in approximate factor models.
Econometrica 70, 191-221.
Bai, J., Ng, S., 2006. Evaluating latent and observed factors in macroeconomics and
finance. Journal of Econometrics 131, 507-537.
Bailey, M. J., Muth, R. F., Nourse, H. O., 1963. A regression method for real estate price
index construction. Journal of the American Statistical Association 58, pp. 933-942.
Bhattacharya, Jayanta, Kate Bundorf, Noemi Pace, and Neeraj Sood. 2009. "Does Health
Insurance make You Fat?" (NBER Working Paper 15163).
Bhattacharya, Jayanta, Dana Goldman, and Neeraj Sood. 2003. "The Link between
Public and Private Insurance and HIV-Related Mortality" Journal of health
economics, 22(6): 1105-1122.
Card, David, Carlos Dobkin, and Nicole Maestas. 2009. "Does Medicare Save Lives?"
The Quarterly Journal of Economics, 124(2): 597-636.
------. 2008. "The Impact of nearly Universal Insurance Coverage on Health Care
Utilization: Evidence from Medicare" American Economic Review, 98(5): 2242-
2258.
Case, K. E., Quigley, J. M., Shiller, R. J., 2003, Home-buyers, housing and the
macroeconomy. In: In Anthony Richards, Tim Robinson (Eds.), Asset prices and
monetary policy, . Reserve Bank of Australia, pp. 149-188.
Case, K. E., Shiller, R. J., 1989. The efficiency of the market for single-family homes.
The American Economic Review 79, pp. 125-137.
95
Case, B., Quigley, J. M., 1991. The dynamics of real estate prices. The Review of
Economics and Statistics 73, pp. 50-58.
Case, K. E., Shiller, R. J., 1987. Prices of single family homes since 1970: New indexes
for four cities.
Cho, M., 1996. House price dynamics: A survey of theoretical and empirical issues.
Journal of Housing Research 7, 145-72.
Coughlin, Teresa A. and Stephen Zuckerman. 2008. "State Responses to New Flexibility
in Medicaid" Milbank Quarterly, 86(2): 209-240.
Deng, Y., McMillen, D. P., Sing, T. F., 2010. Private residential price indices in
singapore: A matching approach. NUS Institute of Real Estate Studies Working
Paper.
Deng, Y., Quigley, J., 2008. Index revision, house price risk, and the market for house
price derivatives. The Journal of Real Estate Finance and Economics 37, 191-209.
Englund, P., Quigley, J. M., Redfearn, C. L., 1998. Improved price indexes for real estate:
Measuring the course of swedish housing prices. Journal of Urban Economics 44,
171-196.
Greene, William H. 2002. "Chapter 21 Models for Discrete Choice." In ECONOMETRIC
ANALYSIS, 5th ed.Anonymous , 715. New Jersey: Prentice-Hall, Inc.
Greenaway-McGrevy, R., Han, C., Sul, D., Estimating the number of common factors in
serially dependent approximate factor models.
Gruber, Jonathan and Kosali Simon. 2008. "Crowd-Out 10 Years Later: Have Recent
Public Insurance Expansions Crowded Out Private Health Insurance?" Journal of
health economics, 27(2): 201-217.
Ham, John C., Serkan Ozbeklik, and Lara D. Shore-Sheppard. 2011. "Estimating
Heterogeneous Take-Up and Crowd-Out Responses to Marginal and Non-
Marginal Medicaid Expansions" IZA Discussion Paper No. 5779.
96
Hausman, Jerry A. 1978. "Specification Tests in Econometrics" Econometrica, 46(6):
1251-1271.
Heckman, James J. 1978. "Dummy Endogenous Variables in a Simultaneous Equation
System" Econometrica, 46(4): pp. 931-959.
Jordan, J., A. Adamo, and T. Ehrmann. 2000. "Innovations in Section 1115
Demonstrations." Health care financing review, 22(2): 49-59}.
Kallberg, J. G., Liu, C. H., Pasquariello, P., 2009. On the price comovement of U.S.
residential real estate markets. SSRN eLibrary.
Krinsky, Itzhak and A. L. Robb. 1986. "On Approximating the Statistical Properties of
Elasticities" The review of economics and statistics, 68(4): 715-719.
------. 1990. "On Approximating the Statistical Properties of Elasticities: A Correction"
The review of economics and statistics, 72(1): 189-190.
Lakdawalla, Darius, Neeraj Sood, and Dana Goldman. 2006. "HIV Breakthroughs and
Risky Sexual Behavior" Quarterly Journal of Economics, 121(3): 1063-1102.
Maddala, G. S. 1983. In Limited-Dependent and Qualitative Variables in
EconometricsAnonymous , 122: Cambridge University Press.
Marks, Gary, Nicole Crepaz, and Robert S. Janssen. 2006. "Estimating Sexual
Transmission of HIV from Persons Aware and Unaware that they are Infected with
the Virus in the USA" AIDS, 20(10): 1447-1450
10.1097/01.aids.0000233579.79714.8d.
Marks, Gary, Nicole Crepaz, J. W. Senterfitt, and Robert S. Janssen. 2005. "Meta-
Analysis of High-Risk Sexual Behavior in Persons Aware and Unaware they are
Infected with HIV in the United States: Implications for HIV Prevention
Programs" JAIDS Journal of Acquired Immune Deficiency Syndromes, 39(4): 446-
453.
97
McMillen, D. P., 2008. Changes in the distribution of house prices over time: Structural
characteristics, neighborhood, or coefficients? Journal of Urban Economics 64,
573-589.
McWilliams, Michael J. 2009. "Health Consequences of Uninsurance among Adults in
the United States: Recent Evidence and Implications" Milbank Quarterly, 87(2):
443-494.
Monfardini, Chiara and Rosalba Radice. 2008. "Testing Exogeneity in the Bivariate
Probit Model: A Monte Carlo Study*" Oxford Bulletin of Economics and Statistics,
70(2): 271-282.
Otten, M. W.,Jr, A. A. Zaidi, J. E. Wroten, J. J. Witte, and T. A. Peterman. 1993.
"Changes in Sexually Transmitted Disease Rates After HIV Testing and Posttest
Counseling, Miami, 1988 to 1989." American Journal of Public Health, 83(4):
529-533.
Philipson, Tomas J. and Richard A. Posner. 1995. "A Theoretical and Empirical
Investigation of the Effects of Public Health Subsidies for STD Testing" The
Quarterly Journal of Economics, 110(2): 445-474.
Quigley, J. M., 1995. A simple hybrid model for estimating real estate price indexes.
Journal of Housing Economics 4, 1-12.
Schmidt, P., 1981, Constraints on the parameters in simultaneous Tobit and probit models,
in: C. Manski and D. McFadden, eds., Structural analysis of discrete data with
econometric applications (MIT Press, Cambridge, MA) 422-434.
Shore-Sheppard, Lara, Thomas C. Buchmueller, and Gail A. Jensen. 2000. "Medicaid
and Crowding Out of Private Insurance: A Re-Examination using Firm Level
Data" Journal of health economics, 19(1): 61-91.
Staiger, Douglas and James H. Stock. 1997. "Instrumental Variables Regression with
Weak Instruments" Econometrica, 65(3): pp. 557-586.
98
Stock, J. H., Watson, M. W., 2002. Forecasting using principal components from a large
number of predictors. Journal of the American Statistical Association 97, 1167-
1179.
Thornton, Rebecca L. 2008. "The Demand for, and Impact of, Learning HIV Status"
American Economic Review, 98(5): 1829-1863.
Topel, R., Rosen, S., 1988. Housing investment in the united states. The Journal of
Political Economy 96, pp. 718-740.
Weinhardt, L. S., M. P. Carey, B. T. Johnson, and N. L. Bickham. 1999. "Effects of HIV
Counseling and Testing on Sexual Risk Behavior: A Meta-Analytic Review of
Published Research, 1985-1997." American Journal of Public Health, 89(9): 1397-
1405.
Wilde, Joachim. 2000. "Identification of Multiple Equation Probit Models with
Endogenous Dummy Regressors" Economics Letters, 69(3): 309-312.
99
Appendix A: Reasons of HIV Testing
Table 11 Reasons of Testing
Test Reasons (all reasons, %) 1996 1997 1998 1999 2000 2001
For hospitalization or surgical procedure 6.86 5.77 6.04 5.77 5.86 5.53
To apply for health insurance 2.41 2.2 2.41 2.2 2.17 1.65
To apply for life insurance 6.6 6.45 6.15 6.45 6.11 5.82
For employment 4.47 4.71 5.39 4.71 4.91 4.01
To apply for a marriage license 2.99 0.9 1.16 0.9 0.81 3.58
For military induction or military service 4.91 5.53 6.73 5.53 5.9 4.47
For immigration 1.71 0.72 1.07 0.72 1.09 2.02
Just to find out if you were infected 20.92 18.33 19.35 18.33 17.85 20.85
Because of referral by a doctor 1.09 1.15 1.62 1.15 1.21 1.19
Because of pregnancy 10.53 15.33 14.79 15.33 15.55 15.36
Referred by your sex partner 0.72 0.76 0.97 0.76 0.97 1.13
Because it was part of a blood donation process 10.22 1.02 1.23 1.02 1.26 0.14
For routine checkup 14.49 23.82 19.6 23.82 22.38 18.08
Because of occupational exposure 3.09 2.67 3.06 2.67 2.59 2.84
Because of illness 2.12 2.01 2.21 2.01 1.92 1.76
Because I am at risk for HIV 0.84 1 1.11 1 1.09 1.41
Don't know/Not sure 0.58 0.43 0.44 0.43 0.66 1.02
Other 4.74 6.57 6.13 6.57 6.94 7.16
Refused 0.72 0.6 0.54 0.6 0.76 1.97
100
Appendix B: Definitions of Variables for the First Chapter
Table 12 Definitions of Variables for the First Chapter
Variables Descriptions
Non-White or
Hispanic
1 if non-white or Hispanic, 0 otherwise
Female 1 for female and 0 for male
Married 1 if married, 0 otherwise
Income below 200%
FPL
1 if income is below 200% of federal poverty line, 0 otherwise
High School Degree 1 if the highest qualification is high school degree, 0 otherwise
Some college or tech
school
1 if the highest qualification is some college or tech school degree, 0 otherwise
College graduate or
higher
1 if the highest qualification is graduate or higher degree, 0 otherwise
Health plan coverage
1 if an individual has some kind of health care coverage, including health
insurance, prepaid plans such as HMOs, or government plans such as Medicare, 0
otherwise
HIV Tesing
1 if an individual tested for HIV sometime in the year preceding their interview
date, 0 otherwise
1115 Waiver
the percentage of Medicaid 1115 waiver beneficiaries over the total Medicaid
beneficiaries
Employment at
medium size firm
the percentage of workers employed in firms with 100 to 499 employees
Employment at large
size firm
the percentage of workers employed in firms with 500 or more employees
101
Appendix C: Definitions of Variables for the Second Chapter
Table 13 Macroeconomic Variables (Macro)
Category Notation Detail
Income RealPersonalIncome Real Personal Income
RealDisposableIncome Real Disposable Personal Income
GDP GDP Gross Domestic Product
Consump-
tion
RealPersonalConsumption Real Personal Consumption Expenditures
ConsumptionDurableGoods Personal Consumption Expenditures: Durable Goods
ConsumptionNondurableGoods
Personal Consumption Expenditures: Nondurable
Goods
ConsumptionServices Personal Consumption Expenditures: Services
Investment PrivateResidentialFixedInvest Private Residential Fixed Investment
PrivateNonresidentialFixedInvest Private Nonresidential Fixed Investment
FixedPrivateInvest Fixed Private Investment
PrivateSaving Gross Private Saving
IndustrialProduction Industrial Production Index
Employ-
ment
CivilianEmployment Civilian Employment of All Persons in United States
Civilian UnemploymentRate Civilian Unemployment Rate
Prices OilPrice Spot Oil Price: West Texas Intermediate
PPI Producer Price Index: All Commodities
CPI Consumer Price Index of All Items in United States
M1 M1 Money Stock
GDPDeflator Gross Domestic Product: Implicit Price Deflator
Finance TreasuryBill3m 3-Month Treasury Bill: Secondary Market Rate
TreasuryBill6m 6-Month Treasury Bill: Secondary Market Rate
TreasuryBill1y 1-Year Treasury Bill: Secondary Market Rate
Treasury5y 5-Year Treasury Constant Maturity Rate
Treasury10y 10-Year Treasury Constant Maturity Rate
FFR Effective Federal Funds Rate
AAA Moody's Seasoned Aaa Corporate Bond Yield
BAA Moody's Seasoned Baa Corporate Bond Yield
CD1m
1-Month Certificate of Deposit: Secondary Market
Rate
CD3m
3-Month Certificate of Deposit: Secondary Market
Rate
CD6m
6-Month Certificate of Deposit: Secondary Market
Rate
Abstract (if available)
Abstract
This dissertation consists of two empirical studies on panel data models. In the first chapter, we study a pseudo panel data set with binary variables as outcome variables and variables of interest. We discuss the identification of a recursive bivariate probit model. We estimate the average treatment effects of health insurance and new treatment technology on the probability of being tested for HIV. In the second chapter, we study genius panel data and apply a factor analysis to Metropolitan Statistical Area (MSA)-level Housing Price Indexes (HPIs). With this, we estimate the number of common factors using Bai and Ng (2002)'s information criteria, and measure the closeness of the links between macroeconomic variables and the set of common factors. ❧ The first chapter investigates the effects of health insurance and new antiviral treatments on HIV testing rates among the U.S. general population. A theoretical model is developed in which an agent decides whether or not to undergo HIV testing. This decision is determined by the value of early treatment and the value of identifying HIV-negative status. We test the predictions from the theoretical model by using nationally representative data from the Behavioral Risk Factor Surveillance Survey (BRFSS) for the years 1993 to 2002. We estimate a recursive bivariate probit model, with insurance coverage and HIV testing as the dependent variables. We use changes in Medicaid eligibility and distribution of firm size over time within a state as restriction exclusions for insurance coverage. Using a bootstrap method, we estimate robust confidence intervals of average treatment effects. Consistent with the theoretical model, the results suggest that (a) insurance coverage increases HIV testing rates, (b) insurance coverage increases HIV testing rates more among the high-risk population, and (c) the advent of Highly Active Antiretroviral Therapy (HAART) increases the effects of insurance coverage on HIV testing for high-risk populations. ❧ The second chapter aims to identify common factors underlying fluctuations in the Metropolitan Statistical Area (MSA)-level Housing Price Indexes (HPIs). More importantly, we examine whether the observed macroeconomic variables are exact factors. For robustness, we study the two most popular housing price indexes: the Office of Federal Housing Enterprise Oversight (OFHEO) repeat sales index, and the Mortgage Risk Assessment Corporation (MRAC) median home price index. The methodology follows a two-step procedure: first, we look at several information criteria to determine the number of common factors that underlie fluctuations in the MSA-level HPIs. Next, we measure the overall closeness of the links between each macroeconomic variable and the set of common factors, using the factors estimated in the first step. Our results suggest that only a small number of factors capture the main co-movements of housing price time series data in the U.S.: Bai and Ng (2002)'s IC and PC criteria both suggest the presence of four factors, while Ahn and Horenstein (2009)'s criteria only finds one latent factor in the OFHEO panel. The first factor, called the ""summary measure,"" closely tracks with the national index. The degree of closeness between this summary measure and the national index reflects the accuracy of weights assigned to census divisions in constructing the national index. We find a geographical pattern of factor loadings, which is useful in defining submarkets. Our comparison study shows that the MRAC HPIs are more volatile than the OFHEO HPIs. As a result, a larger number of factors are extracted. Finally, findings show that GDP, personal consumption, fixed private investment, employment growth, unemployment rates, Treasury bill market rates, and certificate of deposit market rates have strong correlations with the housing market. Financial markets have contemporary effects on the housing market while investment and consumption show lag effects. The measures in the second step reconfirm the presence of four latent factors in the housing price markets.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Essays on econometrics analysis of panel data models
PDF
Panel data forecasting and application to epidemic disease
PDF
Essays on the econometric analysis of cross-sectional dependence
PDF
Assessment of the impact of second-generation antipscyhotics in Medi-Cal patients with bipolar disorder using panel data fixed effect models
PDF
Large N, T asymptotic analysis of panel data models with incidental parameters
PDF
Two essays in econometrics: large N T properties of IV, GMM, MLE and least square model selection/averaging
PDF
Essays on factor in high-dimensional regression settings
PDF
Essays on estimation and inference for heterogeneous panel data models with large n and short T
PDF
Three essays on linear and non-linear econometric dependencies
PDF
Applications of Markov‐switching models in economics
PDF
Essays on business cycle volatility and global trade
PDF
Essays on health economics
PDF
Three essays on supply chain networks and R&D investments
PDF
The causal-effect of childhood obesity on asthma in young and adolescent children
PDF
An empirical analysis of the quality of primary education across countries and over time
PDF
Essays on the econometrics of program evaluation
PDF
Essays on health and aging with focus on the spillover of human capital
PDF
Essays in political economy and mechanism design
PDF
Three essays on the statistical inference of dynamic panel models
PDF
Essays on price determinants in the Los Angeles housing market
Asset Metadata
Creator
Wu, Yanyu
(author)
Core Title
Essays in panel data analysis
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Economics
Publication Date
08/13/2012
Defense Date
08/11/2012
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
factor analysis,HIV testing,housing prices,OAI-PMH Harvest,panel data
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Hsiao, Cheng (
committee chair
), Nugent, Jeffrey B. (
committee member
), Sood, Neeraj (
committee member
)
Creator Email
yanyuwu@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-89210
Unique identifier
UC11288371
Identifier
usctheses-c3-89210 (legacy record id)
Legacy Identifier
etd-WuYanyu-1154.pdf
Dmrecord
89210
Document Type
Dissertation
Rights
Wu, Yanyu
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
factor analysis
HIV testing
housing prices
panel data