Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Two essays on financial econometrics
(USC Thesis Other)
Two essays on financial econometrics
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
TWO ESSAYS ON FINANCIAL ECONOMETRICS By Junbo Wang A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (BUSINESS ADMINISTRATION) August 2014 Copyright 2014 Junbo Wang Dedication To my beloved parents, Guohua Li and Yicheng Wang, and to my lovely girlfriend Xia Meng, who help and encourage for my dissertation. ii Acknowledgements I am grateful to my advisor, Wayne Ferson, as well as Professor Chris Jones, David Solomon, Scott Joslin, Jinchi Lv, Selale Tuzel, Andreas Stathopoulos and Roger Moon for their guidance and the valuable comments for my dissertation. I also want to thank Professor Richard Roll and Kuntara Pukthuanthong for coauthoring the second chapter of this dissertation. I would like to express my appreciation to Professor Oguz Ozbas, Aris Protopapadakis, C.F. Lee, Martijn Cremers, John Bai, Jerchern Lin, Derek Horstmeyer, Haitao Mo, Tong Wang and seminar participants at University of Southern California, California State University, Fullerton, Rutgers Business School, Fordham University, Brattle Group, Fannie Mae, Research Affiliates and FMA 2013 Doctoral Student Consortium, for helpful suggestions. iii Table of Contents Acknowledgements…………..……………………………………………………………..…….ii List of Tables…………………………………………………………………………….……...v Abstract………………………………………………………………………………….……...viii Chapter 1: Can weight-based measures distinguish between informed and uninformed fund managers?........................................................................................................................................1 1.1: Introduction………………………..…………………………………………................1 1.2: The panel predictive regression system and performance measures………………….….8 1.2.1 The panel predictive regression system…………...………………………...………8 1.2.2 Estimating the predictive system….…………………………………………….…9 1.2.3 The classical weight-based performance measures and the predictive system……11 1.2.4 The power of the weight-based performance measure and WLS measures……….19 1.3: Data and summary statistics……………………………………....…………………….21 1.4: Simulation methods and results…….……………………………………………..…….24 1.4.1 Simulation methods……...…….…………………………………………………..24 1.4.2 Simulation results for uninformed strategies…………………………………...….32 1.4.3 The simulation results for informed strategies...…...………………………..……..35 1.5: Empirical results using mutual fund dataRobustness Checks and Supplemental Analys...38 1.5.1 Model misspecification and the relative size of the stocks……………………….....38 1.5.2 Persistence of the bias-adjusted DGTW CS performance measures………………..42 1.5.3 Revisiting previous results………………………………………………………....43 1.6: Conclusion………………………………………………………………………………45 Chapter 2: Resolving the Errors-in-Variables Bias in Risk Premium Estimation...........................92 2.1: Introduction………………………..…………………………………………..............93 2.2: The Instrumental Variable Approach………………….……………………………….96 2.2.1 Assumptions and classical methods…………...…………………………...………96 2.2.2 The instrumental variable method….……………………………………….……100 2.2.3 An improved IV method for risk premium estimation……………………………101 2.2.4 Theil’s Adjustment……………………………………………………………….102 2.2.5 GMM……………………….…………………………………………………….105 2.3: Asymptotic Distributions……………………………………....…………..………….106 2.3.1 The asymptotic distribution of the IV method.…………...……………………106 iv 2.3.2 A simple way to calculate standard errors.….………………………………….…111 2.4: Data and Simulation Results…….………………………………...……………..…….112 2.4.1 Simulations……...…….…………………………………………..……………..113 2.4.2 Simulation Results for the Fama-French Three-Factor (traded factors) Model…..115 2.4.3 Macroeconomic Factor premiums...…...…………………………………..……..116 2.4.4 Time-varying factor loadings and regression residuals……...…………..………..117 2.4.5 Standard Errors…………………………………..............................................….118 2.4.6 Traded Factors...…...………………………..………………………………..…..119 2.4.7 Macro Factors……...…….………………...……………………………………..120 2.4.8 T Statistics………………………………………………………………..…...….121 2.5: Application of these methods………………………………………………………......123 2.6: Conclusion……………………………………………………………………………..125 v List of Tables Table 1.1: Summary statistics of the mutual funds data…………………………………………53 Table 1.2: The means and T-statistics of different performance measures—buy-and-hold strategy (quarterly)…….……………………………………………………………………………..…...54 Table 1.3: The means and T-statistics of different performance measures—momentum strategy (quarterly)…………………………………………………………………………………...…...55 Table 1.4: The means and T-statistics of different performance measures—random selection strategy (quarterly)………………………………………………….….………………………..56 Table 1.5: The means and T-statistics of different performance measures—buy-and-hold strategy, value weighted initial portfolios(quarterly)…………..…………………………………………57 Table 1.6: The means and T-statistics of different performance measures—momentum strategy, value weighted initial portfolios(quarterly)…………………………………………………......58 Table 1.7: The means and T-statistics of different performance measures—random selection strategy, value weighted initial portfolios(quarterly)…………………………………..….…....59 Table 1.8: Estimated value and standard errors of the informed strategy (quarterly)….……….60 Table 1.9: Rejection ratio of the informed strategy (quarterly)....................................................61 Table 1.10: Estimated value and standard errors of the informed strategy(created monthly)…..62 Table 1.11: Rejection ratios of the informed strategy (created monthly)…….............................63 Table 1.12: Estimated value and standard errors of the informed strategy (monthly)….………64 Table 1.13: Rejection ratio of the informed strategy (monthly)………………………………...65 Table 1.14: The value, asymptotic standard deviation and T-statistics of different measures grouped by the weighted average of relative size of the stocks (quarterly)..........................…...66 Table 1.15: The value, asymptotic standard deviation and T-statistics of different measures grouped by the weighted average of relative size of the stocks (monthly)……………………..68 Table 1.16: The value, asymptotic standard deviation and T-statistics of different measures grouped by the weighted average of relative book-to-market value of the stocks (quarterly)…70 Table 1.17: The value, asymptotic standard deviation and T-statistics of different measures grouped by the weighted average of relative book-to-market value of the stocks (monthly)….72 vi Table 1.18: The value, asymptotic standard deviation and T-statistics of different measures grouped by the weighted average of relative past returns of the stocks (quarterly)…………….74 Table 1.19: The value, asymptotic standard deviation and T-statistics of different measures grouped by the weighted average of relative past returns of the stocks (monthly)……………..76 Table 1.20: The value, asymptotic standard deviation and T-statistics of different measures grouped by past performance (monthly, 5 years)……..……………………………………..….78 Table 1.21: The value, asymptotic standard deviation and T-statistics of different measures grouped by return gap (quarterly, from 1984 to 2007)………………………………………….82 Table 1.22: The value, asymptotic standard deviation and T-statistics of different measures grouped by return gap (monthly, from 1984 to 2007)…………………………...……….……..84 Table 1.23: Estimated value and standard errors of an adjusted informed strategy (quarterly)...86 Table 1.24: Rejection ratio of an adjusted informed strategy (quarterly)……………………….87 Table 1.25: Estimated value and standard errors of an adjusted informed strategy (created monthly)………………………………………………………………………………………....88 Table 1.26: Rejection ratios of an adjusted informed strategy (created monthly)………………89 Table 1.27: Estimated value and standard errors of an adjusted informed strategy (monthly)…90 Table 1.28: Rejection ratio of an adjusted informed strategy (monthly)………………………..91 Table 2.1: Three methods to estimate Fama-French three-factor risk premiums using the bootstrap (25 portfolios, Monthly Data)………………………………………………………….……….140 Table 2.2: Three methods to estimate Fama-French three-factor risk premiums using the bootstrap (149 portfolios, Monthly Data)…………………………………………………………….…...141 Table 2.3: Three methods to estimate Fama-French three-factor risk premiums using the bootstrap (4970 stocks, Monthly Data)…………………………………………………………………...142 Table 2.4: Three methods to estimate macro three-factor risk premiums using the bootstrap (25 portfolios, Monthly Data)……………………………………………………………..………..143 Table 2.5: Three methods to estimate macro three-factor risk premiums using the bootstrap (149 portfolios, Monthly Data)……………………………………………………………………....144 Table 2.6: Three methods to estimate macro three-factor risk premiums using the bootstrap (4970 portfolios, Monthly Data)………………………………………………………………………145 vii Table 2.7: Various methods to estimate Fama-French three-factor risk premiums with time-varying factor loadings and regression residuals (4970 stocks Monthly Data)………………………...146 Table 2.8: Three methods to estimate macro three-factor risk premiums using the bootstrapwith time-varying factor loadings and regression residuals (4970 portfolios, Monthly Data)…..…147 Table 2.9: The standard error for Fama-French three-factor risk premiums (25 portfolios Monthly Data)…………=…………………………………………………………………………..…..148 Table 2.10: The standard error for Fama-French three-factor risk premiums (149 portfolios Monthly Data)………………………………………………………………………………...150 Table 2.11: The standard error for Fama-French three-factor risk premiums (4970 portfolios Monthly Data)………………………………………………………………………..…….....152 Table 2.12: The standard error for macro-factor risk premiums (25 portfolios Monthly Data)………………………………………………………….……………………………….154 Table 2.13: The standard error for macro-factor risk premiums (149 portfolios Monthly Data)………………………………………………………………………………………......156 Table 2.14: The standard error for macro-factor risk premiums (4970 portfolios Monthly Data)………………………………………………………………………………………......158 Table 2.15: The rejection ratio of t-statistics with non-hypothesis (4970 portfolios Monthly Data)…………………………………………………………………………………………..160 Table 2.16: Estimated risk premiums for three macro factors (OLS)……...…………………161 Table 2.17: Estimated risk premiums for three macro factors (GLS)………………………...162 viii Abstract My dissertation contains two chapters. The first chapter is "Can weight-based measures distinguish between informed and uninformed fund managers?" This is based on my job market paper. This paper studies weight-based mutual fund performance measures in a panel predictive regressions framework, where future stock returns are regressed on a fund’s portfolio weights. Existing performance measures suffer biases related to benchmark misspecification and are statistically inefficient. We introduce bias-adjusted and weighted least squares (WLS) measures. Simulations show that new methods can effectively control bias and improve power, compared with existing measures. We also apply the existing and newly introduced measures in empirical examples. Using bias-adjusted measures and efficient measures can lead to different conclusions about managers’ abilities. Disclaimer: The power results depend on the way in which I calibrated an informed manager, and if this is a poor representation of reality then the power results may not be reliable. The second chapter is "Resolving the Errors-in-Variables Bias in Risk Premium Estimation", which is co-authored with Kuntara Pukthuanthong and Richard Roll. The Fama-Macbeth (1973) rolling- method is widely used for estimating risk premiums, but its inherent errors-in-variables bias remains an unresolved problem, particularly when using individual assets or macroeconomic factors. We propose a solution with a particular instrumental variable, calculated from alternate observations. The resulting estimators are unbiased. In simulations, we compare this new approach with several existing methods. The new approach corrects the bias even when the sample period is limited. Moreover, our proposed standard errors are unbiased, and lead to correct rejection size ix in finite samples. With this approach, we find that macro factors, such as the consumption growth, can significantly affect the stock returns. 1 Chapter 1 Can weight-based measures distinguish between informed and uninformed fund managers? 1.1 Introduction The mutual fund industry is gaining increasing importance in the U.S. economy. At the end of 2011, there existed some 14,000 mutual funds with over 13 trillion dollars in assets under management. One particular question of interest is whether the managers for these delegated portfolios have superior information. If so, we would expect they should put more weights on the stocks with higher future returns, i.e. their portfolio weights in the current period can predict future stock returns (Grinblatt and Titman (1989b)). Although several weight-based measures 1 have been developed and widely used to capture managers’ ability, there is no coherent framework that integrates them. Also, perhaps surprisingly, the literature has not thoroughly discussed the inherent issues that can arise when using these measures in mutual fund studies. This paper attempts to fill this gap. An effective weight-based performance measure should distinguish informed managers from uninformed ones. On the one hand, various strategies might be considered as uninformed, and an unbiased measure should assign no ability to managers choosing such strategies. On the other hand, an efficient measure should have enough power to identify informed strategies. In this paper, we find that the existing measures are likely to misclassify some commonly accepted uninformed 1 Researchers also use the return-based measures to study manager’s ability. The weight-based measures has gained more and more attentions because they capture funds’ "hypothetical returns" while return-based measures capture the after-cost returns. 2 strategies, and the measures have relatively low power to detect informed strategies. Moreover, with the framework that we will introduce, we propose bias-adjusted and weighted least squares (WLS) measures to control for these issues. We first introduce the existing weight-based measures and discuss the issues associated with them. The most commonly used measure is the DGTW (Daniel, Grinblatt, Titman, and Wermers (1997)) characteristics selectivity (CS) measure, which is constructed in the following way: First, for each stock held by a fund, they form a “matching" portfolio of stocks with similar characteristics such as size, value and past returns. Then for any fund, one takes the difference between the return of each stock in the fund and its matching benchmark return 2 , and the weighted sum of these differences gives the fund’s DGTW CS measure. Intuitively, a manager is informed if the stocks she selects can beat the matching portfolios, thus producing the positive DGTW CS measure. However, there is still heterogeneity in size (market capitalization) of the stocks in each DGTW benchmark portfolio. As a result, if the smaller stocks within a portfolio outperform the larger ones, this leads to a positive DGTW benchmark-adjusted return even for uninformed portfolios that place a higher weight on relatively smaller stocks 3 . Consider a random selection strategy where the manager starts with a randomly selected portfolio in each period with equally weighted stocks. Such an uninformed strategy overweights the relatively smaller stocks relative to the DGTW benchmark portfolio 4 , resulting in a positive DGTW CS measure. However, a positive DGTW CS measure does not reflect a manager’s information; rather it reflects the bias of the measure itself. A similar logic holds for other uninformed strategies such as buy-and-hold and 2 We will call the benchmark return the DGTW benchmark, and the difference between the return of each stock in the fund and its matching benchmark return as the DGTW benchmark adjusted return. 3 We use the term “relatively small (large)" to refer to stocks of smaller (larger) market capitalization than that of the same benchmark portfolio. 4 Recall that in each DGTW benchmark portfolio, the stocks are weighted based on their market capitalization. 3 momentum strategies. In fact, any misspecified benchmark can lead to biased evaluation. If the average return of a stock differs from its benchmark portfolio return, any strategy placing high weights on stocks with positive average benchmark adjusted returns can be positively biased 5 . Another measure we focus on is the weight-change measure (Grinblatt and Titman (1993)), which is the difference in the next-period returns between the manager’s current portfolio and the one held at some point in the past. Using this measure, a manager is informed when her current portfolio achieves higher next-period returns than did the past portfolio. This measure can also be biased. For example, if an uninformed manager engages in a buy-and-hold strategy, stocks with high average returns account for larger weights in the current portfolio than those in the past portfolio. Thus, misspecified benchmarks can also lead to a positively biased weight-change measure. Although these benchmark issues differ with various measures, they can be controlled in the same way. In this paper, we introduce a panel predictive regression system where stock returns or excess returns for the next quarter are regressed on portfolio weights of a fund in the current quarter, and a dummy variable for each stock, i.e. firm fixed effect. We show algebraically the relationship between the existing measures and the slope coefficient of the regression. We also illustrate, using simulations, the bias associated with each measure, and offer solutions for correcting biases. We show that the DGTW CS measure is equivalent to the numerator of the coefficient by regressing the difference between the future stock return and its DGTW benchmark return on the 5 For example, in CAPM, one regresses the stock’s excess return on the market portfolio return and a constant term, and uses the product of its factor loading and the market portfolio return as the benchmark return. If the CAPM is misspecified, the constant term, , captures the model misspecification. In the context of a buy-and-hold strategy, the portfolio weights become higher for the stocks with higher returns as time goes by. If the stock return is also positively correlated with , the correlation between the weights and ’s leads to non-zero values for the performance measure, even for an uninformed buy-and-hold strategy. 4 current portfolio weights, without the stock dummy in the panel predictive regression system. However, since the average values of DGTW benchmark adjusted returns are different from zero (misspecified benchmarks), the estimator in the predictive regression without dummies can be biased. With the introduction of the stock dummies, the difference estimator of the panel regression can remove the fixed effect from the DGTW benchmark adjusted returns, thus correcting the bias of the performance measures . However, the difference estimator still suffers a form of the “Stambaugh bias" (Stambaugh (1999)), and we provide a method to correct it by using the lagged weight change as an instrumental variable. Similarly, we build the connection between the weight- change measure and a panel regression, and provide an estimator that is free of biases for this measure. We also discuss the efficiency and power of performance measures. In particular, since existing measures correspond to Ordinary Least Squares (OLS) estimators in the panel regression framework, they implicitly assume that errors in the regression are homogeneous. However, this is unlikely since the standard deviations of the stock returns vary across stocks. Weighted least squares (WLS) estimators, normalizing the returns and weights by the standard deviations of the stock returns, can significantly improve efficiency. We use simulations to illustrate these issues. Specifically, to test whether the measures might misclassify uninformed managers, we simulate three types of strategies for an uninformed manager: (1) a buy-and-hold strategy, (2) a momentum strategy and (3) a random selection strategy, where the manager randomly rebalances her portfolio. The simulation results are compelling: the weight-change measure exhibits a positive bias for buy-and-hold and momentum strategies. For instance, in the momentum strategy, the weights drift toward the stocks with higher past returns and create a positive bias for the weight-change measure. 5 The bias will also affect the distribution of the T-statistics. The simulations reveal that the 95% empirical critical value of the T-statistics can be as large as 3.3. Such results call into question the results of many of the existing studies that find managers have abilities. The DGTW measure is also biased in our simulations. In particular, the critical values of the T-statistics are much larger than those for the weight-change measure. The results for the random selection strategy are noteworthy: The simulation results reveal that the biases for most of the existing performance measures are much smaller, but are still positive (especially for the DGTW CS measures); and the critical values for the T-statistics are also different from those of the standard normal distributions. We show that the bias-adjusted measures effectively control the bias for the existing measures, and the distributions for the T-statistics are closer to a standard normal. We also evaluate the statistical power of the existing performance measures using simulations. The WLS measures show higher statistical power than the unadjusted measures for an informed strategy. Moreover, simulations show that using the quarterly holdings to create monthly weights and monthly performance measures (Kacperczyk, Sialm and Zheng (2008), Busset and Tong (2011)) does improve the efficiency of the performance measures. In addition to the simulations, we also apply different performance measures to real data. We first show that the bias can empirically affect our judgement about a manager’s information. If the relative sizes can lead to biased estimates for the DGTW CS measure, then managers holding relatively small stocks should outperform managers holding relatively large stocks (based on the biased measures). We find that the funds holding relatively small stocks “significantly outperform" the funds holding relatively large stocks, using the original DGTW CS measure. However, with the bias-adjusted WLS measure, the difference between the funds in the above two groups becomes 1/7 of those with the DGTW CS measure, and is insignificant at the 10% level. 6 The relative book-to-market value and the relative past returns also affect the DGTW CS measure. We find that the funds with value or high past-return stocks outperform the funds with growth or low past-return stocks using the DGTW CS measure, and the results disappear with the bias- adjusted measures. Next, we turn to the persistence of the DGTW CS measures. To this end, we group the funds according to different past performance measures, and examine the next-period performance measures. Our findings show that the bias-adjusted measures are persistent, suggesting that the managers who choose stocks based on information can consistently do so, and thus this is detectable when the measures are unbiased. As a final exercise, we apply the new measures to a recent empirical finding, the return-gap (Kacperczyk, Sialm and Zheng (2008)). Grouping the funds according to the return-gaps, we estimate the performance measures for each fund group and the difference in performance measures between the high and low groups. With the bias-adjusted DGTW measure, we find that the funds with higher return-gaps do not significantly outperform those with lower return-gaps, which contradicts with the result using the original DGTW measure. Our paper is among the few papers that study the statistical properties of mutual fund weight- based performance measures: Ferson and Khang (2002) use simulations to examine conditional weight-based performance measures. Jiang, Yao and Tong (2007) use simulations to study weight- based timing measures. However, to the best of our knowledge, there is no paper to date that examines the statistical properties of the various measures in a unified framework, which permits comparisons of the various measures with different strategies. Our paper fills this gap. We also develop the unbiased WLS measure and show that it has improved power. This compliments the earlier findings about powers by Kothari and Warner (2001), in which they compare the power of 7 the DGTW CS measure with return based performance measures. Our paper expands the literature on the benchmark choice in performance measures, going back to Roll (1978), Lehman and Modest (1989) and others. Recently, Cremers, Petajisto and Zitzewitz (2010) show that the equally weighted Fama-French factors can lead to “positive performance" for passive managers in return-based measures. Chan, Dimmock and Lakonishok (2009) study the effect of benchmarks on weight-based measures, and they find that different methods for constructing the benchmarks can lead to different performances for the same funds. Our paper is also related to these benchmark issues, but we focus on whether specific strategies can magnify the impact of misspecified benchmarks, which lead to a bias in judging managers’ abilities. We also provide a method to correct the bias. Our work also contributes to the existing literature on predictive regressions. (see Stambaugh (1999), Pastor and Stambaugh (2009), Amihud and Hurvich (2004) and Amihud . etal (2009).) The paper in this area that is most closely related to our work is Hjalmarsson (2007), who develops a panel predictive regression framework and shows that the Stambaugh bias can affect the mean estimator of the coefficient. We extend the result to the difference estimator, for which we also provide an instrumental variable method to correct its bias. We believe such bias corrections can be useful for future research. The remainder of this paper is organized as follows: Section 2 introduces the panel regression framework and shows its relation to the different performance measures. The bias and efficiency issues of the existing measures are presented, and we offer methods to control these issues. Section 3 discusses the data sources. Section 4 presents the estimated values, T-statistics and power of the performance measures with simulated data. Section 5 applies the existing and the new methods to real data. Section 6 concludes. 8 1.2 The panel predictive regression system and performance measures 1.2.1 The panel predictive regression system An informed manager should predict future stock returns. Weight-based measures examine whether the weights of stocks the manager chooses can predict future stock returns. In this paper, we introduce a predictive regression framework for weight-based performance measures. Specifically, for each fund manager, we use the following predictive regression (assuming that the portfolio contains N stocks and lasts T periods) . = 1 1 i t i t i i i t w D r (1) In this regression, i t r 1 is the return of stocks at time 1 t , and i t w is the weight of the stocks held by the manager (weights are zero when the stocks are not held by the manager). Moreover, slope coefficient captures the manager’s ability of using weights to predict future stock returns; thus, the estimated should be positive when the manager is informed 6 . One important feature of this regression is the stock dummy variable i D . It takes the value of one if the return belongs to stock i and zero otherwise. The coefficient of this dummy variable, i , represents the expected return of the stock i under the null hypothesis ( 0 = ). We use this dummy variable to replace the constant term in the regression since the stocks have different expected returns. In addition, we assume that the standard deviations of the error terms i t 1 are heterogeneous for different stocks, 6 Empirically, the manager may predict the future returns of some stocks, but not others. In this case, represents average ability to predict the future stock returns. On the other hand, one can consider a panel regression with different i for different stock i . The case can be studied in future work. 9 and these error terms can be cross-sectionally correlated, i.e. ) (0, 1 : t e , where ] , , [ = 1 1 1 1 N t t t e . The dependent variables in the panel regression can be the excess returns with respect to certain benchmarks. For example, if t i D t r , 1 is a benchmark return for stock i at time 1 t , the dependent variable in the regression becomes t i D t i t r r , 1 1 (the excess return of the stock i ). One of the widely used benchmark is the DGTW benchmark, and the construction of the benchmark return will be presented in the next subsection. To capture the fact that the manager’s weights are highly autocorrelated, we assume that the manager’s weights follow an (1) AR model: . = 1 1 i t i t i t w w (2) The equations 1 and 2 form a “predictive system" (e.g., Pastor and Stambaugh (2009)) that facilitates our analysis. I make two standard assumptions about this predictive system. 1 Assumption : For all 0 > j , i j t is uncorrelated with i t w . 2 Assumption : For all 0 > j , i j t is uncorrelated with i t w . 1.2.2 Estimating the predictive system The predictive panel regression, i.e. equation 1, can be estimated with different methods. One approach is the mean estimator, as discussed in Hjalmarsson (2007). Our focus is on another approach, which is the difference estimation. Take the difference between equation 1 and the same equation periods before: . ) ( = 1 1 1 1 i t i t i t i t i t i t w w r r (3) The difference estimator is 10 . )) )( (( )) )( (( = ˆ 1 1 i t i t i t i t i t i t i t i t i t i t diff w w w w r r w w (4) We can also obtain difference estimators for the coefficient , when we replace the returns in equation 1 with the DGTW adjusted returns. Both the mean and difference estimators suffer a “Stambaugh bias". Hjalmarsson (2007) discusses the bias of the mean estimator, and the same argument can be applied to the difference estimator. The numerator of the difference estimator is: . )) )( (( 1 1 i t i t i t i t i t r r w w (5) If we plug i t i t i i i t w D r 1 1 = into equation 5, and impose the null hypothesis 0 = , then the result, which should have an expected value of 0 , becomes: . )) )( (( 1 1 i t i t i t i t i t w w (6) Using equation 2, . ) (1 = 1 = 1 1 = i j t j j i t j j i t w w (7) With equation 7, equation 2 becomes: . ) )( ) (1 ( 1 1 1 = i t i t i j t j j i t w (8) Therefore, if i t 1 and i t 1 are correlated, and if is nonzero, we can conclude that i t w is correlated with i t 1 for 1 > ; thus, there is a bias for the difference estimator. This paper provides a bias-adjusted method for the difference estimator: we use the lagged difference of the weights as the instrumental variable, and define the bias-adjusted difference estimator as 11 . )) )( (( )) )( (( = ˆ 1 1 1 i t i t i t i t i t i t i t i t i t i t adj b w w w w r r w w (9) When we plug in i t i t i i i t w D r 1 1 = , the numerator of this estimator can be written as . )) ( ) ( )( ( 1 1 i t i t i t i t i t i t i t w w w w (10) The estimator is unbiased Since i t i t 1 1 is not correlated with i t i t w w for 1 > . To show this, note that equation 10 converges to (when the number of periods is large enough) . )) ( ), (( i t i t i t i t i w w w w cov Since , ) , ( ) , ( ) , ( ) , ( = )) ( ), (( i t i t i t i t i t i t i t i t i t i t i t i t w w cov w w cov w w cov w w cov w w w w cov from equation 2 and 2 Assumption , the above expression becomes: . ) , ( ) (1 = )) ( ), (( 2 i t i t i t i t i t i t w w cov w w w w cov (11) The last equation assumes stationarity: ) , ( = ) , ( i t i t i t i t w w cov w w cov . From this equation, . ) , ( ) (1 = )) ( ), (( 2 i t i t i t i t i t i t i w w cov w w w w cov For the same reason, the denominator of equation 9 converges to . ) , ( ) (1 = )) ( ), (( 2 i t i t i t i t i t i t i w w cov w w w w cov Therefore, when number of stocks is large, equation 9 converges to . = ) , ( ) (1 ) , ( ) (1 2 2 i t i t i t i t w w cov w w cov 1.2.3 The classical weight-based performance measures and the predictive system 12 In this subsection, we define the different classical weight-based performance measures, and show the relation between these measures and predictive system framework. In addition, we discuss the bias of these measures and provide methods to correct these biases in this framework. 1.2.3.1 Weight-based performance measures Denote the weights and returns of stocks at time t by ] , , [ = 1 N t t t w w w and ] , , [ = 1 N t t t r r r , respectively. The weight-based measures stems from the idea of covariance measure proposed by Grinblatt and Titman (1989b) and Wermers (2006). Theoretically, they propose that the manager’s skill or information should be captured by the covariance between the current weight and the future stock returns, i.e. . ) , ( 1 t t cov r w The covariance can be written equivalently as: . ) )) ( (( = ))) ( ( ( = ) , ( 1 1 1 1 t t t t t t t t E E E E cov r w w r r w r w These equivalent expression suggests the different ways of constructing the weight-based measures. The first approach is to assign a return benchmark for each stock and construct the measure as the weighted average of the benchmark adjusted returns. This measure will be called as the benchmark return measure in this paper. Similarly, we can assign a benchmark weight for each stocks hold by the manager, and construct the benchmark adjusted weight as the difference between fund manager’s weight and the benchmark weight. The benchmark weight measure is defined according to the third equivalent expression above. Following this general idea, there are one measure in each of these two categories that has been used widely in the mutual fund literature. The first measure, the DGTW CS measure, is defined as follows: , )) ( ' ( 1 = , 1 1 1 = t D t t t T t T DGTW r r w where ] , , [ = , 1 , 1 1 , 1 t N D t t D t t D t r r r is the return of the DGTW benchmark at time 1 t . This measure 13 belongs to the benchmark return measure category. Intuitively, a manager is informed if the stocks she selects can beat the benchmark portfolios. The DGTW benchmark return at each time point is constructed as follows. First, rank the stocks by firm size, and divide these stocks into five size groups, with each group having the same number of stocks. Second, in each group, order the stocks by market-to-book value, and subdivide them into five market-to-book groups. Hence, there are in total 25 stock groups. Third, in each of these 25 groups, sort the stocks by their average returns from 12 to 2 months before the current month, and split the stocks into five groups according to average past returns. Finally, there are 125 stock groups, each of them containing the same number of stocks. The value-weighted returns of the stocks in each of the 125 groups become the DGTW benchmark returns for the individual stocks in that group. Another widely used measure, the weight-change measure (denoted by WC ), introduced by Grinblatt and Titman (1993), is defined as: . ) ) (( 1 = 1 1 = t t t T t T WC r w w Note that the benchmark weight in the measure is t w , which is the weight of the fund periods before the current period. As we can see from the definition, the weight-change measure is an example of the benchmark weight measure. With this measure, a manager is informed when her current portfolio achieves higher next-period hypothetical returns than the past portfolio would have earned. The choice of is noteworthy. Grinblatt and Titman (1993) use 4 = with quarterly data. This is because when is too small, the past weight might contain information of future stock returns. Thus, there is a positive correlation between the past weights and future stock returns, and such a correlation can lead to an underestimation of manager’s information. When is too 14 large, portfolios’ systematic risks may change. We use the same criteria with Grinblatt and Titman (1993): Our estimated measures from real data do not change significantly when is close to 4 quarterly and is close to 12 monthly, but they are much smaller for smaller . Therefore, we choose 4 = and 12 with quarterly and monthly data, respectively. Moreover, most of our qualitative conclusions in section 4 and 5 (simulation and empirical results) do not change with different . 1.2.3.2 Issues of the weight-based performance measures, and bias-correction method using the predictive system In this subsection, we will discuss the issues with the existing measures and propose the bias- adjusted measures in the panel predictive regression framework. The DGTW CS measure The issue of the DGTW CS measure comes from the benchmark misspecification. The assigned DGTW benchmarks for each stock is misspecified if 0 ) ( = , 1 1 t i D t i t r r E . Empirically, the average return of the stock is different from the average return of its benchmark returns, over a reasonably long sample period (for example, from 1975 to 2010 ). The misspecification can lead to a bias in the DGTW measure for the uninformed managers. To be more specific, we consider a manager who randomly selects some stocks initially, and adopt a buy-and-hold strategy afterwards. This manager is considered as an uninformed one. However, since the stocks are randomly selected, some stocks will appear to have positive mis-specification, while other stocks will have negative mis-specification. If the manager holds these stocks forever, the stocks with positive mis- specification will constitute larger and larges shares in the portfolio since they are likely to be the stocks with higher returns. For the same reason, the stocks with negative mis-specification are likely to shrink. As a result, the manager’s portfolio will eventually be more likely to have stocks 15 with high mis-specification. This will produce a positive covariance between the weight and mis- specification, and will lead to a positive DGTW CS measure. However, this result is not due to manager’s information; thus, indicating that there is a bias for the DGTW CS measure. The issue of benchmark mis-specification can be shown in the predictive system. If we replace the return with the DGTW benchmark adjusted return in the predictive system, the estimator of the panel predictive regression is related to the DGTW CS measure. In equation 1, if we use the DGTW adjusted return to replace the raw return, the equation becomes . = 1 , 1 1 i t i t i i t i D t i t w D r r (12) When there are no dummy variables i D , the regression becomes . = 1 , 1 1 i t i t t i D t i t w r r (13) The estimated of this regression is . ) ( )) ( ( = ˆ 2 , , 1 1 , i t t i t i D t i t i t t i dum no w r r w (14) The numerator of dum no ˆ is identical to the DGTW CS measure. Excluding the dummy variables in the above regression neglects the fixed effects of the DGTW benchmark adjusted returns, i . For example, when a manager is uninformed, i.e 0 = , the equation 12 becomes . = 1 , 1 1 i t i i t i D t i t D r r In this case, i is the benchmark misspecification. If the benchmark is correctly specified, the value of i should be zero. However, the benchmarks are in general misspecified. Thus, based on the above equation, the original DGTW CS measure at time t can be decomposed as follows: 16 . ) ( ) ( = )) ( ( 1 , , 1 1 i i t i i t i t t i t i D t i t i t i w w r r w By assumption, the first part is zero (in a regression, the error term i t 1 is assumed to be uncorrelated with the regressor i t w ). The bias in the DGTW CS measure comes from the second part. The expected value of the second part, )) ( ( i i t i w E , is the weighted average of misspecification. If we do not control for the misspecification, a higher value of the performance measure may be determined by the incorrect benchmark, but not due to a manager’s superior information in portfolio selection. For example, a strategy (such as the buy-and-hold strategy) that overweights the stocks with high misspecification can be expected to have a positive DGTW CS measure, even if the manager cannot predict future stock returns. In the following sections, we use simulations to show that the not only the buy-and-hold and momentum strategy, but also a random selection strategies, can have positive DGTW measures. The reasons that these measures are positive lie in the following fact: since the DGTW benchmarks are value-weighted. Therefore, if a manager holds larger shares of the stocks that are relatively small compared to the benchmark portfolios, and if the relatively small stocks have positive , then the weighted average of i , should be positive: . 0 > )) ( ( i i t i w E Thus, for the random selection strategy, the manager randomly rebalances the stocks from a pool. She is equally likely to select each stock and the expected weights of the stocks are equal, so the performance measure converges to )) ( ( 1 , 1 1 t i D t i t i r r E N . Empirically, 0 > )) ( ( 1 , 1 1 t i D t i t i r r E N since the weights for the relatively small stocks are larger in equally-weighted portfolios than that in value-weighted benchmark portfolios. Thus, the estimated DGTW CS measure for this uninformed strategy is positive due to benchmark misspecification. 17 To control for the bias from benchmark misspecification, the dummy variables should be included in the regression, and researchers should make use of the difference estimator. We define the bias-adjusted DGTW CS measure as the numerator of the difference estimator of regression 12. It can be written as . ))) ( ) )(( (( , 1 1 , 1 1 t i D t i t t i D t i t i t i t i t r r r r w w This measure not only eliminates the issue of the benchmark mis-specification, but also adjusts for the Stambaugh bias. This measure captures managers’ true skill. More specifically, following the similar derivation as in the previous section, the bias-adjusted measure can be written as . ) , ( ) (1 2 i t i t w w cov We know that 1 < . Thus, if the weights can predict the stock returns (the manager is informed), i.e. 0 > , then the above expression is positive. At the same time, if the true value 0 = (the manager is uninformed), then the above expression is zero. Therefore, the bias-adjusted measure is zero when the manager is uninformed, and nonzero when the manager is informed, and it does not misclassify the uninformed managers. This measure is relatively simple to construct by its formula. Intuitively, the information is characterized by managers who increase weights on the stocks that have higher DGTW adjusted returns than their expected levels (following the similar intuition in Grinblatt and Titman (1993), we use the future adjusted returns to represent the expected adjusted returns). In this paper, we focus on the bias of DGTW CS measure. In general, any misspecified benchmark can lead to a biased evaluation of the manager’s information. The issue and the bias- adjusted method are not limited to the DGTW CS measures. The weight-change measure We now turn to the weight-change measure. The issue of the weight-change measure comes from the mis-specification of the assign weight-benchmark, the past weights. If we consider the buy- 18 and-hold manager, for stocks with higher (lower) average returns, current weights are likely to be higher (lower) than the past weights for the stocks. This will also create a positive correlation between the weight differences (the difference between current weight and the past weight: t t w w ) and average returns, which is the key reason that the uninformed strategy appears to be informed. This bias can also be reflected in the predictive system. We will show that the difference estimator of the coefficient is related to the weight-change measure. More specifically, the numerator of the difference estimator is . )) )( (( 1 1 i t i t i t i t i t r r w w We can decompose it into two parts: . ) )( (( ) )( (( 1 1 i t i t i t i t i t i t i t i t r w w r w w The first part of the decomposition is equivalent to the weight-change measure. Take the expected value of the first part, we have . )) , ( ) ( ) ( ( = )) )( (( ( 1 1 1 i t i t i t i t i t i t i t i t i t i t i t r w w cov r E w w E r w w E If the fund weights have no information about the future returns, then: . 0 = )) , ( 1 i t i t i t r w w cov Therefore, if a trading strategy leads to , 0 ) ( ) ( 1 i t i t i t i t r E w w E then the weight-change measure can be expected to have a bias. The bias in the weight-change measure can be controlled when the second term of the decomposition is included, following the same argument as that which we use for the DGTW CS 19 measure. Thus, we use )) )( (( 1 1 i t i t i t i t i t r r w w as the the bias-adjusted weight-change measure. 1.2.4 The power of the weight-based performance measure and WLS measures Two important statistical issues are efficiency and statistical power. The power of return-based performance measures and the DGTW CS measure has been discussed by Kothari and Warner (2001). Power is also discussed in this subsection. Instead of comparing the power of the different weight-based measures 7 , we introduce Weighted Least Squares (WLS) measures to improve the efficiency and power of the weight-based performance measures in the framework of the predictive regression system. Equation 4 provides the difference estimators of the predictive regression system. However, these estimators are based on OLS estimations, without taking into account the heterogeneous volatilities of the stock returns. Therefore, these estimators cannot be efficient. In order to improve the efficiency of the estimator, we use a WLS measure. More specifically, we divide the stock returns and the portfolio weights by the standard derivations of the stock returns, and use the adjusted returns and weights to construct the performance measures. For example, for the bias- adjusted weight-change measure, the WLS measure is defined as . ) ) ( ) ( ) ( ) ( ) ( ) (( 1 1 1 1 1 1 1 1 = t t t t t t t t t t t t T t T w w w w w w w w r r w w Here, ) , , ( = 1 N var var diag , where ) ( diag represents the diagonal matrix, and the diagonal elements are the variances of N stock returns. Similarly, we can construct the WLS measures of 7 It can be an interesting question to compare the powers of various weight-based measures for the same strategy. Moreover, we can also compare the powers of a certain weight-based measure for the different informed strategies. These will be our future work. 20 the DGTW CS measure, in the above expression by replacing the stock returns with the DGTW benchmark adjusted returns or by replacing the difference of the stock returns by the difference of DGTW benchmark adjusted returns, and replacing the variance of the stock returns by the variance of the DGTW benchmark adjusted returns. The term ) ( ) ( ) ( ) ( 1 1 1 t t t t t t t t w w w w w w w w in the WLS measure is a normalization term. If number of stocks is large, , ) ( ) ( ) ( ) ( 1 1 1 1 1 t t t t t t t t w w w w r r w w . ) ( ) ( ) ( ) ( 1 1 1 t t t t t t t t w w w w r r w w Hence, with the normalization term, the WLS measure should converge to . )) ( ) (( 1 1 t t t t E r r w w Therefore, if the performance measures are unbiased, the WLS measures should converge to the measures without adjusting the efficiency when the number of stocks is large enough. Moreover, the WLS measure is a special case of the Generalized Least Squares (GLS) measure. Considering the heterogeneous volatilities of the stock returns, they should have smaller estimated standard errors than the measures without adjusting the efficiency. Given the lower standard errors, we expect that the WLS measures have higher power. Note that we cannot create a full GLS measure when the number of stocks is large. Specifically, we make two simplifications. First, instead of using the covariance of the error term i t , we use the covariance of the stock returns or the DGTW benchmark adjusted returns. Under the null hypothesis, the covariance of the error term is the same as the covariance of the stock returns or the DGTW benchmark adjusted returns. Second, we only estimate the variance of the returns for each stock (or the DGTW benchmark adjusted returns), but set the covariance between 21 the stock returns as zero. The reason for this simplification is that when the number of stocks is large, we cannot construct a covariance matrix that is invertible. Moreover, if we use the DGTW benchmark adjusted returns, the covariances between different stocks are relatively small. Therefore, the WLS measures should display higher power with the DGTW measures since they are closer to the full GLS measure, which is efficient. 1.3 Data and summary statistics We obtain quarterly or semi-annual holdings data for mutual funds from Thomson-Reuters. The period of the holdings data is from 1980 to 2010 . There are in total 4400 equity funds in this sample. To avoid some standard biases, we employ several screens to filter the holdings data. We first exclude the data before 1984 since Fama and French (2010) indicate a selection bias issue for the data before 1984. Moreover, since most of the research papers using holdings data focus on US equity funds, we also exclude other types of mutual funds. Evans (2010) discusses an incubation bias in fund performance measures, and following his suggestions, we exclude observations before the reported date of fund organization or when a fund has a TNA (total net asset value) of less than 15 million dollars. In addition to the holdings data, we also acquire monthly prices and returns of individual stocks (1970 to 2011) from the CRSP monthly stock file. We also obtain the delisted returns to deal with the case when firms go out of the market. The sample contains 27624 stocks with at least one month of returns from 1970 to 2011. The DGTW benchmark returns are collected from Russ Wermers’ website, and are then combined with the monthly stock file. The merged data contain 15677 stocks 8 . 8 We only select stocks with non missing values of the returns and DGTW benchmark returns, the stocks that can be selected into the DGTW benchmarks should have at least two years data of book values, returns and market capitalization. Ritters and Zhang (2007) show that the 22 In order to empirically test the different weight-based performance measures, we group the funds according to their return gaps. The concept of the return gap is introduced by Kacperczyk, Sialm and Zheng (2008). It is the difference between the raw return (calculated using the weights of stocks and their corresponding returns) and the before-fee return (calculated by the fund reported return plus the expense ratio). We obtain the return gap data by authors from the website of Review of Financial Studies. After we merge the asset holdings data with the return gap data, around 2200 funds remain. We use holdings and stock prices to construct monthly weights for the mutual funds. Let the holdings of the stocks (measured as the number of shares held) and stock prices at time t be t h and t p , where , ] , , [ = 1 Nt t t h h h and ] , , [ = 1 Nt t t p p p . Hence, the weights of the stocks in the fund portfolio are ] , , [ = 1 Nt t t w w w where . = t t it it it p h w p h Since the holdings data have a lower frequency than the price data (in general, the mutual funds report their holdings every three or six months, but CRSP has price data every month), we can create monthly weights by assuming that between the two consecutive reporting dates, the funds keep the same holdings. This method of constructing the weights assumes that the manager uses a buy-and-hold strategy during the periods when we do not observe the holdings data. Therefore, the monthly weights at month t can be constructed by the product of a fund’s preceding reported holdings and the month t prices of the held stocks. They are then normalized to make the total weights equal to 1, i.e. ] , , [ = 1 Nt t t w w w where investment-bank affiliated funds may have to take the cold IPO stocks. Thus, including these stocks can be detrimental to the performance of managers in these funds. 23 , ' = , t r t it r t i it p h w p h and r t is the preceding reporting time. Note that although the holdings do not change between the consecutive reporting dates, the weights do change since the stock prices fluctuate month-to-month. Monthly weights constructed in this way are used by Kacperczyk, Sialm and Zheng (2008), so we study their properties in our empirical session. The summary statistics of the funds are shown in table 1. On average, each fund has total assets of $656 million, and holds 143 stocks. We also calculate the statistics of the return gap for these funds. As shown by Kacperczyk, Sialm and Zheng (2008), on average, before cost reported returns are similar to the raw returns estimated using fund weights and stock returns. However, the maximum and minimum values for the return gap are very different across funds, indicating that the unobserved actions of managers vary across funds. With the weight and return data, we can estimate the parameters of the predictive regression system for each fund. The summary statistics for these estimated parameters of the individual funds are presented in table 1, and their empirical distributions are plotted in figure 1 . These estimated parameters suggest that the fund weights are highly autocorrelated (with the estimated as large as 0.84 ). In addition, the correlation of two errors ( i t and i t ) in the predictive regression system can be either positive or negative; the maximum and minimum values are 0.75 and 0.47 , respectively. Therefore, if is large and i t and i t are correlated, the estimated performance measures suffer Stambaugh biases unless we use the bias-adjusted methods. As a reality check for the (1) AR model assumptions of the weight process in our predictive regression system, we also estimate the coefficients j in the following regressions: 24 , = 1 1 1 i t i j t j i t w w (15) where j is from 1 to 5 . If the weight follows an (1) AR process, we should observe that . = 1 j j The summary statistics for the estimated coefficients are shown in table 1. The average of the estimated j decreases as j increases. Moreover, if we calculate j 1 , it is close to the value of j (for example, 0.70 = 2 1 and 0.73 = 2 ). Therefore, it is reasonable to assume that the weight process is an (1) AR . 1.4 Simulation methods and results 1.4.1 Simulation methods 1.4.1.1 Simulating the stock returns In this section, we use simulations to show that the existing performance measures can be biased and inefficient, and these issues can be controlled with the new measures proposed in this paper. To show these results, we need to simulate the stock returns and the weights of various strategies. We use a bootstrap method to simulate stock returns. The goals are to closely approximate the true statistical distribution of the stock returns, and to avoid the normality assumption of a Monte-Carlo simulation. In each trial, we bootstrap 15667 stock returns for 504 months, the same length as the real data (from 1971 to 2011). In the real data, most stocks have returns that exist for a number of consecutive months and then disappear. We wish to replicate this feature in our bootstrapped returns. There are two reasons: On the one hand, we need the returns to be continuous because most of the strategies we will simulate assume that the managers hold most of stocks for some number of consecutive months, consistent with the persistent weights in the real data. On the other hand, some short-lived stocks have high average returns and large misspecifications with respect to the DGTW benchmark. If these stocks exist for too long in the simulation, it could lead to 25 overestimated average benchmark misspecifications. Thus, we strive to keep the number of months that a stock exists the same in the simulations as in the real data. Moreover, the cross- sectional covariance among different stocks should be preserved in the simulations. In order to maintain this feature, the classical bootstrap method that randomly selects the returns of stocks from their time series cannot be used, because there are many missing values in the time series. For example, suppose that a firm went public in January 1980 and was out of the market in December 1990. The return data for this stock exist from 1980 to 1990, but the data are missing both from 1971 to 1979 and after 1990. If the bootstrapped returns are randomly selected from both the missing and the non-missing returns, the non-missing values in the bootstrapped data cannot be consecutive. In addition, if we independently selecting returns for each stock, we destroy the cross-sectional covariance among different stocks. We address these issues by bootstrapping stock returns with the following method. The bootstrap is based on the following decomposition: , ) ( = 1 , 1 , t i D t i t t i D t i t r r r r where 1 , t i D t r is the DGTW benchmark returns and 1 , t i D t i t r r is the DGTW benchmark adjusted returns. The first step is to keep the periods in which both the return and the DGTW benchmark return of each stock exist in the real data. Moreover, we also preserve the correspondence between the stocks and the benchmarks in each month. Next, we construct a pool of 125 DGTW benchmark returns, 1 , t i D t r . We also create another pool of the DGTW benchmark adjusted returns, 1 , t i D t i t r r . There are periods that the DGTW benchmark adjusted returns do not exist, but the created pool above only contains the existing values. Finally, the bootstrap is conducted with the following procedures: (1) In each month, we select all 125 DGTW benchmark returns at a randomly selected time point from the benchmark pool. Note that we draw all 125 benchmark returns from the same 26 time point in the pool, thus keeping the cross-sectional covariance between the benchmark returns. (2) For each stock that exists at this time point in the real data, we bootstrap the corresponding DGTW adjusted return from the pool of adjusted returns, picking a separate time period at random. If a stock does not exist on the date selected for the DGTW benchmark in the first step, the adjusted return for that stock is set as missing at the second step. (3) The bootstrapped stock return is the summation of its bootstrapped DGTW benchmark return and DGTW adjusted returns. Although the simulation procedure bootstraps the DGTW adjusted returns for different stocks independently, the returns of stocks are correlated through their DGTW benchmark. Moreover, in the procedure above, the DGTW adjusted return captures the misspecification of the DGTW benchmark, which enables us to further examine whether certain strategies are biased when the benchmarks are misspecified. 1.4.1.2 Simulating a fund’s uninformative holdings In addition to stock returns, we also need to simulate the fund holdings for various strategies. Under the null hypothesis, the holdings are uninformed about future stocks returns. The buy-and- hold, momentum (by assumption) and random selection strategies fall into this category. To make these strategies realistic, we need to keep the weights positive, as well as to keep the number of stocks in these strategies similar to those in the real data. We keep the number of stocks similar because this can affect the size of the bias. To this end, in each trial, we randomly select 1000 stocks as the pool, and assume that the manager chooses the stocks from this pool as occurs in the real data. The average number of stocks held by the real managers is 143. With a pool of 1000 stocks, we can maintain a roughly similar number of stocks. On average, there are 150 to 200 stocks in the portfolio for these strategies. We assume that the funds start to hold stocks from 120 = t , and continue to hold stocks for 360 months, which is consistent with the starting period 27 and the length of the holdings data in Thomson-Reuters. Buy-and-hold strategy weights The weights of the buy-and-hold strategy are simulated as follows: at time 120 = t , if stocks in the pool exist, then the manager will select these stocks, and these stocks will be equally weighted. If stocks do not exist, the manager will not hold these stocks 9 . This choice of holdings or weights does not contain information. This is because for each trial, the stock returns are also simulated, and the covariance between the simulated weights and the stock returns is zero when the number of simulations is large enough. Next, for each time point 120 > t , some stocks start to exist and some stocks disappear. For the stocks in the pool that start to exist, the manager will hold the stocks, and the weights are random numbers between 0% and 5%. For stocks in the pool that will disappear, the manager will not hold these stocks. For stocks that exist in the previous period and do not disappear, the manager continues to hold them without changing the holdings (the weights are changing because prices fluctuate). The weights thus contain only past price information. Regulation (the Investment Company Act of 1940) requires that the weight of each stock cannot be larger than 5% without triggering reporting requirements. Therefore, if the weight of one stock exceeds 5%, the manager decreases the weight of the stock until it is equal to 5%. In general, there are multiple stocks with higher-than- 5% weight, so we use the following procedure to adjust the weights of the portfolio. We first rank the stocks by their weights in the portfolios. Then we select the stock with the highest weight and denote this weight by 1 w (by assumption, 5% > 1 w ). Next, we decrease the weight of the stock to 5% , and let the weights of lower-ranked stocks increase by 1) 5%)/( ( 1 N w ( N is the number of stocks in the portfolio. Since the stock has the 9 The delisted returns are included in the return data, therefore, there is no selection bias for the bootstrapped data. 28 highest rank, the number of stocks that have lower ranks is 1 N ). Next, we select the stock with the second highest weight (denote this weight by 2 w ). If 5% > 2 w , then we decrease the weight of this stock to 5% and let the weights of lower-ranked stocks increase by 2) 5%)/( ( 1 N w (there are 2 N stocks with lower rank). Repeat this process until all the stocks have weights less than or equal to 5%. The only exception is the rare case when there are less than 20 stocks in the portfolio. In this case, we do not adjust the weights. To remove the effect of the initial conditions, we let the funds start 20 months earlier (from 100 = t ). However, the results do not change significantly for any of the strategies we simulate. Momentum strategy weights Similar to the buy-and-hold strategy, the manager with the momentum strategy holds all the stocks in the pool that have non-missing returns at time 120 = t . After the first month, the manager rebalances her portfolio. She ranks the stocks by average returns from the past 2 to 12 months, and removes 6% of the stocks with the lowest past returns 10 or the stocks with missing future returns. The stocks that have been removed are replaced by stocks with positive average past returns. Moreover, for each of these stocks, the weight that is added to the rebalanced portfolio is equal to 6% times the proportion of its average past returns to the summation of the average past returns of all stocks that have positive past returns. Random selection strategy weights The random selection strategy is simulated as follows: The manager holds an equally weighted portfolio of stocks in the first period. During all the following periods, she rebalances 6% of the portfolio randomly. The manager randomly selects some stocks from her holdings, and sells all 10 We find that, on average, each manager changes 6% of her total holdings every month. 29 the shares of these stocks. The total weight of these stocks is around 6% . Next, she randomly selects the same number of stocks from the pool, and buys an equally weighted portfolio of these stocks. The total weight of these added stocks is the same as the total weight of the stocks she sells; thus, the total weight of the portfolio is always 100% . 1.4.1.3 Simulating an informative strategy When a manager is informed, her trading strategy should predict future stock returns. Thus, to simulate weights of an informed manager, we assume that the DGTW benchmark adjusted returns are correlated with manager’s weights. Specifically, we assume that the DGTW-adjusted stock returns are simulated from the following dynamics 11 : , = 1 , 1 1 i t i t i i t i D t i t s D r r (16) where i t s will be used to construct informed portfolio weight. Its process is described below . = 1 1 i t i t i t s s The monthly is set to be 0.95, which is 3 0.84 , the cube root of the quarterly value of 1 in table 1. Recall that 1 is the coefficient of regressing the weights on the past weights. The process t i t s } { is simulated as follows. First, the starting random variable of this process, i s 1 is bootstrapped from a pool of demeaned DGTW benchmark adjusted returns, i.e. ) ( 1 { 1 , 1 , t i D t i t t t i D t i t r r T r r . Next, the subsequent random variables in this process are simulated following the dynamics of i t s with i t bootstrapped independently from the pool of demeaned DGTW-adjusted stock returns. In order to make the process stationary, i t is scaled by 2 1 after it is randomly selected from 11 There are more than one way to simulate the informed strategies. For example, a timing strategy is also informed. We simulate this particular strategy to show the efficiency and power of the WLS measures. Comparing the powers of different measures with different strategies can be an interesting future work. 30 the pool, i.e. given the (1) AR process of the signal, . )) ( 1 ( ) (1 ) ( * = ) ( 1 , 1 , 2 2 1 t i D t i t t t i D t i t i t i t r r T r r var s var s var Since , )) ( 1 ( = ) ( 1 , 1 , 0 t i D t i t t t i D t i t i r r T r r var s var it follows that ) ( i t s var ’s are the same for all t . The error term i t 1 in the process (equation 16) is independently bootstrapped from the pool of demeaned DGTW benchmark adjusted returns and scaled by 2 1 . We calculate the DGTW adjusted returns from the dynamics above, assuming takes different values from 0 to 0.25. The value of represents manager’s information. The larger the value of , the more information manager possesses. The value of is determined by regressing the real DGTW adjusted returns on real fund weights following regression 16. The maximal number 0.25 is the average value of the estimated over the funds with largest estimated . Since i represents the benchmark mis-specification, we set it as ) ( 1 1 , t i D t i t t r r T to keep the benchmark misspecifications. After simulating the DGTW adjusted returns from equation 16, we independently simulate the DGTW benchmark returns as in the previous subsection, and the simulated stock returns are the summation of the two. The simulated weights for informed managers are based on the simulated process } { i t s . Since the fund cannot short-sell the stocks, in each period, we assume that the manager only selects the stocks with 0 > i t s , and we set the weight for the stock i at time t to be zero when 0 <= i t s (In general, the funds do not short sell stocks. Even if they do, the reported weights do not contain 31 short positions. Therefore, we assume that the weight cannot be negative). In addition, since the sum of weights should be equal to one, the weight for each stock is scaled by the factor i t i s . 1.4.1.4 Using the simulations Using the bootstrapped returns and the simulated informed strategies, we can estimate the weight- change, DGTW, bias-adjusted and WLS measures, and calculate the standard errors and T- statistics of the different performance measures for each simulation. We calculate these statistics using the quarterly weights from our simulations, and the quarterly returns by taking the summation of the simulated monthly returns. By simulating the returns and strategies 1000 times, we have 1000 point estimates, standard errors, and T-statistics for each case. We calculate the mean of the estimated performance measure as the sample average of the 1000 estimated performance measures. If the estimated T-statistics have standard normal distribution, we should expect that the 5%, 10% , 90% and 95% critical values of the simulated distributions are close to the corresponding critical values of the standard normal distribution. Based on this idea, we rank the 1000 simulated T-statistics, and select the 25 th, 50 th, 950th and 975th values as the 5%, 10% , 90% and 95% critical values of the distribution. These values can be compared with the critical values of the standard normal distribution. Moreover, we test the distributions of the 1000 T-statistics for each performance measure to determine if they follow normal distributions using the Jarque-Bera (JB) test and Lilliefors test (for example, Jarque and Bera (1986) and Lilliefors (1967)) . The intuition on the JB test is that the skewness and kurtosis of a sample distribution should be the same with those of a normal distribution when the sample is normally distributed. The main idea of the Lilliefors test is that if the sample distribution is a normal, then the difference between the CDF of the sample distribution and that of a normal distribution should be small. Therefore, both tests can reject the normality of a sample if the differences between these 32 distribution properties of that sample and a normal sample are large enough. The power of different performance measures is characterized by the rejection frequency when the null hypothesis is false. The rejection ratios are constructed as follows: we rank all 1000 simulated T-statistics, count the number of values that are larger than the 95% critical value from the simulated distribution of T-statistics when 0 = , and divide this number by 1000. The ratio should be larger if the power is larger 12 . Since the power is related to the value of (the covariance between the returns and the weights), we create a table that contains different values of from 0.1 to 0.5. The power can also be affected by the sample size; thus, we also create tables that have either 120 (quarterly) or 360 (monthly) sample periods 13 . 1.4.2 Simulation results for uninformed strategies We first examine the simulated mean and T-statistics for the buy-and-hold strategy. The results for various measures are presented in table 2. The simulation results are consistent with the predictions in the earlier sections. The mean of the weight-change measures for 1000 simulations is 0.43% , which implies that the bias of the measure is positive. In addition, as we discussed in the previous session, there is a positive bias for the DGTW CS measure. This is reflected in table 2, where the average of the DGTW measure is 1.35% . The bias affects the critical values of the T-statistics. For the weight-change measure, the 5%, 10% , 90% and 95% critical values are all higher than the corresponding critical values for the standard normal distribution. Moreover, the critical values of T-statistics for the DGTW measure 12 The reason we do not count the number of the values that are larger than 1.96 is because the null distribution (when 0 = ) of the T- statistics may not be a standard normal. Therefore, its 95% critical value can be different from 1.96 . 13 We assume that the manager still rebalances her portfolios every month, but researchers only have quarterly holdings data (in this case, the manager’s monthly interim trading is simulated but cannot be captured by weight-based measures). 33 are even larger than those for the weight-change measure and much larger than the critical values for the standard normal distribution. For example, in order for a fund’s T-statistics to be significant at the 5% level under the null, the value should be greater than 7.08 (which is much larger than the value for a standard normal distribution). In our simulations, the benchmarks such as past weights and DGTW benchmarks are misspecified. Therefore, a strategy that overweights stocks with large misspecifications can increase the estimated value for the performance measures. As we discussed in previous sections, the buy-and-hold strategy belongs to such a category. Hence, even if the manager does not predict future stock returns ( 0 = ), the weight-change measure and DGTW CS measure are both positive. Thus, these measures misclassify this uninformed strategy. The bias-adjusted methods perform well. Both the adjusted weight-change and DGTW measures have T-statistics with critical values that are close to the corresponding values for the standard normal distribution. In addition, the means of the performance measures are close to zero. Thus the new proposed methods can control the bias in the existing methods. Although the original methods are subject to bias, the distributions of the simulated T- statistics are affected mainly in their location. The JB and Lilliefors tests show that, for most of the bias-adjusted and unadjusted measures, the distributions of the T-statistics are close to normal. This result, together with the critical values of T-statistics, indicates that the distributions of T- statistics for the bias-adjusted measures are close to a standard normal, which is a desired statistical property for the performance measures under the null hypothesis. These results are not limited to the buy-and-hold strategy, but also are found for the momentum strategy. The estimated measures and the T-statistics, as shown in table 3, are similar to those with the buy-and-hold strategy . For example, the biases are positive for both the weight- 34 change measure and the DGTW measure , and they have the same effect on the T-statistics as for the buy-and-hold strategy. These biases can be controlled with the bias-adjusted measures, and T- statistics have similar distributions to that of a standard normal. The results of the random selection strategy are presented in table 4. As discussed in section 2 , the biases for all the performance measures are smaller than when we use the other two uninformed strategies, but the value of the DGTW CS measure is still much larger than zero. Similarly, for the DGTW CS measure, although the critical values for the T-statistics are closer to the critical values of the standard normal distribution compared with that for the other two strategies, they are still larger than those of the standard normal distribution. The bias-adjusted measures eliminate the bias, and produce close-to-standard-normal T-statistics for this uninformed strategy. In these simulations, we assume that the starting portfolio is equally weighted. There is a concern that whether the starting portfolio is affecting our simulations results. In order to deal with this issue, we assume that the starting portfolio is value-weighted, where size is defined as the average total market capitalization from 1975 to 2010 . The statistics of the weight-based measures are estimated based on the strategies with new initial portfolio. The results are shown in table 5, table 6 and table 7. With different initial weights, the main results do not change significantly. This implies that the simulations are robust to the value-weighted starting portfolio. We also estimate different performance measures with the simulated monthly weights (assuming that researchers can obtain managers’ monthly holdings). Similar to the cases with the quarterly weights, there are biases for the existing measures, and the biases are much smaller for the bias- adjusted measures. The critical values and the distributions for the T-statistics are also similar to quarterly weights. 35 In sum, the simulations show that the original performance measures are biased with various uninformed strategies, and the size of the biases are not the same. However, the bias-adjusted measures can control for biases under different assumptions about the uninformed strategies, and produce appropriate distributions for the T-statistics. 1.4.3 The simulation results for informed strategies 1.4.3.1 The estimated mean and standard errors We compare the estimated values and the standard errors for various measures when managers only report quarterly holdings (table 8). Not surprisingly, the estimated weight-based measures are increasing with , which represents manager’s information. We also find that standard errors are generally smaller for the WLS measures. As we explained, this is because WLS are more efficient. Interestingly, the standard errors are similar for the bias-adjusted DGTW measure and the weight-change measure, although the standard errors are smaller for the bias-adjusted DGTW measure. This reuslt leads to similar power for these two different measure when the estimated values of these two measures are similar. We will explain this finding in appendix. The main reason is that estimated variance of both measures are approximate to the fourth moment of the DGTW adjusted returns. We also compare the estimated mean and standard errors of the performance measures shown in table 8 that use the quarterly holdings with those using monthly data, assuming that the monthly holdings are reported. The results (shown in table 10 and normalized to the quarterly values) are surprising: the estimated standard errors are not smaller with higher data frequency. The reason is as follows: Although there are interim trading during the quarter, the weights (we assume 0.95 = in our simulations) are persistent even with the informed strategies. This implies that the risk of the portfolio does not change significantly during the quarter. Hence, the standard deviation 36 of the quarterly measures should be similar to the normalized standard deviation of the monthly measures. However, because observed monthly weights for an informed strategy contain more information than quarterly weights, the true value of the performance measures using monthly weights (normalized to quarter value) is larger than the true value of the performance measure using the quarterly data. Therefore, the measures are in general larger with the monthly weights (in table 10) than with the quarterly weights (in table 8). 1.4.3.2 The power of different performance measures The results of the power of the performance measures are presented in table 9. In this table, the data frequency is quarterly and the DGTW benchmark return is assumed to be misspecified. As is expected, for all the different measures, when the covariance of the return and the weight is larger ( is larger), the power increases. This is because measures how informative the managers are. A higher value of implies that the managers are more informed about future stock returns. Meanwhile, for all the performance measures, the WLS measure has greater power. For example, when 0.25 = , the rejection ratio for the bias-adjusted WLS DGTW measure is 9.6%, which is much larger than that with the bias-adjusted DGTW measure, 7.1% . Table 11 shows the statistical rejection ratios of these measures for the monthly data. We can see that these ratios are much larger compared with the values of table 9. The reason has been explained in the previous subsection. It is not due to the smaller estimated standard errors with a larger sample size, but rather because we have more information with the monthly holdings data (the estimated performance measures are larger). 1.4.3.3 Power of the performance measures when researchers create monthly weights with quarterly holdings In order to use monthly returns data to create monthly weights, we assume that the manager adopts 37 a buy-and-hold strategy during the months in which we do not observe the holdings; thus, we can create monthly weights of mutual funds as in section 3 . An important question is whether incorporating monthly data can improve the power of the bias-adjusted measures. In this section, we use simulation to show the power of various measures when researchers utilize quarterly holdings data to create monthly weights. The stock returns and the manager’s strategies are created in the same way as before . From the simulation, we find that if the researchers create monthly weights with the buy-and- hold strategy to estimate performance measures, the rejection ratio of the statistics is in general higher compared with the case when they use the quarterly weights. For example, the rejection ratio of the bias-adjusted DGTW measures when is 0.25 is 7.1% with the quarterly weights (in table 9) and 43.3% with the created monthly weights (in table 11). As before, the higher rejection ratio with the created monthly weights is not due to the lower estimated standard errors of the measures when the sample size is larger (created monthly weight data contain 360 periods and the quarterly weight data contain 120 periods), but to the larger estimated values of the measures (when the created monthly weights contain more information to predict stock returns). To further illustrate this point, table 8 and table 10 present the estimated values and standard errors for these performance measures. For example, the estimated bias-adjusted DGTW measures when is 0.25 for quarterly and created monthly weights are 0.4% and 1.71% , but the estimated standard errors are 1.04 and 0.91, respectively. It is clear that the estimated values are much larger with the created monthly weights, but the standard errors are similar. The reason that the created monthly weights can contain more information is as follows. Since fund’s weights in previous month can predict current stock returns, then given that fund’s weights are highly autocorrelated, the current returns are also correlated with the current weights. Because the current weights also 38 predict stock returns in next month, the current stock returns which are correlated with current weights contain information of next-month stock returns. Hence, by incorporating the current stock returns, the created monthly weights (with the assumption of the buy-and-hold strategy) can contain more information to predict next-month stock returns than the quarterly weights. This can lead to a higher estimated value of the performance measures as well as the rejection ratios for the null hypothesis. On the other hand, the rejection ratio is much larger when researchers observe monthly weight data. For example, the rejection ratio of the bias-adjusted DGTW measures when is 0.25 can be as large as 87.2% (in table 13), which is much higher than these ratios when the monthly weight data are unavailable. The much higher rejection ratio is due to larger estimated values, and not because of the smaller estimated standard errors. This finding indicates that when there is a persistent component of stock returns, estimated measures using the created monthly weights can partly capture the interim trading within each quarter. However, since the true monthly weights are different from the created monthly weights, the measures with created monthly weights do not outperform the ones with observed monthly data (see table 10 and 12.) 1.5 Empirical results using mutual fund data 1.5.1 Model misspecification and the relative size of the stocks As we have discussed in previous sections, the average DGTW benchmark adjusted return taken over all stocks and all times is positive. Since the DGTW benchmarks are value-weighted, the stocks with relatively smaller size in the benchmark portfolio have larger DGTW benchmark adjusted returns. Hence, if a manager holds more relatively large stocks in her portfolio, the DGTW measure should be negatively biased. Therefore, we first test whether the DGTW CS measures are affected by the relative sizes of the stocks. To this end, we define the relative size of stocks as 39 follows: First, for stocks in each DGTW benchmark portfolio, we divide them into five groups by their market cap. The first group contains the stocks with the lowest market cap, and the fifth group contains the stocks with the highest market cap. Next, we combine the stocks in the first groups from all 125 DGTW benchmark portfolios as the relatively small group, and the stocks in the fifth groups from all 125 benchmark portfolios as the relatively large group. For each relative size group, we calculate the average size of the stocks as the “relative size" of that group. The funds are sorted by the weighted average of these relative sizes, and the different weight-based measures are estimated for funds in each group. Our findings are shown in table 14. From this table, the funds that hold the largest portion of the relatively small stocks (low group) have positive and significant DGTW measures. The T-statistics can be as large as 4.6 . The funds that hold the largest portions of the relatively large stocks (high group) have negative and significant DGTW measures. The difference between the high and low group is also significantly negative. The results are similar with the WLS DGTW measure. If we apply the bias-adjusted DGTW measure, the estimated value for the low group funds is much smaller than that with the original DGTW measure, and for the high group funds, the estimated measure is much larger than that with the original DGTW measure. The difference in the estimated bias-adjusted DGTW measures between the high and low group is only 1/3 of that which is found in the estimated original DGTW measure, and the significance is also much lower. Using the bias-adjusted WLS DGTW measure, the estimated value for the high-minus-low group is only 1/7 of that with the original DGTW measure, and the T-statistic is insignificant at the 10% level. From these results, the funds holding relatively small stocks perform “better" with the DGTW CS measure since the DGTW benchmark adjusted returns are, on average, positive for relatively smaller stocks (DGTW benchmark returns are misspecified). However, using the bias- 40 adjusted measures can mitigate such benchmark misspecification effects. Next, we want to evaluate whether the relative sizes are indeed related to the benchmark misspecification ( i ) of the stocks. In particular, the i refers to the misspecification of the DGTW benchmarks in modeling the expected returns of stocks; thus, we use the average of the DGTW adjusted returns, ) ( 24 1 1 , 24 1 = s i D s i s t t s r r , to estimate the alpha for stock i at time t (specifically, we assume that i s i D s i s t t s r r E = )) ( 24 1 ( 1 , 24 1 = ). Empirically, i could be time- varying, thus, it is more appropriate to make use of the returns that are not too remote from the current period (Fama-Macbeth (1973) first apply this idea to estimate the time-varying beta). Next, we regress the average DGTW adjusted returns ) ( 24 1 1 , 24 1 = s i D s i s t t s r r on the relative size at time t using a panel regression 14 . The estimated coefficient with this regression is negatively significant 14 This regression is a panel predictive regression. However, it does not suffer a Stambaugh bias when the number of stocks is large. The key reason is that it is a panel regression with constant term (instead of dummy variables): i.e. , = 1 , i t i t t i D t i t size r r where i t size represents relative size of the stock, is a constant term. In this case, the estimated coefficient can be written as . ) 1 ( ) 1 ))( ( 1 ( 2 , , 1 , , 1 , i t t i i t i t t i i t t i D t i t t i t i D t i t size NT size size NT size r r NT r r Here N is number of stocks. Substituting the dynamics of 1 , t i D t i t r r , the estimated coefficient becomes . ) 1 ( ) 1 ))( ( 1 ( 2 , , , i t t i i t i t t i i t i t t i i t size NT size size NT size NT From the expression above, the Stambaugh bias depends on the correlation between i t t i NT , 1 and i t t i i t size NT size , 1 . If error terms in the regression for stock i are not highly correlated with the size of other stocks (i.e. i t 1 are not highly correlated with j t size 2 for 41 (with T-statistics 56.95 ). Therefore, we conclude that the benchmark misspecification is affected by the relative size of the stocks. This can further explain that the manager who holds relatively large stocks will have negatively biased DGTW CS measures. We also create monthly weights from the monthly returns and quarterly holdings, with the assumption that managers adopt a buy-and-hold strategy between the two consecutive reporting dates. Using the created monthly weights, we can also estimate the monthly performance measures according to different groups of funds sorted by the weighted average of relative sizes. The results are shown in table 15. Similar to the quarterly weights, the funds holding large stocks have negatively biased DGTW CS measures, and these can be mitigated through bias-adjusted measures. The difference in the DGTW CS measure between the funds holding the relatively small stocks and the funds holding the relatively large stocks are much larger with the DGTW CS measure than that with the bias-adjusted measures. Interestingly, most of the estimated standard errors using the quarterly weights are within 20% of those estimated with the created monthly weights. This is consistent with the simulation results which show that the asymptotic standard deviation does not change significantly with higher frequency of the data. In addition to the relative size, we also grouped the funds according to the relative value and past returns, and compare the performance measures of the funds with relatively high value (high past returns) stocks to the funds with relatively low value (low past return) stocks. The results are j i and 2 , 1 t t from time 1 to time T ) compared with the variance of the error term itself, the correlation between i t t i NT , 1 and i t t i i t size NT size , 1 converges to zero when the number of stocks, N , is large. Since i t 1 ’s represent the error term in a regression where the dependent variables are the DGTW benchmark adjusted returns, they are not highly correlated with the returns of other stocks or the size of other stocks compared with the variance of the error terms. We have at least 2000 stocks with existing DGTW benchmark adjusted returns in each period, which is large enough to make the Stambaugh bias neglectable. 42 shown in table 17, table 16, table 19 and table 18. For the DGTW measure, the funds with relative high value stocks significantly outperform the funds with relative growth stocks, and the difference between the funds with relative high past returns and the funds with relative low past returns is weakly significant (with created monthly weights and monthly data only). If we apply the bias- adjusted DGTW measures, the differences become insignificant. 1.5.2 Persistence of the bias-adjusted DGTW CS performance measures In this subsection, we examine the persistence of difference DGTW performance measures (the weight-change measure does not capture the stock momentum, therefore, the persistence of these measures is due to the wide-spread usage of the momentum strategy (Grinblatt, Titman and Wermers (1995))). To test for persistence, we group the funds according to past performance measures. The procedure is as follows: First, for the DGTW CS measure, we estimate the average values over the past five years (we also estimate the average performance measures over the past three years, and the results are similar). For the bias-adjusted measure, we only use the average values from five years to one year before the current time point. This is because the future DGTW benchmark adjusted returns are used for calculating the measure; thus, it is unrealistic to estimate the bias-adjusted measure within 12 months unless we know the future stock returns. Second, we use the past performance measure to group the funds: the first group contains the funds with the lowest past performance, and the fifth group contains the funds with the highest past performance. After grouping, we estimate the next-period performance measures (DGTW measure, bias- adjusted measure, WLS measure, bias-adjusted WLS measure, the reported returns and Carhart four-factor alpha based on reported returns) for each group. The performance of funds in each group is estimated by taking the average of the performance measures over time. Moreover, we calculate the difference of performance measures between the high (fifth) minus low (first) groups. 43 The results are shown in table 19. From the table, the original DGTW measure and the WLS DGTW measure cannot predict themselves, but the bias-adjusted measure and bias-adjusted WLS measure can predict their future values. The performance measure differences between the high and low groups are much larger and significant for the bias-adjusted measures. This result suggests that the managers who choose stocks based on information may consistently do so. None of these performance measures can predict the return-based measures. This is consistent with Berk and Green (2004): the return based measures are after-cost measures, but the weight- based measures are before-cost measures. The informed manager can extract rents through fees and roughly cover costs. Therefore, the after-cost performance may not be predictable based on the before-cost weight-based performance measures. 1.5.3 Revisiting previous results We apply the existing measures as well as the bias-adjusted and the WLS measures to estimate the performance of funds grouped by funds’ return gaps. The return gap (introduced by Kacperczyk, Sialm and Zheng (2008)) is defined in section 3. In the paper, the authors group the funds according to the return gap and calculate the performance measures for each group and the differences in the performance measures between the high and low groups. With the created monthly weights, they find that both the DGTW CS measure and the weight-change measure are significantly larger for the high return gap group, although the significance level is much smaller than that for the return-based measures. Ferson and Mo (2013) use the quarterly weights, but they find no sign of difference with the DGTW CS measure. Intuitively, since the return gap captures the manager’s interim trading between the reported holding periods, the weight-based measures, which exclude the effect of interim trading, may be uncorrelated with the return gap. 44 We also apply the DGTW CS measure, the weight-change measure and the corresponding bias-adjusted or WLS measures to this empirical question. The estimated value, standard errors and T-statistics of different measures are shown in table 21. First, the estimated performance measures are different: for example, the estimated values for the DGTW CS measure are much larger for the funds in the low return gap groups, but smaller for the funds in the high return gap group. This implies that the manager in the low return gap group holds a larger portion of negatively misspecified stocks. Second, the WLS measures have smaller estimated standard errors than the unadjusted measures. The is due to the fact that the WLS measures are more efficient. Moreover, for the DGTW CS measure, we find that the T-statistics for all five groups and the difference between the high and low groups are insignificantly different from zero. The T-statistics are similar for the bias-adjusted or the bias-adjusted and WLS DGTW measures as well as the different weight-change measures. These results are consistent with our prediction such that the weight-based measures may be uncorrelated with the return gaps. We also create monthly weights and estimate the monthly performance measures for the return gap groups. The results are presented in table 22. An interesting observation is that the T- statistics for most performance measures increase compared with the measures using quarterly data. This observation is consistent with the simulation results where the higher T-statistics are determined by the higher estimated value of performance measures. Specifically, for most cases, the estimated standard errors are not significantly different whether using the quarterly weights or the created monthly weights, but the estimated values of the performance measures with the created monthly weights are higher than those with the quarterly weights in most cases. Empirically, these values are larger because incorporating the monthly stock returns in monthly weights may contain information about the manager’s ability, as shown in the simulation studies. 45 When using the created monthly weights, Kacperczyk, Sialm and Zheng (2008) find that with the DGTW measure, the funds with high return gaps perform better than the funds with low return gaps. The original DGTW CS measure for the difference between the high and low groups is 0.19% , and the T-statistics are significant at the 10% level. Similarly, the results for weight- change measures also suggest that the funds with high return gaps are still more informed than the funds with low return gaps. The bias-adjusted weight-change measure for the funds in the high minus low group is larger than that for the original weight-change measure; although both of them are significant. However, both the bias-adjusted and original weight-change measures do not capture the size, value and momentum effects. Therefore, the findings using these measures can be similar if the difference between the high and low groups is due to the trading strategy based on these stock properties. The most interesting finding is that when using the bias-adjusted DGTW measure, such differences in the high minus low group disappear. The bias-adjusted DGTW CS measure for the difference between the high and low groups is 0.08% (much smaller than 0.19% using the original measure), and is insignificant. This is due to the fact that, for the low return gap funds, the bias-adjusted measure is much higher than the original DGTW measure. This implies that the managers whose funds have low return gaps may also be the managers who select negatively misspecified stocks; hence, the performance for these managers may be undervalued unless we are using the bias-adjusted measure. 1.6 Conclusion In this paper, we build the connection between weight-based performance measures and panel predictive regressions framework. The existing measures display the bias and inefficiency in the panel regression framework. To deal with these issues, bias-adjusted measures and WLS measures are introduced. We use simulations to show that biases are economically large for popular weight- 46 based performance measures when confronted with uninformed strategies, and the T-statistics of these measures are unreliable. The bias-adjusted measures do not suffer these issues. Moreover, we simulate an informed strategy and study the power of various weight-based performance measures. Finally, the bias-adjusted measures and WLS measures are applied to the real data, and these new measures lead to new empirical evidences in the existing literature. The measures introduced in this paper can be applied to the growing literature studying the mutual fund manager’s performance. We can use the bias-adjusted measures to distinguish the informed mangers from uninformed ones. For example, many papers in the current literature tend to find that funds with certain properties outperform others. However, since the existing measures suffer a bias, it is not clear whether the outperformed funds are informed or due to the bias. The bias-adjusted measure can help researchers to find the truly informed fund managers. This method is not limited to mutual funds, but can also be used for hedge funds or individual investors when the holding data is available. In addition to the bias-adjusted measures, the WLS measures are more efficient at distinguishing the informed managers; thus, these measures are more likely to help researchers as well as the practitioners in identifying informed managers. 47 References Amihud Yakov and Hurvich Clifford, 2004, Predictive Regressions: A Reduced-Bias Estimation Method, Journal of Financial and Quantitative Analysis 39, 813-841. Amihud Yakov, Hurvich Clifford and Yi Wang, 2008, Multiple-Predictor Regressions: Hypothesis Testing, Review of Financial Studies 22, 414-434. Amihud Yakov, Hurvich Clifford and Yi Wang, 2010, Predictive Regression with Order-p Autoregressive Predictors, Journal of Empirical Finance 17, 513-525. Chan Lous, Dimmock Stephen, and Lakonishok Josef , 2009, Benchmarking money manager performance: Issues and evidence, Review of Financial Studies 22-11, 4553-4599. Cremers Martijn and Petajisto Antti, 2009, How active is your fund manager? A new measure that predicts performance, Review of Financial Studies 22, 3329-3265. Cremers Martijn, Petajisto Antti and Zitzewitz Eric 2010, Should Benchmark Indices Have Alpha? Revisiting Performance Evaluation, Critical Finance Review, forthcoming. Cohen Randolph, Joshua Coval and Lubos Pastor, Judging Fund Managers by the Company They Keep, 2005, Journal of Finance 60, 1057-1096. Daniel Kent, Mark Grinblatt, Sheridan Titman, and Russ Wermers, 1997, Measuring Mutual Fund Performance with Characteristic Based Benchmarks, Journal of Finance 52, 1035-1058. Fama Eugene F, 1990, Stock returns, expected returns, and real activity, Journal of Finance 45, 1089-1108. Fama Eugene, and Kenneth French, 1988a, Dividend yields and expected stock returns, Journal of Financial Economics 22, 3-25. Fama Eugene, and Kenneth French, 1988b, Permanent and temporary components of stock prices, Journal of Political Economy 96, 246-273. 48 Fama Eugene, and Kenneth French, 1989, Business conditions and expected returns on stocks and bonds, Journal of Financial Economics 25, 23-49. Farnsworth Heber, Wayne Ferson, David Jackson, and Steven Todd, 2002, Performance Evaluation with Stochastic Discount Factors, Journal of Business 75, 473-504. Ferson Wayne, 2012, Ruminations on Investment Performance Measurement, European Financial Management. Ferson Wayne and Kenneth Khang, 2002, Conditional Performance Measurement Using Portfolio Weights: Evidence for Pension Funds, Journal of Financial Economics 65, 249-282. Ferson Wayne and Haitao Mo, 2013, Measuring Performance with Market and Volatility timing and Selectivity, working paper. Ferson Wayne Timothy Simin and Sergei Sarkissian, 2003, Spurious regressions in Financial Economics, Journal of Finance 58, 1393-1414. Granger Clive, and Paul Newbold, 1974, Spurious regressions in economics, Journal of Econometrics 4, 111-120. Grinblatt Mark, Sheridan Titman, 1989, Mutual fund performance: an analysis of quarterly portfolio holdings, Journal of Business 62, 393-416. Grinblatt Mark, Sheridan Titman, 1989, Portfolio performance evaluation: old issues and new insights, Review of Financial Studies 2, 1989b, 393-422. Grinblatt Mark, Sheridan Titman, 1993, Performance measurement without benchmarks: an examination of mutual fund returns, Journal of Business 60, 97-112. Grinblatt Mark, Sheridan Titman, and Russ Wermers, 1995, Momentum Investment Strategies, Portfolio Performance, and Herding: A Study of Mutual Fund Behavior, American Economic Review 85, 1088-1105. 49 Hjalmarsson Erik, 2007, The Stambaugh Bias in Panel Predictive Regressions, International Finance Discussion Papers. Jarque Carlos, Bera Anil, 1980, Efficient tests for normality, homoscedasticity and serial independence of regression residuals, Economics Letters, 6(3), 255-259. Jiang George, Tong Yao and Tong Yu, 2007, Do Mutual Funds Time the Market? Evidence from Portfolio Holdings, Journal of Financial Economics 86, 724-758. Kacperczyk Marcin, Clemens Sialm, and Lu Zheng, 2008, Unobserved actions of mutual funds, Review of Financial Studies 21, 2379-2416. Kothari S.P. , Jerold Warner, 2001, Evaluating Mutual Fund Performance, Journal of Finance 56, 1985-2010. Lilliefors H, 1967, On the Kolmogorov-Smirnov test for normality with mean and variance unknown, Journal of the American Statistical Association, 62, 399-402. Pastor Lubos and Robert Stambaugh, 2009, Predictive Systems: Living with Imperfect Predictors, Journal of Finance 64, 1583-1628. Petajisto Antti, 2013, Active Share and Mutual Fund Performance, working paper. Petersen Mitchell, 2009, Estimating Standard Errors in Finance Panel Data Sets: Comparing Approaches, Review of Financial Studies 22, 435-480. Jay Ritter and Donghang Zhang, 2007, Affiliated Mutual Funds and the Allocation of Initial Public Offerings, Journal of Financial Economics 86, 337-368. Stambaugh Robert, 1999, Predictive regressions, Journal of Financial Economics 54, 315- 421. Wermers Russ, 2004, Is Money Really ’Smart’? New Evidence on the Relation Between Mutual Fund Flows, Manager Behavior, and Performance Persistence, working paper. 50 Wermers Russ, 2006, Performance Evaluation with Portfolio Holdings Information, North American Journal of Economics and Finance 17, 207-230. Wermers Russ, 2011, Performance Measurement of Mutual Funds, Hedge Funds, and Institutional Accounts, Annual Review of Financial Economics, 537-574. 51 strategies informed for measures different for errors standard The : Appendix We will show that this result is theoretically valid in this section. The power of the measure is characterized by the variance. The variance of the bias-adjusted DGTW CS measure and the weight-change measure are , ))) ( ) )(( (( , 1 1 , 1 1 t D t t t D t t t t Var r r r r w w , )) )(( (( 1 1 t t t t Var r r w w respectively. In this paper, We assume that the stock returns follow the dynamics of . = 1 , 1 1 t t D t t r r If we plug it into the expressions of the variance, they become , )) )( (( 1 1 t t t t Var w w . ))) ( ) )(( (( , 1 , 1 1 1 t D t t D t t t t t Var r r w w Since the error 1 t is independent with the benchmark return t D t , 1 r , the variance of the weight- change measure can be decomposed to two parts: . )) )( (( )) )( (( , 1 , 1 1 1 t D t t D t t t t t t t Var Var r r w w w w The first part is the variance of the DGTW CS measure and the second part is positive. This implies that the variance of the weight-change measure is larger, i.e. the power of the weight-change measure is smaller. Interestingly, the difference between the variances of two measures is not very large with the simulated informed strategies. To further test this argument, I also simulate the informed strategy with which managers choose only the stocks with relatively high variance of the benchmark returns ) ( , 1 t D t var r . This strategy can enlarge ) ( ) ( , 1 , 1 t D t t D t t t Var Var r r w w and the variance of the 52 weight-change measure. However, we still find the powers for these two measures are not significantly different using this strategy, although the weight-change measure still have larger standard errors in most cases. 53 Table 1.1: Summary statistics of the mutual funds data This table presents the summary statistics of the our data sets. For each fund, we calculate the time-series average of the assets (Assets), the average number of stocks (num stocks), and the return gaps (the difference between the raw return (calculated using the weights of stocks and their corresponding returns) and the before-fee return (calculated by the fund reported return plus the expense ratio), as well as the estimated coefficients in regression , = 1 1 1 i t i j t j i t w w where j is from 1 to 5 , and the correlations (Error Corr) between the errors in the predictive system (Error i t 1 and error in the predictive regression i t 1 in equation 1). Mean, std, max, min, skewness and kurtosis are taking across individual funds (with 2200 funds). mean std max min skewness kurtosis Assets ($million) 656 2471 54079 10 13 213 num stocks 143.9 270.5 3192.4 14.7 6.7 58.0 Return gap (% returns) 4.22e-003 0.30 2.89 -3.60 -0.23 33.29 1 0.84 0.17 1.27 0 -3.22 16.07 2 0.73 0.21 1.75 -0.13 -1.48 5.98 3 0.64 0.25 1.20 -0.58 -1.01 3.82 4 0.57 0.27 1.22 -0.63 -0.64 2.87 5 0.51 0.31 1.55 -3.80 -1.64 20.55 Error Corr -0.0022 0.060 0.75 -0.47 0.44 24.01 54 Table 1.2: The means and T-statistics of different performance measures—buy-and-hold strategy (quarterly) This table presents the estimated mean of the various performance measures, the bias-adjusted measures, WLS measures and the corresponding critical values ( 5% , 10% , 90% and 95% ) of the T-statistics with simulated stock returns and mutual fund holdings. The bias (WC) and bias (DGTW) are the bias-adjusted weight-change and DGTW measures. WLS (WC) or (DGTW) is the WLS weight-change or DGTW measure. Bias-WLS (WC) and bias-WLS (DGTW) are the bias-adjusted and WLS weight-change and DGTW measures. Finally, we show the normality tests (JB test and Lilliefors test). For these tests, 1 represents rejecting the normality at 95% . Panel A: Original measures Mean ( % returns) 5% T 10% T 90% T 95% T JB Lilliefors DGTW measure DGTW 1.35 3.18 3.44 6.68 7.08 0 1 WLS (DGTW) 0.74 1.78 2.15 5.95 6.29 0 0 Weight-Change measure WC 0.43 0.66 1.10 4.54 4.80 1 0 WLS (WC) 0.51 0.73 1.21 4.79 5.06 1 1 Panel B: Bias-adjusted measures Mean ( % returns) 5% T 10% T 90% T 95% T JB Lilliefors DGTW measure Bias (DGTW) -2.6e-3 -2.14 -1.81 1.79 2.14 0 0 Bias-WLS (DGTW) 0.01 -1.98 -1.63 1.79 2.10 0 0 Weight-Change measure Bias WC 0.06 -1.71 -1.44 2.01 2.22 0 0 Bias-WLS (WC) 0.07 -1.55 -1.24 1.89 2.23 0 0 55 Table 1.3: The means and T-statistics of different performance measures—momentum strategy (quarterly) This table presents the estimated mean of the various performance measures, the bias-adjusted measures, WLS measures and the corresponding critical values ( 5% , 10% , 90% and 95% ) of the T-statistics with simulated stock returns and mutual fund holdings. The bias (WC) and bias (DGTW) are the bias-adjusted weight-change and DGTW measures. WLS (WC) or (DGTW) is the WLS weight-change or DGTW measure. Bias-WLS (WC) and bias-WLS (DGTW) are the bias-adjusted and WLS weight-change and DGTW measures. Finally, we show the normality tests (JB test and Lilliefors test). For these tests, 1 represents rejecting the normality at 95% . Panel A: Original measures Mean ( % returns) 5% T 10% T 90% T 95% T JB Lilliefors DGTW measure DGTW 1.32 1.94 2.18 5.40 5.75 0 0 WLS (DGTW) 0.55 0.30 0.86 4.65 5.03 0 0 Weight-Change measure WC 0.56 -0.74 -0.42 2.96 3.30 0 0 WLS (WC) 0.63 -0.77 -0.45 3.12 3.35 0 0 Panel B: Bias-adjusted measures Mean ( % returns) 5% T 10% T 90% T 95% T JB Lilliefors DGTW measure Bias (DGTW) -0.04 -2.28 -1.91 1.77 2.06 0 0 Bias-WLS (DGTW) 1.42e-003 -2.08 -1.73 1.86 2.29 0 0 Weight-Change measure Bias WC 0.13 -1.82 -1.57 2.09 2.49 0 0 Bias-WLS (WC) 0.15 -1.80 -1.54 2.10 2.49 0 0 56 Table 1.4: The means and T-statistics of different performance measures—random selection strategy (quarterly) This table presents the estimated mean of the various performance measures, the bias-adjusted measures, WLS measures and the corresponding critical values ( 5% , 10% , 90% and 95% ) of the T-statistics with simulated stock returns and mutual fund holdings. The bias (WC) and bias (DGTW) are the bias-adjusted weight-change and DGTW measures. WLS (WC) or (DGTW) is the WLS weight-change or DGTW measure. Bias-WLS (WC) and bias-WLS (DGTW) are the bias-adjusted and WLS weight-change and DGTW measures. Finally, we show the normality tests (JB test and Lilliefors test). For these tests, 1 represents rejecting the normality at 95% . Panel A: Original measures Mean ( % returns) 5% T 10% T 90% T 95% T JB Lilliefors DGTW measure DGTW 0.38 -1.31 -0.97 2.29 2.62 0 0 WLS (DGTW) 0.11 -1.77 -1.51 1.96 2.23 1 0 Weight-Change measure WC 0.23 -1.63 -1.35 1.97 2.27 0 0 WLS (WC) 4.58e-003 -1.95 -1.67 1.81 2.10 0 0 Panel B: Bias-adjusted measures Mean ( % returns) 5% T 10% T 90% T 95% T JB Lilliefors DGTW measure Bias (DGTW) 0.08 -2.21 -1.95 2.06 2.36 0 0 Bias-WLS (DGTW) 0.07 -2.20 -1.88 1.91 2.23 1 0 Weight-Change measure Bias WC 0.07 -2.13 -1.98 1.97 2.37 1 1 Bias-WLS (WC) 0.04 -2.29 -1.87 1.90 2.28 0 0 57 Table 1.5: The means and T-statistics of different performance measures—buy-and-hold strategy, value weighted initial portfolios(quarterly) This table presents the estimated mean of the various performance measures, the bias-adjusted measures, WLS measures and the corresponding critical values ( 5% , 10% , 90% and 95% ) of the T-statistics with simulated stock returns and mutual fund holdings. The bias (WC) and bias (DGTW) are the bias-adjusted weight-change and DGTW measures. WLS (WC) or (DGTW) is the WLS weight-change or DGTW measure. Bias-WLS (WC) and bias-WLS (DGTW) are the bias-adjusted and WLS weight-change and DGTW measures. Finally, we show the normality tests (JB test and Lilliefors test). For these tests, 1 represents rejecting the normality at 95% . Panel A: Original measures Mean ( % returns) 5% T 10% T 90% T 95% T JB Lilliefors DGTW measure DGTW 1.35 3.15 3.48 7.12 7.43 0 0 WLS (DGTW) 0.70 1.78 2.12 6.31 6.53 0 0 Weight-Change measure WC 0.35 0.26 0.62 4.00 4.38 0 0 WLS (WC) 0.29 -0.56 -0.32 3.26 3.64 0 0 Panel B: Bias-adjusted measures Mean ( % returns) 5% T 10% T 90% T 95% T JB Lilliefors DGTW measure Bias (DGTW) -0.01 -2.24 -1.85 1.76 2.14 0 0 Bias-WLS (DGTW) -0.01 -2.16 -1.69 1.68 2.06 0 0 Weight-Change measure Bias WC 0.05 -1.7 -1.42 1.90 2.23 0 0 Bias-WLS (WC) 0.05 -1.72 -1.42 1.77 2.04 0 0 58 Table 1.6: The means and T-statistics of different performance measures—momentum strategy, value weighted initial portfolios(quarterly) This table presents the estimated mean of the various performance measures, the bias-adjusted measures, WLS measures and the corresponding critical values ( 5% , 10% , 90% and 95% ) of the T-statistics with simulated stock returns and mutual fund holdings. The bias (WC) and bias (DGTW) are the bias-adjusted weight-change and DGTW measures. WLS (WC) or (DGTW) is the WLS weight-change or DGTW measure. Bias-WLS (WC) and bias-WLS (DGTW) are the bias-adjusted and WLS weight-change and DGTW measures. Finally, we show the normality tests (JB test and Lilliefors test). For these tests, 1 represents rejecting the normality at 95% . Panel A: Original measures Mean ( % returns) 5% T 10% T 90% T 95% T JB Lilliefors DGTW measure DGTW 1.33 1.92 2.23 5.62 5.98 0 0 WLS (DGTW) 0.58 0.62 0.90 4.77 5.19 0 0 Weight-Change measure WC 0.51 -0.86 -0.56 2.79 3.07 0 0 WLS (WC) 0.12 -1.34 -1.14 2.03 2.36 1 1 Panel B: Bias-adjusted measures Mean ( % returns) 5% T 10% T 90% T 95% T JB Lilliefors DGTW measure Bias (DGTW) -0.06 -2.39 -2 1.77 1.97 0 0 Bias-WLS (DGTW) 0.01 -1.92 -1.67 1.75 2.03 0 0 Weight-Change measure Bias WC 0.03 -2.13 -1.74 1.89 2.31 0 0 Bias-WLS (WC) -0.37 -2.20 -1.95 1.36 1.71 1 1 59 Table 1.7: The means and T-statistics of different performance measures—random selection strategy, value weighted initial portfolios(quarterly) This table presents the estimated mean of the various performance measures, the bias-adjusted measures, WLS measures and the corresponding critical values ( 5% , 10% , 90% and 95% ) of the T-statistics with simulated stock returns and mutual fund holdings. The bias (WC) and bias (DGTW) are the bias-adjusted weight-change and DGTW measures. WLS (WC) or (DGTW) is the WLS weight-change or DGTW measure. Bias-WLS (WC) and bias-WLS (DGTW) are the bias-adjusted and WLS weight-change and DGTW measures. Finally, we show the normality tests (JB test and Lilliefors test). For these tests, 1 represents rejecting the normality at 95% . Panel A: Original measures Mean ( % returns) 5% T 10% T 90% T 95% T JB Lilliefors DGTW measure DGTW 0.68 -1.01 -0.60 2.96 3.25 0 0 WLS (DGTW) 0.26 -1.51 -1.19 2.38 2.60 0 1 Weight-Change measure WC 0.20 -1.69 -1.41 1.96 2.25 0 0 WLS (WC) -0.10 -2.22 -1.82 1.53 1.90 0 0 Panel B: Bias-adjusted measures Mean ( % returns) 5% T 10% T 90% T 95% T JB Lilliefors DGTW measure Bias (DGTW) 0.04 -2.34 -1.95 2.04 2.34 0 0 Bias-WLS (DGTW) 0.03 -2.22 -1.92 2.01 2.42 0 0 Weight-Change measure Bias WC -0.03 -2.36 -1.95 1.96 2.44 0 0 Bias-WLS (WC) -0.03 -2.48 -2.01 1.89 2.33 0 0 60 Table 1.8: Estimated value and standard errors of the informed strategy (quarterly) This table presents the estimated values, standard errors and rejection ratios for various weight-based performance measures under the alternative where the manager is informed. In these simulations, we assume that the DGTW benchmark is misspecified. The bias (WC) and bias (DGTW) are the bias-adjusted weight-change or DGTW measures. Bias-WLS (WC) and Bias-WLS (DGTW) are the bias-adjusted WLS weight-change and DGTW measures. The predictability parameter are 0 , 0.05, 0.1 , 0.15, 0.2 and 0.25 . Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) mean ( % returns) -0.06 0.02 0.08 0.15 0.28 0.4 Std 1.12 1.1 1.08 1.07 1.08 1.04 Weight-change measure Bias (WC) mean ( % returns) -0.03 0.05 0.11 0.18 0.31 0.43 Std 1.13 1.12 1.09 1.09 1.09 1.05 Panel B: WLS measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) mean ( % returns) 0 0 0.05 0.14 0.24 0.43 Std 0.5 0.49 0.5 0.48 0.48 0.48 Weight-change measure Bias (WC) mean ( % returns) 0.02 0.01 0.06 0.15 0.26 0.44 Std 0.56 0.56 0.57 0.54 0.54 0.55 61 Table 1.9: Rejection ratio of the informed strategy (quarterly) This table presents the rejection ratios for various measures under the alternative where the manager is informed. In these simulations, we assume that the DGTW benchmark is misspecified. The bias (WC) and (DGTW) are the bias- adjusted weight-change and DGTW measures. Bias-WLS (WC) and (DGTW) are the bias-adjusted WLS weight-change or DGTW measures. The predictability parameter are 0 , 0.05, 0.1 , 0.15, 0.2 and 0.25 . Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) 2.4 3.6 3.6 4.8 5.8 7.1 Weight-Change measure Bias (DGTW) 2.4 3.2 2.9 4.3 4.9 6.9 Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) 2.4 2.8 3.2 4.5 6.4 9.6 Weight-Change measure Bias (DGTW) 2.4 2.9 2.7 4.2 6.3 9.6 62 Table 1.10: Estimated value and standard errors of the informed strategy(created monthly) This table presents the estimated values, standard errors and rejection ratios for various weight-based performance measures under the alternative where the manager is informed. In these simulations, we assume that the DGTW benchmark is misspecified. The bias (WC) and bias (DGTW) are the bias-adjusted weight-change or DGTW measures. Bias-WLS (WC) and Bias-WLS (DGTW) are the bias-adjusted WLS weight-change and DGTW measures. The predictability parameter are 0 , 0.05, 0.1 , 0.15, 0.2 and 0.25 . Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) mean ( % returns) -0.07 0.38 0.63 0.91 1.29 1.71 Std 0.92 0.96 0.93 0.94 0.92 0.91 Weight-change measure Bias (WC) mean ( % returns) -0.04 0.41 0.66 0.94 1.32 1.74 Std 0.94 0.98 0.95 0.96 0.94 0.93 Panel B: WLS measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) mean ( % returns) -0.04 0.27 0.56 0.91 1.3 1.72 Std 0.52 0.52 0.52 0.52 0.52 0.52 Weight-change measure Bias (WC) mean ( % returns) -0.03 0.3 0.6 0.93 1.3 1.75 Std 0.59 0.59 0.59 0.59 0.59 0.59 63 Table 1.11: Rejection ratios of the informed strategy (created monthly) This table presents the rejection ratios for various measures under the alternative where the manager is informed. In these simulations, we assume that the DGTW benchmark is misspecified. The bias (WC) and (DGTW) are the bias- adjusted weight-change and DGTW measures. Bias-WLS (WC) and (DGTW) are the bias-adjusted WLS weight-change or DGTW measures. The predictability parameter are 0 , 0.05, 0.1 , 0.15, 0.2 and 0.25 . Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) 2.4 5 10.1 16.9 26.4 42.2 Weight-Change measure Bias (DGTW) 2.4 5.8 10.4 18 26.5 43.3 Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) 2.4 8 16.3 33.7 54.3 79.3 Weight-Change measure Bias (DGTW) 2.4 5.8 12.8 23.7 44.6 67.5 64 Table 1.12: Estimated value and standard errors of the informed strategy (monthly) This table presents the estimated values, standard errors and rejection ratios for various weight-based performance measures under the alternative where the manager is informed. In these simulations, we assume that the DGTW benchmark is misspecified. The bias (WC) and bias (DGTW) are the bias-adjusted weight-change or DGTW measures. Bias-WLS (WC) and Bias-WLS (DGTW) are the bias-adjusted WLS weight-change and DGTW measures. The predictability parameter are 0 , 0.05, 0.1 , 0.15, 0.2 and 0.25 . Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) mean ( % returns) 0.04 0.77 1.59 2.34 3.24 4.01 Std 1.12 1.1 1.1 1.11 1.08 1.12 Weight-change measure Bias (WC) mean ( % returns) 0.07 0.8 1.61 2.37 3.27 4.04 Std 1.13 1.11 1.11 1.12 1.09 1.13 Panel B: WLS measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) mean ( % returns) 0 0.77 1.59 2.4 3.24 4.05 Std 0.51 0.51 0.5 0.5 0.5 0.5 Weight-change measure Bias (WC) mean ( % returns) 0.03 0.78 1.59 2.42 3.25 4.06 Std 0.57 0.57 0.57 0.57 0.56 0.57 65 Table 1.13: Rejection ratio of the informed strategy (monthly) This table presents the rejection ratios for various measures under the alternative where the manager is informed. In these simulations, we assume that the DGTW benchmark is misspecified. The bias (WC) and (DGTW) are the bias- adjusted weight-change and DGTW measures. Bias-WLS (WC) and (DGTW) are the bias-adjusted WLS weight-change or DGTW measures. The predictability parameter are 0 , 0.05, 0.1 , 0.15, 0.2 and 0.25 . Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) 2.4 10.3 31 58 81 87.2 Weight-Change measure Bias (DGTW) 2.4 11 31.5 58.2 81.5 87.3 Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) 2.4 23.5 73.5 98.2 99.6 100 Weight-Change measure Bias (DGTW) 2.4 20 63.1 94.8 99.2 99.9 66 Table 1.14: The value, asymptotic standard deviation and T-statistics of different measures grouped by the weighted average of relative size of the stocks (quarterly) This table presents the value, standard errors and the T-statistics for the existing measures, the bias-adjusted measures, and the WLS measures when we group the funds according to the weighted average of the relative size of the stocks choosing by the funds. We define the relative size of stocks as follows: First, for stocks in each DGTW benchmark portfolio, we divide them into five groups by the market cap. Therefore, the first group contains the stocks with lowest market cap, and fifth group contains the stocks with highest market cap. Next, we combine the stocks in the first groups from all 125 benchmark portfolios as the relatively small group, and the stocks in the fifth groups from all 125 benchmark portfolios as the relatively large group. For each relative size group, we calculate the average size of the stocks as the relative size of that group. The funds are sorted by the weighted average of these relative sizes. Low 2 3 4 High High-Low DGTW mean ( % returns) 0.79 0.08 -0.17 -0.19 -0.23 -1.02 Std 1.68 0.99 0.95 0.73 0.92 2.11 T-stat 4.61 0.80 -1.76 -2.62 -2.45 -4.73 Bias-adjusted DGTW mean ( % returns) 0.43 0.03 0.02 0.01 0.14 -0.29 Std 1.29 1.25 1.04 1.02 1.16 1.29 T-stat 3.22 0.25 0.20 0.057 1.18 -2.16 WLS DGTW mean ( % returns) 0.97 0.33 0.22 0.16 0.23 -0.74 Std 1.67 1.13 0.97 0.70 1.03 1.80 T-stat 5.7 2.87 2.18 2.19 2.15 -4.06 Bias-adjusted WLS DGTW mean ( % returns) 0.13 0.01 -0.08 -0.06 -0.02 -0.15 Std 0.74 0.57 0.51 0.51 0.72 0.87 T-stat 1.66 0.14 -1.43 -1.05 -0.25 -1.64 WC mean ( % returns) 0.73 0.24 0.07 0.03 0.01 -0.71 Std 1.71 1.71 1.61 1.60 1.86 1.46 T-stat 4.09 1.32 0.42 0.15 0.077 -4.70 Bias-adjusted WC mean ( % returns) 0.80 0.27 0.14 0.14 0.21 -0.59 Std 2.43 2.28 2.03 2.39 2.92 1.86 T-stat 3.17 1.13 0.66 0.55 0.71 -3.03 67 WLS WC mean ( % returns) 0.32 0.11 0.02 0.01 -0.03 -0.36 Std 0.62 0.45 0.47 0.55 0.95 0.94 T-stat 4.99 2.41 0.38 0.12 -0.32 -3.64 Bias-adjusted WLS WC mean ( % returns) 0.25 0.11 -0.01 0.04 0.04 -0.21 Std 0.74 0.64 0.61 0.76 1.17 0.93 T-stat 3.21 1.65 -0.11 0.49 0.34 -2.11 68 Table 1.15: The value, asymptotic standard deviation and T-statistics of different measures grouped by the weighted average of relative size of the stocks (monthly) This table presents the value, standard errors and the T-statistics (normalized to the quarterly numbers) for the existing measures, the bias-adjusted measures, and the WLS measures when we group the funds according to the weighted average of the relative size of the stocks choosing by the funds. We define the relative size of stocks as follows: First, for stocks in each DGTW benchmark portfolio, we divide them into five groups by the market cap. Therefore, the first group contains the stocks with lowest market cap, and fifth group contains the stocks with highest market cap. Next, we combine the stocks in the first groups from all 125 benchmark portfolios as the relatively small group, and the stocks in the fifth groups from all 125 benchmark portfolios as the relatively large group. For each relative size group, we calculate the average size of the stocks as the relative size of that group. The funds are sorted by the weighted average of these relative sizes. Low 2 3 4 High High-Low DGTW mean ( % returns) 1.05 0 -0.27 -0.33 -0.44 -1.49 Std 1.43 1.03 0.94 0.80 0.86 1.65 T-stat 7.17 -0.018 -2.79 -4.07 -4.96 -8.83 Bias-adjusted DGTW mean ( % returns) 0.49 0.10 0.004 0.07 0.13 -0.36 Std 1.32 1.38 1.20 1.15 1.21 1.16 T-stat 3.56 0.71 0.36 0.55 1.01 -2.98 WLS DGTW mean ( % returns) 1.16 0.27 0.11 0.07 0.10 -1.06 Std 1.57 1.12 0.87 0.77 1.31 1.68 T-stat 7.24 2.35 1.25 0.84 0.77 -6.19 Bias-adjusted WLS DGTW mean ( % returns) 0.15 -0.03 -0.07 -0.04 -0.03 -0.18 Std 0.76 0.68 0.60 0.58 0.79 0.74 T-stat 1.93 -0.47 -1.17 -0.73 -0.40 -2.38 WC mean ( % returns) 0.76 0.22 0.08 0.06 -0.01 -0.77 Std 1.60 1.80 1.57 1.48 1.56 1.08 T-stat 4.53 1.18 0.48 0.36 -0.063 -6.78 Bias-adjusted WC mean ( % returns) 0.79 0.29 0.15 0.18 0.16 -0.63 Std 2.11 2.16 1.87 1.93 2.29 1.61 T-stat 3.53 1.24 0.77 0.87 0.66 -3.68 69 WLS WC mean ( % returns) 0.28 0.04 -0.01 -0.02 -0.04 -0.32 Std 0.61 0.54 0.49 0.55 0.88 0.74 T-stat 4.43 0.75 -0.28 -0.27 -0.46 -4.22 Bias-adjusted WLS WC mean ( % returns) 0.18 0.04 -0.04 0.01 0.01 -0.17 Std 0.55 0.62 0.59 0.71 0.96 0.83 T-stat 3.13 0.65 -0.66 0.098 0.14 -1.91 70 Table 1.16: The value, asymptotic standard deviation and T-statistics of different measures grouped by the weighted average of relative book-to-market value of the stocks (quarterly) This table presents the value, standard errors and the T-statistics for the existing measures, the bias-adjusted measures, and the WLS measures when we group the funds according to the weighted average of the relative book- to-market value of the stocks choosing by the funds. We define the relative book-to-market of stocks as follows: First, for stocks in each DGTW benchmark portfolio, we divide them into five groups by the book-to-market value. Therefore, the first group contains the stocks with lowest value, and fifth group contains the stocks with highest value. Next, we combine the stocks in the first groups from all 125 benchmark portfolios as the relatively low group, and the stocks in the fifth groups from all 125 benchmark portfolios as the relatively high group. For each relative book-to-market value group, we calculate the average book-to-market value of the stocks as the relative book-to- market value of that group. The funds are sorted by the weighted average of these relative book-to-market value. Low 2 3 4 High High-Low DGTW mean ( % returns) -0.58 -0.07 0.01 0.06 0.35 0.94 Std 2.09 1.37 0.98 0.94 1.47 3.03 T-stat -2.73 -0.53 0.15 0.6 2.36 3.02 Bias-adjusted DGTW mean ( % returns) -0.06 0.01 0.11 0.06 0.24 0.3 Std 1.69 1.35 1.14 1 0.88 1.83 T-stat -0.35 0.08 0.94 0.56 2.63 1.59 WLS DGTW mean ( % returns) 0.11 0.19 0.18 0.18 0.43 0.32 Std 1.55 0.95 0.82 1.04 2.18 2.81 T-stat 0.73 1.91 2.12 1.66 1.95 1.12 Bias-adjusted WLS DGTW mean ( % returns) 0.11 0.19 0.18 0.18 0.43 0.32 Std 1.55 0.95 0.82 1.04 2.18 2.81 T-stat 0.73 1.91 2.12 1.66 1.95 1.12 WC mean ( % returns) -0.08 0.1 0.2 0.12 0.27 0.34 Std 2.72 2.11 1.78 1.43 0.92 2.4 T-stat -0.28 0.44 1.09 0.78 2.76 1.37 Bias-adjusted WC mean ( % returns) 0.05 0.25 0.37 0.19 0.31 0.26 Std 3.72 3.12 2.69 1.96 1.41 3.25 T-stat 0.13 0.77 1.31 0.93 2.13 0.77 71 WLS WC mean ( % returns) -0.04 -0.01 0.01 0 0.09 0.13 Std 1.14 0.77 0.67 0.48 0.47 1.02 T-stat -0.37 -0.1 0.11 0 1.84 1.27 Bias-adjusted WLS WC mean ( % returns) -0.01 0.05 0.05 0 0.06 0.08 Std 1.34 1.01 0.9 0.69 0.62 1.12 T-stat -0.1 0.51 0.48 -0.05 1 0.67 72 Table 1.17: The value, asymptotic standard deviation and T-statistics of different measures grouped by the weighted average of relative book-to-market value of the stocks (monthly) This table presents the value, standard errors and the T-statistics (normalized to the quarterly numbers) for the existing measures, the bias-adjusted measures, and the WLS measures when we group the funds according to the weighted average of the relative book-to-market value of the stocks choosing by the funds. We define the relative book-to-market of stocks as follows: First, for stocks in each DGTW benchmark portfolio, we divide them into five groups by the book-to-market value. Therefore, the first group contains the stocks with lowest value, and fifth group contains the stocks with highest value. Next, we combine the stocks in the first groups from all 125 benchmark portfolios as the relatively low group, and the stocks in the fifth groups from all 125 benchmark portfolios as the relatively high group. For each relative book-to-market value group, we calculate the average book-to-market value of the stocks as the relative book-to-market value of that group. The funds are sorted by the weighted average of these relative book-to-market value. Low 2 3 4 High High-Low DGTW mean ( % returns) -0.75 -0.16 -0.01 0.08 0.37 1.11 Std 2.04 1.4 1.02 0.9 1.48 3.03 T-stat -3.58 -1.11 -0.11 0.89 2.44 3.61 Bias-adjusted DGTW mean ( % returns) 0.02 0.1 0.1 0.08 0.21 0.19 Std 1.8 1.52 1.25 1.02 0.83 1.6 T-stat 0.13 0.6 0.75 0.73 2.49 1.14 WLS DGTW mean ( % returns) -0.04 0.12 0.14 0.17 0.44 0.48 Std 1.54 0.98 1.01 1.1 2.22 2.76 T-stat -0.25 1.24 1.4 1.53 1.93 1.69 Bias-adjusted WLS DGTW mean ( % returns) -0.12 -0.05 -0.08 -0.03 0 0.13 Std 1.01 0.8 0.68 0.59 0.63 0.95 T-stat -1.15 -0.56 -1.15 -0.54 0.07 1.27 WC mean ( % returns) -0.08 0.11 0.16 0.11 0.2 0.28 Std 2.29 2.06 1.73 1.35 0.82 1.89 T-stat -0.33 0.51 0.89 0.77 2.38 1.44 Bias-adjusted WC mean ( % returns) 0.08 0.27 0.29 0.16 0.27 0.19 Std 3.02 2.71 2.34 1.71 1.11 2.5 73 T-stat 0.25 0.95 1.17 0.92 2.28 0.72 WLS WC mean ( % returns) -0.08 -0.03 -0.04 -0.02 0.02 0.1 Std 1.03 0.77 0.68 0.5 0.44 0.85 T-stat -0.74 -0.36 -0.53 -0.45 0.5 1.17 Bias-adjusted WLS WC mean ( % returns) -0.06 0.02 -0.02 -0.03 0.02 0.08 Std 1.01 0.95 0.82 0.62 0.44 0.89 T-stat -0.56 0.19 -0.19 -0.42 0.41 0.83 74 Table 1.18: The value, asymptotic standard deviation and T-statistics of different measures grouped by the weighted average of relative past returns of the stocks (quarterly) This table presents the value, standard errors and the T-statistics for the existing measures, the bias-adjusted measures, and the WLS measures when we group the funds according to the weighted average of the relative past returns of the stocks choosing by the funds. We define the relative past returns of stocks as follows: First, for stocks in each DGTW benchmark portfolio, we divide them into five groups by the average past returns from one year ago to two months before. Therefore, the first group contains the stocks with highest market past returns, and fifth group contains the stocks with lowest past returns. Next, we combine the stocks in the first groups from all 125 benchmark portfolios as the relatively high past returns group, and the stocks in the fifth groups from all 125 benchmark portfolios as the relatively low past returns group. For each relative past returns group, we calculate the average size of the stocks as the relative past returns of that group. The funds are sorted by the weighted average of these relative past returns. High 2 3 4 Low Low-High DGTW mean ( % returns) 0.24 0.13 0.05 -0.05 -0.09 -0.34 Std 1.53 0.74 0.65 0.91 1.94 2.85 T-stat 1.56 1.67 0.68 -0.55 -0.47 -1.15 Bias-adjusted DGTW mean ( % returns) 0.11 0.04 0.02 0.12 0.34 0.24 Std 1.09 0.89 0.95 1.08 1.95 2.12 T-stat 0.93 0.39 0.16 1.11 1.69 1.08 WLS DGTW mean ( % returns) 0.68 0.31 0.24 0.18 0.18 -0.5 Std 1.83 0.87 0.79 0.76 1.29 2.1 T-stat 3.62 3.46 2.99 2.27 1.34 -2.34 Bias-adjusted WLS DGTW mean ( % returns) 0.03 -0.02 -0.05 -0.06 0.03 0 Std 0.71 0.5 0.49 0.58 0.99 1.17 T-stat 0.33 -0.35 -0.92 -0.93 0.24 0 WC mean ( % returns) 0.09 0.12 0.14 0.25 0.47 0.38 Std 1.06 1.18 1.38 1.77 3.07 2.68 T-stat 0.82 0.96 0.98 1.35 1.47 1.36 Bias-adjusted WC mean ( % returns) 0.21 0.18 0.2 0.36 0.61 0.4 Std 1.51 1.65 2.08 2.66 4.4 3.99 75 T-stat 1.31 1.03 0.92 1.31 1.33 0.97 WLS WC mean ( % returns) 0.08 0.03 0.04 0.04 0.16 0.07 Std 0.54 0.46 0.5 0.69 1.13 1.1 T-stat 1.47 0.69 0.76 0.55 1.32 0.63 Bias-adjusted WLS WC mean ( % returns) 0.08 0.03 0.05 0.08 0.19 0.11 Std 0.72 0.62 0.72 0.96 1.29 1.22 T-stat 1.04 0.5 0.69 0.79 1.4 0.86 76 Table 1.19: The value, asymptotic standard deviation and T-statistics of different measures grouped by the weighted average of relative past returns of the stocks (monthly) This table presents the value, standard errors and the T-statistics (normalized to the quarterly numbers) for the existing measures, the bias-adjusted measures, and the WLS measures when we group the funds according to the weighted average of the relative past returns of the stocks choosing by the funds. We define the relative past returns of stocks as follows: First, for stocks in each DGTW benchmark portfolio, we divide them into five groups by the average past returns from one year ago to two months before. Therefore, the first group contains the stocks with highest market past returns, and fifth group contains the stocks with lowest past returns. Next, we combine the stocks in the first groups from all 125 benchmark portfolios as the relatively high past returns group, and the stocks in the fifth groups from all 125 benchmark portfolios as the relatively low past returns group. For each relative past returns group, we calculate the average size of the stocks as the relative past returns of that group. The funds are sorted by the weighted average of these relative past returns. High 2 3 4 Low Low-High DGTW mean ( % returns) 0.32 0.02 -0.01 -0.09 -0.2 -0.52 Std 1.56 0.79 0.75 0.95 2.03 2.94 T-stat 2.01 0.24 -0.1 -0.97 -0.94 -1.72 Bias-adjusted DGTW mean ( % returns) 0.15 0.05 0.12 0.15 0.36 0.21 Std 1.05 0.94 1.05 1.36 2.1 2.02 T-stat 1.38 0.48 1.08 1.06 1.64 0.99 WLS DGTW mean ( % returns) 0.68 0.2 0.2 0.11 0.09 -0.59 Std 1.72 0.98 0.96 1.01 1.33 2.28 T-stat 3.88 2.03 2.04 1.03 0.69 -2.52 Bias-adjusted WLS DGTW mean ( % returns) 0.01 -0.03 -0.02 -0.05 0.03 0.02 Std 0.77 0.54 0.56 0.73 1.23 1.33 T-stat 0.13 -0.58 -0.27 -0.59 0.24 0.15 WC mean ( % returns) 0.15 0.08 0.14 0.26 0.45 0.31 Std 1.25 1.24 1.32 1.79 2.81 2.56 T-stat 1.11 0.61 1.04 1.39 1.55 1.16 Bias-adjusted WC mean ( % returns) 0.21 0.13 0.25 0.37 0.62 0.4 Std 1.43 1.47 1.72 2.42 3.66 3.34 77 T-stat 1.43 0.88 1.38 1.45 1.61 1.15 WLS WC mean ( % returns) 0.05 0 0 0.03 0.09 0.04 Std 0.66 0.5 0.5 0.74 1.24 1.25 T-stat 0.71 -0.04 0.03 0.44 0.67 0.29 Bias-adjusted WLS WC mean ( % returns) 0.01 0 0.02 0.07 0.12 0.11 Std 0.67 0.55 0.62 0.89 1.12 1.17 T-stat 0.18 -0.04 0.36 0.75 1.05 0.90 78 Table 1.20: The value, asymptotic standard deviation and T-statistics of different measures grouped by past performance (monthly, 5 years) This table presents the value, standard errors and the T-statistics (normalized to the quarterly numbers) for the existing measures, the bias-adjusted measures, the WLS measures, the reported returns and the Carhart four-factor , when we group the funds according to different past performance measures. There are five groups, and high group contains the funds with highest value of the past performance and the low group contains the funds with lowest absolute value of the past performance. We also estimate the value and T-statistics difference between the high and low group. Low 2 3 4 High High-Low Sort by DGTW DGTW mean ( % returns) 0.07 0.05 0.02 0.05 0.09 0.02 Std 0.76 0.51 0.38 0.51 01.01 1.26 T-stat 1.36 1.29 0.94 1.47 1.29 0.22 Bias-adjusted DGTW mean ( % returns) 0.06 0.07 0.02 0.06 0.10 0.04 Std 0.63 0.64 0.65 0.78 0.96 0.56 T-stat 1.29 1.54 0.53 1.13 1.50 1.11 WLS DGTW mean ( % returns) 0.15 0.10 0.08 0.11 0.17 0.02 Std 0.84 0.60 0.50 0.52 0.62 0.95 T-stat 2.56 2.35 2.23 2.99 3.91 0.28 Bias-adjusted WLS DGTW mean ( % returns) 0.02 0.02 0 0.01 0.06 0.04 Std 0.46 0.37 0.35 0.44 0.59 0.39 T-stat 0.71 0.74 -0.06 0.34 1.49 1.43 reported returns mean ( % returns) 1.08 0.97 0.93 0.92 0.93 -0.15 Std 4.25 4.03 4.03 4.33 4.94 1.70 T-stat 3.62 3.45 3.29 3.02 2.69 -1.24 reported mean ( % returns) -0.07 -0.08 -0.06 -0.10 -0.08 -0.01 Std 0.81 0.55 0.44 0.58 0.90 0.81 T-stat -1.30 -2.15 -2.12 -2.43 -1.33 -0.19 79 Sort by Bias-adjusted DGTW DGTW mean ( % returns) 0.05 0.06 0.04 0.03 0.04 -0.01 Std 0.53 0.40 0.40 0.47 0.86 0.59 T-stat 1.46 2.08 1.35 1.01 0.69 -0.31 Bias-adjusted DGTW mean ( % returns) 0.03 0.06 0.03 0.05 0.11 0.08 Std 0.70 0.69 0.69 0.76 1.11 0.59 T-stat 0.53 1.20 0.70 0.90 1.41 2.03 WLS DGTW mean ( % returns) 0.11 0.10 0.08 0.09 0.13 0.02 Std 0.56 0.53 0.52 0.54 0.61 0.42 T-stat 2.78 2.62 2.29 2.31 3.11 0.82 Bias-adjusted WLS DGTW mean ( % returns) -0.02 0.02 0.02 0.01 0.04 0.06 Std 0.39 0.36 0.38 0.39 0.58 0.36 T-stat -0.87 0.86 0.56 0.39 0.95 2.51 reported returns mean ( % returns) 0.99 0.99 0.95 0.94 0.97 -0.03 Std 4.21 3.98 4.03 4.07 4.79 1.15 T-stat 3.36 3.55 3.39 3.28 2.88 -0.32 reported mean ( % returns) -0.11 -0.08 -0.11 -0.10 -0.11 0 Std 0.69 0.57 0.51 0.56 0.83 0.50 T-stat -2.32 -1.99 -3.07 -2.55 -1.92 -0.02 Sort by WLS DGTW DGTW mean ( % returns) 0.08 0.05 0.03 0.04 0.09 0.01 Std 0.69 0.48 0.38 0.53 1.03 1.22 T-stat 1.69 1.43 1.00 1.02 1.27 0.11 Bias-adjusted DGTW mean ( % returns) 0.04 0.06 0.05 0.04 0.12 0.09 80 Std 0.63 0.62 0.67 0.78 0.96 0.53 T-stat 0.78 1.29 1.07 0.69 1.79 2.31 WLS DGTW mean ( % returns) 0.15 0.09 0.09 0.09 0.18 0.04 Std 0.86 0.62 0.48 0.50 0.64 0.99 T-stat 2.49 2.05 2.63 2.65 4.11 0.51 Bias-adjusted WLS DGTW mean ( % returns) 0.01 0.02 0.02 0 0.07 0.06 Std 0.47 0.35 0.36 0.43 0.61 0.38 T-stat 0.16 0.67 0.66 0.002 1.50 2.36 reported returns mean ( % returns) 1.07 0.95 0.91 0.92 0.96 -0.11 Std 4.12 3.91 4.07 4.38 5.07 1.62 T-stat 3.72 3.47 3.19 3.01 2.71 -0.97 reported mean ( % returns) -0.06 -0.09 -0.08 -0.10 -0.07 -0.01 Std 0.78 0.52 0.44 0.57 0.94 0.78 T-stat -1.13 -2.34 -2.64 -2.58 -1.08 -0.17 Sort by Bias-adjusted WLS DGTW DGTW mean ( % returns) 0.05 0.05 0.04 0.03 0.06 0.01 Std 0.59 0.42 0.38 0.53 0.84 0.61 T-stat 1.16 1.83 1.39 0.93 1.03 0.28 Bias-adjusted DGTW mean ( % returns) 0.04 0.05 0.04 0.05 0.12 0.07 Std 0.81 0.66 0.65 0.81 1.14 0.51 T-stat 0.71 1.13 0.88 0.77 1.42 2.02 WLS DGTW mean ( % returns) 0.11 0.09 0.07 0.09 0.15 0.04 Std 0.62 0.55 0.51 0.53 0.56 0.42 T-stat 2.59 2.36 1.95 2.31 3.82 1.32 Bias-adjusted WLS DGTW 81 mean ( % returns) -0.01 0.01 0 0 0.04 0.05 Std 0.44 0.35 0.34 0.42 0.58 0.33 T-stat -0.26 0.23 0.06 0.01 0.99 1.95 reported returns mean ( % returns) 0.95 0.92 0.86 0.87 0.97 0.02 Std 4.30 3.92 3.83 4.15 4.79 1.05 T-stat 3.16 3.35 3.21 3.01 2.89 0.25 reported mean ( % returns) -0.10 -0.07 -0.08 -0.10 -0.08 0.02 Std 0.74 0.55 0.44 0.60 0.86 0.52 T-stat -2.04 -1.94 -2.73 -2.45 -1.37 0.61 82 Table 1.21: The value, asymptotic standard deviation and T-statistics of different measures grouped by return gap (quarterly, from 1984 to 2007) This table presents the value, standard errors and the T-statistics (normalized to the quarterly numbers) for the existing measures, the bias-adjusted measures, and the WLS measures when we group the funds according to the absolute value of the return gap (Kacperczyk, Sialm and Zheng (2008)). There are five groups, and high group contains the funds with highest absolute value of the return gap and the low group contains the funds with lowest absolute value of the return gap. We also estimate the value and T-statistics difference between the high and low group. Low 2 3 4 High High-Low DGTW mean ( % returns) 0.05 0.13 0.09 0.07 0.10 0.05 Std 0.93 0.64 0.64 0.87 1.17 0.66 T-stat 0.48 1.96 1.37 0.81 0.80 0.74 Bias-adjusted DGTW mean ( % returns) 0.14 0.06 0.08 0.05 0.16 0.02 Std 1.28 0.85 0.78 1.02 1.48 0.78 T-stat 1.01 0.65 0.98 0.50 0.99 0.22 WLS DGTW mean ( % returns) 0.30 0.29 0.27 0.27 0.34 0.04 Std 0.68 0.65 0.67 0.66 0.58 0.41 T-stat 4.18 4.27 3.78 3.96 5.50 0.92 Bias-adjusted WLS DGTW mean ( % returns) 0.05 0 0 0.01 0.04 -0.01 Std 0.54 0.37 0.33 0.42 0.62 0.44 T-stat 0.87 -0.04 -0.11 0.24 0.59 -0.23 WC mean ( % returns) 0.29 0.20 0.16 0.23 0.36 0.07 Std 1.70 1.31 1.26 1.57 2.44 1.06 T-stat 1.58 1.45 1.20 1.35 1.38 0.65 Bias-adjusted WC mean ( % returns) 0.38 0.29 0.28 0.31 0.58 0.19 Std 2.25 1.84 1.75 2.22 3.27 1.45 T-stat 1.58 1.49 1.51 1.28 1.64 1.24 WLS WC mean ( % returns) 0.11 0.05 0.03 0.07 0.12 0.01 83 Std 0.58 0.36 0.33 0.43 0.68 0.37 T-stat 1.80 1.38 0.80 1.52 1.62 0.14 Bias-adjusted WLS WC mean ( % returns) 0.11 0.07 0.05 0.09 0.16 0.05 Std 0.77 0.50 0.46 0.60 1.00 0.54 T-stat 1.32 1.24 1.02 1.42 1.46 0.83 84 Table 1.22: The value, asymptotic standard deviation and T-statistics of different measures grouped by return gap (monthly, from 1984 to 2007) This table presents the value, standard errors and the T-statistics (normalized to the quarterly numbers) for the existing measures, the bias-adjusted measures, and the WLS measures when we group the funds according to the absolute value of the return gap (Kacperczyk, Sialm and Zheng (2008)). There are five groups, and high group contains the funds with highest absolute value of the return gap and the low group contains the funds with lowest absolute value of the return gap. We also estimate the value and T-statistics difference between the high and low group. Low 2 3 4 High High-Low DGTW mean ( % returns) 0.04 0.09 0.13 0.12 0.22 0.19 Std 1.18 0.82 0.73 0.93 1.42 1.09 T-stat 0.29 1.08 1.68 1.19 1.51 1.66 Bias-adjusted DGTW mean ( % returns) 0.16 0.10 0.15 0.13 0.25 0.08 Std 1.30 0.93 0.81 1.03 1.46 0.69 T-stat 1.18 1.06 1.70 1.15 1.59 1.16 WLS DGTW mean ( % returns) 0.26 0.24 0.26 0.26 0.36 0.10 Std 0.80 0.78 0.78 0.76 0.78 0.56 T-stat 3.13 3.02 3.18 3.26 4.43 1.66 Bias-adjusted WLS DGTW mean ( % returns) 0.01 0 0.02 0.02 0.07 0.06 Std 0.67 0.42 0.38 0.48 0.68 0.44 T-stat 0.12 -0.06 0.40 0.47 0.90 1.20 WC mean ( % returns) 0.31 0.23 0.20 0.29 0.47 0.16 Std 1.85 1.32 1.17 1.50 2.27 0.83 T-stat 1.57 1.60 1.64 1.83 1.93 1.78 Bias-adjusted WC mean ( % returns) 0.38 0.29 0.33 0.35 0.68 0.30 Std 2.17 1.66 1.50 1.90 2.88 1.21 T-stat 1.65 1.66 2.07 1.73 2.22 2.33 WLS WC mean ( % returns) 0.09 0.04 0.03 0.07 0.13 0.03 85 Std 0.71 0.39 0.33 0.46 0.77 0.40 T-stat 1.26 0.97 0.77 1.41 1.58 0.82 Bias-adjusted WLS WC mean ( % returns) 0.07 0.04 0.05 0.07 0.19 0.12 Std 0.79 0.48 0.43 0.56 0.99 0.52 T-stat 0.82 0.80 1.12 1.13 1.79 2.20 86 Table 1.23: Estimated value and standard errors of an adjusted informed strategy (quarterly) This table presents the estimated values, standard errors and rejection ratios for various weight-based performance measures under the alternative where the manager is informed. The simulated informed strategy is based on appendix. In these simulations, we assume that the DGTW benchmark is misspecified. The bias (WC) and bias (DGTW) are the bias-adjusted weight-change or DGTW measures. Bias-WLS (WC) and Bias-WLS (DGTW) are the bias-adjusted WLS weight-change and DGTW measures. The predictability parameter are 0 , 0.05 , 0.1 , 0.15 , 0.2 and 0.25 . Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) mean ( % returns) -0.04 0.07 0.01 0.17 0.26 0.44 Std 1.58 1.55 1.57 1.58 1.54 1.6 Weight-change measure Bias (WC) mean ( % returns) -0.01 0.08 0.04 0.2 0.29 0.47 Std 1.59 1.57 1.59 1.59 1.55 1.61 Panel B: WLS measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) mean ( % returns) 0.01 0 -0.01 0.13 0.27 0.43 Std 0.79 0.81 0.78 0.8 0.78 0.78 Weight-change measure Bias (WC) mean ( % returns) 0.02 0.02 0 0.15 0.29 0.43 Std 0.89 0.94 0.87 0.91 0.87 0.89 87 Table 1.24: Rejection ratio of an adjusted informed strategy (quarterly) This table presents the rejection ratios for various measures under the alternative where the manager is informed. The simulated informed strategy is based on appendix. In these simulations, we assume that the DGTW benchmark is misspecified. The bias (WC) and (DGTW) are the bias-adjusted weight-change and DGTW measures. Bias-WLS (WC) and (DGTW) are the bias-adjusted WLS weight-change or DGTW measures. The predictability parameter are 0 , 0.05 , 0.1 , 0.15, 0.2 and 0.25 . Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) 2.4 2.9 1.7 4.1 3.1 5 Weight-Change measure Bias (DGTW) 2.4 2.8 2.3 4.8 4.1 5.5 Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) 2.4 2.8 2.3 3.6 4.5 6.6 Weight-Change measure Bias (DGTW) 2.4 3 1.7 3.1 3.6 5.9 88 Table 1.25: Estimated value and standard errors of an adjusted informed strategy(created monthly) This table presents the estimated values, standard errors and rejection ratios for various weight-based performance measures under the alternative where the manager is informed. The simulated informed strategy is based on appendix. In these simulations, we assume that the DGTW benchmark is misspecified. The bias (WC) and bias (DGTW) are the bias-adjusted weight-change or DGTW measures. Bias-WLS (WC) and Bias-WLS (DGTW) are the bias-adjusted WLS weight-change and DGTW measures. The predictability parameter are 0 , 0.05 , 0.1 , 0.15 , 0.2 and 0.25 . Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) mean ( % returns) -0.01 0.32 0.65 0.98 1.54 2.02 Std 1.22 1.22 1.21 1.22 1.2 1.23 Weight-change measure Bias (WC) mean ( % returns) 0.02 0.34 0.68 1.01 1.56 2.05 Std 1.24 1.24 1.23 1.24 1.22 1.25 Panel B: WLS measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) mean ( % returns) -0.03 0.28 0.6 0.98 1.52 1.91 Std 0.74 0.75 0.74 0.74 0.74 0.74 Weight-change measure Bias (WC) mean ( % returns) 0.01 0.31 0.61 1.01 1.54 1.94 Std 0.82 0.83 0.83 0.83 0.83 0.82 89 Table 1.26: Rejection ratios of an adjusted informed strategy (created monthly) This table presents the rejection ratios for various measures under the alternative where the manager is informed. The simulated informed strategy is based on appendix. In these simulations, we assume that the DGTW benchmark is misspecified. The bias (WC) and (DGTW) are the bias-adjusted weight-change and DGTW measures. Bias-WLS (WC) and (DGTW) are the bias-adjusted WLS weight-change or DGTW measures. The predictability parameter are 0 , 0.05 , 0.1 , 0.15, 0.2 and 0.25 . Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) 2.4 3.3 6.5 10.3 22.4 30.3 Weight-Change measure Bias (DGTW) 2.4 3.1 7 10.2 21.8 29.6 Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) 2.4 5.9 10.5 21.1 41.3 58.9 Weight-Change measure Bias (DGTW) 2.4 4.5 8.2 18.7 32.5 49.5 90 Table 1.27: Estimated value and standard errors of an adjusted informed strategy (monthly) This table presents the estimated values, standard errors and rejection ratios for various weight-based performance measures under the alternative where the manager is informed. The simulated informed strategy is based on appendix. In these simulations, we assume that the DGTW benchmark is misspecified. The bias (WC) and bias (DGTW) are the bias-adjusted weight-change or DGTW measures. Bias-WLS (WC) and Bias-WLS (DGTW) are the bias-adjusted WLS weight-change and DGTW measures. The predictability parameter are 0 , 0.05 , 0.1 , 0.15 , 0.2 and 0.25 . Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) mean ( % returns) 0.06 0.9 1.85 2.62 3.68 4.59 Std 1.63 1.6 1.58 1.61 1.57 1.63 Weight-change measure Bias (WC) mean ( % returns) 0.09 0.94 1.89 2.66 3.73 4.63 Std 1.64 1.61 1.59 1.62 1.58 1.64 Panel B: WLS measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) mean ( % returns) 0.02 0.86 1.81 2.63 3.65 4.53 Std 0.83 0.83 0.82 0.82 0.81 0.82 Weight-change measure Bias (WC) mean ( % returns) 0.06 0.89 1.82 2.64 3.68 4.56 Std 0.92 0.92 0.92 0.92 0.9 0.92 91 Table 1.28: Rejection ratio of an adjusted informed strategy (monthly) This table presents the rejection ratios for various measures under the alternative where the manager is informed. The simulated informed strategy is based on appendix. In these simulations, we assume that the DGTW benchmark is misspecified. The bias (WC) and (DGTW) are the bias-adjusted weight-change and DGTW measures. Bias-WLS (WC) and (DGTW) are the bias-adjusted WLS weight-change or DGTW measures. The predictability parameter are 0 , 0.05 , 0.1 , 0.15, 0.2 and 0.25 . Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) 2.4 9.6 26.9 46.4 71.3 81.6 Weight-Change measure Bias (DGTW) 2.4 9.7 25.5 44.3 70.6 81.4 Panel A: Inefficient measures 0 0.05 0.1 0.15 0.2 0.25 DGTW measure Bias (DGTW) 2.4 13.2 41.7 73.6 94.4 99.1 Weight-Change measure Bias (DGTW) 2.4 14.1 40.9 70.6 93.2 98.3 92 Chapter 2 Resolving the Errors-in-Variables Bias in Risk Premium Estimation by Kuntara Pukthuanthong, Richard Roll, and Junbo Wang Authors’ Coordinates Pukthuanthong Roll Wang Address Department of Finance University of Missouri Columbia, MO 65211 Division of the Humanities and Social Science, California Institute of Technology Marshall School of Business, University of Southern California Voice (619) 807-6124 626-395-3890 (781) 258-8806 E-Mail pukthuanthongk@ missouri.edu rroll@caltech.edu junbo@usc.edu 93 2.1 Introduction The methods introduced by Black, Jensen and Scholes (1972), (BJS), and refined by Fama and Macbeth (1973), (FM), are widely employed to estimate risk premiums in linear factor models. This approach involves two-pass regressions: the first pass is a time series regression of returns on the factors for each asset, which produces estimates of factor loadings, widely called “betas” in the finance literature. The second pass regresses asset returns cross-sectionally on the estimated betas. As pointed out initially by Black, Jensen and Scholes (1972), risk premiums estimates from the second pass cross-sectional regression containan inherent errors-in-variables bias because of estimation errors in the betas from the first pass. Given the luxury of a large number N of individual assets, one can form diversified portfolios organized by particular asset characteristics. Black, Jensen and Scholes (1972), Blume and Friend (1973), and Fama and Macbeth (1973) show that this portfolio approach reduces estimation errors in the betas because they are less affected by idiosyncratic risk; so the errors-in-variables bias is mitigated (and eliminated in the limit N ∞). Athough the finite sample properties of the portfolio grouping procedure are weak even when N is reasonably large (N>2000), many papers still employ FM with portfolios to estimate risk premiums. But more troubling is the fact that portfolio diversification can mask effects that exist in individual assets. Taking a naï ve example, many investors seem to believe that some assets are overpriced and others are underpriced, but any portfolio grouping by an attribute other than price itself could diversify away the mispricing, rendering it undetectable. A more egregious defect from portfolio masking involves the cross-sectional relation between mean returns and factor exposures (“betas”.) Take the single-factor CAPM as an illustration (though the same effect is at work for any linear factor model.) The cross-sectional relation 94 between expected returns and betas holds exactly if and only if the market index used for computing betas is on the mean/variance frontier of the individual asset universe. Errors from the beta/return line, either positive or negative, imply that the index is not on the frontier. But if the individual assets are grouped into portfolios sorted by portfolio beta and the individual errors are not related to beta, the analogous line fitted to portfolio means and betas will display much smaller errors. This could lead to a mistaken inference that the index is on the efficient frontier. This paper contributes to the literature by proposing a method to reduce the bias in risk premium estimates without relying on portfolio grouping. We also provide simulations to gauge the magnitude of the bias in finite samples and to compare our new method with previous approaches. To put the problem into perspective, we show using macro factors with a large number of assets (N≈5000) and sample periods (T≈600), that the risk premium bias can be 60 to 70% with BJS and up to 90% with FM. Similar biases are present when using individual assets rather than portfolios. The method introduced in this paper can virtually eliminate the bias in either case. Our method builds on the FM rolling- method and the techniques introduced in Griliches and Hausman (1986) and Biorn (2000). In the first pass, ’s are estimated from two sets of non- overlapping observations. Then the ’s from one set are used as instruments for the ’s from the second set in the second pass estimation. The asymptotic distribution of the estimated risk premiums is derived using Shanken’s (1992) method. We use simulations to evaluate hypothesis tests involving estimated risk premiums, which typically boil down to a null hypothesis that the intercept in the second pass regression is zero. Simulation results suggest that with the exception of our instrumental variables approach and the covariance from Theorem 3.2 below, all other methods reject a true hypothesis too often. This high 95 rejection rate is due to a downward bias in estimated standard errors and an upward bias in the second pass intercept. We also apply the new approach as well as the classical approaches to estimating the risk premiums for the macroeconomic factors. With our methods, the consumption growth has positive effect on the stock returns, and the effect is significant. However, all other methods do not lead to the same conclusion. This paper contributes to a large literature about the errors-in-varibles bias. As the length of the sample period (T) grows indefinitely, Shanken (1992) shows that the errors-in-variables bias becomes negligible because the estimated beta errors are small. Shanken also derives an asymptotic adjustment for the standard errors. Jagannathan and Wang (1998) extend this asymptotic result to the case of conditionally heterogeneous errors in the time series regression. Kan, Robbotti and Shanken (2012) and Shanken and Zhou (2007) extend the result to a misspecified model. Chen and Kan (2004) investigate the finite sample properties of the cross- sectional regression, and find that the bias can be material even if T is reasonably large (T=600), when using macroeconomic factors such as consumption growth. The two papers most closely related to this paper are Kim (1995) and Gagliardini, Ossola and Scaillet (2011). Kim (1995) corrects the errors-in-variable bias using lagged as an instrument to derive a closed-form solution for the MLE estimator of the risk premiums under the assumption that the error terms are homogeneous. The solution by Kim (1995) is based on Theil’s adjustment (Theil (1971), (Cf. Litzenberger and Ramaswamy (1979), and Shanken (1992)). Theil's adjustment can mitigate errors-in-variables bias when the cross-sectional residuals are weakly dependent and the number of assets is large. A limitation is its dependence on an estimate of the standard error of the regression residuals, which can introduce new biases. Since our method uses instrumental 96 variables to estimate risk premiums directly via the second pass regression, it is not subject to the same difficulty. Gagliardini, Ossola and Scaillet (2011) show that when T and N are close to each other and both converge to infinity, the errors-in-variables bias in the estimated risk premiums in the BJS method converges to zero. Following their method, this paper derives asymptotic distributions by assuming that bothT and N go to infinity. However, with the macro-factor model, even when bothT and N are large, the simulations suggest that the estimator from our IV method has a much smaller bias than the BJS method because the convergence rate for estimated risk premiums is slower in the BJS method than in the IV method. 2.2 The Instrumental Variable Approach 2.2.1. Assumptions and classical methods Let t r denote the 1× N row vector of excess returns on N assets in period t 15 , t f denote the 1× K vector of factor realizations, t β denote the K× N matrix of factor exposures, and t ε denote the 1× N column vector of idiosyncratic return disturbances. For convenience and without loss of generalization, it is customary to assume that the factors and disturbances have means of zero, 0 ε f ) ( E ) ( E t t . Consequently, the system in period t can be expressed as t t t t t ) ( E ε β f r r . (2.1) The no arbitrage condition of the Arbitrage Pricing Theory (APT) (Ross [1976]) stipulates t t t ) ( E β γ r . (2.2) where t γ is a 1× K vector of risk premiums associated with the factors. Since 2.1 and 2.2 hold in every period, over a sample of T periods, t=1,…,T, the vectors stacked alongside each other 15 Matrices and Vectors are indicated by bold face italic 97 become ]' , [ T 1 r r R , ]' , [ T 1 ε ε Ω , and ]' , [ T T 1 1 β f β f F B , so that the overall sample can be expressed compactly as Ω FB R R ) ( E , (2.1a) and, similarly, defining ]' , [ T T 1 1 β γ β γ ΓB , ΓB R ) ( E . (2.2a) If the factor exposures and risk premiums are time invariant, the system is simplified; i.e, for T 1 β β β and T 1 γ γ γ combining 2.1a and 2.2a, we have Ω β γ f γ f R )]' ( ), [( T 1 , (2.1b) Expression 2.1b represents a set of seemingly unrelated regressions that can be used in principle to test the APT’s no arbitrage condition or estimate the risk premiums. If β is known, 2.1b becomes a cross-sectional regression for estimating the risk premiums because 0 f ) ( E t . If the factors are known, (or assumed to be known), 2.1b becomes a time series regression for estimating β . In each regression, the intercept must be zero if there is no arbitrage. In a time series regression, the sample means of the factors are estimates of the risk premiums if the factors are traded portfolios or indexes with market returns See Gibbons, Ross and Shanken (1989). However, if the factors are not traded assets, (e.g., if they are macroeconomic variables) sample means of the factors are not necessarily mean returns. Also, if a joint test of the individual intercepts rejects the hypothesis that they are all zero, the accuracy of estimated risk premiums is called into question. Another method to test Equation 2.1b is to use GMM or MLE (e.g. Gibbons (1982) and Stambaugh (1982)). However, the numbers of parameters, moment conditions and the nonlinear estimation make these methods hard to implement when N is large (See the subsection on GMM below). 98 A third method, based on Black, Jensen and Scholes (1972) and Fama and Macbeth (1973), uses the two-pass regression approach mentioned earlier. The first pass relates returns of each asset to pre-specified factors in a time-series regression, thereby calculating estimates of β . The second pass is a cross-sectional regression of returns on the β estimates from the first pass. This can be done repeatedly for a time series of cross-sections; then the time series means of the cross-sectional coefficients in the second pass are estimates of the risk premiums. The second pass cross-sectional regression is inherently subject to errors-in-variables bias because its explanatory variables are estimates from the first pass. These errors introduce bias in the estimated risk premiums, the coefficients in the second pass. In addition, the error-induced noise can affect the estimated sampling variance of the risk premiums. Heretofore, applications of BJS and FM, including the original contributions, used portfolio groupings to mitigate these problems. We propose a procedure that can be implemented with individual assets; no portfolios are required. We rely on two major assumptions: Assumption 1: t ε is independent and identically distributed and is independent of t β and t f . The covariance metrics of t ε and t f are Σ and F Σ , respectively. Assumption 2: t f and t β are stationary processes and are independent of each other. Under Assumption 1, the estimated factor loadings are asymptotically consistent in the first pass regression though they are not “admissible” in the James/Stein sense for finite samples. Moreover, the i.i.d assumption simplifies the asymptotic standard errors for the estimated risk premium. However, as we show below, the consistency of these estimators do not require the identical distribution of errors. Assumption 2 allows us to derive the unconditional asymptotic distribution of the estimated 99 risk premiums. Suppose the length of the FM rolling window is and t F is the transpose of the submatrix consisting of columns t- +1 to t of the factor observations; i.e., ]' , [ t 1 τ t t f f F . Similarly, let t R and t Ω designate the corresponding columns of R and Ω . The first pass time series regression produces the estimates t t 1 t t t ' ' ˆ R F F F β With a total sample size of T, there are T- sequential overlapping rolling windows of size . 16 The estimation error in the first pass is t t 1 t t t t ' ' ˆ Ω F F F β β , which depends only on the error terms from time t- +1 to t (under assumption #1.) The dependent variable in the second pass cross-sectional regression could be any return vector rs for a disjoint period s [t- +1 t], though it is often simply t+1. This regression can be written s t N s ˆ ˆ α ˆ ξ γ β 1 r where is a common intercept and 1N is a unit vector of length N. Since the true model is s s s ) ( ε γ f β r , the cross-sectional residuals are s s t s ) )( ˆ ( ε γ f β β ξ and hence are correlated with the explanatory variables, the β ˆ ’s. This induces a complicated bias in the expected values of γ ˆ and α ˆ ; under the APT no arbitrage condition, the true value of α is 0. The bias is less severe when the estimation of β is more precise. Assuming that the true value of β is time invariate, one way to improve precision is to makeτ large when T is large. Asymptotically, there is ever smaller measurement error in the first stage estimates and hence little bias in the second stage. 16 Previous work typically uses sequential and equal-length rolling windows but there is no mathematical necessity for such a procedure and we shall suggest a different approach below. 100 2.2.2 The instrumental variable method When the number of assets is large, i.e, when N ∞ with a fixed rolling window (the time series sample period could be large), the errors-in-variables bias could still be substantial. Instrumental variables (IV) provides a method for correcting the first stage bias, but as always with the IV approach, the choice of instruments is crucial. In this case, however, there are some natural candidates; viz., β ˆ estimated from sample observations that are non-contiguous with the sample ending at t. These could be lagged observations. For example, if the original β ˆ ’s are estimated with observations ending at observation 2 , the instruments could be β ˆ ’s estimated with the observations from 1 to . Specifically, if we define ] ˆ , 1 [ 1 ˆ t t β β , the OLS estimator is: ) ' 1 ˆ ( ) ' 1 ˆ 1 ˆ ( = ' ˆ τ t t 1 τ t t τ t r β β β γ , Similarly, the GLS estimator is: ) ' 1 ˆ ( ) ' 1 ˆ 1 ˆ ( = ' ˆ τ t 1 t 1 τ t 1 t τ t r Σ β β Σ β γ . Of course, any other non-overlapping observations could be employed including subsequent ones. The IV method requires that strong instruments are highly correlated with the true explanatory variables and are uncorrelated with the residuals. Since most asset returns are weakly correlated over time, residuals from factor regressions are virtually uncorrelated as well. This satisfies the second condition for strong instruments. The first condition is trivially satisfied if the true betas are time invariate provided that the estimation samples are sufficiently long. In such a circumstance, factor exposure estimates from non-contiguous samples will be strongly related. On the other hand, if there is some time variation in the true β ’s, samples from non-contiguous observations widely separated in time bring the risk of weakened instruments. We have a suggestion next to counteract 101 this possibility. 2.2.3 An improved IV method for risk premium estimation. The IV method is not limited to lagged instruments. Any estimated factor loadings using non- overlapping observations can be used as the instruments. But the instruments could be weak if the factor exposures are changing over time and the non-overlapping samples are relatively far apart. This suggests a procedure that uses non-contiguous samples constructed to be the most coincidental possible in calendar time. Here is one proposed scheme: For each asset, divide the total sample into three subsamples. The first subsample contains returns and factors for observations 1, 4, 7,…, T-3. The second subsample contains returns and factors for observations 2, 5,…, T-2, and the third subsample contains returns and factors for observations 3, 6,…T. Given that factor model residuals are uncorrelated across time, any of the three subsamples can be used to estimate factor loadings ( β ˆ ’s), while either of the other two subsamples can be used to estimate instruments for the loadings. Then, the second stage FM cross-sectional regression can be estimated for each observation in the third subsample without having the returns related in any way to the errors in the β ˆ ’s or in their instruments. The sample means of the cross-sectional coefficients then become unbiased estimated of risk premiums. Since any permutation of the three subsamples is equally suitable, all three could be used and there seems to be nothing wrong with averaging the cross-sectional coefficients over all three permutations. To be more specific, if we define ] ˆ , 1 [ 1 ˆ sample sample β β , where sample can be one of the first, second or third subsamples, the OLS estimator is: ) ' 1 ˆ ( ) ' 1 ˆ 1 ˆ ( = ' ˆ third first 1 second first r β β β γ , 102 Similarly, the GLS estimator is: ) ' 1 ˆ ( ) ' 1 ˆ 1 ˆ ( = ' ˆ third 1 first 1 second 1 first r Σ β β Σ β γ . third r represents the sample average of the stock returns over the third subsample. In fact, since the returns from both second and third subsample are uncorrelated with first 1 ˆ β , we can replace third r with d secondthir r (obtained by taking the average over both of these two subsamples) in the estimator. This estimator is consistent when the number of stocks N is large. This will be the 3-group estimator in our simulation and empirical sections. There is an advantage of using three-group method. Since slow variation in the true factor loadings would evolve over the entire sample, each sub-sample would roughly include the same variation, thereby strengthening the instruments relative to using, say, lagged or leading non- contiguous observations. Finally, we note that it might not be optimal to divide up the sub-samples equally. Depending on the volatility of factors and factor model residuals, one could imagine improvements based on unequal divisions; e.g., estimating the loadings and instruments with half of the available observations and conducting the cross-sectional regressions with the other half. We reserve this refinement for later study though and stick here to a tri-partite procedure. 2.2.4 Theil’s Adjustment To compare, we now discuss two other methods to correct the bias. The first method is Theil’s adjustment, which essentially modifies the BJS method. In BJS method, when the second pass is OLS (the method is similar when the second pass is GLS), the estimated risk premium is 103 ) ' 1 ˆ ( ) 1 ˆ ' 1 ˆ ( 1 r β β β γ , where t r r is the average of the returns and ] ˆ , [ 1 ˆ N 1 β 1 β . We can show that, A N N T β β β β 1 ' 1 1 1 ˆ ' 1 ˆ 1 with 1 2 iT 2 i1 1 1 k k 1 ) )( δ 0 δ ' ( ) ' ( 0 F F' F F F F 0 0 T A , where 2 it δ is the summation of the variances of the regression residuals ] ε , , ε [ tN t1 t ε . The term A T represents the covariance of the error term in the estimated factor loadings. Since A T is positive semi-definite, 1 ' 1 N 1 1 ˆ ' 1 ˆ N 1 β β β β . This leads to a negative bias for the estimated risk premium. In order to correct this bias, Theil (1971) and Litzenberger and Ramaswamy (1979) suggest using the following method: ) ' 1 ˆ ( ) ˆ N 1 ˆ ' 1 ˆ ( 1 r β T β β γ A , with 1 2 iT 2 i1 1 1 k k 1 ) )( δ ˆ 0 δ ˆ ' ( ) ' ( 0 ˆ F F' F F F F 0 0 T A , where 2 it δ ˆ is the summation of the estimated variances of the regression residuals. Shanken (1992) shows that this estimator is consistent when N is large under the assumption that the summation of the estimated variances of the regression residuals converges to its true value with large number of stocks. In Section 5, we also find that the finite sample bias is small with 104 Theil’s adjustment when this assumption is valid. However, the above assumption may not always be true. For example, suppose that the summation of the variances of the regression residuals 2 it δ is time-varying, but we estimate it by summing of the average variance of all regression residuals, i.e. /T δ ˆ δ ˆ N T, 1 1.i t 2 it 2 it for all time, then, ) δ δ ˆ ( 2 it 2 it does not converge to zero since the true value 2 it δ is time varying and the estimator 2 it δ ˆ is constant. In this scenario, if ) δ δ ˆ ( 2 it 2 it is correlated with the factors, then 0 F F' F F F F 0 0 T T 1 2 iT 2 iT 2 i1 2 i1 1 1 k k 1 ) )( ) δ δ ˆ ( 0 ) δ δ ˆ ( ' ( ) ' ( 0 E ) ˆ ( E A A . Therefore, the Theil’s estimator can create a new bias when estimated variances of the residuals do not converge to the true variances. Such issue will not affect the instrumental variable method since the only assumption for this method is that the residuals are not auto correlated. There is another issue associated with the Thiel's adjustment in estimating the standard errors of the regression residues 2 it δ , when the factor loading β is time varying. Specifically, if T 2 1 , , β β β are not identical, the estimated variane is T 1 t 2 1 t 1 t t t t T 1 t 2 1 t 2 t ) ' ' ) ( ' ' ( T 1 ) ' ' ( T 1 ε F F F f FB F F F f ε β f R F F F f R δ t . 105 Here T T 2 2 1 1 β f β f β f FB . The above equation can be further decomposed into T 1 t 1 t t t 1 t t T 1 t 2 1 t t t T 1 t 2 1 t t )) ( ' ' )( ' ' ( T 1 )) ( ' ' ( T 1 ) ' ' ( T 1 FB F F F f f β ε F F F f ε FB F F F f β f ε F F F f ε . The first part of the decomposition is a consistent estimator of the variance of the regression residues. Moreover, when the regression residues are uncorrelated with the factors and the loadings, the expected value of the third part is zero. However, the second part is a function of t β , and its expected value is not zero when the factor loadings are time-varying. Thus, the estimated errors contain a bias, and the bias can affect the estimated risk premiums. The time-varying factor loadings do not create such an issue for the instrumental variable approach since there is no need to estimate the standard errors of the regression residues. 2.2.5 GMM Following equation (12.23) of Cochrane (2005), one can use the following moment conditions in GMM estimation: , 0 = )) ( ' ( BF α r F E . 0 = ) ( B Γ r E For N assets, there are N(K+1)+N moment conditions and NK+N+K parameters. Hence, the GMM is overidentified if N>K. 106 However, there are two limitations in implementing GMM. First, the method cannot be easily estimated when N is even moderately large. Suppose N=149 and K=3 (3-factor model), then there are 745 moment conditions and 599 parameters to estimate, and it becomes problematic to find the global minimal of the objective function with 599 parameters and 745 moment conditions. If N=5000 (as for individual stocks), then there are N(K+1)+N=25000 moment conditions. Hence, T must be more than 25000. Usually, we do not have data with such a large T, hence, the GMM is not implementable. Another problem is the efficiency of the estimator. In an iterated GMM process, it is difficult to construct the efficient weighting matrix of the moment conditions that is invertible based on the estimated parameters in previous iteration. Even when N=25 (the weighting matrix is 125 125 ), we can only use the identity matrix as a weighting matrix. However, using the identity matrix will not be efficient for the estimated parameters. To deal with these two limitations, Shanken and Zhou (2007) make an adjustment to this method. First, they estimate β with the time series regression. Second, they use the estimated β to form the moment condition. In this case, there are only k parameters remaining in the second pass of their two-pass GMM. This adjusted method is essentially similar to the BJS method with the GLS estimation in the second pass. 2.3 Asymptotic Distributions 2.3.1 The asymptotic distribution of the IV method. In this section, we show the consistency of the estimator and obtain its asymptotic distribution. Theorem 3.1 (a) The estimated risk premiums ) , ( ' ˆ t t f 0 γ (rolling-β IV estimator) and 107 ) , ( ' ˆ third f 0 γ (3-group β IV estimator) are consistent. (b) Assume that when N ∞, N / 1 1 1 β Σ β converges to an invertible matrix (denote this matrix by ' 1 b b Σ ). In addition, assume that ] ξ β , , ξ β [ tN 1 N t1 1 1 (where 1 1 N 1 1 1 = ] β , , β [ Σ β ) satisfies a Lindeberg condition, then the asympotic distribution for the estimated risk premiums using the lagged IV (rolling-β IV) is: , ) (0, ) ) , ( ' ˆ ( N 1 1 t t BA A f γ 0 γ N where ' = 1 b b Σ A , , )) ~ ' ( = τ t 0, 1 0 L b b Σ B c where 1 τ t τ t τ t τ τ t 1 τ t τ 1 k k 1 τ t 0, ) ' ( ' ) ' ( 0 = ~ F F F L F F F 0 0 L t and ) ' ) ' )( ( τ 1 τ 2 ) ( ) ( ' ) ' )( ( 1 = 0 t 1 t t t t 1 t t t τ t 1 t t t 0 l F F F f γ f γ F ' F F L F F F f γ c with ) τ 1 ,.. τ 1 , τ 1 1 ( = 0 l a τ -vector and τ 1 τ τ 1 τ 1 τ 1 τ 1 τ τ 1 τ 1 τ 1 τ 1 τ = τ L . The asymptotic distribution for the estimated risk premiums using the 3-group IV method is: ) (0, ) ) , ( ' ˆ ( N 1 1 third BA A f γ 0 γ N . 108 Here ' = 1 b b Σ A , , )) ~ ' ( = 1 0 L b b Σ B c where 1 first first first τ first 1 first first 1 k k 1 ) ' ( ' ) ' ( 0 = ~ 1 F F F L F F F 0 0 L and ) ' ) ' )( ( τ τ 1 τ 2 ) ( ) ( ' ) ' )( ( τ 1 = 2 0 second 1 second second third 3 2 2 third 1 second second second τ second 1 second second third 3 0 2 l F F F f γ f γ F ' F F L F F F f γ c , with ) τ 1 ,.. τ 1 , τ 1 1 ( = 2 2 2 2 0 l a 2 τ -vector and i i i i i i i i i i i i τ τ 1 τ τ 1 τ 1 τ 1 τ 1 τ τ 1 τ 1 τ 1 τ 1 τ = i L , assuming that subsample i contains i τ periods. Moreover, third f is the average of the factors over the third subsample. The proof is in the appendix. When we use the average returns of stocks over both the second and third subsample in the three-group estimator, the asymptotic distribution has the same formula by replacing third f with d secondthir f . In cross-sectional regressions, the second pass can either be OLS or GLS. The Theorem presents the asymptotic distribution when the second pass is GLS. The OLS estimation is a special case (when I Σ 2 δ = ). Theorem 3.1 provides an estimation of the risk premiums conditional on the value of the factors. The Theorem says that t ˆ γ is a consistent estimator of the expected risk premium plus the 109 unexpected factor realization at time t . If the factors are unexpected shocks or demeaned factors ( 0 = ) ( f E ) , then ) (0, ' ˆ t t f γ is the consistent estimator of γ . One important assumption in this Theorem is . ' N / 1 1 1 1 b b Σ β Σ β A possible issue that arises in grouping can violate this assumption. If we group the individual stocks into well diversified portfolios according to characteristics of the firms, these portfolios may have market close to 1. Hence, N / 1 1 1 β Σ β is not invertible, and one cannot use the rolling- method with the instrumental variables. This situation is similar tothe use less factor case in Kan and Zhang (1999) who show that the errors-in-variables bias can be amplified in finite samples. There are several approaches to control this problem. One approach is to group the stocks according to market as well; hence, N / 1 1 1 β Σ β is invertible. Another method is to drop the constant term in the second pass. More specifically, in the second pass, one can regress returns on the β ˆ on book-to-market and the β ˆ on size without β ˆ on market. This method implicitly assumes the multi-factor model is true and the intercept is 0. A third approach is to use individual stock returns to estimate the risk premiums. When N is reasonably large, we will show that using instrumental variables from non-overlapping observations is effective in correcting the bias in finite sample. When the sample period T is large enough, the sample average of the estimated risk premiums using the lagged instrument, t 1 τ T τ = t ˆ 1 2 τ t 1 γ , is a consistent estimator. In the next Theorem, we provide the asymptotic distribution of the sample average of the estimated risk premiums. Notice that since the estimated time series } ˆ { t γ is autocorrelated up to τ , the asymptotic variance of the sample average of the the estimated risk premiums contains these autocorrelations. 110 Theorem 3.2 The sample average of the estimated risk premiums using rolling-β IV method is a consistent estimator of the risk premiums, i.e. when T is large, . ) (0, ' ˆ 1 2 τ T 1 t T 2 τ = t γ γ (a) If } { t f is a stationary process, the unconditional asymptotic distribution of the estimated risk premiums is: , ) (0, ) ) (0, ) ) (0, ˆ ( 1 2 τ T 1 ( N T t t T 2 τ = t V γ f γ N if ))) ~ ' ( ( ( = τ t t1, 1 t1 1 τ 1 τ = t1 f L b b Σ c E B in which t1 c ’s are constants and τ t t1, ~ L ’s are the 1) K ( 1) K ( matrices 17 exists, , = 1 1 f f f A B A V where ' = 1 b b Σ A f . (b) Excluding the term , ) (0, t f we have the same asymptotic distribution of the estimated risk premiums as Theorem 2 in Shanken (1992): ) ~ (0, ) ) (0, ' ˆ 1 2 τ T 1 ( T t T 2 τ = t F N Σ γ γ , where F F Σ 0 0 Σ 1 3 3 1 0 ~ . The proof of this Theorem is in the appendix. We will call V the NT -asymptotic covariance and F Σ ~ the T -asymptotic covariance since they represent covariances with different convergent rates. Shanken (1992) derives the asymptotic distribution of the BJS method with convergent rate of T . Gagliardini, Ossola and 17 The formulas of τ t t1, ~ L and t1 c will be defined in the proof. See appendix for the details. 111 Scaillet (2011) show that when both T and N large, the estimated risk premiums in the BJS method converge to the true value at the speed of ) NT 1 O( ) T 1 O( 18 . If O(T) > N , then the rate of convergence is ) T 1 O( . Theorem 3.2 shows that if one uses the IV method and subtracts the sample average ' 1 2 τ T 1 t T 2 τ = t f from ' ˆ 1 2 τ T 1 t T 2 τ = t γ , the rate of convergence for this estimator is ) NT 1 O( . This method does not depend on the relative size of T and N. Notice that in the derivation of asymptotic distributions when the second pass is GLS, the covariance matrix of the error terms Σ is assumed to be known. In reality, this matrix is unknown. Hence, one needs a feasible version to deal with this problem. One classical approach is to assume Σ is some function of the factors, hence the estimation of the Σ becomes the estimation of the coefficients of these functions. There are two other methods implementing GLS. The first is introduced by Shanken (1985). If T>N+K, one can estimate Σ by taking the sample average of the cross multiplication of the sample error terms that can be calculated through the estimation from the second pass OLS regression. Another method is that of Ferson and Harvey (1999) using weighted GLS. This paper adopts Ferson and Harvey (1999) since most of the cross-sectional correlations between the idiosyncratic risks are small. 2.3.2 A simple way to calculate standard errors. In Theorem 3.2, we derive the asymptotic covariance of the estimated risk premiums. If the factors 18 Here, for any real number X, O(X) is defined as follows: there exist two positive numbers M and N, such that MX<O(X)<NX. 112 are demeaned factors or the factors are the shocks from its conditional expected values, then one can use ) ) (0, ˆ ( 1 2 τ T 1 t t T 2 τ = t f γ as the estimator of the ' γ , and V is the covariance matrix with NT -convergent rate. Fama and Macbeth also provide a simpler way to calculate the asymptotic covariance by taking the sample covariance of the estimated risk premiums. This method is also applicable with the IV method. Define ) (0, ˆ = ˆ t t * t f γ γ and )) (0, ˆ ( 1 2 τ T 1 = ˆ t t T 2 τ = t * f γ γ (the sample average of * ˆ γ ). In this case, the Fama-Macbeth sample asymptotic covariance (with autocovariances up to the length of the rolling window) in this situation is: . ) ˆ ˆ ( ) ˆ ˆ ( 1 t1 2 τ T 1 N * * t1 t * * t T t1 2 τ = t τ τ = t1 γ γ γ γ Another version of the Fama-Macbeth sample covariance does not contain the autocovariances i.e., one can use ) ˆ ˆ ( ) ˆ ˆ ( 1 2 τ T 1 N * * t * * t T 2 τ = t γ γ γ γ to estimate the covariance of the estimated risk premiums. In section 4, we will analysis different sample covariances as well as the asymptotic covariance from Theorem 3.2. The differences are small in the Fama-French three factor model, but can be large for the macro-factor model. 2.4 Data and Simulation Results This section uses Fama-French portfolios and macro factors to compare empirically different methods of estimating risk premiums. We compare four methods: (1) BJS estimation without rolling betas; (2) the rolling- method of Fama and Macbeth (1973); (3) the rolling- method using β ’s estimated with non-overlapping observations as the instrumental variable; (4) Theil’s 113 adjustment. Fama and French’s three factors 1964 to 2009 are available on Kenneth French’s data library. We use these factors along with 100 size and book-to-market portfolios plus 49 industry portfolios as well as the 25 size and book-to-market portoflios. In addition to these 25 and 149 portfolios, we obtain returns for individual stocks from CRSP, which has 4970 stocks with that have fewer than 20% missing observations from 2000 to 2009. The macro variables are chosen following Chen, Roll and Ross (1986). We use the following macros: unexpected consumption growth, unexpected inflation and the unexpected change in industrial production. The raw series are obtained from the Federal Reserve at St Louis. To measure consumption, we add the consumption of nondurables to services and divide by the US population. The the Consumer Price Index for All Urban Consumers is our measuredprice level. Rates of change in terms of the log differences of these level variables become the raw growth rates. The macro factors are estimated shocks from conditional expected values estimated from a vector auto-regression (VAR). Specifically, define Xt≡ [ΔCt : ΔIPt : ΔCPIt]’ where ΔCt, ΔIPt, and ΔCPIt are raw consumption growth, industrial production growth and the inflation rate, respectively. Then Xt is modeled as an (1) AR process that follows a vector auto-regression Xt= A + BXt + ζt where ζt denotes the 3× 1 vector of VAR innovations. The fitted value 𝑿 ̂ t = 𝑨 ̂ + 𝑩 ̂ 𝑿 ̂ t−1 is our the conditional expected value of Xt. The shock or unexpected value 𝑿 t − 𝑿 ̂ t is is assumed to be the driving factor for assets. 2.4.1 Simulations For the three Fama-French factors, we assume that the true risk premiums are the sample means of excess return factors. Shanken (1992) shows that the sample mean is an consistent estimator of 114 the risk premiums.Using data from 1964 to 2009, we regress returns on the factors to estimate β ’s for each portfolio and calculate the regression residues using returns, factors and estimated β’s. In the bootstrap simulation, the factors and the error terms are re-sampled from the the pool of observed factors and the pool of error terms generated by the above regressions. Re-sampled factors, error terms and estimated β ’s are used to generate the simulated (re-sampled) portfolio returns. However, as in Kan and Zhang (1999) or Kleibergen (2009), the regression in the second pass has a multi-collinearity problem because the estimated β ’s on the market are close to 1 for all N simulated portfolios. To alleviate this problem, we use the following method: we regress returns on all three factors. However, since the estimated β on the market is close to 1 for all portfolios, in the second pass, we only regress the returns on the estimated β ’s on market, the book-to-market and the size in the second pass and omit the intercept. This essentially assumes that the Fama- French three-factor model is true and that its constant term is zero. Of course, if the model is not true, this estimation procedure is likely to produce biased risk premiums for all three factors. This is an inherent problem with portfolio grouping if market betas for the portfolios are all close to 1.0. For the Fama-French three-factor model, we generate returns and factors 10,000 times. Each of the 10,000 samples has T of 60 or 600 and N of 25 and 149, which allows us to examine performance in various finite samples. We use the BJS method, rolling-β method and the IV method to estimate the risk premiums. Hence, there are 10,000 estimated risk premiums for each of the methods. The reported risk premiums are the average of these estimated risk premiums. We also present the T-ratio of the difference between reported risk premiums and the their true values. The standard errors used to construct the T-ratio are based on the sample covariance of the estimated risk premiums from 10,000 trials. 115 In addition to the portfolios, we use individual stocks (with N=4970) to estimate the risk premiums in the Fama-French three-factor model. With individual stocks, the β on market is not a redundant variable in the second pass; hence, we run a cross-sectional regression of the returns on 1 and the estimated β ’s of all three factors in the second pass. We also simulate macro factors and concommitant errors and investigate them with the three methods. Since the macro factors are less volatile than the traded factors, the linear factor model has larger idiosyncratic risk; therefore, the errors-in-variables bias is likely to be much larger. We examine this case in the context of monthly returns for 4970 individual stocks from CRSP. In this examination, T is either 60 or 600 and N takes on three values, 25, 149, and 4970. Each combination of parameters is replicated 10,000 times. To fix the “true” risk premiums of the macro factors in the simulation, we average estimated risk premiums from the literature. 19 2.4.2 Simulation Results for the Fama-French Three-Factor (traded factors) Model Table 1 presents the average estimated risk premiums of three-factor model using 25 portfolios. When there are 25 portfolios, the estimated risk premiums have errors-in-variables bias using both the BJS and the Fama-Macbeth rolling beta methods. When the sample period is only 60, this bias is large. Interestingly, even when T = 600, and the rolling window is 60, the Fama-Macbeth rolling beta method produces risk premiums that are still about 20 percent smaller than their true values. Evidently, either the rolling window or the sample period is too short to eliminate the bias. In contrast, the β IV method produces accurate estimated risk premiums when the sample size is both 60 and 600. T-ratios of the difference are small for all estimated risk premiums, indicating that the 19 These papers include Chen, Roll and Ross (1986), Ferson and Harvey (1991), Chan Chen and Hsieh (1985), Jagannathan and Wang (1996) and Kramer (1994). 116 bias is negligible in statistical sense as well. For 149 portfolios, results are presented in Table 2. The errors-in-variables bias is small. though it islarger than the bias with 25 portfolios (Table 1). The explanation is that grouping stocks into 25 well-diversified portfolios diversifies away idiosyncratic risk better than grouping the same stocks into 149 portfolios. Using T-ratios of difference, the bias for classical methods (BJS method and Fama-Macbeth method) are significant when T=600. Estimated risk premiums using 4970 individual stocks are shown in the Table 3. For T = 600, the BJS method, β IV method and Theil’s adjustment produce consistent estimated risk premiums for the three factor model. But estimated risk premiums from the Fama-Macbeth method can have a bias as large as a 25% . For T = 60, estimated risk premiums have smallest bias using the β IV method. The bias is largest for the size factor, about 20% with the β IV method, but this is smaller than that produced by the BJS method (about 40% ), and by the Fama-Macbeth method (more than a 60% .) The bias for classical methods is larger with 4970 stocks in than 25 or 149 portfolios given larger T-ratios. Theil’s adjustment has a smaller bias for the size factor than the β IV method, but has a larger bias for the book-to-market factor. These simulations imply that the finite sample bias in the estimated risk premium is relatively large when the sample period is small and the idiosyncratic risk is high. The β IV method can reduce the finite sample bias, especially when the sample period is small and the number of portfolios or stocks is large. Theil’s adjustment is also successful in adjusting for the finite sample bias. 2.4.3 Macroeconomic Factor premiums In addition to traded factors, we also compare the estimated risk premiums of macroeconomic factors in Tables 4,5 and 6, respectively. 117 The simulation results with 4970 stocks are shown in table 6. Since there is substantial idiosyncratic risk for individual stocks, the estimation error in β is large in the first pass and the bias is large for both the BJS method and the Fama-Macbeth method. The estimated risk premiums are one third or one tenth of the true value of the risk premiums for these two methods, respectively. The value of T-ratios are in general much larger for classical methods with macro factors, indicating that bias is larger in statistical sense as well. However, the estimators from the BJS method with Theil’s adjustment and the β IV method are much closer to the true values. T-ratios are insignicant for β IV method, though can be significant for Theil’s adjustment when T=60. In addtion to 4970 individual stocks, we also compare the estimated risk premiums for 25 and 149 portfolios. The results are shown in Tables 4 and 5. The estimated risk premiums for the BJS method and the Fama-Macbeth method have larger bias than with the 4970 stocks suggesting the grouping might not result in a smaller finite sample bias. But the results are more different for the β IV method and Theil’s adjustment. Estimated risk premiums using these two methods are far from the true risk premiums. These two methods can produce unreasonable risk premiums for some simulations due to the fact that ' ˆ ˆ t τ t β β or A T β β ˆ N 1 ˆ 1 ˆ is not positive definite when N is not large enough and the instrumental variables are weak. One method to deal with the weak instrument is to remove the observations of the stocks with instruments and factor loadings having different signs. However, further reducing the number of stocks is not applicable when this number is small. Thus, these methods can produce consistent estimators only when the number of stocks or portfolios is reasonably large. Using simulations, we find that when number of stocks is larger than 2000, these methods can adjust the bias. (These simulation results are available from the authors upon request.) 2.4.4 Time-varying factor loadings and regression residuals 118 The above simulations are all based on the constant factor loadings and constant volatility of the regression residuals. We also want to examine the various methods with time-varying factor loadings and regression residuals. To do this, we estimate the factor loadings from stock returns and factors using a rolling window of 30 months. The estimated factor loadings are assumed to be the true factor loadings in the simulations. Since the estimated factor loadings using rolling windows are time-varying, the true factor loadings in the simulations are also time-varying. Similarly, we can estimate the residuals from the regression, and create a pool of residuals. To bootstrap the residuals, we first select a time point, and randomly select one of the estimated residuals in a neighborhood (either within 15 or 30 months) of this time points from the residual pool. This provides time-varying residuals in the simulations. With these simulated time-varying factor loadings and regression residuals, we can compare various methods. In particular, we are interested in comparing the IV method with other existing methods. The advantage of the 3-group IV method is that it is not significantly affected by the time-varying factor loadings. The results, shown in table 7 and 8, indicate that the 3-group IV method yields the best estimates among all other alternatives. 2.4.5 Standard Errors In this subsection, we compare the asymptotic standard error of the estimated risk premiums from three methods: the BJS method, IV method and Fama-Macbeth rolling- method. 20 First we construct the “true” standard error of the risk premiums using the bootstrap, i.e., assuming that the standard error of risk premiums across 10,000 replications is the true standard error. 21 20 The asymptotic standard error of the Theil’s adjustment is the Shanken adjustment, which is the same as the BJS method. 21 The constructed "true" standard error are similar for 1,000 simulations and 10,000 simulations, indicating the convergence of the standard error when the simulation number is approximately 1000. 119 We compare this true standard error with the method-specific estimated standard errors, including Fama-Macbeth errors with and without the adjusting for the autocovariances, Shanken’s adjustment and estimated errors from Theorem 3.2. There are two different asymptotic covariance from Theorem 3.2. We can obtain NT - asymptotic covariance V from ) (0, ) ) (0, ) ) (0, ˆ ( 1 2 τ T 1 ( N T t t T 2 τ = t V γ f γ N . We can also obtain T -asymptotic covariance F Σ ~ using ) ~ (0, ) ) (0, ' ˆ 1 2 τ T 1 ( T t T 2 τ = t F N Σ γ γ . Since it is more appealing to use an estimator with faster convergent rate, we will choose the first estimator, and construct NT -convergent standard errors. The results for standard errors are based on 6 cases: Tables 9, 10 and 11 present the standard errors for the three Fama-French factors and 25, 149 and 4970 stocks respectively; Tables 12, 13 and 14 present the standard errors for macro-factors with 25, 149 and 4970 stocks. 2.4.6 Traded Factors First, we consider the standard errors for the Fama-French three factor model. In Table 9 (when T=600 and N = 25,) for the BJS method, the standard error estimated through Shanken’s adjustment is closer to the true standard error compared to the Fama-Macbeth standard error without autocovariance. For the IV method, the Fama-Macbeth standard errors (with or without autocovariance) and the standard error from Theorem 3.2 are all close to the true standard error. For the Fama-Macbeth rolling- method, the Fama-Macbeth standard errors (with or without autococariances) are smaller than the true value. The results are similar for both 149 portfolios 120 (Table 10) and 4970 stocks (Table 11). When T = 60, the estimated standard errors are much farther away from the true standard errors compared to the T = 600 case. However, Shanken’s covariance and the covariance implied by Theorem 3.2 are still the closest to the true covariance for the BJS method and IV method, respectively. This is because estimated standard errors are less accurate with smaller T. An exception is for N = 4970 and T = 60; in this case, the estimated Shanken's adjusted standard errors are much smaller than the true value with the BJS method. This is because the idiosyncratic residuals of the individual stocks are much larger than the idiosyncratic residuals of portfolios. 2.4.7 Macro Factors Tables 12 and 13 present standard errors for macro-factor models. Compared with the Fama- French three-factor model, the standard errors are larger. The reason is that for 25 and 149 portfolios, the idiosyncratic errors are larger in the macro-factor model than with the Fama-French three-factor model. We find similar results for the BJS method and rolling- method. For rolling- IV method, since the estimated risk premiums can be close-to-infinity, the standard errors are unreliable. The more interesting results are shown in Table14 with 4970 individual stocks. When T = 60, all of the estimated standard errors are much smaller than the true standard errors, suggesting that they are inaccurate when T is small. When T = 600, the Shanken adjustment and the Fama-Macbeth standard error with autocovariance are closer to the true value for the BJS and the Fama-Macbeth rolling- IV methods respectively, but they are still much smaller than the true value. The results indicate that for marco-factor models, even if T = 600, the estimated standard errors are not 121 accurate. 22 The standard error from Theorem 3.2 is the closest to the true value, and the Fama- Macbeth standard error with autocovariance is also close to the standard error from Theorem 3.2 with the the IV method. Ang, Liu, and Schwarz (2010) show that if one groups stocks into portfolios to estimate risk premiums, the asymptotic covariance of the estimated risk premiums is larger. The simulated standard errors from Tables 12,13 and 14 are consistent with their theoretical findings. More specifically, the standard errors decrease with the number of portfolios. The comparison of the standard errors leads to the following conclusions. First, one should use large enough T to make the estimated standard errors closer to the true standard errors. For the Fama-French three-factor model, T = 600 is large enough and for the macro-factor model, T needs to be much larger than 600 (e.g. with daily data). For smaller T (e.g., with monthly rather than daily data) but reasonably large N, using the IV method together with standard errors from Theorem 3.2 or the Fama-Macbeth standard error with autocovariance can be more appropriate for the macro-factor model. 2.4.8 T Statistics Lewellen, Nagel and Shanken (2010) show that the explanatory power can be misleading for some asset pricing models. In this paper, we examine another important issue, the size of the cross- sectional regression. Specifically, we want to examine the probability of rejecting an asset pricing model when the model is correctly specified. Following Ferson and Foster (1994) and Shanken and Zhou (2007), we compare the probability of rejecting the null hypothesis, 0 α ( α is the constant term), when it is true. To do this, for each simulation, we calculate the t-ratios for different 22 We did run simulations with 3000 = T ; the estimated standard errors are much closer to the true values. 122 methods and compare them with the 95% critical value of the standard normal distribution, 1.96. Then, we calculate the number of simulations in which absolute value of the t-ratio is greater than 1.96, and divided it by the total number of simulations to obtain the probability of rejecting the true null hypothesis (rejection rate). Since the bias of the point estimators is large for some methods, we use a "bias-corrected" T- ratio. To do this, we estimate the risk premiums in the first 1000 trials, and take the difference between the true value and the sample average of the estimated risk premiums as the bias. Then we adjust this bias in the simulated risk premiums to obtain the bias-corrected T-ratios for another 1000 trials. We compare the t-ratio with different methods for point estimate and different methods of estimating standard errors, in the Fama-French three-factor model and the macro-factor model with 4970 = N and 600 = T . We choose the large values of the N and T because these are the cases with the smallest bias for the BJS and the instrumental variables methods. The results are shown in Table15. From this table, the rejection rate is too large for most of the tests reject. For the Fama-French three-factor model, the IV method and the BJS method have lower rejection rate than the Fama- Macbeth rolling- IV, e.g. the largest rejection rate for the BJS method and the IV method is smaller than 10% , but it is more than 15% for the Fama-Macbeth rolling- IV method. This is because the estimated standard errors are smaller than the true standard errors. Most empirical research finds that the existing models are generally misspecified with the Fama-Macbeth rolling- IV method. 23 This paper suggests that this finding may be due to the statistical issues. For the IV method, the standard error from Theorem 3.2 leads to a rejection rate of 4.70% which is close 23 Here, we consider the bias-corrected T-ratio. Uf the bias is not correct, one can expect a higher rejection ratio with the Fama-Macbeth rolling- method and the BJS method. 123 to the true rejection rate of 5%. All other standard errors lead to larger rejection rates. For the BJS method, the rejection rate is also larger than 5%. The conclusions are similar with the macro-factor model. The only exception is the BJS method with the Shanken adjustment, where the rejection rate is too small (3.3%). To conclude, the most reliable approach for testing whether an asset pricing model can be misspecified is to use the standard error from Theorem 3.2 in the IV method. Using the Fama- Macbeth rolling- method can result in misleading interpretations of α . 2.5 Application of these methods According to the simulation results, the instrumental variable approach can effectively remove the bias in cross-sectional regressions. Thus, it is a natural to examine whether applying this method can help researchers to identify the factors that explain the expected stock returns. Instead of the traded factors, we apply this method to macro factors. The macro factors arguably affect the stock returns. However, as we can see from the simulations, the estimated risk premiums with classical methods are much more biased than the traded factors, making researchers harder to find significant risk premiums. The instrumental variable method, which can adjust for the bias, is more likely to identify these risk factors. Moreover, the risk premiums of the macro factors can only be estimated using the cross-sectional regression approach, while the risk premiums of the traded factors can be estimated using the sample average of the excess returns (Shanken (1992)). Therefore, the macro-factor model provides perfect specification to study various cross-sectional regression methods. We create the same macro factors from 1964 to 2010 as before. The monthly individual stock returns for the same time horizon are available from CRSP. Since many stocks only exist for a short horizon, we exclude these short-lived stocks: i.e. stocks with less than 90 of months return 124 data. This is because we need to have large enough data (for example 30 months) to estimate factor loadings in the 3-sample method. Moreover, it is difficult to believe the macro factors can significantly affect the returns of the short-lived stocks since the majority of these returns should be influenced by idiosyncratic shocks. With these returns and the factors, we estimate the risk premiums and T-statistics of different cross-sectional regression methods: BJS method, Fama- Macbeth rolling-β method, lagged instrumental variable method, 3-group method and Thiel's adjustment. For all three instrumental variable methods, the issue of weak instrumental variables can affect the estimated risk premiums. Thus, we drop the observations of the stocks for which the estimated β instruments and estimated factor loadings from two subsamples have opposite signs. The estimated risk premiums and the T-statistics are shown in tables 16 and 17. We find that the estimated risk premiums with the classical methods are generally smaller than the instrumental variables approach. For example, when the second pass is OLS, the risk premium estimates of consumption growth are 0.006 and -0.001 with BJS and Fama-Macbeth approach, respectively. Nevertheless, the estimated risk premiums for the rolling- IV method and the 3-group method are 0.097 and 0.028, which are much larger. The T-statistics are also larger. For example, we find that the consumption growth can significantly affect the stock returns with two instrumental variable methods, but for the classical approaches, we cannot find such result. The findings from Thiel's adjustment are also noteworthy. When the second pass is OLS, although some estimators are even larger than those with instrumental variable approaches, the T-ratios do not lead to rejection of the null hypothesis. Moreover, with GLS as the second pass, we find that the consumption growth negatively affects the stock returns, which is inconsistent with the findings with OLS estimators. Similarly, we find that the inflation shocks negatively affect the stock returns, though the only significant coefficients are estimated using the lagged instrumental variable 125 approach. Surprisingly, the shock in industry production has significantly negative effect on stock returns for almost all methods. The only exception is the Thiel’s adjustment, although the conclusions are inconsistent with different methods (OLS or GLS) in the second pass. 2.6 Conclusion This paper suggests an adjustment for the Fama-Macbeth IV method. Estimated β ’s from non- overlapping observations can serve as effective instruments and mitigate or entirely eliminate the errors-in-variables bias. For the cases of constant β and time-varying β , we prove consistency and derive the asymptotic distribution of the estimated risk premium when the number of portfolios N is large. When β is a general function of conditioning information, we use simulations to compare the IV method with the traditional BJS method and the traditional Fama-Macbeth method. For macro factors, the bias is large for the latter two methods when the number of portfolio is large and the total sample period T is small. The IV method and Theil’s adjustment method are consistent and the estimated risk premiums are much closer to the true value even when T is small (e.g. T=60). This is due to the NT -convergent rate of these estimators. As long as N is large, the estimated risk premiums have much smaller bias even if T is small. Moreover, the traditional and still widely used Fama-Macbeth method has the largest bias among the three methods. Using the lagged estimated β as the instrumental variable corrects this bias significantly. In addition, this paper provides the standard error estimators (Theorem 3.2) and the Fama- Macbeth standard error with autocovariance that are superior to alternative estimators when T and N are relatively large. We conduct simulations to evaluate the various standard errors and t-ratios for tests on the null hypothesis that 0 α . The results show that most of the tests reject the null 126 hypothesis too often in Fama-French and macro-factor models. The IV method, combined with the standard error from Theorem 3.2, provides well-specified tests. Finally, the empirical applications show that the new approach can indead adjust for the bias and help researchers to identify the factors that can affect the stock returns, while the conclusions drawn from other methods are either insignificant or inconsistent. 127 References Ang Andrew, Jun Liu, and Krista Schwarz 2010, Using Stocks or Portfolio in Tests of Factor Models, working paper. Bansal Ravi, 2004, Long Run Risks and Risk Compensation in Equity Markets, working paper, Duke University. Bansal Ravi and Amir Yaron, 2004, Risks for the Long Run: A Potential Resolution of Asset Pricing Puzzles, Journal of Finance 59, 1481-1509. Biorn Eric, 2000, Panel data with measurement errors: instrumental variables and GMM procedures combining levels and differences, Econometrics Review, 19, 391-424. Black Fisher, Michael C. Jensen, and Myron Scholes, 1972, The capital asset pricing model: Some empirical tests, Michael C. Jensen, ed: Studies in the Theory of Capital Markets , 79–121. Blume Marshall, and Irwin Friend, 1973, A New Look at the Capital Asset Pricing Model, Journal of Finance, 28, 19-34. Braun Phillip, Daniel Nelson and Alain Sunier, 1995, Good news, bad news volaitiltiy and betas, Journal of Finance, 50, 1575-1604. Chan K. C., Nai-Fu Chen, and David A. Hsieh, 1985, An Exploratory Investigation of the Firm Size Effect, Journal of Financial Economics, 14, 451-471. Chen Robert, and Raymond Kan, 2004, Finite Sample Analysis of Two-Pass Cross- Sectional Regressions. Chen Nai-Fu, Richard Roll, and Stephen A. Ross, 1986, Economic Forces and the Stock Market, Journal of Business, 59, 383-403. Cochrane John H, 2001, Asset Pricing, (Princeton University Press). 128 Fama Eugene F, and Kenneth R. French, 1992, The cross-section of expected stock returns, Journal of Finance 47, 427-465. Fama Eugene F, and James D. MacBeth, 1973, Risk, returnand equilibrium: Empirical tests, Journal of Political Economy 81, 607-636. Ferson Wayne E, and Stephen R. Foster, 1994, Finite sample properties of the generalized method of moments in tests of conditional asset pricing models, Journal of Financial Economics, 36, 29-55. Ferson Wayne E, and Campbell Harvey, 1991, The variation of economic risk premiums, Journal of Political Economy, 99, 385–415. Ferson Wayne E, and Robert A. Korajczyk, 1995, Do arbitrage pricing models explain the predictability of stock returns? Journal of Business, 68, 309-349. Ferson Wayne E, and Campbell R. Harvey, 1999, Conditioning Variables and Cross- section of Stock Returns, Journal of Finance 54, 1325-1360. Foster Dean P, and Dan B. Nelson, 1996, Continuous Record Asymptotics for Rolling Sample Variance Estimators, Econometrica, 64, 139-174. Griliches Zvi, and Jerry A. Hausman, 1986, Error in variables in panel data, Journal of Econometrics, 31, 93-118. Gibbons Michael R, 1982, Multivariate tests of Financial models: A new approach, Journal of Financial Economics, 10, 3-27. Gibbons Michael R, Stephen A. Ross, and Jay. Shanken, 1989, A test of the efficiency of a given portfolio, Econometrica, 57, 1121-1152. Ghysels Eric, 1998, On stable factor structures in the pricing of risk: Do time-varying betas help or hurt? Journal of Finance, 53, 549-573. 129 Hansen Lars P, 1982, Large sample properties of generalized method of moments estimators, Econometrica 50, 1029-1054. Hansen Lars P, John Heaton, and Amir Yaron, 1996, Finite-sample properties of some alternative gmm estima- tors, Journal of Business and Economic Statistics, 14, 262-280. Hansen Lars P, and Ravi Jagannathan, 1991, Implications of security market data for models of dynamic economies, Journal of Political Economy, 99, 225-262. Hansen Lars P, 1997, Assessing specification errors in stochastic discount factor models, Journal of Finance, 62, 557-590. Hansen Lars P, and Scott F. Richard, 1987, The role of conditioning information in deducing testable restrictions implied by dynamic asset pricing models, Econometrica, 55, 587- 613. Hansen Lars P, and Kenneth J. Singleton, 1982, Generalized instrumental variables estimation of nonlinear rational expectations models, Econometrica, 50, 1269-1286. Jagannathan Ravi, and Zhenyu Wang, 1996, The conditional capm and the cross-section of expected returns, Journal of Finance, 51, 3-53. Jagannathan Ravi, and Zhenyu Wang, 1998, An asymptotic theory for estimating beta- pricing models using cross-sectional regression, Journal of Finance, 53, 1285-1309. Jagannathan Ravi, and Zhenyu Wang, 2002, Empirical evaluation of asset-pricing models: A comparison of the sdf and beta methods, Journal of Finance, 57, 2337-2367. Kan Raymond, Cesare Robotti and Jay Shanken, 2010, Pricing model performance and the two-pass cross-sectional regression methodology. Kan Raymond and Chu Zhang, 1999, Two-pass tests of asset pricing models with useless factors, Journal of Finance 54, 203-235. 130 Kan Raymond, and Guofu Zhou, 1999, A critique of the stochastic discount factor methodology, Journal of Finance, 54, 1021-1048. Kamer Charles, 1994, Macroeconomic Seasonality and the January Effect, Journal of Finance, 49, 1883-1891. Kleibergen Frank, 2009, Tests of risk premia in linear factor models, Journal of Econometrics, 149, 149-173. Lewellen Jonathan, Stefan Nagel and Jay Shanken, 2010, A skeptical appraisal of asset pricing tests, Journal of Financial Economics, 96, 175-194. Litzenberger Robert H, and Krishna Ramaswamy, 1979, The effect of personal taxes and dividends of capital asset prices: The theory and evidence, Journal of Financial Economics 7, 163-196. Merton Robert C, 1973, An intertemporal asset pricing model, Econometrica 41, 867- 888. Ross, Stephen A., 1976, The arbitrage theory of capital asset pricing, Journal of Economic Theory 13, 341-360. [Rosenberg and Marathe (1975)] Test of asset pricing hypothesis, Research in Finance 1, 115-223. Shanken Jay, 1985, Multivariate tests of the zero-beta capm, Journal of Financial Economics, 14, 327-348. Shanken Jay, 1987, Multivariate proxies and asset pricing relations : Living with the roll critique, Journal of Financial Economics, 18, 91-110. Shanken Jay, 1992, On the estimation of beta-pricing models, Review of Financial Studies, 5, 1-33. 131 Shanken Jay, and Guofu Zhou, 2007, Estimating and testing beta pricing models: alternative methods and their perfor- mance in simulations, Journal of Financial Economics, 84, 40-86. Welch Ivo, 2000, Views of financial economists on the equity premium and on professional controversies, Journal of Business 73, 501-537. Welch Ivo, 2008, The consensus estimate for the equity premium by academic financial economists in december 2007, Working Paper, Brown University. 132 Appendix: Proof of the Theorems Proof of Theorem 3.1. We prove the consistency and asymptotic theorem for the case using the lagged estimated facor loadings as instrumental variable. The case using the general instrumental variable can be proved in the exactly same way. To illustrate the prove, let the second step be OLS, i.e. I Σ = one has: . ) ' 1 ˆ ( ) ' 1 ˆ 1 ˆ ( = ) ) (0, ' ˆ t τ t 1 t τ t t t ξ β β β f γ γ The consistency is established since 0 β ξ = ) ' 1 ˆ ' ( τ t t E and the Lindeberg condition. Since N / β β converges to ' bb when N ∞ and ] ξ β , , ξ β [ tN N t1 1 satisfies the Lindeberg condition, one can apply the Lindeberg Central Limit Theorem. Define t t 1 t t t ' ) ' ( = Ω F F F u . To get the asymptotic covariance, notice that as N ∞, , ' ) ' 1 ˆ 1 ˆ ( N 1/ t τ bb β β t and on the other hand, ) | ) ' 1 ˆ ' 1 ˆ ( N (1/ τ t t t τ t F β ξ ξ β E ) | ) 1 ) ) ( ( ) ) ( 1( ( N 1 ( = t t t t t t F β ε u f γ ε u f γ β E ) | ) ' ) ) ( ( ) ) ( ( ( N 1 ( τ t t t t t t t τ t F u ε u f γ ε u f γ u E 133 1 ) | ) ) ( ( ) ) ( (( 1 N 1 t t t t t t β F ε u f γ ε u f γ β E . ) | ' ~ ) | ) ) ( ( ) ) ( (( ~ ( N 1 τ t t t t t t t τ t τ t F u F ε u f γ ε u f γ u E E ) | ( τ t F E takes the expected value of a random variable at time τ t conditioning on all the information F and ] , ' [ = ' ~ τ t N 1 τ t u 0 u . One can show that 24 ) | ) ) ( ( ) ) ( (( t t t t t t τ t F ε u f γ ε u f γ E , = ) | ) ) ( ( ) ) ( (( = 0 t t t t t t I F ε u f γ ε u f γ c E hence, ) | ) ' 1 ˆ ' 1 ˆ ( 1/N ( τ t t t τ t F β ξ ξ β E . ~ ' τ t 0, 0 0 L bb c c Then the asymptotic covariance matrix can be written as . ) ' )( ~ ' ( ) ' ( = ) | )) (0, ( ˆ ( Acov 1 τ t 0, 1 0 t t bb L bb bb F f γ γ c 24 We will show that in the next page. 134 In the second step is GLS, one has: . ) ' 1 ˆ ( ) ' 1 ˆ 1 ˆ ( = ) ) (0, ' ˆ t 1 τ t 1 t 1 τ t t t ξ Σ β β Σ β f γ γ The consistency can be proved in same way as before. To derive the asymptotic covariance, one can show that ) | ' 1 ˆ ' 1 ˆ N 1 ( τ t 1 t t 1 τ F β Σ ξ ξ Σ β t E , ) ~ ' ( ) | ' ~ ~ N 1 1 1 N 1 ( = τ t 0, 1 0 τ t 1 τ t 1 L b b Σ F u Σ u β Σ β c E so the conditional asymptotic distribution can be derived similarly as before. Note that the key step in this proof is that β β F β ε u f γ ε u f γ β 0 t t t t t t = ) | ) ) ( ( ) ) ( ( ( c E . This result follows the proof in Shanken (1992). The details are shown below: Since t t 1 t t t ' ) ' ( = F F F u , , ) ~ ( ) ' ) ' )( ( ( = ) ( t t 1 t t t t t E F F F f γ I f γ u Vec where ) ~ ( Vec t E reshapes the N τ matrix t ~ E into the a N τ column vector, i.e 135 , ) ε ε , , ε ε , , ε ε , , ε ε , ε ε , , ε ε ( = ) ~ ( Vec t N N t, t N N τ, t t 2 t,2 t 2 τ , 2 t t 1 t,1 t 1 τ , 1 t t E with t i ε the average of i t, i τ, t ε , , ε . Applying this formula, one has ) | ) ( ) ( ' ( t t t t F E u f γ f γ u ) ' ) ' )( ( )( )( ' ) ' )( ( ( = t 1 t t t τ t 1 t t t F F F f γ I L I F F F f γ I . ) ' ) ' )( (( ) ' ) ' )( ( = t 1 t t t τ t 1 t t t I F F F f γ L F F F f γ Using this method, one has , ' = ) | 1 ) ) ( ( ) ) ( 1( N 1 ( 0 t t t t t t bb F β ε u f γ ε u f γ β c E similarly, one can show that 1 τ t τ t τ t τ τ t 1 τ t τ t 0 τ t t t t t t t τ t ) ' ( ' ) ' ( = ) | ' ) ) ( ( ) ) ( ( N 1 ( F F F L F F F F u ε u f γ ε u f γ u c E . 136 Proof of Theorem 3.2. From Theorem 3.1, , ) (0, 1 2 τ T 1 = ' ˆ 1 2 τ T 1 t T 2 τ = t t T 2 τ = t f γ γ the consistency is established because 0 ) (0, 1 2 τ T 1 t T 2 τ = t f when T is large. When the second step is OLS, to derive the unconditional asymptotic distribution, notice that ) ' ˆ ( 1 2 τ T 1 t T 2 τ = t γ γ . ) ) (0, ) ' 1 ˆ ( ) ' 1 ˆ 1 ˆ (( 1 2 τ T 1 = t t τ t 1 t τ t T 2 τ = t f ξ β β β where ' ) ( ' = ' t t t t ε f γ u ξ and ' ~ = ' ˆ τ t τ t u β β . By the assumption that t f is a stationary process and β satisfies the Lindeberg condition, it is clear that one can apply the Central Limit Theorem to derive the asymptotic distribution of ) ) (0, ˆ ( t t T 2 τ = t f γ , i.e. . ) (0, ) ) (0, ) ) (0, ˆ ( ( 1 2 τ T 1 T N t t T 2 τ = t V γ f γ N The asymptotic variance can be written as the summation of variance and autocovariance of the 137 error term. To be more specific, first, notice that conditional on F : ) | ' ˆ ) ) ( ( ) ) ( ( ˆ N 1 ( τ t t t t t t t τ t F β ε u f γ ε u f γ β E . ~ ' τ t 0, 0 0 L bb c c In addition, for any integer 1 t between 1 and 1 τ , ) | ˆ ) ) ( ( ) ) ( ( ˆ N 1 ( t1 τ t t1 t t1 t t1 t t t t τ t F β ε u f γ ε u f γ β E , ~ ' 1 τ t t1, t1 t1 L bb c c where ) ' ) ' )( ( ) ( ~ ) (( = 1 t1 t 1 t t t t1 t t t1, t t1 1 F F F f γ f γ L f γ c . Here, , 0 = ~ 1 τ t t1, 1 k k 1 1 τ t t1, L 0 0 L , ) ' ( ' ) ' ( = 1 t1 τ t t1 τ t t1 τ t τ t 1 τ t τ t 1 τ t t1, F F F M F F F L x where | t1 | τ | t1 | τ ) t1 ( ) t1 ( = L l L l M x with 138 , τ t1 τ τ 1 τ t1 τ τ 1 1 τ t1 τ τ 1 τ t1 τ τ 1 τ t1 τ τ 1 τ t1 τ τ 1 1 τ t1 τ τ 1 τ t1 τ τ 1 τ t1 τ τ 1 τ t1 τ τ 1 τ t1 τ τ 1 1 τ t1 τ τ t1 τ τ 1 τ t1 τ τ 1 τ t1 τ τ t1 τ τ t1 τ τ 1 = τ τ 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 | t1 | τ L the term 2 τ t1 τ τ 1 1 in this matrix are the ) t1 τ τ, ( , , 2,2) (t1 , 1,1) (t1 ’th elements, 2 τ t1 τ ’s are in the upper triangular portion of the τ τ matrix. (The triangular matrix is from 1) | t1 | τ (1, to τ) (t1, to τ) (1, ’th element.), and , ) τ t1 τ , , τ t1 τ ,1 , τ t1 τ ( = 2 2 2 1 | t1 | 1 where one is the 1 | t1 | ’th element. In addition, l is an indicator function: 1 = ) x ( l if 0 > x , 0 = ) x ( l if 0 < x and 2 1 = ) x ( l if 0 = x . The final asymptotic variance is as follows: . ) ' )))( ~ ' ( ( ( ) ' ( = 1 τ t t1, t1 1 τ 1 τ = t1 1 bb L bb bb V c E When the second pass estimation is GLS, ) | ' 1 ˆ ) ) ( ( ) ) ( ( 1 ˆ ( N 1 ( t1 τ t 1 t1 t t1 t t1 t t t t 1 τ t F β Σ ε u f γ ε u f γ Σ β E 139 , ) | ) ' ' ( N 1 ( ' ( t1 τ t 1 τ t 1 t1 F u Σ u b b Σ E c This implies that . ) ' ))( ~ ' ( ( ( ) ' ( = 1 1 τ - t t1, 1 t1 1 τ 1 τ = t1 1 1 b b Σ L b b Σ b b Σ V c E In addition, since , ) N 1 O( = )) ' 1 ˆ ( ) ' 1 ˆ ˆ (( 1 2 τ T 1 T t τ t 1 t τ t T 2 τ = t ξ β β β it is easy to show that ) ~ (0, )) (0, ˆ ( T F N Σ γ γ . 140 Table 2.1 Three methods to estimate Fama-French three-factor risk premiums using the bootstrap (25 portfolios, Monthly Data) This Table uses the FF three-factor model to generate stock returns. The first pass is a time series regression of returns on market, size, and book-to-market factors for each asset, which producesbeta estimates. The second pass regresses asset returns cross-sectionally on the market β , size β and book-to market β . This Table presents the estimated risk premiums with 25 portfolios using the BJS estimation without rolling beta (BJS), the Fama-Macbeth rolling beta method (Rolling), and the lagged beta IV method (Roll IV). The estimation is based on monthly data. 25 portfolios include 25 size and book-to-market portfolios for 1964 to 2009. In applying the rolling beta method, we assume the rolling window is 15 for T=60 and 60 for T=600. The true risk premiums are the sample means of excess return factors. We also report the T-ratio (T-Diff) of the difference between reported risk premiums and their true values. The standard errors used to construct the T-ratio are based on the sample covariance of the estimated risk premiums from 10,000 trials. T=60 Rolling window 15 Factor Market Size BM True 0.41 0.26 0.42 BJS 0.43 0.25 0.40 T-Diff 0.23 -0.16 -0.14 Rolling 0.45 0.23 0.35 T-Diff 0.44 -0.32 -0.49 Roll IV 0.41 0.26 0.43 T-Diff -0.05 -0.03 0.08 T=600 Rolling window 60 True 0.41 0.26 0.42 BJS 0.42 0.26 0.41 T-Diff 0.25 -0.16 -0.16 Rolling 0.43 0.25 0.40 T-Diff 0.69 -0.49 -0.43 Roll IV 0.42 0.26 0.42 T-Diff -0.04 0.01 0.02 141 Table 2.2 Three methods to estimate Fama-French three-factor risk premiums using the bootstrap (149 portfolios, Monthly Data) This Table uses the FF three-factor model to generate stock returns. The first pass is a time series regression of returns on market, size, and book-to-market factors for each asset, which producesbeta estimates. The second pass regresses asset returns cross-sectionally on the market β , size β and book-to market β .This Table presents the estimated risk premiums with 149 portfolios using the BJS estimation without rolling beta (BJS), the Fama-Macbeth rolling beta method (Rolling), and the lagged beta IV method (Roll IV). The estimation is based on monthly data. 149 portfolios include 100 size and book-to-market portfolios combined with 49 industry portfolios. In applying the rolling beta method, we assume the rolling window is 15 for T=60 and 60 for T=600. The true risk premiums are the sample means of excess return factors. We also report the T-ratio (T-Diff) of the difference between reported risk premiums and their true values. The standard errors used to construct the T-ratio are based on the sample covariance of the estimated risk premiums from 10,000 trials. T=60 Rolling window 15 Factor Market Size BM True 0.41 0.26 0.42 BJS 0.44 0.25 0.36 T-Diff 0.53 -0.22 -0.63 Rolling 0.51 0.18 0.22 T-Diff 1.19 -0.70 -1.21 Roll IV 0.41 0.27 0.43 T-Diff -0.06 0.05 0.09 T=600 Rolling window 60 True 0.41 0.26 0.42 BJS 0.42 0.25 0.41 T-Diff 1.77 -0.73 -2.23 Rolling 0.44 0.24 0.35 T-Diff 1.62 -1.10 -2.31 Roll IV 0.41 0.26 0.42 T-Diff -0.04 -0.04 -0.02 142 Table 2.3 Three methods to estimate Fama-French three-factor risk premiums using the bootstrap (4970 stocks, Monthly Data) This Table uses the FF three-factor model to generate stock returns. The first pass is a time series regression of returns on market, size, and book-to-market factors for each asset, which producesbeta estimates. The second pass regresses asset returns cross-sectionally on the market β , size β and book-to market β .This Table presents the estimated risk premiums with 4970 individual stocks using the BJS estimation without rolling beta (BJS), the Fama- Macbeth rolling beta method (Rolling), lagged beta IV method (Roll IV), and Theil adjustment (Theil). The estimation is based on monthly data from 1964 to 2009. In applying the rolling beta method, we assume the rolling window is 15 for T=60 and 60 for T=600. The true risk premiums are the sample means of excess return factors.We also report the T-ratio (T-Diff) of the difference between reported risk premiums and their true values. The standard errors used to construct the T-ratio are based on the sample covariance of the estimated risk premiums from 10,000 trials. T=60 Rolling window 15 Factor constant Market Size BM True 0 0.41 0.26 0.42 BJS 0.070 0.32 0.15 0.29 T-Diff 1.20 -1.16 -0.97 -1.06 Rolling 0.17 0.17 0.09 0.12 T-Diff 0.99 -1.04 -0.68 -1.13 Roll IV 0.017 0.36 0.19 0.47 T-Diff 0.47 -1.15 -1.08 0.77 Theil -0.0032 0.36 0.22 0.53 T-Diff -0.05 -0.67 -0.38 0.96 T=600 Rolling window =60 True 0 0.41 0.26 0.42 BJS 0.0068 0.41 0.25 0.40 T-Diff 1.02 -0.60 -1.60 -1.64 Rolling 0.055 0.37 0.19 0.27 T-Diff 2.90 -1.72 -1.89 -3.68 Roll IV 0.0008 0.41 0.26 0.42 T-Diff 0.02 -0.04 0.03 -0.05 Theil -0.0007 0.42 0.26 0.40 T-Diff -0.02 1.07 0.27 -1.74 143 Table 2.4 Three methods to estimate macro three-factor risk premiums using the bootstrap (25 portfolios, Monthly Data) This Table presents the estimated risk premiums with 25 portfolios using the BJS estimation without rolling beta (BJS), the Fama-Macbeth rolling beta method (Rolling), lagged beta IV method (Roll IV), and Theil’s adjustment (Theil).). The estimation is based on monthly data. 25 portfolios include 25 size and book-to-market portfolios for 1964 to 2009. In applying rolling beta method, we assume the rolling window is 15 for total period T=60 and 60 for T=600. The three factors are ΔC: consumption growth, ΔCPI: change in inflation, ΔIP: change in industrial production. We also report the T-ratio (T-Diff) of the difference between reported risk premiums and their true values. The standard errors used to construct the T-ratio are based on the sample covariance of the estimated risk premiums from 10,000 trials. T=60 Rolling window 15 Factor constant ΔC ΔCPI ΔIP True 0 0.2 -0.1 1.2 BJS -0.028 0.016 -0.027 0.089 T-Diff -0.18 -3.39 1.72 -5.39 Rolling -0.014 0.0038 -0.0081 0.023 T-Diff -0.15 -4.95 2.76 -8.28 Roll IV -0.17 0.31 0.93 -1.43 T-Diff -0.00 0.01 0.01 -0.01 Theil -0.26 0.66 -1.09 0.97 T-Diff -0.54 1.01 -1.78 3.11 T=600 Rolling window 60 True 0 0.2 -0.1 1.2 BJS -0.037 0.097 -0.013 0.57 T-Diff -1.50 -3.79 2.33 -6.49 Rolling -0.067 0.017 -0.026 0.094 T-Diff -4.08 -16.09 9.94 -24.63 Roll IV 17.62 -7.35 -3.70 9.18 T-Diff 0.00 -0.00 -0.01 0.01 Theil 0.30 0.15 -0.84 1.56 T-Diff 0.32 0.76 -0.39 0.79 144 Table 2.5 Three methods to estimate macro three-factor risk premiums using the bootstrap (149 portfolios, Monthly Data) This Table presents the estimated risk premiums with 149 portfolios using the BJS estimation without rolling beta (BJS), the Fama-Macbeth rolling beta method (Rolling), the rolling beta IV method (Roll IV), and Theil’s adjustment (Theil). The estimation is based on monthly data from 1964 to 2009. 149 portfolios include 100 size and book-to- market portfolios combined with 49 industry portfolios. In applying rolling beta method, we assume the rolling window is 15 for total period T=60 and 60 for T=600. The three factors are ΔC: consumption growth, ΔCPI: change in inflation, ΔIP: change in industrial production. We also report the T-ratio (T-Diff) of the difference between reported risk premiums and their true values. The standard errors used to construct the T-ratio are based on the sample covariance of the estimated risk premiums from 10,000 trials. T=60 Rolling window 15 Factor constant ΔC ΔCPI ΔIP True 0 0.2 -0.2 1.2 BJS -0.083 0.045 -0.038 0.20 T-Diff -0.93 -5.16 3.15 -7.88 Rolling -0.081 0.013 -0.013 0.053 T-Diff -0.86 -6.15 3.72 -10.46 Roll IV 1.44 -0.55 0.12 -0.76 T-Diff 0.01 -0.01 0.00 -0.01 Theil -0.36 0.17 0.20 0.26 T-Diff -3.96 -0.73 7.46 -7.01 T=600 Rolling window 60 True 0 0.2 -0.2 1.2 BJS -0.062 0.15 -0.14 0.79 T-Diff -1.50 -4.00 2.52 -6.41 Rolling -0.12 0.044 -0.036 0.19 T-Diff -3.93 -16.09 9.94 -24.63 Roll IV 0.0085 0.18 0.49 2.40 T-Diff 0.00 -0.01 0.01 0.00 Theil 0.013 0.21 -0.21 1.25 T-Diff 0.30 0.45 -0.31 0.80 145 Table 2.6 Three methods to estimate macro three-factor risk premiums using the bootstrap (4970 portfolios, Monthly Data) This Table presents the estimated risk premiums with 4970 individual stocks using the BJS estimation without rolling beta (BJS), the Fama-Macbeth rolling beta method (Rolling), the rolling beta IV method (Roll IV), and Theil’s adjustment (Theil). The estimation is based on monthly data from 1964 to 2009.In applying rolling beta method, we assume the rolling window is 15 for total period T=60 and 60 for T=600. The three factors are ΔC: consumption growth, ΔCPI: change in inflation, ΔIP: change in industrial production. We also report the T-ratio (T-Diff) of the difference between reported risk premiums and their true values. The standard errors used to construct the T-ratio are based on the sample covariance of the estimated risk premiums from 10,000 trials. T=60 Rolling window 15 Factor constant ΔC ΔCPI ΔIP True 0 0.2 -0.2 1.2 BJS -0.045 0.073 -0.062 0.43 T-Diff -1.51 -5.88 3.21 -6.40 Rolling -0.041 0.024 -0.023 0.14 T-Diff -1.25 -6.64 3.90 -9.84 Roll IV -0.033 0.21 -0.21 1.31 T-Diff -0.49 0.27 -0.09 0.45 Theil -0.029 0.23 -0.25 1.42 T-Diff -0.97 1.39 -1.16 1.83 T=600 Rolling window 60 True 0 0.2 -0.2 1.2 BJS -0.0060 0.17 -0.16 1.02 T-Diff -0.69 -9.55 5.96 -9.35 Rolling 0.35 0.074 -0.059 0.45 T-Diff 36.89 -17.00 10.22 -19.26 Roll IV -0.0031 0.20 -0.20 1.20 T-Diff -0.05 -0.02 -0.04 0.04 Theil -0.0029 0.20 -0.20 1.21 T-Diff -0.13 0.16 -0.04 0.09 146 Table 2.7 Various methods to estimate Fama-French three-factor risk premiums with time- varying factor loadings and regression residuals (4970 stocks Monthly Data) This Table uses the FF three-factor model to generate stock returns with time-varying factor loadings and regression residuals, and presents the estimated risk premiums with 4970 individual stocks using the BJS estimation without rolling beta (BJS), the Fama-Macbeth rolling beta method (Rolling), the rolling beta IV method (Roll IV) and 3-group IV. We assume that the true risk premiums are the sample mean of each excess return factor. The total periods are T=60 and 600. In applying the rolling beta method, we assume the rolling window is 15 for T=60 and 60 for T=600. T=60 Rolling window 15 Factor constant Market Size BM True 0 0.41 0.26 0.42 BJS 0.11 0.28 0.12 0.25 Rolling 0.20 0.13 0.08 0.10 Roll IV 0.021 0.33 0.17 0.49 3-group IV Thiel 0.015 -0.0032 0.38 0.33 0.24 0.18 0.40 0.53 T=600 Rolling window =60 True 0 0.41 0.26 0.42 BJS 0.018 0.39 0.23 0.38 Rolling 0.095 0.35 0.18 0.23 Roll IV 0.0029 0.39 0.23 0.39 3-group IV Thiel 0.0009 -0.0021 0.41 0.38 0.25 0.23 0.42 0.40 147 Table 2.8 Three methods to estimate macro three-factor risk premiums using the bootstrapwith time-varying factor loadings and regression residuals (4970 portfolios, Monthly Data) This Table uses the three macro-factor model to generate stock returns with time-varying factor loadings and regression residuals, and presents the estimated risk premiums with 4970 individual stocks using the BJS estimation without rolling beta (BJS), the Fama-Macbeth rolling beta method (Rolling), the rolling beta IV method (Roll IV) and 3-group IV method. The total periods are T=60 and 600. In applying rolling beta method, we assume the rolling window is 15 for total period T=60 and 60 for T=600. The three factors are ΔC: consumption growth, ΔCPI: change in inflation, ΔIP: change in industrial production. T=60 Rolling window 15 Factor constant ΔC ΔCPI ΔIP True 0 0.2 -0.2 1.2 BJS -0.065 0.063 -0.052 0.33 Rolling -0.071 0.021 -0.019 0.12 Roll IV -0.039 0.22 -0.24 1.38 3-group IV Thiel -0.029 -0.039 0.21 0.25 -0.22 -0.27 1.28 1.48 T=600 Rolling window =60 True 0 0.2 -0.2 1.2 BJS -0.0080 0.15 -0.13 0.92 Rolling Roll IV 0.55 -0.0038 0.066 0.21 -0.053 -0.19 0.40 1.18 3-group IV -0.0029 0.20 -0.19 1.20 Thiel -0.0029 0.21 -0.18 1.11 148 Table 2.9 The standard error for Fama-French three-factor risk premiums (25 portfolios Monthly Data) This Table presents the standard error of the estimated risk premiums for the BJS estimation without rolling beta (BJS), the Fama-Macbeth rolling beta method (Rolling), and the lagged beta IV method (Roll IV). The estimation is based on monthly data. 25 portfolios include 25 size and book-to-market portfolios for 1964 to 2009. The true standard error is calculated using the bootsrapped covariance of the estimated risk premiums. The estimated standard error is the average of the bootstrapped Fama-Macbeth standard errors. The Fama-Macbeth standard errors contain the autocovariance (Auto) of estimated risk premiums in each period up to the length of rolling windows. This length of the rolling window is 15 for total period T=60 and 60 for T=600. We also present Fama- Macbeth standard errors ignoring the autocovariance. For the BJS method, we also present the Shanken adjustment in estimated standard errors. T=60 Rolling window 15 Factor Market Size BM BJS Method True 0.0663 0.0904 0.1082 Estimated (No Auto) 0.0593 0.0805 0.0974 Estimated (Shanken) 0.0619 0.0841 0.1017 Rolling Method True 0.0797 0.1083 0.1320 Estimated (No Auto) 0.0622 0.0837 0.1002 Estimated (With Auto) 0.0563 0.0763 0.0916 Rolling IV method True 0.1060 0.1455 0.1779 Estimated (No Auto) 0.0851 0.1179 0.1461 Estimated (With Auto) 0.0680 0.0944 0.1165 Estimated (Theory 3.2) 0.1258 0.1735 0.2132 T=600 Rolling window 60 BJS Method True 0.0206 0.0281 0.0337 Estimated (No Auto) 0.0198 0.0269 0.0327 Estimated (Shanken) 0.0203 0.0276 0.0335 Rolling Method True 0.0220 0.0301 0.0360 149 Estimated (No Auto) 0.0203 0.0275 0.0333 Estimated (With Auto) 0.0198 0.0269 0.0325 Roll IV method True 0.0237 0.0327 0.0392 Estimated (No Auto) 0.0219 0.0300 0.0364 Estimated (With Auto) 0.0212 0.0290 0.0353 Estimated (Theory 3.2) 0.0239 0.0327 0.0398 150 Table 2.10 The standard error for Fama-French three-factor risk premiums (149 portfolios Monthly Data) This Table presents the standard error of the estimated risk premiums for the BJS estimation without rolling beta (BJS), the Fama-Macbeth rolling beta method (Rolling), and the lagged beta IV method (Roll IV). The estimation is based on monthly data from 1964 to 2009. 149 portfolios include 100 size and book-to-market portfolios combined with 49 industry portfolios. The true standard error is calculated through taking the covariance of the estimated risk premiums in each simulation. The estimated standard error is the average of the Fama-Macbeth standard errors in each simulation. The Fama-Macbeth standard errors contain the autocovariance of estimated risk premiums in each period up to the length of rolling windows. This length of the rolling window is 15 for total period T=60 and 60 for T=600. We also compare the Fama-Macbeth standard errors without the autocovariance. For BJS method, we also compare the Shanken adjustment in estimated standard errors. The three factors are ΔC: consumption growth, ΔCPI: change in inflation, ΔIP: change in industrial production. T=60 Rolling window 15 Factor Market Size BM BJS Method True 0.0481 0.0688 0.0870 Estimated (No Auto) 0.0398 0.0560 0.0689 Estimated (Shanken) 0.0413 0.0581 0.0715 Rolling Method True 0.0804 0.1223 0.1617 Estimated (No Auto) 0.0461 0.0682 0.0859 Estimated (With Auto) 0.0418 0.0582 0.0708 Roll IV method Estimated (No Auto) 0.0658 0.0963 0.1286 Estimated (With Auto) 0.0527 0.0770 0.1026 Estimated (Theory 3.2) 0.0913 0.1325 0.1699 T=600 Rolling window 60 BJS Method True 0.0141 0.0201 0.0248 Estimated (No Auto) 0.0137 0.0194 0.0242 Estimated (Shanken) 0.0140 0.0199 0.0248 Rolling Method True 0.0157 0.0226 0.0283 151 Estimated (No Auto) 0.0137 0.0194 0.0238 Estimated (With Auto) 0.0138 0.0197 0.0246 Roll IV method True 0.0167 0.0240 0.0303 Estimated (No Auto) 0.0156 0.0223 0.0282 Estimated (With Auto) 0.0151 0.0216 0.0273 Estimated (Theory 3.2) 0.0169 0.0243 0.0306 152 Table 2.11 The standard error for Fama-French three-factor risk premiums (4970 portfolios Monthly Data) This Table presents the standard error of the estimated risk premiums for the BJS estimation without rolling beta (BJS), the Fama-Macbeth rolling beta method (Rolling), and the lagged beta IV method (Roll IV). The true standard error is calculated through taking the covariance of the estimated risk premiums in each simulation. The estimated standard error is the average of the Fama-Macbeth standard errors in each simulation. The Fama-Macbeth standard errors contain the autocovariance of estimated risk premiums in each period up to the length of rolling windows. This length of the rolling window is 15 for total period T=60 and 60 for T=600. We also compare the Fama-Macbeth standard errors without the autocovariance. For BJS method, we also compare the Shanken adjustment in estimated standard errors. T=60 Rolling window 15 Factor constant Market Size BM BJS Method True 0.0570 0.0795 0.1176 0.1160 Estimated (No Auto) 0.0173 0.0201 0.0195 0.0207 Estimated (Shanken) 0.0176 0.0204 0.0199 0.0211 Rolling Method True 0.1738 0.2392 0.2476 0.2578 Estimated (No Auto) 0.0417 0.0566 0.0585 0.0617 Estimated (With Auto) 0.0795 0.1112 0.1181 0.1234 Roll IV method True 0.0356 0.0464 0.0688 0.0720 Estimated (No Auto) 0.0294 0.0392 0.0585 0.0596 Estimated (With Auto) 0.0236 0.0314 0.0458 0.0468 Estimated (Theory 3.2) 0.0407 0.0503 0.0574 0.0601 T=600 Rolling window 60 BJS Method True 0.0068 0.0082 0.0090 0.0098 Estimated (No Auto) 0.0061 0.0072 0.0079 0.0081 Estimated (Shanken) 0.0062 0.0073 0.0080 0.0083 Rolling Method True 0.0200 0.0281 0.0379 0.0403 Estimated 0.0065 0.0077 0.0083 0.0088 153 (No Auto) Estimated (With Auto) 0.0128 0.0177 0.0253 0.0258 Roll IV method True 0.0076 0.0094 0.0114 0.0112 Estimated (No Auto) 0.0070 0.0086 0.0109 0.0108 Estimated (With Auto) 0.0068 0.0083 0.0105 0.0106 Estimated (Theory 3.2) 0.0076 0.0093 0.0115 0.0115 154 Table 2.12 The standard error for macro-factor risk premiums (25 portfolios Monthly Data) This Table presents the standard error of the estimated risk premiums for the BJS estimation without rolling beta (BJS), the Fama-Macbeth rolling beta method (Rolling), and the lagged beta IV method (Roll IV). The estimation is based on monthly data. 25 portfolios include 25 size and book-to-market portfolios for 1964 to 2009. The true standard error is calculated through taking the covariance of the estimated risk premiums in each simulation. The estimated standard error is the average of the Fama-Macbeth standard errors in each simulation. The Fama- Macbeth standard errors contain the autocovariance of estimated risk premiums in each period up to the length of rolling windows. This length of the rolling window is 15 for total period T=60 and 60 for T=600. We also compare the Fama-Macbeth standard errors without the autocovariance. For BJS method, we also compare the Shanken adjustment in estimated standard errors. T=60 Rolling window 15 Factor constant ΔC ΔCPI ΔIP BJS Method True 0.1948 0.0571 0.0993 0.2113 Estimated (No Auto) 0.1700 0.0431 0.0760 0.1554 Estimated (Shanken) 0.1791 0.0471 0.0899 0.2107 Rolling Method True 0.2021 0.0402 0.0701 0.1436 Estimated (No Auto) 0.1681 0.0257 0.0444 0.0913 Estimated (With Auto) 0.1476 0.0257 0.0438 0.0910 Roll IV method True 162.4623 41.3216 62.5254 118.4471 Estimated (No Auto) 162.4809 41.3156 62.5188 118.5278 Estimated (With Auto) 107.0083 27.8322 45.3984 86.0411 Estimated (Theory 3.2) DNE 25 DNE 1.5550 DNE T=600 Rolling window 60 BJS Method True 0.0984 0.0437 0.0904 0.1786 Estimated (No Auto) 0.0707 0.0305 0.0606 0.1235 Estimated (Shanken) 0.0988 0.0510 0.0824 0.2420 Rolling Method 25 DNE represents does not exist–the covariance is negative. 155 True 0.0641 0.0186 0.0326 0.0675 Estimated (No Auto) 0.0581 0.0148 0.0262 0.0534 Estimated (With Auto) 0.0574 0.0156 0.0275 0.0570 Roll IV method True 1168.5 444.6 490.4 1458.0 Estimated (No Auto) 1168.5 444.6 490.4 1458.0 Estimated (With Auto) 1119.5 425.2 469.7 1394.7 Estimated (Theory 3.2) DNE DNE DNE DNE 156 Table 2.13 The standard error for macro-factor risk premiums (149 portfolios Monthly Data) This Table presents the standard error of the estimated risk premiums for the BJS estimation without rolling beta (BJS), the Fama-Macbeth rolling beta method (Rolling), and the lagged beta IV method (Roll IV). The estimation is based on monthly data from 1964 to 2009. 149 portfolios include 100 size and book-to-market portfolios combined with 49 industry portfolios. The true standard error is calculated through taking the covariance of the estimated risk premiums in each simulation. The estimated standard error is the average of the Fama-Macbeth standard errors in each simulation. The Fama-Macbeth standard errors contain the autocovariance of estimated risk premiums in each period up to the length of rolling windows. This length of the rolling window is 15 for total period T=60 and 60 for T=600. We also compare the Fama-Macbeth standard errors without the autocovariance. For BJS method, we also compare the Shanken adjustment in estimated standard errors. T=60 Rolling window 15 Factor constant ΔC ΔCPI ΔIP BJS Method True 0.0894 0.0301 0.0515 0.1276 Estimated (No Auto) 0.0666 0.0142 0.0249 0.0529 Estimated (Shanken) 0.0751 0.0153 0.0287 0.0645 Rolling Method True 0.0941 0.0304 0.0503 0.1097 Estimated (No Auto) 0.0685 0.0108 0.0181 0.0391 Estimated (With Auto) 0.0634 0.0155 0.0254 0.0570 Roll IV method True 186.9138 63.8955 74.4945 189.5792 Estimated (No Auto) 187.0165 63.8286 74.9956 188.8114 Estimated (With Auto) 150.9858 50.7170 55.5052 145.5986 Estimated (Theory 3.2) 6.4380 2.0918 2.3569 7.5204 T=600 Rolling window 60 BJS Method True 0.0412 0.0132 0.0257 0.0632 Estimated (No Auto) 0.0252 0.0080 0.0151 0.0350 Estimated (Shanken) 0.0426 0.0135 0.0254 0.0620 Rolling Method True 0.0294 0.0097 0.0165 0.0410 Estimated 0.0230 0.0050 0.0087 0.0188 157 (No Auto) Estimated (With Auto) 0.0244 0.0070 0.0120 0.0292 Roll IV method True 36.0784 13.1653 49.0655 222.0078 Estimated (No Auto) 36.0820 13.1583 49.0376 222.0221 Estimated (With Auto) 33.6768 12.4871 45.7321 206.9324 Estimated (Theory 3.2) DNE DNE DNE DNE 158 Table 2.14 The standard error for macro-factor risk premiums (4970 portfolios Monthly Data) This Table presents the standard error of the estimated risk premiums of individual 4970 stocks for the BJS estimation without rolling beta (BJS), the Fama-Macbeth rolling beta method (Rolling), and the lagged beta IV method (Roll IV). The true standard error is calculated through taking the covariance of the estimated risk premiums in each simulation. The estimated standard error is the average of the Fama-Macbeth standard errors in each simulation. The Fama-Macbeth standard errors contain the autocovariance of estimated risk premiums in each period up to the length of rolling windows. This length of the rolling window is 15 for total period T=60 and 60 for T=600. We also compare the Fama-Macbeth standard errors without the autocovariance. For BJS method, we also compare the Shanken adjustment in estimated standard errors. T=60 Rolling window 15 Factor constant ΔC ΔCPI ΔIP BJS Method True 0.0298 0.0216 0.0430 0.1202 Estimated (No Auto) 0.0136 0.0021 0.0036 0.0076 Estimated (Shanken) 0.0190 0.0029 0.0051 0.0106 Rolling Method True 0.0318 0.0273 0.0445 0.1113 Estimated (No Auto) 0.0157 0.0061 0.0105 0.0239 Estimated (With Auto) 0.0179 0.0124 0.0216 0.0483 Roll IV method True 0.0670 0.0370 0.1131 0.2447 Estimated (No Auto) 0.0354 0.0251 0.0862 0.1907 Estimated (With Auto) 0.0589 0.0402 0.1235 0.1984 Estimated (Theory 3.2) 0.0272 0.0136 0.0350 0.0849 T=600 Rolling window 60 BJS Method True 0.0087 0.0031 0.0067 0.0193 Estimated (No Auto) 0.0047 0.0010 0.0019 0.0040 Estimated (Shanken) 0.0095 0.0021 0.0038 0.0081 Rolling Method True 0.0095 0.0074 0.0138 0.0389 Estimated (No Auto) 0.0049 0.0012 0.0021 0.0056 Estimated 0.0085 0.0067 0.0122 0.0356 159 (With Auto) Roll IV method True 0.0114 0.0038 0.0078 0.0163 Estimated (No Auto) 0.0055 0.0020 0.0039 0.0084 Estimated (With Auto) 0.0101 0.0033 0.0066 0.0140 Estimated (Theory 3.2) 0.0122 0.0037 0.0069 0.0148 160 Table 2.15 The rejection ratio of t-statistics with non-hypothesis 0 α (4970 portfolios Monthly Data) This Table presents the t-ratio for 0 α of 4970 individual stocks with the BJS estimation without rolling beta (BJS), the Fama-Macbeth rolling beta method (Rolling), and the lagged beta IV method (Roll IV). The estimation is based on monthly data from 1964 to 2009.For each simulation, we calculate the t-ratios for different methods and compare them with the 95% critical value. Then, we calculate the number of the simulations such that the absolute value of t-ratio is greater than the critical value and divide this number by the number of simulations to get the probability of rejecting the true non-hypothesis(rejection ratio). For different methods, there are different standard errors and t-ratios. For the BJS method, we use the Fama-Macbeth standard error without autocovariance and Shanken’s standard error. For the Fama-Macbeth rolling-beta method, we use the Fama-Macbeth standard error(with and without autocovariance). For the beta IV method, we use the Fama-Macbeth standard error and the standard error that derived from Theorem 3.2. There are two cases: the Fama-French three-factor model and the macro-factor model with 4970 = N and 600 = T since these are the cases with smallest bias for the BJS and the lagged beta IV method methods. FF3 represents the Fama-French three-factor model and the MF represents macro-factor model. T=600 Rolling window 60 Std method Model Reject Ratio Model Reject Ratio BJS Method Estimated (No Auto) FF3 0.0750 MF 0.2970 Estimated (Shanken) FF3 0.0700 MF 0.0330 Rolling Method Estimated (No Auto) FF3 0.5360 MF 0.3540 Estimated (With Auto) FF3 0.1640 MF 0.1610 Roll IV method Estimated (Theory 3.2) FF3 0.0470 MF 0.0490 Estimated (No Auto) FF3 0.1740 MF 0.3240 Estimated (With Auto) FF3 0.0720 MF 0.1620 161 Table 2.16 Estimated risk premiums for three macro factors (OLS) This table presents the estimated risk premiums and T-ratios for three macro factor model using individual stock returns with the BJS estimation (BJS), the Fama-Macbeth rolling beta method (Rolling), the lagged beta IV method (Roll IV), Thiel's adjustment and the 3-group method. We use the standard errors from Fama-Macbeth (FM), Theorem 3.1 and Theorem 3.2 to calculate T-ratios. The length of the rolling window is 60 for total period T=552. T=552 Rolling window 60 Factor constant ΔC ΔCPI ΔIP BJS Method Risk Premium 1.086 0.006 0.000 -0.006 T-ratio (FM) 5.772 0.535 0.107 -0.544 Rolling Method Risk Premium 0.815 -0.001 -0.001 -0.064 T-ratio (FM) 4.952 -0.024 -0.036 -1.738 Roll IV method Risk Premium 1.063 0.097 -0.038 -0.132 T-ratio (FM) 11.72 4.405 -1.865 -2.524 T-ratio (Theorem 3.2) 4.774 2.777 -1.424 -2.091 Thiel's adjustment Risk Premium 0.787 0.122 -0.217 0.009 T-ratio (FM) 3.283 0.529 -0.573 0.601 3-group method Risk Premium 0.314 0.028 -0.001 -0.036 T-ratio (FM) 4.459 1.993 -0.251 -2.591 T-ratio (Theorem 3.1) 4.348 1.698 -0.075 -1.292 162 Table 2.17 Estimated risk premiums for three macro factors (GLS) This table presents the estimated risk premiums and T-ratios for three macro factor model using individual stock returns with the BJS estimation (BJS), the Fama-Macbeth rolling beta method (Rolling), the lagged beta IV method (Roll IV), Thiel's adjustment and the 3-group method. We use the standard errors from Fama-Macbeth (FM), Theorem 3.1 and Theorem 3.2 to calculate T-ratios. The length of the rolling window is 60 for total period T=552. T=552 Rolling window 60 Factor constant ΔC ΔCPI ΔIP BJS Method Risk Premium 0.264 0.015 -0.001 -0.029 T-ratio (FM) 7.536 1.419 -0.383 -2.348 Rolling Method Risk Premium 0.291 0.039 -0.015 -0.074 T-ratio (FM) 5.865 1.129 -0.871 -1.780 Roll IV method Risk Premium 0.909 0.072 -0.044 -0.188 T-ratio (FM) 11.34 2.601 -1.634 -2.665 T-ratio (Theorem 3.2) 6.185 2.125 -1.669 -3.263 Thiel's adjustment Risk Premium 0.384 -0.002 -0.000 0.002 T-ratio (FM) 7.066 -1.620 -0.159 2.386 3-group method Risk Premium 0.182 0.043 -0.006 -0.089 T-ratio (FM) 5.916 2,759 -1.586 -4.687 T-ratio (Theorem 3.1) 4.475 2.595 -0.574 -3.667
Abstract (if available)
Abstract
My dissertation contains two chapters. The first chapter is ""Can weight‐based measures distinguish between informed and uninformed fund managers?"" This is based on my job market paper. This paper studies weight‐based mutual fund performance measures in a panel predictive regressions framework, where future stock returns are regressed on a fund’s portfolio weights. Existing performance measures suffer biases related to benchmark misspecification and are statistically inefficient. We introduce bias‐adjusted and weighted least squares (WLS) measures. Simulations show that new methods can effectively control bias and improve power, compared with existing measures. We also apply the existing and newly introduced measures in empirical examples. Using bias‐adjusted measures and efficient measures can lead to different conclusions about managers’ abilities. ❧ Disclaimer: The power results depend on the way in which I calibrated an informed manager, and if this is a poor representation of reality then the power results may not be reliable. ❧ The second chapter is ""Resolving the Errors‐in‐Variables Bias in Risk Premium Estimation"", which is co‐authored with Kuntara Pukthuanthong and Richard Roll. The Fama-Macbeth (1973) rolling‐method is widely used for estimating risk premiums, but its inherent errors‐in‐variables bias remains an unresolved problem, particularly when using individual assets or macroeconomic factors. We propose a solution with a particular instrumental variable, calculated from alternate observations. The resulting estimators are unbiased. In simulations, we compare this new approach with several existing methods. The new approach corrects the bias even when the sample period is limited. Moreover, our proposed standard errors are unbiased, and lead to correct rejection size in finite samples. With this approach, we find that macro factors, such as the consumption growth, can significantly affect the stock returns.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Essays on financial markets
PDF
Essays in empirical asset pricing
PDF
Model selection principles and false discovery rate control
PDF
Essays in tail risks
PDF
Three essays on linear and non-linear econometric dependencies
PDF
Three essays on econometrics
PDF
Two essays on the mutual fund industry and an application of the optimal risk allocation model in the real estate market
PDF
Essays on delegated portfolio management under market imperfections
PDF
Essays on nonparametric and finite-sample econometrics
PDF
Mutual fund screening versus weighting
PDF
Essays in financial intermediation
PDF
Feature selection in high-dimensional modeling with thresholded regression
PDF
Nonparametric ensemble learning and inference
PDF
Essays on econometrics analysis of panel data models
PDF
Shrinkage methods for big and complex data analysis
PDF
Evolution of returns to scale and investor flows during the life cycle of active asset management
PDF
Essays on econometrics
PDF
Essays on the econometrics of program evaluation
PDF
Large-scale inference in multiple Gaussian graphical models
PDF
Destructive decomposition of quantum measurements and continuous error detection and suppression using two-body local interactions
Asset Metadata
Creator
Wang, Junbo
(author)
Core Title
Two essays on financial econometrics
School
Marshall School of Business
Degree
Doctor of Philosophy
Degree Program
Business Administration
Publication Date
07/11/2014
Defense Date
06/09/2014
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
errors-in-variables bias,Fama-Macbeth regression,financial econometrics,mutual fund,OAI-PMH Harvest,weight-based measures
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Ferson, Wayne (
committee chair
), Jones, Christopher S. (
committee member
), Joslin, Scott (
committee member
), Lv, Jinchi (
committee member
), Solomon, David (
committee member
)
Creator Email
junbo@usc.edu,wangjunbo2007@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-437210
Unique identifier
UC11287690
Identifier
etd-WangJunbo-2659.pdf (filename),usctheses-c3-437210 (legacy record id)
Legacy Identifier
etd-WangJunbo-2659.pdf
Dmrecord
437210
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Wang, Junbo
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
errors-in-variables bias
Fama-Macbeth regression
financial econometrics
mutual fund
weight-based measures