Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Empirical studies of monetary economics
(USC Thesis Other)
Empirical studies of monetary economics
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
EMPIRICALSTUDIESOFMONETARYECONOMICS by EleonoraGranziera ADissertationPresentedtothe FACULTYOFTHEUSCGRADUATESCHOOL UNIVERSITYOFSOUTHERNCALIFORNIA InPartialFulllmentofthe RequirementsfortheDegree DOCTOROFPHILOSOPHY (ECONOMICS) August2010 Copyright2010 EleonoraGranziera ii Acknowledgements Mydeepestgratitudegoestomyadvisor,ProfessorHyungsikRogerMoon,forhis guidance and support. Without him I would not have achieved my goals for this dissertation. IamindebtedtoFrankSchorfheideforhissuggestionsthathavehelpedtremen- douslyinshapingthesecondchapterofmydissertation. IwouldliketoacknowledgeJamesBullard,RobertDekle,GrahamElliot,Kirstin Hubrich, Christopher Jones, Micheal Owyang, Michael MacCracken, Salvatore Miglietta and Martin Weidner for the many valuable discussions that helped me progresswithmydissertation. I am grateful to Morgan Ponder and Young Miller for coordinating and oversee- ingtheadministrativeconcernsthatmadeitpossibleformetocompletemydegree. Finally,Iwouldliketothankmyfamilymembers,especiallymyparentsGiuliana andAngeloandmygrandmotherCaterina. Theyhavebeenthereformeeverystep of the way, have always loved me unconditionally, and have aided me through all ofmytoughdecisions. iii TableofContents Acknowledgements ii ListofTables iv ListofFigures v Abstract vi Chapter1: Introduction 1 Chapter2: AssessingtheImportanceofLearning 7 2.1. Introduction 7 2.2. MotivatingExample: aSmallModelwithLearning 12 2.3. EconometricModel 14 2.4. ObtainingaHistoryofStates 18 2.5. ObtainingtheHyperparameters 19 2.6. ASmallMonetaryModelfortheUSEconomy 22 2.7. ImportanceofLearning 24 2.7.1. ImpulseResponses 24 2.7.2. MarginalEffect 32 2.7.3. CounterfactualExperiment 34 2.8. Conclusions 39 Chapter3: APredictabilityTestforaSmallNumberofNestedModels 41 3.1. Introduction 41 3.2. BasicMathematicalFramework 45 3.2.1. SpecialCase1: AlternativeModelsNestedWhithinEachOther 49 3.2.2. SpecialCase2: Non-NestedAlternativeModels 52 3.2.3. GeneralCase: AlternativeModelsNestedWhithinGroups 53 3.2.4. AlternativeTests 54 3.3. ComputationofCriticalValues 55 3.4. MonteCarloSimulation 58 3.5. ForecastingUSInation 69 3.6. Conclusions 75 References 77 iv ListofTables Table1: MarginalEffectofLearning 33 Table2: DescriptiveStatisticsforActualandSimulatedData 38 Table3: ExactPowerExperimentOne 65 Table4: ExactPowerExperimentTwo 65 Table5: SizeandUnadjustedPowerExperimentTwo: Normality 66 Table6: SizeandUnadjustedPowerExperimentTwo: Subsampling 67 Table7: TestofEqualForecastAccuracyforUSInation: Normality 73 Table8: TestofEqualForecastAccuracyforUSInation: Subsampling 74 v ListofFigures Figure1: IRFofInationtoMonetaryShock 25 Figure2: IRFofOutputGrowthtoMonetaryShock 26 Figure3: IRFofFFRtoMonetaryShock 26 Figure4: IRFofInationtoMonetaryShock,SelectedHorizons 27 Figure5: IRFofGDPGrowthtoMonetaryShock,SelectedHorizons 28 Figure6: IRFofFFRtoMonetaryShock,SelectedHorizons 28 Figure7: IRFofInationtoMonetaryShock,81:Q3and07:Q3 30 Figure8: IRFofGDPGrowthtoMonetaryShock,81:Q3and07:Q3 31 Figure9: IRFofFFRtoMonetaryShock,81:Q3and07:Q3 31 Figure10: ActualandSimulatedInationSeries 35 Figure11: ActualandSimulatedGDPGrowthSeries 36 Figure12: ActualandSimulatedFFRSeries 37 Figure13: ActualandForecastforUSInation1979:01-1983:12 71 Figure14: ActualandForecastforUSInation1984:01-2007:12 72 vi Abstract Thisdissertationcollectsthreeessaysonempiricalmonetaryeconomics. The rst chapter outlines the major issues and illustrates some of the challenges inconductingmonetarypolicy. The second uses a time varying coefcient vector autoregression to assess the importanceofthelearningcomponentintheUSpostwareconomy. Therandomco- efcients are assumed to follow a mean reverting process around an unconditional mean that can be interpreted as the estimates of the coefcients from the reduced form of a rational expectation equilibrium model. The deviations from the uncon- ditional mean are attributed to the learning behaviour of the agents about the value ofthecoefcientswhichregulatetheeconomy. Iestimateamonetarymodelforthe post WWII U.S. economy including ination, output growth and the federal funds rate. Idocumentthepresenceoflearningdynamicsandndthattheimportanceof the learning mechanism is somewhat limited for ination and output growth but it issubstantialinexplainingthedynamicsofthefederalfundsrate. The third chapter deals with forecast evaluation. In particular, it introduces tests of Likelihood Ratio types for one sided multivariate hypothesis to evaluate the null that a parsimonious model performs equally well as a small number of models which nest the benchmark. The size and the power performance of the vii tests are compared with the ones of two existing tests: a chi-squared test, as de- scribed in Clark and West (2007) applied to multivariate comparison in Hubrich and West (2008), and the maximum of correlated normal test outlined in Hubrich and West (2008). The test statistics do not have a standard limiting distribution. Critical values are then obtained through simulations either adopting the assump- tionofnormalityorthroughsubsampling. UnderthenormalityassumptiontheLRT and correlated normal test are undersized but the size distortion decreases with the in-sample/out-of-sample ratio. The Monte Carlo experiments reveal that the chi- squared test performs poorly in terms of power as it disregard the one-sided nature of the test, while the ranking between the likelihood-ratio type test and the corre- lated normal test depends on the simulation settings. The results from the subsam- pling are sensitive to the choice of the block size. The tests are applied to evaluate theforecastaccuracyofcompetingmodelsforforecastingtheyearlyUSaggregate inationrate. 1 Chapter1 Introduction ThebehaviorofkeyUSmacroeconomicseriessuchasinterestrates,outputgrowth and ination poses challenges for the conduct of macroeconomic policy, both for modellingandforecasting. Theaforementionedseriesexibithighmeanandvolatil- ityinthe'60sand'70s;fromthemid-80sinsteadthesestatisticsdecrease. Previous studies argue that the heterogeneous behavior of the series over the two samples is due to breaks into the coefcients that regulate the law of motion of the variables. Then the second chapter of this dissertation deals with how to model time-varying relationship between variables in an econometric model that allows to draw policy conclusions and to justify the empirical model as the counterpart of the reduced formofatheoreticalmodel. Asfarasforecastingisconcerned,thechangeintothe mean, persistence and volatility of the series has implications for their predictabil- ity. In particular, in periods in which the variables are less persistent and volatile, a parsimonious forecasting model might be more effective in prediction, while in periodsofhighervolatilityrichermodelsmightoutperformaparsimoniousmodel. Then the last chapter introduces a new predictability test for nested models, sug- gestinghowtopickthemodelthatbestpredictsavariableofinterest. 2 In the second chapter of my dissertation I evaluate to what extent learning of the private agents and of the policymaker about the structure of the economy can explain the evolution of US postwar ination, GDP growth and short term interest rate. In fact, if learning accounts for a large part of the volatility of the series, then thisvolatilitycanbesignicantlydecreasedbyfacilitatingthelearningprocess,for example by a better communication of the monetary policy. Hence in this essay I showhowtorepresentlearninginanempiricalmodelandItthissmallmonetary model to the US postwar economy. I document the presence of learning dynamics and I nd that the importance of the learning mechanism is somewhat limited for realactivitybutitissubstantialinexplainingthedynamicsofinationandinterest rate. Recently learning has caught the attention of both the theoretical and empiri- calliterature: atthetheoreticallevel,learningmatterswhentherationalexpectation hypothesisisreplacedbytheassumptionofboundedrationality,i.e. agentslackthe complete knowledge of the structure of the economy, specically of the so-called structural parameters. For example, consider the Taylor rule: under bounded ratio- nality agents know that the interest rate is responsive to output and ination gap, buttheydonotknowthesensitivityofthegapstothesequantities. Theparameters thatrepresentthesensitivityareanexampleofstructuralparameters. Attheempir- ical level learning serves as a rationale to the growing use in estimation of vector autoregressionwithtimevaryingcoefcients. Thesemodelsareparticularlysuited 3 to capture the time varying nature of the business cycle. To date a clear link be- tweenthetwoliteratureismissing. Thecontributionofthisessaysisthentwofold: rst it illustrates how the assumption of bounded rationality in a theoretical model translates empirically in a vector regression that features time varying parameters. Second this chapter applies the empirical model to the US postwar data, to assess what is the importance of learning for ination, output growth and interest rate. When estimating the model it is necessary to assume a law of motion for the time varyingcoefcients. Previousempiricalstudiesassumethatthecoefcientsevolve as random walk. This specication does not help in relating the empirical with the theoretical models on learning. Moreover it does not allow distinguishing between twopossiblecausesofthechangesintheestimatedcoefcients: breaksinthestruc- turalparametersoftheunderlyingtheoreticalmodelversusvariationsinthebeliefs of the agents about the structural parameters. Because the objective of the essay is to uncover the importance of learning, then the object of interest are changes in the estimated coefcients that are due exclusively to variations in the beliefs of the agents about the structural parameters. The random walk specication then does not prove to be adequate for this purpose. Hence, the coefcients are modelled as a stationary process that uctuates around an unconditional mean. Then, the un- conditional mean can be interpreted as the value that the coefcients would take in a model solved under rational expectations, while deviations of the coefcients 4 from the mean are interpreted as evidence of learning. Another advantage of the stationary representation for the coefcients is that it allows to rewrite the vactor autoregression isolating the learning component. I assess the importance of learn- ing in two ways: rst I compute the contribution of the learning component to the variance of the series and I nd this contribution to be substantial for the interest rate, while it is lower for ination and output growth. Next I run a counterfactual experimentinwhichdataaregeneratedinabsenceoflearning: agentsareendowed withapriorbeliefonthevalueofthecoefcients,obtainedonpre-sampledata,but they do not update these beliefs as new data becomes available, i.e. they do not engageinlearning. Thisexerciseshowsthatiflearningdidnottakeplace,ination wouldhavebeenconstantlyhigherthantheactualseriesandtheinterestratewould have been constantly lower than the actual series. This suggest that in the absence of learning the monetary policy would have been looser and ination would have been higher, even by 2.5 percentage points, than the realized ones. However real activity is not affected much by the learning dynamics as simulated and actual se- ries for output growth are about the same. This leads to conclude that learning is substantial in explaining the behavior of nominal variables, but it plays a limited roleinthebehaviorofrealactivity. In the second chapter of my dissertation I introduce a new test to evaluate the performance of different econometric models in terms of their ability to predict a 5 variable of interest. In fact, economic theory provides with many plausible spec- ications to model and forecast macroeconomics variables; it is then an empirical issue to verify which model proves more useful for prediction given the data at hand. In order to do so we develop tests of Likelihood Ratio types for one sided multivariate hypothesis to evaluate the null that a parsimonious model performs equallywellasasmallnumberofmodelswhichnestthebenchmark. Supposethat the policymaker is interested in forecasting the US ination rate and she is using a model that includes only an autoregressive term of order one, a second model that uses in addition to the autoregressive term the ination for the food compo- nentandathirdmodelthataddsalsotheenergycomponent. Thebenchmarkmodel is the most parsimonious because it includes the smallest set of regressors. Then the benchmark model is nested in the alternative models as the benchmark can be obtained from the alternative models by restricting the the parameters in the alter- native models. To conduct a forecasting exercise it is necessary to select a loss function that penalizes the distance of the forecasts from the realized data. Then forecastevaluationaskswhetherthedifferentialsinthelossfunctionacrossmodels issignicantlydifferentfromzero. Thetestsuggestedinthisessayisone-sidedbe- cause under the alternative the differential in the loss function between the models ispositive;itismultivariatebecauseitcomparesjointlythatthelossdifferentialbe- tweenthebenchmarkandeachofthealternativemodelisnotsignicantlydifferent 6 from zero. The size and the power performance of the tests are compared through MonteCarloexperimentswiththeonesofotherexistingtests. Sincetheteststatis- ticsdonothaveastandardlimitingdistribution,criticalvaluesareobtainedthrough simulations either adopting the assumption of normality or through sub-sampling. The Monte Carlo investigation reveals that, under the assumption of normality of the difference in MSPE the chi-squared test is slightly oversized, while the other tests are undersized, but the size distortion decreases with the out-of-sample to in- sample ratio, i.e. when the number of observations in the estimation sample grows withrespecttotheobservationsintheforecastingsample. Asfarasthepowerofthe test is concerned, regardless of the assumption under which the critical values are derived the chi-squared test performs poorly as it disregard the one-sided nature of the test, while the ranking between the likelihood-ratio type test and the correlated normaltestdependsonthesimulationsettings. Resultswhenthecriticalvaluesare obtained through sub-sampling depend on the block size. The test is then applied to evaluate the forecast accuracy of competing models for forecasting US aggre- gate ination. An autoregressive model of order one is used as benchmark, while thetwocompetingmodelsaddthefoodcomponent(fortherstalternative)andthe energy(forthesecondalternative);thetestfailstorejectthenullofequalpredictive abilitysuggestingthatrichermodelcannotimproveoveraparsimoniousmodelfor forecastingofination. 7 Chapter2 AssessingtheImportanceofLearning 2.1Introduction The time varying nature of the business cycle and the change in dynamics of key macro variables for the U.S. over the last sixty years have been extensively doc- umented. In particular, many studies report a decrease in the mean and variance of ination (e.g. Stock and Watson, 2007) and output growth (e.g. Blanchard and Simon 2000, Stock and Watson 2003) starting from the early 1980s. A growing literature adopts time varying parameter vector autoregressions as empirical spec- ication to identify the determinants of these changes. Starting with Cogley and Sargent(2001),whichsuggestitasestimationmethodology,vectorautoregressions with time varying coefcients have been used in studying the change in the persis- tence of ination and unemployment (Cogley and Sargent, 2005) and of ination gap (Cogley, Primiceri and Sargent, 2008), in spreading light over the bad luck versus bad policy debate (Primiceri, 2005, Canova and Gambetti, 2004) and in an- alyzing the causes of persistence of ination (Cogley and Sbordone, 2008). The rationale underlying their specication is provided by learning of the policymaker and private agents about the economy. For example, the central bank might adjust 8 its target ination rate in view of changes in beliefs about the effectiveness of the monetarypolicyandtheagentsmightslowlylearnaboutthepolicychange. Learn- ing mechanisms have been introduced also in dynamic stochastic general equilib- rium models (DSGE) to generate endogenous persistence in macro series (Milani 2008). Toaccountforlearninginthesemodels,theassumptionofrationalexpecta- tionsisreplacedbyadaptiveexpectations,thatis,agentsformtheirexpectationson future values of the variables based on past data. To date a clear link between the empiricalandthetheoreticalliteratureonlearningismissing. Theaimofthispaperistwofold: rst,toillustratethelinkbetweenDSGEmodels withlearningandtimevaryingparameterVAR;secondtoassesstheimportanceof thelearningmechanisminexplainingthedynamicsoftheU.S.postwardata. The existing literature on time varying parameter VAR characterizes the evolu- tion of the time varying coefcients as a driftless random walk. This parsimonious specication is suited to capture sharp changes in the coefcients. However, de- spite the emphasys on the learning mechanism as engine for time variation in the coefcients, the random walk specication does not allow to distinguish between variationsduetostructuralchangesintheunderlyingeconomyfromvariationsdue to learning dynamics. Being unable to disentangle these two sources of variations inthecoefcients,thisspecicationdoesnotprovidewithameasureoftheimpact ofthelearningcomponentonthedynamicsofthemacrovariablesunderanalysis. 9 We suggest a way to assess the importance of the learning mechanism by as- suming that parameters change over time in a continuous fashion but differently from previous studies we assume that the coefcients evolve according to station- ary process rather than as random walks. We interpret the unconditional mean as thevalueofthecoefcientsthatwouldresultfromthesolutionofarationalexpec- tationmodel,andthedeviationofthecoefcientsfromthemeanasconsequenceof learningdynamics. Theproposedspecicationallowstodecomposeourvectorautoregressionmodel separating the learning component (the deviation of the coefcients from their un- conditional mean) from the rational expectation component, i.e. the term related to the unconditional mean of the coefcients. The empirical model is then con- sistent with the reduced form from a DSGE model with adaptive expectations in which agents know the correct specication but not the `true' value of the parame- ters. Agents recognize that the actual law of motion of the model is characterized by time varying parameters and they obtain estimates of these time varying para- meters through Kalman Filtering of the data. Instead, in a model solved under the assumption of rational expectations the agents are endowed with knowledge about the law of motion of the economy which would be characterized by constant co- efcients. Because we do not map the parameters from the empirical model with structural parameters from a theoretical macro model of learning, we cannot im- 10 pose restrictions on the values of the parameters in the empirical model in order to guaranteedeterminacyandlearnability. However,bymodelingthelawofmotionof theparametersasameanrevertingprocessweguaranteethatthecoefcientsdonot depart for too long from their unconditional mean, i.e. from their rational expecta- tion equilibrium value. At the same time the drifting coefcients will not converge to the unconditional mean, and so agents are engaged in a perpetual learning. In our model we neglet the possibility of structural changes in the parameters and we focusourattentionexclusivelyonlearningdynamicsaspossibleexplanationofthe changesindynamicsofthemacroseries. A primary objective of the paper is then to quantify the learning component in the US post war data and the extent to which learning might explain the changes in dynamics of ination, output growth and interest rate. In order to do so, we proceedintwoways: rst,wesuggesttomeasuretheproportionofthevarianceof each equation that is accounted for by the learning component. Second, we run a counterfactual example in which we simulate the data in the absence of learning. Theempiricalanalysisdeliversthefollowingresults: throughtherstexercise, we ndthattheimportanceofthelearningmechanismissomewhatlimitedforination and output growth but it is substantial in explaining the dynamics of the federal fundsrate,asthelearningcomponentaccountsforabouttwothirdsofthevariance oftheinterestrate. Fromthecounterfactualexampleitemergesthatintheabsence 11 oflearning,theinationserieswouldhavebeenconsistentlyhigherthantheactual one, while the interest rate series would have lied below the actual series. The magnitudeofthediscrepanciesbetweenactualandsimulatedseriesvariesovertime with peaks occurring during the Volker chairmanships: the value of the simulated interestratefor1982:Q2is5.5%,about3.5percentagepointslowerthantheactual value, and for ination the simulated series is 2 percentage points higher than the actual series from 1977:Q3 to 1980:Q1. For output growth, the chosen measure of real acivity, the actual and simulated series overlap for virtually every date. The resultsfromthisempiricalanalysissuggestthatlearningbehaviourofthemonetary authoritymightexplainthechangeindynamicsofthenominalvariables,whilethe learningmechanismisunabletocharacterizethechangesintherealactivity. The outline of the chapter is as follows: section 2 presents a New-Keynesian model with learning that serves as example to clarify the interpretation of learning provided in this paper; in section 3 the econometric methodology is illustrated. Sections 3-5 describe our small empirical monetary model. Section 6-7 applies the methodology discussed in section 3 to assess the importance of the learning mechanism in the US Post WWII economy and discusses the results; section 8 concludes. 12 2.2MotivatingExample: aSmallModelwithLearning Although this paper does not map the parameters from the empirical model with the structural parameters derived from a theoretical macro model, we present in this section a simple New-Keynesian model to illustrate our denition of learning andtofacilitatetheinterpretationofthelearningdynamicsintheempiricalmodel. Consider a New-Keynesian model summarized by the following log-linearized equations: t =x t + ~ E t t+1 +u t (1) x t = ~ E t x t+1 i t ~ E t t+1 +g t (2) i t =i t1 +(1) h t + ~ E t t+1 t + x ~ E t x t+1 i + t (3) where t is ination, x t is output gap and i t is the nominal interest rate, u t , g t , t are exogenous processes. ~ E t denotes subjective expectations. Equations (1) through(3)representaforwardlookingPhillipscurve,alog-linearizedEulerequa- tion and a Taylor rule for the monetary authority respectively. Variations of this smallmonetarymodelareusedforexampleinMilani(2006,2008)andDelNegro and Schorfheide (2004). The solution of the model and the notion of equilibrium depend on the formation of the expectations: a rational expectation equilibrium is achieved when the agents take expectations on the distribution of the actual sto- 13 chastic processes that generate the data. Agents are then endowed with knowledge about the correct structure of the model including the values of the structural para- meters. A growing literature, surveyed by Evans and Honkapohja (2008) criticizes the assumption of rational expectations as too restrictive and suggests to substitute itbyboundedrationality. Thisimpliesthatagentsknowthestructureoftherational expectationequilibrium(thatisthesolutionunderboundedrationalityincludesthe samevariablesastheMinimumStateVariablesolutionunderrationalexpectations) buttheylacktheknowledgeofthevalueoftheparametersthatgoverntheeconomy, and, like econometricians, they learn about these parameters by forming estimates based on past data. The implication of this assumption for the New-Keynesian model above is that agents form their expectations using the following 'Perceived LawofMotion': Z t = 0;t + 1;t Z t1 + t (4) whereZ t f t ;x t ;i t g collects the variables included in the Minimum State Vari- able Solution, t fu t ;g t ; t g contains the exogenous processes and the matrices 0;t and 1;t are time varying coefcients, whose estimates, ^ 0;t and ^ 1;t ; are up- dated every period. After having estimated the parameters, the agents use (4) to formtheirexpectations: ~ E t Z t+1 = ^ 0;t + ^ 1;t Z t : 14 Substituting back the agents' expectations into the log-linearized model described by equations (1) to (3), yields the Actual Law of Motion of the model, which is therefore characterized by time varying parameters. This result motivates our choiceofatimevaryingcoefcientvectorautoregressionasempiricalmodel. Many papers discuss restrictions on the structural parameters under which the equilib- riumunderlearningconvergestotherationalexpectationequilibrium. Inthispaper we do not map our reduced form parameters with the parameters from a theoreti- calmodel,thereforeinsteadofimposingthelearnabilityordeterminacyconditions discussed in Bray and Savin (1986), Evans and Honkapohja (2001) or Bullard and Mitra(2002),weimposestabilitybymodelingthelawofmotionofthecoefcients as a mean reverting process. Hence, we let the coefcients uctuate around their unconditionalmean,whichweinterpretasthevaluetakenbythecoefcientsunder rational expectations, but not to depart too much from it. Also, we assume that the variance-covariance matrix of the estimated coefcients is constant over time; this specication is then consistent with perpetual learning, as it implies that the coefcientswillneverconvergetothevaluetheytakeunderrationalexpectations. 2.3EconometricModel Giventhemotivationexamplediscussedinsection(2),considerthefollowingvec- torautoregressionmodelwithtimevaryingparameters: 15 y t = 0;t + 1;t y t1 +::+ p;t y tp +" t (5) wherey t isanN1vectorofendogenousvariables, 0;t isavectorofintercepts, i;t , i = 1;::;p are matrices of time varying autoregressive coefcients, " t is a vectorofnormallydistributederrors. Rewritethemodelinthefollowingform: y t =X 0 t B t +" t (6) where X 0 t is an N K matrix collecting a constant and lagged values of the en- dogenousvariableandB t isaK1vectoroftimevaryingcoefcients: X 0 t =I N 1; y 0 t1 ; y 0 t2 ;:::,y 0 tp B t = ' 0 1;t ;' 1 1;t ;::;' p 1;t ;::;' 0 n;t ;' 1 n;t ;::;' p n;t 0 with' j i ,i = 1;::;n,j = 1;::p;beingthei-thrawofthematrixofcoefcients j;t andK = N(Np + 1). We specify the following stationary law of motion for the vectorofrandomcoefcients: B t = (I) B +B t1 +v t (7) 16 with aKK matrix of coefcients,I an identity matrix of suitable dimention andv t a vector of errors. B t is a mean reverting process with unconditional mean B; provided that the roots of lie outside the unit circle. Note that (7) nests the driftess random walk specication if equals the identity matrix. The errors from the state and measurement equations are distributed according to: [" t ;v t ] 0 i:i:d N (0;V)and V = 2 6 6 4 R 0 0 Q 3 7 7 5 sothatRistheNN covariancematrixfortheinnovationsinequation(6)andQ istheKK covariancematrixfortheinnovationsinequation(7). Notethatinthis specication the errors are assumed to be time invariant and there is no correlation betweentheerrorsfromthetwoequations. Whilethesecondassumptionisstandard inthetimevaryingparameterVARliterature,atimevariantrepresentationforRis morewidelyused(Primiceri2005,CogleyandSargent2005),althoughCogleyand Sargent (2001) and Canova and Gambetti (2004) specify the variance covariance functionoftheerrorstobeconstantacrosstime. Dene ~ B t B t B as the deviation of B t from its unconditional mean and rewriteequation(7)as: ~ B t = ~ B t1 +v t (8) 17 The persistence of these deviations is determined by the matrix of autoregressive parameters :Thenequation(6)canberewrittenas: y t =X 0 t ~ B t +X 0 t B +" t : (9) Thesystemofequations(8)and(9)hasaGaussianstatespacerepresentationwhere equation (9) is the measurement equation and the equation describing the law of motion for ~ B t is the state equation. In a time varying framework ~ B t and B t are usuallycalledtheparameters,while B;;QandRarecalledthehyperparameters. We estimate the state space model with Bayesian methods and make use of the Kalman lter to retrieve the value of the time varying coefcients. Our state space representation is non-linear as we impose a stability condition on the roots of the coefcients in the VAR. Following the previous literature, we estimate our model usingBayesianmethods. Let B T = [B 0 1 ;::;B 0 T ] 0 denote the history of the coefcients B t up to time T; we are interested in characterizing the joint posterior distributions of the history of parametersandtheposteriordistributionforthehyperparameters: p B T ; B;;V jY T : 18 ThejointposteriorforstatesandhyperparameterscanbesimulatedthroughGibbs- samplingbyiteratingontheconditionaldistributionsintwosteps; step1: conditionalondataandhyperparametersdrawahistoryofstatesfrom: p B T jY T ; B;;V ; step2: conditionalondataandstatesdrawthehyperparametersfrom: p B;;V jB T ;Y T : 2.4ObtainingaHistoryofStates The evolution of the states given the hyperparameters and the data is characterized as: p B t+1 jB t ;Y T ; B;;V / I(B t+1 )p U B t+1 jB t ;Y T ; B;;V (10) where I(B t+1 ) is an indicator function that takes the value 1 if the eigenvalues of the companion matrix associated to (6) are within the unit circle and zero oth- erwise. This is to guarantee that at any date the VAR does not exhibit explosive roots. Allowing for unstable roots in the vector autoregression would imply an 19 innite variance for the macro series included in the model, and therefore in an unplausible representation of the data. The second term in (10) represents the un- restricted posterior density ofB t+1 : Given the normality assumption for the errors in the state equation and the law of motion (7), p U B t+1 jB t ;Y T ; B;;V N (I) B +B t1 ;Q : The truncation is implemented in the simulation by disregarding the draws that violate the stability condition: if any draw B ; = 1;::;T givesrisetounstablerootsthewholehistoryofdrawsforB isrejected. Inferenceonthestatespacemodelaboveisimplementedbyconditioningonthe hyperparametsandapplyingtheKalmanltertothestateequation(8)afterhaving initializedthestatevector ~ B 0 . 2.5ObtainingtheHyperparameters Conditionalonthehistoryofstates,oneneedstosamplefromtheposteriordistrib- utionofthehyperparameters. Notethatgiventhestationaryautoregressivespeci- cation for the law of motion of the states, it is necessary to estimate two additional hyperparameters with respect to the existing literature on TVP VARs, which as- sume a random walk law of motion for B t : In particular, we have to obtain the posterior distributions for the unconditional mean B and the matrix of autoregres- sive coefcients : Again, note our specication nests the random walk one and thataposteriordistributionforcenteredattheidentitymatrixwouldprovideevi- 20 dencetowardarandomwalkcharacterizationfortheevolutionofB t . Weassumea hierarchicalpriorandposteriordistributionforthehyperparameters: p B ; ;Q;RjY T ;B T =p B ; jQ;R;Y T ;B T p(Q;RjY T ;B T ) = =p B ; jQ;R;Y T ;B T p(QjY T ;B T )p(RjY T ;B T ) ThevectorsoftimevaryingcoefcientsB t and ~ B t areofdimensionK1;where K = N(Np + 1). In our small model for the US post WWII economy with three endogenous variables and two lagsK = 21, so that both andQ are matrices of dimensionKK = 2121. The high dimensionality of these matrices is then a concernforestimation. Toovercomethedimentionalityissueweimposediagonal- ityofandQ. Thisimpliesthattherandomcoefcientsevolveindependently. We assumeanormal-invertedgammapriorforeachithelementofthevector B and ofthediagonalof andQ: [ B i ; ii ] j Q ii N (;Q ii =) Q ii j ;;B i;t ;::;B i;1 1 (; i ) Aconjugatepriordeliversaconjugateposterior. Foreachithequationin(8)the relationshipbetweentheparametersofthethepriors(indicatedwiththesuperscript 21 0) and those of the posterior (indicated by the superscript T) are described by the followingequations: 0 i;T = " 0 ^ ii ;0 + T ^ ii 0 + T ; 0 B i ;0 + T ^ B i 0 + T # 0 0 = 0 T = 0 + T 2 i;T = i;0 + 1 2 T X t=1 B i;t 1 ^ ii ^ B i ^ ii B i;t1 2 + + 0 T 0 + T h ^ ii ^ ii ;0 ; ^ B i B i ;0 i 2 6 6 4 ^ ii ^ ii ;0 ^ B i B i ;0 3 7 7 5 where 0 and T are the size of the training sample and of the estimation sample respectively, ^ ii and ^ B i are estimated fromB i;t = (1 ii ) B i + ii B i;t1 + i;t ; andtheparameters i;0 ; B i ;0 ; ^ ii ;0 arechosenarbitrarily. The prior and the posterior distributions for the variance covariance matrix of the measurement equation, R, take an Inverted Wishart form where the posterior distributionparameterscanbederivedfromthepriorparametersas: RIW ( R R T ; R ) 22 R = 0 + T R T = 0 R R 0 + T R ^ R T ^ R T = 1 T T X t=1 ^ " t ^ " 0 t whereR 0 is an arbitraryN N matrix and ^ " t are the residuals from the measure- mentequation. 2.6ASmallMonetaryModelfortheUSEconomy WeconsiderasmallempiricalmodelfortheUSpostWWIIdata,whichfocuseson variables relevant for monetary policy analysis: the model includes ination, out- put growth and short term interest rate. Ination is computed as annual percentage change in the consumer price index; output growth is obtained as annual percent- age change in real GDP, the Federal Funds Rate is used as measure of short term interestrate. Dataarecollectedatthequarterlyfrequencyfrom1955Q3to2009Q1. The VAR specication includes 2 lags and an intercept term. The coefcients are estimated through a Gibbs sampling algorithm that involves 45000 iterations with therst5000discardedtoallowforburnin. We use a training sample of 11 years, (fourty-four observations, from 1954:Q3 to 1965:Q3) in order to obtain priors for B ; ; Q and R and to initialize the 23 state vector: The prior mean and the variance for B are obtained as the MLE estimate and its variance from a vector autoregression with constant coefcients on the pre-sample. We impose a standard prior for the diagonal elements in Q: Q ii 1 (;gQVB_OLS ii ) whereVB_OLS is the long run variance of the coefcientsobtainedfromanOLSregressionofaconstantcoefcientsVARofor- dertwoonthetrainingsample,scaledbyafactorgQ = 0:0001and isthesizeof the training sample. The prior mean forR is the variance covariance matrix of the residualsfromtheconstantcoefcientsVARonthetrainingsample. The choice of a prior mean for is less trivial. We arbitrarily assume that de- viations from the unconditional mean are larger for the time varying coefcient describing the effect of lagged ination on itself, of lagged output on itself and of lagged interest rate on itself so we allow for higher prior mean (0:7) for the correspondent entries in . We experimented with different values for and we documentthatchoosingvaluesforthepriormeanof largerthan 0:7resultsinno drawforthehistoryofB t thatsatisfythestabilitycondition. Thepriormeanforthe autoregressivecoefcientswhichcharacterizetheevolutionoftheinterceptsareset to0:4. Alltheotherentriesinthediagonalofequal0:2. Weinitialize ~ B 0 tozero, i.e. weassumethatB t isequaltoitsunconditionalmeanuptothebeginningofthe estimationsample;wealsoassumethattheinitialstate ~ B 0 andthehyperparameters areindependent. 24 2.7ImportanceofLearning We investigate the importance of learning dynamics in three ways: rst we look at theimpactofashocktomonetarypolicyandweaskwhetherthiseffectisthesame for each period. Any heterogeneity in the impulse response functions is imputed todeviationsofthecoefcientsfromtheirunconditionalmeanandtherefore,given ourinterpretation,itisevidenceinfavoroflearningdynamics. Second,wequantify theimportanceoflearningbycomputingthecontributionofthelearningcomponent to the variance of the variables in the VAR. Last, we run a counterfactual example by simulating the data in absence of learning and we compare the actual and the simulateddata. 2.7.1ImpulseResponses We study the effect of a tightening of the monetary policy on ination, output growthandshortterminterestrate. Identicationofthemonetaryshockisachieved through sign restrictions, rather than through exclusion restrictions, in the spirit of Uhlig (2005). We assume that after a contractionary monetary policy the federal funds rate increases and both ination and output decrease. We repeat the analysis imposingtherestrictionsforuptoH=2andH=4horizonsaftertheshock. Inatime varying framework, in order to assess the importance of learning dynamics, rather thanlookingatarepresentativeimpulseresponsefunctionoverthesample, wede- 25 rive impulse response function for each date in the estimation sample. Canova and Gambetti (2004) provide with a formal denition of impulse response functions in the case of time varying coefcients VAR; following their denition, the impulse responsefunctionsareconstructedtakingintoaccountfutureprojectionsofthetime varying coefcients. Figure 1 through 3 show the impulse responses of ination, outputgrowthandfederalfundsratetoashockofone-standarddeviationinsizefor each date for horizon 1 through 20 (the bounds of the credible set are not plotted tomakethegureslegible)whenthesignrestrictionisimposedonlyfortherst2 periods after the shock; the responses obtained by imposing the restrictions for up tohorizon4areanalogoustotheonesshown. Figure1. IRFofInationtoMonetaryShock 5 10 15 20 1970 1980 1990 2000 2010 0.25 0.2 0.15 0.1 0.05 0 horizon time percent 26 Figure2. IRFofOutputGrowthtoMonetaryShock 5 10 15 20 1970 1980 1990 2000 2010 1.5 1 0.5 0 0.5 1 horizon time percent Figure3. IRFofFFRtoMonetaryShock 5 10 15 20 1970 1980 1990 2000 2010 0 0.1 0.2 0.3 0.4 0.5 horizon time percent In order to highlight differences in the uncertainty around the median impulse response over time, gure 4 through 6 display the median and the lower and upper boundofthe90th-percenthighestposteriordensityintervaloftheimpulseresponse 27 functions for each date at selected horizons: on impact, after four quarters, after 2 yearsandafterveyears. Figure4. IRFofInationtoMonetaryShock,SelectedHorizons From theoretical models we expect a tightening of the monetary policy to in- crease the short term interest rate, decrease prices and reduce real output. Because ofthesignrestrictionsimposed,ourempiricalresultsconrmthepredictionsforthe variablesuptothesecondhorizon,buttheyalsoshowananomalyfortheresponse of the output growth at longer horizons. A contractionary monetary shock has a signicant negative effect on output growth on impact. However, starting from the thirdquarteraftertheshockoutputgrowthincreasessharplyforfewquartersandit peaksatabout1percentafteroneyearfromtheshock;itthendeclinesrapidly. 28 Figure5. IRFofGDPGrowthtoMonetaryShock,SelectedHorizons Figure6. IRFofFFRtoMonetaryShock,SelectedHorizons 29 Finally the effect of the shock fades away after 10 quarters. Contrary to theo- retical predictament then, output growth is positive for one year. An increase in real activity after a tightening of the monetary policy has been documented also inUhlig(2005)foraconstantcoefcientVAR.Thecontractionarymonetaryshock decreasestheinationrateonimpact,butthiseffectisshortlived,andfromthesec- ond quarter after the shock ination sluggishly goes back to its initial value. After 2 years the effect of the monetary shock on ination disappears. The federal funds rate increases on impact by about 0.6 percentage points and rapidly goes back to itsinitialvalue. Wejustdescribedsomefeaturesoftheimpulseresponsefunctions common throughout the sample. The object of the analysis however is to highlight the differences in the impulse responses across time. From the tridimentional g- ures 1 to 3 as well as from gures 4 through 6 it emerges that for all the variables consideredthemedianoftheimpulseresponsefunctionexhibitslittleheterogeneity across the sample. This nding is consistent with Primiceri (2005), which derives trivariates impulse response functions of ination and unemployment to a contrac- tionary monetary policy shock. Using a time varying parameter VAR in which the coefcientsfollowarandomwalkspecicationandthevariancecovariancematrix of the errors is assumed to be time varying he constructs impulse response func- tions for ination and unemployment rate at three dates in the estimation sample (75:Q1,81:Q3and96:Q1)andndsthattheresponsesareverysimilar,particularly 30 for unemployment series. In gure 7 through 9 are depicted the median as well as thelower andupperbound ofthe90th-percent highestposterior density intervalof the sampled impulse response functions of the variables for a horizon of up to 5 years after the shock for 1981:Q3 and 2007:Q3. The rst date coincides with the peakintheinterestrateseries,anditisalsoaNBERbusinesscyclepeakdate,while 2007:Q3 is the last observation before the beginning of the current recession. The differencesacrossperiodsarelimitedforalltheseries. Figure7. IRFofInationtoMonetaryShock,81:Q3and07:Q3 0 2 4 6 8 10 12 14 16 18 20 0.5 0.4 0.3 0.2 0.1 0 0.1 0.2 0.3 The medians are relatively stable, while the condence bounds show more het- erogeneity across periods. In particular, for all variables they are wider for 81:Q3, suggesting that more uncertainty about the behavior of the variables is associated withthatperiod. 31 Figure8. IRFofGDPGrowthtoMonetaryShock,81:Q3and07:Q3 0 2 4 6 8 10 12 14 16 18 20 2 1 0 1 2 3 4 Figure9. IRFofFFRtoMonetaryShock,81:Q3and07:Q3 0 2 4 6 8 10 12 14 16 18 20 0.5 0 0.5 1 1.5 2 To summarize, the impulse response functions show some heterogeneity across thesample. Anydifferenceintheimpulseresponseacrosstimeisduetothedevia- tions of the estimated coefcients from their unconditional mean and are therefore attributable to learning dynamics. Therefore our ndings seems to suggest that 32 learning dynamics do play a role in the behavior of the U.S. postwar series under analysis. 2.7.2MarginalEffect In the previous section, we documented thorugh a graphical analysis of impulse responses that learning dynamics are limited in the data. We now consider a dif- ferent strategy to investigate the extent of learning in the data and we quantify the importance of learning by computing how much of the overall sample variance of the variables under analysis is explained by the learning component. In order to do so, we need to disentangle the contribution of the learning component from the contribution of the rational expectation component to the variance of y t . Recall from(9)thattheequationthatdescribestheevolutionofy t canbedecomposedinto two separate components: one that is interpreted as the rational expectation com- ponent, X 0 t B; and the learning component, X 0 t ~ B t . Despite this break-down of the measurement equation, the correlation between X 0 t ~ B t and X 0 t B does not allow to breakdownthevarianceofy t assharply. Weproposetoisolatethemarginaleffect of the learning component on the variance ofy t by proceding in two steps. Denote ~ Z t = X 0 t ~ B t ; Z t = X 0 t B and denoteY; ~ Z; and Z as the matrices that stacky 0 t ; ~ Z 0 t ; and Z 0 t ;respectively. 33 1. TherststepistoregressY on Z andregress ~ Z on Z,andobtaintheresiduals: M Z Y andM Z ~ Z;whereM Z =I T Z( Z 0 Z) 1 Z 0 : 2. In the second step, we compute the sample correlation between M Z Y and M Z ~ Z: This is the R 2 of the regression of M Z Y on M Z ~ Z; and we may interpret it as a marginal R 2 of ~ Z t on y t ; that is the proportion of the variation of y t explained marginally by ~ Z t : Note that the measure we propose purges out the effect of the interactionbetweenX 0 t ~ B t andX 0 t B andhencethecontributionofthelearningcom- ponenttothevarianceofy t couldactuallybeunderestimated. Table1showsthemarginaleffectofthelearningcomponentoneachofthethree equationsincludedinthevectorautoregression. Table1. MarginalEffectofLearning Sample66:Q1-09:Q1 Equation M.E. Ination 0.1687 GDPgrowth 0.2202 FFR 0.6850 Thelearningcomponentcontributesforslightlylessthan17%ofthevariationof inationandforabout22%ofthevariationinoutputbutitplaysamuchbiggerrole intheequationforthefederalfundsrate,accountingformorethantwo-thirdsofthe 34 variationinourmeasurefortheshortterminterestrate. Thisanalysissuggeststhat learning dynamics are important in explaining the evolution of the variables and thattheequationcapturingthebehaviourofthemonetaryauthorityistheonemore subjecttolearningfromagents. 2.7.3CounterfactualExperiment Analternativewaytoevaluatetheimportanceoflearningistorunacounterfactual experiment in which the data are simulated as if learning dynamics did not take place. The experiment is implemented as follows: rst, for each date in the esti- mation sample we compute the residuals from the modely t = X 0 t ~ B t +X 0 t B +" t ; the simulated data, y t are obtained by iterating on y t = X 0 t B + ^ " t with ^ " t be- ing the residuals computed in the rst step and X 0 t including lagged values of y t : X 0 t =I N 1; y 0 t1 ; y 0 t2 ;:::, y 0 tp . Figures 10 through 12 plot the actual and the simulated series of ination, out- put growth and federal funds rate for the sample 1966:Q1 - 2009:Q1 in the upper panel and the difference between the two series in the lower panel. For most of thesample,thesimulatedinationseriesliesabovetheactualseries,implyingthat without learning, the ination rate would have been higher. The discrepancy is re- markably large in the rst part of the sample, where the simulated series is more than2percentagepointhigherthantheactualseries. 35 Figure10. ActualandSimulatedInationSeries Except for the subsample 68Q1-69Q1, the actual series of interest rate is higher than the simulated one. Again, the difference between the series is accentuated in the rst part of the sample, peaking at 3.6 percentage points in 1983Q1. This behavior is consistent with our ndings for the ination series: learning dynamics inducethemonetaryauthoritytosettheinterestratehigherthanitwouldinabsence oflearning,andthisinturnkeepsinationlowerthaninabsenceoflearning. 36 Figure11. ActualandSimulatedGDPGrowthSeries The actual and simulated series for gdp growth are almost overlapping, suggest- ing that learning dynamics did not affect the evolution of our measure for real ac- tivity. However, the loose monetary policy of the late '70s would imply a drop in outputgrowthofupto0.6percentagepointsinthesample1976Q1to1980Q1. 37 Figure12. ActualandSimulatedFFRSeries Table2reportssomedescriptivestatisticsofthesimulatedandoftheactualdata forthewholesampleandfortwosubsamplesofequalsize: 66:Q1-87:Q4and88Q1- 09:Q1. The table conrms that the differences between the actual and simulated series are more marked for ination and federal funds rate. For the overall sam- ple,themeanoftheactualinationrateisabout80%ofthemeanofthesimulated data, while for the interest rate the mean of the simulated data is 15% smaller than 38 the mean for the actual series. The actual ination is less volatile than the simu- latedseries,whiletheconverseholdsforthefederalfundsrate. Meanandstandard deviationsareaboutthesamefortheoutputgrowthactualandsimulateddata. Table2. DescriptiveStatisticsforActualandSimulatedData Simulated 66:Q1-09:Q1 Actual INF GDP FFR INF GDP FFR 2.366 2.891 5.366 Mean 1.934 2.997 6.370 1.541 2.131 2.737 Std. Dev 1.178 2.163 3.387 Simulated 66:Q1-87:Q4 Actual INF GDP FFR INF GDP FFR 3.290 3.015 6.615 Mean 2.545 3.210 8.109 1.667 2.602 2.792 Std. Dev. 1.307 2.650 3.399 Simulated 88:Q1-09:Q1 Actual INF GDP FFR INF GDP FFR 1.431 2.765 4.102 Mean 1.292 2.791 4.561 0.483 1.521 2.015 Std. Dev 0.509 1.515 2.261 A comparison of the statistics over sub-samples conrms that in the second part ofthesamplethemeanandthestandarddeviationsofthesimulatedseriesarecloser 39 to the mean and standard deviations of the actual series for all the three variables considered. 2.8Conclusions In this essay we use a time varying coefcient vector autoregression to assess the importance of the learning component in the US postwar economy and its ability to explain the changes in dynamics of key macroeconomic series. The random co- efcients are assumed to follow a mean reverting process around an unconditional mean that can be interpreted as the estimates of the coefcients from the reduced form of a rational expectation equilibrium model. The deviations from the uncon- ditional mean are attributed to the learning behavior of the agents about the value of the coefcients which regulate the economy. The proposed specication allows to decompose our vector autoregression model separating the learning component (the deviation of the coefcients from their unconditional mean) from the rational expectation component, i.e. the term related to unconditional mean of the coef- cients. We estimate a monetary model for the post WWII U.S. economy including in- ation, output growth and the federal funds rate. We document the presence of learning dynamics and we assess their importance by measuring the proportion of the variance of each equation that is accounted for by the learning component and 40 byrunningacounterfactualexampleinwhichthedataaregeneratedintheabsence of learning. We nd that the importance of the learning mechanism is somewhat limited for ination and output growth but it is substantial in explaining the dy- namics of the federal funds rate. Our results suggest that learning behaviour of the monetary authority might explain the change in dynamics of the nominal vari- ables, while the learning mechanism is unable to characterize the changes in the realactivity. 41 Chapter3 A Predictability Test for a Small Number of Nested Models 3.1Introduction Evaluation of forecast accuracy usually requires to compare the expected loss of the forecasts obtained from a set of models of interest. Testing whether the mod- elsprovidewiththesameforecastperformancerepresentsatestofequalpredictive ability. DieboldandMariano(1995)andWest(1996)suggestedaframeworktotest forequalpredictiveabilityinthecaseofnon-nestedmodels. DieboldandMariano (1995) make inference on a vector of moments of predictions or prediction errors generated by two models and prove that under the null of equal predictive ability the distribution of the sample mean loss differential is asymptotically a standard normal. Their framework accommodates various loss functions but assumes that the forecasts do not rely on regression estimates. West (1996) discusses the effect ofparameterestimationerroroninferenceaboutthemeansquaredpredictionerror (MSPE):heshowsthattheasymptoticdistributionofthedifferenceinsamplemean square forecast errors differs from the distribution of the population mean squared 42 forecast errors. He then provides with conditions under which the parameter es- timation error is asymptotically irrelevant and therefore inference on MSPE can proceedasdescribedinDieboldandMariano. The analysis conducted in Diebold and Mariano (1985) and West (1996) does not apply to nested models because of a rank condition that is not satised in this framework. Theproblemwithnestedmodelsarisesbecauseunderthenullofequal predictive accuracy, the errors of the different models are the same and hence the variancecovariancematrixoftheestimatorisnotfullrank. Formalcharacterization oflimitingdistributionsforthecomparisonoftwonestedmodelshasbeenattained byMcCracken(2004)andClarkandMcCracken(2001,2005)whentheparameters are estimated through nonlinear least squares. In this environment the test statistic to evaluate the null of equal predictive ability is derived as functionals of Brown- ian motions and is asymptotically pivotal if certain additional conditions hold: the forecast horizon is one and the forecast errors are conditionally homoskedastic or if the larger model contains only one additional regressor. Clark and West (2006, 2007),thereafterCW,arguesthatfornestedmodelsthesampleMSPEdifferenceis positive and introduce an adjustment term to center the statistic around zero. They also provide Monte Carlo evidence to justify the assumption of normality of the adjustedMSPEdifference. 43 The above mentioned papers compare only two models. There are two existing out-of-sampleprocedurestocompareasmallnumberofnestedmodels,outlinedin Hubrich and West (2008), thereafter HW. The rst one is a direct extension of the work of Diebold and Mariano (1995) and West (1996), (DMW), and consists of a chi-squaredstatistic,theotherexaminesthemaximumofcorrelatednormals. Both testsadjusttheMSPEdifferencesasadvocatedinCW. Arelatedliteratureinvestigatesthemeritofusingin-sampleversusout-of-sample tests to evaluate predictability (see for example Inuoe and Kilian (2004) and Chen (2005)). Weabstractfromthisdiscussionandfocusonlyonout-of-sampletests. We propose a one-sided likelihood ratio type predictability test for the compar- ison of a small number of models nested to each other. A rigorous denition of 'small'isnotprovided,butasapracticalrulewesuggestthatthenumberofmodels should be smaller than the size of the out-of sample. We distinguish among three different cases according to the structure of the alternative models considered: in therstcasethemodelsarenestedwithineachother,inthesecondthereisnonest- ing relation between the models, in the third, more general case, the models can begroupedsuchthatwithineachgroupthemodelsarenested,butthereisnonest- ing relation among groups. In all cases, the alternative models nest the benchmark model. 44 WeevaluatethesizeandpowerpropertiesofthetestviaMonteCarlosimulations for 1-step ahead forecasts under different assumption on the limiting distributions ofthestatistics. Asmentionedabove,thelimitingdistributionofthesampleMSPE differential under the null is a functional of Brownian motion and critical values cannot be derived analytically. However Clark and McCracken (2001) and Clark and West (2007) show through simulations that the size distortion of the test based oncriticalvaluesderivedundernormalityassumptionoftheMSPEisnotbig. The empirical size is close to the nominal size for large values of the in-sample to out- of-sample ratio. There are then two ways to obtain critical values: the rst one is to assume normality of the sample MSPE differential, the second is through sub- sampling of the statistics. The Monte Carlo investigation reveals that, under the assumption of normality of the MSPE differences the chi-squared test is slightly oversized,whiletheothertestsareundersizedbutthesizedistortiondecreaseswith the out-of-sample to in-sample ratio, as previously found by Clark and McCraken and Clark and West. As far as the power of the test is concerned, regardless of theassumptionsunderwhichthecriticalvaluesarederived,thechi-squaretestper- forms poorly in terms of power as it disregard the one-sided nature of the test, whilerankingbetweenthelikelihood-ratiotypetestandthecorrelatednormaltest, depends on the simulation settings. The results from subsampling are sensitive to the choice of the block size: the chi-squared has empirical size equal to nominal 45 size for smaller block size, while the correlated normal test and the LRT for larger blocksizes. The outline of the chapter is as follows: section 3.2 introduces the notation and theforecastingenvironmentandpresentsthetests. Section3.3provideswithinfer- enceonthetests. Insection3.4theMonteCarlosimulationexperimentisdescribed and the size and power properties of the tests are commented. Section 3.5 presents anempiricalapplication,forecastingaggregateUSination. Section3.6concludes. 3.2BasicMathematicalFramework We refer to the work of HW for the description of the environment. We are in- terested in forecasting a scalar y t through M+1 linear models estimated by least squares. Thebenchmarkmodel,denotedas`0',andthealternativemodels,denoted as`m'withm = 1;::;M,canbewrittenas: y t = X 0 0;t 0 +u 0;t . . . y t = X 0 m;t m +u m;t . . . y t = X 0 M;t M +u M;t ; 46 wherey t isastationarytimeseriesvariable,u i;t arei.i.drandomvariablessatisfying E(u i;t X i;t ) = 0, andX 0;t ,..,X M;t are vectors of regressors such thatX 0;t = x 0 0;t isofdimensionk 0 1andX m;t = x 0 0;t ; x 0 m;t isofdimensionk m 1withk 0 <k m . Underthenull,model0isthetruemodelandhenceeachmodelmincludesk m k 0 excess parameters: m = 0 0 ;0 0 kmk 0 0 8m = 1;::;M; moreover under the null theerrorsareidenticalu 0;t =u 1;t =::: =u M;t . Underthealternative,however,the additionalparametersestimatedarenon-zeroinpopulation. Forsimplicitywefocus on one period ahead forecasts. We assume the parameters m to be constant over time, therefore we do not allow for structural breaks. Let T+1 be the total sample size, R the size of the sample used to generate the initial estimates and P the ob- servations used for out of sample evaluation. Denote by ^ y 0;t+1 ;::;^ y m;t+1 ;::;^ y M;t+1 the one period ahead forecasts obtained from the estimated models either through theexpandingwindowortherollingschemefort=R,..,T.Intheexpandingwindow schemethesizeoftheestimationsamplegrowswhileintherollingschemeitstays constant. Following CW we denote as f m;t+1 the difference of the loss functions between the benchmark and alternative modelm; in this case the difference in the squared prediction errors (SPE):f m;t+1 = u 2 0;t+1 u 2 m;t+1 , whereu m;t+1 = y t+1 y f m;t+1 andy f m;t+1 istheone-stepaheadforecastfrommodelmwhentheparametersofthe modelsaresettotheirpopulationvalues. 47 Collect the SPE differences in the vector f t+1 and dene by the mean SPE (MSPE),theexpectedvalueoff t+1 : f t+1 = (f 1;t+1 ;:::;f M;t+1 ) 0 =E(f t+1 ) = 2 0 2 1 ;:::; 2 0 2 M 0 with 2 i E u 2 i;t being the population variance of the forecast error, which is assumedtobeastationaryprocess. Let ^ u i;t+1 = y t+1 ^ y i;t+1 be the 1-step ahead forecast error from the estimated modeliwithi = 0;::;M. Thesampleanalogoff m;t+1 ,denotedby ^ f m;t+1 ,isgiven by: ^ f m;t+1 = (y t+1 ^ y 0;t+1 ) 2 (y t+1 ^ y m;t+1 ) 2 = (^ u 0;t+1 ) 2 (^ u m;t+1 ) 2 andthesampleanalogof ^ f t+1 ;calledf t+1 isthevectorobtainedbystackingtogether thesampleSPEdifferences: ^ f t+1 = ^ f 1;t+1 ;:::; ^ f M;t+1 0 : Thesamplecounterpartof;thesampleMSPE,isgivenbythevector f thatcollects 48 the sample averages of the elements in ^ f t+1 computed over the forecasting sample R+1throughP: f =P 1 T X t=R ^ f 1;t+1 ;::; T X t=R ^ f m;t+1 ;::; T X t=R ^ f M;t+1 ! 0 : However, CW claim that under the null that model 0 is the correctly specied model the sample MSPE from the parsimonious model will be generally lower than the sample MSPE from the alternative model, so it may be the case that P 1 P T t=R ^ f m;t+1 < 0. Hence, they suggest the following adjustment to cen- terthesampleMSPEdifferencearoundzero: ^ f adj m;t+1 = (y t+1 ^ y 0;t+1 ) 2 (y t+1 ^ y m;t+1 ) 2 (^ y 0;t+1 ^ y m;t+1 ) 2 = (^ u 0;t+1 ) 2 (^ u m;t+1 ) 2 (^ y 0;t+1 ^ y m;t+1 ) 2 : Analogousquantitiesdenedabovefor ^ f m;t+1 canbederivedfrom ^ f adj m;t+1 : ^ f adj t+1 = ^ f adj 1;t+1 ;:::; ^ f adj M;t+1 0 : f adj =P 1 T X t=R ^ f adj 1;t+1 ;::; T X t=R ^ f adj m;t+1 ;::; T X t=R ^ f adj M;t+1 ! 0 FollowingClarkandWestsuggestionwedene adj m = m +E y f 0;t+1 y f m;t+1 2 : 49 Since adj m = 2 m , we conlude that in the population the adjustment does not alter the nature of the problem stated by the unadjusted MSPE. We will specify the null hypothesis as H 0 : = 0; or equivalently H 0 : adj = 0 while the specication of the alternative hypothesis will depend on the assumptions on the structure of the alternative models. We will distinguish between three cases: in the rst one the models are nested within each other, in the second there is no nesting relation betweenthealternativemodels,inthelastonethemodelsarenestedwithingroups. 3.2.1SpecialCase1: AlternativeModelsNestedWithinEachOther Wecharacterizethecaseinwhicheachmodelm1isnestedinmodelmbyimpos- ingthatmodelmincludesk m k m1 additionalregressors:X m;t = x 0 m1;t ; x 0 m;t so thatk 0 <<k m <<k M . Giventhestructureoftheproblem,whenconsideringtheMSPEweknowthatif modelm is the true model then for modelsm = 1;:::;m 1; it will hold that, 2 m < 2 m 1 < < 2 1 and hence 0 < 1 = 2 0 2 1 < < 2 0 2 m1 = m 1 < 2 0 2 m = m while for modelsm +1;:::;M; it will be the case that 2 m = 2 m +1 =::: = 2 M ;whichimplies m = m +1 =::: = M :Thisordering isinvarianttotheintroductionoftheCWadjustment. 50 Thenthenullandthealternativehypothesescanbeexpressedas: H 0 : = 0: (11) H 1 : 0 1 2 M ; 6= 0: orequivalentlywithrespectto adj since adj = 2 : H 0 : adj = 0: (12) H 1 : 0 adj 1 adj 2 adj M ; adj 6= 0: Hence wetest equal forecastaccuracy versus thealternative that atleast one ofthe models performs better than the benchmark. We consider a one-sided alternative as rst suggested by Ashley, Granger and Schmalensee (1980) and subsequently assumedinmanystudies(CW,HW). Thetestweproposetoevaluatethenullofequalpredictiveabilityisalikelihood- ratiotypetestoftheform: T LRT_D =P f adj0 ^ v 1 f adj min D0 P f adj 0 ^ v 1 f adj 51 where ^ v =P 1 T X t=R ^ f adj t+1 f adj 2 (13) and D = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 1 0 0 0 1 1 0 0 0 1 1 0 0 0 1 1 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 : ispickedsuchthattheparametersetimpliedbythealternativecanbeexpressedas H 1 :D 0: i.e. thealternativemodelsfollowthestructure: 0 1 2 M : Insteadof(11)aweakeralternativemightbeconsidered: H 0 [H 1 : 1 0; 2 0;:::; and M 0; 6= 0: 52 InthiscasetheassociatedLRtypecanbeformulatedas: T LRT_I =P f adj0 ^ v 1 f adj min I0 P f adj 0 ^ v 1 f adj where ^ v isdenedin(13)andI isanMxMidentitymatrix. 3.2.2SpecialCase2: Non-NestedAlternativeModels Inthiscasethereisnonestingrelationbetweenthealternativemodels,butstilleach ofthemneststhebenchmark. Wetestfor: H 0 : = 0 against H 0 [H 1 : 1 0;:::; or M 0: DenotebyAtheregionsuchthat A = f : 1 0;:::; or M 0g A m = f : m 0g 53 Inthiscase,thelikelihoodratioisgivenby: T LRT max = max P f adj0 ^ v 1 f adj min 2A 1 P f adj 0 ^ v 1 f adj ;:: ::;P f adj0 ^ v 1 f adj min 2A M P f adj 0 ^ v 1 f adj 3.2.3GeneralCase: AlternativeModelsNestedwithinGroups Nowweconsiderageneralcase. Supposethatthealternativemodelscanbegrouped accordingtothefollowingrelations: withineachgroupthemodelsarenestedhow- ever across different groups, the models are not nested. In particular, consider K groups such that within each groupG k : k;1 k;2 k;M k , withM k the numberofmodelsincludedingroupk. Then,foreachgroupkdenethesetA k as: A k = : 0 k;1 k;2 k;M k For this case we propose a likelihood-ratio type test that combines the two tests outlinedintheprevioussections. T LRT maxK = max P f adj0 ^ v 1 f adj min 2A 1 P f adj 0 ^ v 1 f adj ;:: ::;P f adj0 ^ v 1 f adj min 2A K P f adj 0 ^ v 1 f adj 54 wherenow: A k = 2 R M :D k 0 fork = 1;:::;K D k = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 0 (M( k j=1 M j )) j j 0 (( k j=1 M j )M k ) D k 0 (( K j=k+1 M j )M k ) j j 0 (M( K j=k+1 M j )) 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 and D k = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 1 0 0 0 1 1 0 0 0 1 1 0 0 0 1 1 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 withD k beingamatrixofdimension (M)(M),andD k ofdimensionM k M k . 3.2.4AlternativeTests Weconsidertwoalternativeforecastaccuracytestsfornestedmulti-modelcompar- isonproposedintheexistingliterature: therstoneisachi-squaredtest,originally designed for a bi-model comparison in CW and extended to multivariate compari- sonbyHW.Thesecondoneisthecorrelatednormalstest,proposedbyHW. 55 CW considers a Wald-type test involving the statistic T chi =P f adj0 ^ v 1 f adj . As discussed above it focuses on the adjusted MSPE in order to center the statistic around zero. This test does not take into account the one-sided nature of the al- ternative and therefore is expected to have low power. HW exploits the one-sided natureofthealternativeandadoptsasteststatistic:T maxt 1mM = max[ p P f adj m = p ^ v m ] which is the maximum of t-statistics, with ^ v m the sample variance of ^ f adj m : ^ v m = P T t=R+1 ^ f adj m;t f adj m 2 and f adj m = P T t=R+1 ^ f adj m;t : 3.3ComputationofCriticalValues Inthissectionwediscusshowtocomputecriticalvalueoftheteststatisticsoutlined intheprevioussections. The limiting distributions of the tests described above rely on the distribution of the adjusted MSPE. Observe that the adjusted MSPE can be rewritten for each modelmas: f adj m = 2P 1 T X t=R ^ u 0;t+1 (^ u 0;t+1 ^ u m;t+1 ): AccordingtoClarkandMcCracken(2001)thelimitingdistributionof f adj m = p ^ v m is non-normal under the null when the two models are nested if lim P;R!1 P=R = ; > 0; i.e. if the size of the estimation sample grows at the same rate than the out-of-sample. They show that the asymptotic distribution is a functional of 56 Brownian motion which depends on the excess parameters in model m, k m , the largesamplelimitoftheratiobetweenthein-sampleandout-of-samplesize,,and theestimationschemeused(expandingwindoworrolling). However,simulationexperimentsshowthatforone-stepaheadforecastsandho- moskedastic prediction errors, applying standard normal inference to f adj m = p ^ v m leads to slightly undersized test results. The standard normal approximation per- forms reasonably well also in heteroskedastic environment when the number of additional regressors,k m is equal to one. This nding is conrmed by simulations in CW which nds an empirical size between 0.05 and 0.1 for a 10% nominal size forbothheteroskedasticandhomoskedasticforecasterrors,forboththeexpanding window and rolling estimation scheme and for values of ranging between one thirdandsix. They also compare the performance of the test when using simulated or boot- strapped critical values rather than asymptotic normal critical values and they do not nd substantial size or power improvements. This is taken as justication for theassumptionofnormalityof f adj m = p ^ v m inHWwhichextendtheworkofCWfor amulti-modelcomparisonsetting. Basedontheabovediscussion,onemayguessthatthelimitingdistributionofthe Likelihood ratio would be also a functional of the Brownian motion, whose distri- bution is hard to compute. Because of this we propose to consider two alternative 57 approachestoderivecriticalvalues: (i)undertheasymptoticnormalityassumption, (ii)throughsub-sampling. Whenf adj t N(0;v)thelimitingdistributionofT LRT underthenullisknownto beamixtureofindependentchi-squareddistributions:T LRT =)! 1 2 1 +::+! M 2 M , whereM,thenumberofalternativemodelsconsidered,isthesizeofthevector adj and ! m = ! m (M;) is the probability that exactly m of the M components are strictlypositive(seePerlman1969). Foragivensignicancelevel,thetestrejects thenullwhenT LRT >c wherec issuchthat = Pr[! 1 2 1 +::+! M 2 M c ]. Criticalvaluesc ,necessarytoevaluateT LRT ,canbederivedthroughMonteCarlo Simulationsonceaconsistentestimatorforvisobtained. Inpresenceofhomoskedas- tic and uncorrelated forecast errors, as it is the case for one-step ahead forecasts, a consistentestimatorforv isgivensimplybythesamplecovariance ^ v =P 1 T X t=R ^ f adj t+1 f adj 2 : FortheT chi statistictheasymptoticnormalityassumptionimpliesthatinferencecan beconductedthroughcriticalvaluesfromachi-squareddistributionwithMdegrees offreedom, 2 M ;whileforHubrichandWestthecriticalvaluestoevaluateT maxt for agivennominalsizecanbefoundsolving R c() 1 g z (z)dz = 1whereg z (z) denotesthedensityofthelargerofMstandardnormalvariableswithcorrelation. 58 Whentheassumptionofnormalityisdiscardedthecriticalvaluescanbederived through subsampling of the data. The subsampling procedure is consistent under weakly convergence conditions of the sub-sampled statistic and when B/P! 0 where B is the selected block size. The test is constructed by computing the test statistics at the block of data n ^ f adj t ;:::; ^ f adj t+B1 o TB+2 t=R+1 : The critical values of the test are obtained as the 1 quantile of the sub-sampled distribution. Details on howtoconstructcriticalvalueswiththesetwomethodsaregiveninsection3.4. 3.4MonteCarloSimulation We now outline in detail the experimental design for the Monte Carlo simulation andtheproceduresusedtoobtainthecriticalvalues. Wepresenttwosetsofexper- iments, one simple design which has a general formulation and can be applied to many empirical studies and one more specic, suited to the evaluation of forecasts for ination. The evaluation of the tests is implemented with asymptotic critical valuesderivedthroughsimulationsforthreesettings: rstbasedontheassumption of normality of the adjusted MSPE, second through sub-sampling of the data and thirdtheexactcriticalvalue. The rst part of the simulation exercise requires the design of the DGP process forthesizeandthepowerexperiment. 59 In the rst design the process chosen as DGP for the size experiment is an order oneautoregressiveprocessoftheform: y t =c+y t1 +" t (14) withc = 1; = 0:2and" t N(0;1): TheprocesschosenasDGPforthepowerexperimentisthefollowing: y t =c+y t1 + x 1t +" t (15) withc = 1; = 0:2;" t N(0;1):Theexperimentisrepeatedfor3differentvalues of = (0:2;0:1;0:05). Theexogenousvariablex t isdeterminedby: x 1t =a+u 1t (16) with a = 1; u t N(0;1) and u 1t ? " t : Next we need to select the regression models. Themodelusedforbenchmarkisthefollowing: M0 : y t =c+y t1 + t (17) 60 ThereareM=2alternativemodelsoftheform: M m : y t =c m + m y t1 + m x m;t + m;t m = 1;2 (18) wheretheextraregressorx 2t isgeneratedfrom: x 2t =a+u 2t (19) withu 2t ? " t andu 2t ? u 1t : Model 2 then nests not only the benchmark but also model 1; this is equivalent to the scenario analyzed in the rst case presented in Section3.2. In the design of the second experiment we follow HW and we assume the series y t isanaggregatevariableobtainedassumofasmallnumberofcomponents:y t = P L l=1 x l;t :ThedisaggregatevariablesfollowaVAR(1)process: x t = a+x t1 +" t with x t = (x 1;t ::x l;t ::x L;t ) 0 ; a anLx1 vector of constants and" t N(0;I): The aggregate is the sum of L=3 disaggregate series, and a is a vector of ones. We considerthefollowingregressionmodels. 61 The model used for benchmark includes a constant and an autoregressive com- ponentoforderone: M0 : y t =c+y t1 + t : (20) Thetwoalternativemodelsareoftheform: M 1 : y t =c 1 + 1 y t1 + 1 x 1;t + 1;t (21) M 2 : y t =c 2 + 2 y t1 + 1 x 1;t + 2 x 2t + 2;t (22) Again, both model 1 and model 2 nest the benchmark, moreover model 1 can be obtained by model 2 by setting 2 = 0. The estimates are carried out through OLS,withrollingscheme,suchthateachestimationsamplehasthesamesizeR = f40;100;200;400g. Theforecastsareproducedforhorizonh=1andthesizeofthe out-of-sampleisP =f40;100;200;400g. TheVAR(1)regressioncoefcientvarieswhenconsideringthesizeorthepower experiment;forthesizeexperimentweconsiderthecomponentstobeindependent: = 2 6 6 6 6 6 6 4 0:5 0 0 0 0:5 0 0 0 0:5 3 7 7 7 7 7 7 5 ; 62 whileforthepowerexperimentweallowforinteractionsbetweenthecomponents = 2 6 6 6 6 6 6 4 0:5 0:6 0 0:4 0:3 0 0 0 0:5 3 7 7 7 7 7 7 5 ; (23) Note that in the size experiment, the aggregate process is an AR(1) process with autoregressive parameter = 0:5 and constant c = 3, while in the power ex- periment the aggregate is an ARMA(3,2) process. To proceed with the forecast evaluation the quantity ^ f adj mt+1 is computed for each model and stacked in the vec- tor ^ f adj t+1 = ^ f adj 1t+1 ;::; ^ f adj mt+1 ;::; ^ f adj Mt+1 0 : The sample average of ^ f adj t+1 is denoted as f adj . The forecast evaluation is carried out for three different tests: chi-squared, correlatednormalandlikelihood-ratio. Thechi-squaretestconsidersasstatistic: T chi =P f adj0 ^ v 1 f adj (24) where ^ v =P 1 P T t=R ^ f adj t+1 f adj 2 : Thestatisticforthecorrelatednormaltestisgivenby: T maxt = max h P 1=2 f adj 1 = p ^ v 1 ;::;P 1=2 f adj M = p ^ v M i (25) 63 Thestatisticforthelikelihoodratiotestisgivenby: T LRT =P f adj0 ^ v 1 f adj min 0 P f adj 0 ^ v 1 f adj where can be the matrix D,D or the identity matrix, depending on the structure ofthealternativemodels. As discussed above the adjusted MSPE does not have a normal limiting distri- bution. However, following Cark and McCracken (2001), CW and HW we rst derivecriticalvaluesunderthenormalityassumptionfortheMSPEadjusted. Then, we conduct other two experiments to obtain the critical values: one is based on sub-sampling;anotheronesimulatestheexactdistributionofthestatistic. Underthenormalityassumption,thecriticalvaluesforT chi aregivenbya 2 M dis- tribution with M being the number of alternative models. For theT maxt the critical values depend on the correlation matrix of the forecast errors and need to be sim- ulated. The procedure works as follows: rst the sample correlation matrix of the f adj vector,^ ,isestimated. ThenavectorofsizeMfromamultivariatenormalwith zeromeanandsamplecorrelation^ isdrawn. Next,themaximumelementfromthis vectorisselected. Theexperimentisrepeateddtimes. The1-alphapercentileofthe simulateddistributionisthethpercentcriticalvalue. FortheT LRT thecritical values are generated as follows: rst a vector Z of size P 1 is generated from 64 a multivariate normal distribution with mean zero and variance-covariance matrix ^ v :Z N (0;^ v),with ^ vasdenedabovebeingthesamplecovariancematrixofthe MP matrix of prediction errors from the models of interest. Next the LRT statis- ticsiscomputedbysolvingP f adj0 ^ v 1 f adj min 0 P f adj 0 ^ v 1 f adj : WelabelthisasLRT D whenimposinganestedstructureforthealternativemodels, orLRT I whennostructureisimposedforthealternativemodels. Theprocedureis repeatedd times and the critical valuesc is derived as the (1)-th quantile of thesimulateddistribution. The subsampling method approximates the sampling distribution of the test sta- tistics by recomputing it over subsamples of smaller size. The procedure, outlined in Politis et al. (1999), is based on blocks of size B of consecutive observations; the rst block includes observation 1 through B, the last one observations P-B+1 through P. For each subsample the test statistics T chi ;T maxt and T LRT are com- putedandevaluationofT chi ;T maxt andT LRT isperformedthroughthequantilesof T chi ;T maxt andT LRT : Note that the choice of the block size is such that B/P! 0 and it determines the number of subsamples, which is equal to P-B+1. In our ex- perimentweconsidervariousblocksizesandB/Pratiosrangingfrom0.1to0.7. The simulation results are shown in table 3 through 6. Table 3 and 4 provide resultsforthesize-adjustedpowerfora10percentsignicancelevelforexperiment 1and2respectively. 65 Table3. ExactPowerExperimentOne RnP 40 100 200 400 RnP 40 100 200 400 chi 2 0.212 0.348 0.526 0.776 chi 2 0.238 0.452 0.738 0.938 40 CN 0.378 0.557 0.728 0.905 200 CN 0.506 0.758 0.922 0.988 LRT I 0.320 0.479 0.664 0.866 LRT I 0.386 0.651 0.861 0.974 LRT D 0.316 0.490 0.678 0.875 LRT D 0.399 0.664 0.876 0.980 chi 2 0.225 0.435 0.653 0.902 chi 2 0.258 0.496 0.766 0.959 100 CN 0.468 0.692 0.855 0.970 400 CN 0.530 0.791 0.947 0.993 LRT I 0.368 0.606 0.797 0.949 LRT I 0.402 0.657 0.883 0.983 LRT D 0.372 0.623 0.812 0.956 LRT D 0.402 0.682 0.903 0.988 Table4. ExactPowerExperimentTwo RnP 40 100 200 400 RnP 40 100 200 400 chi 2 0.551 0.873 0.987 0.999 chi 2 0.655 0.964 0.999 1 40 CN 0.693 0.918 0.993 1 200 CN 0.832 0.992 1 1 LRT I 0.652 0.920 0.995 1 LRT I 0.776 0.987 1 1 LRT D 0.727 0.943 0.997 1 LRT D 0.842 0.994 1 1 chi 2 0.643 0.945 0.997 1 chi 2 0.646 0.968 1 1 100 CN 0.814 0.984 0.999 1 400 CN 0.831 0.993 1 1 LRT I 0.765 0.976 0.999 1 LRT I 0.777 0.990 1 1 LRT D 0.837 0.987 1 1 LRT D 0.853 0.994 1 1 66 Table5. SizeandUnadjustedPowerExperimentTwo: Normality RnP 40 100 200 400 size power size power size power size power chi 2 0.116 0.584 0.103 0.875 0.110 0.988 0.112 1 40 CN 0.092 0.673 0.081 0.904 0.099 0.993 0.1 1 LRT I 0.099 0.650 0.080 0.907 0.091 0.994 0.094 1 LRT D 0.083 0.686 0.067 0.921 0.080 0.996 0.082 1 chi 2 0.120 0.678 0.110 0.950 0.105 0.997 0.103 1 100 CN 0.091 0.798 0.077 0.975 0.079 0.999 0.077 1 LRT I 0.094 0.755 0.080 0.968 0.076 0.999 0.078 1 LRT D 0.078 0.796 0.062 0.976 0.058 0.999 0.057 1 chi 2 0.122 0.694 0.115 0.968 0.109 0.999 0.098 1 200 CN 0.098 0.829 0.082 0.989 0.072 1 0.071 1 LRT I 0.104 0.784 0.087 0.984 0.074 1 0.070 1 LRT D 0.088 0.822 0.065 0.988 0.056 1 0.052 1 chi 2 0.131 0.701 0.116 0.973 0.104 1 0.101 1 400 CN 0.104 0.837 0.088 0.992 0.076 1 0.071 1 LRT I 0.109 0.794 0.086 0.987 0.077 1 0.070 1 LRT D 0.088 0.830 0.067 0.991 0.060 1 0.052 1 67 Table5reportsthesizeandtheunadjustedpowerfortheaggregationexperiment undertheassumptionofnormalityofthelimitingdistributionofthestatistics. Table 6 shows the size and power properties obtained through subsampling for differentvaluesoftheblocksizeB. Table6. SizeandUnadjustedPowerExperimentTwo: Subsampling B 7 13 16 19 size power size power size power size power chi 2 0.010 0.118 0.101 0.443 0.166 0.583 0.230 0.674 CN 0.104 0.709 0.155 0.811 0.185 0.848 0.213 0.871 LRT I 0.017 0.184 0.067 0.508 0.102 0.628 0.131 0.707 LRT D 0.019 0.205 0.059 0.533 0.078 0.638 0.102 0.717 In the aggregation experiment under the normality assumption the CN and the LRT I are slightly undersized in most of the cases and the ranking of the tests de- pends on the particular combination of R and P considered. The LRT D performs poorly with respect to size, being at most 0.88 for a 10 percent nominal size. The chi-squaredtestisalmostalwaysoversized,asfoundalsoinHubrichandWest. For all the tests the empirical size gets closer to the nominal size as the ratio P/R de- creases. ThisisconsistentwiththetheoreticalframeworkinClarkandMcCracken (2001): thenormalapproximationholdsasymptoticallyiftheratioP/Rapproaches zero as the total sample size increases and if the rolling estimation is used. When 68 thecriticalvaluesarederivedthroughsub-samplingofthestatisticthesizeproper- ties are found to be sensitive to the block size: the CN test is correctly sized when the block size is small, the LRT performs best for larger values of the block size B and the chi-squared for intermediate values. The results provided in table 6 are derived for R=400 and P=40. Simulations for different values of R and P provide with similar results. We report results for this particular combination of R and P becauseitisconsistentwiththesizeofthesamplesintheempiricalapplications. For all tests the power increases with the size of the out-of-sample for a given in-sample size and it almost always increases with the in-sample size for a given out-of-sample. The performance of the chi-squared test is overall disappointing: it always ranks last and for the rst experiment with critical values derived from the exact distribution the power of the chi-squared test is roughly half of the power of the competitor tests for P=40. For tests performed under the normality assumption the correlated normals and the LRT D tests have comparable performance, but for the exact power experiment the LRT D outperforms the correlated normals. For experiment 1 the correlated normals ranks rst in the exact power exercise, while theLRT D andtheLRT I arealmostequivalent. Whenthecriticalvaluesarederivedthroughsub-sampling,thepowerisincreas- ing with the block size for each test. To compare the power across different tests, we need to take into account that for different block sizes the tests exibit very dif- 69 ferentempiricalsizes. Hence,foreachtestwechoosetheblocksizethatminimizes the size distortion (B=7 for CN, B=13 for chi-squared, B=16 for LRT I and B=19 forLRT D )andwecomparethepowerofthetestsacrossthosedifferentblocksizes. Then, the ranking of the tests is the same as for the case in which the critical val- uesarecomputedundertheassumptionofnormality: thechi-squaredtestperforms theworse,LRT I rankssecond,whileLRT D andCNimprovesubstantiallyoverthe othertwotests. 3.5ForecastingUSInation In this section we apply our test to the evaluation of equal predictive ability for forecastingtheUSinationrate. ThevariableofinterestisyearlyUSCPIination, all items, which is the weighted average of four components: services, commodi- ties, food and energy. We consider two different estimation samples: the rst in- cludes the observations 1959:01-1978:12, the second spans from 1984:01 through 2002:12. The remaining years (1979-1983 for the rst sample, and 2003-2007 for thesecondsample)areusedforforecastevaluation. Inationexhibitsverydifferent characteristics overthe two periods: inthe rst itis very highand volatile whilein the second it is much more stable and has a lower mean. This led us to split the data as described above, as the different behavior is possibly due to parameters in- stability, which is ruled out in our framework. In both samples the ratio between 70 out-of-sampletoin-sampleisaboutonefourth. Forsimilarvaluesofthein-sample (R=228 or 240) and of the out-of-sample (P=60 in both samples) the simulations resultssummarizedintable5suggestthatthetestsarewellbehavedintermsofsize andpower. ThemodelsweconsiderareanAR(1)benchmark: M0 : y t =c+y t1 + t ; and two alternatives including a lag of the dependant variable plus the food series component for model 1, and plus the food and energy components for model 2. Consistently with the simulation settings, both models nest the benchmark and the secondalternativemodelneststherstone. M1 : y t =c 1 + 1 y t1 +' 1F x F;t1 + 1;t M2 : y t =c 2 + 2 y t1 +' 2F x F;t1 +' 2E x E;t1 + 2;t Food and energy are the most volatile components and are excluded from the computationofthecoreination. Hereweaskwhetherthosetwocomponentshave anyadditionalpredictiveabilityoveramodelthatincludesonlyaggregateination. Again,theestimationtechniqueisOLSwithrollingscheme,appliedtothemonthly 71 inationrate,andtheforecasthorizonofinterestisone. Oncethemonthlyination ratesforecastsareobtained,theyaretransformedintoyearlyrates. Figure13and14plotstheactualandtheforecastsobtainedfromthebenchmark andalternativemodelsforthetwoforecastsamplesconsidered. Figure13. ActualandForecastsforUSInation1979:01-1983:12 The gures show clearly the different behavior of the series in the two subsam- ples,inparticularthehighermeanandpersistenceoftheseriesintherstsample. 72 Figure14. ActualandForecastsforUSInation2003:01-2007:12 We ran the chi-squared, the correlated normals and the likelihood ratio test for nested alternative models; the critical values are obtained rst assuming normality of the expected value of the adjusted loss differentials and then through subsam- pling. The tests results for the 10 percent level of signicance are shown in table 7 fornormalityand8forsubsampling. Forthecaseinwhichthecriticalvaluesarederivedundertheassumptionofnor- mality all tests do reject the null of equal predictive accuracy for both the great moderation period (the second sample) and for the rst sample. This is in contrast withtheresultfromHW,wherethetestsrejectthenullintherstsampleimplying 73 that in times of high ination and volatility the information embedded in the com- ponents is important to forecast the aggregate. However, HW framework includes four alternative models, while our incorporates only two; moreover, HW conduct also two distinct bivariate comparisons, one for the benchmark against an alterna- tivemodelincludingfoodandasecondoneforthebenchmarkagainstanalternative modelincludingenergy. Inboththisbivariatecomparisonsthehypothesisofequal predictiveabilityisnotrejected. Table7. TestofEqualForecastAccuracyforUSInation: Normality est. 59:01-78:12 for. 79:01-83:12 chi 2 CN LRT teststat 0.488 -0.037 0.092 cv(10%) 4.605 1.596 3.00 est. 84:01-02:12for. 03:01-07:12 chi 2 CN LRT teststat 3.797 -1.555 0 cv(10%) 4.605 1.613 3.107 Forthecaseinwhichsubsamplingisusedtoobtainthecriticalvaluestheresults are consistent across the different block sizes used (B = 10;15;20;25): as for the case in which the critical valuse are obtained under the normality assumption, all 74 testsfailtorejectthenullofequalpredictiveability. Theonlyexceptionisthechi- squared test for B=15 and B=20 for the second sample. The critical values differ greatlyaccordingtotheblocksize. Table8. TestofEqualForecastAccuracyforUSInation: Subsampling est. 59:01-78:12 R=240 for.79:01-83:12 P=60 teststat criticalvalue size=10% B=10 B=15 B=20 B=25 chi2 0.488 9.303 6.545 3.821 4.442 CN -0.037 3.345 2.366 1.664 1.801 LRT 0.092 7.964 4.25 2.760 3.271 est. 84:01-02:12 R=228 for.03:01-07:12 P=60 teststat criticalvalue size=10% B=10 B=15 B=20 B=25 chi2 3.796 3.843 3.398 3.246 3.905 CN -1.555 1.869 1.748 1.323 1.523 LRT 0 9.469 5.339 3.827 3.914 75 3.6Conclusions Inthischapterweintroducedalikelihoodratiotypepredictabilitytestforthecom- parisonofasmallnumberofmodelsnestedtoeachother. Wedistinguishedamong threecasesaccordingtothestructureofthealternativemodelsconsidered: ageneral case, in which the models can be grouped such that within each group the models arenested,butthereisnonestedrelationamonggroups,andtwoextremecases,one in which the models are nested within each other, one in which there is no nested relationbetweenthemodels. Weevaluatedthesizeandpowerpropertiesofthetest via Monte Carlo simulations for 1-step ahead forecasts under different assumption on the limiting distributions of the statistics and for two simulation settings. The Monte Carlo investigation reveals that the chi-square test performs poorly in terms ofpowerasitdisregardtheone-sidednatureofthetest,whiletherankingbetween the likelihood-ratio type test and the correlated normal test depends on the simu- lation frameworks. The normal approximation of the vector of MSPE differences, assumed in previous studies for multi-model comparison, proves to be reasonable forP/Rgoingtozero,asfoundinCWandearlieroninClarkandMcCracken(2001, 2005). UsingsubsamplingtoconductinferenceonMSPEprovestobetoosensitive to the block size, but conrms that under the simulation settings we adopted the LRTprovideswithmorepower. Nevertheless,thereisnouniformlymostpowerful 76 testforone-sidedmultipletest,soweacknowledgethattherelativeperformanceof the LRT and HW test depends on the parameterization of the Monte Carlo experi- ment. Last,weappliedthetestsfortheanalysisoftheUSaggregateinationandfound that the yearly rate of ination can be better forecasted by a parsimonious model notonlyinthegreatmoderationperiod,butalsointheearlier,morevolatileperiod. 77 References Andrews, D.W. K. (2000), `Inconsistency of the Bootstrap Whenthe Parameter isontheBoundaryoftheParameterSpace',Econometrica,vol. 68n.2. Ashley, R., C.W.J.Granger and R.Schmalense (1980), `Advertising and Aggre- gateConsumption: anAnalysisofCausality',Econometrica,vol.48n.5. Blanchard , O.J. and J. Simon (2000), `The Long and Large Decline in U.S. OutputVolatility',BrookingsPapersonEconomicActivity,(2000) Bray, M. and N. Savin, `Rational Expectations, Learning and Model Specica- tion',Econometrica(1986),1129-1160 Bullard, J. `Time-Varying non Convergence to Rational Expectations Equilib- riumunderLeastSquaresLearning',EconomicsLetter40(1992),159-166 Bullard,J.andJ.Suda,`TheStabilityofMacroeconomicsSystemswithBayesian Learners',FederalReserveBank,ofSt. Louis,wp. 2008-043B Bullard, J. and K. Mitra, `Learning about Monetary Policy Rules', Journal of MonetaryEconomics49(2002),1105-1129 Canova,F.andL.Gambetti,`OnTimeVariationsofU.S.MonetaryPolicy: Who is Right?', Money Macro and Finance (MMF) Research Group Conference, 96 (2004) Chen, S.S. (2005), `A Note on In-Sample and Out-of Sample Tests for Granger Causality',JournalofForecasting,vol.24,435-464 Clark,T.E.andM.W.McCracken(2001),`TestsofEqualForecastAccuracyand EncompassingforNestedModels',JournalofEconometrics,vol.105,85-110 Clark, T.E. and K. West, (2006), `Using out-of-sample Mean Squared Predic- tionErrorstotesttheMartingaleDifferenceHypothesis',JournalofEconometrics, vol.138,291-311. Clark,T.E.andK.West,(2007),`ApproximatelyNormalTestsforEqualPredic- tiveAccuracyinNestedModels',JournalofEconometrics,vol.138,291-311. Cogley, T., Primiceri G. E. and T. J. Sargent, `Ination-Gap Persistence in the U.S.',NBERw.p. 13749(2008) 78 Cogley, T. and T. J. Sargent, `Evolving Post WWII U.S. Ination Dynamics', NBERMacroeconomicsAnnual(2001) Cogley,T.andT.J.Sargent,`DriftsandVolatilities: MonetaryPoliciesandOut- comesinthePostWWIIU.S.',ReviewofEconomicDynamics8(2005),262-302 Cogley,T.andA.Sbordone,`TrendInationandInationPersistenceintheNew KeynesianPhillipsCurve',AmericanEconomicReview98(2008),2101-26 DelNegro,M.andF.Schorfheide,`PriorsfromGeneralEquilibriumModelsfor VARs',InternationalEconomicReview,45(2)(2004),190-217 Diebold,F.X.,andR.S.Mariano,(1995),`ComparingPredictiveAccuracy',Jour- nalofBusinessandEconomicStatistics,vol.13,253-263. Evans, G. and S. Honkapohja, `Expectations, Learning and Monetary Policy: an OverviewofRecentResearch',CDMAWorkingPaperSeries(2008) Hubrich,K.andK.D.West,(2008),`ForecastEvaluationofSmallNestedModel Sets',forthcoming,JournalofAppliedEconometrics. Inoue, A. and L. Kilian, (2004), `In-Sample or Out-of-Sample Tests of Pre- dictability:WhichOneShouldWeUse?',EconometricsReviews,vol.23,371-402. Koop, G.M. and S. Potter, `Time Varying VARs with Inequality Restrictions', WorkingPapersUniversityofStrathclyde,(2008) McGough, B., `Statistical Learning with Time-Varying Parameters', Macroeco- nomicDynamics,7(2003),119-139 Milani, F. `The Evolution of the Fed's Ination Target in an Estimated Model underREandLearning',(2006),mimeo Milani,F.,`Learning,monetarypolicyrules,andmacroeconomicstability',Jour- nalofEconomics,DynamicsandControl,32(2008),3148-3165 Perlman, M.D., (1969), `One-Sided Testing Problems in Multivariate Analysis', TheAnnalsofmathematicalStatistics,vol.40,549-567. Politis, D., and J. Romano, (1994), `The Stationary Bootstrap', Journal of the AmericanStatisticalAssociation,vol.89,1303-1313. 79 Primiceri,G.E.,`TimeVaryingStructuralVectorAutoregressionsandMonetary Policy',TheReviewofEconomicStudies,72(2005),821-852 Sargent, T. J. and N. Williams, `Impacts of Priors on Convergence and Escapes fromNashInation',ReviewofEconomicDynamics,8(2)(2005),360-391 Sims, C.A., `Are Forecasting Models Usable for Policy Analysis', Minneapolis FederalReserveBankQuarterlyReview,(1986),2-16 Sims, C.A., `Interpreting the Macroeconomic Time Series Facts: the Effect of MonetaryPolicy',EuropeanEconomicReview,36(1992),975-1011 Slobodyan,S.andWouters,`EstimatingaMedium-ScaleDSGEModelwithEx- pectationsBasedonSmallForecastingModels',mimeo(2008) Stock, J. H. and Watson, M. W.,`Has the Business Cycle Changed and Why?', NBERMacroeconomicsAnnual,17(2003) Stock, J. H. and Watson, M. W., `Has Ination Become Harder to Forecast?', JournalofMoney,CreditandBanking,39(2007),3-34 Uhlig, H., `What are the effects of monetary policy on output? Results from an agnostic identication procedure', Journal of Monetary Economics, 52 (2005), 381-419 West, K.D., (1996) `Asymptotic Inference about Predictive Ability', Economet- rica,vol.64,1067-1084. White, H., (2000) `A Reality Check for Data Snooping', Econometrica, vol.68, 1097-1126.
Abstract (if available)
Abstract
This dissertation collects three essays on empirical monetary economics.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Monetary policy and the term structure of interest rates
PDF
Argentina's economic crisis in 2001 and the International Monetary Fund
PDF
Measurement error in income and consumption data and economic mobility
PDF
Large N, T asymptotic analysis of panel data models with incidental parameters
PDF
Essays in empirical health economics
PDF
Two essays on the impact of exchange rate regime changes in Asia: examples from Thailand and Japan
PDF
Essays on political economy of privatization
PDF
Essays on the economics of subjective well-being in transition countries
PDF
Structural transformation and globalization
PDF
Empirical essays on alliances and innovation in the biopharmaceutical industry
PDF
Emerging market financial crises, investors and monetary policy
PDF
Essays in international economics
PDF
Essays on the econometrics of program evaluation
PDF
Against the wind: labor force participation of women in Iran
PDF
Essays on commodity futures and volatility
PDF
Bayesian analysis of stochastic volatility models with Levy jumps
PDF
Adaptation, assets, and aspiration. Three essays on the economics of subjective well-being
PDF
Essays on inflation, stock market and borrowing constraints
PDF
Essays in behavioral and financial economics
PDF
Three essays on the credit growth and banking structure of central and eastern European countries
Asset Metadata
Creator
Granziera, Eleonora
(author)
Core Title
Empirical studies of monetary economics
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Economics
Publication Date
07/29/2010
Defense Date
05/03/2010
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
forecast evaluation,Learning and Instruction,monetary economics,OAI-PMH Harvest
Place Name
USA
(countries)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Moon, Hyungsik Roger (
committee chair
), Dekle, Robert (
committee member
), Jones, Christopher S. (
committee member
)
Creator Email
egranziera@yahoo.it,granzier@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m3228
Unique identifier
UC1426341
Identifier
etd-Granziera-3917 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-349663 (legacy record id),usctheses-m3228 (legacy record id)
Legacy Identifier
etd-Granziera-3917.pdf
Dmrecord
349663
Document Type
Dissertation
Rights
Granziera, Eleonora
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
forecast evaluation
monetary economics