Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Statistical inference for second order ordinary differential equation driven by additive Gaussian white noise
(USC Thesis Other)
Statistical inference for second order ordinary differential equation driven by additive Gaussian white noise
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
STATISTICAL INFERENCE FOR SECOND ORDER ORDINARY DIFFERENTIAL EQUATION DRIVEN BY ADDITIVE GAUSSIAN WHITE NOISE by Jian Wang A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (APPLIED MATHEMATICS) August 2018 Copyright 2018 Jian Wang Dedication To my parents, for their love and support. ii Acknowledgments Firstly, I would like to express my sincere gratitude to my advisor Prof. Lototsky, for the continuous support throughout my years in graduate school, for his patience and immense knowledge. His guidance helped me in all the time of research and steered me in the right direction. Beside my advisor, I would like to thank the rest of my oral and thesis committee: Prof. Lv, Prof. Mikulevicius, Prof. Zhang and Prof. Minsker, for their insightful comments. These comments widened my research from various perspectives, and discussions with them helped me overcome obstacles during the research. Finally, I would like to express my gratitude to my parents and other family members, for supporting me spiritually and my life in general. This accomplishment would not have been possible without them. Thank you. iii Table of Contents Dedication ii Acknowledgments iii Abstract vi List of Symbols vii Chapter 1: Introduction 1 Chapter 2: Classification 8 Chapter 3: Roots Outside the Unit Circle 13 Chapter 4: Conjugate Roots on the Unit Circle 18 4.1 Estimation without damping . . . . . . . . . . . . . . . . . . . . . . . 19 4.2 Simultaneous and Sequential Double Asymptotics . . . . . . . . . . . 32 4.2.1 Simultaneous Double Asymptotics . . . . . . . . . . . . . . . . 34 4.2.2 T goes to infinity, then h goes to 0 . . . . . . . . . . . . . . . 36 4.2.3 h goes to 0, then T goes to infinity . . . . . . . . . . . . . . . 37 4.3 Test for Damping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.3.1 h goes to 0, then T goes to infinity . . . . . . . . . . . . . . . 41 4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Chapter 5: Double Roots on the Unit Circle 44 Chapter 6: Distinct Roots Inside the Unit Circle 48 Chapter 7: Double Roots inside the Unit Circle 55 Chapter 8: Mixed Roots 60 Chapter 9: Comparison to Continuous Time Model 67 9.1 Ergodic Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 iv 9.2 Distinct Real Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 9.3 Double Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 9.4 Conjugate Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Chapter 10: Summary 76 Chapter 11: Appendix 77 Bibliography 79 v Abstract This thesis studied the statistical inference problem for a second order linear stochastic ordinary differential equation using discrete observations. All cases for this equation are analyzed, and the limiting distributions for the estimation errors are attained and compared to the corresponding continuous-time results in every case. The thesis also discussed the simultaneous and sequential double asymptotics, which serve as a bridge between the discrete and continuous-time model. vi List of Symbols B matrix, Page 5 W 1 ,W 2 Brownian motions, X, underlying process studied, Page 1 Y,Z processes, Page 61 g 1 ,g 2 solutions to ODEs associated to the problem, Page 1 h Sampling interval, Page 2 r parameter related to θ 1 ,θ 2 , Page 1 θ 1 ,θ 2 parameters, Page 1 α h ,β h parameters related to θ 1 ,θ 2 , Page 2 σ scale of the Brownian motion driving the ODE, Page 1 ξ,η white noises, σ h , scale of noise, Page 3 ϕ h , parameter, Page 3 λ 1 ,λ 2 eigenvalues of matrix B, Page 49 γ h (0),γ h (1) autocovariance functions, Page 3 φ(z) characteristic polynomial, Page 8 vii Chapter 1 Introduction The purpose of this thesis is to study the estimation of parameters θ 1 and θ 2 of the following stochastic ODE ¨ X(t) =θ 1 ˙ X(t) +θ 2 X(t) +σdW (t), t> 0, X(0) = ˙ X(0) = 0. (1.1) from discrete observations, with W (t) being a standard Brownian motion. It is an important model for a broad range of problems in mechanics and electric circuit the- ory. One example is forθ 1 = 0,θ 2 < 0, the process represents an undamped harmonic oscillator. Lin and Lototsky (2011) studied the problem of estimating the parameters of the undamped harmonic oscillator with continuous observations of{X(t)} t≥0 as well as{ ˙ X(t)} t≥0 , but very often the assumption of continuous observation is unreal- istic. What’s more, ˙ X(t) in most cases cannot be observed directly at all. This thesis is meant to overcome the two obstacles by building a statistical inference theory on discrete observations of the process{X(t)} t≥0 only. All cases for arbitrary values of θ 1 and θ 2 are considered, and the limiting distributions are deducted and compared to the corresponding continuous-time cases. Use parameter variation, the solution to (1.1) can be written out explicitly (see Appendix for details) X(t) =σ Z t 0 g 1 (t−s)dW (s). (1.2) Where g 1 is the solution of the following initial value problem ¨ g 1 −θ 1 ˙ g 1 −θ 2 g 1 = 0, g 1 (0) = 0, ˙ g 1 (0) = 1. To discretizeX(t), another funtiong 2 with the following initial conditions is intro- duced: ¨ g 2 −θ 1 ˙ g 2 −θ 2 g 2 = 0, g 2 (0) = 1, ˙ g 2 (0) = 0. 1 Set r = q |θ 2 +θ 1 2 /4|, the solutions to the two linear ODEs are: g 1 (t) = sin(rt) r e θ 1 t/2 , if −θ 2 >θ 1 2 /4; te θ 1 t/2 , if −θ 2 =θ 1 2 /4; sinh(rt) r e θ 1 t/2 , if −θ 2 <θ 1 2 /4. g 2 (t) = cos(rt)− θ 1 2r sin(rt) e θ 1 t/2 , if −θ 2 >θ 1 2 /4; 1− (θ 1 t/2) e θ 1 t/2 , if −θ 2 =θ 1 2 /4; cosh(rt)− θ 1 2r sinh(rt) e θ 1 t/2 , if −θ 2 <θ 1 2 /4. Suppose the process is sampled at equal intervals, let the interval length be h, then the sampling times are t k = kh. Set X k = X(t k ), ˙ X k = ˙ X(t k ) for simplicity of writing. The solution (1.2) is discretized as: X(t) =X k g 2 (t−t k ) + ˙ X k g 1 (t−t k ) +σ Z t t k g 1 (t−s)dW (s). Plug in t =t k+1 and t =t k−1 , the above formula becomes X(t k+1 ) =X k g 2 (h) + ˙ X k g 1 (h) +σ Z t k+1 t k g 1 (k + 1)h −s dW (s), (1.3) X(t k−1 ) =X k g 2 (−h) + ˙ X k g 1 (−h) +σ Z t k−1 t k g 1 (k− 1)h −s dW (s). (1.4) For nontrivial h, g 1 (h) = 0 is possible only in the situation θ 2 1 /4 +θ 2 < 0. For these specific values of h, (1.3) becomes an AR(1) model, which has been studied thoroughly and will not be discussed here [check Brockwell and Davis(1991)]. For nontrivial situations that sample interval h is chosen such that g 1 (h)6= 0, by combining (1.3) and (1.4) and eliminate the ˙ X k term, the following model is derived X k+1 =α h X k +β h X k−1 +ξ k+1 , k≥ 1. (1.5) In which α h =g 2 (h)− g 1 (h) g 1 (−h) f(−h), β h = g 1 (h) g 1 (−h) . (1.6) And the initial conditions areX 0 = 0, X 1 =ξ 1 =σ R h 0 g 1 (h−s)dW (s). In general, the noises have a one-step dependence, which can be seen from the formula below 2 ξ k+1 =σ Z t k+1 t k g 1 (t k+1 −s)dW (s)−β h Z t k t k−1 g 1 (t k−1 −s)dW (s) . (1.7) For n≥m, the autocovariance function of the noises is E ξ n ξ m = 0 , n−m≥ 2; −σ 2 β h Z 0 −h g 1 (s)g 1 (s +h)ds :=γ h (1) , n−m = 1; σ 2 ( Z h 0 g 2 1 (s)ds +β 2 h Z 0 −h g 2 1 (s)ds :=γ h (0) , n−m = 0. From Wold’s decomposition theorem [Theorem 42 of Pesaran(2015)], ξ n can be rep- resented by η n +ϕ h η n−1 , and η n ∼ iid N(0,σ 2 h ). The values of ϕ h and σ h satisfy Eξ 2 n = (1 +ϕ 2 h )σ 2 h , Eξ n ξ n−1 =ϕ h σ 2 h . Solve the equations to get the values of σ h and ϕ h , which are σ h = ( q γ h (0) + 2γ h (1) + q γ h (0)− 2γ h (1))/2, ϕ h = ( q γ h (0) + 2γ h (1)− q γ h (0)− 2γ h (1))/( q γ h (0) + 2γ h (1) + q γ h (0)− 2γ h (1)). By Cauchy-Schwartz inequality, γ h (1) =−β h Z 0 −h g 1 (s)g 1 (s +h)ds≤ [ Z 0 −h β 2 h g 2 1 (s)ds Z 0 −h g 2 1 (s +h)ds] 1 2 = [ Z 0 −h β 2 h g 2 1 (s)ds Z h 0 g 2 1 (s)ds] 1 2 ≤ 1 2 [ Z 0 −h β 2 h g 1 (s)ds + Z h 0 g 2 1 (s)ds] =γ h (0)/2. And by considering the 3 types of functions that g 1 (t) can be, the first inequality is actually sharp. Thus σ h > 0,|ϕ h |< 1. Then (1.5) can be rewritten as: X k+1 =α h X k +β h X k−1 +η k+1 +ϕ h η k , k≥ 1. (1.8) For ϕ h 6= 0, it is an ARMA(2,1) model. The rest of the thesis will still stick to (1.8). Recall (1.6),α h andβ h are functions ofθ 1 andθ 2 . Instead of estimatingθ 1 and θ 2 directly, this thesis will work on estimating α h and β h will discrete observations {X k } k≥0 . Three types of estimators are introduced below. 3 One type of estimators for α h and β h is the least squares estimators, which are derived from minimizing n X k=1 (X k+1 −α h X k −β h X k−1 ) 2 . (1.9) Take partial derivatives with respect to α h and β h , ∂ ∂α h n X k=1 (X k+1 −α h X k −dX k−1 ) 2 = 2α h n X k=1 X 2 k + 2β h n X k=1 X k−1 X k − 2 n X k=1 X k X k+1 , ∂ ∂β h n X k=1 (X k+1 −α h X k −β h X k−1 ) 2 = 2α h n X k=1 X k−1 X k + 2β h n X k=1 X 2 k−1 − 2 n X k=1 X k−1 X k+1 . Set them both to 0 to get the estimators ˆ α h ˆ β h = n X k=1 X k X k−1 X k X k−1 −1 n X k=1 X k X k−1 X k+1 . (1.10) While multiplying X k and X k−1 to (1.5) leads to X k+1 =α h X k +β h X k−1 +ξ k+1 ⇒ X k X k+1 =X k α h X k +β h X k−1 +ξ k+1 X k−1 X k+1 =X k−1 α h X k +β h X k−1 +ξ k+1 And the true values of α h and β h are α h β h = n X k=1 X k X k−1 X k X k−1 −1 n X k=1 X k X k−1 X k+1 −ξ k+1 . The estimation errors are ˆ α h −β h ˆ β h −β h = n X k=1 X k X k−1 X k X k−1 −1 n X k=1 X k X k−1 ξ k+1 . (1.11) 4 SetX k+1 = (X k+1 ,X k ) > , ξ k = (ξ k , 0) > , B = α h β h 1 0 then the matrix form of (1.5) is X k+1 =BX k +ξ k+1 . (1.12) The error terms of the least square estimators in matrix form are ˆ α h −α h ˆ β h −β h = ( n X i=1 X k X > k ) −1 ( n X i=1 X k−1 ξ > k ). (1.13) Another type of estimators can be derived from the log likelihood function. The log likelihood function is l(α h ,β h |X 0 ,X 1 ,...X n ) = ln L(α h ,β h |X 0 ,X 1 ,...X n ) = lnp(X 0 ,X 1 ,...X n ;α h ,β h ) = ln[p(X n |X n−1 ,X n−2 ,...X 0 )·...·p(X 2 |X 1 ,X 0 )p(X 1 |X 0 )]. (1.14) In which p(·) is the probability density function. But the problem is that if ˜ X n = α h X n−1 +β h X n−2 is used to estimate X n , then X n − ˜ X n =η n +ϕ h η n−1 , X n−1 − ˜ X n−1 =η n−1 +ϕ h η n−2 . There is a one step overlap of the noises, which means p(X n |X n−1 ,X n−2 ,...X 0 ) = p(X n − ˜ X n ) and p(X n−1 |X n−2 ,X n−3 ,...,X 0 ) =p(X n−1 − ˜ X n−1 ) are not independent, and the log likelihood function cannot be written out easily. The Durbin-Levinson Algorithm and the Innovation Algorithm [Brockwell and Davis (1991), Proposition 5.2.1 and Proposition 5.2.2] introduced a way of dealing with this problem. Instead of estimating X n byα h X n−1 +β h X n−2 , use the estimator ˆ X n =α h X n−1 +β h X n−2 +ϕ h (X n−1 − ˆ X n−1 ), starting with ˆ X 0 = 0. Which means X 1 =η 1 , ˆ X 1 = 0; X 2 =α h X 1 +β h X 0 +η 2 +ϕ h η 1 , ˆ X 2 =α h X 1 +β h X 0 +ϕ h η 1 ; . . . . . . 5 In this way, X n − ˆ X n = η n , the estimation errors are independent. And from ˆ X n = α h X n−1 +β h X n−2 +ϕ h (X n−1 − ˆ X n−1 ), it is clear that ˆ X n ∈F n−1 . Writing it out explicitly, ˆ X i = [(α h +ϕ h )X i−1 +β h X i−2 ]− i−2 X j=1 ϕ j h [(α h +ϕ h )X i−1−j +β h X i−2−j ]. (1.15) Then p(X n |X n−1 ,X n−2 ,...X 0 ) = (2πσ 2 h ) −1/2 exp{−(X n − ˆ X n ) 2 /(2σ 2 h )}, and the log likelihood function (1.14) becomes l(α h ,β h |X 0 ,X 1 ,...X n ) =− n 2 ln(2πσ 2 h )− 1 2σ 2 h [ n X i=1 (X i − ˆ X i ) 2 ]. The log likelihood function is maximized by minimizing [ P n i=1 (X i − ˆ X i ) 2 ] with respect to α h ,β h . And the estimations of α h and β h satisfy the following equation M ˆ α h ˆ β h = n X i=2 M 1 (i− 1) M 1 (i− 2) (X i −ϕ h X i−1 + i−2 X j=1 ϕ j+1 h X i−1−j ). (1.16) Where M and M · matrices are defined as M 1 (i− 1) = n X i=2 h X i−1 − i−2 X j=1 ϕ j h X i−1−j i 2 , M 2 (i− 2) = n X i=2 h X i−1 − i−2 X j=1 ϕ j h X i−1−j ih X i−2 − i−2 X j=1 ϕ j h X i−2−j i , M = n−1 X i=1 M 2 1 (i) M 2 2 (i) M 2 2 (i) M 2 1 (i− 1) . While α h and β h satisfy M α h β h = n X i=2 M 1 (i− 1) M 1 (i− 2) (X i −ϕ h X i−1 + i−2 X j=1 ϕ j+1 h X i−1−j −η i ). (1.17) From (1.16) and (1.17), the estimation errors are ˆ α h −α h ˆ β h −β h =M −1 n X i=2 M 1 (i− 1) M 1 (i− 2) η i . (1.18) 6 To sum it up, the goal is to estimate the parameters of (1.1) with discrete obser- vations. After discretizing the equation, three types of estimators: the least squares estimators, the log likelihood estimators and the Durbin-Levinson type estimators are introduced. For the rest of the thesis, only the least squares estimators will be studied due to the fact that they can be explicitly written out and thus simpler to deduct the results. This simplicity provides the explicit formulas for the estimators, and the main focus will be on the limiting distributions of the estimation errors. The rest of the thesis is arranged as following: Chapter 2 classifies all the possible cases, then Chapter 3 through 8 treats every case separately and attains all the limiting distributions of the estimation errors, Chapter 9 compares the results to the results of CAR(2) discussed in Lin and Lototsky (2014), Chapter 10 concludes. 7 Chapter 2 Classification SetL to be the lag operator, the model (1.8) can be written as (1−α h L−β h L 2 )X t = (1−ϕ h L)η t . The characteristic polynomial of the autoregressive part is φ(z) = 1−α h z−β h z 2 . The aymptotic behaviour of the least squares estimators (ˆ α h , ˆ β h ) depend on the posi- tions of the roots ofφ(z) relative to the unit circle. And as discussed in last chapter, the solutions ofg 1 andg 2 have three different types, depending on the sign of θ 2 1 4 +θ 2 , which means (α h ,β h ) have three different types. So the model is primarily classified into three main categories with respect to the sign of θ 2 1 4 +θ 2 , then into different types with respect to the positions of the roots of φ(z) relative to the unit circle. Category I For θ 2 1 4 + θ 2 > 0, the solution of g 1 (t) is sinh(rt) r e θ 1 t/2 . For t 6= 0, g 1 (t)6= 0, then α h = g 1 (h) g 1 (−h) is well defined. The model is X k+1 = 2 cosh(rh)e θ 1 h 2 X k −e θ 1 h X k−1 +ξ k+1 . (2.1) The roots of φ(z) are [cosh(rh)± sinh(rh)]e − θ 1 h 2 , or e (±r− θ 1 2 )h . Both roots are positive and real. Since (r− θ 1 2 )(−r− θ 1 2 ) =−θ 2 , thesignofθ 2 determinesthepositionsoftherootsrelativeto 1.Sothreesubcategories based on the sign of θ 2 are introduced. 1. Ifθ 2 = 0,thentherootsare1and e −θ 1 h .Noticethatinthiscategory, θ 2 1 4 +θ 2 > 0, so θ 1 cannot be 0. Depending on the sign of θ 1 , there are two cases, 8 I-1 θ 1 > 0,θ 2 = 0, then there is one root at 1, the other less than 1. I-2 θ 1 < 0,θ 2 = 0, then there is one root at 1, the other greater than 1. 2. Ifθ 2 < 0, then (±r− θ 1 2 ) have the same sign, the two roots will be both greater or less than 1. There are two cases, I-3 θ 1 > 0,θ 2 < 0, θ 2 1 4 +θ 2 > 0, the case is distinct roots both less than 1. I-4 θ 1 < 0,θ 2 < 0, θ 2 1 4 +θ 2 > 0, the case is distinct roots both greater than 1. 3. For θ 2 > 0, (±r− θ 1 2 ) have opposite signs, then the case is I-5 ∀θ 1 , θ 2 > 0, θ 2 1 4 +θ 2 > 0, one root is less than 1, the other greater than 1. The autocovariance formula for this category when θ 1 6= 0 is Eξ n ξ m = 0, if n−m≥ 2; −σ 2 β h 4r sinh(hθ 1 /2) cosh(rh)− 2θ 1 cosh(hθ 1 /2) sinh(rh) θ 1 3 r− 4θ 1 r 3 , if n−m = 1; σ 2 e θ 1 h θ 2 1 cosh(2rh)−θ 2 1 − 2rθ 1 sinh(2rh) + 4r 2 − 4r 2 2θ 3 1 r 2 − 8θ 1 r 4 +σ 2 β 2 h e −θ 1 h −θ 2 1 cosh(2rh) +θ 2 1 − 2rθ 1 sinh(2rh)− 4r 2 + 4r 2 2θ 3 1 r 2 − 8θ 1 r 4 if n−m = 0. (2.2) While for θ 1 = 0, the autocovariance is Eξ n ξ m = 0, if n−m≥ 2; −σ 2 d sinh(rh)−rh cosh(rh) 2r 3 , if n−m = 1; σ 2 (1 +d 2 ) sinh(2rh)− 2rh 4r 3 , if n−m = 0. (2.3) Category II For the category θ 2 1 4 +θ 2 = 0, g 1 (t) = te θ 1 t/2 . With h6= 0,α h is always well defined in this situation. And the model is X k+1 = 2e θ 1 h 2 X k −e θ 1 h X k−1 +ξ k+1 . (2.4) 9 The roots of the characteristic function are double roots: e − θ 1 h 2 . It depends only on the sign of θ 1 , while θ 2 =− θ 2 1 4 also depends on θ 1 . II-1 For θ 1 < 0, θ 2 < 0, θ 2 1 4 +θ 2 = 0, it is double roots greater than 1. II-2 For θ 1 > 0, θ 2 < 0, θ 2 1 4 +θ 2 = 0, it is double root less than 1. II-3 For θ 1 = 0, θ 2 = 0, θ 2 1 4 +θ 2 = 0, then it is double roots at 1. For this category, the autocovariance formula while θ 1 6= 0 is Eξ n ξ m = 0, if n−m≥ 2; −σ 2 d 4 sinh(hθ 1 /2)−2θ 1 h cosh(hθ 1 /2) θ 1 3 , if n−m = 1; σ 2 e θ 1 h θ 1 h(θ 1 h− 2) + 2 − 2 θ 1 3 , +σ 2 d 2 e −θ 1 h −θ 1 h(θ 1 h− 2)− 2 + 2 θ 1 3 , if n−m = 0. (2.5) And for θ 1 = 0, it is Eξ n ξ m = 0, if n−m≥ 2; σ 2 d h 3 6 , if n−m = 1; σ 2 (1 +d 2 ) h 3 3 , if n−m = 0. (2.6) Category III The third category comes from the situation θ 2 1 4 +θ 2 < 0. And in this type g 1 (h) = sin(rt) r e θ 1 t/2 , it is possible that for some h, g 1 (h) is zero. For those specific values of h, α h = g 1 (h)/g 1 (−h) is not well defined. Actually for the situation that g 1 (h) = 0, recall (1.3), the discretization itself gives an AR(1) model, no elimination of ˙ X k is needed and neither is α h . III-0 For g 1 (h) = 0, which means rh =mπ,m∈Z, the model is X k+1 = cos(rh)e θ 1 h 2 X k +ξ k+1 . (2.7) And the root of the characteristic function is h cos(rh)e θ 1 h 2 i −1 , the magnitude is decided by the sign of θ 1 . If θ 1 < 0, θ 2 < 0, θ 2 1 4 +θ 2 < 0. The magnitude of the root of the characteristic polynomial is less than 1: 10 III-0-1 For rh = 2mπ, the model is X k+1 =e θ 1 h 2 X k +ξ k+1 . III-0-2 For rh = 2mπ +π, the model is X k+1 =−e θ 1 h 2 X k +ξ k+1 . If θ 1 = 0, θ 2 < 0, θ 2 1 4 +θ 2 < 0, the magnitude of the root is equal to 1: III-0-3 For rh = 2mπ, the model is X k+1 =X k +ξ k+1 . III-0-4 For rh = 2mπ +π, the model is X k+1 =−X k +ξ k+1 . If θ 1 > 0, θ 2 < 0, θ 2 1 4 +θ 2 < 0, the magnitude of the root is less than 1: III-0-5 For rh = 2mπ, the model is X k+1 =e θ 1 h 2 X k +ξ k+1 . III-0-6 For rh = 2mπ +π, the model is X k+1 =−e θ 1 h 2 X k +ξ k+1 . All the six cases are AR(1) model and have been thoroughly studied [see Hamil- ton(1994)]. They will not be considered here. Then consider the more general situation that θ 2 1 4 +θ 2 < 0 andrh6= 0. The model in this situation is X k+1 = 2 cos(rh)e θ 1 h 2 X k −e θ 1 h X k−1 +ξ k+1 . (2.8) The roots are [cos(rh)±i sin(rh)]e − θ 1 h 2 . They are conjugates, and the magnitude is decided by the sign of θ 1 . III-1 rh6=mπ,θ 1 < 0, θ 2 < 0, θ 2 1 4 +θ 2 < 0, the roots are outside the unit circle. III-2 rh6=mπ,θ 1 = 0, θ 2 < 0, θ 2 1 4 +θ 2 < 0, the roots are on the unit circle and not equal to±1. III-3 rh6=mπ,θ 1 > 0, θ 2 < 0, θ 2 1 4 +θ 2 < 0 The roots are inside the unit circle. For type III-1,2,3, the autocovariance formula for θ 1 = 0 is reduced to Eξ n ξ m = 0, if n−m≥ 2; −σ 2 d rh cos(rh)− sin(rh) 2r 3 , if n−m = 1; σ 2 (1 +d 2 ) 2rh− sin(2rh) 4r 3 if n−m = 0. (2.9) 11 And the general autocovariance formula for θ 1 6= 0 is Eξ n ξ m = 0, if n−m≥ 2; σ 2 β h e −hθ 1 /2 [θ 1 (e hθ 1 + 1) sin(rh)− 2r(e hθ 1 − 1) cos(rh)] rθ 1 3 + 4θ 1 r 3 , if n−m = 1; σ 2 e θ 1 h θ 2 1 −θ 2 1 cos(2rh)− 2θ 1 r sin(2rh) + 4r 2 − 4r 2 2r 2 θ 1 3 + 8θ 1 r 4 + σ 2 β 2 h e −θ 1 h −θ 2 1 +θ 2 1 cos(2rh)− 2θ 1 r sin(2rh)− 4r 2 + 4r 2 2r 2 θ 1 3 + 8θ 1 r 4 , if n−m = 0. (2.10) For the models considered in this thesis, they are ARMA(2,1) models if β h 6= 0. And for β h = 0, they are AR(2) models. For estimation of (α h ,β h ) in AR(2), it can be attained by just settingϕ h = 0 in the result of ARMA(2,1). This thesis will study the problem of estimating (α h ,β h ) from discrete observations{X k } k≥0 of the process {X(t)} t≥0 . The rest of the thesis is arranged as following: Chapter 3 deals with the ergodic case, which means two roots of the characteristic polynomial are outside the unit circle, and it covers the cases I-4, II-1 and III-1. Chapter 4 deals with case III-2, which represents conjugate roots on the unit circle. Chapter 5 deals with case II-3, which represents double roots at 1. Chapter 6 deals with case I-3 and III-3, which represents distinct roots inside the unit circle. Chapter 7 deals with case II-2, which is double roots inside the unit circle. Chapter 8 deals with the mixed roots cases: section 1 deals with case I-2, which is one root at 1, the other greater than 1; section 2 deals with case I-1, which is one root at 1, the other less than 1; section 3 deals with case I-5, which is one root less than 1, the other root greater than 1. Chapter 9 summarizes and compares the results to the CAR(2) model studied by Nin and Lototsky (2014). 12 Chapter 3 Roots Outside the Unit Circle This chapter considers the situation that both roots of the characteristic polyno- mial have modulus greater than 1. This includes 3 cases: I-4, distinct roots outside the unit circle; II-1, double roots outside the unit circle; and III-1, conjugate roots outside the unit circle. The asymptotic distributions for the three cases are the same, and combine the restrictions on θ 1 and θ 2 , it will just be θ 1 < 0, θ 2 < 0. Recall (1.8), the model is X k+1 =α h X k +β h X k−1 +η k+1 +ϕ h η k , k≥ 1,η k ∼N(0,σ 2 h ), (3.1) or in the matrix form (1.12), X k+1 =BX k +ξ k+1 . The estimation errors, from (1.11), are ˆ α h −α h ˆ β h −β h = P n i=1 X 2 k P n i=1 X k X k−1 P n i=1 X k X k−1 P n i=1 X 2 k −1 P n i=1 X k ξ k+1 P n i=1 X k−1 ξ k+1 . (3.2) Use the lag operator to rewrite (3.1), (1−α h L−β h L 2 )X k = (1 +ϕ h L)η k . (3.3) Then from the deduction done in Chapter 1, (1−α h z−β h z 2 ) and (1+ϕ h z) doesn’t equal to 0 for|z|≤ 1, so the model is casual and invertible, thus strictly stationary and ergodic. The following lemma on functional of ergodic process [Taniguchi and Kakizawa (2000), Theorem 1.3.3] is introduced: Lemma 3.1. Suppose that a process{X t ,t∈ Z} is strictly stationary and ergodic, and that there is a measurable function f :R ∞ →R k . Define Y t such that 13 Y t = f(X t ,X t−1 ,...). Then{Y t ,t∈Z} is strictly stationary and ergodic. [Brockwell and Davis (1991), Theorem 3.1.1 and 3.1.2] Thislemmaisusedtoprovethatthefirstmatrixontherightsideof(3.2)converges to a finite value matrix. Theorem 3.2. 1 n P n i=1 X 2 k and 1 n P n i=1 X k X k−1 converge to finite value. Proof. By Lemma 3.1, X 2 k and X k X k−1 are strictly stationary and ergodic. Multiply X k+1 ,X k , and X k−1 to both sides of (3.1), and replace X k+1 by (3.1), then take expectation of the term leads to EX k+1 X k+1 =E[α h X k X k+1 +β h X k−1 X k+1 +η k+1 X k+1 +ϕ h η k X k+1 ], EX k+1 X k =E[α h X k X k +β h X k−1 X k +η k+1 X k +ϕ h η k X k ], (3.4) EX k+1 X k−1 =E[α h X k X k−1 +β h X k−1 X k−1 +η k+1 X k−1 +ϕ h η k X k−1 ]. Set the autocovariance function to be γ(h) = E[X k X k−h ],h≥ 0. Notice that X k−1 and ξ k are independent, soEX k−h η k = 0 for h≥ 1. Thus the two terms,EX k η k and EX k+1 η k , which are needed in (3.4), can be calculated using this fact: EX k η k =E[(α h X k−1 +β h X k−2 +η k +ϕ h η k−1 )η k ] =σ 2 h , EX k+1 η k =E[(α h X k +β h X k−1 +η k+1 +ϕ h η k )η k ] =Eα h X k η k +σ 2 h ϕ h =σ 2 h (α h +ϕ h ). Plug the two terms above into (3.4) leads to 1 −α h −β h −α h 1−β h 0 −β h −α h 1 γ(0) γ(1) γ(2) = σ 2 h (1 +α h ϕ h +ϕ 2 h ) σ 2 h ϕ h 0 . Invert the matrix on the left side gives γ(0) γ(1) γ(2) = 1 (1 +β h )[(1−β h ) 2 −α 2 h ] 1−β h α h (1 +β h ) β h (1−β h ) α h 1−β 2 h α h β h α 2 h +β h (1−β h ) α h (1 +β h ) 1−α 2 h −β h 14 · σ 2 h (1 +α h ϕ h +ϕ 2 h ) σ 2 h ϕ h 0 . By Lemma 3.1, 1 n n X i=1 X 2 k →γ(0), 1 n n X i=1 X k X k−1 →γ(1) a.s. (3.5) From Theorem 3.2, 1 n P n i=1 X 2 k P n i=1 X k X k−1 P n i=1 X k X k−1 P n i=1 X 2 k a.s. → H 1 = γ(0) γ(1) γ(1) γ(0) . And H 1 is H 1 = 1 (1 +β h )[(1−β h ) 2 −α 2 h ] · (1−β h )γ h (0) + 2α h γ h (1) α h γ h (0) + (1 +α 2 h −β 2 h )γ h (1) α h γ h (0) + (1 +α 2 h −β 2 h )γ h (1) (1−β h )γ h (0) + 2α h γ h (1) . This shows that the matrixH 1 relies only onα h ,β h and the autocovariance of the noise{ξ k } k≥0 . For the other term P n k=1 X k−1 ξ k in (3.2), the idea is to replace ξ k by η k +ϕ h η k−1 and construct a martingale difference sequence: n X k=1 X k−1 ξ k = n X k=1 (α h X k−2 +β h X k−3 +η k−1 +ϕ h η k−2 )(η k +ϕ h η k−1 ) X k−2 (η k +ϕ h η k−1 ) = P n−1 k=1 [ϕ h α h X k−1 + (α h +ϕ h β h )X k−2 +β h X k−3 ]η k + P n k=1 ξ k−1 ξ k + (α h X n−2 +β h X n−3 )η n P n−1 k=1 (ϕ h X k−1 +X k−2 )η n +X n−2 η n . (3.6) Theorem 3.3. 1 √ n P n k=1 X k−1 ξ k − ( P n k=1 ξ k ξ k−1 , 0) > converges to a normal dis- tribution. 15 Proof. The idea is to use the characteristic triplet of a process to prove convergence in distribution. Define Y n := P n−1 k=1 α h ϕ h X k−1 +(α h +ϕ h β h )X k−2 +β h X k−3 √ n η k + α h X n−2 +β h X n−3 √ n aη n P n−1 k=1 ϕ h X k−1 +X k−2 √ n η k + X n−2 √ n η n . Then by definition, the covariation matrix ˆ C can be calculated using Theorem 3.2. ˆ C n 1,1 = [Y n,1 ,Y n,1 ] = σ 2 h n n−1 X k=1 [α h ϕ h X k−1 + (α h +ϕ h β h )X k−2 +β h X k−3 ] 2 + (α h X n−2 +β h X n−3 ) 2 a.s. → ˆ C 1,1 = [(α h ϕ h ) 2 + (α h +ϕ h β h ) 2 + (β h ) 2 ]γ(0) + 2(α h +ϕ h β h )(β h +ϕ h α h )γ(1) + 2ϕ h α h β h γ(2) = [(α 2 h +β 2 h )γ h (0) + 2α h β h γ h (1)]γ(0) + 2[α h β h γ h (0) + (α 2 h +β 2 h )γ h (1)]γ(1) + 2α h β h γ h (1)γ(2) ˆ C n 1,2 = ˆ C n 2,1 = [Y n,1 ,Y n,2 ] = σ 2 h n n−1 X k=1 [ϕ h α h X k−1 + (α h +ϕ h β h )X k−2 +β h X k−3 ][ϕ h X k−1 +X k−2 ] +o( 1 n ) a.s. → ˆ C 1,2 = ˆ C 2,1 = (α h γ h (0) +β h γ h (1))γ(0) + [β h γ h (0) + 2α h γ h (1)]γ(1) +β h γ h (1)γ(2) ˆ C n 2,2 = [Y n,2 ,Y n,2 ] = σ 2 h n n−1 X k=2 (ϕ h X k−1 +X k−2 ) 2 +o( 1 n ) a.s. → ˆ C 2,2 =γ h (0)γ(0) + 2γ h (1)γ(1) Apply Theorem VIII 3.8 in Jacod and Shiryaev (2002), Y n d → N(0, ˆ C), and the components of ˆ C are defined as above. Theorem 3.4. The estimators are biased, and after bias correction, the esti- mation errors are asymptotically normal with convergence rate √ n. Proof. Since 1 n P n k=1 ξ k ξ k−1 0 a.s. → γ h (1) 0 (3.7) 16 by the ergodic theorem. Which means ˆ α h −α h ˆ β h −β h a.s. → H −1 1 γ h (1) 0 . Thus the estimators are biased. After bias correction, from Theorem 3.2 and 3.3, the limiting distribution is √ n ˆ α h −α h ˆ β h −β h −H −1 1 γ h (1) 0 d →H −1 1 N(0, Σ 1 ) . Where Σ 1 = ˆ C, ˆ C is the same as in Theorem 3.3. 17 Chapter 4 Conjugate Roots on the Unit Circle This chapter treats case III-2, θ 1 = 0, θ 2 < 0, and r = q |θ 2 1 /4 +θ 2 | = √ −θ 2 . In this case, the model is X k+1 = 2 cos(rh)X k −X k−1 +ξ k+1 , (4.1) the process is an undamped harmonic oscillator. For the simplicity of future writing, set φ h = rh. When φ h = mπ,m ∈ Z, as discussed in Chapter 2 there are two situations: • φ h = 2mπ, the model is X k+1 =X k +ξ k+1 ; • φ h = 2mπ +π, the model is X k+1 =−X k +ξ k+1 . Whicharetrivialandwillnotbeconsideredhere. Thischapterconsidersthesituation thatφ h 6=mπ, so the roots of the characteristic function of the AR part are conjugate roots on the unit circle and not equal to±1. Under this setting, the main result is Theorem 4.1. Given observations{X k } k≥0 at t k =kh, 0≤k≤n =T/h,T is the ending time of the process, then • (a) the least squares estimator for α and β are ˆ α h ˆ β h = ( n X k=1 X k X > k ) −1 ( n X k=1 X k−1 ·X k ), and for the undamped harmonic oscillator (θ 1 = 0,θ 2 < 0), lim n→∞ n ˆ α h −α h ˆ β h −β h = lim n→∞ (n −2 n X k=1 X k X > k ) −1 (n −1 n X k=1 X k−1 ·ξ k ) 18 = 2 Z 1 0 W 2 1 (s)ds + Z 1 0 W 2 2 (s)ds −1 · sin(φ h ) cos(φ h ) 0 −1 · R 1 0 W 1 (s)dW 2 (s)− R 1 0 W 1 (s)dW 2 (s) R 1 0 W 1 (s)dW 1 (s) + R 1 0 W 2 (s)dW 2 (s) + ϕ h sin(φ h ) 1 − cos(φ h ) . • (b) For the LSE with lag 1, namely multiply both sides of (1.12) byX k−1 instead ofX k and sum in k, then calculate the least squares estimators. The estimators are ˜ α h ˜ β h = ( n X k=1 X k−1 X > k ) −1 ( n X k=1 X k−2 ·X k ). For this type of estimators, the limiting distribution of the estimation errors is lim n→∞ n ˜ α h −α h ˜ β h −β h = lim n→∞ (n −2 n X k=1 X k−1 X > k ) −1 (n −1 n X k=1 X k−2 ·ξ k ) = 2 Z 1 0 W 2 1 (s)ds + Z 1 0 W 2 2 (s)ds −1 · sin(φ h ) cos(φ h ) 0 −1 · R 1 0 W 1 (s)dW 2 (s)− R 1 0 W 1 (s)dW 2 (s) R 1 0 W 1 (s)dW 1 (s) + R 1 0 W 2 (s)dW 2 (s) . In Section 1, estimation of θ 2 given θ 1 = 0 (no damping) is studied. In Section 2, simultaneous and sequential double asymptotics of ˆ α h −α h are studied and related to maximum likelihood estimator of θ 2 . Section 3 presents a method of testing for θ 1 = 0 against θ 1 < 0 (damping is present), Section 4 concludes. 4.1 Estimation without damping Consider the case that θ 1 = 0 is given, then (1.1) becomes ¨ X(t) =θ 2 X(t) +σdW (t), t> 0, 19 The corresponding exact discretization model is X k+1 = 2 cos(rh)X k −X k−1 +ξ k+1 , (α h = 2 cos(rh),β h =−1) (4.2) In whichα h is to be estimated. For independent noises η k ∼ i.i.dN(0, 1), Chan and Wei (1988) proved the following model X k+1 = 2 cos(rh)X k −X k−1 +η k+1 , X 0 = 0, has solution X k = 1 1− 2 cos(rh)L +L 2 η k = 1 sin(φ) k X j=0 sin (k + 1−j)φ h η k = 1 sin(φ h ) cos(kφ h ) sin(kφ h ) cos(φ h ) sin(φ h ) − sin(φ h ) cos(φ h ) − P k j=1 sin(jφ h )η j P k j=1 cos(jφ h )η j . In which L stands for the lag operator. A central limit theorem related to the last vector proved by Chan and Wei (1988) is: for η j ∼ i.i.d. N(0, 1), − √ 2 √ n P [nx] j=1 sin(jφ h )η j √ 2 √ n P [ny] j=1 cos(jφ h )η j > d → W 1 (x) W 2 (y) > , (4.3) in which W 1 (x),W 2 (y) are independent standard Brownian motion. Bierens (2001) extended the result to correlated noises. He showed that for X k+1 = 2 cos(rh)X k −X k−1 +ξ k+1 , ξ k =ψ(L)η k , X 0 = 0, the solution X k is X k = 1 sin(φ h ) cos(kφ h ) sin(kφ h ) cos(φ h ) sin(φ h ) − sin(φ h ) cos(φ h ) (4.4) · −Re ψ(e iφ h ) Im ψ(e iφ h ) Im ψ(e iφ h ) Re ψ(e iφ h ) − P k j=1 sin(jφ h )η j P k j=1 cos(jφ h )η j . From (4.4), the last term needs to be studied for the distribution of X k . Define 20 S n (t) = bntc X k=1 cos(kφ h )ξ k , T n (t) = bntc X k=1 sin(kφ h )ξ k . A lemma about the limiting distribution S n (t),T n (s) , which is the extension of (4.3) to correlated noises, can be proved. Lemma 4.2. √ 2 √ n S n (t),T n (s) d → q σ 2 h (1 +ϕ 2 h + 2ϕ h cosφ h ) W 1 (t),W 2 (s) . Proof. Replace ξ k by η k +ϕ h η k−1 , then the objective is rewritten as √ 2n −1 2 bntc−1 X i=1 cos(iφ h ) +ϕ h cos (i + 1)φ h η i + cos(bntcφ h )η bntc , bnsc−1 X j=1 sin(jφ h ) +ϕ h sin (j + 1)φ h η j + sin(bnscφ h )η bnsc d → q σ 2 h (1 +ϕ 2 h + 2ϕ h cosφ h ) W 1 (t),W 2 (s) . Set M n,1 (t) = √ 2 √ n bntc−1 X i=1 cos(iφ h ) +ϕ h cos (i + 1)φ h η i + cos(bntcφ h )η bntc ; M n,2 (s) = √ 2 √ n bnsc−1 X j=1 sin(jφ h ) +ϕ h sin (j + 1)φ h η j + sin(bnscφ h )η bnsc . The covariations between M n,1 are D M n,1 ,M n,1 E (t) = 2σ 2 h n bntc−1 X i=1 [cos(iφ h ) +ϕ h cos((i + 1)φ h )] 2 = 2σ 2 h n bntc−1 X i=1 [(1 +ϕ 2 h ) cos 2 (iφ h ) + 2ϕ h cos(iφ h ) cos((i + 1)φ h )] + cos 2 (φ h ) +ϕ 2 h cos 2 (bntcφ h ) =σ 2 h 1 +ϕ 2 h + 2ϕ h cos(φ h ) bntc n +O(n −1 ). (4.5) And in the same way, D M n,2 ,M n,2 E (s) =σ 2 h (1 +ϕ 2 h + 2ϕ h cos(φ h )) bnsc n +O(n −1 ) (4.6) 21 While D M n,1 (t),M n,2 (s) E = 2σ 2 h n bntc∧bnsc−1 X j=1 h cos(jφ h ) +ϕ h cos((j + 1)φ h ) i · h sin(jφ h ) +ϕ h sin((j + 1)φ h ) i = 2σ 2 h n bntc∧bnsc−1 X j=1 1 2 sin(2jφ h ) + 1 2 ϕ 2 h sin((2j + 2)φ h ) + 1 2 ϕ 2 h sin((2j + 2)φ h ) = σ 2 h n sin(bntc∧bnscφ h ) sin((bntc∧bnsc− 1)φ h ) sinφ h +ϕ 2 h sin(bntc∧bnscφ h ) sin((bntc∧bnsc + 1)φ h ) sinφ h 2ϕ h sin 2 ((bntc∧bnsc)φ h ) sinφ h . (4.7) Define Y n t := s 2 n P bntc i=1 cos(iφ h )η i +ϕ h cos(iφ h )η i−1 P bnsc j=1 sin(jφ h )η j +ϕ h sin(jφ h )η j−1 , and h(x) : R 2 → R 2 , a two dimensional bounded function that is equal to x in a neighbourhood of 0. Then define ˇ Y n (h) t := bntc X i=1 s 2 n cos(iφ h )η i +ϕ h cos(iφ h )η i−1 −h 1 q 2 n cos(iφ h )η i +ϕ h cos(iφ h )η i−1 , bnsc X j=1 s 2 n sin(jφ h )η j +ϕ h sin(jφ h )η j−1 −h 2 q 2 n sin(jφ h )η j +ϕ h sin(jφ h )η j−1 , and another function Y n (h) t Y n (h) t := bntc X i=1 h 1 s 2 n cos(iφ h )η i +ϕ h cos(iφ h )η i−1 bnsc X j=1 h 2 s 2 n sin(jφ h )η j +ϕ h sin(jφ h )η j−1 . 22 From [Jacod and Shiryaev (2002), page 71], the Lévy measure ν is defined as ν n,i n {t}×dx o = X s≤t 1 {ΔY n,i s (ω)6=0} ·δ (s,ΔY n,i s (ω)) (dt,dx) fori = 1, 2, andδ is the Dirac measure. By 2.14 in [Jacod and Shiryaev (2002), Page 77], ΔB n,i t = R h i (x)ν n,i n {t}×dx o . SoB n,i t =E[h i (·)|F n−1 ]. And ˆ C n i,j = [Y n,i ,Y n,j ] is the covariation. From (4.5), (4.6) and (4.7), ˆ C n is σ 2 h 1 +ϕ 2 h + 2ϕ h cos(φ h ) bntc n +O(n −1 ) O(n −1 ) O(n −1 ) σ 2 h 1 +ϕ 2 h + 2ϕ h cos(φ h ) bnsc n +O(n −1 ) . Thus ˆ C n →C = σ 2 h 1 +ϕ 2 h + 2ϕ h cos(φ h ) t 0 0 σ 2 h 1 +ϕ 2 h + 2ϕ h cos(φ h ) s in probability. Since sup t |B n t − 0|→ 0, ν n [0,t]×{|x|>} → 0 in probability for t∈ (0, +∞),∀> 0, ˆ C n →C t in probability for t∈ (0, +∞). (0,C t , 0) are the characteristic triplets of r σ 2 h 1 +ϕ 2 h + 2ϕ h cos(φ h ) (W 1 (t),W 2 (s)) > . By Theorem VIII 3.8 in Jacod and Shiryaev (2002), Lemma 3.1 is proved. Another lemma on convergence to stochastic integral is proved for later use in Theorem 4.5. Before stating Lemma 4.4, Theorem 2.4 of Chan and Wei (1988) is introduced below, which is an analogue to defining an integral as the limit of a Riemann sum in Calculus. Theorem 4.3. Let{X n } and{Y n } be two sequences of random variables. Let P n (t) = P bntc k=1 X k ,Q n (t) = P bntc k=1 Y k , and R n (t) = X bntc . Assume one of the following two conditions holds: 23 (i)Thereisu n ↑∞suchthat (u −1 n R n ,n −1/2 Q n ) d → (R,W 2 ),whereW 2 isastandard Browian Motion with respect to an increasing sequence of σ−fieldsF t and R is F t −adaptive; (ii) There are increasing σ−fields{F n } n≥1 , such that (X n ,Y n ) > is a sequence of martingale differences with respect toF n . Moreover, E(X 2 n +Y 2 n |F n−1 )≤c a.s. for some constant c> 0. and n −1/2 (P n ,Q n ) d → (W 1 ,W 2 ), (4.8) Where W 1 and W 2 are two independent Brownian Motions. Then (v −1 n P n ,n −1/2 Q n ,n −1/2 v −1 n n−1 X k=1 P n ( k n )Y k+1 ) d → (H,W 2 , Z 1 0 HdW 2 ). (4.9) Where v n =nu n and H(t) = R t 0 R(u)du in (i) and H(t) =W 1 (t) in (ii). The following lemma is an extension of part (ii) of Theorem 2.4 from Chan and Wei (1988). Lemma 4.4. 2n −1 n X k=1 Sk−1 n (0) h Tk+1 n (0)−Tk n (0) i d →σ 2 h (1 +ϕ 2 h + 2ϕ h cosφ h ) Z 1 0 W 1 (s)dW 2 (s). Proof. The proof of Lemma 4.4 is essentially the same as the proof of Theorem 2.4 of Chan and Wei (1988). Let D[0, 1] be the space of càdlàg functions on [0, 1] equipped with the Skorokhod topology. Lemma 4.4 is to prove s 2 n S n (1), s 2 n T n (1), 2 n n X k=1 S n ( k− 1 n ) h T n ( k + 1 n )−T n ( k n ) i d → q C h W 1 (1), q C h W 2 (1),C h Z 1 0 W 1 dW 2 , whereC h =σ 2 h (1 +ϕ 2 h + 2ϕ h cosφ h ). According to proof of Theorem 2.4, by the Sko- rokhod representation theorem, there are a probability space Ω and random elements S n ,T n in D[0, 1] such that 24 (S n ,T n )− q C h (W 1 ,W 2 ) ∞ → 0 a.s. and (S n ,T n ) d = ( s 2 n S n , s 2 n T n ). Let G n = n X k=1 S n ( k− 1 n ) T n ( k + 1 n )−T n ( k n ) , G n = 2 n n X k=1 Sk−1 n (0) h Tk+1 n (0)−Tk n (0) i . Then (S n ,T n ,G n ) d = ( s 2 n S n (0), s 2 n T n (0),G n ). To prove Lemma 3.3, it is sufficient to prove G n d →C h R 1 0 W 1 dW 2 . By Egorov’s theo- rem, given > 0, there is an event Ω ∈ Ω such thatP(Ω )≥ 1− and sup n k(S n (ω),T n (ω))− (W 1 (ω),W 2 (ω))k ∞ :ω∈ Ω o =δ n → 0. Choose integers N(n)→∞ such that N(n)δ 2 n → 0 and N(n)/n→ 0. (4.10) For each n, further choose a partition{t 0 ,...,t N(n) } of [0, 1] such that 0 =t 0 <t 1 = n 1 n <t 2 = n 2 n <...<t N(n) = n N(n) n = 1, max{|t i+1 −t i | : 0≤i≤N(n)− 1} =o(1). For the simplicity of writing, set all functions with negative indice, such as S n (− 1 n ), to be 0. It can be shown that G n = N(n) X k=1 S n (t k−2 ) h T n (t k )−T n (t k−1 ) i +o p (1). (4.11) Let 25 J n =G n − N(n) X k=1 S n (t k−2 ) h T n (t k )−T n (t k−1 ) i = N(n) X k=1 n k X i=n k−1 +1 S n ( i− 2 n ) h T n ( i n )−T n ( i− 1 n ) i − N(n) X k=1 S n (t k−2 ) h T n (t k )−T n (t k−1 ) i = N(n) X k=1 n k X i=n k−1 +1 S n ( i− 2 n ) h T n ( i n )−T n ( i− 1 n ) i − N(n) X k=1 S n (t k−2 ) h T n (t k )−T n (t k−1 ) i = N(n) X k=1 n k X i=n k−1 +1 S n ( i− 2 n ) h T n ( i n )−T n ( i− 1 n ) i − N(n) X k=1 S n (t k−2 ) n k X i=n k−1 +1 h T n ( i n )−T n ( i− 1 n ) i = N(n) X k=1 n k X i=n k−1 +1 h S n ( i− 2 n )−S n (t k−2 ) ih T n ( i n )−T n ( i− 1 n ) i . Thus EJ 2 n = N(n) X k=1 n k X i=n k−1 +1 E h S n ( i− 2 n )−S n (t k−2 ) i 2 h T n ( i n )−T n ( i− 1 n ) i 2 + N(n) X k=1 n k X i=n k−1 +1 E h S n ( i− 2 n )−S n (t k−2 ) ih S n ( i− 1 n )−S n (t k−2 ) i · h T n ( i + 1 n )−T n ( i n ) ih T n ( i n )−T n ( i− 1 n ) i ≤ N(n) X k=1 n k X i=n k−1 +1 C h i− 2 n −t k−2 1 n +C h i− 2 n −t k−2 1 n = 2C h N(n) X k=1 n k X i=n k−1 +1 i− 2 n − n k−2 n 1 n ≤ 2C h N(n) X k=1 n k X i=n k−1 +1 n k −n k−2 n 1 n = 2C h N(n) X k=1 n k −n k−1 n n k −n k−2 n → 0. By the Markov inequality, (4.11) is proved. Next is to show 26 I Ω N(n) X k=1 S n (t k−2 ) h T n (t k )−T n (t k−1 ) i =I Ω N(n) X k=1 W 1 (t k−2 ) h T n (t k )−T n (t k−1 ) i +o p (1). (4.12) And it is true because E I Ω N(n) X k=1 S n (t k−2 ) h T n (t k )−T n (t k−1 ) i −I Ω N(n) X k=1 W 1 (t k−2 ) h T n (t k )−T n (t k−1 ) i 2 =E N(n) X k=1 I Ω h S n (t k−2 )−W 1 (t k−2 ) ih T n (t k )−T n (t k−1 ) i 2 ≤E N(n) X k=1 I Ω h S n (t k−2 )−W 1 (t k−2 ) i 2 · N(n) X k=1 h T n (t k )−T n (t k−1 ) i 2 ≤N(n)δ 2 n N(n) X k=1 E h T n (t k )−T n (t k−1 ) i 2 ≤N(n)δ 2 n N(n) X k=1 C h (t k −t k−1 ) =N(n)δ 2 n C h → 0. The last step is due to the choice of N(n) in (4.10). Since by rearranging the terms, N(n) X k=1 W 1 (t k−2 ) h T n (t k )−T n (t k−1 ) i =W 1 (t N(n−2) )T n (t N(n) ) + N(n)−1 X k=1 T n (t k ) h W 1 (t k−2 )−W 1 (t k−1 ) i . By a similar argument, T n (t k ) can be replaced by W 2 (t k ), which means I Ω N(n) X k=1 W 1 (t k−2 ) h T n (t k )−T n (t k−1 ) i =I Ω N(n) X k=1 W 1 (t k−2 ) h W 2 (t k )−W 2 (t k−1 ) i +o p (1). The final step is to show I Ω N(n) X k=1 W 1 (t k−2 ) h W 2 (t k )−W 2 (t k−1 ) i =I Ω Z 1 0 W 1 dW 2 +o p (1). (4.13) It is proved by showing 27 E N(n) X k=1 W 1 (t k−2 ) h W 2 (t k )−W 2 (t k−1 ) i − Z 1 0 W 1 dW 2 2 =E N(n) X k=1 W 1 (t k−2 ) Z t k t k−1 dW 2 − N(n) X k=1 Z t k t k−1 W 1 dW 2 2 =E N(n) X k=1 Z t k t k−1 h W 1 (t k−2 )−W 1 (t) i dW 2 2 = N(n) X k=1 E Z t k t k−1 h W 1 (t k−2 )−W 1 (t) i dW 2 2 = N(n) X k=1 Z t k t k−1 (t−t k−2 )dt≤ h N(n) X k=1 (t k −t k−1 ) i · max(t k −t k−2 )→ 0. Combining (4.11), (4.12) and (4.13) G n d →C h Z 1 0 W 1 dW 2 . With Lemma 4.2 and Lemma 4.4, the following theorem, which serves as the cornerstoneforthischapter, canbeproved. Forthesimplicityofwriting,Sk n (0),Tk n (0) are written as S k (0),T k (0) in the rest of this chapter. Define C h =σ 2 h (1 +ϕ 2 h + 2ϕ h cosφ), H 1 = Z 1 0 W 1 (s)dW 2 (s)− Z 1 0 W 2 (s)dW 1 (s), G = Z 1 0 W 2 1 (s)ds + Z 1 0 W 2 2 (s)ds, H 2 = Z 1 0 W 1 (s)dW 1 (s) + Z 1 0 W 2 (s)dW 2 (s). Theorem 4.5. For fixed sampling interval h, as n→∞, • (a) n −2 P n k=1 X 2 k d → C h 4 sin 2 φ h G, • (b) n −1 P n k=1 X k ξ k+1 d → C h 2 sinφ h H 1 +σ 2 h ϕ h , • (c) n −2 P n k=1 X k−1 X k d → cosφ C h 4 sin 2 φ h G, • (d) n −1 P n k=1 X k−1 ξ k+1 d → C h 2 sinφ h n cosφ h ·H 1 − sinφ h ·H 2 , o • (e) n −1 P n k=1 X k−1 ξ k+1 d → C h 2 sinφ h n (2 cos 2 φ h − 1)H 1 − 2 sinφ h cosφ h H 2 o , • (f) n −2 P n k=1 X k−2 X k d → (2 cos 2 φ h − 1) C h 4 sin 2 φ h G. 28 Proof. From Lemma 3.3.2 of Chan and Wei (1988), n −2 n X k=1 X 2 k = 1 4 sin 2 φ h 2n −2 n X k=1 S 2 k (0) +T 2 k (0) +o p (1) = 1 4 sin 2 φ h 1 n n X k=1 √ 2n −1/2 S k (0) 2 + 1 n n X k=1 √ 2n −1/2 T k (0) 2 +o p (1) d → C h 4 sin 2 φ h G. The first equality from Chan and Wei (1988) used straightforward applications of trigonometric identities, and the convergence at the last step is due to Lemma 3.1 and the continuous mapping theorem. For part (b), again from Lemma 3.3.2 of Chan and Wei (1988), n −1 n X k=1 X k ξ k+1 = 1 sinφ h n −1 n X k=1 S k (0) h T k+1 (0)−T k (0) i − n X k=1 T k (0) h S k+1 (0)−S k (0) i . In order to get independence, replace S k (0) by S k−1 (0) + cos(kφ)ξ k . Then n −1 n X k=1 X k ξ k+1 = 1 sinφ h n −1 n X k=1 S k−1 (0) h T k+1 (0)−T k (0) i − n X k=1 T k−1 (0) h S k+1 (0)−S k (0) i + 1 sinφ h n −1 [ n X k=1 sin(k + 1)φ h cos(kφ h )ξ k ξ k+1 − n X k=1 cos(k + 1)φ h sin(kφ h )ξ k ξ k+1 ] ⇒ C h 2 sinφ h H 1 +σ 2 h ϕ h . For part (c), multiply both side of (4.2) by X k , sum from k = 1 to n and divide by n 2 , n −2 n X k=2 X k−1 X k +n −2 n X k=2 X k−2 X k−1 = 2 cosφ h ·n −2 n X k=2 X 2 k−1 +n −2 n X k=2 X k−1 ξ k . as n→∞, by part (b), the second term on right goes to 0, and the two terms on the left have the same limit, thus by part (a), part (c) is proved. For part (d), from Lemma 3.3.2 of Chan and Wei (1988), 29 n −1 sinφ h n X k=1 X k−1 ξ k+1 =n −1 cosφ h n X k=1 S k−1 (0) h T k+1 (0)−T k (0) i − n X k=1 T k−1 (0) h S k+1 (0)−S k (0) i −n −1 sinφ h n X k=1 S k−1 (0) h S k+1 (0)−S k (0) i − n X k=1 T k−1 (0) h T k+1 (0)−T k (0) i . As n→∞ n −1 n X k=1 X k−1 ξ k+1 d → C h 2 sinφ h n cosφ h ·H 1 − sinφ h ·H 2 o . For part (e), use (4.2) to replace X k , n X k=1 X k ξ k+1 = n X k=1 (2 cosφ h X k−1 −X k−2 +ξ k )ξ k+1 = 2 cosφ h n X k=1 X k−1 ξ k+1 − n X k=1 X k−1 ξ k+1 + n X k=1 ξ k ξ k+1 . By law of large numbers, n −1 P n k=1 ξ k ξ k+1 → σ 2 h ϕ h , together with part (b) and (d), part (e) is proved. Proof of part (f) is similar to that of part (c). Multiply both sides of (4.2) by X k+1 and sum from k = 1 to n− 1, then n −2 n X k=2 X 2 k = 2 cosφ h ·n −2 n X k=2 X k−2 X k−1 −n −2 n X k=2 X k−2 X k +n −2 n X k=2 X k ξ k . By (b), (d) and (e), together with (4.2),n −2 P n k=2 X k ξ k → 0, then use (a) and (c), part (f) is proved. Back to the model defined in (4.2) the least squares estimator (LSE) of α h is ˆ α h = ( n X k=1 X 2 k ) −1 [ n X k=1 X k (X k+1 +X k−1 )], with estimation error being 30 ˆ α h −α h = ( n X k=1 X 2 k ) −1 [ n X k=1 X k ξ k+1 ]. WhenEξ k+1 ξ k 6= 0, namely ϕ h 6= 0, thenEX k ξ k+1 6= 0, this phenomenon is reflected in the following theorem. Theorem 4.6. n(ˆ α h −α h ) d → 2 sinφ H 1 + ϕ h 1+ϕ 2 h +2ϕ h cosφ · 2 sin(φ) G . Proof. The result comes easily from Theorem 4.5 and the continuous mapping theo- rem. Proof of Theorem 4.1 is also an application of Theorem 4.5. Proof. For part(a), use(4.4)and Theorem 4.5, togetherwiththecontinuousmapping theorem n ˆ α h −α h ˆ β h −β h = (n −2 n X k=1 X k X > k ) −1 (n −1 n X k=1 X k−1 ·ξ k ) d → σ 2 |ψ(e iφ h )| 2 4 sin 2 (φ h ) 1 cos(φ h ) cos(φ h ) 1 G −1 · σ 2 |ψ(e iφ h )| 2 2 sin(φ h ) 1 0 cosφ h − sinφ h H 1 H 2 + ϕ h 0 = 1 2 sin(φ h ) 1 cos(φ h ) cos(φ h ) 1 G −1 · 1 0 cos(φ h ) − sin(φ h ) H 1 H 2 + ϕ h 0 = 2G −1 · sin(φ h ) cos(φ h ) 0 −1 · H 1 H 2 + ϕ h sin(φ h ) 1 − cos(φ h ) . Thus part (a) is proved. For part (b) where the LSE with lag 1 is considered, the limiting distribution of the errors are: 31 n ˜ α h −α h ˜ β h −β h = (n −2 n X k=1 X k−1 X > k ) −1 (n −1 n X k=1 X k−2 ·ξ k ) d → σ 2 |ψ(e iφ h )| 2 4 sin 2 φ h cosφ h 1 2 cos 2 φ h − 1 cosφ h G −1 · σ 2 |ψ(e iφ h )| 2 2 sinφ h cosφ h − sinφ h 2 cos 2 φ h − 1 −2 sinφ h cosφ h H 1 H 2 = 1 2 sinφ h cosφ h 1 2 cos 2 φ h − 1 cosφ h G −1 · cosφ h − sinφ h 2 cos 2 φ h − 1 −2 sinφ h cosφ h H 1 H 2 = 2 sinφ h 1− cos 2 φ h cosφ h −1 1− 2 cos 2 φ h cosφ h cosφ h − sinφ h 2 cos 2 φ h − 1 −2 sinφ h cosφ h ·G −1 · H 1 H 2 = 2G −1 · sinφ h cosφ h 0 −1 · H 1 H 2 . 4.2 Simultaneous and Sequential Double Asymp- totics This section works on double asymptotics for the model. The idea comes from Wang and Yu (2016), and it assumes sampling interval h shrinks to 0 and the time spanT diverges. The simultaneous double asymptotics considers the case thath→ 0 andT→∞ simultaneously, while the sequential double asymptotics considersh→ 0 followed by T→∞, as well as T→∞ followed by h→ 0. InWangandYu(2016), theyextended resultsfromPhillips(1987b), Perron(1991) and Phillips and Magdalinos (2007), and showed that for an explosive model, double 32 asymptotics better approximates the finite sample distribution than the asymptotic distribution that is independent of the initial condition. The double asymptotics connects the continuous time model and the discretized autoregressive model, which is related to the purpose of this chapter. Anditneedstobeemphasizedthatdoubleasymptoticsdistributionmayexplicitly depends on the initial condition, and that is one reason why it better approximates the finite sample distribution. Though for the model considered in this thesis, the initial condition is X 0 = 0, this theory might be useful for future study of the same model with nonzero initial condition. This section will consider the simultaneous/sequential double asymptotic distribu- tions for ˆ α h −α h , and the results show that the three double asymptotics distributions are related to the limiting distribution of estimation error with continuous observa- tions considered in Lin and Lototsky (2011). For the undamped harmonic oscillator, namely θ 1 = 0, θ 2 < 0. (1.5) becomes X k+1 = 2 cos(rh)X k −X k−1 +ξ k+1 , ξ k =η k +ϕ h η k−1 , η k ∼ i.i.d N(0,σ 2 h ) In this case, the autocovariance is γ h (1) =σ 2 0 rh cos(rh)− sin(rh) 2r 3 ,γ h (0) =σ 2 0 2rh− sin(2rh) 2r 3 . And as h goes to 0, γ h (1)/γ h (0) h→0 → −1/4. The values of σ h and ϕ h are σ h = ( q γ h (0) + 2γ h (1) + q γ h (0)− 2γ h (1))/2, ϕ h = q γ h (0) + 2γ h (1)− q γ h (0)− 2γ h (1) q γ h (0) + 2γ h (1) + q γ h (0)− 2γ h (1) . Divide the equation by σ h , then x k+1 = 2 cos(rh)x k −x k−1 + k+1 +ϕ h k (ζ k+1 := k+1 +ϕ h k ), (4.14) where x k =X k /σ h and k ∼N(0, 1) is standard white noise. ϕ h →ϕ =−2 + √ 3 as h→ 0. Define 33 S 0 n (t) = bntc X k=1 cos(kφ h )ζ k , T 0 n (t) = bntc X k=1 sin(kφ h )ζ k . Then Lemma 4.2 is rescaled to be √ 2n −1/2 S 0 n (t),T 0 n (s) → q (1 +ϕ 2 h + 2ϕ h cosφ h ) W 1 (t),W 2 (s) . 4.2.1 Simultaneous Double Asymptotics Recall the proof in Theorem 4.4, and that sin(rh) h h→0 → r. So as h→ 0,T →∞ simultaneously, h 2 n −2 n X k=1 x 2 k = h 2 4 sin 2 φ h 2n −2 n X k=1 S 02 k (0) +T 02 k (0) +o p (h 2 ) = h 2 4 sin 2 φ h 1 n n X k=1 √ 2n −1/2 S k (0) 2 + 1 n n X k=1 √ 2n −1/2 T k (0) 2 +o p (h 2 ) d → 1 −4θ 2 (1 +ϕ) 2 G. And hn −1 n X k=1 x k ξ k+1 = h sinφ h n −1 n X k=1 S 0 k (0) h T 0 k+1 (0)−T 0 k (0) i − n X k=1 T 0 k (0) h S 0 k+1 (0)−S 0 k (0) i = h sinφ h n −1 n X k=1 S 0 k−1 (0) h T 0 k+1 (0)−T 0 k (0) i − n X k=1 T 0 k−1 (0) h S 0 k+1 (0)−S 0 k (0) i + h sinφ h n −1 n X k=1 cos(kφ) sin[(k + 1)φ]ζ k ζ k+1 − n X k=1 cos[(k + 1)φ] sin(kφ)ζ k ζ k+1 d → 1 2 √ −θ 2 (1 +ϕ) 2 H 1 . By continuous mapping theorem, as h→ 0 and T d →∞ simultaneously, 34 T h 2 (ˆ α h −α h ) d → 2 q −θ 2 H 1 G . Back to the continuous time model defined by (1.1), the estimator of θ 2 and the estimation error given continuous observation{(X t , ˙ X t )} T t=0 are ˆ θ 2 = R T 0 X(t)d ˙ X(t) R T 0 X 2 (t)dt , ˆ θ 2 −θ 2 = R T 0 X(t)dW (t) R T 0 X 2 (t)dt , Its Euler approximate estimator for θ 2 is ˆ θ 2,Euler = P n k=1 X k (X k+1 − 2X k +X k−1 ) P n k=1 X 2 k h 2 . The relationship between ˆ α h and ˆ θ 2,Euler is ˆ θ 2,Euler = (ˆ α h − 2)/h 2 . Thus ˆ α h −α h h 2 = ˆ α h − 2 h 2 − α h − 2 h 2 = ˆ θ 2,Euler −θ− α h − 2−θh 2 h 2 = ˆ θ 2,Euler −θ +O(h 2 ). Impose an additional condition of Th 2 → 0, just as in Shimizu (2009). Then the simultaneous double asymptotic of T ( ˆ θ 2,Euler −θ 2 ) is T ( ˆ θ 2,Euler −θ 2 ) =nh( ˆ θ 2,Euler −θ 2 ) = n h (ˆ α h −α h ) +O(nh 3 ) = hn −1 P n k=1 x k ζ k+1 h 2 n −2 P n k=1 x 2 k +O(nh 3 ) d → 2 q −θ 2 H 1 G . While from α h = 2 cos(rh) = 2 cos( √ −θ 2 h), another estimator of θ 2 is ˆ θ 2,α h = −( 1 h arccos ˆ α h 2 ) 2 , then T ( ˆ θ 2,α h −θ 2 ) =− T h 2 (arccos ˆ α h 2 ) 2 − (arccos α h 2 ) 2 =− T h 2 · 2 arccos α 0 h 2 · −1 q 1− ( α 0 h 2 ) 2 · (ˆ α h −α h ) 35 = 2 arccos α 0 h 2 q 1− ( α 0 h 2 ) 2 · T h 2 (ˆ α h −α h ). In which α 0 h is between α h and ˆ α h . And for h→ 0,α 0 h p → 2. So as h→ 0 and n→∞ simultaneously, T ( ˆ θ 2,α h −θ 2 ) d → 8 q −θ 2 H 1 G . The difference between limiting distributions of T ( ˆ θ 2,α h −θ 2 ) and T ( ˆ θ 2,Euler −θ 2 ) is that for the first one, no additional restriction of Th 2 → 0 is needed. And the reason is, ˆ θ 2,Euler is an approximation of θ 2 , and it is subject to discretization error, the restriction Th 2 → 0 allows the error asymptotically goes to 0. While ˆ θ 2,α h is related to exact discretization of the model and not subject to this type of error, then the additional restriction is not needed. 4.2.2 T goes to infinity, then h goes to 0 From Section 3, n(ˆ α h −α h ) d → 2 sinφ h H 1 + ϕ h 1+ϕ 2 h +2ϕ h cosφ h · 2 sinφ h G . With the additional condition Th 2 → 0, lim h→0 lim T→∞ T ( ˆ θ 2,Euler −θ 2 ) = lim h→0 lim n→∞ nh( ˆ θ 2,Euler −θ 2 ) = lim h→0 lim n→∞ n h (ˆ α h −α h ) +O(nh 3 ) = 2 q −θ 2 H 1 G . The result is consistent with Lin and Lototsky (2011). And lim h→0 lim T→∞ T ( ˆ θ 2,α h −θ 2 ) = 8 q −θ 2 H 1 G . 36 4.2.3 h goes to 0, then T goes to infinity The main result is: for fixed T, as h→ 0 T h 2 (ˆ α h −α h ) d → r R 1 0 I(s)dJ (s)−J (s)dI(s) R 1 0 sin(rTs)I(s)− cos(rTs)J (s) 2 ds =T ( ˆ θ 2,T −θ 2 ). (4.15) WithI(s) andJ (s) being I(s) = 1 √ rT Z Ts 0 cos(rz)dW (z), J (s) = 1 √ rT Z Ts 0 sin(rz)dW (z). The equality on the right side comes from (2.9) of Lin and Lototsky (2011). And lim T→∞ T ( ˆ θ 2,T −θ 2 ) = 2 q −θ 2 H 1 G . (4.16) This result shows that the asymptotic distribution of T h 2 (ˆ α h −α h ), which comes from the discrete-time least squares estimator, is the same as the exact distribu- tion of the continuous time estimation error as h→ 0. This justifies studying the distributional properties of T ( ˆ θ 2,T −θ 2 ) as an approximation to T h 2 (ˆ α h −α h ). The idea of the proof comes from Theorem 1 in Perron (1991). For simplicity, consider a sequence{h 1 ,h 2 ,...h n } such that h n → 0 and N n =T/h n is an integer. Recall equation (4.14) x k+1 = 2 cos(rh)x k −x k−1 + k+1 +ϕ h k , (ζ k+1 := k+1 +ϕ h k ) where x k =X k /σ h and k ∼N(0, 1) is standard white noise. LEMMA 4.7. Consider{{x n,k } Nn k=1 } ∞ n=1 , a triangular array of random variables defined by x n,k+1 = 2 cos(rh n )x n,k −x n,k−1 +ζ n,k+1 , (ζ n,k+1 = n,k+1 +ϕ n n,k , n,k ∼N(0, 1)) (4.17) in which α hn = 2 cos(rh n ). For the simplicity of writing, set φ hn =rh n . Define 37 ˜ S n,k (0), ˜ T n,k (0) := k X i=1 cos(iφ hn )ζ n,i , k X i=1 sin(iφ hn )ζ n,i . As n→∞ (h n =T/N n → 0 with T fixed) • (a) N −1/2 n ˜ S n,bNntc (0), ˜ T n,bNnsc (0) d → (1 +ϕ) √ r I(t),J (s) , • (b) N −3/2 n x n,Nn d → (1 +ϕ) √ rT h sin(rT )I(1)− cos(rT )J (1) i , • (c) N −4 n Nn X k=1 x 2 n,k−1 d → (1 +ϕ) 2 rT 2 Z 1 0 sin(rTs)I(s)− cos(rTs)J (s) 2 ds, • (d) N −2 n Nn X k=1 x n,k−1 ζ n,k d → (1 +ϕ) 2 T Z 1 0 I(s)dJ (s)−J (s)dI(s) . Proof. For 0≤t,s≤ 1, N −1/2 n ˜ S n,bNntc (0), ˜ T n,bNnsc (0) =N −1/2 n bNntc X k=1 cos(kφ hn )ζ n,k , bNnsc X k=1 sin(kφ hn )ζ n,k = 1 √ N n √ N n √ T bNntc X k=1 cos(r T N n k) s T N n ( n,k +ϕ n n,k−1 ), bNnsc X k=1 sin(r T N n k) s T N n ( n,k +ϕ n n,k−1 ) d → 1 +ϕ √ T Z Tt 0 cos(rz)dW (z), Z Ts 0 sin(rz)dW (z) = (1 +ϕ) √ r I(t),J (s) . Part (a) is proved. For (b), start with (3.3.4) of Chan and Wei (1988) N −3/2 n x n,Nn =N −3/2 n sin −1 (φ n ) sin h (N n + 1)φ hn i ˜ S n,Nn (0)− cos h (N n + 1)φ hn i ˜ T n,Nn (0) = 1 N n sin(φ hn ) sin h (N n + 1)φ hn i ˜ S n,Nn (0) √ N n − cos h (N n + 1)φ hn i ˜ T n,Nn (0) √ N n d → (1 +ϕ) √ r rT sin(rT )I(1)− cos(rT )J (1) . 38 The last step is due to part (a). In general, for 0≤z≤ 1, N −3/2 n x n,bNnzc d → (1 +ϕ) √ r rT n sin(rTz)I(z)− cos(rTz)J (z) o . Use the above formula, N −4 n Nn X k=1 x 2 n,k−1 = Nn X k=1 (N −3/2 n x n,k−1 ) 2 1 N n d → (1 +ϕ) 2 rT 2 Z 1 0 sin(rTs)I(s)− cos(rTs)J (s) 2 ds. part (c) is proved. From Lemma 3.3.2 of Chan and Wei (1988), sin(φ hn ) Nn X k=1 x n,k−1 ζ n,k = Nn X k=1 ˜ S n,k (0) h ˜ T n,k+1 (0)− ˜ T n,k (0) i − ˜ T n,k (0) h ˜ S n,k+1 (0)− ˜ S n,k (0) i . Thus N −2 n Nn X k=1 x n,k−1 ζ n,k = 1 N n sin(φ n ) Nn X k=1 ˜ S n,k (0) √ N n h ˜ T n,k+1 (0) √ N n − ˜ T n,k (0) √ N n i − ˜ T n,k (0) √ N n h ˜ S n,k+1 (0) √ N n − ˜ S n,k (0) √ N n i d → (1 +ϕ) 2 r rT Z 1 0 I(s)dJ (s)−J (s)dI(s) . Immediately from (c) and (d), lim hn→0 N 2 n (ˆ α hn −α hn ) = lim hn→0 N −2 n P Nn k=1 x n,k−1 ξ n,k N −4 n P Nn k=1 x 2 n,k−1 = rT R 1 0 I(s)dJ (s)−J (s)dI(s) R 1 0 sin(rTs)I(s)− cos(rTs)J (s) 2 ds . 39 Thus combine (4.16), lim T→∞ lim h→0 T h 2 (ˆ α h −α h ) = 2 q −θ 2 H 1 G . 4.3 Test for Damping Consider the case that θ 2 is known, and θ 1 = 0 is tested against θ 1 < 0 (damping is present). In the continuous-time observation setting by Lin and Lototsky (2011), the result is lim T→∞ T ( ˆ θ 1,T −θ 1 ) = lim T→∞ T −1 R T 0 ˙ X(t)dW (t) T −2 R T 0 ˙ X 2 (t)dt = lim T→∞ R 1 0 I(s)dI(s) + R 1 0 J (s)dJ (s) R 1 0 (cos(rTs)I(s) + sin(rTs)J (s)) 2 ds = 2 H 2 G = 2−W 2 1 (1)−W 2 2 (1) G . While testing for θ 1 = 0 is tested against θ 1 < 0 with θ 2 unknown, the result is lim T→∞ T ( ˆ θ 1,T −θ 1 ) = lim T→∞ 1 1−D T T −1 R T 0 ˙ X(t)dW (t) T −2 R T 0 ˙ X 2 (t)dt − T −1 R T 0 X(t)dW (t)· (4T 2 ) −1 X 2 (T ) T −2 R T 0 ˙ X 2 (t)dt·T −2 R T 0 X 2 (t)dt = 2−W 2 1 (1)−W 2 2 (1) G , lim T→∞ T ( ˆ θ 2,T −θ 2 ) = lim T→∞ 1 1−D T T −1 R T 0 X(t)dW (t) T −2 R T 0 X 2 (t)dt − T −1 R T 0 ˙ X(t)dW (t)· (4T 2 ) −1 X 2 (T ) T −2 R T 0 ˙ X 2 (t)dt·T −2 R T 0 X 2 (t)dt = 2 q −θ 2 H 1 G . AndD T = R T 0 X(t) ˙ X(t)dt 2 R T 0 ˙ X 2 (t)dt· R T 0 X 2 (t)dt . This shows that in continuous-time case, withθ 2 given or not, the estimators are different, though the limiting distributions for T ( ˆ θ 1,T −θ 1 ) are the same. But for the discretized model considered in this paper, the model is X k+1 = 2 cos(rh)X k −e θ 1 h X k−1 +ξ k+1 , 40 with r = q |θ 2 +θ 2 1 /4|. This indicates that when θ 2 given, it is still not possible to estimateβ h =−e θ 1 h only sincer is unknown. Instead, regardless ofθ 2 known or not, both α h = 2 cos(rh) and β h =−e θ 1 h need to be estimated jointly. This leads back to Section 2, and the result is: For fixed time interval h, as T→∞, the limiting distributions of the estimation errors of the least squares estimator (LSE), n( ˆ β h −β h ) d →−2G −1 H 2 + cos(φ h ) sin(φ h ) ϕ . And the limiting distribution of the estimation errors of the least squares estimator with lag 1 (LSE with lag 1) is n( ˆ β h −β h ) d →−2G −1 H 2 . In order to test θ 1 = 0 against θ 1 < 0, φ h =rh shouldn’t be involved, thus LSE with lag 1 shall be used. 4.3.1 h goes to 0, then T goes to infinity The main result is: for fixed T, as h→ 0 T h 2 ( ˆ β h −β h ) d → R 1 0 I(s)dI(s) +J (s)dJ (s) R 1 0 sin(rTs)I(s)− cos(rTs)J (s) 2 ds =T ( ˆ θ 1,T −θ 1 ). (4.18) Andtheproofissimilartothatofsection4.2. Usingthesametriangulararraydefined by (4.17) in Lemma 4.7, the estimation errors can be written as: ˆ α h −α h ˆ β h −β h = P Nn k=1 x 2 n,k P Nn k=1 x n,k−1 x n,k P Nn k=1 x n,k−1 x n,k P Nn k=1 x 2 n,k −1 P Nn k=1 x n,k ζ n,k+1 P Nn k=1 x n,k−1 ζ n,k+1 (4.19) From Lemma 3.3.2 of Chan and Wei (1988), sin(φ hn ) Nn X k=2 x n,k−2 ζ n,k 41 = cos(φ hn ) Nn X k=1 ˜ S n,k−2 (0) h ˜ T n,k (0)− ˜ T n,k−1 (0) i − ˜ T n,k−2 (0) h ˜ S n,k (0)− ˜ S n,k−1 (0) i − sin(φ hn ) Nn X k=1 ˜ S n,k−2 (0) h ˜ S n,k (0)− ˜ S n,k−1 (0) i + ˜ T n,k−2 (0) h ˜ T n,k (0)− ˜ T n,k−1 (0) i . And since Nn X k=1 x n,k−1 x n,k = Nn X k=1 x n,k−1 2 cos(φ hn )x n,k−1 −x n,k−2 +ζ n,k , it can be seen that as N n →∞ lim Nn→∞ N −4 n Nn X k=1 x n,k−1 x n,k = lim Nn→∞ N −4 n Nn X k=1 x n,k−2 x n,k−1 = lim Nn→∞ N −4 n cos(φ hn ) Nn X k=1 x 2 n,k−1 +N −4 n Nn X k=1 x n,k−1 ζ n,k . This means lim Nn→∞ N −4 n Nn X k=1 x 2 n,k = lim Nn→∞ N −4 n Nn X k=1 x n,k−1 x n,k , which indicates that the first matrix on the right side of (4.19) will be rank 1 and not invertible. So instead of considering the limit of the two matrices in (4.19) inde- pendently, it is needed to do the tedious calculation and invert the matrix, then set N n →∞. And the result is lim T→∞ lim h→0 T h 2 ( ˆ β h −β h ) = lim T→∞ R 1 0 I(s)dI(s) +J (s)dJ (s) R 1 0 sin(rTs)I(s)− cos(rTs)J (s) 2 ds = lim T→∞ T ( ˆ θ 1,T −θ 1 ) = 2−W 2 1 (1)−W 2 2 (1) G . 4.4 Summary In this chapter, the situationθ 1 = 0,θ 2 < 0 is considered. Under this assumption, the characterization polynomial of the discretized model has conjugate roots on the unit circle. The main contributions of this chapter are: 42 • (1) The rate of convergence of the least squares estimators of α h and β h when θ 1 = 0,θ 2 < 0 (undamped harmonic oscillator) is established. • (2) For θ 1 = 0,θ 2 < 0, simultaneous and sequential double asymptotics for ˆ α h −α h are considered and related to the rate of convergence of the maximum likelihood estimator of θ 2 . • (3) Testingθ 1 = 0 againstθ 1 < 0 (damping is present) is considered via param- eter β h . 43 Chapter 5 Double Roots on the Unit Circle This chapter deals with case II-3: double roots at 1. Under the assumption that θ 1 =θ 2 = 0, the model is X t − 2X t−1 +X t−2 =ξ t . (5.1) Using the lag operator, the model can be written as (1−L) 2 X t =ξ t . (5.2) Define u t (0) = (1−L) 2 X t = ξ t , u t (1) = (1−L)X t , u t (2) = X t . Then the following equation holds, 1 0 1 −1 X t X t−1 = u t (2) u t (1) . (5.3) A few lemmas related tou t (j) will firstly be proved, then the limiting distribution of the estimation errors defined in (1.13), will be stated. LEMMA 5.1 1 √ n P bntc k=1 u t (0) d → σ h (1 + ϕ h )W (t), where W (t) stands for a standard Brownian Motion. Proof. Since 1 √ n P bntc k=1 u t (0) = 1 √ n P bntc k=1 ξ t = 1 √ n P bntc k=1 (η t +ϕ h η t−1 ) = 1 √ n P bntc−1 k=1 (1 + ϕ h )η t +η bntc , By Donsker’s theorem [Jacod and Shiryaev (2003), P271] it converges to σ h (1 +ϕ h )W (t) in distribution. Lemma5.3provesafunctionalcentrallimittheorem. Inordertoprovethislemma, a mapping theorem from [Chan and Wei (1988), Theorem 2.3] is firstly introduced: LEMMA 5.2. Letx n andx be random elements taking values inR m . For each s∈ [0, 1] and any continuous function f :R m →R, define y(s) = (x 1 (s),...,x m (s)) and z(t) = Z t 0 f(y(s))ds. Similarly, define y n and z n . If x n d →x, then (x n ,z n ) d → (x,z). 44 LEMMA 5.3. For j = 1 and 2, n 1 2 −j u bntc (j) d → σ h (1 +ϕ h ) R t 0 F j−1 (s)ds. In which F 0 (t) =W (t),F 1 (t) = R t 0 F 0 (s)ds = R t 0 W (s)ds. Proof. By definition, u t (j + 1) = P t k=0 u k (j).Thus from Lemma 5.1, n − 1 2 u bntc (1) = 1 √ n bntc X k=1 u k (0) d →σ h (1 +ϕ h )W (t) =σ h (1 +ϕ h )F 0 (t). (5.4) And for j = 2, by Lemma 5.2 and (5.4), n 1 2 −2 u bntc (2) =n −1 bntc X k=1 n − 1 2 u k (1) d →σ h (1 +ϕ h ) Z t 0 F 1 (s)ds. Lemma 5.3 leads to the following result: THEOREM 5.4. Define H 3 = (H ij 3 ) 2 i,j=1 , where H ij 3 = R t 0 F i−1 (s)F j−1 (s)ds. And F j (s) is defined in Lemma 5.3. Then n 2 0 0 n −1 1 0 1 −1 ( n−1 X k=1 X k X > k ) 1 0 1 −1 > n 2 0 0 n −1 d →σ 2 h (1 +ϕ h ) 2 H 3 . Proof. Thei,j-th term of the left side isn −(i+j) P n−1 k=1 u k (i)u k (j). By Lemma 5.2 and Lemma 5.3, n −(i+j) n−1 X k=1 u k (i)u k (j) =n −1 n−1 X k=1 n 1 2 −i u k (i)n 1 2 −j u k (j) d →σ 2 h (1 +ϕ h ) 2 H ij . Recall Theorem 4.3, which is an analogue to defining an integral as the limit of a Riemann sum in calculus. With this theorem, the following theorem can now be proved. THEOREM 5.5. Define ζ = σ 2 h (1 +ϕ h ) 2 R 1 0 F 1 (s)dW (s) σ 2 h (1 +ϕ h ) 2 R 1 0 F 0 (s)dW (s) +ϕ h , then n 2 0 0 n −1 1 0 1 −1 n X k=1 X k−1 ξ k d →ζ. 45 Proof. n 2 0 0 n −1 1 0 1 −1 n X k=1 X k−1 ξ k = n 2 0 0 n −1 n X k=1 u k−1 (2) u k−1 (1) ξ k . The idea is to replace ξ k by η k + ϕ h η k−1 and separate out martingale difference sequences: n X k=1 u k−1 (2)ξ k = n X k=1 X k−1 (η k +ϕ h η k−1 ) = n X k=1 X k−1 η k + n X k=1 (2X k−2 −X k−3 +η k−1 +ϕ h η k−2 )ϕ h η k−1 = n−1 X k=1 X k−1 +ϕ h (2X k−1 −X k−2 ) η k +X n−1 η n + n X k=1 η 2 k−1 + n X k=1 ϕ h η k−1 η k−2 = n−1 X k=1 (1 +ϕ h )X k−1 +ϕ h (X k−1 −X k−2 ) η k +X n−1 η n + n X k=1 η 2 k−1 + n X k=1 ϕ h η k−1 η k−2 = n−1 X k=1 (1 +ϕ h )u k−1 (2) +ϕ h u k−1 (1) η k +X n−1 η n + n X k=1 η 2 k−1 + n X k=1 ϕ h η k−1 η k−2 . The first term is a martingale difference sequence, and for the last two terms, n −2 ( n X k=1 η 2 k−1 + n X k=1 ϕ h η k−1 η k−2 )→ 0 a.s. From Lemma 5.3, n 1 2 −2 n−1 X k=1 (1 +ϕ h )u k−1 (2) +ϕ h u k−1 (1) d →σ 2 h (1 +ϕ h ) 2 Z t 0 F 1 (s)ds. Use it and Theorem 4.3, then n −2 n X k=1 u k−1 (2)ξ k d →σ 2 h (1 +ϕ h ) 2 Z t 0 F 1 (s)dW (s). 46 Fortheotherterm,bydefinition,u k (1) =X k −X k−1 =X k−1 −X k−2 +ξ k =u k−1 (1)+ξ k , then n X k=1 u k−1 (1)ξ k = n X k=1 u k−2 (1)ξ k + n X k=1 ξ k−1 ξ k = n X k=1 u k−2 (1)(η k +ϕ h η k−1 ) + n X k=1 ξ k−1 ξ k = n−1 X k=1 u k−2 (1) +ϕ h u k−1 (1) η k +u n−2 (1)η n + n X k=1 ξ k−1 ξ k . By the same proof as above, n −1 n X k=1 u k−1 (1)ξ k d →σ 2 h (1 +ϕ h ) 2 Z 1 0 F 0 (s)dW (s) +σ 2 h ϕ h . Combine Theorem 5.4 and 5.5, the limiting distributions of the estimation errors can be attained: Theorem 5.6. The limiting distributions of the estimation errors are n 2 0 0 n 1 1 0 −1 ( n X k=1 X k X > k ) −1 n X k=1 X k−1 ξ k = n 2 0 0 n 1 1 0 −1 ˆ α h −α h ˆ β h −β h d →H −1 R 1 0 F 1 (s)dW (s) R 1 0 F 0 (s)dW (s) +ϕ h /(1 +ϕ h ) 2 . H is defined in theorem 5.4, which is: H 3 = (H ij 3 ) 2 i,j=1 , H ij 3 = R t 0 F i−1 (s)F j−1 (s)ds. And F 0 (t) =W (t),F 1 (t) = R t 0 F 0 (s)ds = R t 0 W (s)ds. 47 Chapter 6 Distinct Roots Inside the Unit Circle This chapter deals with the cases that both roots of the characteristic polynomial are inside the unit circle, and it constitutes two cases • I-3: θ 1 > 0, θ 2 < 0, θ 2 1 4 +θ 2 > 0; • III-3: rh6=mπ,θ 1 > 0,θ 2 < 0, θ 2 1 4 +θ 2 < 0. For the roots z 1,2 of the characteristic polynomial φ(z) = 1−α h z−β h z 2 , the first case correspond to two distinct real roots inside the unit circle, and the second case correspond to conjugate complex roots inside the unit circle. Recall (1.12), the model in matrix form is X k+1 =BX k +ξ k+1 . In whichX k = (X k+1 ,X k ) > ,ξ k+1 = (ξ k+1 , 0) > , B = α h β h 1 0 . B can be diagonal- ized as KBK −1 = Λ, where Λ = diag(λ 1 ,λ 2 ), λ 1,2 = α h ± √ α 2 h +4β h 2 are the eigenvalues of B, and K = 1 λ 1 −λ 2 1 −λ 2 −1 λ 1 . The relationship between λ 1,2 and z 1,2 is λ 1,2 = 1/z 1,2 . So under the assumptions,|λ 1,2 |> 1. The estimation error matrix is ˆ B n −B = ( n X i=1 ξ i X > i−1 )( n X i=1 X i−1 X > i−1 ) −1 . WriteX n as a summation of noises, X n =BX n−1 +ξ n =... = n X i=1 B n−i ξ i . 48 In these two cases,||X n || might blow up, which means P n i=1 X i−1 X > i−1 might blow up and the limiting distribution of it cannot be attained directly. The idea is to multiplyB −(n−1) totheX n termandthengettheasymptoticdistributionwithrespect to B −(n−1) X n instead ofX n . For the simplicity of further writing, define U n = n X i=1 ξ i X > i−1 , V n = n X i=1 X i−1 X > i−1 . For the rest of this chapter, the limiting distributions of U n and V n will be studied. Further define J n =B −(n−2) X n−1 = n−1 X i=1 B −(i−1) ξ i , J = ∞ X i=1 B −(i−1) ξ i , (6.1) F n =J n J > n +B −1 J n J > n (B) −> +... +B −(n−1) J n J > n (B > ) −(n−1) , (6.2) G n =ξ n J > n +ξ n−1 J > n (B > ) −1 +... +ξ 1 J > n (B > ) −(n−1) . (6.3) It will firstly be proved in Lemma 6.1 and 6.2 that the difference between B −(n−2) V n (B > ) −(n−2) and F n , as well as U n (B > ) −(n−2) and G n , will go to 0 in prob- ability. So instead of dealing with the limiting distributions of B −(n−2) V n (B > ) −(n−2) and U n (B > ) −(n−2) directly, F n and G n will be considered, which are easier. LEMMA 6.1. B −(n−2) V n (B > ) −(n−2) −F n → 0 in probablity. Proof. Bydefinition,J n =B −(n−2) ξ n−1 +B −(n−3) ξ n−2 +...+ξ 1 .Thedifferencebetween J n and J i is J n −J i =B −(n−2) ξ n−1 +B −(n−3) ξ n−2 +... +B −(i−1) ξ i . This property can be used to rewrite the difference B −(n−2) V n (B > ) −(n−2) −F n (6.4) =B −(n−2) V n (B > ) −(n−2) − n−1 X i=0 B −(n−i−1) J n J > n (B > ) −(n−i−1) = n−1 X i=0 B −(n−i−1) B −(i−1) X i X > i (B > ) −(i−1) (B > ) −(n−i−1) 49 − n−1 X i=1 B −(n−i−1) J n J > n (B > ) −(n−i−1) = n−1 X i=0 B −(n−i−1) (J i+1 J > i+1 −J n J > n )(B > ) −(n−i−1) (6.5) While−J i+1 J > i+1 +J n J > n =B −(n−2) ξ n−1 ξ > n−1 (B > ) −(n−2) +...+B −i ξ i+1 ξ > i+1 (B > ) −i , the expectation of the norm of this term can be bounded from above E||J i+1 J > i+1 −J n J > n ||≤||B −(n−2) || 2 E||ξ n−1 ξ > n−1 || +... +||B −(i) || 2 E||ξ i+1 ξ > i+1 || (6.6) =O [max(λ −1 1 ,λ −1 2 )] 2(n−2) +... +O [max(λ −1 1 ,λ −1 2 )] 2i → 0. Insert (6.6) back to (6.5), then the upper bound of (6.4) can be attained: E||B −(n−2) V n (B > ) −(n−2) −F n || ≤ n−1 X i=0 ||B −(n−i−1) || 2 (||B −(n−2) || 2 +... +||B −i || 2 ). And this concludes that the original formula converges to 0 in probability. LEMMA 6.2 U n (B > ) −(n−2) −G n → 0 in probablity. Proof. The idea is similar as in the proof of Lemma 6.1: separate out J n −J i term and show that there is an upper bound. n X i=1 ξ i X > i−1 (B > ) −(n−2) − n−1 X i=0 ξ n−i J > n (B > ) −i = n X i=1 ξ i X > i−1 (B > ) −(i−2) (B > ) −(n−i) −ξ i J > n (B > ) −(n−i) = n X i=1 ξ i (J > i −J > n )(B > ) −(n−i) . Use (6.6), for the expectation of the norm of the difference, the upper bound is 50 E|| n X i=1 ξ i X > i−1 B −(n−2) − n−1 X i=0 ξ n−i J > n (B > ) −i || ≤E|| n X i=1 ξ i (J > i −J > n )(B > ) −(n−i) || ≤||(B > ) −(n−i) ||(E||ξ i || 2 n X i=1 E||J > i −J > n || 2 ) 1 2 . Define Γ = (Γ i,j ), in which Γ i,j = 1 1−λ −1 i λ −1 j . Let ˜ J n be a diagonal matrix with the i-th element being the i-th element of KJ n . The following theorem introduces the limit of F n : THEOREM 6.3. F n −K −1 ˜ J n Γ ˜ J n (K > ) −1 → 0 in probability. Proof. F n =J n J > n +K −1 Λ −1 KJ n J > n K > Λ −1 (K > ) −1 +K −1 Λ −(n−1) KJ n J > n K > Λ −(n−1) (K > ) −1 =K −1 (KJ n J > n K > + Λ −1 KJ n J > n K > Λ −1 +... + Λ −(n−1) KJ n J > n K > Λ −(n−1) )(K > ) −1 . The terms inside the parenthesis have the common part KJ n J > n K > . The i,j- th term of Λ −k KJ n J > n K > Λ −k is (KJ n ) i 1 (λ i λ j ) k (KJ n ) > j . As n→∞, 1 + 1 λ i λ j +... + 1 (λ i λ j ) n−1 = 1−(λ i λ j ) −n 1−(λ i λ j ) −1 → 1 1−(λ i λ j ) −1 . And Γ is defined such that Γ i,j = 1 1−(λ i λ j ) −1 , so the i,j-th element in the parenthesis will go to (KJ) i 1 1−(λ i λ j ) −1 (KJ) > j . That leads to the result F n −K −1 ˜ J n Γ ˜ J n (K > ) −1 → 0 in probability. THEOREM 6.4. Define the j-th column of L 0 n to be (L 0 n ) ·,j = (ξ n +λ j −1 ξ n−1 +... +λ j −(n−1) ξ 1 ). Then G n −L 0 n ˜ J n (K > ) −1 → 0 in probability. Proof. The proof is similar to Theorem 6.3. Rewrite G n such that 51 G n =ξ n J > n +ξ n−1 J > n (B > ) −1 +... +ξ 1 J > n (B > ) −(n−1) = (ξ n J > n K > +ξ n−1 J > n K > Λ −1 +... +ξ 1 J > n K > Λ −(n−1) )(K > ) −1 . Then G n −L 0 n ˜ J n (K > ) −1 → 0 in probability. THEOREM 6.5. Set J 0 n = KJ n . For j = 1 and 2, ((L 0 n ) ·,j ,J 0 n )→ (L 0 ·,j ,J 0 ), and (L 0 ·,j ,J 0 ) are independent bivariate normals. Proof. SinceJ 0 n andL 0 ·,j are both summation of bivariate white noises, they are both bivariate Gaussian. Firstly, the independence of L 0 ·,j and J 0 n is proved using the moment generating functions. lim n→∞ m (L 0 n ) ·,j +J 0 n (t) = lim n→∞ m P n−1 i=1 (λ −(n−i) j +ϕ h λ −(n−i−1) j +Λ −(i−1) K+ϕ h Λ −i K)η i +(1+Λ −(n−1) K)η n (t) = lim n→∞ exp{− σ 2 h 2 t > [ n−1 X i=1 (λ −(n−i) j +ϕ h λ −(n−i−1) j + Λ −(i−1) K +ϕ h Λ −i K) · (λ −(n−i) j +ϕ h λ −(n−i−1) j + Λ −(i−1) K +ϕ h Λ −i K) > + (1 + Λ −(n−1) K)(1 + Λ −(n−1) K) > ]t} = lim n→∞ exp{− σ 2 h 2 t > [ n−1 X i=1 (λ −(n−i) j +ϕ h λ −(n−i−1) j )(λ −(n−i) j +ϕ h λ −(n−i−1) j ) > + 1]t} · exp{− σ 2 h 2 t > [ n−1 X i=1 (Λ −(i−1) K +ϕ h Λ −i K)(Λ −(i−1) K +ϕ h Λ −i K) > + (Λ −(n−1) K)(Λ −(n−1) K) > ]t} =m (L 0 ) ·,j (t)·m J 0(t). Thus L 0 ·,j and KJ are independent. Replace ξ n by η n +ϕ h η n−1 , then J 0 n = n X i=1 Λ −(i−1) Kξ i = 1 λ 1 −λ 2 n X i=1 λ 1 −(i−1) 0 0 λ 2 −(i−1) 1 −λ 2 −1 λ 1 ξ i 0 52 = 1 λ 1 −λ 2 P n i=1 λ −(i−1) 1 ξ i − P n i=1 λ −(i−1) 2 ξ i (6.7) = 1 λ 1 −λ 2 P n i=1 (η i +ϕ h η i−1 )λ 1 −(i−1) − P n i=1 (η i +ϕ h η i−1 )λ 2 −(i−1) = 1 λ 1 −λ 2 P n−1 i=1 (λ 1 +ϕ h )λ −i 1 η i +λ −(n−1) 1 η n +ϕ h η 0 − P n−1 i=1 (λ 2 +ϕ h )λ −i 2 η i −λ −(n−1) 2 η n −ϕ h η 0 Then the covariance matrix Σ 4 of the vector in the parenthesis above can be easily done: Σ 1,1 4 (n) =σ 2 h n−1 X i=1 (λ 1 +ϕ h ) 2 λ −2i 1 +λ −2(n−1) 1 +ϕ 2 h =σ 2 h (λ 1 +ϕ h ) 2 1−λ −2(n−1) 1 λ 2 1 − 1 +λ −2(n−1) 1 +ϕ 2 h → (λ 2 1 γ h (0) + 2λ 1 γ h (1))/(λ 2 1 − 1), Σ 2,2 4 (n) =σ 2 h n−1 X i=1 (λ 2 +ϕ h ) 2 λ −2i 2 +λ −2(n−1) 2 +ϕ 2 h =σ 2 h (λ 2 +ϕ h ) 2 1−λ −2(n−1) 2 λ 2 2 − 1 +λ −2(n−1) 2 +ϕ 2 h → (λ 2 2 γ h (0) + 2λ 2 γ h (1))/(λ 2 2 − 1), Σ 1,2 4 (n) = Σ 2,1 4 (n) =σ 2 h − n−1 X i=1 (λ 1 +ϕ h )(λ 2 +ϕ h )(λ 1 λ 2 ) −i − (λ 1 λ 2 ) −(n−1) −ϕ 2 h =σ 2 h − (λ 1 +ϕ h )(λ 2 +ϕ h ) 1− (λ 1 λ 2 ) −(n−1) λ 1 λ 2 − 1 − (λ 1 λ 2 ) −(n−1) −ϕ 2 h →−[λ 1 λ 2 γ h (0) + (λ 1 +λ 2 )γ h (1)]/(λ 1 λ 2 − 1). Set n go to infinity, the covariance matrix is: Σ 4 = λ 2 1 γ h (0)+2λ 1 γ h (1) λ 2 1 −1 − λ 1 λ 2 γ h (0)+(λ 1 +λ 2 )γ h (1) λ 1 λ 2 −1 − λ 1 λ 2 γ h (0)+(λ 1 +λ 2 )γ h (1) λ 1 λ 2 −1 λ 2 2 γ h (0)+2λ 2 γ h (1) λ 2 2 −1 . 53 And J 0 ∼ N(0, 1 (λ 1 −λ 2 ) 2 Σ 1 ). For L 0 n , the second row of the matrix is 0. And define the transpose of the first row of L 0 n as L n : L n = (L n ) > 1,· = ξ n +ξ n−1 λ −1 1 +... +ξ 1 λ −(n−1) 1 ξ n +ξ n−1 λ −1 2 +... +ξ 1 λ −(n−1) 2 = P n−1 i=1 (1 +ϕ h λ 1 )λ −(n−i) 1 η i +η n +ϕ h λ −(n−1) 1 η 0 P n−1 i=1 (1 +ϕ h λ 2 )λ −(n−i) 2 η i +η n +ϕ h λ −(n−1) 2 η 0 . The limit is a bivariate normal, L ∼ N(0, Σ 5 ). And the covariance matrix Σ 5 = Σ 4 . THEOREM 6.6. If (L 0 n ,J n ) has a limiting distribution (L 0 ,J), then ( ˆ B n −B)B n d →L 0 Γ −1 ˜ J −1 K. Since the parameters estimated are just α h and β h , their estimation errors are (B > ) n ˆ α h −α h ˆ β h −β h d →K > ( ˜ J) −1 Γ −1 L. Where K = 1 λ 1 −λ 2 1 −λ 2 −1 λ 1 , ˜ J = J 0 1 0 0 J 0 2 , and J 0 ∼ N(0, 1 (λ 1 −λ 2 ) 2 Σ 4 ), L∼ N(0, Σ 4 ) are independent bivariate normals. J 0 1 ,J 0 2 are the first and second element of vector J 0 . 54 Chapter 7 Double Roots inside the Unit Circle This chapter treats the case II-2, θ 1 > 0,θ 2 < 0, θ 2 1 4 +θ 2 = 0, when two roots of the characteristic polynomial are equal and less than 1. The model is X k = 2λ 1 X k−1 −λ 2 1 X k−2 +ξ k . And in the matrix form (1.12),B = 2λ 1 −λ 2 1 1 0 , the eigenvalue ofB isλ 1 . Theorem 6.1 and 6.2 from last chapter still holds since there is no decomposition ofB involved in the proof. But the limiting distributions of F n and G n , as defined in (6.2) and (6.3) will be different. The Jordan decomposition of B is KBK −1 = Λ, in which K = 1 −λ 1 0 1 , K −1 = 1 λ 1 0 1 , Λ = λ 1 0 1 λ 1 . Firstly, the limiting distribution of J 0 n =KJ n is calculated. LEMMA 7.1. The limit of J 0 n is J 0 , which has a normal distribution. Proof. Recall (6.7) from Theorem 6.5 J 0 n =KJ n = n X i=1 Λ −(i−1) Kξ i = n X i=1 λ −(i−1) 1 0 −(i− 1)λ −i 1 λ −(i−1) 1 1 −λ 1 0 1 ξ i 0 = n X i=1 λ −(i−1) 1 ξ i −(i− 1)λ −i 1 ξ i 55 = P n−1 i=1 (λ 1 +ϕ h ) η i λ i 1 +ϕ h η 0 + ηn λ (n−1) 1 − P n−1 i=1 [(1 + ϕ h λ 1 )i− 1] η i λ i 1 − (n− 1) ηn λ n 1 . Clearly it will go to a bivariate normal distribution. The mean will be 0. And for the elements of covariance matrix Σ 6 , Σ 1,1 6 =σ 2 h (λ 1 +ϕ h ) 2 ∞ X i=1 λ −2i 1 +ϕ 2 h = (λ 2 1 γ h (0) + 2λ 1 γ h (1))/(λ 2 1 − 1), Σ 1,2 6 =σ 2 h ∞ X i=1 (λ 1 +ϕ h )[(1 + ϕ h λ 1 )i− 1]λ −2i 1 = [λ 1 γ h (0) + (λ 2 1 + 1)γ h (1)]/(λ 2 1 − 1) 2 , Σ 2,2 6 =σ 2 h ∞ X i=1 [(1 + ϕ h λ 1 )i− 1] 2 λ −2i 1 = [(1 +λ 2 1 )γ h (0) + 4λ 1 γ h (1)]/(λ 2 1 − 1) 3 . Thus J 0 n =KJ n d →N(0, Σ 6 ). With the limiting distribution of KJ n attatined, the limiting distribution of F n and G n can be similarly calculated. LEMMA 7.2. The limiting distribution of F n is a Cauchy type. Proof. Recall (6.2) F n =K −1 KJ n J > n K > ... + Λ −(n−1) KJ n J > n K > (Λ > ) −(n−1) (K > ) −1 . The k-th term in the parenthesis is Λ −k KJ n (KJ n ) > (Λ > ) −k ·,1 = (KJ n ) 2 1 λ −2k 1 (KJ n ) 1 [−(KJ n ) 1 kλ −1 1 + (KJ n ) 2 ]λ −2k 1 , Λ −k KJ n (KJ n ) > (Λ > ) −k ·,2 = (KJ n ) 1 [−(KJ n ) 1 kλ −1 1 + (KJ n ) 2 ]λ −2k 1 [−(KJ n ) 1 kλ −1 1 + (KJ n ) 2 ] 2 λ −2k 1 . And the terms in the parenthesis summed up together is 56 n−1 X k=0 Λ −k KJ n (KJ n ) > (Λ > ) −k = n−1 X k=0 (KJ n ) 1 0 (KJ n ) 2 (KJ n ) 1 λ −k 1 −kλ −(k+1) 1 · (KJ n ) 1 0 (KJ n ) 2 (KJ n ) 1 λ −k 1 −kλ −(k+1) 1 > = (KJ n ) 1 0 (KJ n ) 2 (KJ n ) 1 n−1 X k=0 λ −2k 1 −kλ −(2k+1) 1 −kλ −(2k+1) 1 k 2 λ −(2k+2) 1 (KJ n ) 1 (KJ n ) 2 0 (KJ n ) 1 . Thus the limit of F n is F n →K −1 P (K −1 ) > , (7.1) where P = (KJ) 1 0 (KJ) 2 (KJ) 1 λ 2 1 λ 2 1 −1 − λ 1 (λ 2 1 −1) 2 − λ 1 (λ 2 1 −1) 2 λ 2 1 +1 (λ 2 1 −1) 3 (KJ) 1 (KJ) 2 0 (KJ) 1 . THEOREM 7.3. G n goes to a Cauchy type distribution. Proof. In the same way as Lemma 7.2, rewrite G n : G > n = n−1 X i=0 B −i J n ξ > n−i =K −1 [KJ n ξ > n + Λ −1 KJ n ξ > n−1 + Λ −(n−1) KJ n ξ > 1 ] =K −1 n−1 X k=0 1 λ 2k 1 λ k 1 0 −kλ k−1 1 λ k 1 (KJ n ) 1 (KJ n ) 2 ξ n−k 0 =K −1 n−1 X k=0 (KJ n ) 1 0 (KJ n ) 2 (KJ n ) 1 λ −k 1 −kλ −(k+1) 1 ξ n−k 0 =K −1 (KJ n ) 1 0 (KJ n ) 2 (KJ n ) 1 P n k=1 λ −(n−k) 1 ξ k 0 − P n k=1 (n−k)λ −(n−k+1) 1 ξ k 0 Before deduction the limiting distribution of G n , the following lemma will be proved to show that J 0 n = KJ n and L n = P n k=1 λ −(n−k) 1 ξ k − P n k=1 (n−k)λ −(n−k) 1 ξ k will be independent when n goes to infinity. 57 LEMMA 7.4. ζ 1 = P n−1 i=1 η i /λ i 1 , andζ 2 = P n−1 i=1 η i /λ (n−i) 1 are independent asn goes to infinity. Proof. The proof is done by using the moment generating functions. m ζ 1 +ζ 2 (t) =m (λ −(n−1) 1 +λ −1 1 )η 1 (t) +... +m (λ −(n−1) 1 +λ −1 1 )η (n−1) = n−1 Y i=1 exp{− (σ h t) 2 2 (λ −(n−i) 1 +λ −i 1 ) 2 } = exp{− σ 2 h t 2 2 (2 n−1 X i=1 λ −2i 1 + 2 n−1 X i=1 λ −i 1 λ −(n−i) 1 )} = exp{−σ 2 h t 2 ( λ −2 1 −λ −2n 1 1−λ −2 1 + n− 1 λ n 1 )} While m ζ 1 (t) = m ζ 2 (t) = exp{− σ 2 h t 2 2 λ −2 1 −λ −2n 1 1−λ −2 1 }, as n goes to infinity,m ζ 1 +ζ 2 (t) = m ζ 1 (t)m ζ 2 (t), which proves that ζ 1 and ζ 2 are independent as n goes to infinity. LEMMA 7.5. J 0 n is independent of L n as n goes to infinity. Proof. Since P n k=1 λ −(n−k) 1 ξ k P n k=1 (n−k)λ −(n−k) 1 ξ k = P n−1 k=1 (1 +ϕ h λ 1 )λ −(n−k) 1 η k +η n +ϕ h λ −(n−1) 1 η 0 P n−1 k=1 [(n−k)λ −1 1 + (n−k− 1)ϕ h ]λ −(n−k) 1 η k +ϕ h (n− 1)λ −(n−1) 1 η 0 , modify the proof of Lemma 7.4, the two terms are independent. Proof. Continuation of the proof of Theorem 7.3. For the term P n k=1 λ −(n−k) 1 ξ k P n k=1 (n−k)λ −(n−k) 1 ξ k = P n−1 k=1 (+ϕ h λ 1 )λ −(n−k) 1 η k +η n +ϕ h λ −(n−1) 1 η 0 P n−1 k=1 [(n−k)λ −1 1 + (n−k− 1)ϕ h ]λ −(n−k) 1 η k +ϕ h η 0 (n− 1)λ −(n−1) 1 . It will be a bivariate normal with mean 0. Then the covariance matrix needs to be calculated: 58 Σ 1,1 7 =σ 2 h n−1 X k=1 (1 +ϕ h λ 1 ) 2 λ −2(n−k) 1 + 1 =σ 2 h λ −2 1 −λ −2n 1 1−λ −2 1 (1 +ϕ h λ 1 ) 2 + 1 →σ 2 h [(1 +ϕ 2 h )λ 2 1 + 2ϕ h λ 1 ]/(λ 2 1 − 1) = (λ 2 1 γ h (0) + 2λ 1 γ h (1))/(λ 2 1 − 1), Σ 1,2 7 =σ 2 h n−1 X k=1 (1 +ϕ h λ 1 )[(n−k)λ −1 1 + (n−k− 1)ϕ h ]λ −2(n−k) 1 →σ 2 h (1 +ϕ h λ 1 )(λ 1 +ϕ h )/(λ 2 1 − 1) 2 = [λ 1 γ h (0) + (λ 2 1 + 1)γ h (1)]/(λ 2 1 − 1) 2 , Σ 2,2 7 =σ 2 h n−1 X k=1 [(n−k)λ −1 1 + (n−k− 1)ϕ h ] 2 λ −2(n−k) 1 →σ 2 h [(1 +ϕ 2 h )(λ 2 1 + 1) + 4ϕ h λ 1 ]/(λ 2 1 − 1) 3 = [(λ 2 1 + 1)γ h (0) + 4λ 1 γ h (1)]/(λ 2 1 − 1) 3 . Then L = P n k=1 λ −(n−k) 1 ξ k − P n k=1 (n−k)λ −(n−k) 1 ξ k ∼ N(0, Σ 7 ). Notice the covariance matrix is the same as the covariance matrix of J 0 . In conclusion, (B > ) (n−2) ˆ α h − 2λ 1 ˆ β h +λ 2 1 d → 1 0 −λ 1 1 J 0 1 J 0 2 0 J 0 1 −1 λ 2 1 λ 2 1 −1 − λ 1 (λ 2 1 −1) 2 − λ 1 (λ 2 1 −1) 2 λ 2 1 +1 (λ 2 1 −1) 3 −1 L. Where J 0 ∼ N(0, Σ), L∼ N(0, Σ) are independent bivariate normals with the same covariance matrix. And the covariance matrix Σ = Σ 6 , as defined in Lemma 7.1. 59 Chapter 8 Mixed Roots This chapter considered the cases of mixed roots. It covers three cases in cate- gory I, which is the category that θ 2 1 4 +θ 2 > 0. In this category, both roots of the characteristic polynomial of the model (1.8) are real and positive. • I-1, θ 1 > 0,θ 2 = 0, one root is at 1, the other less than 1, • I-2, θ 1 < 0,θ 2 = 0, one root is at 1, the other greater than 1, • I-5,∀θ 1 ,θ 2 > 0, one root is less than 1, the other greater than 1. Letλ 1 ,λ 2 be the roots ofλ 2 −α h λ−β h = 0. The idea is to write (1−α h L−β h L 2 ) = (1−λ 1 L)(1−λ 2 L), and define Y k =φ(L)(1−λ 1 L) −1 X k = (1−λ 2 L)X k =X k −λ 2 X k−1 , (8.1) Z k =φ(L)(1−λ 2 L) −1 X k =X k −λ 1 X k−1 . (8.2) Then the following holds 1 −λ 2 1 −λ 1 X k X k−1 = Y k Z k . And (1−λ 1 L)Y k =ξ k , (1−λ 2 L)Z k =ξ k .Y k andZ k satisfy two ARMA(1,1) models. Since ARMA(2,1) models are discussed in previous chapters, the results for the two ARMA(1,1) models can be similarly attained using lemmas proved earlier. Instead of considering (1.13) ˆ α h −α h ˆ β h −β h = ( n−1 X i=1 X i X > i ) −1 ( n−1 X i=1 X i−1 ξ i ). Consider the modified terms 60 1 1 −λ 2 −λ 1 −1 ˆ α h −α h ˆ β h −β h = n−1 X i=1 1 −λ 2 1 −λ 1 X i X > i 1 −λ 2 1 −λ 1 > −1 n−1 X i=1 1 −λ 2 1 −λ 1 X i−1 ξ i = n−1 X i=1 Y i Z i Y i Z i −1 n−1 X i=1 Y i−1 Z i−1 ξ i . The limiting distributions of the two terms in the brackets above will be discussed in this chapter. The results in this chapter are combination of different results from previous chapters. And the important part is to prove that the cross product terms will go to 0. A full proof for the first case is done in Lemma 8.2, and for the other two cases, only small modifications to Lemma 8.2 are needed to prove the corresponding results. 8.1 λ 1 = 1,λ 2 < 1 This section deals with the case that one root of the charateristic polynomial φ(z) = 1−α h z−β h z 2 is 1, and the other greater than 1. For Y k part, define Y k (0) = (1−L)Y k =ξ k , Y k (1) =Y k = k X i=1 u i (0). (8.3) Then from Lemma 5.1, 1 √ n Y [nt] (1) = 1 √ n [nt] X k=0 Y k (0) = 1 √ n [nt] X k=0 ξ k d →σ h (1 +ϕ h )W (t). (8.4) Use Lemma 5.2, it holds that n −2 n−1 X k=1 Y 2 k (1) = 1 n n−1 X k=1 (Y k (1)/ √ n) 2 d →σ 2 h (1 +ϕ h ) 2 Z 1 0 W 2 (s)ds. (8.5) 61 And for the n −1 P n−1 t=1 Y t−1 ξ t part, use (8.3) and separate out a martingale differ- ence sequence n −1 n−1 X t=1 Y t−1 ξ t =n −1 n−1 X t=1 ( t−1 X k=1 ξ k )ξ t =n −1 n−1 X t=1 ( t−1 X k=1 ξ k )(η t +ϕ h η t−1 ) =n −1 n−1 X t=1 ( t−1 X k=1 ξ k )η t +n −1 n−1 X t=1 ( t−2 X k=1 ξ k +ξ t−1 )ϕ h η t−1 =n −1 n−1 X t=1 ( t−1 X k=1 ξ k )η t +n −1 n−1 X t=1 ( t−2 X k=1 ξ k +ξ t−1 )ϕ h η t−1 (8.6) =n −1 n−1 X t=1 ( t−1 X k=1 ξ k )η t +n −1 n−1 X t=1 ( t−2 X k=1 ξ k )ϕ h η t−1 +n −1 n−1 X t=1 (ξ t−1 )ϕ h η t−1 . For the first two terms, by Lemma 5.5, it can be easily checked that together they converge toσ 2 h (1+ϕ h ) 2 R 1 0 W (s)dW (s) in distribution. And for the third term, by the ergodic theorem n −1 n−1 X t=1 (ξ t−1 )ϕ h η t−1 =n −1 n−1 X t=1 (η t−1 +ϕ h η t−2 )ϕ h η t−1 a.s. → σ 2 h ϕ h . Combine (8.5) and (8.6), it gives n( n−1 X t=1 Y 2 t ) −1 ( n−1 X t=1 Y t−1 ξ t ) d → R 1 0 W (s)dW (s) +γ h (1)/(γ h (0) + 2γ h (1)) R 1 0 W 2 (s)ds . For the ergodic part, by definition, (1−λ 2 L)Z t =ξ t , and the formula to be studied ( 1 n P n−1 t=1 Z 2 t ) −1 ( 1 √ n P n−1 t=1 Z t−1 ξ t ) is comparable to what’s done in Chapter 3. And the result is: 1 n n−1 X t=1 Z 2 t a.s. → γ h (0) + 2λ 2 γ h (1) 1−λ 2 2 , 1 n n−1 X t=1 Z t Z t−1 a.s. → λ 2 γ h (0) + (1 +λ 2 2 )γ h (1) 1−λ 2 2 . (8.7) For the 1 √ n P n−1 t=1 Z t−1 ξ t part, replace ξ t by η t +ϕ h η t−1 to make a martingale dif- ference sequence: 62 n−1 X t=1 Z t−1 ξ t =λ 2 n−1 X t=1 Z t−2 ξ t + n−1 X t=1 ξ t−1 ξ t =λ 2 n−2 X t=1 (ϕ h Z t−1 +Z t−2 )η t +Z n−3 η n−1 + n−1 X t=1 ξ t−1 ξ t . Follow the same proof in Theorem 3.4, the result is 1 √ n ( n−1 X t=1 Z t−1 ξ t −γ h (1)) d →λ 2 N(0,γ h (0)γ(0) + 2γ h (1)γ(1)). Another problem is the cross product term, and the following lemma proves that this term goes to 0 as n goes to infinity. The idea comes from Lemma 3.4.3 in Chan and Wei(1988). LEMMA 8.1. The cross product, n − 3 2 P n−1 t=1 Y t Z t , goes to 0. Proof. Firstly, a related result will be proved: E||n − 3 2 n X t=1 Y t ξ t−i ||→ 0 for any i. By definition Y t =Y t−1 +ξ t =... = P t j=0 ξ t−j , thus E||n − 3 2 n X t=1 Y t ξ t−i || =E||n − 3 2 n X t=1 (Y t−i−1 + i X j=0 ξ t−j )ξ t−i || ≤E||n − 3 2 n X t=1 Y t−i−1 ξ t−i || +E||n − 3 2 n X t=1 i X j=0 ξ t−j ξ t−i || ≤n − 3 2 σ 2 h (1 +ϕ 2 h ) v u u t n X t=1 Y 2 t−i−1 +n − 3 2 v u u t i X j=0 ( n X t=1 ξ 2 t−j )( n X t=1 ξ 2 t−i ). The first part, by (8.5), goes to 0. And for the second part, for any fixed i, also goes to 0. With this proved, back ton − 3 2 P n−1 t=1 Y t Z t . The definition of Z t gives Z t = λ k 2 Z t−k + P k−1 j=0 λ j 2 ξ t−j . Thus E||n − 3 2 n−1 X t=1 Y t Z t || =E||n − 3 2 n−1 X t=1 Y t (λ k 2 Z t−k + k−1 X j=0 λ j 2 ξ t−j )|| 63 ≤E||n − 3 2 λ k 2 n−1 X t=1 Y t Z t−k || +E||n − 3 2 n−1 X t=1 Y t k−1 X j=0 λ j 2 ξ t−j || ≤λ k 2 n − 3 2 v u u t ( n−1 X t=1 Y 2 t )( n−1 X t=1 Z 2 t−k ) +n − 3 2 k−1 X j=0 |λ 2 | j E|| n−1 X t=1 Y t λ j 2 ξ t−j ||. First fix k, set n go to infinity, the second part goes to 0. Then set k go to infinity, since|λ 2 |< 1, the first part also goes to 0. Thus the lemma is proved. THEOREM 8.2. For the estimators (ˆ α h , ˆ β h ), the limiting distribution of the estimation errors is n 0 0 n 1/2 1 1 −λ 2 −λ 1 −1 ˆ α h −α h ˆ β h −β h − 0 (1−λ 2 2 )γ h (1) γ h (0)+2λ 2 γ h (1) d → R 1 0 W (s)dW (s) + γ h (1) γ h (0)+2γ h (1) R 1 0 W 2 (s)ds λ 2 N(0, Σ 8 ) . Where Σ 8 =γ h (0)γ(0) + 2γ h (1)γ(1). 8.2 λ 1 = 1,λ 2 > 1. For the case that one root of the characteristic polynomial is on the unit circle, and the other one is inside, just the second part of the proof in section 8.1 needs to be modified. Define J n similarly as in (6.1), J n = n X i=1 λ −i−1 2 ξ i = n X i=1 λ −i−1 2 (η i +ϕ h η i−1 ) = n−1 X i=1 (λ 2 +ϕ h )λ −i 2 η i + +ϕ h η 0 +λ −(n−1) 2 η n . Then the limit of J n will be a normal distribution with mean 0. And the variance 64 σ 2 h n−1 X i=1 (λ 2 +ϕ h ) 2 λ −2i 2 +ϕ 2 h +λ −2(n−1) 2 →σ 2 h (1 +ϕ 2 h )λ 2 2 + 2ϕ h λ 2 λ 2 2 − 1 = γ h (0)λ 2 2 + 2γ h (1)λ 2 λ 2 2 − 1 . So J n d →J, J∼N(0, γ h (0)λ 2 2 + 2γ h (1)λ 2 λ 2 2 − 1 ). And as discussed in the explosive case in chapter 7, λ −2n−2 2 n−1 X t=1 Y 2 t d →J λ 2 2 λ 2 2 − 1 J. Recall that λ −n−2 2 P n−1 t=1 Y t−1 ξ t → LJ, in which L is the limit of L n . L n is defined similarly as in Theorem 6.4 L n :=ξ n +ξ n−1 λ −1 2 +... +ξ 1 λ −(n−1) 2 = n−1 X i=1 (1 +ϕ h λ 2 )λ −(n−i) 2 η i +η n +ϕ h λ −(n−1) 2 η 0 . Then the limit is L∼ N(0, γ h (0)λ 2 2 +2γ h (1)λ 2 λ 2 2 −1 ), and it is independent of J n by proof similar to Theorem 6.5. In conclusion, λ n−2 2 ( n−1 X t=1 Z 2 t ) −1 ( n−1 X t=1 Z t−1 ξ t ) d → λ 2 2 − 1 λ 2 2 L J . Like section 1, it also needs to be proved that the cross product term goes to 0. LEMMA 8.3. For the cross product term, λ −n 2 n − 1 2 P n t=1 Y t Z t , it converges to 0 in probability. Proof. The proof is similar to Lemma 8.1 proved in section 8.1. Except that instead of substitute backwardly and use Z t = λ k 2 Z t−k + P k−1 j=0 λ j 2 ξ t−j , do it forwardly since now|λ 2 |> 1. ReplaceZ t byλ −k 2 Z t+k − P k j=1 λ −j 2 ξ t+j , and the rest of the proof can be done in the same way as in Lemma 8.1. THEOREM 8.4. n 0 0 λ n 2 1 1 −λ 2 −λ 1 −1 ˆ α h −α h ˆ β h −β h d → R 1 0 W (s)dW (s)+ γ h (1) γ h (0)+2γ h (1) R 1 0 W 2 (s)ds λ 2 2 −1 λ 2 2 L J . (8.8) 65 In which L and J are independent standard normal. 8.3 λ 1 < 1,λ 2 > 1 In this case, one root of the characteristic polynomial is less than 1, and the other is greater than 1. The result is a combination of results from previous two sections. THEOREM 8.5. n 1/2 0 0 λ n 2 1 1 −λ 2 −λ 1 −1 ˆ α h −α h ˆ β h −β h − (1−λ 2 1 )γ h (1) γ h (0)+2λ 1 γ h (1) 0 d → λ 2 N(0, Σ 9 ) λ 2 2 −1 λ 2 2 L J . Where Σ 9 =γ h (0)γ(0) + 2γ h (1)γ(1) = Σ 8 . LEMMA 8.6. The cross product term goes to 0. Proof. The idea is the same as introduced in lemma 8.1 before. For the explosive part, substitute Z t = λ −k 2 Z t+k − P k j=1 λ −j 2 ξ t+j , and mimic the proof in Lemma 8.1 leads to the result. 66 Chapter 9 Comparison to Continuous Time Model This thesis studied the statistical inference problem of a second order stochastic ODE with discrete observations. The same underlying equation with continuous observations is studied by Lin and Lototsky (2014). The results of this thesis are compared to the corresponding results from Lin and Lototsky (2014), and they are indeed consistent. This is reasonable since the model considered in this thesis is exact discretization of the CAR(2) model. In Lin and Lototsky (2014), the following stochastic ordinary differential equation ¨ X(t) =θ 1 ˙ X(t) +θ 2 X(t) +σ ˙ W (t),t> 0 (9.1) is studied. Set X(t) = X(t) ˙ X(t) , Θ = 0 1 θ 2 θ 1 ,σ = 0 σ Then it is a 2 dimensional Ornstein-Uhlenbeck process dX(t) = ΘX(t)dt +σdW (t). (9.2) The maximum likelihood estimator of Θ is studied in Basak and Lee(2008) using the continuous time observationsX ˆ Θ T = Z T 0 dX(t)X > (t) Z T 0 X(t)X > (t)dt −1 . (9.3) And the estimation errors are ˆ Θ T − Θ = Z T 0 σX > (t)dW (t) Z T 0 X(t)X > (t)dt −1 . (9.4) 67 Let p,q be the roots of x 2 −θ 1 x−θ 2 = 0. For the discretized model considered in this thesis, let λ 1,2 be the roots of λ 2 −α h λ−β h = 0, and z 1,2 be the roots of the characteristic polynomialφ(z) = 1−α h z−β h z 2 . Depending on the values ofp,q, the CAR(2) model are categorized into nine cases, just as this thesis. In the following sections, results from this thesis are compared to those from CAR(2) model studied in Lin and Lototsky (2014), and they indeed are similar, which is reasonable since the model considered in this thesis is exact discretization of the CAR(2) model, so the characteristics of the process are preserved during this discretization, resulting in similar limiting distributions for the estimation errors of the discrete and continuous- time cases. The results are grouped in four different categories due to the value of p and q: ergodic case, distinct real roots, double roots and conjugate roots. All the cases are compared in following sections. In these results, η 1 ,η 2 stand for independent normal random variables, and W 1 (t),W 2 (t) are two independent Brownian motions. 9.1 Ergodic Case In the CAR(2) model, this is the case θ 1 < 0,θ 2 < 0, and it corresponds to three cases in this thesis: • I-4 θ 1 < 0,θ 2 < 0, θ 2 1 4 +θ 2 < 0, distinct roots both greater than 1; • II-1 θ 1 < 0,θ 2 < 0, θ 2 1 4 +θ 2 = 0, double roots both greater than 1; • III-1 θ 1 < 0,θ 2 < 0, θ 2 1 4 +θ 2 > 0, conjugate roots outside unit circle. In all cases, β h =−e θ 1 h . And for I-4, α h = 2 cosh(rh)e θ 1 h 2 ; for II-1, α h = 2e θ 1 h 2 ; for III-1, α h = 2 cos(rh)e θ 1 h 2 . In the CAR(2) model, the result is lim T→∞ q |θ 1 |T ( b θ 1,T −θ 1 ) d = √ 2|θ 1 |η 1 , lim T→∞ q |θ 1 |T ( b θ 2,T −θ 2 ) d = q 2|θ 2 ||θ 1 |η 2 . Where η 1 ,η 2 are iid standard normal random variables. While in discrete time, √ n ˆ α h −α h ˆ β h −β h −H −1 1 γ h (1) 0 d →H −1 1 N(0, Σ 1 ). 68 γ h (1) is the autocorrelation between noises with lag 1. H 1 and Σ 1 matrices are defined in Chapter 3. The two results are comparable since the limiting distributions are both normal. ButunliketheCAR(2)result, whichisrathersimpleandelegant, theresultof discrete model is rather complex, especially the expression ofH −1 1 , making analyzing the result’s dependence on α h and β h hard to accomplish. 9.2 Distinct Real Roots This section shows that whenp,q are distinct and real, roots of the characteristic polynomial for the discretized model are also distinct and real. The four cases in CAR(2) corresponds to the four non-ergodic cases in category I. In this section, α h = 2 cosh(rh)e θ 1 h 2 ,β h =−e θ 1 h . 1. For the casep> 0,q< 0 in the CAR(2) model, it corresponds to the following case in the thesis, I-5 ∀θ 1 ,θ 2 < 0, θ 2 1 4 +θ 2 > 0, one root less than 1, the other root greater than 1. In CAR(2), the result is lim T→∞ q |q|T ( b θ 1,T −θ 1 ) d =− 1 p lim T→∞ q |q|T ( b θ 2,T −θ 2 ) d = √ 2|q|η For the discretized model, λ 1 = e (−r−θ 1 /2)h < 1,λ 2 = e (r−θ 1 /2)h > 1. The limiting distribution is n 1/2 0 0 λ n 2 1 1 −λ 2 −λ 1 −1 ˆ α h −α h ˆ β h −β h − (1−λ 2 1 )γ h (1) γ h (0)+2λ 1 γ h (1) 0 d → λ 2 N(0,γ h (0)γ(0) + 2γ h (1)γ(1)) λ 2 2 −1 λ 2 2 η 1 η 2 . This is an interesting case because for the CAR(2) model, the estimation errors’ limitingdistributionsarebothnormal,whileinthediscretizedmodel,thepartλ 2 (ˆ α h − 69 α h ) + ( ˆ β h −β h ) converges to a Cauchy type distribution with exponential rate. One possible reason is that β h =−e θ 1 h , and though ˆ θ 1,T −θ 1 converges to normal with rate √ T, due to the exponential function in β h , the limiting distribution in discrete time is transformed to a Cauchy type. 2. The case p>q> 0 in CAR(2) corresponds to I-3 θ 1 > 0,θ 2 < 0, θ 2 1 4 +θ 2 > 0, both roots less than 1. In CAR(2), the result is lim T→∞ e qT ( b θ 1,T −θ 1 ) d =− 1 p lim T→∞ e qT ( b θ 2,T −θ 2 ) d = 2(p +q)q p−q η ξ +c While for the discretized case, λ 1,2 =e (±r−θ 1 /2)h > 1. The result is (B > ) n ˆ α h −α h ˆ β h −β h d →K > ( ˜ J) −1 Γ −1 L. Where B = α h β h 1 0 ,K = 1 λ 1 −λ 2 1 −λ 2 −1 λ 1 , ˜ J = J 0 1 0 0 J 0 2 , and J 0 ∼N(0, 1 (λ 1 −λ 2 ) 2 Σ 6 ), L∼N(0, Σ 6 ) are independent bivariate normals. The two results are comparable since they both have exponential convergence rate and both converge to Cauchy type distributions. 3. The case p = 0,q< 0 in CAR(2) corresponds to I-2 θ 1 < 0,θ 2 = 0, θ 2 1 4 +θ 2 > 0, one root at 1, the other greater than 1. For CAR(2) model, the result is lim T→∞ q |q|T ( b θ 1,T −θ 1 ) d = √ 2|q|η, lim T→∞ T ( b θ 2,T −θ 2 ) d =|q| W 2 (1)− 1 2 R 1 0 W 2 (s)ds While in discretized model, λ 1 = 1,λ 2 =e θ 1 h < 1. The result is n 0 0 n 1/2 1 1 −λ 2 −λ 1 −1 ˆ α h −α h ˆ β h −β h − 0 (1−λ 2 2 )γ h (1) γ h (0)+2λ 2 γ h (1) 70 d → R 1 0 W (s)dW (s) + γ h (1) γ h (0)+2γ h (1) R 1 0 W 2 (s)ds λ 2 N(0,γ h (0)γ(0) + 2γ h (1)γ(1)) . In CAR(2), with rate √ T, the estimation error converges to normal, which is com- parable to the case that with rate √ n, the estimation error converges to normal in discretized model. For the other one, the rate T and n are comparable, and the lim- iting distributions are both functionals of Brownian motion. Notice in the discretized model, if the autocorrelation between noises is 0, then γ h (1)/γ h (0) disappears. 4. The continuous time case p> 0,q = 0 corresponds to I-1 θ 1 > 0,θ 2 = 0, θ 2 1 4 +θ 2 > 0, one root at 1, the other less than 1. The result for CAR(2) model is lim T→∞ θ 1 T ( b θ 1,T −θ 1 ) d =− lim T→∞ T ( b θ 2,T −θ 2 ) d =θ 1 W 2 (1)− 1 2 R 1 0 W 2 (s)ds In the discretized model, λ 1 = 1,λ 2 =e θ 1 h > 1. The result is n 0 0 λ n 2 1 1 −λ 2 −λ 1 −1 ˆ α h −α h ˆ β h −β h d → R 1 0 W (s)dW (s) + γ h (1) γ h (0)+2γ h (1) R 1 0 W 2 (s)ds λ 2 2 −1 λ 2 2 η 1 η 2 . (9.5) Notice the difference that for the CAR(2) model, both estimation errors converges to functionals of Brownian motion, while for the discretized model, one convergences to Brownian functional, while the other converges to a Cauchy type distribution. The reason might be the same as in case 1. 9.3 Double Roots This section considered the case that in CAR(2) model, the roots are double roots. The two cases corresponds to category II, which also has double roots for the characteristic polynomial. In this section, α h = 2e θ 1 h/2 ,β h =−e θ 1 h . 71 1. The case p =q> 0 in CAR(2) corresponds to II-2 θ 1 > 0,θ 2 < 0, θ 2 1 4 +θ 2 = 0, double roots less than 1. The CAR(2) model’s result is lim T→∞ e qT qT ( b θ 1,T −θ 1 ) d =− 1 q lim T→∞ e qT qT ( b θ 2,T −θ 2 ) d = 4 √ 2q η ξ +c Where c = √ 2q ˙ X(0)−pX(0) /σ. In discretized model, λ 1 =λ 2 =e θ 1 h/2 > 1. The result is (B > ) (n−2) ˆ α h − 2λ 1 ˆ β h +λ 2 1 d → 1 0 −λ 1 1 (J 0 ) 1 (J 0 ) 2 0 (J 0 ) 1 −1 λ 2 1 λ 2 1 −1 − λ 1 (λ 2 1 −1) 2 − λ 1 (λ 2 1 −1) 2 λ 2 1 +1 (λ 2 1 −1) 3 −1 L. Where J 0 ∼ N(0, Σ), L∼ N(0, Σ) are independent bivariate normals with the same covariance matrix. And the covariance matrix Σ = Σ 6 , as defined in Lemma 7.1. The results are comparable since the limiting distributions are all Cauchy type. 2. The continuous time case p =q = 0 corresponds to II-3 θ 1 > 0,θ 2 < 0, θ 2 1 4 +θ 2 = 0, double roots at 1. The result of CAR(2) model is lim T→∞ T b θ 1,T d = 2f 3 (W 2 (1)− 1)− 2f 2 1 (W (1)f 1 −f 2 ) 4f 2 f 3 −f 4 1 lim T→∞ T 2b θ 2,T d = 4f 2 (W (1)f 1 −f 2 )−f 2 1 (W (1) 2 − 1) 4f 2 f 3 −f 4 1 Where W =W (s), 0≤s≤ 1, is a standard Brownian motion. and f 1 = Z 1 0 W (s)ds, f 2 = Z 1 0 W 2 (s)ds, f 3 = Z 1 0 ( Z s 0 W (t)dt) 2 ds. In the discretized model, λ 1,2 = 1. And the result is 72 n 2 0 0 n 1 1 0 −1 ( n X k=1 X k X > k ) −1 n X k=1 X k−1 ξ k = n 2 0 0 n 1 1 0 −1 ˆ α h −α h ˆ β h −β h d →H −1 3 R 1 0 F 1 (s)dW (s) R 1 0 F 0 (s)dW (s) + γ h (0) γ h (0)+2γ h (1) . H is defined in theorem 5.4, which is: H 3 = (H ij 3 ) 2 i,j=1 , H ij 3 = R t 0 F i−1 (s)F j−1 (s)ds. And F 0 (t) =W (t),F 1 (t) = R t 0 F 0 (s)ds = R t 0 W (s)ds. The results are comparable since in CAR(2) the convergence rates are T 2 and T, while for the discretized model, the rates are n 2 and n. 9.4 Conjugate Roots This section considers the cases that in the CAR(2) model, the roots are conju- gates. The corresponding discretized model also have conjugate roots. In this section, α h = 2 cos(rh)e θ 1 h/2 ,β h =−e θ 1 h . 1. The case p = √ −1ν,ν > 0 in CAR(2) model corresponds to III-2 rh6= mπ,θ 1 = 0,θ 2 < 0, θ 2 1 4 +θ 2 < 0, conjugate roots on the unit circle and not equal to 1 or -1. The result of CAR(2) model is lim T→∞ T b θ 1,T d = 2−W 2 1 (1)−W 2 2 (1) R 1 0 W 2 1 (t)dt + R 1 0 W 2 2 (t)dt , lim T→∞ T ( b θ 2,T −θ 2,T ) d = 2ν R 1 0 W 1 (t)dW 2 (t)− R 1 0 W 2 (t)dW 1 (t) R 1 0 W 2 1 (t)dt + R 1 0 W 2 2 (t)dt . And the result for the discretized model is • (a) for the least squares estimator ˆ α h ˆ β h = ( n X k=1 X k X > k ) −1 ( n X k=1 X k−1 ·X k ), the limiting distribution is 73 lim n→∞ n ˆ α h −α h ˆ β h −β h = lim n→∞ (n −2 n X k=1 X k X > k ) −1 (n −1 n X k=1 X k−1 ·ξ k ) = 2 Z 1 0 W 2 1 (s)ds + Z 1 0 W 2 2 (s)ds −1 · sin(φ h ) cos(φ h ) 0 −1 · R 1 0 W 1 (s)dW 2 (s)− R 1 0 W 1 (s)dW 2 (s) R 1 0 W 1 (s)dW 1 (s) + R 1 0 W 2 (s)dW 2 (s) + ϕ h sin(φ h ) 1 − cos(φ h ) . • (b) While for the LSE with lag 1, the estimators are ˜ α h ˜ β h = ( n X k=1 X k−1 X > k ) −1 ( n X k=1 X k−2 ·X k ). And the limiting distribution is lim n→∞ n ˜ α h −α h ˜ β h −β h = lim n→∞ (n −2 n X k=1 X k−1 X > k ) −1 (n −1 n X k=1 X k−2 ·ξ k ) = 2 Z 1 0 W 2 1 (s)ds + Z 1 0 W 2 2 (s)ds −1 · sin(φ h ) cos(φ h ) 0 −1 · R 1 0 W 1 (s)dW 2 (s)− R 1 0 W 1 (s)dW 2 (s) R 1 0 W 1 (s)dW 1 (s) + R 1 0 W 2 (s)dW 2 (s) . The results are comparable since the limiting distributions are all functionals on two independent Brownian motions. Notice the denominators in both cases are the same integral. And the same integrals related to W 1 ,W 2 show up in both results. 2. The case p,q =μ± √ −1ν,μ> 0,ν > 0 in CAR(2) corresponds to III-3 rh6=mπ,θ 1 > 0,θ 2 < 0, θ 2 1 4 +θ 2 < 0, conjugate roots inside the unit circle. The result of CAR(2) model is: the families{e λT ( b θ i,T −θ i ),T > 0} are relatively compact, and the limit distributions are of the form ξ c η c +ξ s η s ξ 2 c +ξ 2 s , 74 where (ξ c ,ξ s ) and (η c ,η s ) are independent bivariate normal vectors, Eη c =Eη s = 0, and the means of η c and η s depend on the initial condition (X(0), ˙ X(0)). For the discretized model, the result is: (B > ) n ˆ α h −α h ˆ β h −β h d →K > ( ˜ J) −1 Γ −1 L. Where K = 1 λ 1 −λ 2 1 −λ 2 −1 λ 1 , ˜ J = J 0 1 0 0 J 0 2 , and J 0 ∼ N(0, 1 (λ 1 −λ 2 ) 2 Σ 5 ), L∼ N(0, Σ 5 ) are independent bivariate normals. Σ 5 is defined in Chapter 6. To sum up, in seven out of the nine cases, the discrete-time and CAR(2) results are comparable, namely the convergence rate and limiting distributions are similar. While for the other two cases, case 1 and case 4 in the category of distinct real roots, the results seem to contradict each other. One idea is that β h =−e θ 1 h modified the characteristics of θ 1 , and leading the result to explosive category. This special case needs further research and is not discussed here. 75 Chapter 10 Summary This thesis considered the statistical inference problem of the discretized CAR(2) model. For the CAR(2) model, namely the second order stochastic ODE (9.1) with two unknown parameters, the MLE considered in Lin and Lototsky (2014) is strongly consistent for all cases. While for the difference equation (1.8) considered in this thesis, when one or both roots of the characteristic polynomial is outside the unit circle, the corresponding estimate part is not consistent. This is due to te correlation of noises. Overall, the results of this thesis are all comparable with corresponding cases in the CAR(2) model, which is reasonable since the model (1.8) is exact discretization of the process, thus the characteristics of the process is preserved. But as discussed in Chapter 1, MLE in CAR(2) model assumes continuous observations of{X(t)} t≥0 and { ˙ X(t)} t≥0 , while the LSE considered in this thesis only needs discrete observations of the process, making statistical inference through the LSE more feasible. Apartfromthelimitingdistributionoftheleastsquaresestimators, doubleasymp- totics of the estimators are also discussed in Chapter 4. The double asymptotics serve as a bridge between the difference equation and the CAR(2) model, and though in this thesis the initial condition is set to be X(0) = ˙ X(0) = 0, for general initial conditions, the double asymptotics are supposed to incorporate the initial conditions into the final result and be more precise than the current result. For more details, check Wang and Yu (2016). This is a possible area of future research. In conclusion, the main contributions of this thesis are: (1) Transformed the statistical inference problem of CAR(2) into a time series problem, and relaxed the assumption of continuous observations to discrete ones. (2) Attained the limiting distributions for all different cases. (3) Compared the results to the corresponding CAR(2) model and verified the consistency between MLE in CAR(2) and LSE in discrete model. 76 Chapter 11 Appendix Notations used here are only good for this appendix. They are not related to the main text. To get the solution of the following stochastic ordinary differential equation ¨ X(t)−θ 1 ˙ X(t)−θ 2 X =σdW (t), X(0) = ˙ X(0) = 0, firstly, consider the solution of the associated homogeneous ODE: ¨ Y (t)−θ 1 ˙ Y (t)−θ 2 Y = 0, The appendix will only consider the situation that θ 2 1 4 + θ 2 > 0, the other two situations can be done in the same way. The solution to the homogeneous ODE is Y 1 (t) = e pt ,Y 2 (t) = e qt . And p,q are solutions to x 2 −θ 1 x−θ 2 = 0. By parameter variation, the solution to will be: X(t) = u 1 (t)Y 1 (t) +u 2 (t)Y 2 (t),u 1 (t),u 2 (t) are determined by two sets of equations: u 0 1 Y 1 +u 0 2 Y 2 = 0, u 0 1 Y 0 1 +u 0 2 Y 0 2 =σ ˙ W. That gives u 1 (t) =u 1 (0) + R t 0 e −ps σdW (s) q−p , u 2 (t) =u 2 (0) + R t 0 e −qs σdW (s) p−q . Then X(t) = (u 1 (0) + Z t 0 e −ps σdW (s) q−p )e pt + (u 2 (0) + Z t 0 e −qs σdW (s) p−q )e qt = (u 1 (0)e pt +u 2 (0)e qt ) + Z t 0 e p(t−s) −e q(t−s) p−q )σdW (s). 77 Plug in X(0) = ˙ X(0) = 0, that gives u 1 (0) =u 2 (0) = 0. So the solution is X(t) = Z t 0 e p(t−s) −e q(t−s) p−q σdW (s). Notice the two exponential functions are solutions to the homogenenous solution. So setg(t) = e pt −e qt p−q , then ¨ g−θ 1 ˙ g−θ 2 g = 0. And the initial condition isg(0) = 0, ˙ g = 1. This proves that the solution to the stochastic ODE is: X(t) =σ Z t 0 g(t−s)dW (s) For the other two situaions: θ 2 1 4 +θ 2 = 0 and θ 2 1 4 +θ 2 < 0, similar calculation leads to the same result. 78 Bibliography [1] Aït-Sahalia, Y., Mykland, P. A. and Zhang, L., 2005. How Often to Sample a Continuous-time Process in the Presence of Market Microstructure Nnoise. Review of Financial studies, 18(2), pp.351-416. [2] Anderson, T. W., 1959. On Asymptotic Distributions of Estimates of Parameters of Stochastic Difference Equations. Ann. Math. Statist. Volume 30, No. 3, pp. 676-687. [3] Basak, G. K. and Lee, P., 2008. Asymptotic Properties of an Estimator of the DriftCoefficientsofMultidimensionalOrnstein-UhlenbeckProcesses thatAreNot Necessarily Stable. Electron. J. Stat. 2, pp. 1309-1344. [4] Bierens, H. J., 2001. Complex Unit Roots and Business Cycles: Are They Real?. Econometric Theory, 17(05), pp.962-983. [5] Brockwell, Peter J., Davis, Richard A., 1991. Time Series: Theory and Methods. Second Edition, Springer. [6] Chan, N. H. and Wei, C. Z., 1988. Limiting Distributions of Least Squares Esti- mates of Unstable Autoregressive Processes. The Annals of Statistics, Vol.16, No. 1, pp. 367-401. [7] Hall, A., 1989.Testing for a Unit Root in the Presence of Moving Average Errors. Biometrika, 76(1), pp.49-56. [8] Hamilton, J. D., 1994. Time Series Analysis (Vol. 2). Princeton: Princeton uni- versity press. 79 [9] Jacod, J., and Shiryaev, A. N.,2002. Limit Theorems for Stochastic Processes. Second Edition, Springer. [10] Lin, N. and Lototsky, S. V., 2011. Undamped Harmonic Oscillator Driven by Additive Gaussian White Noise: A Statistical Analysis. Communications on Stochastic Analysis, Vol. 5, No. 1, pp. 233-250. [11] Lin, N. and Lototsky, S. V., 2014. Second-order Continuous-time Non-stationary Gaussian Autoregression. Statistical Inference for Stochastic Processes, Vol. 17, No. 1, pp. 19-49. [12] Mickens, R. E., 1994. Nonstandard Finite Difference Models of Differential Equa- tions. world scientific. [13] Mickens, R.E., Oyedeji, K.and Rucker, S., 2005. Exact Finite Difference Scheme for Second-order, Linear ODEs Having Constant Coefficients. Journal of sound and vibration, 287(4), pp.1052-1056. [14] Mikulevicius, R. and Plate, E., 1991. Rate of Convergence of the Euler Approxi- mation for Diffusion Processes. Mathematische Nachrichten, 151(1), pp.233-239. [15] Mikulevicius, R. and Zhang, C., 2011. On the Rate of Convergence of Weak Euler Approximation for Nondegenerate SDEs Driven by Lévy Processes. Stochastic Processes and their Applications, 121(8), pp.1720-1748. [16] Mishura, Y. and Munchak, Y., 2016. Functional Limit Theorems for Additive and Multiplicative Schemes in the Cox–Ingersoll–Ross Model. arXiv preprint arXiv:1604.01584. [17] Perron, P., 1991. A Continuous Time Approximation to the Unstable First-order Autoregressive Process: the Case Without an Intercept. Econometrica: Journal of the Econometric Society, pp.211-236. [18] Pesaran, M. H., 2015. Time Series and Panel Data Econometrics. Oxford Uni- versity Press. [19] Phillips, P. C., 1987. Towards a Unified Asymptotic Theory for Autoregression. Biometrika, 74(3), pp.535-547. 80 [20] Phillips, P. C. and Magdalinos, T., 2007. Limit Theory for Moderate Deviations From a Unit Root. Journal of Econometrics, 136(1), pp.115-130. [21] Phillips, P. C. and Perron, P., 1988. Testing for a unit root in time series regres- sion. Biometrika, 75(2), pp.335-346. [22] Phillips, P. C., Wu, Y. and Yu, J., 2011. Explosive Behavior In The 1990S Nasdaq: When Did Exuberance Escalate Asset Values? International economic review, 52(1), pp.201-226. [23] Shimizu, Y., 2009. Notes on Drift Estimation for Certain Non-recurrent Dif- fusion Processes From Sampled Data. Statistics & Probability Letters, 79(20), pp.2200-2207. [24] Talay, D. and Tubaro, L., 1990. Expansion of the Global Error For Numeri- cal Schemes Solving Stochastic Differential Equations. Stochastic analysis and applications, 8(4), pp.483-509. [25] Taniguchi,M.andKakizawa,Y.,2000.Asymptotic Theory of Statistical Inference for Time Series. First Edition, Springer. [26] Wang, X.andYu, J., 2015.Limit Theory for an Explosive Autoregressive Process. Economics Letters, 126, pp. 176-180. [27] Wang, X. and Yu, J., 2016. Double Asymptotics for Explosive Continuous Time Models. Journal of Econometrics, 193(1), pp.35-53. 81
Abstract (if available)
Abstract
This thesis studied the statistical inference problem for a second order linear stochastic ordinary differential equation using discrete observations. All cases for this equation are analyzed, and the limiting distributions for the estimation errors are attained and compared to the corresponding continuous-time results in every case. The thesis also discussed the simultaneous and sequential double asymptotics, which serve as a bridge between the discrete and continuous-time model.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Statistical inference of stochastic differential equations driven by Gaussian noise
PDF
Tamed and truncated numerical methods for stochastic differential equations
PDF
Equilibrium model of limit order book and optimal execution problem
PDF
Second order in time stochastic evolution equations and Wiener chaos approach
PDF
Numerical weak approximation of stochastic differential equations driven by Levy processes
PDF
Gaussian free fields and stochastic parabolic equations
PDF
On stochastic integro-differential equations
PDF
Numerical methods for high-dimensional path-dependent PDEs driven by stochastic Volterra integral equations
PDF
Asymptotic problems in stochastic partial differential equations: a Wiener chaos approach
PDF
On spectral approximations of stochastic partial differential equations driven by Poisson noise
PDF
Statistical inference for stochastic hyperbolic equations
PDF
On the non-degenerate parabolic Kolmogorov integro-differential equation and its applications
PDF
Time-homogeneous parabolic Anderson model
PDF
Statistical insights into deep learning and flexible causal inference
PDF
Parameter estimation in second-order stochastic differential equations
PDF
Regularity problems for the Boussinesq equations
PDF
On the simple and jump-adapted weak Euler schemes for Lévy driven SDEs
PDF
Conditional mean-fields stochastic differential equation and their application
PDF
Asymptotically optimal sequential multiple testing with (or without) prior information on the number of signals
PDF
Topics in selective inference and replicability analysis
Asset Metadata
Creator
Wang, Jian
(author)
Core Title
Statistical inference for second order ordinary differential equation driven by additive Gaussian white noise
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Applied Mathematics
Publication Date
07/26/2018
Defense Date
02/28/2018
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
discrete observations,limit distribution,OAI-PMH Harvest,rate of convergence,second-order stochastic equation
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Lototsky, Sergey (
committee chair
), Lv, Jinchi (
committee member
), Mikulevicius, Remigijus (
committee member
)
Creator Email
wang945@usc.edu,wang945acc@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-25899
Unique identifier
UC11672477
Identifier
etd-WangJian-6485.pdf (filename),usctheses-c89-25899 (legacy record id)
Legacy Identifier
etd-WangJian-6485.pdf
Dmrecord
25899
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Wang, Jian
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
discrete observations
limit distribution
rate of convergence
second-order stochastic equation