Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Linear filtering and estimation in conditionally Gaussian multi-channel models
(USC Thesis Other)
Linear filtering and estimation in conditionally Gaussian multi-channel models
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
LINEAR FILTERING AND ESTIMATION IN CONDITIONALLY GAUSSIAN MULTI-CHANNEL MODELS by Li Xu A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (APPLIED MATHEMATICS) August 2013 Copyright 2013 Li Xu Table of Contents Page List of Figures iv Abstract v Chapter 1 PRELIMINARIES 1 1.1 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Optimal Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.1 Nonlinear filtering for diffusion Markov process . . . . . . . . . . . 4 1.2.2 The Kalman-Bucy Filter . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3 Conditionally Gaussian Processes . . . . . . . . . . . . . . . . . . . . . . . 8 1.4 Multi-Channel System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.5 Asymptotic Behavior of The Solution to Riccati Equation . . . . . . . . . . . 10 Chapter 2 DIAGONAL STOCHASTIC PARABOLIC EQUATIONS 30 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.2 The 1-dimension Stochastic Heat Equation with constant coefficient . . . . . 32 2.3 Sieve Estimate for 1-dimension Stochastic Heat Equation with Time-dependent Coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.4 Kernel Estimator for 1-dimension Stochastic Heat Equation with Time-dependent Coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.5 Optimal Filtering of Stochastic Parabolic Equations . . . . . . . . . . . . . . 40 Chapter 3 DIAGONAL STOCHASTIC HYPERBOLIC EQUATIONS 48 Chapter 4 FIRST ORDER MULTI-CHANNEL MODEL 52 4.1 The Linear Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.2 Moments and Related Results . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.3 Asymptotic Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Chapter 5 SECOND ORDER MULTI-CHANNEL MODEL 70 5.1 One-dimension Stochastic Wave Equation . . . . . . . . . . . . . . . . . . . 70 5.2 Main Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5.3 Nonlinear Filtering of Diagonalizable Equations . . . . . . . . . . . . . . . . 72 5.4 Multi-Channel Model and Linear Filtering . . . . . . . . . . . . . . . . . . . 75 5.5 Second Order Linear Stochastic Ordinary Differential Equation . . . . . . . . 80 5.6 Conditional Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 5.6.1 Upper Bound and Lower Bound . . . . . . . . . . . . . . . . . . . . 95 ii 5.6.2 Asymptotic Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.7 Comparison Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 5.8 Other Possible Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Chapter 6 MORE DIAGONALIZABLE STOCHASTIC EQUATIONS 122 6.1 The One-dimension Stochastic parabolic equation . . . . . . . . . . . . . . . 122 References 130 iii List of Figures 1.1 Phase Portrait of Riccatiγ > 0 . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.2 Phase Portrait of Riccatiγ = 0, α> 0 . . . . . . . . . . . . . . . . . . . . 25 1.3 Phase Portrait of Riccatiγ = 0, α< 0 . . . . . . . . . . . . . . . . . . . . . 25 iv Abstract The aim of this thesis is to estimate the random coefficient parameter of first and second order conditionally Gaussian multi-channel models and to study asymptotic properties of those estimators. The connection between certain stochastic partial differential equation and the multi-channel model serves as the initial motivation of this disposition. In the first chapter we give some preliminaries including necessary terms and asymptotic behavior of the solution to Riccati equations developed for the analysis of the estimation or filtering error. In the second chapters we introduce maximum likelihood estimator of the constant coef- ficients in parabolic and hyperbolic stochastic PDE’s. The main objective is to investigate the spectral method used for computing MLEs based on finite-dimension approximations to solutions of such SPDE’s and to state the known result of the usual range of asymptotic prob- lems in parameter estimation: consistency, asymptotic normality and asymptotic efficiency as the dimension of the approximation goes to infinity. It turns out that with the tool of spec- tral method, we can derive a parameter estimation problem for stochastic PDE in parameter estimation problem for multi-channel system of stochastic ODE’s. In the third chapter we discuss the situation under which the parameterθ is a deterministic function oft:θ =θ(t),0≤t≤T . With the motivation described in the second and third chapters, the remaining chapters extend the study to the cases with stochastic parameterθ(ω) andθ t (ω). When unknown param- eters are stochastic, we actually are dealing with an optimal filtering problem. However, with the randomθ (a single random variable or a stochastic process), the consistency is not guaran- teed. Even though the so-called finite-dimension approximations does not converge to the real v solution of the original stochastic PDE, the optimal filtering with multi-channel observation is still of interest on its own. The fourth chapter is devoted to the one-dimension stochastic heat equation and first order multi-channel model. Stochastic wave equation case and second order multi-channel model are contained in chapter 5. More diagonalizable stochastic equations are investigated in the last chapter as a further discussion. vi Chapter 1 PRELIMINARIES The study about parameter estimation for stochastic ordinary differential equations was active in 1980’s (see e.g., [12], [13]). Later in 1990’s, parameter estimation for stochastic partial differential equations began to receive a fair amount of attention (see e.g., [2], [3]). Maximum likelihood estimators for parabolic stochastic PDE’s have been studied by several authors. M. Huebner and B.L. Rozovskii [11], worked on the cases in which the parameter subject to estimation is a scalar, using a spectral method for computing MLEs based on the finite- dimension approximations to solutions of parabolic SPDEs. For the problem of estimating a bounded (deterministic) coefficient function (of time) in SPDEs, M. Huebner and S. Lototsky [10] applied the method of sieves to obtain the maximum likelihood estimate. They also constructed a kernel estimator for the same problem [9]. 1.1 Parameter Estimation Let (Ω,F,{F t } 0≤t≤T ,P) be a filtrated complete probability space satisfying the usual hypothesis. For easier statement, we will restrict us to one-dimension situation for now. Let ξ = (ξ,F t ), 0 ≤ t ≤ T , be an observable process whose distribution relies on an unknown parameterθ. The problem is given the observed value ofξ up to timet, 0≤ t≤ T , namely, givenF ξ t := σ(ξ s : 0≤s≤t), can one design a reasonable method of finding or estimat- ing θ ∈ Θ ∈ R? In other words, for given t, we want to find a function f t (ξ [0,t] ) so as to make|θ−f t (ξ [0,t] )| as small as possible in certain sense. One common approach is to use 1 the maximum likelihood estimator (MLE) of θ based on{ξ s : 0 ≤ s ≤ t}. For this pur- pose, we need to construct a certain likelihood ratio of the measures generated by the process ξ a = (ξ a s ,F s ), 0≤s≤t, whereξ a s =ξ s | θ=a , fora∈ Θ. Another widely used criterion of an estimate being “best” is to minimize the mean-square error,E|θ−f t (ξ [0,t] )| 2 , for each givent ∈ [0,T]. Basic knowledge from probability theory indicates that choosing f t (ξ [0,t] ) = E[θ|F ξ t ] actually minimizesE|θ−f t (ξ [0,t] )| 2 among all f t (ξ [0,t] )∈F ξ t (i.e. among all theF ξ t -measurable estimators). However, asθ is deterministic, E[θ|F ξ t ] = θ is a “mathematically legal” solution, yet practically useless because we do not knowθ. How can we state mathematically that we do not know the value of a fixed constant? There is probably no way to answer this question. In spite of this, there are several ways to formulate the above problem mathematically, so that the trivial solution is excluded. One of them is to findf t (ξ [0,t] ) which minimizes sup θ∈Θ E|θ−f t (ξ [0,t] )| 2 , (so called minimax setting). Another, which is usually referred to as Bayesian approach, is to assume thatθ itself is random and first we choose randomlyθ = a ∈ Θ and then run the process with thisθ =a. In this Bayesian setting the problem is to minimizeE|θ−f t (ξ [0,t] )| 2 . Here, the best mean-square estimate ofθ, known as the minimum mean square error (MMSE) estimator, is given by E(θ|F ξ t ) =E(θ|ξ [0,t] ) = P a∈Θ aP(θ =a|ξ [0,t] ), if Θ is countable R Θ aπ(a|ξ [0,t] )da, ifπ(a|ξ [0,t] ) exists , whereπ(a|ξ [0,t] ) is the posterior distribution. As we can see, the main problem of finding the MMSE estimator in the Bayesian setting is to compute or generate the posterior distribution of 2 the parameterθ given a certain realization of processξ = (ξ t ,F t ). In practice, we even want the posterior distribution can be generated in an efficient way, say recursively. But we will stop this topic here because this article is mainly dealing with the situation when the unknown θ is an unobservable processθ = (θ t ,F t ), 0≤t≤T . 1.2 Optimal Filtering We still define(Ω,F,{F t } 0≤t≤T ,P) as earlier. Let(θ,ξ) be a two-dimension partially observ- able random process where θ = (θ t ,F t ), 0 ≤ t ≤ T , is an unobservable component, and ξ = (ξ t ,F t ), 0 ≤ t ≤ T , is an observable component. It is very natural to ask whether we can estimate the evolution ofθ based on the observation ofξ. Or more generally, we are interested in someF t -measurable functionh t of (θ,ξ). The problem of optimal filtering for a partially observable process (θ,ξ) comprises the construction for each instantt, 0 ≤ t ≤ T , of an optimal mean square estimate of someF t -measurable functionh t of (θ,ξ) on the basis of observation resultsξ s , 0≤s≤t. Assuming E|h t | 2 < ∞, the optimal mean square estimate is evidently the conditional expectationm t (h) = E[h t |F ξ t ]. To determinem t (h), we need some special assumptions on the structure of the process (h,ξ). Under relatively loose assumptions of the process type, m t (h) can be characterized by a stochastic differential equation, which is called the optimal nonlinear filtering equation and can be found in many fundamental filtering books (e.g. Lipster and Shiryaev [14], Fujisaki, Kallianpur and Kunita [7], Stratonovich [21]). However, this basic equation of (optimal nonlinear) filtering is rather theoretical than practical. The main problem is obviously that the representation of the solution,m t (h), in general is hard to find or write in a closed form. Using the basic equation of filtering as the primary proxy, two classical filtering models, nonlinear diffusion and linear Gaussian, have been developed and extensively 3 studied from both theoretical and applied points of view. The following subsections will be devoted to the summary of the main results in these two categories. 1.2.1 Nonlinear filtering for diffusion Markov process Let us consider the two-dimension diffusion Markov process (θ,ξ) = (θ t ,ξ t ), 0 ≤ t ≤ T , satisfying some necessary regularity and integrability assumptions. We shall assume now that the conditional distributionP(θ t ≤x|F ξ t ), 0≤t≤T , has the density ρ x (t) = dP(θ t ≤x|F ξ t ) dx , x∈R, which is a measurable function of (t,x,ω) (This is analogue to the Bayesian setting). Under some difficult-to-check assumptions, the conditional density ρ x (t), x ∈ R, 0 ≤ t ≤ T , satisfies a stochastic differential equation with partial derivatives. For simplicity and to be more practical, we shall confine us to a fairly simple (but nevertheless non-trivial!) case of the process (θ,ξ) for which the conditional densityρ x (t) exists and is the unique solution of the corresponding equation. It will be assumed that the random process (θ,ξ) = [(θ t ,ξ t ),F t ], 0≤t≤T , satisfies the stochastic differential equations dθ t = a(θ t )dt+dW 1 (t), (1.1) dξ t = A(θ t )dt+dW 2 (t), (1.2) where the random variable θ 0 and the Wiener processes W i = (W i (t),F t ), i = 1,2, are independent,P(ξ 0 = 0) = 1,Eθ 2 0 <∞. Theorem 1.2.1. Suppose 4 (I) the functions a(x), A(x) be uniformly bounded together with their derivatives a ′ (x), a ′′ (x), a ′′′ (x),A ′ (x) andA ′′ (x) (by a constantK); (II) |A ′′ (x)−A ′′ (y)|≤K|x−y|, |a ′′′ (x)−a ′′′ (y)|≤K|x−y|; (III) the distribution function F(x) = P(θ 0 ≤ x) have twice continuously differentiable densityf(x) =dF(x)/dx. Then there exists (P−a.s.) for eacht, 0≤t≤T , ρ x (t) = dP(θ t ≤x|F ξ t ) dx , which is aF ξ t -measurable (for eacht, 0≤t≤T ) solution of the equation d t ρ x (t) = L ∗ ρ x (t)dt+ρ x (t) A(x)− Z ∞ −∞ A(y)ρ y (t)dy × dξ t − Z ∞ −∞ A(y)ρ y (t)dy dt , (1.3) withρ x (0) =f(x) and L ∗ ρ x (t) =− ∂ ∂x [a(x)ρ x (t)]+ 1 2 ∂ 2 ∂x 2 [ρ x (t)]. In the class of measurable (t,x,ω) twice continuously differentiable functionsU x (t) overx, F ξ t -measurable for eacht, 0≤t≤T , and satisfying the condition P ( Z T 0 Z ∞ −∞ A(x)U x (t)dx 2 dt<∞ ) = 1, (1.4) 5 the solution to Equation (1.3) is unique in the following sense: ifU (1) x (t) andU (2) x (t) are two such solutions, then P sup 0≤t≤T |U (1) x (t)−U (2) x (t)|> 0 = 0, −∞<x<∞. (1.5) 1.2.2 The Kalman-Bucy Filter We shall consider the two-dimension Gaussian random process(θ t ,ξ t ), 0≤t≤T , satisfying the stochastic differential equations dθ t = a(t)θ t dt+b(t)dW 1 (t), (1.6) dξ t = A(t)θ t dt+B(t)dW 2 (t), (1.7) whereW 1 = (W 1 (t),F t ) andW 2 = (W 2 (t),F t ) are two independent Wiener processes and θ 0 , ξ 0 areF 0 -measurable. It will be assumed that the deterministic measurable functionsa(t), b(t), A(t) andB(t) are such that Z T 0 |a(t)|dt<∞, Z T 0 b 2 (t)dt<∞ (1.8) Z T 0 |A(t)|dt<∞, Z T 0 B 2 (t)dt<∞ (1.9) These assumptions guarantee that the linear equations given by (1.6), (1.7) have unique and continuous solutions. Furthermore,θ t can be expressed explicitly. The problem of optimal linear non-stationary filtering (θ t onξ [0,t] ) examined by Kalman and Bucy consists of the following. Supposeθ t , 0 ≤ t ≤ T , is inaccessible for observation, one can only observe the valuesξ t , 0≤t≤T , containing incomplete (due to the availability 6 of A(t) and the noise R t 0 B(s)dW 2 (s)) information on the values θ t . At each moment, it is required to estimate in an “optimal” way the valuesθ t based on the observed processξ [0,t] = {ξ s : 0≤s≤t}. Taking the optimality of estimation in the minimum mean square sense, the optimal esti- mate for θ t given ξ [0,t] is the conditional expectation m t = E(θ t |F ξ t ), according to basic probability theory. The error of estimation (of filtering) we use here is denoted by γ t :=E[(θ t −m t ) 2 |F ξ t ] =E[(θ t −m t ) 2 ]. The second equality here is due to the fact that (θ t ,ξ t ) is Gaussian. The assumption of (θ t ,ξ t ) being Gaussian is crucial in the Kalman-Bucy method. As we know, the a posteriori meanm t =E(θ t |F ξ t ), in general, is nonlinear. But thanks to the normal correlation property of Gaussian process (θ t ,ξ t ), 0 ≤ t ≤ T , the optimal estimatem t turns out to be linear. Moreover, the general equations of filtering can easily lead to a closed system of dynamic equations, mainly because the well-known relations between the moments of a Gaussian random variable (or vector). The scheme originally suggested by Kalman and Bucy was not deduced from the general equations of filtering; and there are quite a few different ways to achieve Kalman-Bucy filtering equations. Here, we only state this famous result without giving any proof. Theorem 1.2.2. Let (θ t ,ξ t ), 0≤ t≤ T, be a two-dimension Gaussian process described as in the system of equations (1.6) and (1.7). Let (1.8) and (1.9) also be satisfied, and require, further that Z T 0 A 2 (t)dt<∞, B 2 (t)≥C > 0, 0≤t≤T. 7 Then the conditional expectationm t =E(θ t |F ξ t ) and the mean square error of filteringγ t = E(θ t −m t ) 2 satisfy the system of equations dm t = a(t)m t dt+ γ t A(t) B 2 (t) (dξ t −A(t)m t dt), (1.10) ˙ γ t = 2a(t)γ t − A 2 (t)γ 2 t B 2 (t) +b 2 (t), (1.11) withm 0 =E(θ 0 |ξ 0 ), γ 0 =E(θ 0 −m 0 ) 2 . The system of equations in (1.10) and (1.11) has a unique continuous solution (forγ t , in the class of non-negative functions). Equation (1.11) is a deterministic Riccati equation aboutγ t , andγ t can be regarded as a measure of filtering efficiency. In the later content of this article, we will need some results about Riccati equation for efficiency analysis. This will be done in Section 1.5 . 1.3 Conditionally Gaussian Processes In deducing the Kalman-Bucy filtering equations (1.10) and (1.11) from the general equations of filtering, one encounters an essential difficulty: in order to findE(θ t |F ξ t ), it is necessary to know the conditional moments of higher order: E(θ 2 t |F ξ t ), E(θ 3 t |F ξ t ). Therefore we are forced to search for additional relations between the moments of higher orders so as to obtain a closed system. In the previous section 1.2.2, the partially observable random process (θ,ξ) is Gaussian, which yields the well-known relation E(θ 3 t |F ξ t ) = 3E(θ t |F ξ t )E(θ 2 t |F ξ t )−2 h E(θ t |F ξ t ) i 3 , (1.12) 8 and ensures us the closed system of equations given by (1.10) and (1.11). A reasonable extension of the Kalman-Bucy filtering equations thus can be carried out in search of another class of random processes (θ,ξ) = (θ t ,ξ t ), 0 ≤ t ≤ T , possessing additional relations between their moments. Definition 1.3.1. A random process(θ,ξ) = [(θ t ,ξ t ),F t ], 0≤t≤T is called a conditionally Gaussian process if for anyt, and 0≤t 0 <t 1 <···<t n ≤t, the conditional distributions F ξ [0,t] (x 0 ,··· ,x n ) =P(θ t 0 ≤x 0 ,··· ,θ tn ≤x n |F ξ t ), are (P -a.s.) Gaussian. If (θ,ξ) is conditionally Gaussian, the relation (1.12) still holds, and the solution of prob- lem of filtering can be obtained similarly as in the case of Gaussian process. But it has been pointed out that, with (θ,ξ) being conditionally Gaussian,γ t =E[(θ t −m t ) 2 |F ξ t ] in general is not the same asE(θ t −m t ) 2 , also, the estimatorm t =E(θ t |F ξ t ) is then, generally speaking, nonlinear. 1.4 Multi-Channel System Let the process θ = (θ t ,F t ), 0 ≤ t ≤ T , be inaccessible for observation, {ξ n = (ξ n (t),F t ), 0 ≤ t ≤ T} ∞ n=1 be a sequence of observable processes whose value contain incomplete information of process θ = (θ t ,F t ), 0 ≤ t ≤ T . For any fixed integer N, (θ t ,ξ 1 (t),··· ,ξ N (t)), 0 ≤ t ≤ T, forms a N + 1-dimension partially observed process. We are interested (at each moment t) in estimating (or filtering) in an “optimal” way, say, in minimum mean square error sense, the values θ t on the basis of the observed process {ξ 1,[0,t] ,··· ,ξ N,[0,t] }, whereξ n,[0,t] ={ξ n (s), 0≤s≤t}. 9 In other words, if we define, for eachN, m N (t) = E[θ t |F ξ N,t ], γ N (t) := E[(θ t −m N (t)) 2 |F ξ N,t ], whereF ξ N,t = σ({ξ n (s)|0≤s≤t, n = 1,··· ,N}), we want to find or characterizem N (t) andγ N (t) for anyt andN. Furthermore, we are also interested in the asymptotic behavior of m N (t) andγ N (t) as more and more observations are at hand, i.e. asN → ∞. The filtering error of the multi-channel system (θ t ,{ξ n (t)} N n=1 ), 0 ≤ t ≤ T, N = 1,2,··· , might vanish as more and more channels are added (see whetherγ N (t)→ 0 asN →∞). With vanishing errorγ N (t), one can continue study the efficiency or convergence rate ofm N (t) towardsθ t . 1.5 Asymptotic Behavior of The Solution to Riccati Equa- tion Letx n (t),n = 1,2,··· , be the non-negative solutions of the following sequence of Riccati equations. ˙ x n (t) =A(t)x n (t)−B n (t)x 2 n (t)+C(t), t∈ (t 0 ,T], (1.13) where A(t), C(t) are assumed to be bounded on [t 0 ,T], and C(t) ≥ 0. Furthermore, we assumeB n+1 (t)≥B n (t)≥ 0, ∀t∈ [t 0 ,T] , for anyn∈N, andB n (t), n≥ 1 are continuous int. We may also assumeB n (t) to be stochastic later to generalize our results. Some of our interests lie in the sufficient conditions for the convergence ofx n (t) to zero, and the type of convergence. We wantx n (t) converging to zero in the “uniform” sense, i.e., lim n→∞ sup t 0 ≤t≤T x n (t) = 0. (1.14) 10 To guarantee (1.14), some natural guesses of the sufficient condition are 1) R T t 0 B n (t)dt→∞ as n→∞; 2) B n (t)→∞ as n→∞, for any t∈ [t 0 ,T]. Remark. In the case ofB n (t) being random processes, (1.14) is written as P ω : lim n→∞ sup t 0 ≤t≤T x n (t) (ω) = 0 = 1. The following example immediately shows that the first condition is not sufficient. Example 1.5.1. Supposex n (t) is the non-negative solution of ˙ x n (t) =αx n (t)−B n (t)x 2 n (t)+γ, (1.15) whereα> 0, γ > 0, andB n (t), n≥ 1, satisfy the following properties. For fixed partition t 0 <t 1 <t 2 <t 3 <t 4 <T , 1) B n (t)≤β on [t 0 ,t 1 ]∪[t 4 ,T], ∀n≥ 1, for someβ≥ 0; 2) B n (t) =n on [t 2 ,t 3 ]; 3) on [t 1 ,t 2 ]∪[t 3 ,t 4 ],B n (t), n≥ 1, are smooth andB n+1 (t)≥B n (t). Obviously, with the above properties, Z T t 0 B n (t)dt≥n(t 3 −t 2 )→∞, as n→∞. 11 For simplicity, let us supposex n (t 0 ) = 0, ∀n≥ 1. Fort∈ [t 0 ,t 1 ], compare the solutionx n (t) of (1.15), using Lemma 1.5.2, with the solutionx(t) of the equation ˙ x(t) =αx(t)−βx 2 (t)+γ, t∈ (t 0 ,t 1 ) x(t 0 ) = 0 . Then we have, for anyt∈ (t 0 ,t 1 ] x n (t)≥x(t)> 0, ∀n, the last inequality is based on the expression of solution to Riccati equation with constant coefficients, see Lemma 1.5.3 and the phase portraits in Theorem 1.5.6 for details. sup t 0 ≤t≤T x n (t)≥ sup t 0 ≤t≤t 1 x n (t)≥ sup t 0 ≤t≤t 1 x(t)> 0, ∀n. Therefore we cannot have uniform convergence to 0. Furthermore, we see that the sequence of solutions{x n (t)} ∞ n=1 on[t 4 ,T] does not converge to 0, neither. If we compare eachx n (t) on [t 4 ,T], using Lemma 1.5.2, with the solutiony(t) of the equation ˙ y(t) =αy(t)−βy 2 (t)+γ, t∈ (t 4 ,T) y(t 4 ) = inf n≥1 x n (t 4 ) . then for anyt∈ (t 4 ,T] x n (t)≥y(t)> 0, ∀n, (For the last inequality, notice thaty(t 4 ) = inf n≥1 x n (t 4 ) ≥ 0,γ > 0. According to Lemma 1.5.3 and the phase portraits in Theorem 1.5.6,y(t)> 0, ∀t∈ (t 4 ,T].) 12 We note that lim n→∞ sup t 0 ≤t≤T x n (t) = 0 implies lim n→∞ x n (t 0 ) = 0. So from now on, we always assume that the initial conditions of equations (1.13) satisfy lim n→∞ x n (t 0 ) = 0. To prepare for the main results, we list some related facts as follows. Lemma 1.5.2. Assume thatx =x(t)≥ 0 is a solution of ˙ x(t) =α 1 (t)x(t)−β 1 (t)x 2 (t)+γ 1 (t), andy =y(t)≥ 0 is a solution of ˙ y(t) =α 2 (t)y(t)−β 2 (t)y 2 (t)+γ 2 (t). Ifx(t 0 )≥y ( t 0 )≥ 0, and for allt≥t 0 , α 1 (t)≥α 2 (t), β 2 (t)≥β 1 (t)≥ 0, γ 1 (t)≥γ 2 (t)≥ 0. Thenx(t)≥y(t) for allt≥t 0 . Proof. Definez(t) =x(t)−y(t), then direct computations show that ˙ z(t) =F(t)z(t)+G(t), whereF(t) =α 1 (t)−β 1 (t)·[x(t)+y(t)], G(t) = [α 1 (t)−α 2 (t)]·y(t)+[β 2 (t)−β 1 (t)]·y 2 (t)+γ 1 (t)−γ 2 (t). 13 According to assumption,G(t)≥ 0 fort≥t 0 , andz(t 0 )≥ 0, it follows thatz(t)≥ 0, ∀t≥ t 0 . Fact 1. If the non-negative sequence{x n (t 0 )} n≥1 is decreasing, then the sequence of solu- tions,{x n (t)} n≥1 , to equations (1.13) is also decreasing inn, for allt≥t 0 . Proof. Note that for anyn≥ 1,x n (t 0 )≥x n+1 (t 0 )≥ 0, andB n+1 (t)≥B n (t)≥ 0, ∀t≥t 0 . Thenx n (t)≥x n+1 (t), ∀t≥t 0 , which follows from Lemma 1.5.2. Fact 2. With sup n≥1 x n (t 0 )<∞, the sequence{x n (t)} n≥1 on [t 0 ,T] is bounded. Proof. Consider the solutionx(t) to the following equation ˙ x(t) =αx(t)+γ x(t 0 ) = sup n≥1 x n (t 0 ) , (1.16) where α := sup t 0 ≤t≤T A(t), γ := sup t 0 ≤t≤T C(t) ≥ 0 ( A(t), C(t) are described as in equation (1.13)). Direct application of Lemma 1.5.2 shows thatx n (t) ≤ x(t), ∀t ∈ [t 0 ,T], for eachn≥ 1. And x(t) = x(t 0 )e α(t−t 0 ) + γ α e α(t−t 0 ) −1 , α6= 0, x(t 0 )+γ(t−t 0 ), α = 0. Lemma 1.5.3. Letx(t) be the non-negative solution to ˙ x(t) =αx(t)−R 2 x 2 (t)+γ, (1.17) withx(0) =x 0 ≥ 0, γ≥ 0. Then for anyt> 0,x(t)→ 0 as R→∞. 14 Proof. Equation (1.17) has the solution x(t) = α 1 −Kα 2 ·exp((α 2 −α 1 )R 2 t) 1−K·exp((α 2 −α 1 )R 2 t) , where α 1 = 1 R 2 α 2 − r α 2 2 +R 2 γ ! , α 2 = 1 R 2 α 2 + r α 2 2 +R 2 γ ! , K = x 0 −α 1 x 0 −α 2 . Note that ifx 0 > 0,K → 1 as R→∞; and ifx 0 = 0,K →−1 as R→∞. By directly investigatingx(t), we have fort> 0, withγ > 0, x(t)∼ 1 R 2 α 2 + r α 2 2 +R 2 γ ! ∼ √ γ R , sox(t)→ 0 as R→∞. Ifγ = 0, exp (α 2 −α 1 )R 2 t = exp(αt), α 1 = 0,α 2 = α R 2 , K = x 0 x 0 −α/R 2 , then we have x(t) = −Kα·e αt R 2 −R 2 K·e αt = − x 0 x 0 −α/R 2 · α R 2 e αt 1− x 0 x 0 −α/R 2 ·e αt . Sox(t)≡ 0, ifx 0 = 0; andx(t)→ 0 as R→∞, ifx 0 > 0. Now we state one sufficient condition for the convergence ofx n (t) to zero. 15 Theorem 1.5.4. Consider a sequence of Riccati equations x ′ n (t) =A(t)x n (t)−B n (t)x 2 n (t)+C(t), t 0 <t<T, n∈N (1.18) whereC(t)≥ 0,B n (t)≥ 0,B n+1 (t)≥B n (t), ∀t∈ [t 0 ,T]. Assume further the initial conditions x n (t 0 ) ց 0. Let {x n (t)} n≥1 be the sequence of corresponding non-negative solutions. If for anyN > 0, there exist someB m (t) (m=m(N)) andk∈N, such that B m (t)≥N, ∀t∈ t 0 + 1 k ,T . Then we have sup t 0 ≤t≤T x n (t)→ 0, as n→∞. Proof. Lety n (t) be the non-negative solution of ˙ y n (t) =αy n (t)+γ n , t 0 <t<t 0 + 1 k , withy n (t 0 ) =x n (t 0 )≥ 0, the same initial condition as equation (1.18), where α = sup t 0 ≤t≤T A(t), γ n = max sup t 0 ≤t≤T C(t), −αy n (t 0 ) . One can easily see that, γ n = sup t 0 ≤t≤T C(t) =:γ, ifα≥ 0, γ n ցγ, asn→∞ (sincey n (t 0 ) =x n (t 0 )ց 0, asn→∞), ifα< 0. 16 For eachn≥ 1, by Lemma 1.5.2, we have y n (t)≥x n (t)≥ 0, ∀t∈ t 0 ,t 0 + 1 k . Ifα6= 0, we have y n (t) =y n (t 0 )e α(t−t 0 ) + γ n α e α(t−t 0 ) −1 , with y ′ n (t) =e α(t−t 0 ) [αy n (t 0 )+γ n ]≥ 0. So we can obtain that sup [t 0 ,t 0 + 1 k ] y n (t) =y n (t 0 + 1 k ) =y n (t 0 )e α/k + γ n α e α/k −1 . Ifα = 0, y n (t) =y n (t 0 )+γ n ·(t−t 0 ), which is also an increasing function on t 0 ,t 0 + 1 k , sup [t 0 ,t 0 + 1 k ] y n (t) =y n (t 0 + 1 k ) =y n (t 0 )+ γ n k . Now let us consider on the interval t 0 + 1 k ,T . For each fixedn≥m(N), we haveB n (t)≥ N. Letz n (t) be the non-negative solution of ˙ z n (t) =αz n (t)−Nz 2 n (t)+γ, t 0 + 1 k <t<T, z n t 0 + 1 k =x n t 0 + 1 k 17 whereα = sup t 0 ≤t≤T A(t),γ = sup t 0 ≤t≤T C(t). Applying Lemma 1.5.2, we have z n (t)≥x n (t)≥ 0, ∀t∈ t 0 + 1 k ,T . Then using Lemma 1.5.3, we have sup [t 0 + 1 k ,T] z n (t)≤ l n √ N , (N is sufficiently large), for some constantl n depending onγ, t 0 , T and the value ofz n t 0 + 1 k =x n t 0 + 1 k . By Fact 2 and details in Lemma 1.5.3, we know l n is bounded, i.e., l n ≤ l for each n≥m(N), which implies sup [t 0 + 1 k ,T] x n (t)≤ l √ N . Combining all the reasoning above, we get sup [t 0 ,T] x n (t)≤ max y n t 0 + 1 k , l √ N , ∀n≥m(N). LetN →∞, k→∞, we may haven→∞. Thus l √ N → 0, y n t 0 + 1 k → 0. The second result is from y n t 0 + 1 k →y n (t 0 ) =x n (t 0 )ց 0, as k→∞, n→∞. Therefore, we have lim n→∞ sup t 0 ≤t≤T x n (t) = 0. 18 Fact 3. For eachn≥ 1, supposex n (t) is the non-negative solution of x ′ n (t) =A(t)x n (t)−B n (t)x 2 n (t)+C(t), t∈ (t 0 ,T], where C(t) ≥ 0 and B n+1 (t) ≥ B n (t), ∀n ≥ 1, ∀t ∈ [t 0 ,T]. Under the assumption “x n (t 0 )ց 0, asn→∞”, the following two statements are equivalent. (i) x n (t) converges to zero uniformly on [t 0 ,T]; (ii) x n (t) converges to zero everywhere on [t 0 ,T]. Proof. (i)⇒(ii): trivial. (ii)⇒(i): Now let us assume that (ii) is true, then under the assumption x n (t 0 ) ց 0, as n→∞, it follows from Fact 1 that x n (t)≥x n+1 (t), ∀t∈ [t 0 ,T], ∀n≥ 1. So the sequence sup t 0 ≤t≤T x n (t) n≥1 is decreasing and bounded below by 0. Hence, we know the limit lim n→∞ sup t 0 ≤t≤T x n (t) exists. But if lim n→∞ sup t 0 ≤t≤T x n (t) 6= 0, then there existsǫ> 0, such that sup t 0 ≤t≤T x n (t)>ǫ, ∀n. Thus, for eachn, there is some pointt n ∈ [t 0 ,T], so that x n (t n )> ǫ 2 . {t n } n≥1 lie in [t 0 ,T], therefore we can find a convergent subsequence{t n k } k≥1 , witht n k → 19 t ∗ ∈ [t 0 ,T] ask → ∞. For simplicity, we still use{t n } n≥1 to denote the subsequence, and {x n } n≥1 the corresponding functions. Because{x n (t)} is monotone decreasing inn, x n (t m )≥x m (t m )> ǫ 2 , ∀m>n. Then by continuity ofx n (t), lettingm→∞ x n (t ∗ )> ǫ 2 , ∀n. This contradicts to (ii). Now, let{x n (t)} n≥1 be the same as in Fact 3, but withC(t)≡ 0 on [t 0 ,T], which means, x n (t) satisfies x ′ n (t) =A(t)x n (t)−B n (t)x 2 n (t), ∀t∈ (t 0 ,T]. One can easily verify that, for anyt∈ [t 0 ,T], x n (t) = x n (t 0 )exp R t t 0 A(r)dr 1+x n (t 0 ) R t t 0 exp R s t 0 A(r)dr B n (s)ds . Notice thatA(t) is bounded on [t 0 ,T], hence exp R t t 0 A(r)dr is also bounded. Ifx n (t 0 )→ 0 asn→∞, then x n (t)→ 0, asn→∞, ∀t∈ (t 0 ,T]. Furthermore, if{x n (t 0 )} n≥1 and{B n (t)} n≥1 are both monotone decreasing inn, then accord- ing to Fact 3, lim n→∞ sup t 0 ≤t≤T x n (t) = 0. 20 Remark. Above analysis says that, in the trivial case, C(t) ≡ 0 on interval [t 0 ,T], initial condition x n (t 0 ) ց 0, as n → ∞ is sufficient to have the uniform convergence of x n (t) to zero. No additional assumption for{B n (t)} n≥1 is needed. Next we are going to see that in the non-trivial case,C(t)6≡ 0, the assumption forB n (t) “being unbounded” in Theorem 1.5.4 is necessary. Theorem 1.5.5. For the same equations defined in Theorem 1.5.4, x ′ n (t) =A(t)x n (t)−B n (t)x 2 n (t)+C(t), whereC(t)≥ 0,B n+1 (t)≥B n (t)≥ 0, ∀t∈ [t 0 ,T], ∀n≥ 1. If there is some interval [a,b]⊂ [t 0 ,T], such that, for some constantL> 0 B n (t)≤L, ∀t∈ [a,b], ∀n≥ 1 andC(t)6≡ 0 on [a,b]. Then sup t 0 ≤t≤T x n (t)9 0, n→∞ Proof. Consider ˙ z(t) =αz(t)−Lz 2 (t)+C(t), where α = inf t 0 ≤t≤T A(t). If x n (a) 9 0, we are done. Now we assume x n (a) → 0 as n→∞. Then for each fixedn, withz(a) = 0≤x n (a), we apply Lemma 1.5.2 to obtain x n (t)≥z(t)≥ 0, ∀t∈ [a,b], ∀n, 21 wherez(t) is a non-trivial solution, sinceC(t) 6≡ 0 on [a,b]. Andz(t) 6≡ 0 on [a,b] implies δ := sup a≤t≤b z(t)> 0. In summary, the following inequality is true for eachn sup t 0 ≤t≤T x n (t)≥ sup t 0 ≤t≤T z(t)≥ sup a≤t≤b z(t) =δ> 0. Theorem 1.5.6. Let{x n (t) :t∈ [t 0 ,T]} ∞ n=1 be the sequence of non-negative solutions to the equations ˙ x n (t) =A(t)x n (t)−B n (t)x 2 n (t)+C(t), n≥ 1, (1.19) where for anyt∈ [t 0 ,T],A(t), C(t) are bounded,C(t)≥ 0;B n (t)’s are continuous and B n+1 (t)≥B n (t)≥ 0, ∀t∈ [t 0 ,T], ∀n≥ 1. If there existδ> 0, ¯ δ> 0, such that lim n→∞ inf [t 0 +δ,T− ¯ δ] B n (t) =∞ Then, for arbitrary ˜ δ>δ with ˜ δ<T − ¯ δ−t 0 , we have lim n→∞ sup [t 0 + ˜ δ,T− ¯ δ] x n (t) ! = 0. Remark. This theorem does not have any restriction on the non-negative initial conditions {x n (t 0 )} ∞ n=1 . 22 Proof. On interval [t 0 ,t 0 +δ], definey n (t) to be the non-negative solution to ˙ y n (t) =α 0 y n (t)+γ 0 n y n (t 0 ) =x n (t 0 ) , (1.20) where α 0 := sup [t 0 ,t 0 +δ] A(t), γ 0 n := max ( sup [t 0 ,t 0 +δ] C(t), −α 0 y n (t 0 ) ) ≥ 0. Now, for eachn≥ 1, by Lemma 1.5.2, we have y n (t)≥x n (t)≥ 0, ∀t∈ [t 0 ,t 0 +δ]. Ifα 0 6= 0, we know y n (t) =y n (t 0 )e α 0 (t−t 0 ) + γ 0 n α 0 e α 0 (t−t 0 ) −1 , withy ′ n (t)≥ 0 on [t 0 ,t 0 +δ]. So sup [t 0 ,t 0 +δ] y n (t) =y n (t 0 +δ) =y n (t 0 )e α 0 δ + γ 0 n α 0 e α 0 δ −1 . Ifα 0 = 0,y 0 (t) =y 0 (t 0 )+γ 0 n ·(t−t 0 ), we also have sup [t 0 ,t 0 +δ] y n (t) =y n (t 0 +δ) =y n (t 0 )+γ 2 n ·δ. To summarize, sup [t 0 ,t 0 +δ] y n (t) =y n (t 0 +δ)≤C 1 (x n (t 0 )+1), 23 for some constantC 1 depending on sup t A(t), sup t C(t) andδ, that is C 1 ∝e α 0 δ ∨ e α 0 δ −α 0 δ ∨ sup t C(t) α 0 ∨δ·sup t C(t)∨1. We will constructy n (t), t∈ (t 0 +δ,T − ¯ δ] later. Next, we will need some results for constant-coefficient Riccati Equation. Considerx(t) being the non-negative solution to ˙ x(t) =αx(t)−R 2 x 2 (t)+γ, (1.21) wherex(t 0 ) =x 0 ≥ 0, γ≥ 0, R≥ 0. x(t) can be written explicitly as x(t) = α 1 −Kα 2 exp((α 2 −α 1 )R 2 (t−t 0 )) 1−Kexp((α 2 −α 1 )R 2 (t−t 0 )) , t≥t 0 , where α 1 = 1 R 2 α 2 − r α 2 2 +R 2 γ ! ≤ 0, α 2 = 1 R 2 α 2 + r α 2 2 +R 2 γ ! ≥ 0, K = x 0 −α 1 x 0 −α 2 . Equation (1.20) is autonomous. Investigating it’s phase portrait we can easily see the following solution trajectory behavior. For anyx(t 0 )≥ 0, ifx(t 0 )≤α 0 , then we have sup t≥t 0 x(t)≤α 2 ; 24 x(t) x 0 α 2 0 α 1 t 0 t Figure 1.1: Phase Portrait of Riccatiγ > 0 x(t) x 0 t 0 t α 1 =0 α 2 Figure 1.2: Phase Portrait of Riccatiγ = 0, α> 0 x(t) x 0 α 1 t 0 t α 2 =0 Figure 1.3: Phase Portrait of Riccatiγ = 0, α< 0 25 ifx(t 0 ) ≥ α 2 , sincex ′ (t) < 0 as long asx(t) > α 2 , then for anyz ∈ (α 2 ,x 0 ), withx 0 := x(t 0 ), there existst ∗ >t 0 , such that x(t)≤x(t ∗ ) =z, ∀t≥t ∗ . Now for fixedx 0 >α 2 ,z∈ (α 2 ,x 0 ), let’s solve x(t) = α 1 −Kα 2 exp((α 2 −α 1 )R 2 (t−t 0 )) 1−Kexp((α 2 −α 1 )R 2 (t−t 0 )) =z, (1.22) with α 1 −α 1 = 2 R 2 r α 2 2 +R 2 γ and we found that equation (1.22) is equivalent to log α 1 −z K(α 2 −z) = (α 2 −α 1 )R 2 (t−t 0 ) = 2 r α 2 2 +R 2 γ·(t−t 0 ). Note that (α 1 −z) K(α 2 −z) = (x 0 −α 2 )·(z−α 1 ) (x 0 −α 1 )·(z−α 2 ) > 0. Lemma 1.5.7. For positive numbersa,b,c> 0 1< (b+c)(a+b) (a+b+c)b < a+b b Proof. The result can be easily obtained by direct computation. Takinga =α 2 −α 1 , b =z−α 2 , c =x 0 −z, we have 1< (x 0 −α 2 )(z−α 1 ) (x 0 −α 1 )(z−α 2 ) < z−α 1 z−α 2 , 0< 2 q α 2 2 +R 2 γ·(t ∗ −t 0 ) = log z−α 1 K(z−α 2 ) < log z−α 1 z−α 2 , 26 where we chooset ∗ such thatx(t ∗ ) =z t ∗ < 1 2 q α 2 2 +R 2 γ ·log z−α 1 z−α 2 +t 0 . (1.23) Thus, for any t ∗ > 1 2 q α 2 2 +R 2 γ ·log z−α 1 z−α 2 +t 0 , we havex(t)<z. Now, let us constructy n (t), t∈ (t 0 +δ,T − ¯ δ] as the non-negative solution to ˙ Y n (t) =αY n (t)−β n Y 2 n (t)+γ Y n (t 0 +δ) =y n (t 0 +δ) , t 0 +δ<t<T − ¯ δ, wherey n (t 0 +δ) is obtained from the solution to (1.20), α := sup [t 0 +δ,T− ¯ δ] A(t), β n := inf [t 0 +δ,T− ¯ δ] B(t), γ n := sup [t 0 +δ,T− ¯ δ] C(t). By definitionβ n ր∞,γ≥ 0. Apply Lemma 1.5.2, since y n (t 0 +δ) = sup t 0 +δ,T− ¯ δ y n (t)≥ sup t 0 +δ,T− ¯ δ x n (t)≥x n (t 0 +δ), we have y n (t)≥x n (t)≥ 0, ∀t∈ [t 0 +δ,T − ¯ δ]. For eachn≥ 1, defineα n,2 ≥ 0≥α n,1 as follows α n,1 = 1 β n α 2 − r α 2 2 +β n γ ! , α n,2 = 1 β n α 2 + r α 2 2 +β n γ ! . 27 According to the results for Riccatti Equation, we have just stated, Ify n (t 0 +δ)≤α n.,2 , then sup t≥t 0 +δ y n (t)≤α n,2 ; Ify n (t 0 +δ)>α n,2 , leftz n =α n,2 + 1 βn ; Now ify n (t 0 +δ)∈ (α n,2 ,z n ], theny ′ n (t) is decreasing on [t n +δ,T − ¯ δ], sup t≥t 0 +δ y n (t)≤z n ify n (t 0 +δ)>z n , there existst n >t 0 +δ, such that y n (t n ) =z n , y n (t) =z n , ∀t>t n , and by the inequality (1.23), t n < 1 2 q α 2 2 +β n γ log z n −α n,1 z n −α n,2 +t 0 +δ. Denote τ n = 1 2 q α 2 2 +β n γ log z n −α n,1 z n −α n,2 = 1 2 q α 2 2 +β n γ log(1+β n α n,2 −β n α n.1 ) = 1 2 q α 2 2 +β n γ log 1+2 r α 2 2 +β n γ ! → 0, asn→∞, sinceβ n →∞ asn→∞. Therefore, for any ǫ > 0, ∃N, such that 0 < τ n < ǫ, as long as n > N. So t n < t 0 +δ +ǫ, ∀n>N, and y n (t 0 +δ +ǫ)<z n . 28 Further more, we have sup [t 0 +δ+ǫ,T− ¯ δ] x n (t)≤ sup [t 0 +δ+ǫ,T− ¯ δ] y n (t)<z n , ∀n>N. Take limit for both sides, we know limsup n→∞ sup [t 0 +δ+ǫ,T− ¯ δ] x n (t) ! ≤ limsup n→∞ z n = 0. Havingx n (t) being non-negative, we have proved the desired result. With ˜ δ = δ +ǫ,ǫ is an arbitrary small positive number. 29 Chapter 2 DIAGONAL STOCHASTIC PARABOLIC EQUATIONS 2.1 Introduction Parameter estimation for ordinary stochastic differential equations received a fair amount of attention since 80’s of last century (see e.g. [12] and [13]). And this problem for stochastic partial differential equations started to be investigated mainly in early 90’s (see e.g. [2], [3], [8],[11], and references therein ). Consider parabolic SPDE’s of the form: du(t,x)+(A 0 +θA 1 )u(t,x)dt =dW(t,x), 0<t≤T,x∈G (2.1) with appropriate initial and boundary conditions, e.g. u(0,x) = u 0 (x),D γ u(t,x)| ∂G = 0 for any multi-indexγ with|γ| ≤ ord(A 0 +θA 1 )− 1, whereA 0 andA 1 are partial differential operators,θ is a scalar parameter subject to estimation, andW(t,x) is a cylindrical Brownian Motion (C.M.B) in L 2 (G). G is a bounded domain in R d . For further discussion, let us denote m 0 = ord(A 0 ), m 0 = ord(A 1 ). The above problem is understood in the sense of distribution. 30 Letu 0 (x) = 0, ∀x∈G. Assume that the operatorsA 0 andA 1 have a common system of eigenfunctions,{h k (x), k≥ 1} (orthonormal basis in Hilbert spaceL 2 (G)) A 0 h k =ρ k h k , A 1 h k =ν k h k . Then C.B.M.W(t,x) can be interpreted as dW(t,x) = X k≥1 h k (x)dw k (t), wherew k (t) are independent standard Brownian Motions. Defineu k (t), 0≤t≤T, k = 1,2,··· to be the solution of du k (t) =−(ρ k +θν k )u k (t)dt+dw k (t) u k (0) = 0 . and define ˆ θ N =− N P k=1 ν k R T 0 (u k (t)du k (t)+ρ k u 2 k (t)dt) N P k=1 ν 2 k R T 0 u 2 k (t)dt . (2.2) Theorem 2.1.1. If ∞ X k=1 ν 2 k ρ k +θν k =∞, then lim N→∞ ˆ θ N =θ with probability 1, lim N→∞ X k≤N ν 2 k ρ k +θν k ! 1/2 ( ˆ θ N −θ) d =N 0, 2 T , 31 and the measures generated by the solutions of equation (2.1) withu(0,x) = 0 are mutually singular for different values ofθ. Remark. It is shown that the divergence of P k ν 2 k ρ k +θν k is necessary and sufficient to have consistency and asymptotic normality of ˆ θ N . And suggested by Huebner and Rozovskii in [11], the equation (2.2) is, in face, the maximum likelihood estimator ofθ based on the observation {u k (t); 0≤t≤T, k = 1,··· ,N}. 2.2 The 1-dimension Stochastic Heat Equation with con- stant coefficient Consider the following specific example du(t,x) =θu xx (t,x)dt+dW(t,x), 0<t≤T, x∈ (0,π) u(0,x) = 0, ∀x∈ [0,π] u(t,0) =u(t,π) = 0, ∀t∈ [0,T] . (2.3) whereθ(> 0) is an unknown real number,W(t,x) is C.B.M. with representation dW(t,x) = X k≥1 h k (x)dw k (t), (2.4) and h k = q 2 π sin(kx), k ≥ 1, 0 ≤ x ≤ π; w k (t) are independent standard Brownian Motions. If we expressu(t,x) as a Fourier series u(t,x) = X k≥1 u k (t)h k (x), 32 then ∂u(t,x) ∂t = X k≥1 u ′ k (t)h k (x), ∂ 2 u(t,x) ∂x 2 = X k≥1 −k 2 u k (t)h k (x). Substitution of these series and (2.4) into (2.3) suggests that for eachk≥ 1, du k (t) =−k 2 θu k (t)dt+dw k (t), 0<t≤T u k (0) = 0 . (2.5) In practice, we only have finite trajectories of observation{u k (t); k = 1,··· ,N, t∈ [0,T]} at hand. The original stochastic PDE is thus reduced to a multi-channel stochastic decoupled system of ODE’s. Then (2.2) suggests that the MLE ofθ based on{u k (t)} N k=1 is ˆ θ N =− N P k=1 R T 0 k 2 u k (t)du k (t) N P k=1 R T 0 k 4 u 2 k (t)dt , and obviously ˆ θ N −θ =− N P k=1 R T 0 k 2 u k (t)dw k (t) N P k=1 R T 0 k 4 u 2 k (t)dt . By L.L.N., lim N→∞ ( ˆ θ N −θ) = 0 with probability 1. By C.L.T., we have lim N→∞ N 3/2 ( ˆ θ N −θ) d =N 0, 6θ T . Furthermore, note that (2.5) has a unique strong solution (for eachk), u k (t) =e −k 2 θt Z t 0 e k 2 θs dw k (s). 33 Then, we have the estimation that E[u 2 k (t)] = Z t 0 e −2k 2 θ(t−s) ds≤ 1 2θk 2 . Thenu(t,x) = P k≥1 u k (t)h k (x) is a unique solution of (2.3) and Eku(t,x)k 2 L 2 ((0,π)) (t) = ∞ X k=1 E[u 2 k (t)]≤ ∞ X k=1 1 2θk 2 = π 2 12θ . The convergence ofu(t,x) = P k≥1 u k (t)h k (x) guarantees that theθ we estimate for the multi- channel stochastic decoupled system of ODEs (2.5) is exactly the sameθ subject to estimation in the stochastic heat equation (2.3). 2.3 Sieve Estimate for 1-dimension Stochastic Heat Equa- tion with Time-dependent Coefficient If the unknown coefficientθ in equation (2.1) is a deterministic time-dependent function with appropriate assumptions about operatorsA 0 ,A 1 andθ(t), different approaches to constructing the estimate can be used (e.g. Kernel Estimator in [9], Sieve Estimator in [10]). Here, we are going to use 1-dimension stochastic heat equation to illustrate the method of Sieve which combines the methods used by Heubner and Rozovskii (1995) (see [2]), and Nguyen and Pham (1982) (see [19]). For a general parabolic SPDE with time-dependent coefficient satisfying: du(t,x) = (A 0 +θ 0 (t)A 1 )u(t,x)dt+dW(t,x), t∈ (0,T], x∈G u(0,x) =u 0 (x) . (2.6) whereW =W(t,x) is a C.B.M. 34 We always assume that Assumption 2.3.1. (H1) There is a complete orthonormal system{h k } k≥1 inL 2 (G), so that A 0 h k =ρ k h k , A 1 h k =ν k h k ; (H2) The eigenvaluesν k andρ k satisfy|ν k |∼k m 1 /d and|ρ k |∼k m 0 /d , and µ k :=−(ρ k +θ(t)ν k )∼k 2m/d , uniformly int∈ [0,T], wherem 0 =ord(A 0 ),m 1 =ord(A 1 ) and 2m = max(m 0 ,m 1 ). Remark. Assumptions 2.3.1 holds in many models. For instance, whenA 0 andA 1 commute and eitherA 0 orA 1 is uniformly elliptic and formally self-adjoint. Suppose the processu(t,x), t∈ [0,T], x∈ [0,π] is generated by the following equation: du(t,x) =θ 0 (t)u xx (t,x)dt+dW(t,x), 0<t≤T, x∈ (0,π) u(0,x) = 0, ∀x∈ [0,π] u(t,0) =u(t,π) = 0, ∀t∈ [0,T] . (2.7) In this example,d = 1, m 0 = 0, m 1 = 2, m = 1 (where 2m = max(m 0 ,m 1 )). As explained in Section 2, h k = r 2 π sin(kx), k≥ 1, 0≤x≤π is the common system of eigenfunctions ofA 0 , A 1 , or, the spatial Fourier basis. 35 The C.B.M. dW(t,x) = X k≥1 h k (x)dw k (t) (2.8) with independent standard Brownian Motions{w k (t)} k≥1 . And the spatial Fourier coefficients of the solution of equation (2.7) are u k (t) = Z t 0 exp −k 2 Z t s θ 0 (r)dr dw k (s). The Sieve maximum likelihood estimate ˆ θ N is obtained by maximizing the likelihood ratio function based on the N Fourier coefficients. The maximization is carried out over a Sieve Θ N , that is to say, a finite dimensional subspace of Θ (the set of admissible function θ 0 ) 1 . And the family of spaces{Θ N , N ≥ 1} needs to be chosen such that the approximation error vanishes asN increases. If we use linear nested sieves, we assume Θ ={θ : θ =θ(t) = ∞ X j=1 θ j e j (t)}, where {e j (t), j ≥ 1} are known functions orthonormal on [0,T]. And we can choose a subspace Θ N = span{e 1 (t),··· ,e d N (t)}, (2.9) whered N denotes the dimension of the approximating spaces Θ N , depending onN. For example, let us consider the following function space Θ γ (0,T) ={θ =θ(t) : |θ j | 2 ≤ L j γ+1 } 1 For the general equation 2.6, Θ should be the collection of functions for which assumption (H2) holds. If A 1 is not the leading operator, the Θ is the set of bounded measurable functions on [0,T]. If A 1 is the leading operator, thenθ(t)∈ Θ should be positive and bounded away from 0. 36 whereθ j = R T 0 θ(t)e j (t)dt,γ > 0 andL is a constant. Assumingθ 0 (t) ∈ Θ γ (0,T) and 0 < C 1 ≤ θ(t) ≤ C 2 , ∀t ∈ [0,T], for any admissible function θ(t) and some constants C 1 ,C 2 , we can easily verify that C 1 k 2 ≤ µ k (t) ≤ C 2 k 2 ((H1) and (H2) are fulfilled). Then the Sieve estimator ˆ θ N (t) = d N P j=1 ˆ θ j e j (t), or in vector form ˆ θ N = ( ˆ θ 1 ,··· , ˆ θ d N ) is the solution of a system of linear equations J(N) ˆ θ N =a N , where a N = N X k=1 −k 2 Z T 0 e j (t)u k (t)du k (t) ! j=1,···,d N , and J(N) = N X k=1 k 4 Z T 0 e i (t)e j (t)u 2 k (t)dt ! i,j=1,···,d N . One can show thatJ(N) is nonsingular almost surely. If we use the cosine basis e 1 (t) = 1 √ T , e j (t) = r 2 T cos π(j−1) T t , j > 1, then according to Theorem 3.1 and Theorem 3.2 in [10], we need d 2 N N → 0, N 3 d γ N → 0 to guarantee the consistency and asymptotic normality of ˆ θ N (t) (in a weaker sense). Thus, we can take d N ∼N α , whereα∈ 3 γ ,1 . 37 Remark. The Sieve estimator depends closely on the Sieve we use, the admissible function spaceΘ and its subspaceΘ N as well as the basis{e j } d N j=1 . For example, if we choose Legendre polynomials (for simplicity, let (0,T) be (0,1)) e j+1 (t) = p 2j +1p j (t), where p j (t) = 1 2 j j! d j (t 2 −1) j dt j , j = 0,1,··· . To have a consistent and asymptotically normal estimation, we take d n ∼N α , whereα∈ 3 γ , 1 2 . 2.4 Kernel Estimator for 1-dimension Stochastic Heat Equation with Time-dependent Coefficient With the knowledge in density estimation (see, e.g. [6],[18]), we can also construct a kernel- type estimator for a time-dependent parameter in a stochastic parabolic equation (see [9]). Let us still take 1-dimension stochastic heat equation (2.7) for example, where θ 0 (t) is a bounded measurable function on [0,T]. Furthermore, θ 0 (t) is smooth (infinitely differen- tiable). And as in Section 3, we assume (H1) and (H2) are fulfilled. With h k = r 2 π sin(kx), k≥ 1, 0≤x≤π, 38 the solution of the equation (2.7) can be written as u(t,x) = X k≥1 u k (t)h k (x), whereu k (t), k = 1,··· ,N satisfies du k (t) =−k 2 θ 0 (t)u k (t)dt+dw k (t) u k (0) = 0 , k = 1,··· ,N. The series for the solution converges inL 2 (Ω×(0,1);L 2 ((0,1))). Note thatu k (t) are inde- pendent random processes (k = 1,··· ,N), the kernel method can be used to construct the estimator ofθ 0 (t). Recall that function R = R(t), t ∈ R is called a compactly supported kernel of order K ≥ 1 ifR has the following properties: a. ∃C > 0, R(t) = 0, |t|>C; b. R R R(t)dt = 1; c. R R t j R(t)dt = 0, j = 1,···K. In this example, let us choose R(t) = 3(3−5t 2 ) 8 ·1 {|t|≤1} . It is a compactly supported kernel of order 3. According to the Theorem 3.1 of [9], the kernel estimator ofθ 0 (t) is given by ˆ θ N (t) = 1 h N F ν,N N X k=1 Z T 0 R s−t h N U k,N (s)du k (s) (2.10) 39 whereF ν,N =− P N k=1 k 2 , U k,N (t) = 1 u k (t) , |u k (t)|>v N 1 v N , |u k (t)|≤v N , andh N is certain bandwidth such that 0<h N → 0, Nh N →∞ asN →∞.{v N ,N ≥ 1} is a sequence of positive real numbers so thatv N ↓ 0 asN →∞. The integral in (2.10) is defined though the Fourier coefficientsu k (t) are vanishing. Here we can takeh N = N −3/17 , v N = N −41/17 , then we have the following mean-square convergence sup t 0 ≤t≤t 1 E ˆ θ N (t)−θ 0 (t) 2 ≤C(t 0 ,t 1 )N −24/17 , whereC(t 0 ,t 1 ) is constant depending on 0≤t 0 <t 1 ≤T . The kernel estimator ˆ θ N (t) and its convergence directly rely on the selection of the kernel and/or the choice of bandwidth. Withh N = 100n −3/17 andv N = 0.1N −4/17 , for example, we have the same asymptotic result. It is expected that taking a higher order kernel would bring a better rate of convergence, but this also increase the computational complexity. 2.5 Optimal Filtering of Stochastic Parabolic Equations So far, we have introduced the work about the parameter estimation of parabolic equations (in particular, heat equation) with deterministic (time-invariant and time-dependent) coefficient. A natural extension is to consider the case in which the parameterθ becomes random. This additional randomness inθ not only makes the model du(t,x) = (A 0 +θA 1 )u(t,x)dt+dW(t,x), 0<t<T, x∈G, u(0,x) =u 0 , (2.11) 40 more general but also poses new and interesting mathematical challenges. Whenθ = θ(t) is random, this estimation problem becomes a problem of filtering, with the observation process being the solution to the SPDE (2.11). Similar to the deterministic cases, to have existence and uniqueness of the solution to equation (2.11), the unknown processθ must be uniformly bounded and therefore modeled by a nonlinear diffusion equation with degenerating coef- ficients, while the right-hand side of (2.11) is linear in u. Thus, the filtering problem for equation (2.11) is neither a linear Gaussian filtering nor a nonlinear diffusion filtering (two classical filtering models). The solution to equation (2.11) is still understood in the sense of distribution. We assume equation (2.11) is diagonalizable in the following sense: (D1) There is a complete orthonormal system{h k , k≥ 1} inL 2 (G) so that A 0 h k =κ k h k , A 1 h k =ν k h k . (D2) There exist positive finite limits lim k→∞ |ν k |·k −m 1 /d <∞, lim k→∞ |κ k |·k −m 0 /d <∞, wherem 0 = ord(A 0 ), m 1 = ord(A 1 ). (D3) There exist positive real numbersC 1 , C 2 , so that,∀t∈ [0,T] andω∈ Ω, −C 1 < liminf k→∞ κ k +θ(t)ν k k 2m/d , limsup k→∞ κ k +θ(t)ν k k 2m/d <−C 2 , where 2m = max(m 0 ,m 1 ). 41 With some assumption of the initial conditionu 0 , the diagonalizable equation (2.11) is shown to have a unique solution in a certain Hilbert space. More details can be found in [20], [17]. Under conditions (D1)-(D3), equation (2.11) is equivalent to the following uncoupled system of stochastic ordinary differential equations: du k (t) = (κ k +θ(t)ν k )u k (t)dt+dw k (t), 0<t<T, k = 1,2,··· u k (0) =u 0,k , (2.12) and the solution to (2.11) can be written as a Fourier series u(t,x) = X k≥1 u k (t)h k (x), converging in the corresponding Hilbert space. Consider the problem of estimatingθ from the observations of the firstN Fourier coeffi- cients (2.12). Condition (D3) implies that the processθ is uniformly bounded, that is to say, there exist real numbersa θ , b θ such that,∀ω∈ Ω, inf 0≤t≤T θ(t)≥a θ , sup 0≤t≤T θ(t)≤b θ . (2.13) In particular, if A 1 is the leading operator, that is, m 0 < m 1 = 2m, then (D3) also implies a θ > 0, i.e.,θ must be uniformly positive. A possible model forθ is the Ito diffusion equation dθ(t) =B(t,θ(t))dt+r(t,θ(t))dV(t), (2.14) whereB andr are sufficiently regular functions and the Wiener processV is independent of W (for simplicity). To ensure the condition (2.13), we can make the following modification. 42 Defineρ =ρ(x) to be a smooth, compactly supported function onR such that (i) There exist finite non-zero limits lim x→a θ ρ(x)−ρ(a θ ) x−a θ , lim x→b θ ρ(x)−ρ(b θ ) x−b θ 6= 0. (ii) ρ(x)> 0, x∈ (a θ ,b θ ), andρ(a θ ) =ρ(b θ ) = 0. (iii) ρ(x) = 1 on [a θ +δ,b θ −δ] for some sufficiently smallδ> 0. With the aboveρ, we consider the following modification of equation (2.14). dθ(t) =ρ(θ(t))B(t,θ(t))dt+ρ(θ(t))r(t,θ(t))dV(t) (2.15) with some initial conditionθ 0 , independent ofV andW . Under some suitable regularity assumptions of functionsB andr, the equation (2.15) has a unique strong solution for every square-integrable initial condition. And ifθ 0 is a random variable taking values in [a θ ,b θ ], then the solution to (2.15) satisfies (2.13). Therefore, we have a problem of nonlinear filtering of diagonalizable equations, dθ(t) =ρ(θ(t))B(t,θ(t))dt+ρ(θ(t))r(t,θ(t))dV(t) du k (t) = (κ k +θ(t)ν k )u k (t)dt+dw k (t), k = 1,··· ,N (2.16) The filtering problem for (2.16) consists in computing the conditional density of θ(t) given the observations{u k (s), 0≤s≤t} N k=1 . With certain regularity assumptions, one can write out the corresponding Kushner’s equa- tion and Zakai equation. See more details in [17]. 43 Recall that the filtering density for (2.16) is a random filed Π = Π(t,x) so that, for every bounded measurable functionF =F(x), E(F(θ(t))|u k (s), k = 1,··· ,N; 0<s<t) = Z R F(x)Π(t,x)dx. Under condition (2.5), it is natural to expect Π to be supported in [a θ ,b θ ] for all t. Let (Lf)(t,x) =ρ(x)B(t,x)f ′ (x)+ 1 2 ρ 2 (x)r 2 (t,x)f ′′ (x) be the generator ofθ. If the functionsB andr are sufficiently smooth inx, then the adjointL ∗ ofL is defined by (L ∗ f)(t,x) =− ∂ ∂x (ρ(x)B(t,x)f(x))+ 1 2 ∂ 2 ∂x 2 (ρ 2 (x)r 2 (t,x)f(x)). Theorem 2.5.1. Let the following conditions be fulfilled: 1 The functions B and r are infinitely differentiable in x on [a θ ,b θ ] so that each derivative with respect tox is uniformly bounded as a function oft andx. 2 There exists anǫ> 0 so thatr 2 (t,x)≥ǫ for allt∈ [0,T] andx∈ [a θ ,b θ ]. 3 The Wiener processV is independent ofW . 4 The initial conditionθ 0 is independent ofV andW and has a density Π 0 ∈C ∞ 0 ((a θ ,b θ )). Then the filtering density Π = Π(t,x) for (2.16) exists and has the following properties: (1) For everyt ∈ [0,T] andP−a.s. ω ∈ Ω, the support of Π is [a θ ,b θ ] and the function Π is infinitely differentiable with respect tox with all the derivatives vanishing at pointsa θ andb θ . 44 (2) The function Π is a path-wise solution of the non-linear equation dΠ(t,x) = (L ∗ Π)(t,x)dt+ N X k=1 (κ k +xν k )Π(t,x)− ¯ H k (t)Π(t,x) u k (t)(du k (t)− ¯ H k (t)dt) (2.17) with initial condition Π 0 , where ¯ H k (t) = Z R d (κ k +xν k )Π(t,x)dx. Instead of equation (2.16), if we consider a linear filtering problem. dθ(t) =a(t)θ(t)dt+b(t)dV(t) du k (t) = (κ k +θ(t)ν k )u k (t)dt+dw k (t), k = 1,··· ,N, (2.18) where a = a(t) and b = b(t) are measurable and bounded functions on [0,T]. If θ 0 is a Gaussian random variable, then the solutionθ =θ(t) of the equation dθ(t) =a(t)θ(t)dt+b(t)dV(t), 0<t≤T, θ(0) =θ 0 , , is a Gaussian process. Obviously, such a process does not satisfy the condition (2.13), there- fore, cannot appear as a coefficient in (2.11). But we can still consider the multi-channel linear filtering problem. The finite system (2.18) has a unique strong solution and the corresponding optimal filter has a much simpler structure. We therefore consider (2.18) under the following assumptions: (L1): The functiona =a(t) andb =b(t) are measurable and bounded on [0,T]; (L2): The Wiener processesV andw k , k = 1,··· ,N, are independent; 45 (L3): The initial conditions (θ 0 ,u 0,1 ,··· ,u 0,N ) are independent of the Wiener processes V andw k , k = 1,··· ,N. (L4): E(θ 4 0 + P N k=1 u 4 0,k )<∞. (L5): The conditional distribution ofθ 0 givenu 0,1 ,··· ,u 0,N isP−a.s. Gaussian. It turns out that under these assumption the conditional distribution ofθ given the observation u k is Gaussian, and the best mean-square estimate ofθ(t) givenu k (s) can be computed from a generalized Kalman-Bucy filter. Theorem 2.5.2. Under assumptions (L1)-(L5), the conditional distribution of θ(t) given F u N,t =σ(u k (s), k = 1,··· ,N, 0≤s≤t), isP−a.s. Gaussian with parameters ˆ θ N (t) =E(θ(t)|F n N,t ), γ N (t) =E (θ(t)− ˆ θ N (t)) 2 |F u N,t . The functions ˆ θ N (t) andγ N (t) satisfy the following system of equations: d ˆ θ N (t) =a(t) ˆ θ N (t)dt+γ N (t) P N k=1 ν k u k (t) du k −(κ k u k (t)+ν k u k (t) ˆ θ N (t)dt) . ˙ γ N (t) = 2a(t)γ N (t)+b 2 (t)−γ 2 N (t) P N k=1 ν 2 k u 2 k (t). (2.19) with initial conditions ˆ θ N (0) =E(θ 0 |u 0,1 ,··· ,u 0,N ), γ N (0) =E (θ 0 − ˆ θ N (0)) 2 |u 0,1 ,··· ,u 0,N . One central question in the study of parameter estimation is the asymptotic behavior of the filter varianceγ N (t) asN →∞. For the filtering problem (2.18) with certain assumptions aboutθ 0 andu 0 , if q := 2(m 1 −m) d ≥−1 46 then lim N→∞ γ N (t 0 ) = 0, ∀t 0 ∈ (0.T]. w.p.1, (2.20) whereγ N (t) follows the equation (2.19) withγ N (0)> 0. If, in addition, inf 0≤t≤T |b(t)| > 0, then ∀t 0 ∈ (0,T], there exists a finite positive limit lim N→∞ ψ N γ N (t), w.p.1. where ψ N = √ lnN, q =−1 N (q+1)/2 , q>−1 . In summary, under condition (2.5), the variance of the linear filter tends to zero as more and more of the spatial Fourier coefficients of the solution to (2.11) are accessible in the observation processes. Example 2.5.3. Withκ k = 0 andν k =−k 2 , the system of filtering equations corresponding to equation (2.18) is d ˆ θ N (t) =a(t) ˆ θ N (t)dt−γ N (t) P N k=1 k 2 u k (t) du k +k 2 u k (t) ˆ θ N (t)dt . ˙ γ N (t) = 2a(t)γ N (t)+b 2 (t)−γ 2 N (t) P N k=1 k 4 u 2 k (t). then, with probability one, lim N→∞ γ N (t 0 ) = 0 for every 0 < t 0 ≤ T . If in addition, inf 0≤t≤T |b(t)| > 0, then for every 0 < t 0 ≤ T , there exists, with probability one, a finite positive limit lim N→∞ N 3/2 γ N (t 0 ). 47 Chapter 3 DIAGONAL STOCHASTIC HYPERBOLIC EQUATIONS In this chapter, we are going to introduce the work related to parameter estimation for second order (in time) stochastic partial differential equations. The extension to parameter estima- tion for stochastic hyperbolic equations was addressed with constant unknown coefficient in 2010 (see [16]). The corresponding problem when the unknown coefficient is time-varying or random remains open. Consider a stochastic version of damped linear hyperbolic equation as following, ¨ u+(A 0 +θ 1 A 1 )u = (B 0 +θ 2 B 1 )˙ u+ ˙ W, 0<t≤T, where ˙ u and ¨ u are the first and second derivatives ofu with respect to time, ˙ W is Gaussian space-time white noise on a separable Hilbert spaceH,A 0 , A 1 , B 1 , B are linear operators defined by A 0 h k =κ k h k , A 1 h k =τ k h k , B 0 h k =ρ k h k , B 1 h k =ν k h k , where{h k , k≥ 1} is an orthonormal basis inH, and we can writtenu as the Fourier series u(t,x) = X k≥1 u k (t)h k (x). 48 Accordingly, we can consider a multi-channel model with independent channels{u k , k≥ 1} defines by ¨ u k (t)−(ρ k +θ 2 ν k )˙ u k (t)+(κ k +θ 1 τ k )u k (t) = ˙ w k (t), 0<t≤T u k (0) = ˙ u(0) = 0 . (3.1) The maximum likelihood estimators ofθ 1 andθ 2 are constructed in the algebraic case (defini- tion and details are in [16]) for the multi-channel model (3.1); the consistency and asymptotic normality are also studied in the algebraic case and extended to a more general result. Theorem 3.0.4. Assume thatA i , B i are positive-definite, self-adjoint elliptic differential or pseudo-differential operators, either on a smooth bounded domain inR d with suitable bound- ary conditions, or on a smooth compactd−dimensional manifold. Then (1) the maximum likelihood estimator of θ 1 is consistent and asymptotically normal in the limitN →∞ if and only if order(A 1 )≥ order(A 0 +θ 1 A 1 )+order(B 0 +θ 2 B 1 )−d 2 ; (3.2) (2) the maximum likelihood estimator of θ 2 is consistent and asymptotically normal in the limitN →∞ if and only if order(B 1 )≥ order(B 0 +θ 2 B 1 )−d 2 . (3.3) Similar to the parabolic case, this result extends to a multichannel model with multiple param- eters, which , in the equivalent operator formulation can be written as ¨ u+ n X i=0 θ 1i A i u = m X j=0 θ 2j B j ˙ u+ ˙ W. 49 For example, the coefficientθ 1p can be consistently estimated if and only if order(A p )≥ order( P n i=0 θ 1i A i )+order( P m j=0 θ 2j B j )−d 2 . Example 3.0.5. Consider the wave equation with zero initial and boundary conditions and the space-time Gaussian white noise ˙ W(t,x) as the driving force: u tt (t,x) =θu xx (t,x)+ ˙ W(t,x), 0<t≤T, 0<x<π, u(0,x) =u t (0,x) = 0, u(t,0) =u(t,π) = 0, , (3.4) whereθ > 0 is an unknown real number subject to estimation. The Fourier series expansion of the solution to equation (3.4) is written as u(t,x) = r 2 π X k≥1 u k (t)sin(kx). Substitution of this series into (3.4) suggests that eachu k should satisfy d˙ u k (t) =−k 2 θu k (t)dt+dw k (t), 0<t≤T with initial conditionu k (0) = ˙ u k (0) = 0. If the trajectories ofu k (t), ˙ u k (t) are observed for all 0 < t < T and allk = 1,··· ,N, then the maximum likelihood estimator ofθ based on these observations is ˆ θ N =− P N k=1 R T 0 k 2 u k (t)dv k (t) P N k=1 R T 0 k 4 u 2 k (t)dt , where v k (t) := du k (t) dt . 50 It is proved that lim N→∞ ˆ θ N =θ, with Probability 1. and lim N→∞ N 3/2 ( ˆ θ N −θ) d =N 0, 12θ T 2 . 51 Chapter 4 FIRST ORDER MULTI-CHANNEL MODEL As demonstrated in previous two chapters, the original motivation is to estimate the unknown parameters in a Stochastic Partial Differential Equation. By searching for the solution of the SPDE as a Fourier series, we attempt to investigate an “equivalent” (if possible) system of Stochastic Ordinary Differential Equations driven by the same unknown parameters. However, if the deterministic unknown parameter is replaced by a random coefficient (or even further a stochastic process), the solution of the original SPDE as a Fourier series may not exist in general, this can be easily illustrated by the following example. Example 4.0.6. The one-dimension stochastic heat equation. Consider the following stochas- tic equation du(t,x) =θu xx (t,x)dt+dW(t,x), 0<t≤T, x∈ (0,π), (4.1) with zero initial and boundary conditions, whereθ∼N(0,σ 2 ) is an unknown random variable anddW(t,x) is the time-space Gaussian white noise term. Taking{h k (x) = q 2 π sin(kx),k∈ N},dW can be interpreted as dW(t,x) = X k≥1 h k (x)dw k (t), 52 where w k (t)’s are independent standard Brownian motions, and also independent of θ. We look for the solution of (4.1) as a Fourier series u(t,x) = X k≥1 u k (t)h k (x), (4.2) with eachu k (t), k≥ 1, satisfying du k (t) =−k 2 θu k (t)dt+dw k (t), 0<t≤T, u k (0) = 0. (4.3) Direct computation gives u k (t) = exp[−k 2 θt] Z t 0 exp[k 2 θs]dw k (s) = Z t 0 exp[−k 2 θ(t−s)]dw k (s), which implies E[u 2 k (t)|σ(θ)] = Z t 0 exp[−2k 2 θ(t−s)]ds E[u 2 k (t)] = Z t 0 E[exp[−2k 2 (t−s)θ]]ds = Z t 0 exp σ 2 2 (−2k 2 (t−s)) 2 ds = Z t 0 exp(2σ 2 k 4 s 2 )ds. For any fixedt> 0, there exists integerK =K(t)> 0, so that 2σ 2 k t 3 2 ≥ 1, ∀k≥K. 53 Thus, we know E[u 2 k (t)]≥ Z t t/3 exp(k 3 )ds = exp(k 3 )· 2t 3 , ∀k≥K =K(t/3). Notice that{h k ,k ≥ 1} is an orthonormal basis inL 2 ((0,π)). The formal sum in (4.2) does not converge in the sense of square mean since, for anyt> 0, Ekuk 2 L 2 ((0,π)) (t) = X k≥1 E[u 2 k (t)]≥ X k≥K E[u 2 k (t)]≥ X k≥K exp(k 3 )· 2t 3 =∞. whereK is an integer depending on t. Even though the existence of the solution to original SPDE is not guaranteed in general (not to mention the validation of the formal Fourier expression), the parameter estimation in multi-channel model (e.g. (4.3)) is still of its own interest. 4.1 The Linear Filtering The objective of this section is to study the following filtering problem. For 0<t<T , dθ(t) =a(t)θ(t)dt+b(t)dV(t), du k (t) =−k 2 θ(t)u k (t)dt+dw k (t), k = 1,··· ,N (4.4) with certain initial conditionu k (0), k = 1,··· ,N,θ(0)∼N(µ 0 ,σ 2 0 ). Let us consider (4.4) under the following assumptions: (H1) The deterministic functions a = a(t) and b = b(t) are measurable and bounded on [0,T]; (H2) The Wiener processesV andw k , k = 1,··· ,N are independent; 54 (H3) The initial conditions(θ(0),u 1 (0),··· ,u N (0)) are independent of the Wiener processes V andw k , k = 1,··· ,N; (H4) E |θ(0)| 4 + P N k=1 |u k (0)| 4 <∞; (H5) The conditional distribution ofθ(0) given (u 1 (0),··· ,u N (0)) isP−a.s. Gaussian. Theorem 4.1.1. Under assumption (H1)-(H5), the conditional distribution of θ(t) given F u N,t =σ({u k (s), k = 1,··· ,N, 0≤s≤t}), isP−a.s. Gaussian with mean and variance m N (t) =E θ(t)|F u N,t , γ N (t) =E (θ(t)−m N (t)) 2 |F u N,t . For each N ≥ 1, the random processes m N (t) and γ N (t) satisfy the following system of equations: dm N (t) =a(t)m N (t)−γ N (t) P N k=1 k 2 u k (t)[du k (t)+k 2 u k (t)m N (t)dt], ˙ γ N (t) = 2a(t)γ N (t)+b 2 (t)−γ 2 N (t) P N k=1 k 4 u 2 k (t), (4.5) with initial conditions m N (0) = E[θ(0)|u 1 (0),··· ,u N (0)], γ N (0) = E (θ(0)−m N (0)) 2 |u 1 (0),··· ,u N (0) . Proof. The equation (4.5) comes from the generalized Kalman-Bucy filter for the best mean- square estimate ofθ(t) given{u k (s) : k = 1,··· ,N, 0≤s≤t}. Based on Theorem 8.1 in [14] . Theorem 12.6 and Theorem 12.7 in [15], it suffices to verify that, for eachk≥ 1, Z T 0 E|u k (t)θ(t)| 2 dt<∞. (4.6) 55 Noticing that the unique strong solutionθ(t) to (4.4) is a Gaussian process, independent of all w k , with explicit expression: θ(t) = exp Z t 0 a(r)dr · θ(0)+ Z t 0 exp − Z s 0 a(r)dr b(s)dV(s) ; and the observation equation in (4.4) has a unique strong solution u k (t) = exp −k 2 Z t 0 θ(r)dr · u k (0)+ Z t 0 exp k 2 Z s 0 θ(r)dr dw k (s) . ThusE|θ(t)| 4 andE|u k (t)| 4 both exist (see details in the next section) and are continuous int. Condition (4.6) holds by Cauchy-Schwarz. 4.2 Moments and Related Results In sake of studying the asymptotic efficiency of the linear filtering constructed in Theorem 4.1.1, we need to gather more properties of θ(t) as well as u k (t), k = 1,··· ,N. Solving (4.4), θ(t) = exp Z t 0 a(r)dr · θ(0)+ Z t 0 exp − Z s 0 a(r)dr b(s)dV(s) . Ifθ(0) is a Gaussian random variable,θ(t),t≥ 0, is a Gaussian process, and E(θ(t)) = exp Z t 0 a(r)dr E[θ(0)], E|θ(t)| 2 = exp 2 Z t 0 a(r)dr · E|θ(0)| 2 + Z t 0 exp −2 Z s 0 a(r)dr b 2 (s)ds . 56 As we denotedµ 0 =E(θ(0)),σ 2 0 = Var(θ(0)), we also denote µ t :=E(θ(t)), σ 2 t := Var(θ(t)). Then, we have µ t = µ 0 ·exp Z t 0 a(r)dr , σ 2 t = E|θ(t)| 2 −|E(θ(t))| 2 = exp 2 Z t 0 a(r)dr E|θ(0)| 2 −|E(θ(0))| 2 +exp 2 Z t 0 a(r)dr Z t 0 exp −2 Z s 0 a(r)dr b 2 (s)ds = exp 2 Z t 0 a(r)dr σ 2 0 + Z t 0 exp 2 Z t s a(r)dr b 2 (s)ds Since for any fixedt≥ 0,θ(t)∼N(µ t ,σ 2 t ), E[θ(t) 2 ] = µ 2 t +σ 2 t ; E[θ(t) 3 ] = µ 3 t +3µ t σ 2 t ; E[θ(t) 4 ] = µ 4 t +6µ 2 t σ 2 t +3σ 4 t , etc.. Solving (4.5), u k (t) = exp −k 2 Z t 0 θ(r)dr · u k (0)+ Z t 0 exp k 2 Z s 0 θ(r)dr dw k (s) . 57 We cannot directly compute the expectation or higher moments of u k (t) because the ran- domness of θ(t) is involved. But we can still consider the conditional expectation. Denote F θ t :=σ(θ(s) : 0≤s≤t). Noticingθ(t) andw k (t), k≥ 1 are independent, E[u k (t)|F θ t ] = exp −k 2 Z t 0 θ(r)dr E[u k (0)|F θ t ] = exp −k 2 Z t 0 θ(r)dr E[u k (0)|θ(0)] E[u 2 k (t)|F θ t ] = exp −2k 2 Z t 0 θ(r)dr E[u 2 k (0)|F θ t ] +E 2u k (0) Z t 0 exp k 2 Z s 0 θ(r)dr dw k (s) F θ t +E " Z t 0 exp k 2 Z s 0 θ(r)dr dw k (s) 2 F θ t #) ≤ 2exp −2k 2 Z t 0 θ(r)dr E[u 2 k (0)|θ(0)]+ Z t 0 exp 2k 2 Z s 0 θ(r)dr ds . Here we introduce a powerful inequality of Gaussian processes. This result was due, inde- pendently, and with very different proofs, to Borell [4] and Tsirelson, Ibragimor and Sudakov (TIS) [5]. Hence, according to Adler and Taylor (see [1]), here we refer to this inequality as Borell-TIS inequality, and state it as the following. Let(Ω,F,P) be a complete probability space andT a topological space, let a measurable mappingX : Ω→R T be a real valued Gaussian random field. Theorem 4.2.1. ( Borell-TIS Inequality ) LetX t be a centered Gaussian process, a.s. bounded onT . DenoteX ∗ T = sup t∈T X t . Then we have the following results, (i) E{X ∗ T }<∞; (ii) For everyx> 0, P{X ∗ T −E(X ∗ T )>x}≤ exp(−x 2 /2σ 2 T ), 58 whereσ 2 T , sup t∈T E{X 2 t }. We skip the proof here (A detailed proof can be found in [1]) but state two immediate and trivial consequences for future use. Corollary 4.2.2. Under the same assumptions of Theorem 4.2.1,∀x>E{X ∗ T }, P{X ∗ T >x}≤ exp{−(x−EX ∗ T ) 2 /2σ 2 T }. Corollary 4.2.3. Under the same assumptions of Theorem 4.2.1, for a large enoughx, P{X ∗ T ≥x}≤ exp{Cx−x 2 /2σ 2 T }, whereC is a constant depending only onEX ∗ T . Remark. For the unique strong solutionθ(t) to equation (4.4), if we assumeθ(0)∼N(0,σ 2 ), thenθ(t), 0≤t≤T , is a centered Gaussian process, a.s. bounded on[0,T]. Theorem 4.2.1 is applicable. Further more, ifθ(0)∼N(µ,σ 2 ) withµ 6= 0, consider ˜ θ(t) =θ(t)−µ ·e R t 0 a(r)dr , then d ˜ θ(t) =a(t) ˜ θ(t)dt+b(t)dV(t), with ˜ θ(0)∼N(0,σ 2 ). Similar results can be applied to ˜ θ(t). 4.3 Asymptotic Efficiency One central question in the study of the filtering equation (4.5) is the asymptotic behavior of the filter varianceγ N (t) as more and more observations become available. Theorem 4.3.1. Consider the filtering problem (4.4) under the assumptions (H1)-(H5) and suppose that initial observations u k (0), k = 1,··· ,N, are independent Gaussian random 59 variables, andθ(0)∼N(0,σ 2 ). Defineγ N (t) according to (4.5) with someγ N (0)≥ 0. Then for anyǫ> 0, P lim N→∞ sup [ǫ,T] γ N (t) ! = 0 ! = 1. (4.7) Proof. First, according to Theorem 1.5.6, we can see it is sufficient to prove: for anyt> 0, lim N→∞ N X k=1 k 4 u 2 k (t) ! =∞, with prob. 1. (4.8) Supposing (4.8) holds true for all t > 0, noticing the series P N k=1 k 4 u 2 k (t) is non- decreasing inN, then with probability 1, lim N→∞ inf δ≤t≤T N X k=1 k 4 u 2 k (t) !! =∞, ∀δ> 0. (4.9) Applying Theorem 1.5.6 in Section 1.5 toγ N (t) described by equation (4.5) under the condi- tion (4.9), it immediately gives, with probability 1, lim N→∞ sup [δ+ǫ,T] γ N (t) ! = 0, for arbitraryǫ> 0. (4.10) Sinceδ> 0 andǫ> 0 are both arbitrary, we combine them to get (4.7). To prove (4.8), for any fixedt> 0 and fixed integerM > 0, we define A N :={ω : N X k=1 k 4 u 2 k (t)≤M}. Then, we have P(A N ) = Z C[0,t] P(A N |θ [0,t] =x [0,t] )dµ θ (x), (4.11) 60 whereµ θ is the measure on space(C[0,t],B(C[0,t])), generated by processθ(s),0≤s≤t; i.e. for anyA∈B(C(0,t)), µ θ (A) =P(θ [0,t] ∈A). Hereµ θ =µ t θ withµ t θ =P(θ [0,t] ∈A). To simplify notation, we omit superscripts where there is no confusion. Then, we can estimate the conditional probability, P A N |θ [0,t] =x [0,t] ≤ P u 2 1 (t)≤M, 2 4 u 2 2 (t)≤M,··· ,N 4 u 2 N (t)≤M|θ [0,t] =x [0,t] = N Y k=1 P k 4 u 2 k (t)≤M|θ [0,t] =x [0,t] . (4.12) The last equality is becauseu k (t), k = 1,··· ,N, are independent givenθ [0,t] = x [0,t] , under the assumption thatu k (0), k = 1,··· ,N, are independent. This can be verified directly from equation (4.4) and its solution’s explicit expression (See section 3.2 for details). Furthermore, assumingu k (0), k = 1,··· ,N, are Gaussian random variables, then u k (t)| θ [0,t] =x [0,t] is Gaussian with mean and variance as µ k (x) := exp −k 2 R t 0 x(r)dr E(u k (0)|θ(0) =x(0)), σ 2 k (x) := exp −2k 2 R t 0 x(r)dr × n Var(u k (0)|θ(0) =x(0))+ R t 0 exp 2k 2 R s 0 x(r)dr ds o . (4.13) To proceed, we recall two facts about normal random variable and the processθ(t). 61 Fact 4. IfZ ∼N(µ,σ 2 ), then P(|Z|≤ǫ) = 1 √ 2πσ Z ǫ −ǫ exp − (x−µ ) 2 2σ 2 dx≤ 2ǫ √ 2πσ = r 2 π · ǫ σ , which is P(|Z|≤ǫ)≤ min 1, r 2 π · ǫ σ ! , ∀ǫ≥ 0. Therefore, P(k 4 u 2 k (t)≤M|θ [0,t] =x [0,t] ) = P |u k (t)|≤ √ M k 2 θ [0,t] =x [0,t] ! ≤ r 2M π · 1 k 2 σ k (x) , (4.14) whereσ k (x) is defined as in (4.13). Fact 5. As we assumedθ(0)∼N(0,σ 2 ), the solutionθ(s) to the equation (4.4) is a centered Gaussian process, a.s. bounded on [0,T]. Notice that we already fixt> 0, and denoteθ ∗ t := sup 0≤s≤t θ(s). Then, according to Theorem 4.2.1 (Borell-TIS inequality)and its corollaries, we knowE[θ ∗ t ]<∞, and for allδ>E[θ ∗ t ], P{θ ∗ t >δ}≤ exp(−(δ−α t ) 2 /2σ 2 t ), whereα t :=E[θ ∗ t ], σ 2 t := sup 0≤s≤t E|θ(s)| 2 = sup 0≤s≤t e 2 R s 0 a(r)dr σ 2 + Z s 0 e −2 R τ 0 a(r)dr b 2 (τ)dτ . We choose certain δ > α t ∨ 1. Given the realization of θ [0,t] is x [0,t] ∈ C[0,t], let us 62 consider the following three cases: Case I: Denotec(x) := sup 0≤s≤t x(s)≤ 0. Then for all integersk≥ 1, by definition (4.13) σ 2 k (x) = exp −2k 2 Z t 0 x(r)dr Var(u k (0)|θ(0) =x(0)) + Z t 0 exp −2k 2 Z t s x(r)dr ds ≥ a k exp(−2k 2 ·c(x)·t)+ Z t 0 exp(−2k 2 ·c(x)·s)ds ≥ a k +t ≥ t, wherea k := Var(u k (0)|θ(0) =x(0)). Remark. c(x) should actually be c(x,t) = sup 0≤s≤t x(s). Since t > 0 is already fixed, to simplify notation, we writec(x) where there is no confusion. Case II: 0< sup 0≤s≤t x(s)≤δ, i.e. 0<c(x)≤δ. For all integersk≥ 1, σ 2 k (x) ≥ Z t 0 exp −2k 2 Z t s x(r)dr ds ≥ Z t 0 exp(−2k 2 c(x)·s)ds ≥ Z t 0 exp(−2k 2 δs)ds = 1−exp(−2k 2 δt) 2k 2 δ . Let us define K δ := " r ln2 2tδ # +1, (4.15) 63 then one can easily verify that σ 2 k (x)≥ 1 4k 2 δ , as long ask≥K δ . Case III: sup 0≤s≤t x(s)>δ> 0, i.e.c(x)>δ> 0, σ 2 k (x)≥ Z t 0 exp(−2k 2 c(x)·s)ds = 1−exp(−2k 2 ·c(x)·t) 2k 2 c(x) . Define K(c(x)) := "s ln2 2t·c(x) # +1, then, σ 2 k ≥ 1 4k 2 c(x) , ∀k≥K(c(x)). Notice thatK(c(x)) is decreasing inc(x) andc(x)>δ. As in Case II, we will useK δ (defined in (4.15)), then for anyk≥K δ , we havek≥K(c(x)) and σ 2 k (x)≥ 1 4k 2 c(x) , as long ask≥K δ . In the computation of P ∞ N=1 P(A N ) coming later, we need another quantity E[(θ ∗ t ) p · 1 {θ ∗ t >δ} ], forp> 0. It is presented as follows. 64 Ifp> 0, δ>α t ∨1, where α t =E(θ ∗ t ), E[(θ ∗ t ) p ·1 {θ ∗ t >δ} ], = E[ θ ∗ t ·1 {θ ∗ t >δ} p ] = Z ∞ 0 py p−1 ·P θ ∗ t ·1 {θ ∗ t >δ} >y dy = Z δ 0 py p−1 ·P(θ ∗ t >y∨δ)dy + Z ∞ δ py p−1 ·P(θ ∗ t >y∨δ)dy = Z δ 0 py p−1 ·P(θ ∗ t >δ)dy + Z ∞ δ py p−1 ·P(θ ∗ t >y)dy Noticing fory>δ>α t ,P(θ ∗ t >y)≤ exp(−(y−α t ) 2 /2σ 2 t ). E[(θ ∗ t ) p ·1 {θ ∗ t >δ} ] ≤ P(θ ∗ t >δ)δ p + Z ∞ δ py p−1 exp −(y−α t ) 2 /2σ 2 t dy = P(θ ∗ t >δ)δ p + √ 2πσ t ·p·E[Y p−1 ;{Y ≥δ}], whereY ∼N(α t ,σ 2 t ). Thus, E (θ ∗ t ) p ·1 {θ ∗ t >δ} ≤δ p + √ 2πσ t p 2 E|Y| p−1 (4.16) and we know E|Y| p−1 ≤ C 1 (p−1) p−1 (α t +σ t ) p−1 , p≥ 1 C 2 , 0<p< 1 65 whereC 1 does not depend onα t , σ t butC 2 does. Therefore, forp> 0, δ>α t ∨1, E[(θ ∗ t ) p |·1 {θ ∗ t >δ} ]≤δ p +Cp[(p−1) p−1 (α t +σ t ) p−1 ·1 {p≥1} +1 {0<p<1} ] (4.17) whereC is some constant depending onα t , σ t . Combining (4.11), (4.12), (4.13), (4.14), for anym ≥ K δ , withK δ defined earlier as in (4.15), ∞ X N=m P(A N ) = ∞ X N=m Z C[0,t] P(A N |θ [0,t] =x [0,t] )dµ θ (x) ≤ ∞ X N=m Z C[0,t] N Y k=1 P(k 4 u 2 k (t)≤M|θ [0,t] =x [0,t] )dµ θ (x) ≤ ∞ X N=m Z C[0,t] N Y k=m P |u k (t)|≤ √ M k 2 θ [0,t] =x [0,t] ! dµ θ (x) ≤ ∞ X N=m Z C[0,t] N Y k=m r 2M π · 1 k 2 σ k (x) dµ θ (x) = ∞ X N=m 2M π N−m+1 2 {I 1 +I 2 +I 3 }, (4.18) where I 1 = Z { sup 0≤s≤t x(s)≤0} N Y k=m 1 k 2 σ k (x) dµ θ (x), I 2 = Z {0< sup 0≤s≤t x(s)≤δ} N Y k=m 1 k 2 σ k (x) dµ θ (x), I 3 = Z { sup 0≤s≤t x(s)>δ} N Y k=m 1 k 2 σ k (x) dµ θ (x). 66 Recalling the analysis forσ k (x) in those three cases and keeping in mind thatm≥K δ , I 1 ≤ Z { sup 0≤s≤t x(s)≤0} N Y k=m 1 k 2 √ t dµ θ (x) = 1 √ t N−m+1 · (m−1)! N! 2 ·P sup 0≤s≤t θ(s)≤ 0 ≤ 1 √ t N−m+1 · (m−1)! N! 2 I 2 ≤ Z {0< sup 0≤s≤t x(s)≤δ} N Y k=m 2k √ δ k 2 dµ θ (x) = 2 √ δ N−m+1 (m−1)! N! P(0< sup 0≤s≤t θ(s)≤δ) = 2 √ δ N−m+1 (m−1)! N! I 3 ≤ Z { sup 0≤s≤t x(s)>δ} N Y k=m 2k p c(x) k 2 dµ θ (x) = 2 N−m+1 ·(m−1)! N! Z { sup 0≤s≤t x(s)>δ} sup 0≤s≤t x(s) (N−m+1)/2 dµ θ (x) = 2 N−m+1 ·(m−1)! N! E (θ ∗ t ) (N−m+1)/2 ·1 {θ ∗ t >δ} ≤ 2 N−m+1 ·(m−1)! N! n δ N−m+1 2 +C(N−m+1) " N−m−1 2 N−m−1 2 ·(α t +σ t ) N−m−1 2 ·1 {N−m−1≥0} + 1 {0< N−m+1 2 <1} io 67 From (4.18), ∞ X N=m P(A N )≤ ∞ X N=m 2M π N−m+1 2 (I 1 +I 2 +I 3 ). By Stirling’s formula, one can easily verify that, for any constantc ∞ X n=1 c n (n!) 2 <∞, ∞ X n=1 c n n! <∞, ∞ X n=1 c n n 2 n 2 n! <∞. Therefore, P ∞ N=m P(A N )<∞. Hence, P ∞ N=1 P(A N )<∞. By Borel-Cantelli Lemma,P(A N , i.o.) = 0, i.e., P N X k=1 k 4 u 2 k (t)≤M i.o. ! = 0. we have P N X k=1 k 4 u 2 k (t)>M for sufficiently large N ! = 1, and notice thatM > 0 is an arbitrary integer, then with the arbitrarily chosent> 0, P lim N→∞ N X k=1 k 4 u 2 k (t) =∞ ! = 1. This is the condition (4.8), thus the proof is completed. Remark. Here we assumed θ(0) ∼ N(0,σ 2 ) to ensure θ(s), 0 ≤ s ≤ t, being centered Gaussian process. In more general situation, ifθ(0)∼N(µ,σ 2 ) withµ 6= 0, one can consider ˜ θ(t) =θ(t)−µe R t 0 a(r)dr , then d ˜ θ(t) =a(t) ˜ θ(t)dt+b(t)dV(t), 68 with ˜ θ(0)∼N(0,σ 2 ). The Borell-TIS inequality can be applied to ˜ θ(t). The same asymptotic efficiency result (Theorem 4.3.1) still can be established. Remark. Using the same scheme, one can actually show that, under the assumption of Theo- rem 4.3.1 , for anyt> 0, anyα> 2, P lim N→∞ N X k=1 k α u 2 k (t) =∞ ! = 1. 69 Chapter 5 SECOND ORDER MULTI-CHANNEL MODEL 5.1 One-dimension Stochastic Wave Equation Similar to the previous chapter, the second order multi-channel model is derived from the intention to investigate the parameter estimation problem for hyperbolic equation withθ being a (random) process. LetF = (Ω,F,{F t } t≥0 ,P) be a stochastic basis satisfying the usual hypothesis. Con- sider the following one-dimension stochastic wave equation u tt (t,x) =θ(t)u xx (t,x)+ ˙ W(t,x), 0<t≤T, 0<x<π, u(0,x) =u t (0,x) = 0, u(t,0) =u(t,π) = 0, , (5.1) where ˙ W(t,x) is the time-space Gaussian white noise, θ = θ(t) is the unknown (adapted) random process subject to estimation. It is natural to look for solution of (5.1) as a Fourier series u(t,x) = X k≥1 u k (t)h k (x) (5.2) 70 whereh k (x) = q 2 π sin(kx),k≥ 1. Taking into account dW(t,x) = X k≥1 h k (x)dw k (t), where thew k ’s are independent standard Brownian motions, equations (5.1) and (5.2) suggest that eachu k should satisfy du ′ k (t) =−k 2 θ(t)u k (t)dt+dw k (t), 0<t≤T u k (0) =u ′ k (0) = 0. (5.3) Sinceθ =θ(t) is a random process, we are dealing with a filtering problem with observations of infinite dimension. Instead, we consider a set of finite observations{u k (t),u ′ k (t)} N k=1 , hop- ing that the filter we construct later can converge to the trueθ (in certain sense) as more and more observations are available. 5.2 Main Problem I. If the series (5.2) converges to the solution of (5.1), the equivalence of (5.1) and (5.3) directly follows. So we need to check whether Ekuk 2 L 2 ((0,π)) (t) = ∞ X k=1 E[u 2 k (t)] ? <∞. (5.4) II. The method of filtering depends closely on the structure ofθ (e.g. linear, nonlinear). So we need to adapt appropriate assumptions for processθ. 71 III. For every t ∈ (0,T], we want to construct the estimate of θ(t) given observations {u k , u ′ k (s) : 0≤s≤t} up to timet. IV . With the estimate constructed later, we are interested in the usual range of asymptotic problems in parameter estimation: consistency, asymptotic normality and asymptotic efficiency. For problem I, the convergence of P ∞ k=1 E[u 2 k (t)] is in general not guaranteed. For instance, one can consider the trivial case whereθ(t) = θ ∼ N(0,1). Apart from the assumptions of processθ(t),E[u 2 k (t)] relies strongly onθ(t). Therefore it seems too wishful to seek a direct connection between (5.1) and (5.3). However model (5.3) is of interest on its own. Let us start with the multi-channel filtering problem with certain assumptions forθ(t). du ′ k (t) =−k 2 θ(t)u k (t)dt+dw k (t), 0<t≤T, k = 1,··· ,N, u k (0) =u ′ k (0) = 0, k = 1,··· ,N. (5.5) There are two classical filtering models, linear Gaussian and nonlinear diffusion. If we assume θ follows a linear stochastic differential equation, then the series in (5.4) cannot converge becauseθ(t) may take negative values “quite often”. If we wantθ to be bounded away from 0,θ follows a nonlinear stochastic differential equation; the nonlinear diffusion filtering leads us to a nonlinear stochastic partial differential equation about the conditional density ofθ(t) given the observations up to timet. The next section provides a brief result about nonlinear filtering, but we will not discuss it in depth in this article. 5.3 Nonlinear Filtering of Diagonalizable Equations Consider the problem of estimating the random processθ on the basis of the observations of the firstN Fourier coefficients (5.3) of the solution to the diagonalizable equation (5.1). 72 Diagonalizability implies that the processθ is uniformly bounded: there exist real numbers a θ ,b θ such that for allw∈ Ω, inf 0≤t≤T θ(t)≥a θ , sup 0≤t≤T θ(t)≤b θ . (5.6) In particular, for our stochastic wave equation case,a θ > 0. A possible model forθ is the Ito diffusion equation dθ(t) =B(t,θ(t))dt+r(t,θ(t))dV(t) (5.7) whereB andr are sufficiently regular functions, and the Wiener processV is independent of W (orw k ,k≥ 1). To ensure the condition (5.6), we need some modification of equation (5.7) in addition to appropriate choices ofB,r and initial condition. Letρ =ρ(x) be a smooth, compactly supported function onR such that 1. there exist finite non-zero limits: lim x→a θ ρ(x)−ρ(a θ ) x−a θ and lim x→b θ ρ(x)−ρ(b θ ) x−b θ ; 2. ρ(x)> 0 forx∈ (a θ ,b θ ),ρ(a θ ) =ρ(b θ ) = 0; 3. ρ(x) = 1 on [a θ +δ,b θ −δ] for some sufficiently smallδ> 0. A modification of (5.7) is dθ(t) =ρ(θ(t))B(t,θ(t))dt+ρ(θ(t))r(t,θ(t))dV(t) (5.8) with some initial conditionθ 0 independent ofV andW (orw k ,k≥ 1). This model has already been discussed in Section 3 of [17]. By Proposition 3.1 of [17], withB = B(t,x) andr = r(t,x) being deterministic, bounded, Lipschitz continuous inx, 73 uniformly in(t,x); equation (5.8) has a unique strong solution for every square-integrable ini- tial condition. Furthermore, if the initial conditionθ 0 is a random variable whose distribution is supported in [a θ ,b θ ], then the solution of (5.8) satisfies (5.6). The filtering problem can now be stated as the following multi-channel model: dθ(t) =ρ(θ(t))B(t,θ(t))dt+ρ(θ(t))r(t,θ(t))dV(t) dv k (t) =−k 2 θ(t) R t 0 v k (s)ds dt+dw k (t), k = 1,··· ,N v k (0) = 0, k = 1,··· ,N , (5.9) wherev k (t) =u ′ k (t). Denote Π(t,x) to be the conditional density of θ(t) given the observations up to time t, i.e., Π(t,x) = ∂ ∂x [P(θ(t)≤x|v k (s),k = 1,··· ,N;0<s≤t)] Or it is a random field so that for every bounded measurable functionF =F(x), E(F(θ(t))|v k (s),k = 1,··· ,N,0<s≤t) = Z R F(x)Π(t,x)dx Let (Lf)(t,x) =ρ(x)B(t,x)f ′ (x)+ 1 2 ρ 2 (x)r 2 (t,x)f ′′ (x) be the generator ofθ. AssumeB andr to be sufficiently smooth inx, then the adjointL ∗ ofL is defined by (L ∗ f)(t,x) =− ∂ ∂x (ρ(x)B(t,x)f(x))+ 1 2 ∂ 2 ∂x 2 (ρ 2 (x)B 2 (t,x)f(x)) The general theory of diffusion filtering theory states that, under certain regularity assump- tions (the precise statement of those technical assumptions has little relevance to the current 74 disposition and will be omitted. For those interested, they can be found in standard texts, such as [20]), which will be satisfied by our choices ofB,r and initial conditionθ 0 , the conditional density Π(t,x) satisfies a stochastic PDE dΠ(t,x) = (L ∗ Π)(t,x)dt−Π(t,x) N X k=1 k 2 u k (t)(x−H k (t))(du ′ k (t)+k 2 u k (t)H k (t)dt) with initial condition Π 0 ∈C ∞ ((a θ ,b θ )), whereH k (t) = R R xΠ(t,x)dx. And the above equation has pathwise solution Π(t,x) with the following property: for everyt∈ [0,T] andP-a.s. w ∈ Ω, the support of Π is [a θ ,b θ ] and the function Π is infinitely differentiable w.r.tx with all derivatives vanishing ata θ andb θ . 5.4 Multi-Channel Model and Linear Filtering Consider the following linear multi-channel filtering problem: dθ(t) =a(t)θ(t)dt+b(t)dV(t), 0<t≤T, du ′ k (t) =−k 2 θ(t)u k (t)dt+dw k (t), 0<t≤T, k = 1,··· ,N, u k (0) =u ′ k (0) = 0, k = 1,··· ,N. (5.10) For simplicity, let us assume Wiener processesV andw k , k = 1··· ,N are independent. Definev k =u ′ k (t), then u k (t) =u k (0)+ Z t 0 v k (s)ds = Z t 0 v k (s)ds, 75 model (5.10) can be written as dθ(t) =a(t)θ(t)dt+b(t)dV(t), 0<t≤T, dv k (t) =−k 2 θ(t) R t 0 v k (s)ds dt+dw k (t), 0<t≤T, k = 1,··· ,N, v k (0) = 0, k = 1,··· ,N. (5.11) Or, if we denote ξ N (t) = (v 1 (t),··· ,v N (t)) T , W N (t) = (w 1 (t),··· ,w N (t)) T , we have the vector format. dθ(t) =a(t)θ(t)dt+b(t)dV(t), 0<t≤T, dξ N (t) =A(t,ξ N )θ(t)dt+dW N (t), 0<t≤T, ξ N (0) = ~ 0. (5.12) To this problem, we need some conditions to apply the generalized Kalman-Bucy filtering: (H1): R T 0 [|a(t)|+|b(t)|]dt<∞; (H2): R T 0 E R t 0 v k (s)ds·θ(t) dt<∞, for eachk = 1,··· ,N; (H3): P R T 0 h R t 0 v k (s)ds·E θ(t)|F ξ N t i 2 dt<∞ = 1, for k = 1,··· ,N, where F ξ N t :=σ(ξ N (s), 0≤s≤t). (H4): Wiener processesV andw k ,k = 1,··· ,N are independent; (H5): The conditional distribution ofθ 0 givenξ N (0) isP−a.s. Gaussian; 76 One can verify that, according to Theorem 12.6 of [15], if the conditions (H1)-(H5) hold, we have the following results. The random processes (θ,ξ N ) = (θ(t), (v 1 (t),··· ,v N (t))) satisfying the system of equations given by (5.12) is conditionally Gaussian (P−a.s.), i.e. for anyt andt j , with 0≤t 0 <t 1 <···<t n ≤t, the conditional distribution F ξ N [0,t] (x 0 ,··· ,x n ) =P n θ t 0 ≤x 0 ,··· ,θ tn ≤x n |F ξ N t o , isP−a.s. Gaussian. Furthermore, we may need the following conditions to proceed (H1*): E[θ 4 0 ]<∞; (H2*): R T 0 b 4 (s)ds<∞; (H3*): R T 0 E[u 2 k (t)θ 2 (t)]dt<∞,∀k = 1,··· ,N. It turns out that under the conditions (H1)-(H5) and (H1*)-(H3*), the best mean-square esti- mate ofθ(t) givenξ N (t), 0≤s≤t, can be computed from a generalized Kalman-Bucy filter. For simplicity, we denote that m N (t) :=E h θ(t)|F ξ N t i , γ N (t) :=E h (θ(t)−m N (t)) 2 |F ξ N t i . According to the general theory of filtering (Theorem 8.1 in [14], Theorem 12.6 and Theorem 77 12.7 in [15]), a straightforward verification show that, under conditions (H1)-(H5) and (H1*)- (H3*), the functionsm N t andγ N t satisfy the following system of equations: dm N (t) =a(t)m N (t)dt+γ N (t) n P N k=1 k 2 u k (t)dv k (t)− P N k=1 k 4 u 2 k (t)m N (t)dt o , ˙ γ N (t) =b 2 (t)+2a(t)γ N (t)−(γ N (t)) 2 P N k=1 k 4 u 2 k (t), (5.13) with initial conditions: m N (0) =E[θ(0)|F ξ N 0 ], γ N (0) =E[(θ(0)−m N (0)) 2 |F ξ N 0 ]. Remark. Conditions (H2), (H3) and (H3*) are difficult to verify directly. They depend on properties of the system of equations (5.10) or (5.11). Due to the nature of processesθ(t) and u k (t), k = 1,··· ,N, we can find some sufficient conditions which can be more readily verified to replace (H1)-(H5), (H1*)-(H3*). More precisely, ifE|u k (t)| 4 exists and is continuous int (for allk), then conditions (H2), (H3), (H3*) are fulfilled. Therefore we will study the higher moments ofu k (t), which will come in the subsequent sections. For now, we just summarize and state the filtering result. Theorem 5.4.1. Let us consider (5.10) under the following assumptions: (H1) The deterministic functions a = a(t) and b = b(t) are measurable and bounded on [0,T]; (H2) The Wiener processesV andw k , k = 1,··· ,N, are independent; (H3) The initial conditions(θ(0),u 1 (0),··· ,u N (0)) are independent of the Wiener processes V andw k , k = 1,··· ,N; (H4) E(|θ(0)| 4 )<∞; 78 (H5) The conditional distribution ofθ(0) given (u 1 (0),··· ,u N (0)) isP−a.s. Gaussian. Then, the conditional distribution ofθ(t) given F u N,t =σ({u k (s), k = 1,··· ,N, 0≤s≤t}), isP−a.s. Gaussian with mean and variance m N (t) =E θ(t)|F u N,t , γ N (t) =E (θ(t)−m N (t)) 2 |F u N,t . For eachN ≥ 1, the random processesm N (t) andγ N (t) satisfy the system of equations (5.13) with initial conditions m N (0) = E[θ(0)|u 1 (0),··· ,u N (0)], γ N (0) = E (θ(0)−m N (0)) 2 |u 1 (0),··· ,u N (0) . Proof. Note that with zero initial conditions in equation (5.10), F ξ N t =F v 1 ,···,v N t =F (u 1 ,v 1 ),···,(u N ,v N ) t =F u 1 ,···,u N t . It suffices to show that R T 0 E[u 2 k (t)θ 2 (t)]dt < ∞, ∀k = 1,··· ,N, which follows from the fact that θ = (θ(t),F t ),0 ≤ t ≤ T is a Gaussian process and thatE|u k (t)| 4 exists and is continuous int (for allk). The latter will be demonstrated in Section 5.6 . Remark. The existence and uniqueness of the observationu k (t) as a solution to SDE in (5.10) will be studied in the next section. One may notice that the randomness of θ(t) makes the classical regularity results about SDE not applicable for proving the existence and uniqueness ofu k (t)’s. Other approaches are needed. 79 5.5 Second Order Linear Stochastic Ordinary Differential Equation Consider the following non-homogeneous linear SODE and its corresponding homogeneous equation, y ′′ +p(t)y ′ +q(t)y = g(t) dW(t) dt , t 0 <t≤T, (5.14) y ′′ +p(t)y ′ +q(t)y = 0, t 0 <t≤T, (5.15) wherep(t), q(t) andg(t) are continuous int, and dW(t) dt is Gaussian white noise (p(t), q(t) andg(t) can be deterministic or stochastic depending on context). Remark. The derivative of Wiener process w.r.t. t does not exist. Here we just use dW(t) dt to denote the Gaussian white noise formally. Theorem 5.5.1. Denoteϕ 1 (t) andϕ 2 (t) as a set of fundamental solutions to (5.15), then (1). the general solution of (5.14) is given by y(t) =c 1 ϕ 1 (t)+c 2 ϕ 2 (t)+ϕ ∗ (t), wherec 1 andc 2 are arbitrary constants, and ϕ ∗ (t) :=−ϕ 1 (t) Z t t 0 ϕ 2 (s) Q(s) g(s)dW(s)+ϕ 2 (t) Z t t 0 ϕ 1 (s) Q(s) g(s)dW(s), (5.16) whereQ(s) is the Wronskian ofϕ 1 , ϕ 2 : Q(s) :=Q[ϕ 1 ,ϕ 2 ](s) = ϕ 1 (s) ϕ 2 (s) ϕ ′ 1 (s) ϕ ′ 2 (s) =ϕ 1 (s)ϕ ′ 2 (s)−ϕ 2 (s)ϕ ′ 1 (s). 80 (2). With initial conditiony(t 0 ) =y 0 ,y ′ (t 0 ) =y 1 , (5.14) has a unique solution. Proof. Let us first consider the case in whichp(t), q(t) andg(t) are deterministic. Note that p(t), q(t) are continuous, the existence and uniqueness of solution to (5.15) is guaranteed. Sinceϕ 1 (t), ϕ 2 (t) are fundamental set of solutions to (5.15), for simplicity, let us denote Q(t) = Q(t 0 )exp Z t t 0 p(s)ds 6= 0, for any 0≤t 0 ≤t≤T, A(t) = − Z t t 0 ϕ 2 (s) Q(s) g(s)dW(s), B(t) = Z t t 0 ϕ 1 (s) Q(s) g(s)dW(s). So,ϕ ∗ (t) can be written as ϕ ∗ (t) =ϕ 1 (t)A(t)+ϕ 2 (t)B(t). Then, by Ito’s formula, d[ϕ ∗ (t)] = ϕ ′ 1 (t)dt·A(t)+ϕ 1 (t)dA(t)+dhϕ 1 ,Ai t +ϕ ′ 2 (t)dt·B(t)+ϕ 2 (t)dB(t)+dhϕ 2 ,Bi t = [ϕ ′ 1 (t)A(t)+ϕ ′ 2 (t)B(t)]dt That is to say, ˙ ϕ ∗ (t) =ϕ ′ 1 (t)A(t)+ϕ ′ 2 (t)B(t). 81 Using the Ito’s formula to ˙ ϕ ∗ (t), we have d[ ˙ ϕ ∗ (t)] = ϕ ′′ 1 (t)dt·A(t)+ϕ ′ 1 (t)dA(t)+dhϕ ′ 1 ,Ai t +ϕ ′′ 2 (t)dt·B(t)+ϕ ′ 2 (t)dB(t)+dhϕ ′ 2 ,Bi t = [ϕ ′′ 1 (t)A(t)+ϕ ′′ 2 (t)B(t)]dt+ ϕ 1 (t)ϕ ′ 2 (t)−ϕ 2 (t)ϕ ′ 1 (t) Q(t) g(t)dW(t) = [ϕ ′′ 1 (t)A(t)+ϕ ′′ 2 (t)B(t)]dt+g(t)dW(t) Sinceϕ 1 (t), ϕ 2 (t) both satisfy equation (5.15), a straightforward computation shows that d[ ˙ ϕ ∗ (t)]+p(t) ˙ ϕ ∗ (t)dt+q(t)ϕ ∗ (t)dt =g(t)dW(t). which means thatϕ ∗ (t) is a specific solution to equation (5.14). So y(t) =c 1 ϕ 1 (t)+c 2 ϕ 2 (t)+ϕ ∗ (t) is the general solution to equation (5.14). For uniqueness, supposey(t) and ˜ y(t) are solutions to (5.14) satisfying y(t 0 ) =y 0 , y ′ (t 0 ) =y 1 , ˜ y(t 0 ) =y 0 , ˜ y ′ (t 0 ) =y 1 . ThenY(t) =y(t)−˜ y(t) satisfy equation (5.15) withY(t 0 ) =Y ′ (t 0 ) = 0. Note that, ˜ Y(t)≡ 0 is a solution to equation (5.15) with the same zero initial conditions, then by the uniqueness of the solution to equation (5.15), Y(t) = ˜ Y(t)≡ 0, i.e.y(t) = ˜ y(t),∀t∈ [0,T]. 82 If some of p(t), q(t) and g(t) areF t −adapted random process(es), then the continuity condition becomes “continuity int∈ [0,T],P−a.s.” Thus, for almost allω∈ Ω, p(t) =p(t,ω), q(t) =q(t,ω), are continuous int ∈ [0,T], the existence and uniqueness of the solution to equation (5.15) still holdsP−a.s. Therefore, we still have ϕ 1 (t) =ϕ 1 (t,ω), ϕ 2 (t) =ϕ 2 (t,ω), for almost allω∈ Ω, path-wisely defined (P-a.s.) to be a fundamental set of solutions to (5.15). Then with the random processes ϕ 1 = (ϕ 1 (t),F t ) and ϕ 2 = (ϕ 2 (t),F t ), 0 ≤ t ≤ T (ϕ 1 (t), ϕ 2 (t) areF p,q t −measurable, thusF t -measurable. And because of the (P-a.s.) conti- nuity of processesp,q and the usual hypothesis of{F t } 0≤t≤T ,ϕ ′ 1 (t),ϕ ′ 2 (t),ϕ ′′ 1 (t) andϕ ′′ 2 (t) are allF p,q t − andF t − adapted), we can use formula (5.16) to define processϕ ∗ = (ϕ ∗ (t),F t ) based on the definition of stochastic integral. Hence all the results directly follow. Furthermore, one may notice that, ifp(t), q(t) are independent ofW(t), thenϕ 1 (t), ϕ 2 (2) (as well as their derivatives) are also independent ofW(t). Till the end of this section, we specifically consider the following initial value problem: y ′′ +p(t)y ′ +q(t)y = dW(t) dt , t> 0, y(0) =y ′ (0) = 0. . (5.17) Applying Theorem 5.5.1, the solution is of the form y(t) =c 1 ϕ 1 (t)+c 2 ϕ 2 (t)+ϕ ∗ (t), 83 wherec 1 , c 2 are constants to be determined later,ϕ 1 (t), ϕ 2 (t) are a set of fundamental solution to the corresponding homogeneous equation, andϕ ∗ (t) is defined as in (5.16). Zero initial conditions in (5.17) implies thatc 1 =c 2 = 0, the solution to (5.17) is y(t) = ϕ ∗ (t) =−ϕ 1 (t) Z t t 0 ϕ 2 (s) Q(s) dW(s)+ϕ 2 (t) Z t t 0 ϕ 1 (s) Q(s) dW(s) = Z t 0 ϕ 1 (s)ϕ 2 (t)−ϕ 1 (t)ϕ 2 (s) Q(s) dW(s) (5.18) Define R(t,s) := ϕ 1 (s)ϕ 2 (t)−ϕ 1 (t)ϕ 2 (s) Q(s) , 0≤s≤t 0, s>t (5.19) By direct computation, one can easily verify thatR(t,s) has the following properties, (1) R(t,t) = 0, ∀t∈ [0,T]; (2) ∂R ∂t (t,s) t=s = 1, ∂R ∂s (t,s) s=t =−1, t≥s; (3) ∂ 2 R ∂t 2 (t,s)+p(t) ∂R ∂t (t,s)+q(t)R(t,s) = 0, t≥s (t<s also true); (4) ∂ 2 R ∂t∂s (t,s) t=s =p(s). From properties (1)-(3), we see for each fixeds≥ 0,R(t,s) is the solution to an initial value problem; or say, R(t,s) satisfies a linear second order ordinary differential equation with respect to the first variablet. Yet seeking for a simple form of ordinary differential equation whichR(t,s) may fit in with respect to the second variables, requires additional assumptions. Alternatively, we defineG(t,s) = R(t,s)Q(s), and for simplicity, consider the fundamental setϕ 1 (t), ϕ 2 (t) to be such that ϕ 1 (0) = 1, ϕ ′ 1 (0) = 0; ϕ 2 (0) = 0, ϕ ′ 2 (0) = 1. 84 By Abel’s Identity,Q(t) =Q(0)e − R t 0 p(r)dr =e − R t 0 p(r)dr . G(t,s) =R(t,s)Q(s) = ϕ 1 (s)ϕ 2 (t)−ϕ 1 (t)ϕ 2 (s), 0≤s≤t, 0, s>t. (5.20) We summarize the properties ofG(t,s) in the upcoming Lemma omitting the obvious proof. Lemma 5.5.2. G(t,s) =R(t,s)Q(s) , described as in (5.20), has the following properties: (1) For each fixeds≥ 0,G(t,s) is the solution to ∂ 2 ∂t 2 G(t,s)+p(t) ∂ ∂t G(t,s)+q(t)G(t,s) = 0, t>s, G(t,s)| t=s = 0, ∂ ∂t G(t,s) t=s =Q(s), whereQ(s) =Q(0)·e − R s 0 p(r)dr =e − R s 0 p(r)dr (with specifiedϕ 1 , ϕ 2 ). (2) For each fixedt> 0,G(t,s) is the solution to ∂ 2 ∂s 2 G(t,s)+p(s) ∂ ∂s G(t,s)+q(s)G(t,s) = 0, 0≤s<t, G(t,s)| s=t = 0, ∂ ∂s G(t,s) s=t =−Q(t). Remark. (1) The specification of Q(0) does not actually affect the form of the properties above, since we can always rescale by a constant factor. (2) One may notice a special case in whichp(t)≡ 0, t≥ 0, thenG(t,s) =R(t,s). We as a result have a linear second order ordinary differential equation forR(t,s) with respect to its second variables. This case will get more attention later on when we move on to our multi-channel model. 85 (3) By defining ˜ G(t,r) = G(t,t−r) = G(t,s), with r = t−s, we have an initial value problem for ˜ G(t,r). ∂ 2 ∂r 2 ˜ G(t,r)−p(t−r) ∂ ∂r ˜ G(t,r)+q(t−r) ˜ G(t,r) = 0, 0≤r≤t, ˜ G(t,r)| r=0 = 0, ∂ ∂r ˜ G(t,r) r=0 =Q(t). Before stating the main result of this section, we need the following three lemmas. Lemma 5.5.3. Lety(t) be the solution to problem (5.17), theny(t) has the following proper- ties. (1) R t 0 y(r)d˙ y(r) =y(t)˙ y(t)− R t 0 (˙ y(r)) 2 dr; (2) R t 0 ˙ y(r)d˙ y(r) = (˙ y(t)) 2 −t 2 ; (3) R t 0 y(r)˙ y(r)dr = y 2 (t) 2 . Proof. With the zero initial conditions, integration by parts and Ito’s formula, the results can be verified by direct computation. Lemma 5.5.4. Ifp(t) andq(t) are deterministic, then(y(t), ˙ y(t)) is jointly normal distributed with E[y(t)] =E[˙ y(t)] = 0, and the covariance matrix Cov(y(t), ˙ y(t)) = R t 0 R 2 (t,s)ds R t 0 R(t,s) ∂R(t,s) ∂t ds R t 0 R(t,s) ∂R(t,s) ∂t ds R t 0 ∂R(t,s) ∂t 2 ds . 86 Proof. Noticing the expression (5.18), (5.19) and integrating by parts, y(t) = Z t 0 R(t,s)dW(s) = R(t,t)W(t)−R(t,0)W(0)− Z t 0 ∂R(t,s) ∂s W(s)ds = − Z t 0 ∂R(t,s) ∂s W(s)ds ˙ y(t) = − Z t 0 ∂ 2 R(t,s) ∂t∂s W(s)ds+ ∂R(t,s) ∂s s=t ·W(t) = − Z t 0 ∂ 2 R(t,s) ∂t∂s W(s)ds+W(t) = − W(t) ∂R(t,s) ∂t s=t − Z t 0 ∂R(t,s) ∂t dW(s) +W(t) = Z t 0 ∂R(t,s) ∂t dW(s) Many equalities here are due to the properties ofR(t,s). Thus,E[y(t)] =E[˙ y(t)] = 0, and for anyc 1 , c 2 ∈R, we have c 1 y(t)+c 2 ˙ y(t) = Z t 0 c 1 R(t,s)+c 2 ∂R(t,s) ∂t dW(s), Sincep(t) andq(t) are deterministic,R(t,s) is also deterministic; hence (y(t), ˙ y(t)) is jointly normal. Additionally, by Ito Isometry, Cov(y(t), ˙ y(t)) = E[|y(t)| 2 ] E[y(t)˙ y(t)] E[y(t)˙ y(t)] E[|˙ y(t)| 2 ] = R t 0 R 2 (t,s)ds R t 0 R(t,s) ∂R(t,s) ∂t ds R t 0 R(t,s) ∂R(t,s) ∂t ds R t 0 ∂R(t,s) ∂t 2 ds . 87 Lemma 5.5.5. Consider on a given stochastic basisF = (Ω,F,{F t } t≥0 ,P). (1) For any two independent random variables ξ,η and a measurable function α = α(x,y),x,y∈R, then E[α(ξ,η)|ξ] =E[α(x,η)]| x=ξ ; (2) Let ξ = (ξ t ,F t ), η = (η t ,F t ),t ≥ 0, be two independent random processes. α = (α t (x,y),B t ×B t ) 1 ,t≥ 0, be a nonanticipative measurable functional. Then E[α t (ξ,η)|F ξ t ] =E[α t (x,η)]| x=ξ [0,t] . Proof. The proof to (1) and (2) share the same trick. For simplicity, we just prove (2). Fix t ∈ [0,T]. For any bounded nonanticipative Borel functional λ t (x),x ∈ C([0,T]), we have E[α t (ξ,η)λ t (ξ)] = Z Z α t (x,y)λ t (x)dµ t,ξ (x)dµ t,η (y) = Z λ t (x) Z α t (x,y)dµ t,η (y) dµ t,ξ (x) = Z λ t (x)E[α t (x,η)]dµ t,ξ (x) = E n λ t (ξ)E[α t (x,η)]| x=ξ [0,t] o , where µ t,ξ and µ t,η are measures on space (C[0,t],B t (C[0,t])), generated by processes ξ(s), 0≤s≤t, andη(s), 0≤s≤t, respectively. Then by the definition of conditional expectation, E[α t (ξ,η)|F ξ t ] =E[α t (x,η)]| x=ξ [0,t] . 1 B t = σ{x : x(s),s≤ t} wherex belongs to the space of continuous functions. 88 Theorem 5.5.6. Ifp(t) andq(t) are stochastic, but independent ofW(t), then for everyt ∈ [0,T], given the information of{p(s), q(s)|0≤s≤t}, (y(t), ˙ y(t)) is jointly normal. That is to say,∀c 1 , c 2 ∈R, [c 1 y(t)+c 2 ˙ y(t)]| F p,q t is Gaussian, whereF p,q t =σ({p(s),q(s)|0≤s≤t}). Furthermore, the expectations ofy(t) and ˙ y(t) equal to zero. The conditional covariance matrix is Cov(y(t), ˙ y(t)|F p,q t ) = R t 0 R 2 (t,s)ds R t 0 R(t,s) ∂R(t,s) ∂t ds R t 0 R(t,s) ∂R(t,s) ∂t ds R t 0 ∂R(t,s) ∂t 2 ds . whereR(t,s) is defined as in (5.19). Remark. Withp(t) andq(t) being stochastic, we get stochastic fundamental set of solutions, ϕ 1 = ϕ 1 (t,ω),ϕ 2 = ϕ 2 (t,ω), to the corresponding homogeneous equation of (5.17). And ϕ 1 ,ϕ 2 are (P-a.s.) pathwise continuous as illustrated in Theorem(5.5.1). ThusR(t,s) is also pathwise continuous on 0 ≤ s ≤ t ≤ T . All the properties we found for R(t,s) and in Lemma(5.5.3) still hold true, pathwise (P-a.s.). Proof of Theorem 5.5.6. The solution,y(t), to following initial value problem y ′′ +p(t)y ′ +q(t)y = dW(t) dt , 0<t≤T, y(0) =y ′ (0) = 0 89 and its derivative ˙ y(t) are obviously determined uniquely processes by p(s), q(s) and W(s), 0≤s≤t. Precisely, (y(t), ˙ y(t)) =H t (p,q,W) for some nonanticipative measurable functionalH t . For any~ c = (c 1 ,c 2 ) T ∈R 2 , E e i~ c·Ht(p,q,W) |F p,q t =E e i~ c·Ht(x 1 ,x 2 ,W) | x 1 =p,x 2 =q where x 1 ,x 2 ∈ C([0,T]), H t (x 1 ,x 2 ,W) denotes the corresponding (˜ y(t),˜ y ′ (t)) satisfying ˜ y ′′ (t)+x 1 (t)˜ y ′ (t)+x 2 (t)˜ y(t) = dW(t) dt with ˜ y(0) = ˜ y ′ (0) = 0. So, (˜ y(t),˜ y ′ (t)) is jointly normal, and their covariance matrix is given by Lemma 5.5.4 (with coefficientsx 1 (t),x 2 (t) in the place ofp(t), q(t)). Thus, due to the results of conditional expectation in Lemma 5.5.5, we have: E e i~ c·Ht(p,q,W) |F p,q t =e i~ c·~ m(t)− 1 2 ~ c T Σ(t)~ c where ~ m(t) = E Z t 0 R(t,s)dW(s) F p,q t , E Z t 0 ∂R(t,s) ∂t dW(s) F p,q t T = (0,0) T Σ(t) = R t 0 R 2 (t,s)ds R t 0 R(t,s) ∂R(t,s) ∂t ds R t 0 R(t,s) ∂R(t,s) ∂t ds R t 0 ∂R(t,s) ∂t 2 ds . 90 R(t,s) is defined as in (5.19), with ϕ 1 ,ϕ 2 being a fundamental set of solutions to y ′′ (t) + p(t)y ′ (t)+q(t)y(t) = 0. Remark. As we pointed out earlier, computing the higher moments ofy(t) (or,u k (t), in par- ticular) is necessary. Now we found that (y(t), ˙ y(t)) conditioned onF p,q t is jointly Gaussian, and their higher moments depend on the fundamental solutions, ϕ 1 (t,ω) and ϕ 2 (t,ω), to corresponding homogeneous equation (5.15). In general, we cannot write out the explicit expression ofϕ 1 ,ϕ 2 orR(t,s). Alternative approach is needed. 5.6 Conditional Moments Proposition 5.6.1. Let (θ(t),(u 1 (t),··· ,u N (t));F t ), 0 ≤ t ≤ T be the process defined by (5.10), under the assumptions (H1)-(H5) in theorem 5.4.1, assumingu k (0) =u ′ k (0) = 0, k = 1,··· ,N; then for eachk, (u k (t),u ′ k (t))| F θ t is jointly normal with E[u k (t)] =E[u ′ k (t)] = 0, and conditional covariance matrix Cov (u k (t),u ′ k (t)) F θ t = R t 0 R 2 (t,s)ds R t 0 R(t,s) ∂R(t,s) ∂t ds R t 0 R(t,s) ∂R(t,s) ∂t ds R t 0 ∂R(t,s) ∂t 2 ds , whereR(t,s) =R k (t,s), (to simplify notation, we omit the subscriptk) satisfies the following second order linear ODE ∂ 2 R ∂t 2 (t,s)+k 2 θ(t)R(t,s) = 0, s<t≤T R(t,s)| t=s = 0, ∂R ∂t (t,s)| t=s = 1. 91 Proof. The result directly follows from Theorem 5.5.6 and the properties ofR(t,s) described after (5.19). Remark. For each k, u k (t)| F θ t is normal. Thus the existence of finite E[u k (t)|F θ t ] and E[u 2 k (t)|F θ t ] ensure the finite higher moments ofu k (t) conditioned onF θ t . The remaining of this section will devote to the computation of upper and lower bounds of E[u 2 k (t)|F θ t ] (since as we pointed out earlier, exact computation seems hopeless). Let us now fixt∈ (0,T], k∈{1,2,··· ,N}, E[u 2 k (t)] =E{E[u 2 k (t)|F θ t ]} = Z C[0,t] E[u 2 k (t)|θ [0,t] =q [0,t] ]dµ θ (q), (5.21) whereµ θ is the measure on space(C[0,t],B(C[0,t])), generated by processθ(s), 0≤s≤t; i.e. for anyA∈B(C[0,t]) µ θ (A) =P(θ [0,t] ∈A), also, E[u 2 k (t)|θ [0,t] =q [0,t] ] = Z t 0 R 2 (t,s)ds, (5.22) whereq is a continuous and deterministic function, andR(τ,s) =R q (τ,s) satisfies ∂ 2 R ∂τ 2 (τ,s)+k 2 q(τ)R(τ,s) = 0, s<τ R(τ,s)| τ=s = 0, ∂R ∂τ (τ,s)| τ=s = 1. (5.23) Still, to simplify notation, we omit the subscriptq for the rest of this section. As indicated in (5.21), we are interested in R t 0 R 2 (t,s)ds, the integral of R 2 (t,s) with respect to the second variable. But the above equation (5.23) is aboutR(t,s) and its partial derivatives with respect to the first variable. Based on Lemma 5.5.2 and its remarks, we can 92 describe R(t,s) using another similar linear ordinary differential equation. For each fixed t> 0, ∂ 2 ∂s 2 R(t,s)+k 2 q(s)R(t,s) = 0, 0≤s≤t, R(t,s)| s=t = 0, ∂ ∂s R(t,s) s=t =−1. (5.24) Or, if we consider ˜ R(t,s) :=R(t,t−r) =R(t,s), withr =t−s, we have ˜ R(t,s) being the solution to the following initial value problem, ∂ 2 ∂r 2 ˜ R(t,r)+k 2 q(t−r) ˜ R(t,r) = 0, 0≤r≤t, ˜ R(t,r)| r=0 = 0, ∂ ∂r ˜ R(t,r) r=0 = 1. Also obviously, R t 0 R 2 (t,s)ds = R t 0 | ˜ R(t,r)| 2 dr. The following computation mainly depends on the comparison theorem of second order linear ODE’s developed in the section coming later. q(τ), 0≤ τ ≤ t, is a certain realization of processθ up to timet. Asq(τ)∈C[0,t], it is bounded. Denotec = min 0≤τ≤t q(τ), ¯ c = max 0≤τ≤t q(τ). (One may notice thatc andc also depend ont). Case I. 0<c≤ ¯ c I t 0 (k 2 ¯ c)≤ Z t 0 | ˜ R(t,r)| 2 dr≤I t 0 (k 2 c), where I t s (λ) :=− 1 4λ 3/2 sin(2 √ λ(t−s))+ t−s 2λ , λ> 0. (For details ofI t s (λ) and its properties see the proof in Lemma 5.7.3). With R t 0 R 2 (t,s)ds = R t 0 | ˜ R(t,r)| 2 dr, we have I t 0 (k 2 ¯ c)≤ Z t 0 R 2 (t,s)ds≤I t 0 (k 2 c), (5.25) 93 where I t 0 (k 2 c) =− 1 4k 3 c 3/2 ·sin(2k √ ct)+ t 2k 2 c = t 2k 2 c 1− sin(2k √ ct) 2k √ ct . (5.26) Case II.c = 0< ¯ c I t 0 (k 2 ¯ c)≤ Z t 0 | ˜ R(t,r)| 2 dr≤ t 3 3 , or, I t 0 (k 2 ¯ c)≤ Z t 0 R 2 (t,s)ds≤ t 3 3 . Case III.c< 0< ¯ c, I t 0 (k 2 ¯ c)≤ Z t 0 | ˜ R(t,r)| 2 dr≤J t 0 (k 2 c), where J t s (λ) := 1 8(−λ) 3/2 n e 2 √ −λ(t−s) −e −2 √ −λ(t−s) −4 √ −λ(t−s) o = − t−s 2λ ( e 2 √ −λ(t−s) −e −2 √ −λ(t−s) 4 √ −λ(t−s) −1 ) , λ< 0. (For details ofJ t s (λ) and its properties see the proof of Lemma 5.7.3) With R t 0 R 2 (t,s)ds = R t 0 | ˜ R(t,r)| 2 dr, we have I t 0 (k 2 ¯ c)≤ Z t 0 R 2 (t,s)ds≤J t 0 (k 2 c), where J t 0 (k 2 c) =− t 2k 2 c ( e 2k √ −ct −e −2k √ −ct 4k √ −c·t −1 ) , c< 0. (5.27) Case IV.c< 0 = ¯ c, t 3 3 ≤ Z t 0 | ˜ R(t,r)| 2 dr≤J t 0 (k 2 c), 94 or, t 3 3 ≤ Z t 0 R 2 (t,s)ds≤J t 0 (k 2 c). Case V.c≤ ¯ c< 0, J t 0 (k 2 ¯ c)≤ Z t 0 | ˜ R(t,r)| 2 dr≤J t 0 (k 2 c), or, J t 0 (k 2 ¯ c)≤ Z t 0 R 2 (t,s)ds≤J t 0 (k 2 c). 5.6.1 Upper Bound and Lower Bound For eachm, n∈N, we define the setA t m,n ∈B(C[0,t]) in the following way, A t m,n = {q(τ)∈C[0,t]|m≤ min 0≤τ≤t q(τ)<m+1; n−1< max 0≤τ≤t q(τ)≤n}, m≤n ∅, m>n . ThusA t m,n ∩A t k,l =∅, form6=k orn6=l. For a fixedk∈{1,··· ,N} andt∈ (0,T], as in (5.21) E[u 2 k (t)] = E{E[u 2 k (t)|F θ t ]} = Z C[0,t] E(u 2 k (t)|θ [0,t] =q [0,t] )dµ θ (q) = ∞ X m,n=−∞ Z A t m,n E(u 2 k (t)|θ [0,t] =q [0,t] )dµ θ (q) (5.28) = S 1 +S 2 +S 3 +S 4 +S 5 +S 6 , 95 where S 1 := ∞ X m=−∞ Z A t m,m Z t 0 |R q (t,s)| 2 dsdµ θ (q), S 2 := X 0<m<n Z A t m,n Z t 0 |R q (t,s)| 2 dsdµ θ (q), S 3 := ∞ X n=1 Z A t 0,n Z t 0 |R q (t,s)| 2 dsdµ θ (q), S 4 := X m<0<n Z A t m,n Z t 0 |R q (t,s)| 2 dsdµ θ (q), S 5 := −∞ X m=−1 Z A t m,0 Z t 0 |R q (t,s)| 2 dsdµ θ (q), S 6 := X m<n<0 Z A t m,n Z t 0 |R q (t,s)| 2 dsdµ θ (q). R q (t,s) is described in (5.23) or (5.24) for corresponding continuous function q. Based on those inequalities we just developed in the previous section, we can obtain the upper and lower bounds forS i , i = 1,··· ,6, further. S 1 = ∞ P m=1 R A t m,m I t 0 (k 2 m)dµ θ (q)+ R A t 0,0 t 3 3 dµ θ (q)+ −∞ P m=−1 R A t m,m J t 0 (k 2 m)dµ θ (q), P 0<m<n R A t m,n I t 0 (k 2 n)dµ θ (q)≤S 2 ≤ P 0<m<n R A t m,n I t 0 (k 2 m)dµ θ (q), ∞ P n=1 R A t 0,n I t 0 (k 2 n)dµ θ (q)≤S 3 ≤ ∞ P n=1 R A t 0,n t 3 3 dµ θ (q), P m<0<n R A t m,n I t 0 (k 2 n)dµ θ (q)≤S 4 ≤ P m<0<n R A t m,n J t 0 (k 2 m)dµ θ (q), −∞ P m=−1 R A t m,0 t 3 3 dµ θ (q)≤S 5 ≤ −∞ P m=−1 R A t m,0 J t 0 (k 2 m)dµ θ (q), P m<n<0 R A t m,n J t 0 (k 2 n)dµ θ (q)≤S 6 ≤ P m<n<0 R A t m,n J t 0 (k 2 m)dµ θ (q), (5.29) 96 whereI t 0 andJ t 0 are as defined in (5.26) and (5.27). I t 0 (k 2 c) = t 2k 2 c 1− sin(2k √ ct) 2k √ ct , c> 0, t> 0. For fixedt ∈ (0,T], for eachc > 0, there exists integerK = K(t,c) (K decreases int, c), such that 1 2 ≤ 1− sin(2k √ ct) 2k √ ct ≤ 1, ∀k≥K, so, t 4k 2 c ≤I t 0 (k 2 c)≤ t 2k 2 c , ∀k≥K. (5.30) We know J t 0 (k 2 c) =− t 2k 2 c ( e 2k √ −ct −e −2k √ −ct 4k √ −c·t −1 ) , c< 0, t> 0. For fixedt ∈ (0,T], for eachc < 0, there exists integerK = K(t,c) (K decrease int, |c|), such that 1 2 · e 2k √ −ct 4k √ −ct ≤ e 2k √ −ct −e −2k √ −ct 4k √ −ct −1≤ e 2k √ −ct 4k √ −ct , ∀k≥K. Then, we have 1 2 · e 2k √ −ct 8k 3 (−c) 3/2 ≤J t 0 (k 2 c)≤ e 2k √ −ct 8k 3 (−c) 3/2 , ∀k≥K. (5.31) Remark. Noticing the property of K = K(t,c), for each t > 0, we can further find K 1 ∈ N, M 1 ∈R, such that the inequalities (5.30) and (5.31) both hold as long as|c|≥M 1 , k ≥ K 1 ; or we can findδ> 0, K 2 ∈N, M 2 ∈R, such that the inequalities (5.30) and (5.31) both hold as long ast>δ, |c|≥M 2 , k≥K 2 . 97 If we are still seeking the solution of equation (5.1) as a Fourier series (5.2), we need E|u(t,x)| 2 = ∞ X k=1 E|u 2 k (t)|<∞, (5.32) as pointed out in (5.4). However, based on above computation and bounds result, it is necessary to have: on each A t m,n withµ θ (A t m,n )> 0, the following series converges ∞ X k=1 Z A t m,n E(u 2 k (t)|θ [0,t] =q [0,t] )dµ θ (q)<∞, whereE(u 2 k (t)|θ [0,t] =q [0,t] ) relies onI t 0 , J t 0 and t 3 3 through (5.28) and (5.29). Then according to the inequalities (5.30) and (5.31), to ensure (5.32) it is natural and sufficient to expect P(min 0≤τ≤t θ(τ)≤ 0) = 0. (5.33) With the process θ = (θ(t),F t ), 0 ≤ t ≤ T , described in (5.10), this θ is a Gaussian process, thus the condition (5.33) cannot be true for anyt∈ (0,T]. 5.6.2 Asymptotic Efficiency For simplicity, we are assuming that the initial condition of processθ(t) in (5.10) is normal with zero mean, i.e. θ(0) ∼ N(0,σ 2 ). Then θ(t), 0 ≤ t ≤ T , is a centered Gaussian process, a.s. bounded on [0,T]. Theorem 4.2.1 is applicable. (For more general case, if θ(0)∼N(µ,σ 2 ) withµ 6= 0, one can consider ˜ θ(t) =θ(t)−µ ·e R t 0 a(r)dr .) 98 Theorem 5.6.2. Consider (5.10) under the assumptions of Theorem 5.4.1. Ifθ(0)∼N(0,σ 2 ), then for anyt> 0, P lim N→∞ N X k=1 k 4 u 2 k (t) ! =∞ ! = 1. Proof. By Proposition 5.6.1,u k (t)| θ [0,t] =q [0,t] is normal with zero mean and its variance is σ 2 k (q) :=E u 2 k (t)| θ [0,t] =q [0,t] , here σ 2 k (q) = σ 2 k (t,q) and since we start with a fixed t > 0, we are going to use σ 2 k (q) for notation simplicity. Using the same notation of A t m,n as in Section 5.6.1, and recalling expressions (5.28), (5.29), we have 1. ifq [0,t] ∈A t m,n , withn> 0,m<n σ 2 k (q)≥I t 0 (k 2 n); 2. ifq [0,t] ∈A t m,0 , withm< 0, σ 2 k (q)≥ t 3 3 ; 3. ifq [0,t] ∈A t m,n , withn< 0,m<n, σ 2 k (q)≥J t 0 (k 2 n). As in the proof of Theorem 4.3.1, for the fixedt> 0, and any fixed integerM > 0, we define A N :={ω : N X k=1 k 4 u 2 k (t)≤M}. 99 Then ∞ X N=l P(A N ) = ∞ X N=l Z C[0,t] P(A N |θ [0,t] =q [0,t] )dµ θ (q) = ∞ X N=l Z C[0,t] N Y k=l P |u k (t)|≤ √ M k 2 θ [0,t]=q [0,t] ! dµ θ (q) ≤ ∞ X N=l Z C[0,t] N Y k=l r 2M π · 1 k 2 σ k (q) dµ θ (q) = ∞ X N=l 2M π N−l+1 2 {I 1 +I 2 +I 3 }, (5.34) where I 1 = ∞ X n=1 n−1 X m=−∞ Z A t m,n N Y k=l 1 k 2 ·σ k (q) dµ θ (q), I 2 = −1 X m=−∞ Z A t m,0 N Y k=l 1 k 2 ·σ k (q) dµ θ (q), I 3 = −1 X n=−∞ n−1 X m=−∞ Z A t m,n N Y k=l 1 k 2 σ k (q) dµ θ (q). According to (5.30), (5.31) and the remark, there exists an integerK =K(t), such that for all integersn> 0, t 4k 2 n ≤I t 0 (k 2 n)≤ t 2k 2 n , ∀k≥K, 1≤ 1 2 · e 2k √ nt 8k 3 ·n 3/2 ≤J t 0 (k 2 ·(−n))≤ e 2k √ nt 8k 3 ·n 3/2 , ∀k≥K. 100 Therefore, assumingl≥K, we have I 1 ≤ ∞ X n=1 n−1 X m=−∞ Z A t m,n N Y k=l 1 k 2 p I t 0 (k 2 n) dµ θ (q) ≤ ∞ X n=1 n−1 X m=−∞ Z A t m,n N Y k=l 1 k 2 · 2k √ n √ t dµ θ (q) = ∞ X n=1 n−1 X m=−∞ 2 √ n √ t N−l+1 · (l−1)! N! P(θ [0,t] ∈A t m,n ) = ∞ X n=1 2 √ n √ t N−l+1 · (l−1)! N! P(θ [0,t] ∈ n−1 [ m=−∞ A t m,n ) = 2 √ t N−l+1 · (l−1)! N! ∞ X n=1 n N−l+1 2 P(n−1< sup 0≤τ≤t θ(τ)≤n), where ∞ X n=1 n N−l+1 2 P(n−1< sup 0≤τ≤t θ(τ)≤n) ≤ E h (θ ∗ t +1) N−l+1 2 ;θ ∗ t > 0 i = E h (θ ∗ t +1) N−l+1 2 ;θ ∗ t >δ i +E h (θ ∗ t +1) N−l+1 2 ;0<θ ∗ t ≤δ i , whereθ ∗ t := sup 0≤s≤t θ(s),α t :=E(θ ∗ t ),δ>α t ∨1. The same as in Theorem 4.3.1, we use Borell-TIS inequality (Theorem 4.2.1) and its corol- laries to obtain thatE[θ ∗ t ]<∞, and for allδ>E[θ ∗ t ] =α t , P(θ ∗ t >δ)≤ exp −(δ−α t ) 2 /2σ 2 t , where σ 2 t := sup 0≤s≤t E|θ(s)| 2 = sup 0≤s≤t e 2 R s 0 a(r)dr [σ 2 + Z s 0 e −2 R τ 0 a(r)dr b 2 (τ)dτ] . 101 Recalling (4.16), we can easily see that E h (θ ∗ t +1) N−l+1 2 ;θ ∗ t > 0 i ≤ 2 N−l+1 2 E h (θ ∗ t ) N−l+1 2 ;θ ∗ t >δ i +(δ +1) N−l+1 2 ≤ 2 N−l+1 2 n δ N−l+1 2 +C(N−l +1)· " N−l−1 2 N−l−1 2 ·(α t +σ t ) N−l−1 2 ·1 {N−l−1≥0} +1 {0< N−l+1 2 <1} #) +(δ +1) N−l+1 2 =: B(N,l,δ,α t ,σ t ), whereC is some constant depending onα t , σ t . Then we have I 1 ≤ 2 √ t N−l+1 · (l−1)! N! E h (θ ∗ t +1) N−l+1 2 ;θ ∗ t > 0 i ≤ 2 √ t N−l+1 · (l−1)! N! ·B(N,l,δ,α t ,σ t ). Similarly, we have I 2 ≤ −1 X m=−∞ Z A t m,0 N Y k=l √ 3 k 2 ·t 3/2 dµ θ (q) = 3 t 3 N−l+1 2 · (l−1)! N! 2 ·P(θ [0,t] ∈ −1 [ m=−∞ A t m,0 ) = 3 t 3 N−l+1 2 · (l−1)! N! 2 ·P(−1< sup 0≤s≤t θ(s)≤ 0) ≤ 3 t 3 N−l+1 2 · (l−1)! N! 2 . 102 I 3 ≤ −1 X n=−∞ n−1 X m=−∞ Z A t m,n N Y k=l 1 k 2 p J t 0 (k 2 n) dµ θ (q) ≤ −1 X n=−∞ n−1 X m=−∞ Z A t m,n N Y k=l 1 k 2 dµ θ (q) = (l−1)! N! 2 ·P( sup 0≤s≤t θ(s)≤−1) ≤ (l−1)! N! 2 . From (5.34), ∞ X N=l P(A N )≤ ∞ X N=l 2M π N−l+1 2 {I 1 +I 2 +I 3 } By Stirling’s formula, one can easily verify that, for any constantC, ∞ X n=1 C n n! <∞, ∞ X n=1 C n n 2 n 2 n! <∞, ∞ X n=1 C n (n!) 2 <∞. Therefore, P ∞ N=l P(A N )<∞. Hence, P ∞ N=1 P(A N )<∞. By Borel-Cantelli Lemma,P(A N , i.o.) = 0, i.e. P N X k=1 k 4 u 2 k (t)≤M i.o. ! = 0. We have P N X k=1 k 4 u 2 k (t)>M; N is large enough ! = 1, and notice thatM > 0 is an arbitrary integer, then with the arbitrarily chosent> 0, P lim N→∞ N X k=1 k 4 u 2 k (t) =∞ ! = 1. 103 Remark. Using the same scheme, one can actually show that, under the same assumption of Theorem 5.6.2 , for anyt> 0, anyα> 2, P lim N→∞ N X k=1 k α u 2 k (t) =∞ ! = 1. Theorem 5.6.3. Consider (5.10) under the assumptions of Theorem 5.4.1. Let θ(0) ∼ N(0,σ 2 ), defineγ N (t) according to (5.13) with someγ N (0)≥ 0. Then for anyǫ> 0, P lim N→∞ sup [ǫ,T] γ N (t) ! = 0 ! = 1. Proof. Similar to Theorem 4.3.1, the proof here is a straight consequence of applying Theorem 1.5.6 and Theorem 5.6.2. 5.7 Comparison Theorem In this section we are going to develop a comparison theorem (namely Theorem 5.7) to bound the second moment of u k (t), since there seems no way to compute the value of the second moment ofu k (t) exactly. First, we introduce the famous Sturm-Picone Comparison Theorem as a start. Theorem 5.7.1. Comparison Theorem I (Sturm-Picone): Consider the linear second-order ordinary differential equations of the form (p 1 (t)u ′ ) ′ +q 1 (t)u = 0, (5.35) and (p 2 (t)v ′ ) ′ +q 2 (t)v = 0, (5.36) 104 wherep i , q i , i = 1,2 are continuous real-valued functions defined on a given intervalI, with 0<p 2 (t)≤p 1 (t) and q 1 (t)≤q 2 (t). Letu be a non-trivial solution of (5.35) with consecutive zeros ata andb (a < b). Then any non-trivial solutionv of (5.36) has one of the following properties. (A) There exists at 0 ∈ (a,b), such thatv(t 0 ) = 0; (B) there exists some constantλ∈R, such thatv(t) =λu(t), on [a,b]. Proof. The proof is based on the Picone identity. One can easily verify that for u, v being solutions of (5.35) and (5.36) respectively, withv(t)6= 0, the following identity holds: u v (p 1 u ′ v−p 2 uv ′ ) ′ = (q 2 −q 1 )u 2 +(p 1 −p 2 )(u ′ ) 2 +p 2 u ′ −v ′ u v 2 . According to basic differential computation, h u v (p 1 u ′ v−p 2 uv ′ ) i ′ = p 1 u ′ u−p 2 v ′ · u 2 v ′ = (−q 1 u)u+p 1 (u ′ ) 2 −(−q 2 v) u 2 v −p 2 v ′ · 2uu ′ v−u 2 v ′ v 2 = (q 2 −q 1 )u 2 +(p 1 −p 2 )(u ′ ) 2 +p 2 (u ′ ) 2 −2u ′ v ′ · u v + v ′ · u v 2 = (q 2 −q 1 )u 2 +(p 1 −p 2 )(u ′ ) 2 +p 2 u ′ −v ′ · u v 2 To prove the main result of Sturm-Picone Comparison Theorem, we use contradiction. 105 Suppose to the contrary of situation (A), v(t) has no zeros in (a,b). Integrating Picone identity froma tob, we have LHS := Z b a h u v (p 1 u ′ v−p 2 uv ′ ) i ′ dt = Z b a (q 2 −q 1 )u 2 +(p 1 −p 2 )(u ′ ) 2 +p 2 u ′ −v ′ · u v 2 dt Noting thatu(a) =u(b) = 0, we have LHS = u(t) v(t) (p 1 (t)u ′ (t)v(t)−p 2 (t)u(t)v ′ (t)) b a = 0 Sincep 1 (t) ≥ p 2 (t) > 0,q 2 (t) ≥ q 1 (t) for allt ∈ [a,b], andu is a non-trivial solution, i.e., u(t)6= 0 on [a,b]. Then,we obtain LHS = Z b a (q 2 −q 1 )u 2 +(p 1 −p 2 )(u ′ ) 2 +p 2 u ′ −v ′ · u v 2 dt ≥ 0. (5.37) The equality holds if and only ifq 1 ≡q 2 , p 1 ≡p 2 andu ′ −v ′ · u v ≡ 0 on interval (a,b). This implies that, on interval [a,b],v(t) is a multiple ofu(t). On the contrary, when the equality in (5.37) fails, we have a contradiction, which means v(t) does have (at least one) zero point in (a,b). Corollary 5.7.2. Let a 1 (t), a 0 (t), A 0 (t) be continuous real-valued functions defined on a given interval I. Suppose u and v are non-trivial solutions of the following linear second order ODE’s, respectively, u ′′ +a 1 (t)u ′ +a 0 (t)u = 0 (5.38) v ′′ +a 1 (t)v ′ +A 0 (t)v = 0 (5.39) 106 Ifa, b(a<b) are two consecutive zeros ofu(t), and A 0 (t)≥a 0 (t), ∀t∈ [a,b], then one of the following properties hold, (A) There exists at 0 ∈ (a,b), such thatv(t 0 ) = 0; (B) there exists some constantλ∈R, such thatv(t) =λu(t) on [a,b]. Proof. Define p(t) = exp Z t 0 a 1 (s)ds , q 1 (t) =a 0 (t)p(t), q 2 (t) =A 0 (t)p(t), then d dt (p(t)u ′ )+p(t)a 0 (t)u = 0, d dt (p(t)v ′ )+p(t)A 0 (t)v = 0, or (p(t)u ′ (t)) ′ +q 1 (t)u(t) = 0, (p(t)v ′ (t)) ′ +q 2 (t)v(t) = 0, withp(t)> 0, q 2 (t)≥q 1 (t) for allt. Therefore, the result follows from a direct application of Sturm-Picone Comparison The- orem. Motivated by the Sturm-Picone Comparison Theorem, we are about to construct a theorem 107 comparing the solutions in the sense ofL 2 norm. We first introduce a Lemma which serves as the inspiration for the main theorem. Lemma 5.7.3. Letq 1 ≥q 2 be two real numbers. Supposex andy are solutions of the following linear second order ODE’s, receptively, x ′′ (t)+q 1 ·x(t) = 0, s<t≤T, y ′′ (t)+q 2 ·y(t) = 0, s<t≤T. Ifx andy share the same initial conditions x(s) =y(s) = 0, x ′ (s) =y ′ (s), then we have Z T s x 2 (t)dt≤ Z T s y 2 (t)dt. Proof. It is enough and straightforward to investigate the concrete solution expression. Let x i (t), i = 1,2,3, be the solutions of the following ODE, x ′′ i (t)+λ i ·x i (t) = 0, s<t≤T, x i (s) = 0, x ′ i (s) =b, whereλ 1 >λ 2 = 0>λ 3 . The solutionx i has explicit expression x 1 (t) = b √ λ 1 sin( p λ 1 (t−s)), x 2 (t) = b(t−s), x 3 (t) = b 2 √ −λ 3 e √ −λ 3 (t−s) −e − √ −λ 3 (t−s) . 108 Obviously, |x 1 (t)| 2 = b(t−s)· sin( √ λ 1 (t−s)) √ λ 1 (t−s) 2 ≤|b(t−s)| 2 =|x 2 (t)| 2 , |x 3 (t)| 2 = b 2 −4λ 3 n e 2 √ −λ 3 (t−s) +e −2 √ −λ 3 (t−s) −2 o ≥ b 2 (t−s) 2 =|x 2 (t)| 2 . The second inequality can easily be verified by second derivative test. Thus Z T s |x 1 (t)| 2 dt≤ Z T s |x 2 (t)| 2 dt≤ Z T s |x 3 (t)| 2 dt, where Z T s |x 2 (t)| 2 dt = b 2 (T −s) 3 3 . LetI(λ) := R T s b √ λ sin( √ λ(t−s)) 2 dt, λ> 0, we knowI(λ 1 ) = R T s |x 1 (t)| 2 dt, and I(λ) = − b 2 4λ 3/2 sin(2 √ λ(T −s))+ b 2 λ · T −s 2 = b 2 T −s 2λ − 1 4λ 3/2 sin(2 √ λ(T −s)) dI(λ) dλ = b 2 − T −s 2λ 2 + 3 8λ 5/2 sin(2 √ λ(T −s))− T −s 4λ 2 cos(2 √ λ(T −s)) = b 2 (T −s) 2λ 2 ( 3 2 · sin(2 √ λ(T −s)) 2 √ λ(T −s) − 1 2 cos(2 √ λ(T −s))−1 ) . Noting that function 3 2 · sinx x − 1 2 cosx−1≤ 0, ∀x≥ 0. 109 the equality holds if and only ifx = 0 (function is defined by continuity whenx = 0). Then we have dI(λ) dλ ≤ 0, ∀λ> 0. I(λ) is decreasing inλ> 0. Similarly, let us define J(λ) = Z T s b 2 √ −λ e √ −λ(t−s) − b 2 √ −λ e − √ −λ(t−s) 2 dt, λ< 0, thenJ(λ 3 ) = R T s |x 3 (t)| 2 dt, and J(λ) = b 2 −4λ 1 2 √ −λ e 2 √ −λ(T−s) −1 + 1 2 √ −λ 1−e −2 √ −λ(T−s) −2(T −s) = b 2 −4λ 1 2 √ −λ e 2 √ −λ(T−s) −e −2 √ −λ(T−s) −2(T −s) dJ(λ) dλ = 3b 2 16 (−λ) −5/2 h e 2 √ −λ(T−s) −e −2 √ −λ(T−s) i − b 2 (T −s) 8 (−λ) −2 h e 2 √ −λ(T−s) +e −2 √ −λ(T−s) i − b 2 (T −s) 2λ 2 = b 2 (T −s) 8λ 2 ( 3 2 e 2 √ −λ(T−s) −e −2 √ −λ(T−s) √ −λ(T −s) − e 2 √ −λ(T−s) +e −2 √ −λ(T−s) +4 ) . Noting that function 3(e x −e −x ) x −(e x +e −x +4)≤ 0, ∀x≥ 0. The equality holds if and only ifx = 0 (function is defined by continuity whenx = 0). Then we have dJ(λ) dλ ≤ 0, ∀λ< 0. We knowJ(λ) is decreasing inλ< 0. The above completes the proof. Theorem 5.7.4. Let p(t), q(t) be piecewise continuous real-valued functions defined on a 110 given interval [s,T]. Suppose x and y are solutions of the following linear second order ODE’s, respectively, x ′′ (t)+p(t)x(t) = 0, s<t≤T, y ′′ (t)+q(t)y(t) = 0, s<t≤T. If the followings are satisfied 1. p(t)≤q(t), ∀t∈ [s,T]; 2. x(s) =y(s), x ′ (s) =y ′ (s); 3. x(t), y(t) do not have zeros on interval (s,T] then we have x 2 (t)≥y 2 (t), ∀t∈ [s,T] ⇒ R T s x 2 (t)dt≥ R T s y 2 (t)dt. Proof. We use the calculus of variations to prove. First, we considerp(t), q(t)∈C ∞ (s,T). Joinp(t) andq(t) by a path p ǫ (t) =p(t)+ǫ·[q(t)−p(t)] =p(t)+ǫf(t), 0≤ǫ≤ 1, wheref(t) =q(t)−p(t)≥ 0. For any 0≤ǫ≤ 1, letx ǫ (t) be the solution of the following ODE x ′′ ǫ (t)+p ǫ (t)x ǫ (t) = 0, s<t≤T x ǫ (s) =a, x ′ ǫ (s) =b, (5.40) wherea :=x(s) =y(s),b :=x ′ (s) =y ′ (s), the notation ′ and ′′ are with respect to variablet. Note that x ǫ (t)| ǫ=0 =x(t), x ǫ (t)| ǫ=1 =y(t), 111 and sendingǫ → 0 + , we havep ǫ (t) → p(t) in sup norm, thus the solutionx ǫ (t) → x(t) in sup norm and L 2 . Sending ǫ → 1 − , we have p ǫ (t) → q(t) in sup norm, thus the solution x ǫ (t)→y(t) in sup norm andL 2 . Therefore,x ǫ (t), 0≤ǫ≤ 1, is a path joiningx(t) andy(t). Define Φ(ǫ) := Z T s |x ǫ (t)| 2 dt, 0≤ǫ≤ 1. We want to show that Φ(0)≥ Φ(1). It is sufficient to show dΦ(ǫ) dǫ ≤ 0, ∀ǫ∈ [0,1]. We know d dǫ Φ(ǫ) = Z T s 2x ǫ (t)· dx ǫ (t) dǫ dt. (5.41) Taking d dǫ both sides of equation (5.40), d dǫ (x ′′ ǫ )(t)+ d dǫ (p ǫ (t)x ǫ (t)) = 0, d dǫ x ǫ (t) ′′ +f(t)x ǫ (t)+p ǫ (t)· d dǫ x ǫ (t) = 0 (5.42) Defineu ǫ (t) := d dǫ x ǫ (t), then (5.42) becomes u ′′ ǫ (t)+p ǫ (t)u ǫ (t) =−f(t)x ǫ (t), (5.43) this is an inhomogeneous ODE corresponding to the homogeneous ODE (with solutionx ǫ (t)), x ′′ ǫ (t)+p ǫ (t)x ǫ (t) = 0. (5.44) 112 So we can use reduction of order to solve foru ǫ . Letu ǫ (t) = x ǫ (t)·v ǫ (t), and keep in mind that ′ and ′′ are with respect tot. Equation (5.43) derives to (x ǫ (t)v ǫ (t)) ′′ +p ǫ (t)x ǫ (t)v ǫ (t) =−f(t)x ǫ (t), x ′′ ǫ (t)v ǫ (t)+2x ′ ǫ (t)v ′ ǫ (t)+x ǫ (t)v ′′ ǫ (t)+p ǫ (t)x ǫ (t)v ǫ (t) =−f(t)x ǫ (t), noticingx ′′ ǫ v ǫ +p ǫ x ǫ v ǫ =v ǫ (x ′′ ǫ +p ǫ x ǫ ) = 0, 2x ′ ǫ v ′ ǫ +x ǫ v ′′ ǫ = −f·x ǫ , 2x ǫ ·x ′ ǫ ·v ′ ǫ +x 2 ǫ v ′′ ǫ = −f·(x ǫ ) 2 , (x 2 ǫ v ′ ǫ ) ′ = −f·x 2 ǫ , so, x 2 ǫ (t)v ′ ǫ (t) = − Z t s f(r)·x 2 ǫ (r)dr +C 1 v ′ ǫ (t) = − R t s f(r)·x 2 ǫ (r)dr x 2 ǫ (t) + C 1 x 2 ǫ (t) v ǫ (t) = Z t s − R τ s f(r)x 2 ǫ (r)dr x 2 ǫ (τ) dτ + Z t s C 1 x 2 ǫ (τ) dτ +C 2 thus u ǫ (t) =x ǫ (t)· Z t s − R τ s f(r)x 2 ǫ (r)dr x 2 ǫ (τ) dτ +x ǫ (t)· Z t s C 1 x 2 ǫ (τ) dτ +C 2 x ǫ (t). (5.45) 113 Now initial conditions foru ǫ are: u ǫ (s) = d dǫ x ′ ǫ (t) t=s = d dǫ x ǫ (s) = 0, (5.46) u ′ ǫ (s) = d dǫ x ǫ (t) ′ t=s = d dǫ x ′ ǫ (t) t=s = d dǫ x ′ ǫ (s) = 0. (5.47) Note that x ǫ (s) =a, x ′ ǫ =b, x ′′ ǫ (s) =−p ǫ (s)x ǫ (s) = 0. Ifa6= 0, then all the integrals in (5.45) are well-defined. Otherwise, ifa = 0, we are going to need further analysis to see whether those integrals are legal. Now supposea = 0. The Taylor expansion ofx ǫ (t) aroundt =s should be of the form, x ǫ (t) =b(t−s)+a 3 (t−s) 3 +a 4 (t−s) 4 +··· , for some constantsa 3 , a 4 ,··· . Similarly, the Taylor expansion ofx 2 ǫ (t) aroundt = s should be of the form x 2 ǫ (t) =b 2 (t−s) 2 +a 4 (t−s) 4 +··· , for some constantsa 4 ,··· . The Taylor expansion off(t)x 2 ǫ (t) aroundt = s should be of the form f(t)x 2 ǫ (t) =a 2 (t−s) 2 +a 3 (t−s) 3 +··· , 114 for some constantsa 2 , a 3 ,··· . Therefore, the Taylor expansion of R t s −f(r)x 2 ǫ (r)dr around t =s should be of the form Z t s −f(r)x 2 ǫ (r)dr =a 3 (t−s) 3 +a 4 (t−s) 4 +··· , for some constantsa 3 , a 4 ,··· . The Taylor expansion of R t s −f(r)x 2 ǫ (r)dr x 2 ǫ (t) aroundt =s should be of the form R t s −f(r)x 2 ǫ (r)dr x 2 ǫ (t) =a 1 (t−s)+a 2 (t−s) 2 +··· , for some constants a 1 , a 2 ,··· . The Taylor expansion of R t s R τ s −f(r)x 2 ǫ (r)dr x 2 ǫ (τ) dτ around t = s should be of the form Z t s R τ s −f(r)x 2 ǫ (r)dr x 2 ǫ (τ) dτ =a 2 (t−s) 2 +a 3 (t−s) 3 +··· , for some constantsa 2 , a 3 ,··· . The Taylor expansion ofx ǫ (t) R t s R τ s −f(r)x 2 ǫ (r)dr x 2 ǫ (τ) dτ aroundt−s should be of the form x ǫ (t) Z t s R τ s −f(r)x 2 ǫ (r)dr x 2 ǫ (τ) dτ =a 3 (t−s) 3 +a 4 (t−s) 4 +··· , for some constantsa 3 , a 4 . Hence, Z t s R τ s −f(r)x 2 ǫ (r)dr x 2 ǫ (τ) dτ t=s = 0, Z t s R τ s −f(r)x 2 ǫ (r)dr x 2 ǫ (τ) dτ ′ t=s = 0. 115 Combining this with (5.45) and (5.46) and (5.47), we know the constantsC 1 , C 2 in (5.45) are both 0, since x ǫ (t) Z t s R τ s −f(r)x 2 ǫ (r)dr x 2 ǫ (τ) dτ already satisfy the initial condition foru ǫ (t). So, d dǫ x ǫ (t) = u ǫ (t) =x ǫ (t) Z t s R τ s −f(r)x 2 ǫ (r)dr x 2 ǫ (τ) dτ, d dǫ Φ(ǫ) = Z T s 2x ǫ (t) dx ǫ (t) dǫ dt = Z T s 2x ǫ (t) x ǫ (t) Z t s R τ s −f(r)x 2 ǫ (r)dr x 2 ǫ (τ) dτ dt = −2 Z T s x 2 ǫ (t) Z t s R τ s f(r)x 2 ǫ (r)dr x 2 ǫ (τ) dτ. Noticing thatf(t)≥ 0, t∈ [s,T], the integrand d dǫ |x ǫ (t)| 2 = 2x ǫ (t) dx ǫ (t) dǫ ≤ 0, ∀t∈ [s,T],∀ǫ∈ [0,1]. Hence, d dǫ Φ(ǫ)≤ 0, ∀ǫ∈ [0,1]. Thenx 2 (t)≥y 2 (t), ∀t∈ [s,T], which also implies R T s x 2 (t)dt≥ R T s y 2 (t)dt. Now, ifp(t), q(t) are piecewise continuous. There exists a sequence ofC ∞ (s,T) functions {p n (t)} ∞ n=1 and{q n (t)} ∞ n=1 with p n (t)→p(t), q n (t)→q(t), asn→∞ 116 in sup norm. Without loss of generality, we can also assume p n (t)≤q n (t), ∀t∈ [s,T], ∀n. Letx n (t), y n (t) be the solution of the following linear ODE’s, respectively x ′′ n (t)+p n (t)x n (t) = 0, s<t≤T, y ′′ n (t)+q n (t)y n (t) = 0, s<t≤T, with the same initial conditions asx(t), y(t), i.e. x n (s) =y n (s) =a :=x(s) =y(s), x ′ n (s) =y ′ n (s) =b :=x ′ (s) =y ′ (s). Then by the basic theory of ordinary differential equations, x n (t)→x(t), y n (t)→y(t), asn→∞, in sup norm. According to previous proof, |x n (t)| 2 ≥|y n (t)| 2 , ∀t∈ [s,T],∀n; Z T s |x n (t)| 2 dt≥ Z T s |y n (t)| 2 dt, ∀n. Sendingn→∞, |x(t)| 2 ≥|y(t)| 2 , ∀t∈ [s,T]; Z T s |x(t)| 2 dt≥ Z T s |y(t)| 2 dt. 117 Remark. The condition that ‘x(t), y(t) do not have zeros on interval (s,T]’ is necessary for defining some integrals in the proof. It is natural to think about extending the results of Lemma 5.7.3 and Theorem 5.7.4 to more general cases. However, so far I do not have a solid proof even though I find no counterexample. Therefore I only present the following claim as a conjecture. Conjecture. Letp(t), q(t) be piecewise continuous real-valued functions defined on a given interval [s,T]. Suppose x and y are solutions of the following linear second order ODE’s, respectively, x ′′ (t)+p(t)x(t) = 0, s<t≤T, y ′′ (t)+q(t)y(t) = 0, s<t≤T. Ifx andy share the same initial conditions x(s) =y(s) = 0, x ′ (s) =y ′ (s) alsop(t)≤q(t), ∀t∈ [s,T], then Z T s x 2 (t)dt≥ Z T s y 2 (t)dt. Remark. If one ofx(t), y(t) have zeros on(s,T], thex ǫ (t) will have zeros on(s,T] for some ǫ, then the integral R t s − R τ s f(r)x 2 ǫ (r)dr x 2 ǫ (τ) dτ may not be well-defined. Instead, we can show that nowu ǫ (t) =x ǫ (t)[h ǫ (t)−h ǫ (s)], whereh ǫ (t) is any function such that h ′ ǫ (t) = − R t s f(r)x 2 ǫ (r)dr x 2 ǫ (t) . This h ǫ (t) itself may not be continuous on [s, T], but x ǫ (t)h ǫ (t) is well-defined and smooth enough. However, the strong conclusion, d dǫ |x ǫ (t)| 2 = 2x ǫ (t) dxǫ(t) dǫ ≤ 0, ∀t ∈ [s,T],∀ǫ ∈ 118 [0,1], may fail; which makes the weak conclusion, d dǫ Φ(ǫ) ≤ 0, ∀ǫ ∈ [0,1], much harder to prove. Remark. The condition ‘x(s) =y(s) = 0’ seems necessary if one considerx ′′ ǫ (t)+ǫx ǫ (t) = 0 withx ǫ (0) = 1, x ′ ǫ (0) = 0. 5.8 Other Possible Approaches As explained in previous sections, we need the higher moments ofu k (t), such asE|u k (t)| 2 , E|u k (t)| 4 , etc.. Since we cannot compute it directly, we try to analyze asymptotically as k→∞. Approach 1: Consider y ′′ (t)+p(t)y ′ (t)+k 2 q(t)y(t) = 0, k = 1,2,3,... (5.48) withp(t),q(t) both continuous on [0,T] (pathwise continuous if they are random processes). Question: Can we find some functionsx(t),z(t) such that, for everyt∈ [0,T],P −a.s., x(t)≤y k (t)≤z(t) for sufficiently largek wherey k (t) is a solution to equation (5.48) with indexk. To the author’s best knowledge, there is not any useful reference for this topic. But a reasonable start might be the ”control” ofq(t) (we can choose appropriateθ(t) to study). For example: 1) Considerdq(t) =aq(t)dt+bdV(t), 119 and choose appropriateq(0), such that R t 0 q(s)ds> 0, fort∈ [0,T]. 2) For eachk, consider dq k (t) =aq k (t)dt+b k q k (t)dV(t) with sequenceb k → 0 ask→∞. Choose suitable initial conditionq k (0) =q(0)> 0, for allk≥ 1, such that for every 0≤t≤T , R t 0 q k (s)ds> 0 fork≥K(t). Approach 2: “Freeze” the equation. Note that, ifp(t)≡p,q(t)≡q, thenR(t,s) =φ(t−s) for someφ(t) satisfying φ ′′ (t)+pφ ′ (t)+qφ(t) = 0, φ(0) = 0,φ ′ (0) = 1. and furthermore,φ(t) can be expressed explicitly. Let us make partition: 0 = t 0 < t 1 < ... < t n = T . On [t j ,t j+1 ], considery j (t) to be the solution to y ′′ j (t)+p(t j )y ′ j (t)+q(t j )y j (t) = dW(t) dt , t j <t≤t j+1 ,j = 0,...,n−1 y j (t j ) =y j−1 (t j ), y ′ j (t j ) =y ′ j−1 (t j ), j = 1,...,n−1 y 0 (t 0 ) = 0, y ′ 0 (t 0 ) = 0. 120 denoteϕ j,1 andϕ j,2 to be fundamental solutions to x ′′ (t)+p(t j )x ′ (t)+q(t j )x(t) = 0, t j <t≤t j+1 ,j = 0,...,n−1 with ϕ j,1 (t j ) = 0, ϕ ′ j,1 (t j ) = 1, j = 0,...,n−1 ϕ j,2 (t j ) = 1, ϕ ′ j,2 (t j ) = 0. j = 0,...,n−1 so Wronskian ofϕ j,1 andϕ j,2 is Q j (t) :=Q j [ϕ j,1 ,ϕ j,2 ](t) =Q j (t j )e − R t t j p(t j )ds =−e −p(t j )(t−t j ) , t j ≤t≤t j+1 And R j (t,s) = ϕ j,1 (s)ϕ j,2 (t)−ϕ j,1 (t)ϕ j,2 (s) Q j (s) =φ j (t−s), t j ≤s≤t≤t j+1 DefineY n (t) = P n−1 j=0 y j (t)χ [t j ,t j+1 ) (t)+y n (T)χ {T} (t), 0≤t≤T Then it is reasonable to expect that Y n (t)→y(t), as n→∞ and E|y(t)| 2 = lim n→∞ ( n−1 X j=0 E|y j (t)| 2 χ [t j ,t j+1 ) (t)+E|y n (T)| 2 χ {T} (t) ) . 121 Chapter 6 MORE DIAGONALIZABLE STOCHASTIC EQUATIONS As before, let (Ω,F,{F t } 0≤t≤T ,P) be a filtrated probability space satisfying the usual hypothesis. In this part, we will investigate some other models involving unknown param- eter process subject to estimate. For simplicity, we confine us to the time-space[0,T]×[0,π], and use the CONS{h k (x) = p 2/πsin(kx), k ≥ 1} ofL 2 ([0,π]). Letθ = (θ(t),F t ), 0 ≤ t ≤ T , be an unobservable process satisfying the following stochastic differential equation dθ(t) =a(t)θ(t)dt+b(t)dV(t), (6.1) whereV = (V(t),F t ), 0≤ t ≤ T , is a Wiener Process. The assumptions ofa(t), b(t) and initial conditionθ(0) will be specified later, when we present the filtering results. (At least for now, let us assume they guarantee the existence of the unique strong solutionθ(t).) 6.1 The One-dimension Stochastic parabolic equation Consider the following stochastic equation du(t,x) = (u xx (t,x)+θ(t)u(t,x))dt+dW(t,x), 0≤t≤T, x∈ (0,π), (6.2) 122 with zero boundary condition and certain initial condition, wheredW(t,x) is the noise term, as explained earlier, dW(t,x) = X k≥1 h k (x)dw k (t), (6.3) wherew k = (w k (t),F t ), 0≤t≤T , are independent standard Brownian motions. Looking for the solution of (6.2) as a Fourier series u(t,x) = X k≥1 u k (t)h k (x), (6.4) suggests that eachu k (t), k≥ 1, should satisfy du k (t) = (−k 2 u k (t)+θ(t)u k (t))dt+dw k (t), 0<t≤T, (6.5) with initial conditionu k (0) being the Fourier coefficient of series u(0,x) = ∞ X k=1 u k (0)h k (x), x∈ [0,π]. Direct computation with the help of Ito’s formula leads to the explicit expression ofu k (t): u k (t) =e −k 2 t u k (0)e R t 0 θ(r)dr + Z t 0 e k 2 s e R t s θ(r)dr dw k (s) . (6.6) Assumeθ(0), V = (V(t),F t ), 0≤t≤T are independent of the family of standard Brown- ian motions w k = (w k (t),F t ), 0≤t≤T, k = 1,2,3,··· , 123 also note that the solution to equation (6.1) is a Gaussian process θ(t) =e R t 0 a(r)dr θ(0)+ Z t 0 e − R s 0 a(r)dr b(s)dV(s) , (6.7) then the processθ is independent ofw k , k = 1,2,··· . Furthermore, assume the initial conditions (θ(0),u 1 (0),··· ,u N (0)) are independent of the Wiener processesV andw k , k = 1,··· ,N, for any integerN. TakingE(·) of the following u 2 k (t) = e −2k 2 t u 2 k (0)e 2 R t 0 θ(r)dr +2u k (0)e R t 0 θ(r)dr Z t 0 e k 2 s e R t s θ(r)dr dw k (s) + Z t 0 e k 2 s e R t s θ(r)dr dw k (s) 2 ) , we obtain E[u 2 k (t)] =e −2k 2 t E[u 2 k (0)e 2 R t 0 θ(r)dr ]+ Z t 0 e 2k 2 s E e 2 R t s θ(r)dr ds . Note thatu k (0) may depend onθ(0), by Cauchy-Schwartz E h u 2 k (0)e 2 R t 0 θ(r)dr i ≤ p E|u k (0)| 4 r E e 4 R t 0 θ(r)dr Sinceθ = (θ(t),F t ), 0≤t≤T is Gaussian,ψ t s := R t s θ(r)dr, 0≤s<t≤T , is Gaussian as well. E h e 2ψ t s i = e 2µ (s,t)+2σ 2 (s,t) , E h e 4ψ t 0 i = e 4µ (0,t)+8σ 2 (0,t) , 124 where µ (s,t) := E[ψ t s ] = Z t s E[θ(r)]dr, = E[θ(0)] Z t s e R r 0 a(τ)dτ dr, σ 2 (s,t) = E h ψ t s 2 i − Eψ t s 2 , = f(E(θ(0)),E|θ(0)| 2 ,a [0,t] ,b [0,t] ). If we assumea(t), b(t) are both measurable and bounded functions on[0,T], thenµ (s,t) and σ 2 (s,t) are bounded for all 0≤s<t≤T . Hence, for 0≤t≤T , E[u 2 k (t)] ≤ C n e −2k 2 t p E|u k (0)| 4 + 1−e −2k 2 t 2k 2 o , ≤ C n e −2k 2 t p E|u k (0)| 4 + 1 2k 2 o , (6.8) where C = max sup 0≤t≤T h e 2µ (0,t)+4σ 2 (0,t) i , sup 0≤s≤t≤T h e 2µ (s,t)+2σ 2 (s,t) i , i.e.C depends onθ(0), and functionsa [0,T] , b [0,T] . Now, summarizing those conditions, we are ready to state our first result about the follow- ing multi-channel filtering problem: dθ(t) = a(t)θ(t)dt+b(t)dV(t), du k (t) = (−k 2 u k (t)+u k (t)θ(t))dt+dw k (t), k = 1,··· ,N, (6.9) where θ = (θ(t),F t ), 0 ≤ t ≤ T is not accessible for observation, the value of θ(t) is subject to estimate on the basis of the observed process: u k,[0,t] :={u k (s), 0≤ s≤ t}, k = 1,2,··· ,N. 125 Theorem 6.1.1. Consider the multi-channel filtering problem (6.9) under following assump- tions: (H1) The functionsa =a(t) andb =b(t) are measurable and bounded on [0,T]; (H2) The Wiener processesV andw k , k = 1,··· ,N are independent; (H3) The initial conditions(θ(0),u 1 (0),··· ,u N (0)) are independent of the Wiener processes V andw k , k = 1,··· ,N; (H4) E(|θ(0)| 4 + N P k=1 |u k (0)| 4 )<∞; (H5) The conditional distribution ofθ(0) givenu 1 (0),··· ,u N (0) isP−a.s. Gaussian. Then, the conditional distribution ofθ(t) given F u N,t =σ(u k (s), k = 1,··· ,N, 0≤s≤t), isP−a.s. Gaussian with parameters m N (t) =E θ(t) F u N,t , γ N (t) =E (θ(t)−m N (t)) 2 F u N,t . The functionsm N (t) andγ N (t) satisfy the following system of equations: dm N (t) = a(t)m N (t)dt+γ N (t) ( N X k=1 u k (t)du k (t)+ N X k=1 (k 2 −m N (t))u 2 k (t)dt ) ˙ γ N (t) = 2a(t)γ N (t)+b 2 (t)−γ 2 N (t) N X k=1 u 2 k (t) (6.10) with the initial conditions m N (0) =E[θ(0)|u 1 (0),··· ,u N (0)], γ N (0) =E[(θ(0)−m N (0)) 2 |u 1 (0),··· ,u N (0)]. 126 Proof. Note that θ = (θ(t),F t ), 0 ≤ t ≤ T , is a Gaussian process independent of all w k , possessing the expression (6.7); also,u k (t) can be explicitly described by formula (6.6). As in Chapter 4, one can easily verify thatE|θ(t)| 4 andE|u k (t)| 4 both exist and are continuous functions oft, so by Cauchy-Schwartz inequality, Z T 0 E|u k (t)θ(t)| 2 dt<∞. Then the result follows from Theorem 8.1 in [14], Theorem 12.6 and Theorem 12.7 in [15]. Proposition 6.1.2. If there existα> 0, p∈ [0,1) and some integerK ≥ 1, such that E|u k (0)| 4 ≤e αk 1+p , ∀k≥K, (6.11) then there exists a unique solution of equation (6.2) inL 2 ((0,T);Ω×(0,π)). Besides, under the condition (6.11), the optimal filtering varianceγ N (t) does not converge to zero, asN →∞. Proof. With the Fourier representation ofu(t,x) given in (6.4) as u(t,x) = X k≥1 u k (t)h k (x), where eachu k (t) = (u,h k ) L 2 ((0,π)) (t) satisfies equation (6.5), and (6.5) has unique solution expressed by (6.6). According to the computation ofE[u 2 k (t)] and the inequality (6.8), Ekuk 2 L 2 ((0,π)) (t) = ∞ X k=1 E[u 2 k (t)]≤ ∞ X k=1 C e −2k 2 t p E|u k (0)| 4 + 1 2k 2 , where constantC depends onθ(0), functionsa(t) andb(t). 127 Since for anyt> 0, withp∈ [0,1), α> 0 X k≥K Ce −2k 2 t p E|u k (0)| 4 ≤ X k≥K Ce −2k 2 t e α 2 k 1+p ≤C X k≥K e k 1+p ( α 2 −2k 1−p ) <∞, we know that Ekuk 2 L 2 ((0,π)) (t)<∞, ∀t∈ (0,T]. Also, ∞ P k=1 E[u 2 k (t)]<∞, ∀t∈ (0,T] implies P ∞ X k=1 u 2 k (t)<∞ ! = 1, ∀t∈ (0,T]. It follows from the asymptotic result about Riccati Equation in Section 1.5 that, the non- negative solution, γ N (t), to equation (6.10) has the property: for arbitrary τ 1 , τ 2 with 0 < τ 1 <τ 2 ≤T , sup [τ 1 ,τ 2 ] γ N (t)9 0, asN →∞. This says the optimal filtering error does not go to zero even if more and more observations are available. Remark. Theorem 6.1.1 and Proposition 6.1.2 state that we can write our original filtering problem with observation satisfying a SPDE (6.2) “equivalently” into a multi-channel filtering problem (6.9), with observations satisfying a sequence of SODE. Even though we can get the optimal linear filter, the filtering error does not vanish as more and more channels of information are at hand. On the other hand, if we want consistent filter (γ N → 0, asN →∞), 128 we need the condition N P k=1 u 2 k (t) → ∞, asN → ∞, in some sense. However, this would not guarantee the convergence of the Fourier series u(t,x) = ∞ X k=1 u k (t)h k (x). 129 References [1] Robert J. Adler and Jonathan E. Taylor, Random fields and geometry, Springer Mono- graphs in Mathematics, Springer, New York, 2007. [2] S.I. Aihara, Regularized maximum likelihood estimate for an infinite-dimensional param- eter in stochastic parabolic systems, SIAM J. Control Optim. 30 (1992), no. 4, 745–764. [3] A. Bagchi and V . Borkar, Parameter identification in infinite-dimensional linear systems, Stochastics 12 (1984), no. 3-4, 201–213. [4] Christer Borell, The Brunn-Minkowski inequality in Gauss space, Invent. Math. 30 (1975), no. 2, 207–216. [5] B. S. Cirel ′ son, I. A. Ibragimov, and V . N. Sudakov, Norms of Gaussian sample functions, Proceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent, 1975), pp. 20–41. Lecture Notes in Math., V ol. 550. [6] Luc Devroye, A course in density estimation, Progress in Probability and Statistics, vol. 14, Birkh¨ auser Boston Inc., Boston, MA, 1987. [7] Masatoshi Fujisaki, G. Kallianpur, and Hiroshi Kunita, Stochastic differential equations for the non linear filtering problem, Osaka J. Math. 9 (1972), 19–40. [8] M. H¨ ubner, R. Khasminskii, and B. L. Rozovskii, Two examples of parameter estimation for stochastic partial differential equations, Stochastic processes, Springer, New York, 1993, pp. 149–160. [9] M. Huebner and S. Lototsky, Asymptotic analysis of a kernel estimator for parabolic SPDE’s with time-dependent coefficients, Ann. Appl. Probab. 10 (2000), no. 4, 1246– 1258. [10] , Asymptotic analysis of the sieve estimator for a class of parabolic SPDEs, Scand. J. Statist. 27 (2000), no. 2, 353–370. [11] M. Huebner and B. L. Rozovski˘ ı, On asymptotic properties of maximum likelihood esti- mators for parabolic stochastic PDE’s, Probab. Theory Related Fields 103 (1995), no. 2, 143–163. [12] Yu. A. Kutoyants, Parameter estimation for stochastic processes, Research and Expo- sition in Mathematics, vol. 6, Heldermann Verlag, Berlin, 1984, Translated from the Russian and edited by B. L. S. Prakasa Rao. 130 [13] , Identification of dynamical systems with small noise, Mathematics and its Applications, vol. 300, Kluwer Academic Publishers Group, Dordrecht, 1994. [14] Robert S. Liptser and Albert N. Shiryaev, Statistics of random processes. I, expanded ed., Applications of Mathematics (New York), vol. 5, Springer-Verlag, Berlin, 2001, General theory, Translated from the 1974 Russian original by A. B. Aries, Stochastic Modelling and Applied Probability. [15] , Statistics of random processes. II, expanded ed., Applications of Mathematics (New York), vol. 6, Springer-Verlag, Berlin, 2001, Applications, Translated from the 1974 Russian original by A. B. Aries, Stochastic Modelling and Applied Probability. [16] W. Liu and S. V . Lototsky, Parameter estimation in hyperbolic multichannel models, Asymptot. Anal. 68 (2010), no. 4, 223–248. [17] S. V . Lototsky, Optimal filtering of stochastic parabolic equations, Recent develop- ments in stochastic analysis and related topics, World Sci. Publ., Hackensack, NJ, 2004, pp. 330–353. [18] Hans-Georg M¨ uller, Smooth optimum kernel estimators of densities, regression curves and modes, Ann. Statist. 12 (1984), no. 2, 766–774. [19] Hung T. Nguyen and Tuan D. Pham, Identification of nonstationary diffusion model by the method of sieves, SIAM J. Control Optim. 20 (1982), no. 5, 603–611. [20] B. L. Rozovski˘ ı, Stochastic evolution systems, Mathematics and its Applications (Soviet Series), vol. 35, Kluwer Academic Publishers Group, Dordrecht, 1990, Linear theory and applications to nonlinear filtering, Translated from the Russian by A. Yarkho. [21] R. L. Stratonovich, Conditional Markov processes and their application to the theory of optimal control, Translated from the Russian by R. N. and N. B. McDonough for Scripta Technica, Inc. Modern Analytic and Computational Methods in Science and Mathemat- ics, No. 7, American Elsevier Publishing Co., Inc., New York, 1968. 131
Abstract (if available)
Abstract
The aim of this thesis is to estimate the random coefficient parameter of first and second order conditionally Gaussian multi-channel models and to study asymptotic properties of those estimators. The connection between certain stochastic partial differential equation and the multi-channel model serves as the initial motivation of this disposition.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Parameter estimation in second-order stochastic differential equations
PDF
Large deviations rates in a Gaussian setting and related topics
PDF
Second order in time stochastic evolution equations and Wiener chaos approach
PDF
On the simple and jump-adapted weak Euler schemes for Lévy driven SDEs
PDF
Tamed and truncated numerical methods for stochastic differential equations
PDF
Improvement of binomial trees model and Black-Scholes model in option pricing
PDF
Multi-population optimal change-point detection
PDF
Large-scale inference in multiple Gaussian graphical models
PDF
Statistical inference of stochastic differential equations driven by Gaussian noise
PDF
Non-parametric multivariate regression hypothesis testing
PDF
Asymptotic problems in stochastic partial differential equations: a Wiener chaos approach
PDF
New results on pricing Asian options
PDF
Empirical approach for estimating the ExB velocity from VTEC map
PDF
Equilibrium model of limit order book and optimal execution problem
PDF
Elements of dynamic programming: theory and application
PDF
On stochastic integro-differential equations
PDF
Finding technical trading rules in high-frequency data by using genetic programming
PDF
A nonlinear pharmacokinetic model used in calibrating a transdermal alcohol transport concentration biosensor data analysis software
PDF
Two essays in econometrics: large N T properties of IV, GMM, MLE and least square model selection/averaging
PDF
Feature selection in high-dimensional modeling with thresholded regression
Asset Metadata
Creator
Xu, Li
(author)
Core Title
Linear filtering and estimation in conditionally Gaussian multi-channel models
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Applied Mathematics
Publication Date
08/02/2013
Defense Date
05/08/2013
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
conditionally Gaussian,generalized Kalman-Bucy filter,multi-channel,OAI-PMH Harvest
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Lototsky, Sergey V. (
committee chair
), Magill, Michael (
committee member
), Zhang, Jianfeng (
committee member
)
Creator Email
lxu1@usc.edu,summer.x717@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-311972
Unique identifier
UC11293676
Identifier
etd-XuLi-1934.pdf (filename),usctheses-c3-311972 (legacy record id)
Legacy Identifier
etd-XuLi-1934.pdf
Dmrecord
311972
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Xu, Li
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
conditionally Gaussian
generalized Kalman-Bucy filter
multi-channel