Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
A fully discrete approach for estimating local volatility in a generalized Black-Scholes setting
(USC Thesis Other)
A fully discrete approach for estimating local volatility in a generalized Black-Scholes setting
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
A FULLY DISCRETE APPROACH FOR ESTIMATING LOCAL VOLATILITY IN A GENERALIZED BLACK-SCHOLES SETTING by Oleksandr Lytvak A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (APPLIED MATHEMATICS) August 2008 Copyright 2008 Oleksandr Lytvak Dedication This work is dedicated to my parents whose love, support and encouragement I can always count on. ii Acknowledgements I would like to thank my advisor Prof. Gary Rosen for his enormous help and guidance. Without him this work would not have been possible. I cannot thank him enough for all the help and advice he has offered. I am also very grateful to Prof. Jaksa Cvitanic who introduced me to the field of Mathematical Finance and initially was my co-advisor. I wish to thank Prof. Sergey Lototsky, Prof. Mark Westerfield, Prof. Chunming Wang and Prof. Remigijus Mikulevicius for their time and helpful comments. Last, but not least, I want to thank my brother for his constant interest in my work and great help and support. iii Table of Contents Dedication ii Acknowledgements iii Abstract v Chapter 1: Introduction 1 1.1 General Formulation and Importance of the Problem . . . . . . . . . . 1 1.2 Review of Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Chapter 2: The Parameter Estimation Problem and its Abstract Formulation 7 2.1 Transformation of the Black-Scholes Equation . . . . . . . . . . . . . . 7 2.2 An Abstract Formulation of Black-Scholes . . . . . . . . . . . . . . . . 12 2.3 Wellposedness and The Abstract Evolution System . . . . . . . . . . . 20 2.4 Parameter Estimation Problem: Formulation and Existence of a Solution 24 Chapter 3: Fully Discretized System 40 3.1 Semidiscrete Approximation . . . . . . . . . . . . . . . . . . . . . . . 40 3.2 Review of Krein’s Factor-Method and its Application to Finite-Difference Schemes for Evolution Equations . . . . . . . . . . . . . . . . . . . . . 48 3.3 Application of the Factor-Method to the Transformed Black-Scholes Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Chapter 4: Application of the Adjoint Method for Computing the Gradient 75 Chapter 5: Future Work 80 References 82 Appendix: Results from Analysis 84 iv Abstract We consider a generalized Black-Scholes model which is used for pricing derivative securities. A fully discrete approximation framework based on the factor-method devel- oped by Krein is presented for the solution of the associated inverse problem. The scheme allows one to estimate a parameter, local volatility, which is extremely important in the theory and practice of financial markets. V olatility is a function of the spatial and temporal variables which appear in the Black-Scholes partial differential equation. The- oretical convergence results are established. A numerical scheme utilizing the advan- tage of full discretization along with the adjoint method are presented and discussed. The usage of the adjoint method allows for the computation of the gradient of the cost functional precisely in a computationally efficient manner. v Chapter 1 Introduction 1.1 General Formulation and Importance of the Prob- lem In their famous work Black and Scholes [6] suggested a model for the dynamics of a stock price and derived an explicit formula for the price of the European call option written on this stock. They assumed that the stock price follows a diffusion process of the formdS t = S t (μdt +σdW t ), whereS t is a stock price at timet, μ is a drift, σ is a constant volatility (standard deviation) of the return on the stock andW t is a standard Brownian motion. Then they showed that the option price should satisfy the following partial differential equation (PDE): ∂V(S,t) ∂t +rS ∂V(S,t) ∂S + 1 2 σ 2 S 2 ∂ 2 V(S,t) ∂S 2 =rV(S,t), (1.1) with the terminal condition V(S,T) = max(S−K,0), S≥ 0, (1.2) where V(S,t) is the price of the option at time t when the stock price is S; r is the constant continuously compounded interest rate and T is the option expiration date 1 (maturity) and K is its exercise (strike) price. Solving PDE (1.1)-(1.2) they obtained the formula for the option price: V(S,t) =SN(d 1 )−Ke −r(T−t) N(d 2 ), (1.3) d 1 = ln( S K )+(r + 1 2 σ 2 )(T −t) σ √ T −t , d 2 = ln( S K )+(r− 1 2 σ 2 )(T −t) σ √ T −t , N(d) = Z d −∞ 1 √ 2π e − x 2 2 dx. The only parameter in the formula (1.3) which can not be directly observed is the volatil- ityσ. It would seem to be reasonable to estimate the volatility using historical data. But Black and Scholes [5] showed that their model overpriced (underpriced, respectively) options on a stock with a high (low, respectively) historical volatility. Among other problems with the volatility estimation from historical data are the facts that the variance is non-stationary (extending an observation period can make matters worse) and that the frequency of observations can not be increased. More discussion with references regard- ing systematic bias of estimates of option prices based on historical volatilities can be found in Musiela and Rutkowski [18]. An alternative approach would be not to price European call options based on for- mula (1.3) but rather to invert it, i.e. find a value for σ, called implied volatility σ imp such that the formula (1.3) with thisσ imp produces values that coincide with correspond- ing prices of options traded on an exchange. If the Black-Scholes model were correct then the implied volatility for a particular stock would remain constant regardless of the strike priceK or maturityT of the option. However this is not the case. Dependence of implied volatility on the strike price is called the ’smile’ (due to its U-shape) whereas 2 its dependence on the time to maturity is called the ’term structure’. In the literature this misspecification of the Black-Sholes model with regard to the constant volatility assumption is called the smile effect. The importance of knowledge of the volatility is difficult to overemphasize. Prac- titioners need it for deciding their trading strategies (especially in derivative markets), making economic forecasts, implementing risk management, etc. Knowledge of volatil- ity is also required to consistently price exotic (custom-made) derivatives that are not actively traded on exchanges. V olatility plays so important a role in the financial mar- kets that traders quote option prices in volatility units rather than dollars (or some other currency). Taking into account the importance of the volatility, a considerable amount of research has been done in order to reconcile the smile effect with reality. A brief review of some approaches designed to this end will be presented in the next section. 1.2 Review of Literature There are three basic approaches to overcome shortcomings (with respect to the assump- tion regarding constant volatility) of the Black-Scholes model. One of them, called stochastic volatility models, suggests modeling the stock price volatility itself as a sto- chastic process. In this case a second source of randomness is incorporated in the model to better explain the bias in the observed prices. This results, however, in a two-factor model- it becomes impossible to perfectly hedge the option just using underlying asset (the market becomes incomplete) and thus there is more than one fair price for the option. The second approach is called jump-diffusion models. In this case the price of the underlying asset is allowed to jump. This would correspond to the arrival of unexpected 3 news and the reaction to it by the market. As in the previous approach this model con- tains two sources of randomness- one from the diffusion and one from jumps. Models again become incomplete making it impossible to perfectly hedge the option by using just the underlying asset. Finally, the third approach, called complete market (single-factor, no-arbitrage) models enables one to perfectly hedge the option by using the underlying asset, hence allowing arbitrage pricing and hedging (to have just one fair price for the option). In this approach volatility is usually assumed to be an unknown but deterministic function of timet and stock priceS. This function is called a local volatility. Since our method falls into this category we will give a more detailed overview of this approach. Dupire [11] showed that a local volatility function can be uniquely determined if option prices are known for all possible strike prices K > 0 and maturities T > 0. Unfortunately, it’s not the case since options are traded for only a limited number of strike prices and maturities. Therefore the market doesn’t provide enough information to uniquely define a local volatility function. One way of determining local volatility is a so-called implied tree approach. Derman and Kani [10], Dupire [11] and Rubinstein [21] showed how to construct a discreet approximation to the continuous risk-neutral process for the underlying stock in the form of a binomial or trinomial tree. Another possibility to uncover local volatility is to use the Black-Scholes and/or Dupire PDE. Lagnado and Osher [16] proposed a new computational technique for solving the inverse calibrating problem. They presented a minimization method to fit the smile to the option prices. In order to overcome the ill-posedness of the problem (due to the facts that there are only a limited number of observed prices and volatility is not continuously dependent on market data) they introduced a Tikhonov regularization 4 method in a minimization process to restrict a local volatilityσ(S,t) to be a ”smooth” function. Their procedure which is computationally demanding produced good numeri- cal results and could be applied to more complex pricing models. Using the same idea Crepey [9] proved the stability of this approach and its conver- gence towards a minimum norm solution of the calibration problem (assuming it exists). Jackson, Suli and Howison [13] and Coleman, Li and Verma [7] used splines to construct smooth local volatility functions. Finally we mention the paper by Achdou and Pironneau [1] in which they used the finite element method and the Dupire equation for least squares optimization. They computed the gradient of the cost functional exactly by solving the adjoint problem. Our approach adopts the same idea as in [1], [9] and [16] which consists of solving a minimization problem and finding a local volatility subject to the Black-Scholes PDE governing option prices. In this thesis we propose a method for solving this inverse problem in mathematical finance by fully discretizing (in time and spatial variables) the Black-Scholes PDE. We then present a scheme that will allow one to numerically implement this approach and we prove its theoretical convergence. We also present an adjoint method for calculating the gradient of the cost functional which is not only precise but also very cost efficient. Using the adjoint method would make numerical implementation much faster than in [16]. 1.3 Outline of the Thesis In Chapter 2 we present a generalized Black-Scholes partial differential equation and transform it in order to get a forward PDE with homogenous boundary conditions on the interval [0,1]. Then we specify functional spaces in which we consider this PDE and give its weak formulation. Using an assumption regarding the volatility function 5 we show that this PDE can be regarded as an evolution equation and prove that it has a unique solution. Finally we formulate the inverse calibration problem in terms of minimizing a cost functional subject to a given PDE and show that it has a solution. We also establish convergence results. In Chapter 3 we first describe an approximation scheme in semidiscrete (discretiza- tion in the spatial variable) case and prove its convergence. Then we discuss advantages of full discretization and show how it can be done using a general framework (Krein’s approach) that allow us to establish convergence results. In Chapter 4 the adjoint method for the numerical implementation of our scheme is presented. Finally in Chapter 5 we discuss future work. It contains some suggestions for extend- ing this research, including proposed numerical studies. 6 Chapter 2 The Parameter Estimation Problem and its Abstract Formulation 2.1 Transformation of the Black-Scholes Equation Let’s consider now the generalization of the Black-Scholes Partial Differential Equation: ∂C(S,t) ∂t +(r−q)S ∂C(S,t) ∂S + 1 2 σ 2 (S,t)S 2 ∂ 2 C(S,t) ∂S 2 =rC(S,t), (2.1) whereC(S,t) is the theoretical price of the option at time t when the price of the under- lying asset (stock, index etc.) isS; r is the riskless continuously compounded interest rate andq is the continuous dividend yield on the underlying asset;σ(S,t) is the volatil- ity of the return on the underlying asset, it is a deterministic function ofS andt. For a European call option we can write the following terminal and boundary conditions: C(S,T) = max(S−K,0), S≥ 0 (2.2) C(0,t) = 0, 0≤t≤T (2.3) ∂C(S,t) ∂S →e −q(T−t) as S→∞, 0≤t≤T, (2.4) 7 whereK > 0 is the option strike price andT is its expiration date (maturity). In order to transform the PDE (2.1) into a forward equation we define new variables x,τ and e C(x,τ) by S =Ke x ,t =T −τ andC(S,t) =K e C(x,τ), −∞<x<∞, 0≤τ ≤T or x = ln( S K ),τ =T −t. Then ∂C ∂t =K ∂ e C ∂τ ∂τ ∂t =−K ∂ e C ∂τ , ∂C ∂S =K ∂ e C ∂x ∂x ∂S = K S ∂ e C ∂x , ∂ 2 C ∂S 2 = ∂( K S ∂ e C ∂x ) ∂S =K(− 1 S 2 ∂ e C ∂x + 1 S ∂( ∂ e C ∂x ) ∂S ) = K S 2 (− ∂ e C ∂x + ∂ 2 e C ∂x 2 ). Then PDE (2.1) becomes, −K ∂ e C ∂τ +(r−q)K ∂ e C ∂x + 1 2 σ 2 (Ke x ,T −τ)S 2 K S 2 (− ∂ e C ∂x + ∂ 2 e C ∂x 2 ) =rK e C or ∂ e C ∂τ = 1 2 σ 2 (Ke x ,T −τ) ∂ 2 e C ∂x 2 +(r−q− 1 2 σ 2 (Ke x ,T −τ)) ∂ e C ∂x −r e C. (2.5) Introducing the variablee σ(x,τ) =σ(S,t) equation (2.5) becomes: ∂ e C ∂τ = 1 2 e σ 2 (x,τ) ∂ 2 e C ∂x 2 +(r−q− 1 2 e σ 2 (x,τ)) ∂ e C ∂x −r e C. (2.6) Conditions (2.2)-(2.4) in the new variables are then given by: e C(x,0) = max(e x −1,0), −∞<x<∞ (2.7) 8 e C(x,τ)→ 0, as x→−∞, 0≤τ ≤T (2.8) 1 e x ∂ e C ∂x →e −qτ as x→∞, 0≤τ ≤T. (2.9) Truncating the domain of x from (−∞,∞) to [x 0 ,x 1 ] (this corresponds to truncating the domain forS from [0,∞) to [S 0 ,S 1 ],S 0 > 0) we get approximations for conditions (2.7)-(2.9): e C(x,0) = max(e x −1,0), x 0 ≤x≤x 1 (2.10) e C(x 0 ,τ) = 0, 0≤τ ≤T (2.11) 1 e x 1 ∂ e C ∂x (x 1 ,τ) =e −qτ , 0≤τ ≤T. (2.12) In order to consider equation (2.6), (2.10)-(2.12) on the interval [0,1] instead of [x 0 ,x 1 ] we introducey = x−x 0 x 1 −x 0 forx∈ [x 0 ,x 1 ]. Clearly, 0≤y≤ 1 andx =y(x 1 −x 0 )+x 0 . Letb :=x 1 −x 0 > 0 and C(y,τ) = e C(by +x 0 ,τ) = e C(x,τ), σ(y,τ) =e σ(by +x 0 ,τ) =e σ(x,τ). Then ∂ e C ∂x = ∂C ∂y ∂y ∂x = 1 b ∂C ∂y , ∂ 2 e C ∂x 2 = 1 b 2 ∂ 2 C ∂y 2 . 9 Therefore PDE (2.6) and the initial and boundary conditions (2.10)-(2.12) can be written as ∂C ∂τ = 1 2b 2 σ 2 (y,τ) ∂ 2 C ∂y 2 + 1 b (r−q− 1 2 σ 2 (y,τ)) ∂C ∂y −rC (2.13) and C(y,0) = max(e by+x 0 −1,0), 0≤y≤ 1 (2.14) C(0,τ) = 0, 0≤τ ≤T (2.15) ∂C ∂y (1,τ) =be x 1 −qτ , 0≤τ ≤T (2.16) respectively. Finally, in order to get homogenous boundary conditions we write C(y,τ) =C 1 (y,τ)+C 2 (y,τ), whereC 1 (y,τ) satisfies boundary conditions (2.15)-(2.16). Choosing C 1 (y,τ) =be x 1 −qτ y, (C 1 (0,τ) = 0, ∂C 1 ∂y (1,τ) =be x 1 −qτ ) we get ∂C ∂τ =−bqe x 1 −qτ y + ∂C 2 ∂τ , 10 ∂C ∂y =be x 1 −qτ + ∂C 2 ∂y , ∂ 2 C ∂y 2 = ∂ 2 C 2 ∂y 2 . Therefore (2.13) becomes, −bqe x 1 −qτ y + ∂C 2 ∂τ = 1 2b 2 σ 2 (y,τ) ∂ 2 C 2 ∂y 2 + 1 b (r−q− 1 2 σ 2 (y,τ))(be x 1 −qτ + ∂C 2 ∂y ) −r(be x 1 −qτ y +C 2 ) or ∂C 2 ∂τ = 1 2b 2 σ 2 (y,τ) ∂ 2 C 2 ∂y 2 + 1 b (r−q− 1 2 σ 2 (y,τ)) ∂C 2 ∂y −rC 2 +e x 1 −qτ (r−q− 1 2 σ 2 (y,τ)−bry +bqy). (2.17) Conditions (2.14)-(2.16) can be written as C 2 (y,0) = max(e by+x 0 −1,0)−be x 1 y, 0≤y≤ 1 (2.18) C 2 (0,τ) = 0, 0≤τ ≤T (2.19) ∂C 2 ∂y (1,τ) = 0, 0≤τ ≤T. (2.20) 11 It is worth noting the relations betweenC(S,t) andC 2 (y,τ),σ(S,t) andσ(y,τ): C 2 (y,τ) = C(S,t) K −be x 1 −q(T−t) , σ 2 (y,τ) =σ 2 (S,t), y = ln(S/K)−x 0 b , τ =T −t. 2.2 An Abstract Formulation of Black-Scholes We make the following assumptions regarding the functionσ(y,τ) : A1:σ(y,τ) is bounded above and below by positive constants, i.e. 0 < m≤ σ(y,τ)≤ K for ally ∈ [0,1],τ ∈ [0,T] for some positive constantsm and K; A2:σ(y,τ) is differentiable with respect toy and its derivative is bounded: ∂σ(y,τ) ∂y ≤M for ally∈ [0,1],τ ∈ [0,T] for some constantM > 0; A3:σ(y,τ) is uniformly Lipshitz continuous with respect toτ: |σ(y,τ)−σ(y,s)|≤L|τ−s| for ally∈ [0,1],τ,s∈ [0,T], someL> 0; A4: ∂σ 2 (y,τ) ∂y is uniformly Lipshitz continuous with respect toτ: ∂σ 2 (y,τ) ∂y − ∂σ 2 (y,s) ∂y ≤L 1 |τ−s| for ally∈ [0,1],τ,s∈ [0,T], someL 1 > 0; A5:σ(y,τ) is differentiable with respect toτ and its derivative is bounded: ∂σ(y,τ) ∂τ ≤M 1 for ally∈ [0,1],τ ∈ [0,T] for some constantM 1 > 0. We want to rewrite (2.17)-(2.20) as an abstract evolution equation in the appropriate space. DefineV ={ϕ−absolutely continuous function on [0,1] :ϕ(0) = 0,ϕ 0 ∈L 2 [0,1]} with the inner product (ϕ,ψ) V = R 1 0 ϕψdx + R 1 0 DϕDψdx, and norm kϕk V = p (ϕ,ϕ) V , whereD = ∂ ∂x is the usual differentiation operator. 12 Let H = L 2 [0,1] with the usual inner product < ϕ,ψ > H = R 1 0 ϕψdx, and norm kϕk H = √ <ϕ,ϕ> H . We know that V ⊂ H andkvk H ≤ kvk V for all v ∈ V and since we can write V = {ϕ ∈ H 1 [0,1] : ϕ(0) = 0}, whereH 1 [0,1]- is a Sobolev space, V is dense and continuously embedded inH:V ,→H. Hence, we haveH 0 ⊂V 0 (V 0 ={g :V →R,g− continuous linear} is the dual space to V ) and identifying H and H 0 by means of the Riesz theorem (see [15]) applied to the Hilbert spaceH, we getV ,→H =H 0 ,→V 0 . Define theτ-dependent bilinear (inu,v) forma(τ;u,v) :V ×V →R by a(τ;u,v) =<Du,D(αv)−βv> H −<u,γv> H , (2.21) where α =α(σ;y,τ) = σ 2 (y,τ) 2b 2 (2.22) β =β(σ;y,τ) = 1 b (r−q− 1 2 σ 2 (y,τ)) (2.23) γ =−r (2.24) are coefficients in the PDE (2.17). Our next goal is to show thata(τ;u,v) determines an operatorA(τ,σ) :Dom(A)→ H, whereDom(A)⊂V and dense inV . For this we will need the following lemmas. 13 Lemma 1 (Boundedness). There exists a constantω > 0 such that|a(τ;u,v)|≤ωkuk V kvk V for allu,v∈V andτ ∈ [0,T]. Proof. Using the triangle and Cauchy-Schwartz inequalities we can write: |a(τ;u,v)| =|<Du,D(αv)−βv>−<u,γv>| ≤|<Du,D(αv)−βv>|+|<u,γv>| ≤ p <Du,Du> p <D(αv)−βv,D(αv)−βv>+ √ <u,u> √ <γv,γv>. For the first term we have √ <Du,Du> =kDuk H and p <D(αv)−βv,D(αv)−βv> =kD(αv)−βvk H =kαDv +vDα−βvk H =kαDv +(Dα−β)vk H ≤kαDvk H +k(Dα−β)vk H . Using the definitions ofα andβ, (2.22)-(2.23), and Assumptions A1-A2, we have the following estimates: |α| = σ 2 2b 2 ≤ K 2 2b 2 , |Dα−β| = 2σ 2b 2 ∂σ ∂y − 1 b (r−q− 1 2 σ 2 ) ≤ 1 b 2 |σ| ∂σ ∂y + 1 b |r−q|+ 1 2b σ 2 ≤ KM b 2 + 1 b |r−q|+ K 2 2b . 14 Therefore, kαDvk H +k(Dα−β)vk H ≤K 1 kDvk H +K 1 kvk H , whereK 1 := max( K 2 2b 2 , KM b 2 + 1 b |r−q|+ K 2 2b ). Taking into account the definition ofγ given by (2.24), we get: √ <u,u> √ <γv,γv> =kuk H kγvk H =rkuk H kvk H . Therefore, |a(τ;u,v)|≤kDuk H (K 1 kDvk H +K 1 kvk H )+rkuk H kvk H ≤K 2 (kDuk H kDvk H +kDuk H kvk H +kuk H kvk H ) ≤K 2 (kDuk H kDvk H +kDuk H kvk H +kuk H kvk H +kuk H kDvk H ) =K 2 (kuk H +kDuk H )(kvk H +kDvk H ) =K 2 ( s Z u 2 + s Z |Du| 2 )( s Z v 2 + s Z |Dv| 2 ) ≤K 2 √ 2( s Z (u 2 +(Du) 2 )) √ 2( s Z (v 2 +(Dv) 2 )) =ωkuk V kvk V , whereω = 2K 2 = 2max(K 1 ,r). 15 Lemma 2 (V −H Coercivity). There exist a positiveδ> 0 andk such that a(τ;u,u)≥δkuk 2 V −kkuk 2 H for allu∈V andτ ∈ [0,T]. Proof. a(τ;u,u) =<Du,D(αu)−βu>−<u,γu> =<Du,αDu> +<Du,uDα>−<Du,βu>−<u,γu>. Using the definition ofα and the assumptions onσ we have, <Du,αDu>≥ m 2b 2 <Du,Du>= m 2b 2 kDuk 2 H . For the second and third terms we have the following estimates (m anda are positive): <Du,uDα> =< r m 4b 2 Du, r 4b 2 m uDα>≥− r m 4b 2 Du H r 4b 2 m uDα H ≥− 1 2 ( r m 4b 2 Du 2 H + r 4b 2 m uDα 2 H ) =− 1 2 ( m 4b 2 kDuk 2 H + 4b 2 m kuDαk 2 H ), <Du,βu> =< r m 4b 2 Du, r 4b 2 m βu> ≤ r m 4b 2 Du H r 4b 2 m βu H ≤ 1 2 ( r m 4b 2 Du 2 H + β r 4b 2 m u 2 H ) = 1 2 ( m 4b 2 kDuk 2 H + 4b 2 m kβuk 2 H ). 16 Therefore, a(τ;u,u)≥ m 2b 2 kDuk 2 H − m 8b 2 kDuk 2 H − 2b 2 m kuDαk 2 H − m 8b 2 kDuk 2 H − 2b 2 m kβuk 2 H −<u,γu> = m 4b 2 kDuk 2 H − 2b 2 m (kuDαk 2 H +kβuk 2 H )−<u,γu>. Using the estimates for Dα and β from the previous lemma and the definition of γ, (2.24), we have a(τ;u,u)≥ m 4b 2 kDuk 2 H − 2b 2 m ( K 2 M 2 b 4 + 1 b 2 (|r−q|+ 1 2 K 2 ) 2 )kuk 2 H +rkuk 2 H = m 4b 2 (kDuk 2 H +kuk 2 H )−( m 4b 2 + 2K 2 M 2 mb 2 + 2 m (|r−q|+ 1 2 K 2 ) 2 −r)kuk 2 H . Sincekuk 2 V =kuk 2 H +kDuk 2 H and taking constantsδ = m 4b 2 > 0,k = m 4b 2 + 2K 2 M 2 mb 2 + 2 m (|r−q|+ 1 2 K 2 ) 2 −r we get a(τ;u,u)≥δkuk 2 V −kkuk 2 H . Define the operatorA(τ) : V → V 0 bya(τ;u,v) = (A(τ)u)v =< A(τ)u,v > H , u,v∈V , whereV 0 ={g :V →R,g− continuous linear} is the dual space toV . For fixed τ the operator A(τ) is linear since the corresponding form a(τ;u,v) is bilinear and bounded (by definition ofA and Lemma 1) and hence continuous . There- foreA∈L(V,V 0 ). It is shown in Tanabe [25] thatDom(A(τ)) is independent ofτ. Define Dom(A) = {u ∈ V : Au ∈ H} and operator A : Dom(A) → H by Au = Au,u ∈ Dom(A) ⊂ V ⊂ H. It can be proved (see Tanabe [25], p.27) that Dom(A) is dense inV and inH. 17 Let’s write PDE (2.17)-(2.20) in terms ofα,β andγ given by (2.22)-(2.24): ˙ C 2 =αD 2 C 2 +βDC 2 +γC 2 +F, C 2 (0) =DC 2 (1) = 0, (2.25) whereF = e x 1 −qτ (r−q− 1 2 σ 2 (y,τ)−ary +aqy) and ˙ C 2 = ∂C 2 ∂τ . Here and in what follows with the slight abuse of notation when we writeC 2 (·) we meanC 2 (·,τ). SupposeC 2 is the solution to (2.25) andC 2 ∈H 2 [0,1]. ThenC 2 ∈H 2 [0,1]∩V and forϕ∈V we have: < ˙ C 2 ,ϕ>=<αD 2 C 2 +βDC 2 +γC 2 +F,ϕ> =αϕDC 2 | 1 0 − Z 1 0 DC 2 D(αϕ)dx+<βDC 2 ,ϕ> +<γC 2 ,ϕ> +<F,ϕ> =−<DC 2 ,D(αϕ)> +<βDC 2 ,ϕ> +<γC 2 ,ϕ> +<F,ϕ>. Therefore ifC 2 ∈H 2 [0,1] is the solution to (2.25), it satisfies < ˙ C 2 ,ϕ> =−<DC 2 ,D(αϕ)> +<DC 2 ,βϕ> +<γC 2 ,ϕ> +<F,ϕ> ∀ϕ∈V. (2.26) Conversely, assume thatC 2 ∈ H 2 [0,1]∩V and satisfies (2.26). SinceC ∞ 0 (0,1)⊂ V , the last equality is true for allϕ ∈ C ∞ 0 (0,1) (infinitely continuously differentiable functions with compact support in [0,1]). Integrating by parts we obtain: < ˙ C 2 ,ϕ> =−DC 2 αϕ| 1 0 +<D 2 C 2 ,αϕ> +<DC 2 ,βϕ> +<γC 2 ,ϕ> +<F,ϕ> ∀ϕ∈C ∞ 0 (0,1). 18 Sinceϕ∈C ∞ 0 , ϕ(0) =ϕ(1) = 0 and we get < ˙ C 2 ,ϕ>=<αD 2 C 2 +βDC 2 +γC 2 +F,ϕ> ∀ϕ∈C ∞ 0 (0,1). (2.27) SinceC 2 ∈H 2 [0,1],F ∈L 2 [0,1] andα,β,γ are bounded, we have(αD 2 C 2 +βDC 2 + γC 2 +F)∈H =L 2 [0,1] and (2.27) implies that ˙ C 2 =αD 2 C 2 +βDC 2 +γC 2 +F (2.28) in the sense of distributions on (0,1). Here we adopt the definition of distribution from Showalter [23], i.e. a distribution onG is defined to be a conjugate linear functional on C ∞ 0 (G). Also,C 2 (0) = 0 (sinceC 2 ∈H 2 (0,1)∩V ) and <αD 2 C 2 +βDC 2 +γV 2 +F,ϕ>=−αϕDC 2 | 1 0 +<D 2 C 2 ,αϕ> +<DC 2 ,βϕ> +<γC 2 ,ϕ> +<F,ϕ> ∀ϕ∈V. Therefore,−αϕDC 2 | 1 0 = 0 orDC 2 (1)(αϕ)(1) = 0 ∀ϕ∈V (sinceϕ(0) = 0). Since (αϕ)(1) is dense inR we getDC 2 (1) = 0. Therefore we establish the following correspondence between PDE (2.25) and its weak formulation (2.26): IfC 2 ∈ H 2 (0,1) is a solution to (2.25) thenC 2 ∈ V andC 2 satisfies (2.26) for all ϕ∈V . IfC 2 ∈ H 2 (0,1)∩V andC 2 satisfies (2.26) for allϕ ∈ V then ˙ C 2 = αD 2 C 2 + βDC 2 +γC 2 +F in the sense of distributions on (0,1) andC 2 (0) = 0, DC 2 (1) = 0. 19 Therefore the weak formulation of (2.25) can be given using the bilinear form a(τ;u,v): FindC 2 ∈V such that < ˙ C 2 ,ϕ>=−a(τ;C 2 ,ϕ)+<F,ϕ> ∀ϕ∈V, (2.29) C 2 (0) = max(e by+x 0 −1,0)−be x 1 y, 0≤y≤ 1 (2.30) holds (hereC 2 (0) =C 2 (y,0)). 2.3 Wellposedness and The Abstract Evolution System In view of the discussion in Section 2.2, PDE (2.17)-(2.20) can be written as an abstract evolution equation ˙ C 2 (τ)+A(τ,σ)C 2 (τ) =F(τ,σ), 0<τ ≤T (2.31) C 2 (0) = max(e by+x 0 −1,0)−be x 1 y (2.32) in a Hilbert space H. Using Lemmas 1 and 2 we have: Theorem 1. For eachτ ∈ [0,T]−A(τ,σ) generates an analytic semigroup in bothH andV 0 . Proof. The proof is given in [25] (p.76). To show that (2.31)-(2.32) has a unique solution, we require the following lemmas. 20 Lemma 3. The forma(τ;u,v) is Holder continuous inτ in the following sense: |a(t;u,v)−a(s;u,v)|≤K 1 |t−s|kuk V kvk V for alls,t∈ [0,T], u,v∈V, whereK 1 is a positive constant. Proof. Using the definitions ofα,β andγ given by (2.22)-(2.24) and Assumptions A1-A4, we have the following estimates: |α(t)−α(s)| = σ 2 (y,t) 2b 2 − σ 2 (y,s) 2b 2 = (σ(y,t)−σ(y,s))(σ(y,t)+σ(y,s)) 2b 2 ≤ 2KL|t−s| 2b 2 = KL b 2 |t−s|, |β(t)−β(s)| = σ 2 (y,t) 2b − σ 2 (y,s) 2b ≤ KL b |t−s|, |D(α(t))−D(α(s))| = 1 2b 2 D(σ 2 (y,t))−D(σ 2 (y,s)) ≤ L 1 2b 2 |t−s|. 21 Then |a(t;u,v)−a(s;u,v)| =|<Du,D(α(t)v)−β(t)v>−<u,γv> −<Du,D(α(s)v)−β(s)v> +<u,γv>| =|<Du,(β(s)−β(t))v> +<Du,(Dα(t)−Dα(s))v> +<Du,(α(t)−α(s))Dv>| ≤kDuk H k(β(s)−β(t))vk H +kDuk H k(Dα(t)−Dα(s))vk H +kDuk H k(α(t)−α(s))Vk H ≤ ( KL b kDuk H kvk H +kDuk H L 1 2b 2 kvk H + KL b 2 kDuk H kDvk H )|t−s| =kDuk H (( KL b + L 1 2b 2 )kvk H + KL b 2 kDvk H )|t−s| ≤kDuk H K 0 (kvk H +kDvk H )|t−s|≤K 1 kuk V kvk V |t−s|, whereK 1 = √ 2K 0 = √ 2max( KL b + L 1 2b 2 , KL b 2 ). Lemma 4. The functionτ →F(τ,y) is Holder continuous inτ, i.e: |F(t,y)−F(s,y)|≤K 2 |t−s| for alls,t∈ [0,T], y∈ [0,1], where constantK 2 > 0. Proof. Applying the mean value theorem toF(τ,y) on [t,s], we obtain |F(t,y)−F(s,y)| = ∂F ∂t (τ,y) |t−s|, τ ∈ [t,s]. 22 Also we have the following estimate ∂F ∂t (τ,y) = e x 1 −qτ (−σ(y,τ) ∂σ(y,τ) ∂t −qe x 1 −qt (r−q− 1 2 σ 2 (y,τ)) ≤e x 1 −qt (|σ(y,τ)| ∂σ ∂t (y,τ) +q|r−q|+ q 2 σ 2 (y,τ) ≤e x 1 (KM 1 +q|r−q|+ q 2 K 2 ) :=K 2 , where the last inequality is obtained by using assumptions onσ. Therefore, we have |F(t,y)−F(s,y)|≤K 2 |t−s|. Now we can state the main result of this section. Theorem 2 (Existence and Uniqueness of a Solution). Equation (2.31)-(2.32) has a unique solution given by C 2 (τ) =U(τ,0)C 2 (0)+ Z τ 0 U(τ,s)F(s)ds, (2.33) whereU(τ,s) is the evolution operator corresponding to the operatorA(τ,σ). Remark: We say that functionC 2 (τ) is a solution of (2.31)-(2.32) if it satisfies these equations and ifC 2 ∈ C(0,T;H) T C 1 (0,T;H), C 2 (τ) ∈ D(A(τ)) for eachτ and AC 2 ∈C(0,T;H). Here the spaceC(0,T;H) is a space consisting of all continuous functionsu : [0,T]→ H with kuk C(0,T;H) := max 0≤t≤T ku(t)k H <∞. 23 Proof. Since the bilinear forma(τ;u,v) is bounded, coercive and Holder continuous inτ (Lemmas 1, 2 and 3 of this thesis), F is a Holder continuous function with values in H (Lemma 4) andC 2 (0)∈ H, we can apply Theorem 5.4.3 from [25] that assures the existence and uniqueness of a solution of equation (2.31)-(2.32). Applying results from Lions [17] (Theorem III.1.2) we have that the system (2.29)- (2.30) has a unique solution C 2 ∈ C(0,T;H) with C 2 ∈ L 2 (0,T;V) and ˙ C 2 ∈ L 2 (0,T;V 0 ). 2.4 Parameter Estimation Problem: Formulation and Existence of a Solution We can now give a precise formulation of the problem of estimating the volatility σ (or equivalently σ). We consider options written on the given underlying asset S (or y in the new variables) with different maturities T 1 ,...,T l and different strike prices K ij , i = 1,...,l, j = 1,...,m i that are actively traded on an exchange. We assume that we know their market quotes (bid and ask prices)c b ij andc a ij (orc b 2ij andc a 2ij in the new variables) at the present time t = 0 corresponding to the spot asset price S 0 (or τ = T i andy = ln( S 0 K ij )−x 0 b in new variables). We want to find the volatilityσ such that the solution of (2.29)-(2.30),C 2 (y,T i ,K ij ,σ) atτ =T i andy = ln( S 0 K ij )−x 0 b satisfies c b 2ij ≤C 2 (y,T i ,K ij ,σ)≤c a 2ij (2.34) 24 fori = 1,...,l, j = 1,...,m i . To satisfy these inequality constraints we will minimize the functional J(σ) = l X i=1 m i X j=1 |C 2 (y,T i ,K ij ,σ)−c 2ij | 2 , (2.35) where C 2 (y,T i ,K ij ,σ) is a solution at τ = T i and y = ln( S 0 K ij )−x 0 b to (2.29)-(2.30) corresponding to expiration dateT i and strike priceK ij andc 2ij = 1 2 (c a 2ij +c b 2ij ) is the arithmetic mean of the appropriately transformed bid and ask prices. In other words, we want to minimize the functional J(σ) subject to (2.29)-(2.30). To do this we need to choose the admissible parameter space forσ(y,τ). Assuming that ∂σ 2 ∂y is continuous with respect toy we can use the fundamental theorem of calculus to write: σ 2 (y,τ) =σ 2 (0,τ)+ Z y 0 ∂(σ 2 (z,τ)) ∂z dz. Letf(τ) =σ 2 (0,τ) andh(z,τ) = ∂(σ 2 (z,τ)) ∂z ; thusσ 2 (y,τ) =f(τ)+ R y 0 h(z,τ)dz. From the assumptions onσ we have thatf(τ)∈C 1 ([0,T]) andh(z,τ)∈C 0,1 ([0,1]×[0,T]). Instead of estimating parameter σ (or σ 2 ) we will estimate the pair (f(τ),h(y,τ)) ∈ C 1 ([0,T])×C 0,1 ([0,1]× [0,T]). Therefore the parameter estimation problem (2.35), becomes: Minimize J(f,h) = l X i=1 m i X j=1 |C 2 (y,T i ,K ij ,f,h)−c 2ij | 2 , (2.36) subject to equations (2.31)-(2.32), wheref ∈C 1 ([0,T]), h∈C 0,1 ([0,1]×[0,T]), T = max 1≤i≤l T i . In order to be able to say that the minimization problem (2.36) has a solu- tion we will use the fact that a continuous functional on a compact set attains its mini- mum (and, as a matter of fact, its maximum as well) value. 25 Consider the spaceD =C 1 ([0,T])×C 0,1 ([0,1]×[0,T]) with metricd =d 1 +d 2 , where d 1 is the usual max metric onC 1 ([0,T]) andd 2 is the max metric onC 0,1 ([0,1]×[0,T]), i.e. d 1 (f 1 ,f 2 ) = max τ∈[0,T] |f 1 (τ)−f 2 (τ)|+ max τ∈[0,T] ˙ f 1 (τ)− ˙ f 2 (τ) , f 1 ,f 2 ∈C 1 ([0,T]), d 2 (h 1 ,h 2 ) = max y∈[0,1],τ∈[0,T] |h 1 (y,τ)−h 2 (y,τ)| + max y∈[0,1],τ∈[0,T] ˙ h 1 (y,τ)− ˙ h 2 (y,τ) , h 1 ,h 2 ∈C 0,1 ([0,1]×[0,T]), where ˙ f i = df i dτ and ˙ h i = ∂h i ∂τ . Let Q 1 be a compact subset of C 1 ([0,T]) and Q 2 be a compact subset ofC 0,1 ([0,1]×[0,T]). ThenQ =Q 1 ×Q 2 is a compact subset ofD. Characterization of the compact subset of the space of continuous functionsC([a,b]) is given by the Arzela-Ascoli theorem (see Appendix). The general version of the Arzela- Ascoli theorem (see Appendix) can be used to show thatQ is not a trivial set containing only a 0. Remark: the reason for choosing to estimate the pair(f(τ),h(y,τ)) instead of estimat- ingσ 2 (y,τ) is the relative simplicity of the numerical implementation of the previous compared to the latter. However from a theoretical point of view the problems of esti- matingσ 2 and (f,h) are equivalent and we can and will use them interchangeably. To prove that the solution C 2 of (2.31)-(2.32) continuously depends on (f,h) we will need the following lemma. 26 Lemma 5 (Continuity). The bilinear forma(f,h)(u,v) continuously depends on the parameters (f,h), i.e. |a(f 1 ,h 1 )(u,v)−a(f 2 ,h 2 )(u,v)|≤d((f 1 ,h 1 ),(f 2 ,h 2 ))kuk V kvk V ∀{(f 1 ,h 1 ),(f 2 ,h 2 )}⊂Q, ∀{u,v}⊂V. Proof. Using the definition ofa given by (2.21) we have |a(f 1 ,h 1 )(u,v)−a(f 2 ,h 2 )(u,v)| =|<Du,D(α 1 v)−β 1 v>−<u,γv> −<Du,D(α 2 v)−β 2 v> +<u,γv>| =|<Du,D((α 1 −α 2 )v)−(β 1 −β 2 )v>| ≤kDuk H (kD(α 1 −α 2 )v−(β 1 −β 2 )v>k H ) =kDuk H (kvD(α 1 −α 2 )+(α 1 −α 2 )Dv−(β 1 −β 2 )vk H ) ≤kDuk H (kvD(α 1 −α 2 )k H +k(α 1 −α 2 )Dvk H +k(β 1 −β 2 )vk H ). 27 Recalling the definitions ofα andβ, (2.22)-(2.23), andσ 2 in terms of (f,h) we have: |α 1 −α 2 | = 1 2b 2 σ 2 1 (y,τ)− 1 2b 2 σ 2 2 (y,τ) = 1 2b 2 f 1 (τ)+ Z y 0 h 1 (z,τ)dz−f 2 (τ)− Z y 0 h 2 (z,τ)dz ≤ 1 2b 2 (|f 1 (τ)−f 2 (τ)|+ Z y 0 |h 1 (z,τ)−h 2 (z,τ)|dz) ≤ 1 2b 2 ( max τ∈[0,T] |f 1 (τ)−f 2 (τ)|+ Z y 0 max z∈[0,1],τ∈[0,T] |h 1 (z,τ)−h 2 (z,τ)|dz) ≤ 1 2b 2 (d 1 (f 1 ,f 2 )+yd 2 (h 1 ,h 2 ))≤ 1 2b 2 (d 1 (f 1 ,f 2 )+d 2 (h 1 ,h 2 )) = 1 2b 2 d((f 1 ,h 1 ),(f 2 ,h 2 )), where the last inequality holds becausey∈ [0,1]; |D(α 1 −α 2 )| = D( 1 2b 2 (σ 2 1 −σ 2 2 )) = 1 2b 2 |h 1 (y,τ)−h 2 (y,τ)|≤ 1 2b 2 max y∈[0,1],τ∈[0,T] |h 1 (y,τ)−h 2 (y,τ)| ≤ 1 2b 2 d 2 (h 1 ,h 2 )≤ 1 2b 2 d((f 1 ,h 1 ),(f 2 ,h 2 )); |β 1 −β 2 | = 1 b (r−q− 1 2 σ 2 1 (y,τ))− 1 b (r−q− 1 2 σ 2 2 (y,τ)) = 1 2b (σ 2 2 (y,τ))−σ 2 2 (y,τ)) ≤ 1 2b d((f 1 ,h 1 ),(f 2 ,h 2 )). 28 Then ||vD(α 1 −α 2 )|| H +k(α 1 −α 2 )Dvk H +k(β 1 −β 2 )vk H ≤ 1 2b 2 d((f 1 ,h 1 ),(f 2 ,h 2 ))kvk H + 1 2b 2 d((f 1 ,h 1 ),(f 2 ,h 2 ))kDvk H + 1 2b d((f 1 ,h 1 ),(f 2 ,h 2 ))kvk H ≤ ( 1 2b + 1 2b 2 )d((f 1 ,h 1 ),(f 2 ,h 2 ))(kvk H +kDvk H ) = √ 2( 1 2b + 1 2b 2 )d((f 1 ,h 1 ),(f 2 ,h 2 ))kvk V ≤d((f 1 ,h 1 ),(f 2 ,h 2 ))kvk V . The last inequality is true for large enoughb (which is of course the case sinceb is the size of the truncation for the priceS). Therefore, |a(f 1 ,h 1 )(u,v)−a(f 2 ,h 2 )(u,v)|≤kDuk H d((f 1 ,h 1 ),(f 2 ,h 2 ))kvk V ≤d((f 1 ,h 1 ),(f 2 ,h 2 ))kuk V kvk V . Theorem 3. Suppose{(f n ,h n )} ∞ n=1 is a sequence inQ with (f n ,h n )→ (f,h)∈ Q asn→∞. Letu n denote the solution to (2.29)-(2.30) corresponding to (f n ,h n ) andu the solution to (2.29)-(2.30) corresponding to (f,h). Then ifu∈C(0,T;V) we have ku n (t)−u(t)k H → 0 asn→∞ (convergence inC(0,T;H)) and R t 0 ku n (s)−u(s)k 2 V ds→ 0 asn→∞ (convergence inL 2 (0,T;V)) for eacht∈ [0,T]. 29 Proof. From Lemma 2 we have: δku n −uk 2 V ≤a n (u n −u,u n −u)+kku n −uk 2 H , wherea n (u,v) =a((f n ,h n );u,v). Here in our notation for the bilinear forma we don’t explicitly show its dependence on timet in order to simplify the presentation. Using the properties of the bilinear forma we get a n (u n −u,u n −u)+kku n −uk 2 H =a n (−u,u n −u)−a(−u,u n −u)+a(−u,u n −u) +a n (u n ,u n −u)+kku n −uk 2 H =a n (−u,u n −u)−a(−u,u n −u)−a(u,u n −u) +a n (u n ,u n −u)+kku n −uk 2 H . 30 Applying Lemma 5 for the first two terms and (2.29) for the third and fourth terms of the last expression we obtain a n (u n −u,u n −u)+kku n −uk 2 H ≤d((f n ,h n ),(f,h))k−uk V ku n −uk V +< ˙ u,u n −u> H −<F,u n −u> H −< ˙ u n ,u n −u> H +<F n ,u n −u> H +kku n −uk 2 H =d((f n ,h n ),(f,h))kuk V ku n −uk V +< ˙ u− ˙ u n ,u n −u> H +<F n −F,u n −u> H +kku n −uk 2 H =d((f n ,h n ),(f,h))kuk V ku n −uk V − 1 2 d dt (ku n −uk 2 H ) +<F n −F,u n −u> H +kku n −uk 2 H ≤d((f n ,h n ),(f,h))( 1 2 kuk 2 V + 2 ku n −uk 2 V )− 1 2 d dt (ku n −uk 2 H ) +kF n −Fk H ku n −uk H +kku n −uk 2 H , where in the last step we used Cauchy’s and Holder’s inequality with> 0. Using the definition ofF andF n and Assumption A1 we get ||F n −F|| H = 1 2 e x 1 −qτ (σ 2 (y,τ)−σ 2 n (y,τ)) H ≤M 1 σ 2 n (y,τ)−σ 2 (y,τ) H =M 1 f n −f + Z y 0 (h n (z,τ)−h(z,τ))dz H ≤M 1 (kf n −fk H + Z y 0 (h n (z,τ)−h(z,τ))dz H ≤M 1 (d 1 (f n ,f)+d 2 (h n ,h)) =M 1 d((f n ,h n ),(f,h)). 31 Therefore <F n −F,u n −u> H ≤M 1 ( 1 2 1 d 2 ((f n ,h n ),(f,h))+ 1 2 ku n −uk 2 H ) ≤M 1 ( 1 2 1 d 2 ((f n ,h n ),(f,h))+ 1 2 ku n −uk 2 V ), for some 1 > 0. Thus we have d dt (||u n −u|| 2 H )+(2δ−d((f n ,h n ),(f,h))−M 1 1 )ku n −uk 2 V ≤ 1 d((f n ,h n ),(f,h))kuk 2 V + M 1 1 d 2 ((f n ,h n ),(f,h))+2kku n −uk 2 H . Fork> 0 and small enough, 1 > 0 we have d dt (||u n −u|| 2 H )+(2δ−d((f n ,h n ),(f,h))−M 1 1 )ku n −uk 2 V ≤ 1 d((f n ,h n ),(f,h))kuk 2 V + M 1 1 d 2 ((f n ,h n ),(f,h))+2k(ku n −uk 2 H +(2δ−d((f n ,h n ),(f,h))−M 1 1 ) Z t 0 ku n (s)−u(s)k 2 V ds). Let ρ n (t) :=ku n −uk 2 H +(2δ−d((f n ,h n ),(f,h))−M 1 1 ) Z t 0 ku n (s)−u(s)k 2 V ds; thenρ n (0) = 0. Now the last inequality can be written in terms ofρ n : ˙ ρ n (t)≤ 2kρ n (t)+ 1 d((f n ,h n ),(f,h))kuk 2 V + M 1 1 d 2 ((f n ,h n ),(f,h)). (2.37) 32 Applying Gronwall’s inequality in differential form (see Appendix) to (2.37) we get: ρ n (t)≤e 2kt d((f n ,h n ),(f,h)) Z t 0 ( kuk 2 V + M 1 1 d((f n ,h n ),(f,h)))ds. Since 0≤t≤T andkuk 2 V <∞, lettingn→∞ we getρ n (t)→ 0 asn→∞ for each t ∈ [0,T] orku n −uk 2 H → 0 and R t 0 ku n −uk 2 V → 0, asn → ∞ for eacht ∈ [0,T], which completes the proof. However, since the least-squares performance indexJ involves observations which are pointwise in the spatial variabley, we will need the convergence in the strongerV norm: Theorem 4. Suppose{(f n ,h n )} ∞ n=1 is a sequence inQ with (f n ,h n )→ (f,h)∈ Q asn→∞. Letu n denote the solution to (2.29)-(2.30) corresponding to (f n ,h n ) andu the solution to (2.29)-(2.30) corresponding to (f,h). Then ifu∈H 1 (0,T;H 1 ), we have ku n (t)−u(t)k V → 0 asn→∞ for eacht∈ [0,T] (convergence inC(0,T;V)). Remark: We denote byH 1 the usual Sobolev space on[0,1]. We sayu∈H 1 (0,T;H 1 ) ifu∈L 2 (0,T;H 1 ) and ˙ u∈L 2 (0,T;H 1 ). Proof. From (2.29)-(2.30) we have < ˙ u n ,ϕ> +a n (u n ,ϕ) =<F n ,ϕ>, ϕ∈V (2.38) u n (0) =ψ (2.39) and < ˙ u,ϕ> +a(u,ϕ) =<F,ϕ>, ϕ∈V (2.40) 33 u(0) =ψ, (2.41) whereψ = max(e by+x 0 −1,0)−be x 1 y. Subtracting (2.40) from (2.38) and (2.41) from (2.39) we get: < ˙ u n − ˙ u,ϕ> +a n (u n ,ϕ)−a(u,ϕ) =<F n −F,ϕ>, ϕ∈V (2.42) (u n −u)(0) = 0. (2.43) We can write a n (u n ,ϕ)−a(u,ϕ) =a n (u n ,ϕ)−a n (u,ϕ)+a n (u,ϕ)−a(u,ϕ) =a n (u n −u,ϕ)+a n (u,ϕ)−a(u,ϕ). Lettingz n =u n −u, (2.42)-(2.43) would become < ˙ z n ,ϕ> +a n (z n ,ϕ) =<F n −F,ϕ> +a(u,ϕ)−a n (u,ϕ), ϕ∈V (2.44) z n (0) = 0. (2.45) Choosingϕ = ˙ z n , we can rewrite (2.44) as < ˙ z n , ˙ z n > +a n (z n , ˙ z n ) =<F n −F, ˙ z n > +a(u, ˙ z n )−a n (u, ˙ z n ). 34 Using the definition of bilinear formsa anda n given by (2.21)-(2.24), we have: k ˙ z n k 2 H +<Dz n ,(Dα n −β n ) ˙ z n > +<Dz n ,α n D ˙ z n >−<z n ,γ ˙ z n > =<F n −F, ˙ z n > +<Du,(Dα−β) ˙ z n > +<Du,αD ˙ z n > −<u,γ ˙ z n >−<Du,(Dα n −β n ) ˙ z n >−<Du,α n D ˙ z n > +<u,γ ˙ z n > or || ˙ z n || 2 H +<Dz n ,(Dα n −β n ) ˙ z n > + 1 2 d dt <Dz n ,α n Dz n >− 1 2 <Dz n , ˙ α n Dz n > + 1 2 d dt √ rz n 2 H =<F n −F, ˙ z n > +<Du,(Dα−Dα n −β +β n ) ˙ z n > + d dt < (α−α n )Du,Dz n >−< (˙ α− ˙ α n )Du,Dz n >−< (α−α n )D˙ u,Dz n >. Repeated application of the inequality<u,v>≤ckuk 2 H + 1 4c kvk 2 H , c> 0 yields k ˙ z n k 2 H + 1 2 d dt (k √ α n Dz n k 2 H +rkz n k 2 H ) ≤c 1 kDz n k 2 H + 1 4c 1 k(Dα n −β n ) ˙ z n k 2 H + 1 2 <Dz n , ˙ α n Dz n > +c 2 k ˙ z n k 2 H + 1 4c 2 kF n −Fk 2 H +c 3 k ˙ z n k 2 H + 1 4c 3 k(Dα−Dα n −β +β n )Duk 2 H + d dt < (α−α n )Du,Dz n > +c 4 kDz n k 2 H + 1 4c 4 k(˙ α− ˙ α n )Duk 2 H +c 5 kDz n k 2 H + 1 4c 5 k(α−α n )D˙ uk 2 H . From the assumptions on σ, Lemmas 1,5 and Theorem 3 we have the following esti- mates: 35 |(Dα n −β n )|≤K 1 , | ˙ α n |≤K 2 , kF n −Fk≤K 3 d((f n ,h n ),(f,h)), |(Dα−Dα n −β +β n |≤K 4 d((f n ,h n ),(f,h)), |α−α n |≤K 5 d((f n ,h n ),(f,h)). From the hypotheses of this theorem it follows that |˙ α− ˙ α n |≤K 6 d((f n ,h n ),(f,h)). Therefore the last inequality can be rewritten: k ˙ z n k 2 H + 1 2 d dt (k √ α n Dz n k 2 H +rkz n k 2 H ) ≤c 1 kDz n k 2 H + K 2 1 4c 1 k ˙ z n k 2 H + K 2 2 kDz n k 2 H +c 2 k ˙ z n k 2 H + K 2 3 4c 2 d 2 ((f n ,h n ),(f,h)) +c 3 k ˙ z n k 2 H + K 2 4 4c 3 d 2 ((f n ,h n ),(f,h))kDuk 2 H + d dt < (α−α n )Du,Dz n > +c 4 kDz n k 2 H + K 2 6 4c 4 d 2 ((f n ,h n ),(f,h))kDuk 2 H +c 5 kDz n k 2 H + K 2 5 4c 5 d 2 ((f n ,h n ),(f,h))kD˙ uk 2 H . 36 Choosingc 1 ,c 2 andc 3 such that K 2 1 4c 1 +c 2 +c 3 < 1, we can eliminate terms involving k ˙ z n k 2 H from the above expression to obtain 1 2 d dt (|| √ α n Dz n || 2 H +rkz n k 2 H ) ≤ (c 1 + K 2 2 +c 4 +c 5 )kDz n k 2 H +( K 2 3 4c 2 + K 2 4 4c 3 kDuk 2 H + K 2 5 4c 5 kD˙ uk 2 H + K 2 6 4c 4 kDuk 2 H )d 2 ((f n ,h n ),(f,h)) + d dt < (α−α n )Du,Dz n >. Integrating the last inequality from 0 tot and taking into account that z n (0) =Dz n (0) = 0, we have 1 2 (|| √ α n Dz n || 2 H +rkz n k 2 H ) ≤ (c 1 + K 2 2 +c 4 +c 5 ) Z t 0 kDz n k 2 H ds +d 2 ((f n ,h n ),(f,h)) Z t 0 ( K 2 3 4c 2 + K 2 4 4c 3 kDuk 2 H + K 2 5 4c 5 kD˙ uk 2 H + K 2 6 4c 4 kDuk 2 H )ds +< (α−α n )Du,Dz n >. 37 Using again the inequality< u,v >≤ ckuk 2 H + 1 4c kvk 2 H and the estimate from above we have: 1 2 (|| √ α n Dz n || 2 H +rkz n k 2 H ) ≤ (c 1 + K 2 2 +c 4 +c 5 ) Z t 0 kDz n k 2 H ds +d 2 ((f n ,h n ),(f,h)) Z t 0 ( K 2 3 4c 2 + K 2 4 4c 3 kDuk 2 H + K 2 5 4c 5 kD˙ uk 2 H + K 2 6 4c 4 kDuk 2 H )ds +c 6 kDz n k 2 H + K 2 5 4c 6 d 2 ((f n ,h n ),(f,h))kDuk 2 H . From Assumption A1 and (2.22) we have that α n ≥ m 2 2b 2 and choosing c 6 such that m 2 2b 2 −2c 6 > 0 we get: ( m 2 2b 2 −2c 6 )kDz n k 2 H +rkz n k 2 H ≤ (2c 1 +K 2 +2c 4 +2c 5 ) Z t 0 kDz n k 2 H ds+d 2 ((f n ,h n ),(f,h))· ( Z t 0 ( K 2 3 2c 2 + K 2 4 2c 3 kDuk 2 H + K 2 5 2c 5 kD˙ uk 2 H + K 2 6 2c 4 kDuk 2 H )ds+ K 2 5 2c 6 kDuk 2 H ). Denotingl := min( m 2 2b 2 − 2c 6 ,r) > 0 (ifr = 0 using the same argument as below we show thatkDz n k 2 H → 0 asn→∞ which, together with the result of Theorem 3, gives the desired result) we obtain ||Dz n || 2 H +kz n k 2 H ≤ (2c 1 +K 2 +2c 4 +2c 5 ) l Z t 0 (kz n k 2 H +kDz n k 2 H )ds + d 2 ((f n ,h n ),(f,h)) l · ( Z t 0 ( K 2 3 2c 2 + K 2 4 2c 3 kDuk 2 H + K 2 5 2c 5 kD˙ uk 2 H + K 2 6 2c 4 kDuk 2 H )ds+ K 2 5 2c 7 kDuk 2 H ). 38 Applying Gronwall’s inequality in the integral form (see Appendix) and noting that d((f n ,h n ),(f,h))→ 0 asn→∞ we obtain kDz n k 2 H +kz n k 2 H =ku n −uk 2 V →∞, n→∞, t∈ [0,T]. In the following chapters, to make the presentation in line with the results of other researchers that we have used we will make the following change of notation: we will use u(x,t) instead of C 2 (y,τ) and σ(x,t) instead of σ(y,τ). The relations between the original variables and new ones (C 2 (y,τ),σ(y,τ)) are given at the end of Section 2.1. To avoid future confusion whenever it is necessary we will indicate values of spa- tial/temporal variables at which the functionsu andσ are considered. 39 Chapter 3 Fully Discretized System 3.1 Semidiscrete Approximation In this chapter we present an approximation scheme to solve the parameter estimation problem involving the full discretization (in time and space) of equation (2.29)-(2.30) and we prove its convergence. But before doing that we show how a semidiscrete (approximation in the spatial variable) approach works. This would be done using ideas and techniques described in Banks and Rosen [4]. Full discretization involves a two-stage approximation process. We first use a Ritz- Galerkin approach to approximate the infinite dimensional state equation (2.29)-(2.30) by a sequence of finite dimensional ordinary differential equations. For eachn = 1,2,... letV n be a finite dimensional subspace ofH satisfyingV n ⊂ V . DefineP n : H → V n to be the orthogonal projection ofH ontoV n with respect to the H inner product. We make the following assumption about the approximating properties of the subspaces ofV n : A6: For eachϕ∈V, kϕ−P n ϕk V → 0 asn→∞ withkϕ−P n ϕk V ≤kkϕk V for some constantk independent ofN andϕ. The Galerkin equations inV n corresponding to the system (2.29)-(2.30) are given by < ˙ u n ,ϕ n > +a(u n ,ϕ n ) =<F,ϕ n >, ϕ n ∈V n (3.1) 40 u n (0) =P n ψ, (3.2) where u n (t) ∈ V n , t ≥ 0 and ψ = max(e by+x 0 − 1,0)−be x 1 y. We will show that sufficiently smooth solutions to (2.29)-(2.30) are approximated by solutions to (3.1)- (3.2) with a certain degree of uniformity in parameters (f,h) (Theorem 5). Then we discretize the set of admissible parametersQ. As a result we would have a sequence of optimization problems involving the minimization of a least-squares per- formance indexJ over a compact subset of Euclidian space subject to finite dimensional constraints. Theorem 5. Suppose{(f n ,h n )} ∞ n=1 is a sequence inQ with (f n ,h n )→ (f,h)∈ Q asn→∞. Letu n denote the solution to (3.1)-(3.2) corresponding to (f n ,h n ) andu the solution to (2.29)-(2.30) corresponding to (f,h). Then ifu∈H 1 (0,T;H 1 ) and A6 holds we have ku n (t)−u(t)k V → 0 asn→∞ for eacht∈ [0,T] (convergence inC(0,T;V)). Proof. We know that ku n (t)−u(t)k V ≤ku n (t)−P n u(t)k V +kP n u(t)−u(t)k V . The regularity assumptions onu and Assumption A6 imply that the second term on the right-hand side of the above estimate tends to 0 as n → ∞ for each t ∈ [0,T]. We therefore need only to estimate the termku n (t)−P n u(t)k V . From (2.29)-(2.30) and (3.1)-(3.2) we have < ˙ u,ϕ> +a(u,ϕ) =<F,ϕ>, ϕ∈V (3.3) u(0) =ψ (3.4) 41 and < ˙ u n ,ϕ n > +a n (u n ,ϕ n ) =<F n ,ϕ n >, ϕ n ∈V n (3.5) u n (0) =P n ψ. (3.6) Subtracting (3.3) from (3.5) and taking into account thatV n ⊂V we get: < ˙ u n − ˙ u,ϕ n > +a n (u n ,ϕ n )−a(u,ϕ n ) =<F n −F,ϕ n >, ϕ n ∈V n . (3.7) We can write a n (u n ,ϕ n )−a(u,ϕ n ) =a n (u n −P n u,ϕ n )+a n (P n u,ϕ n )−a(u,ϕ n ). Lettingz n =u n −P n u and using the properties of the orthogonal projectionP n , (3.7) can be rewritten as < ˙ z n ,ϕ n > +a n (z n ,ϕ n ) =<F n −F,ϕ n > +a(u,ϕ n )−a n (P n u,ϕ n ), ϕ n ∈V n (3.8) and also we have z n (0) = 0. (3.9) Choosingϕ n = ˙ z n , we can rewrite (3.8) as < ˙ z n , ˙ z n > +a n (z n , ˙ z n ) =<F n −F, ˙ z n > +a(u, ˙ z n )−a n (P n u, ˙ z n ). 42 Using the definition of bilinear formsa anda n given by (2.21)-(2.24), we have: k ˙ z n k 2 H +<Dz n ,(Dα n −β n ) ˙ z n > +<Dz n ,α n D ˙ z n >−<z n ,γ ˙ z n > =<F n −F, ˙ z n > +<Du,(Dα−β) ˙ z n > +<Du,αD ˙ z n >−<u,γ ˙ z n > −<D(P n u),(Dα n −β n ) ˙ z n >−<D(P n u),α n D ˙ z n > +<P n u,γ ˙ z n > or k ˙ z n k 2 H +<Dz n ,(Dα n −β n ) ˙ z n > + 1 2 d dt <Dz n ,α n Dz n > − 1 2 <Dz n , ˙ α n Dz n > + 1 2 r d dt kz n k 2 H =<F n −F, ˙ z n > +<Du(Dα−β)−D(P n u)(Dα n −β n ), ˙ z n > + d dt <αDu−α n D(P n u),Dz n >−< ˙ αDu− ˙ α n D(P N u),Dz n > −<αD˙ u−α n D(P n ˙ u),Dz n >. Repeated application of the inequality<u,v>≤ckuk 2 H + 1 4c kvk 2 H , c> 0 yields || ˙ z n || 2 H + 1 2 d dt (k √ α n Dz n k 2 H +rkz n k 2 H ) ≤c 1 kDz n k 2 H + 1 4c 1 k(Dα n −β n ) ˙ z n k 2 H + 1 2 <Dz n , ˙ α n Dz n > +c 2 k ˙ z n k 2 H + 1 4c 2 kF n −Fk 2 H +c 3 k ˙ z n k 2 H + 1 4c 3 kDu(Dα−β)−D(P n u)(Dα n −β n )k 2 H + d dt <αDu−α n D(P n u),Dz n > +c 4 kDz n k 2 H + 1 4c 4 k(˙ αDu− ˙ α n D(P n u)k 2 H +c 5 kDz n k 2 H + 1 4c 5 k(αD˙ u−α n D(P n ˙ u)k 2 H . 43 From the assumptions on σ, Lemmas 1,5 and Theorem 3 we have the following esti- mates: |Dα n −β n |≤K 1 , | ˙ α n |≤K 2 , kF n −Fk≤K 3 d((f n ,h n ),(f,h)), |Dα−Dα n −β +β n |≤K 4 d((f n ,h n ),(f,h)), |α−α n |≤K 5 d((f n ,h n ),(f,h)), |α n |≤K 7 . From the hypotheses of this theorem it follows that |˙ α− ˙ α n |≤K 6 d((f n ,h n ),(f,h)). Using these estimates and the definitions of the norm inH andV we have: kDu(Dα−β)−D(P n u)(Dα n −β n )k H ≤kDu(Dα−β−Dα n +β n )k H +k(Du−D(P n u))(Dα n −β n )k H ≤K 4 d((f n ,h n ),(f,h))kDuk H +K 1 kD(u−P n u)k H ≤K 4 d((f n ,h n ),(f,h))kDuk H +K 1 ku−P n uk V ; k˙ αDu− ˙ α n D(P n u)k H ≤k(˙ α− ˙ α n )Duk H +k ˙ α n (Du−D(P n u))k H ≤K 6 d((f n ,h n ),(f,h))kDuk H +K 2 kD(u−P n u)k H ≤K 6 d((f n ,h n ),(f,h))kDuk H +K 2 ku−P n uk V ; 44 kαD˙ u−α n D(P n ˙ u)k H ≤k(α−α n )D˙ uk H +kα n D(˙ u−(P n ˙ u))k H ≤K 5 d((f n ,h n ),(f,h))kD˙ uk H +K 7 kD(˙ u−P n ˙ u)k H ≤K 5 d((f n ,h n ),(f,h))kD˙ uk H +K 7 k˙ u−P n ˙ uk V . Therefore the above inequality can be rewritten: k ˙ z n k 2 H + 1 2 d dt (k √ α n Dz n k 2 H +rkz n k 2 H ) ≤ (c 1 + K 2 2 +c 4 +c 5 )kDz n k 2 H +( K 2 1 4c 1 +c 2 +c 3 )k ˙ z n k 2 H + d dt <αDu−α n D(P n u),Dz n > +ρ n , where ρ n (t) = K 2 3 4c 2 d 2 ((f n ,h n ),(f,h)) + 1 2c 3 (K 2 4 d 2 ((f n ,h n ),(f,h))kDuk 2 H +K 2 1 ku−P n uk 2 V ) + 1 2c 4 (K 2 6 d 2 ((f n ,h n ),(f,h))kDuk 2 H +K 2 2 ku−P n uk 2 V ) + 1 2c 5 (K 2 5 d 2 ((f n ,h n ),(f,h))kD˙ uk 2 H +K 2 7 k˙ u−P n ˙ uk 2 V ). From Assumption A6 and the convergence (f n ,h n ) → (f,h), we can conclude that ρ n (t)→ 0 asn→∞ for eacht∈ [0,T]. Choosingc 1 ,c 2 andc 3 such that K 2 1 4c 1 +c 2 +c 3 < 1, we can eliminate terms involving k ˙ z n k 2 H from the above expression to obtain 1 2 d dt (|| √ α n Dz n || 2 H +rkz n k 2 H ) ≤ (c 1 + K 2 2 +c 4 +c 5 )kDz n k 2 H + d dt < (αDu−α n D(P n u),Dz n > +ρ n . 45 Integrating the last inequality from 0 tot and recalling (3.9) we find 1 2 (|| √ α n Dz n || 2 H +rkz n k 2 H ) ≤ (c 1 + K 2 2 +c 4 +c 5 ) Z t 0 kDz n k 2 H ds+<αDu−α n D(P n u),Dz n > + Z t 0 ρ n (s)ds. Using again the inequality<u,v>≤ckuk 2 H + 1 4c kvk 2 H and kαDu−α n D(P n u)k H ≤K 5 d((f n ,h n ),(f,h))kDuk H +K 7 ku−P n uk V we have: 1 2 (k √ α n Dz n k 2 H rkz n k 2 H ) ≤ (c 1 + K 2 2 +c 4 +c 5 ) Z t 0 kDz n k 2 H ds+c 6 kDz n k 2 H +θ n , where θ n (t) = K 2 5 2c 6 d 2 ((f n ,h n ),(f,h))kDuk 2 H + K 2 7 2c 6 ku−P n uk 2 V + Z t 0 ρ n (s)ds. Assumption A6 together with the Dominated Convergence Theorem applied to ρ n (t) yieldθ n (t)→ 0 asn→∞ for eacht∈ [0,T]. From Assumption A1 and (2.22) we have thatα n ≥ m 2 2b 2 and choosingc 6 such that m 2 2b 2 −2c 6 > 0 we get: ( m 2 2b 2 −2c 6 )kDz n k 2 H +rkz n k 2 H ≤ (2c 1 +K 2 +2c 4 +2c 5 ) Z t 0 kDz n k 2 H ds+2θ n . 46 Denotingl := min( m 2 2b 2 −2c 6 ,r)> 0 (ifr = 0, we’ll prove thatkDz n k 2 H → 0,n→∞ and arguments similar to those used in Theorem 3 will yieldkz n k 2 H → 0 asn→∞ for eacht∈ [0,T]) we obtain kDz n k 2 H +kz n k 2 H ≤ (2c 1 +K 2 +2c 4 +2c 5 ) l Z t 0 (kz n k 2 H +kDz n k 2 H )ds+ 2θ n l . Applying the integral version of Gronwall’s inequality (see Appendix) and noting that θ n (t)→ 0 for eacht∈ [0,T] asn→∞ we have that kDz n k 2 H +kz n k 2 H =ku n −uk 2 V →∞, n→∞. Theorem 5 enables us to claim that solutions to the state discretized system (3.1)-(3.2) converge to the solution of the original equation (2.29)-(2.30) with a certain degree of uniformity in the parameters. We want however to have similar results applied to completely discrete (in time and in space) system. We motivate why we insist on fully discretizing system (2.29)-(2.30) as follows. For numerical computation we will have to minimize the functionalJ. For that we would use numerical methods (e.g. steepest descent) that would require us to calculate the gradient of the functional J,∇J. The naive approach to computing the gradient using finite- difference approximation would be computationally very expensive. It would also yield inaccurate gradients. This inaccuracy can cause the method to converge very slowly or not at all. If we fully discretize system (2.29)-(2.30) we can overcome these difficulties by using the so-called adjoint method (this will be described later in full details) which would allow us to calculate the gradient precisely (up to roundoff errors) for a fully discrete system. Computing∇J is also very efficient and inexpensive using the adjoint method. 47 In order to achieve these goals (i.e. to have convergence results for fully discreet system) we adopt the approach known as the factor-method of Krein [14]. First we describe the ideas and results of this method in the general situation and then indicate how it can be used in our case. 3.2 Review of Krein’s Factor-Method and its Appli- cation to Finite-Difference Schemes for Evolution Equations Consider a Banach spaceE and its subspaceE 1 . For any elementu∈ E the set of all elementsu+x, wherex∈E 1 , is called a residue class ¯ u of the elementu relative to the subspaceE 1 . The collection of all residue classes, in which it is natural to introduce the operations of addition and multiplication by a scalar, is a linear space and is called the factor-spaceE/E 1 of the spaceE by the subspaceE 1 . A norm may be introduced in the factor-spaceE/E 1 by putting k¯ uk E/E 1 = inf v∈¯ u kvk E . (3.10) In this norm the spaceE/E 1 is a Banach space and this norm is called the natural norm (other norms may be defined onE/E 1 as well). The relationu7→ ¯ u is a homomorhism ofE ontoE/E 1 which we denote byφ 1 :φ 1 u = ¯ u. Now suppose that a sequence of subspacesE n has been distinguished in the space E. We construct the factor-spacesE/E n and the homomorhismsφ n and normsk·k E/En corresponding to them. 48 Definition 1. The sequence{¯ u n }∈E/E n factor-converges to the elementu∈E if lim n→∞ kφ n u− ¯ u n k E/En = 0. Of course, with an arbitrary choice of the subspaces E n and norms in the factor- spacesE/E n the concept of factor-convergence may fail to correspond to any natural concept of closeness of the elementsu and ¯ u n . We impose a restriction on this choice. Lemma 6. Suppose that either conditionlim n→∞ kφ n vk E/En =kvk E ∀v∈E or the condition E n+1 ⊂ E n , T ∞ n=1 E n = 0 is satisfied, and that the norms in the factor-spacesE/E n are the natural ones. Then a sequence{¯ u n } cannot factor-converge to two distinct limits. The connection between approximation to an element in the sense of factor- convergence and in the usual sense is established by the following assertion: Lemma 7. Suppose that a natural norm has been introduced into each of the factor-spaces and that the sequence{¯ u n } converges to the elementu ∈ E. Then from each class ¯ u n one can selectu n ∈E such that lim n→∞ ku−u n k E = 0. Consider the equation Tu =f, (3.11) where T is a linear operator defined on the linear set D(T) of a Banach space E and acting into the Banach spaceF . Suppose that inE andF there have been distinguished sequences of subspacesE n andF n and that the factor-spacesE/E n andF/F n have been constructed with the corresponding homomorphismsφ n andψ n . On a linear subset of 49 the spaceE/E n containing the image of of the domainD(T) of the operatorT , define an operatorT n approximatingT in some sense, and consider the approximate equation T n ¯ u n =ψ n f. (3.12) A solution ¯ u n ∈ E/E n of (3.12) is regarded as approximating the solution of equation (3.11). The passage from equation (3.11) to equation (3.12) will be called the factor-method of approximate solution of equation (3.11). The basic characteristics of the factor-method are the properties of approximation and stability. Definition 2. OperatorsT n approximate the operatorT at the elementv∈D(T), if lim n→∞ kψ n Tv−T n φ n vk F/Fn = 0, i.e. ifT n φ n v factor-converges toTv. Definition 3. The factor-method is said to be stable if for alln beyond somen 0 there exist bounded linear inverse operatorsT −1 n defined on the spacesF/F n and such that T −1 n F/Fn→E/En ≤k, n≥n 0 , where the constantk does not depend onn. 50 Theorem 6. Suppose thatu is a solution of (3.11). If the operatorsT n approximate the operatorT on the solutionu, and the stability condition is satisfied, then the approximate solutions of (2.31), ¯ u n , factor-converge to the exact solutionu. Proof. From the stability condition, we have forn≥n 0 that kφ n u− ¯ u n k E/En = T −1 n T n φ n u−T −1 n ψ n f E/En = T −1 n (T n φ n u−ψ n Tu E/En ≤kk(T n φ n u−ψ n Tuk E/En . It follows from the approximation condition that the right side tends to zero asn→∞. Now we show how the factor-method described above can be used in investigat- ing the finite difference approximation of evolution equations. Consider the evolution equation in a Banach spaceE: du dt =A(t)u+f(t) (0≤t≤T) (3.13) with the initial condition u(0) =u 0 ∈D(A), (3.14) where A(t) is a closed linear operator with domain D(A) which is dense in E and doesn’t depend ont. We suppose thatA(t) is strongly continuous onD(A), has bounded inverseA −1 (t) and that the functionf is continuous. If a model for the equation (3.13) is a PDE then the full discretization of it will be done in two stages. First we replace the differential operatorA(t) connected with the 51 space coordinates by a finite-difference or finite element expression. Then we replace the derivative with respect to timet by a difference quotient. We choose a uniform grid on the interval [0,T] given by the pointst k = kΔ n t = k T n (k = 0,1,...,n). We shall find the approximate valuesu(t k ) of the solutionu(t) at the pointst k . The time derivative at pointst k we replace by u(t k+1 )−u(t k ) Δnt . In the space E we select a subspace L n , form the factor-space E/L n and approx- imate the operator A(t i ) by the bounded operators A n (t i ) acting in the factor-space E/L n . In place of equation (3.13) and initial condition (3.14) we consider the system ¯ u (n) i+1 − ¯ u (n) i Δ n t =A n (t i )¯ u (n) i + ¯ f (n) i (i = 0,1,...,n−1) (3.15) or ¯ u (n) i+1 − ¯ u (n) i Δ n t =A n (t i+1 )¯ u (n) i+1 + ¯ f (n) i+1 (i = 0,1,...,n−1) (3.16) and ¯ u (n) 0 =φ n u 0 , (3.17) whereφ n is the natural homomorphism ofE ontoE/L n and ¯ f (n) i =φ n f(t i ). Equation (3.15) would correspond to an explicit scheme while equation (3.16) would represent an implicit scheme. It is the latter which will turn out to be the most appropriate in our case. The solution of the system (3.15), (3.17) (or (3.16)-(3.17)) will be a collection of (n + 1) elements{¯ u (n) 0 ,¯ u (n) 1 ,...,¯ u (n) n } in the spaceE/L n . This solution is regarded as approximating the solution of (3.13)-(3.14) at grid pointst 0 ,t 1 ,...,t n . Since this approx- imation lies inE/L n and solution to (3.13)-(3.14) inE we need to define convergence. 52 Definition 4. We will say that the approximate solution ¯ u (n) k converges to the functionu(t) if max k=0,1,...,n φ n u(t k )− ¯ u (n) k E/Ln → 0 asn→∞. It is shown in [14] (p.342-344) how to make the passage from the problem (3.13)- (3.14) to the problem (3.15), (3.17) as the passage from the equation (3.11) to (3.12). We will adopt the same method to make the passage from the original problem (3.13)-(3.14) to the implicit scheme (3.16)- (3.17). Consider the spaceC(E) of continuous functions with values inE, provided with the usual norm kuk C(E) = max 0≤t≤T ku(t)k E , and the spaceF(E) =E×C(E) with the norm k(u 0 ,f(t)k F(E) = max{ku 0 k E ,kfk C(E) }. We denote byD(T) the set of all functionsu(t) ofC(E) which are continuously dif- ferentiable on [0,T] and such that the functions A(t)u(t) are defined and continuous on [0,T]. Each function of D(T) is a solution of the problem (3.13)-(3.14) for some u 0 ∈D(A) andf ∈C(E). OnD(T) we define an operatorT by the formula Tu ={u(0), du dt −A(t)u}. The linear operatorT will act fromD(T) into the spaceF(E). 53 Now we consider the operator generated by the problem (3.16)- (3.17). On the set of all collections ˆ u ={¯ u (n) 0 ,¯ u (n) 1 ,...,¯ u (n) n } ofn+1 elements ofE/L n we introduce the norm kˆ uk = max i=0,...,n k¯ u i k E/Ln . We define an operatorT n on this set by means of the formula T n ˆ u ={¯ u 0 , ¯ u 1 − ¯ u 0 Δ n t −A n (t 1 )¯ u 1 ,..., ¯ u n − ¯ u n−1 Δ n t −A n (t n )¯ u n }. (3.18) Then we can find the inverse operatorT −1 n . We have T n ˆ u = ˆ f ≡{¯ g 0 , ¯ f 1 ,..., ¯ f n } and, therefore: ¯ u 0 = ¯ g 0 , ¯ u k+1 − ¯ u k Δ n t =A n (t k+1 )¯ u k+1 + ¯ f k+1 (k = 0,...,n−1). Solving for ¯ u k+1 , we obtain ¯ u k+1 = (I−Δ n tA n (t k+1 )) −1 ¯ u k +(I−Δ n tA n (t k+1 )) −1 Δ n t ¯ f k+1 , (3.19) or finally (T −1 n ˆ f) i = ¯ u i = i Y j=1 (I−Δ n tA n (t j )) −1 ¯ u 0 +Δ n t i X k=1 i Y j=k (I−Δ n tA n (t j )) −1 ¯ f k . (3.20) In view of the boundedness of the operatorsA n (t j ) the operatorT −1 n will be bounded. 54 Alternatively, denote by C n (E) the collection of all those functions u(t) of C(E) for which u(t k ) ∈ L n for all k = 0,1,...,n. Since L n is closed it follows from the form of the norm in C(E) that C n (E) is a subspace of C(E). We note that two functions u(t) and v(t) of C(E) fall into one residue class relative to C n (E) if and only if for all k = 0,1,...,n the equation φ n u(t k ) = φ n v(t k ) holds. Hence to each residue class of the spaceC(E) relative toC n (E) there corresponds a collection {φ n u(0),φ n u(Δ n t),...,φ n (T)} ofn+1 elements ofE/L n . If the natural norm is intro- duced in the factor-spacesE/L n then the natural norms in the spacesC(E)/C n (E) will be calculated by the formula kΦ n uk C(E)/Cn(E) = max i=0,1,...,n kφ n u(t i )k E/Ln , (3.21) where Φ n is the natural homomorphism ofC(E) ontoC(E)/C n (E). We may consider the factor-spacesC(E)/C n (E) andE/L n as well. However we will always suppose that the norms are consistent with equation (3.21). Thus, one may consider the factor-spaceC(E)/C n (E) as the set of all collections ofn+1 elements of the spaceE/L n with a norm equal to the maximum of the norms of the components. We may distinguish in the spaceF(E) the subspaceF n (E) consisting of the pairs (v,u(t)) such thatv∈L n andu(t k )∈L n ,k = 1,2,...,n. The factor-spaceF(E)/F n (E) is also isomorphic to the space of all collections of n+1 elements of the spaceE/L n . We introduce a norm inF(E)/F n (E) by a formula analogous to (3.21): kΨ n (v,uk F(E)/Fn(E) = max{kφ n vk E/Ln ,kφ n u(0)k E/Ln ,...,kφ n u(t n−1 )k E/Ln }, where Ψ n is the natural homomorphism ofF(E) ontoF(E)/F n (E). 55 The operator T n defined by formula (3.18) will now be considered as a bounded operator acting from C(E)/C n (E) into F(E)/F n (E). It has a bounded inverse T −1 n , found from the formula (3.20). We may treat the passage from the problem (3.13)-(3.14) to the problem (3.16)- (3.17) as the passage from the equation Tu =f, (3.22) whereu∈D(T),f = (u 0 ,f(t))∈F(E), to the equation T n ˆ u (n) = ˆ f (n) , (3.23) where ˆ u (n) ∈C(E)/C n (E) and ˆ f (n) = Ψ n f ∈F(E)/F n (E). Now we turn to the question of stability and approximation properties of the factor- method. Theorem 7. For the stability of the factor-method it is sufficient that the condition i Y j=k (I−Δ n tA n (t j )) −1 E/Ln ≤M (1≤k≤i≤n) (3.24) be satisfied, whereM is a constant not depending onn,i andk. Proof. It follows from the definition of the stability that we need to prove the uniform boundness of the operatorsT −1 n . 56 Suppose that (3.24) is satisfied. Then from formula (3.20) we obtain (T −1 n ˆ f i ) E/Ln ≤Mk¯ u 0 k E/Ln +Δ n tM i X k=1 ¯ f k E/Ln ≤M ˆ f F(E)/F(En) +Δ n tM ˆ f F(E)/F(En) i ≤M ˆ f F(E)/F(En) +Δ n tM ˆ f F(E)/F(En) n =M(1+T) ˆ f F(E)/F(En) , since ˆ f F(E)/F(En) = max{k¯ g 0 k E/Ln , ¯ f 1 E/Ln ,..., ¯ f n E/Ln }. Hence (T −1 n ˆ f) C(E)/Cn(E) = max{ (T −1 n ˆ g 0 ) E/Ln , (T −1 n ˆ f 1 ) E/Ln ,..., (T −1 n ˆ f n ) E/Ln } ≤M(1+T) ˆ f F(E)/Fn(E) and the stability condition follows: T −1 n F(E)/Fn(E) ≤K, whereK =M(1+T). It is difficult to verify the stability condition (3.24). Therefore following theorem provides the simpler sufficient condition: Theorem 8. For the stability of the factor-method it is sufficient that the condition (I−Δ n tA n (t j )) −1 E/Ln ≤ 1+aΔ n t (3.25) 57 be satisfied, wherea≥ 0 does not depend onn andj. Proof. It follows from (3.25) that i Y j=k (I−Δ n tA n (t j )) −1 E/Ln ≤ i Y j=k (I−Δ n tA n (t j )) −1 E/Ln ≤ i Y j=k (1+aΔ n t) = (1+aΔ n t) i−k+1 ≤ (1+aΔ n t) n = (1+ aT n ) n ≤e aT . Let’s turn to the derivation of conditions under which the operatorsT n approximate the operator T . We suppose that the operators A n (t) approximate the operator A(t) uniformly int on the solutionu(t) of the equation (3.13), i.e. that the condition lim n→∞ max i=0,1,...,n kφ n A(t i )u(t i )−A n (t i )φ n u(t i )k E/Ln = 0 (3.26) is satisfied for any solutionu(t) of (3.13). Sinceu(t) is the solution of equation (3.13) Tu = (u 0 ,f(t)) and Ψ n Tu = (φ n u 0 ,φ n f(t 1 ),...,φ n f(t n )). By the construction of the operatorT n T n Φ n u = (φ n u 0 ,φ n u(t 1 )−u 0 Δ n t −A n (t 1 )φ n u(t 1 ), φ n u(t 2 )−u(t 1 ) Δ n t −A n (t 2 )φ n u(t 2 ),...,φ n u(T)−u(t n−1 ) Δ n t −A n (T)φ n u(T)). 58 Therefore we get ||T n Φ n u−Ψ n Tu|| F(E)/Fn(E) = max i=0,...,n−1 φ n u(t i+1) −u(t i ) Δ n t −A n (t i+1 )φ n u(t i+1 )−φ n f(t i+1 ) . (3.27) For the estimation of this norm we note that in view of equation (3.13) u(t i+1) −u(t i ) Δ n t −A(t i+1 )u(t i+1 )−f(t i+1 ) E = 1 Δ n t Z t i+1 t i [u 0 (s)−u 0 (t i+1 )ds] E . We suppose that in the factor-spaceE/L n the norm has been introduced in such a way that kφ n zk E/Ln ≤ckzk E , z∈E. (3.28) Then φ n u(t i+1) −u(t i ) Δ n t −A n (t i+1 )φ n u(t i+1 )−φ n f(t i+1 ) E/Ln ≤ φ n u(t i+1) −u(t i ) Δ n t −φ n A(t i+1 )u(t i+1 )−φ n f(t i+1 ) E/Ln +kφ n A(t i+1 )u(t i+1 )−A n (t i+1 )φ n u(t i+1 )k E/Ln ≤c 1 Δ n t Z t i+1 t i [u 0 (s)−u 0 (t i+1 )ds] E +kφ n A(t i+1 )u(t i+1 )−A n (t i+1 )φ n u(t i+1 )k E/Ln The solutionu(t) is continuously differentiable and therefore the first term becomes arbitrarily small uniformly in t as n → ∞ because of the uniform continuity of u 0 (t) 59 on [0,T]. The second term becomes arbitrarily small in view of (3.26). We then obtain from (3.27) the following: kT n Φ n u−Ψ n Tuk F(E)/Fn(E) → 0 asn→∞. Therefore using Theorems 6 and 7 we can conclude the following: Theorem 9. Suppose that the condition (3.28), the approximation condition (3.26) and the stability condition (3.24) or (3.25) are satisfied. Then the approximate solutions ¯ u n k converge to the exact solution of the problem (3.13)-(3.14) if such a solution exists. Remark: We can establish the rate of convergence of the term 1 Δnt R t i+1 t i [u 0 (s)−u 0 (t i+1 )ds] E . Indeed, 1 Δ n t Z t i+1 t i [u 0 (s)−u 0 (t i+1 )ds] = u(t i+1 )−u(t i ) Δ n t −u 0 (t i+1 ) and using Taylor’s series we have: u(t i+1 )−u(t i ) Δnt −u 0 (t i+1 ) =− u 00 (t i ) 2 Δ n t +O((Δ n t) 2 ). Therefore the rate of convergence of 1 Δnt R t i+1 t i [u 0 (s)−u 0 (t i+1 )ds] E is of order Δ n t. To improve the rate of convergence we could use instead of the explicit scheme (3.15) as in [14] or the fully implicit scheme (3.16), the Crank-Nicolson method: ¯ u (n) i+1 − ¯ u (n) i Δ n t = 1 2 A n (t i )¯ u (n) i + 1 2 A n (t i+1 )¯ u (n) i+1 + ¯ f (n) i + ¯ f (n) i+1 2 , i = 0,1,...,n−1. 60 Defining the operatorT n now by means of the formula T n ˆ u ={¯ u 0 , ¯ u 1 − ¯ u 0 Δ n t − 1 2 A n (t 1 )¯ u 1 − 1 2 A n (t 0 )¯ u 0 ,..., ¯ u n − ¯ u n−1 Δ n t − 1 2 A n (t n )¯ u n − 1 2 A n (t n−1 )¯ u n−1 } and using the same approach as above we would get a formula analogous to (3.19): ¯ u k+1 = (I− Δ n t 2 A n (t k+1 )) −1 (I+ Δ n t 2 A n (t k ))¯ u k +(I− Δ n t 2 A n (t k+1 )) −1 Δ n t ¯ f k+1 + ¯ f k 2 or ¯ u i = i Y j=1 (I− Δ n t 2 A n (t j )) −1 (I + Δ n t 2 A n (t j−1 ))¯ u 0 + Δ n t 2 (I− Δ n t 2 A n (t i )) −1 ¯ f i + ¯ f i−1 2 + Δ n t 2 (I− Δ n t 2 A n (t i )) −1 i−1 X k=1 i−1 Y j=k [(I− Δ n t 2 A n (t j )) −1 (I + Δ n t 2 A n (t j ))]( ¯ f k + ¯ f k−1 ). Then a sufficient condition for the stability of the factor-method in this case (analogous to Theorem 7) would be: || i Y j=k [(I− Δ n t 2 A n (t j )) −1 (I + Δ n t 2 A n (t j ))]|| E/Ln ≤M, 1≤k≤i≤n and ||(I− Δ n t 2 A n (t j )) −1 ||≤L, 1≤j≤n, whereM is a constant not depending onn,i andk andL is a constant not depending on n andj. 61 Keeping the same approximation condition (3.26) we would get: ||φ n u(t i+1) −u(t i ) Δ n t − 1 2 A n (t i+1 )φ n u(t i+1 )− 1 2 A n (t i )φ n u(t i ) − 1 2 φ n f(t i+1 )− 1 2 φ n f(t i )|| E/Ln ≤c|| 1 2Δ n t Z t i+1 t i [u 0 (s)−u 0 (t i+1 )ds]+ 1 2Δ n t Z t i+1 t i [u 0 (s)−u 0 (t i )ds]|| E + 1 2 kφ n A(t i+1 )u(t i+1 )−A n (t i+1 )φ n u(t i+1 )k E/Ln + 1 2 kφ n A(t i )u(t i )−A n (t i )φ n u(t i )k E/Ln . Now we can examine the rate of convergence of 1 2Δ n t Z t i+1 t i [u 0 (s)−u 0 (t i+1 )ds]+ 1 2Δ n t Z t i+1 t i [u 0 (s)−u 0 (t i )ds] E = 1 Δ n t Z t i+1 t i [u 0 (s)− u 0 (t i )+u 0 (t i+1 ) 2 ]ds E = u(t i+1) −u(t i ) Δ n t − u 0 (t i )+u 0 (t i+1 ) 2 E . Using Taylor’s expansion ofu around a pointt = t i +t i+1 2 we get: u(t i+1 )−u(t i ) Δ n t − u 0 (t i )+u 0 (t i+1 2 =u 000 ( t i +t i+1 2 )( Δ n t 2 ) 2 ( 1 3! − 1 2! )+O((Δ n t) 3 ). Therefore the rate of convergence now is of order (Δ n t) 2 and the use of the Crank- Nicolson method would be advantageous especially when we would need to solve the forward PDE in the adjoint method (Chapter 4). 62 3.3 Application of the Factor-Method to the Trans- formed Black-Scholes Equation Now we show how Theorem 9 can be applied in our case. As before we consider an abstract evolution equation du dt =A(t,σ)u+f(t,σ) (3.29) with initial condition u(x,0) =φ(x) (3.30) in the spaceE =H =L 2 [0,1]. Here operatorA(t,σ): DomA→E is defined as A(t,σ)u =α(t,σ)D 2 u+β(t,σ)Du+γ(t,σ)u, α(t,σ),β(t,σ),γ(t,σ) are given by formulas (2.22)-(2.24), the nonhomogenuous term is given byf(t,σ) = e x 1 −qt (r−q− 1 2 σ 2 (x,t)−brx +bqx), the initial conditions by φ(x) = max(e bx+x 0 − 1,0)−be x 1 x, DomA = {u ∈ H 2 (0,1) : u(0) = Du(1) = 0} and constantsx 0 ,x 1 , andb have been defined earlier. The parameter estimation problem consists of finding σ ∈ Q that minimizes the least-squares functional: J(σ) = l X i=1 m i X j=1 |u(x ij ,T i ,K ij ,σ)−u ij | 2 , (3.31) 63 whereu is a solution of (3.29)-(3.30) att = T i andx ij = ln( S 0 K ij )−x 0 b corresponding to σ. Crucial to our further investigation is the fact thatQ here is a compact set (for more information onJ andQ please refer to the Section 2.4). Let the approximation spacesV n be the spans of cubic spline basis elements which have been modified so as to satisfy the homogenous boundary conditions, i.e. ifu∈V n thenu(0) =Du(1) = 0. It can be done in the following way using the same approach as in [3]. For any integern> 0, letx n j =j/n,j =−3,...,n+3, andS n j ,j =−1,...,n+1, be the cubic spline that vanishes outside(x n j−2 ,x n j+2 ), has value 4 and slope 0 atx n j , value 1 and slope 3n at x n j−1 , and value 1 and slope−3n at x n j+1 . Since for some values of j,S j n does not satisfy boundary conditions we have to modify the basis elements. The modified basis elementsϕ n j are the restriction to [0,1] of the following functions: ϕ n 0 = S n 0 −4S n −1 ,ϕ n 1 =S n 0 −4S n 1 ,ϕ n j =S n j (j = 2,...,n−2),ϕ n n−1 =S n n−1 +S n n+1 , andϕ n n = S n n . The approximating subspacesV n ofV are given then byV n = span{ϕ n 0 ,...,ϕ n n } ClearlyV n ⊂ DomA for eachn = 1,2,... andV n is a finite dimensional subspace of H withV n ⊂V. Then by the Projection Theorem (see [15]) we can writeH as a direct sum:H =V n ⊕(V n ) ⊥ , where(V n ) ⊥ is the orthogonal complement ofV n with respect to theH inner product. The orthogonal projectionP n : H → V n is characterized by the property:<P n u−u,ϕ n j >= 0 foru∈H,j = 0,1,...,n. TakeL n = (V n ) ⊥ . ThenE/L n = H/(V n ) ⊥ would be isomorphic toV n and the natural homomorhism φ n of H onto V n would become the orthogonal projection P n (φ n =P n ) with respect to theH inner product. Then if we consider in the factor-space H/(V n ) ⊥ ' V n natural norm, we would getkuk H/(V n ) ⊥ =kP n uk H for anyu∈ H. With this norm, condition (3.28) would be automatically satisfied withc = 1. 64 Consider in the space V n the fully discrete version of the equation (3.29)-(3.30) given by (3.19) u i,j k+1 = (I− T i n A n (t k+1 )) −1 u i,j k + T i n (I− T i n A n (t k+1 )) −1 P n f i,j k+1 , u i,j 0 =P n φ i,j , (3.32) where bounded operators A n : V n → V n are given by A n (t,σ) = P n A(t,σ) and superscripts i,j indicate correspondence of the appropriate variable (u,f or φ) to the maturity T i (i = 1,...,l) and strike price K ij (j = 1,...,m i ). Here there is still some ambiguity regarding the relation betweenT i andn that will be clarified later. Associated with (3.32) is an approximate parameter estimation problem: findσ n ∈ Q that minimizes J n (σ) = l X i=1 m i X j=1 u i,j n,T i (σ)−u ij 2 , (3.33) whereu i,j n,T i (σ) is thenth approximation ofu i,j (t,σ) at timet =T i ,x =x ij . First we show that the stability condition (3.25) is satisfied for the above defined operatorA n . We need the following definitions and results from [20] (p. 399, 404). Definition 5. We shall say that a rational functionr(z) is acceptable with respect to the set{z ∈ C : Rez≤ 0}, or equivalently a member of the classA Rez≤0 , if |r(z)−e z | =O(|z| q+1 ), z→ 0, q≥ 1, (3.34) |r(z)|≤ 1, z∈{z∈C : Rez≤ 0}. (3.35) Definition 6. A setL⊂C (completed by the point at infinity) will be called a spectral set for the linear operatorT on a Hilbert spaceH if (a) it is closed, (b)L⊇σ(T) and (c) for every 65 rational functionu(z) satisfying the inequality|u(z)| ≤ 1 for allz ∈ L, we have that ku(T)k≤ 1. Theorem 10. A necessary and sufficient condition that the halfplane{z ∈ C : Rez ≤ 0} be a spectral set for the bounded linear operatorT is that Re<Tf,f >≤ 0 for allf ∈H. Lemma 8. Suppose (1)C(z)∈A Rez≤0 , (2){z ∈ C : Rez ≤ 0} is a spectral set forT −βI, whereβ > 0 andT is a bounded linear operator on a Hilbert spaceH. Then|C((r/N)T)|≤ 1+βKr/N for some constantK which is independent ofN. Choose the functionC(z) to be (1−z) −1 . Then using Taylor’s expansion we have |C(z)−e z | =|1+z +z 2 +...−(1+z + z 2 2! +...)| =|z 2 (1− 1 2! )+...| =O(|z 2 |) and |C(z)| =| 1 1−z | = 1 |1−(x+iy)| = 1 √ (1−x) 2 +y 2 ≤ 1 forz∈{z∈C : Rez≤ 0}. Therefore by the definition, the functionC(z) = (1−z) −1 ∈A Rez≤0 . An operatorA n (t,σ) :V n →V n can be considered on a Hilbert spaceV n withL 2 inner product. Then∀f ∈ V n and for a constantk from Lemma 2 (ifk is negative there, we can use|k|) we have < (A n −kI)f,f >=< P n A(t,σ n )f,f > −k < f,f >=< A(t,σ n )f,f > −k < f,f > . Taking into account the relation between the operatorA and and the bilinear forma(·,·) from Chapter 2 we have<A(t,σ n )f,f >=−a n (f,f) and therefore using Lemma 2 66 < (A n −kI)f,f >=−a n (f,f)−kkfk 2 H ≤−δkfk 2 V ≤ 0. Therefore by Theorem 10, {z ∈ C : Rez ≤ 0} is a spectral set for the operator A n (t,σ n )−kI and all conditions of Lemma 8 are satisfied. Then we have C( T n A n (t j )) = (I−Δ n tA n (t j )) −1 ≤ 1+kLΔ n t and Theorem 8 guarantees the stability of the factor- method. Remark: Using the same approach as above and takingC(z) = (1− z 2 ) −1 (1 + z 2 ) we can obtain the first condition for the stability of the Crank-Nicolson method. The second condition can also be satisfied using the estimate for the resolvent from [2]. Now we need to show that the approximation condition (3.26) is satisfied. The following result from the theory of partial differential equations similar to the one presented in [3] will be required. Consider an abstract evolution equation du dt =A(t,σ)u+f(t,σ) (3.36) with the initial condition u(x,0) =φ(x), (3.37) where the operatorA is the same as in equation (3.29) and the functionf is such that f(0,x = 0,σ(0,0)) =Df(0,x = 1,σ(0,1)) = 0. (3.38) Then we have the following lemma. 67 Lemma 9. If the initial conditionφ∈ I = H 6 ∩H 3 0 then the solution to (3.36) corresponding to the initial dataφ is such thatu(t,x)∈H 4 ∩{u∈H 1 :u(0) =Du(1) = 0}. Sketch of the proof of this lemma which requires results from [12] is given in [3]. The following lemma providing estimates for spline approximation is also needed. Lemma 10. Letu∈H 4 . Then ku−P n uk≤ c 0 n 4 D 4 u , (3.39) kD(u−P n u)k≤ c 1 n 3 D 4 u , (3.40) D 2 (u−P n u) ≤ c 2 n 2 D 4 u , (3.41) wherec 0 ,c 1 ,c 2 are constants independent ofu andn. Proof of this lemma can be found in [22]. Now we can prove the following theorem. Theorem 11. Consider an abstract evolution equation (3.36) with nonhomogenuous term satisfy- ing (3.38) and an initial conditionφ ∈ I. OperatorsA n (t,σ n ) = P n A(t,σ n ) approxi- mate operatorA(t,σ) ifσ n →σ asn→∞. 68 Proof. We just need to check that the condition (3.26) is satisfied. kφ n A(t i )u(t i )−A n (t i )φ n u(t i )k =kP n A(t i ,σ)u(t i )−P n A(t i ,σ n )P n u(t i )k ≤kA(t i ,σ)u(t i )−A(t i ,σ n )P n u(t i )k =||α(t i ,σ)D 2 u(t i )+β(t i ,σ)Du(t i )+γ(t i ,σ)u(t i ) −(α(t i ,σ n )D 2 P n u(t i )+β(t i ,σ n )DP n u(t i )+γ(t i ,σ n )P n u(t i ))|| ≤||αD 2 u−α n D 2 P n u||+||βDu−β n DP n u||+||γu−γ n P n u|| ≤||(α−α n )D 2 u||+||α n (D 2 u−D 2 P n u)|| +||(β−β n )Du||+||β n (Du−DP n u)|| +||(γ−γ n )u||+||γ n (u−P n u)||. It was shown (using definitions ofα,β andγ, (2.22)-(2.24) and assumptions regarding σ) in Lemma 5 that |α−α n |≤K 1 d(σ,σ n ), |β−β n |≤K 2 d(σ,σ n ), |γ−γ n | = 0, |α n |≤K, |β n |≤K, |γ n |≤K. These inequalities together with the results from the last two lemmas yield kφ n A(t i )u(t i )−A n (t i )φ n u(t i )k≤K 1 d(σ,σ n )||D 2 u||+K 2 d(σ,σ n )||Du|| +K c 2 n 2 ||D 4 u||+K c 1 n 3 ||D 4 u||+K c 0 n 4 ||D 4 u||. Taking the maximum of the last inequality with respect to i = 0,1,...,n and letting n→∞ produce the desired result. 69 Therefore we have established that for an abstract evolution equation (3.36) with the initial condition φ from I and f satisfying (3.38), u n,k (σ n ) → u(t,σ) if σ n → σ as n → ∞, i.e. max k=0,1,...,n ||u n.k (σ n )−P n u(t k ,σ)|| → 0 as n → ∞. It is worth emphasizing that the last convergence holds in theH =L 2 norm. Theorem 12. Suppose thatσ n →σ whereσ n andσ are arbitrary inQ. Thenu n,k (σ n )→u(t,σ), i.e. max k=0,1,...,n ||u n,k (σ n )−P n u(t k ,σ)||→ 0 asn→∞. Proof. Since our initial condition φ ∈ H 1 but φ / ∈ I and f ∈ H 1 but f does not satisfy (3.38) we can approximate them by ˆ φ ∈ I and ˆ f satisfying (3.38) by using a density argument, i.e. the quantities|| ˆ φ−φ|| and|| ˆ f(s)−f(s)|| can be made as small as desired. Therefore we have: ||u n,k (σ n ,f,φ)−P n u(t k ,σ,f,φ)||≤||u n,k (σ n ,f,φ)−u n,k (σ n , ˆ f, ˆ φ)|| +||u n,k (σ n , ˆ f, ˆ φ)−P n u(t k ,σ, ˆ f, ˆ φ)||+||P n u(t k ,σ, ˆ f, ˆ φ)−P n u(t k ,σ,f,φ)||. As was mentioned before the second term goes to 0 asn → ∞. Since the operatorA generates an evolution operatorU onH, applying results of Theorem 2, every solution of (3.29)-(3.30) can be written as u(t,σ,f,φ) =U(t,0)φ+ Z t o U(t,s)f(s)ds. (3.42) 70 Then ||u(t,σ, ˆ f, ˆ φ)−u(t,σ,f,φ)||≤||U(t,0|||| ˆ φ−φ||+ Z t 0 ||U(t,s|||| ˆ f(s)−f(s)||ds ≤c|| ˆ φ−φ||+c Z t 0 || ˆ f(s)−f(s)||ds, where for the last estimate we used the property of the evolution operator||U(t,s)||≤ Me w(t−s) and the fact thatt,s∈ [0,T]. Now we can make the right hand side of the last inequality arbitrarily small. Therefore the term ||P n u(t k ,σ, ˆ f, ˆ φ)−P n u(t k ,σ,f,φ)|| can be made arbitrarily small as well. We also can make the term ||u n,k (σ n ,f,φ)− u n,k (σ n , ˆ f, ˆ φ)|| as small as we want using the stability property and (3.20). Thus the convergence result is established. To prove the next result (similar to the one from [3]) which indicates how an approx- imation estimation problem can be used to obtain the solutions for the original problem we need to modify the cost functionalsJ andJ n . This has to be done because the func- tionalsJ andJ n require a pointwise evaluation (in the spatial variablex) of the function u. This is not possible since we have established convergence results (Theorem 12) only in theH =L 2 norm. Define an ”average” of the functionu in a neighborhood of a pointx (0<x< 1): ˆ u(x,) = 1 2 Z x+ x− u(y)dy for some small> 0. Similarly we can define forx = 0, ˆ u(0,) = 1 R 0 u(y)dy and for x = 1, ˆ u(1,) = 1 R 1 1− u(y)dy. Similarly define ˆ u i,j n,T i , at a pointx ij : ˆ u i,j n,T i , = 1 2 Z x ij + x ij − u i,j n,T i (y)dy. 71 Then we can define the functional ˆ J as ˆ J(σ) = l X i=1 m i X j=1 |ˆ u(x ij ,,T i ,K ij ,σ)−u ij | 2 , (3.43) and the functional ˆ J n as ˆ J n (σ) = l X i=1 m i X j=1 ˆ u i,j n,T i , (σ)−u ij 2 . (3.44) Then we can prove the following result. Theorem 13. Let{σ n } be given, where eachσ n is a solution to the approximate parameter esti- mation problem (3.32)-(3.44). Then there existsσ ∗ ∈Q and a subsequence{σ nv } such that{σ nv }→σ ∗ andσ ∗ is a solution to the original estimation problem (3.29)-(3.43); in fact, for any convergent subsequenceσ nv →σ, the limitσ is a solution to this problem. If the problem has a unique solution, then the sequence {σ n } itself converges to this solution. Proof. SinceQ is compact, convergence of a subsequence{σ nv } to someσ ∗ is ensured. For eachi = 1,...,n,j = 1,...,m i we have ||u i,j n,T i (σ n )−u(T i ,σ)||≤||u i,j n,T i (σ n )−P n u(T i ,σ)||+||P n u(T i ,σ)−u(T i ,σ)||. From Theorem 12 we know that max k=0,1,...,n ||u n.k (σ n )−P n u(t k ,σ)||→ 0 asn→∞ and therefore the first term in the above inequality goes to0 asn→∞. The second term goes to 0 as well since we can use an estimate from Lemma 10 (even ifu in our case is not inH 4 it can be approximated by ˜ u ∈ H 4 using a density argument). Therefore, 72 ||u i,j n,T i (σ n )−u(T i ,σ)||→ 0 asn→∞. Using the definitions of ˆ u(x,) and ˆ u i,j n,T i , and the Cauchy-Schwartz inequality we obtain: |ˆ u i,j n,T i , (σ n )− ˆ u(T i ,σ,x ij ,)|≤ 1 2 Z x ij + x ij − |u i,j n,T i (σ n ,y)−u(T i ,σ,y)|dy ≤ 1 √ 2 ( Z x ij + x ij − |u i,j n,T i (σ n ,y)−u(T i ,σ,y)| 2 dy) 1 2 ≤ 1 √ 2 ||u i,j n,T i (σ n )−u(T i ,σ)||. Therefore|ˆ u i,j n,T i , (σ n )− ˆ u(T i ,σ,x ij ,)|→ 0 asn→∞. From (3.43) ˆ J(σ ∗ ) = l X i=1 m i X j=1 |ˆ u(x ij ,,T i ,K ij ,σ ∗ )−u ij | 2 = lim nv→∞ l X i=1 m i X j=1 ˆ u i,j nv,T i , (σ nv )−u ij 2 = lim nv→∞ ˆ J nv (σ nv )≤ lim nv→∞ ˆ J nv (σ) for any σ ∈ Q. But Theorem 12 is also valid with the the constant sequence σ, so that max k=0,1,...,n ||u n,k (σ)−P n u(t k ,σ)|| → 0 as n → ∞; we are thus guaranteed that lim n k →∞ ˆ J n k (σ) = ˆ J(σ) so that ˆ J(σ ∗ ) ≤ ˆ J(σ) for anyσ ∈ Q. If the problem of minimizing ˆ J overQ has a unique solutionσ ∗ , then standard subsequential arguments yield convergence of{σ n } itself. Remark: As was mentioned before we had to change cost functionalsJ andJ n because of a required pointwise (in spatial variable) evaluation ofu. The other way to address this issue would be to prove convergence results (as in Theorem 12) in the strongerV - norm. For that we would apply the above described factor-method in the space E = V = H 1 [0,1] with the corresponding Sobolev space norm. Thenφ n = P n V would be a projection ofV ontoV n . In order to prove convergence in theV -norm we would need to show that the stability and approximation conditions are satisfied. It is not difficult to 73 prove that the operatorA n (t,σ n ) = P n V A(t,σ n ) approximates the operatorA in theV - norm (the proof, which is similar to the proof of Theorem 11, would require the estimate ||D 3 (u−P n u)|| L 2 ≤ c 3 n ||D 4 u|| L 2 , foru∈ H 4 which can be obtained by using results of Lemma 10, estimates from [24] and Schmidt Inequality [22]). However the stability condition is much harder to prove, and as of yet we have been unsuccessful in establishing the requisite estimates in theV -norm. 74 Chapter 4 Application of the Adjoint Method for Computing the Gradient Now we show how the gradient ofJ n (σ) for the approximate parameter estimation prob- lem (3.33) can be a computed in a very efficient and precise way using the adjoint method (see for example [19]). We will assume that all maturitiesT i are arranged in the ascending order, i.e. 0 < T 1 <T 2 <...<T l . We divide the interval [0,T] inton equal subintervals such that all T i ’s would be grid points: 0 =t 0 <t 1 <...<t n =T l and{T 1 ,...T l }⊂{t 0 ,...,t n }. For numerical implementation we would have to discretize σ(t,x) (for example, σ(t,x) = P i P j c ij φ ij (t,x), where φ ij can be tensor products of splines, etc.) Then our goal would be to determine the set of coefficientsc ij . However, for the presentation purpose we keep usingσ as the only parameter we need to determine. Then the approximate minimization problem (3.32)-(3.33) becomes: findσ ∗ n which minimizes the functional J n (σ) = l X i=1 m i X j=1 u i,j nT i T l −u ij 2 , where for each (i,j),u i,j k is given by u i,j k+1 = (I− T i n A n (t k+1 )) −1 u i,j k + T i n (I− T i n A n (t k+1 )) −1 P n f i,j k+1 , u i,j 0 =P n φ i,j , (4.1) 75 fork = 0,1,..., nT i T l −1,j = 1,...,m i ,i = 1,...,l andt k = kT l n . To simplify the presentation and emphasize the dependence of variables on an unknown parameterσ we will write C i,j k+1 (σ) = (I− T i n A n (t k+1 )) −1 , F i,j k+1 = T i n (I− T i n A n (t k+1 )) −1 P n f i,j k+1 . Then (4.1) becomes: u i,j k+1 (σ) =C i,j k+1 (σ)u i,j k (σ)+F i,j k+1 (σ), u i,j 0 =P n φ i,j , (4.2) fork = 0,1,..., nT i T l −1,j = 1,...,m i ,i = 1,...,l andt k = kT l n . The functionalJ n (σ) can be rewritten as J n (σ) = n X k=1 l X i=1 m i X j=1 u i,j k −u ij 2 δ k, nT i T l , (4.3) whereδ i,j is a Kronecker delta. Differentiating (4.3) and (4.2) with respect toσ, we obtain: ∇J n (σ) = n X k=1 l X i=1 m i X j=1 2(u i,j k −u ij ) T ∂u i,j k ∂σ δ k, nT i T l = l X i=1 m i X j=1 n X k=1 2(u i,j k −u ij ) T ∂u i,j k ∂σ δ k, nT i T l and ∂u i,j k+1 (σ) ∂σ =C i,j k+1 (σ) ∂u i,j k (σ) ∂σ + ∂C i,j k+1 (σ) ∂σ u i,j k (σ)+ ∂F i,j k+1 (σ) ∂σ . Define for eachi andj (i = 1,...,l,j = 1,...,m i )the co-statesz i,j k , and the co-state or adjoint equations: z i,j k−1 (σ) = (C i,j k+1 (σ)) T z i,j k (σ)+v i,j (σ), z i,j nT i T l = 0, (4.4) 76 k = 1,..., nT i T l , where v i,j = 2 n X k=1 (u i,j k −u i,j )δ k, nT i T l . It then follows that ∇J n (σ) = l X i=1 m i X j=1 n X k=1 (v i,j ) T ∂u i,j k ∂σ = l X i=1 m i X j=1 nT i T l X k=1 (v i,j ) T ∂u i,j k ∂σ = l X i=1 m i X j=1 nT i T l X k=1 (z i,j k−1 −(C i,j k+1 ) T z i,j k ) T ∂u i,j k ∂σ = l X i=1 m i X j=1 nT i T l X k=1 (z i,j k−1 ) T ∂u i,j k ∂σ − l X i=1 m i X j=1 nT i T l X k=1 (z i,j k ) T C i,j k+1 ∂u i,j k ∂σ = l X i=1 m i X j=1 nT i T l −1 X k=0 (z i,j k ) T ∂u i,j k+1 ∂σ − l X i=1 m i X j=1 nT i T l −1 X k=0 (z i,j k ) T C i,j k+1 ∂u i,j k ∂σ + l X i=1 m i X j=1 (z i,j 0 ) T C i,j 1 ∂u i,j 0 ∂σ = l X i=1 m i X j=1 nT i T l −1 X k=0 (z i,j k ) T [ ∂u i,j k+1 ∂σ −C i,j k+1 ∂u i,j k ∂σ ] = l X i=1 m i X j=1 nT i T l −1 X k=0 (z i,j k ) T [ ∂C i,j k+1 ∂σ u i,j k + ∂F i,j k+1 ∂σ ], sinceu i,j 0 is independent ofσ. Therefore to compute the gradientJ n (σ) for a given value ofσ ∈ Q the following calculations are needed for eachi = 1,...,l,j = 1,...,m i : 1. Integrate the forward model (4.2) once and save statesu i,j k ,k = 0,1,..., nT i T l . 2. Integrate the adjoint system (4.4) backward once and save statesz i,j k ,k = 0,1,..., nT i T l . 3. Compute the gradient using the saved states via the formula ∇J n (σ) = l X i=1 m i X j=1 nT i T l −1 X k=0 (z i,j k ) T [ ∂C i,j k+1 ∂σ u i,j k + ∂F i,j k+1 ∂σ ]. 77 Remark: If we want to add a Tikhonov regularization term (as in [16]) to the least squares performance index, this can be easily done and will not have an impact on the adjoint method. The regularization term depends explicitly onσ and its derivative can be added to the above formula for the gradient. In order to carry out the required computations, matrix representations for the oper- atorsC i,j k (σ) and functionsF i,j k (σ) have to be computed. To this end we use Galerkin approximations given by (3.1) derived from the weak form of the abstract evolution equation (2.29): < ˙ u i,j n ,ϕ r >=−a(u i,j n ,ϕ r )+<F i,j ,ϕ r >, r = 1,2,...,v n , where the time dependent bilinear forma(·,·) is given by (2.21) and{ϕ r } vn r=1 are mod- ified cubic spline basis elements described above. Settingu i,j n (t) = P vn r=1 X i,j r (t)ϕ r (x) in the last equation, we get: vn X r=1 ˙ X i,j r (t)<ϕ r ,ϕ p >=− vn X r=1 X i,j r (t)a(ϕ r ,ϕ p )+<F i,j ,ϕ p >, p = 1,2,...,v n and the matrix representation for the operator A i,j n (t k+1 ) can be determined as A i,j n (t k+1 ) = (G n ) −1 E i,j n,k+1 (σ), where the matrixG n is given by [G n ] r,p =< ϕ p ,ϕ r > and the matrixE i,j n,k+1 by [E i,j n,k+1 ] r,p =−a(ϕ p ,ϕ r ), and where the time dependent form a(·,·) is evaluated att =t k+1 = (k+1)T l n . The matrix representation for the operatorsC i,j k+1 and vectorsF i,j k+1 can be evaluated as follows: C i,j k+1 = (I− T i n A n (t k+1 )) −1 = (I− T i n G n E i,j n,k+1 ) −1 = (G n − T i n E i,j n,k+1 ) −1 G n , 78 F i,j k+1 = T i n (I− T i n A n (t k+1 )) −1 P n f i,j k+1 = ( n T i G n −E i,j n,k+1 ) −1 [<f i,j k+1 ,ϕ 1 >,...,<f i,j k+1 ,ϕ vn >] T . In order to compute partial derivatives with respect toσ required to calculate∇J we use the fact that ∂A −1 ∂σ =−A −1∂A ∂σ A −1 . Then ∂C i,j k+1 ∂σ = T i n C i,j k+1 ∂A n (t k+1 ) ∂σ C i,j k+1 = T i n C i,j k+1 (G n ) −1 ∂E i,j n,k+1 ∂σ C i,j k+1 = T i n (G n − T i n E i,j n,k+1 ) −1 ∂E i,j n,k+1 ∂σ (G n − T i n E i,j n,k+1 ) −1 G n . The formula for the ∂F i,j k+1 ∂σ can be easily obtained in a similar fashion. 79 Chapter 5 Future Work To extend this research there are a number of problems that can be addressed. Firstly, we have not attempted to prove the legitimacy of truncating the domain of the spatial vari- able in the Black-Scholes equation from[0,∞) to[S 0 ,S 1 ]. Another improvement would be the extension of the theory presented here to make it applicable for broader classes of problems. One way of achieving this goal would be to reformulate the approximation condition (3.26) to make it easier to check. As we mentioned before another extension of this work would be proving conver- gence results in a strongerH 1 [0,1] norm. We have indicated how it can be done but the question of the stability of the factor-method in that case still is not fully addressed. Also we presented along with the fully implicit scheme described in this thesis, the Crank-Nicolson method that could substantially improve the rate of convergence. The explicit scheme described in Krein’s book [14] corresponds to approximating an expo- nential functione z by (1+z), the fully implicit scheme described in this work approx- imatese z by (1−z) −1 and the Crank-Nicolson method approximatese z by 1+ z 2 1− z 2 . This suggests that in the future we may try to use other Pade rational functions to approximate the exponential and gain substantial improvement in the rate of convergence. The numerical implementation of the method described in this paper would be nec- essary to demonstrate the viability of our approach. For that one could carry out the estimation procedure first using the classical Black-Scholes setting (volatility is a con- stant). Then observations could be generated from a known solution and an application 80 of our technique used to produce the volatility. This test would show how our approach works in the simplest situation. Then one could consider a more complex example: the absolute diffusion model of Cox and Ross [8] in which the volatility is of the formσ(S,t) = C S (C- is a constant). In this case the PDE (2.1) can be solved explicitly and observations of call prices needed for the calibration can again be generated from the closed-form solution. Finally, one could use real market data (index option) for testing our approach. The performance of our method with the observed volatility (bid-ask spread) could be com- pared. 81 References [1] Y . Achdou and O. Pironneau Volatility smile by multilevel least square, Interna- tional Journal of Theoretical and Applied Finance 5 (2002) 619–643. [2] H.T. Banks and K. Ito, A unified framework for approximation in inverse problems for distributed parameter systems, Control theory and advanced technology Vol 4 (1) (1988) 73–90. [3] H.T. Banks and P.D. Lamm, Estimation of variable of coefficients in parabolic dis- tributed systems, IEEE transactions on automatic control AC-30 (4) (1985) 386– 398. [4] H.T. Banks and I.G. Rosen, Numerical schemes for the estimation of functional parameters in distributed models for mixing mechanisms in lake and sea sediment cores, Inverse problems 3 (1987) 1–23. [5] F. Black and M. Scholes, The valuationof option contracts and a test of market efficiency, J. Finance 27 (1972) 399–417. [6] F. Black and M. Scholes, The pricing of options and corporate liabilities, J. Pol. Econ. 81 (1973) 637–654. [7] T. Coleman, Y . Li and A. Verma Reconstructing the unknown volatility function, Journal of Computational Finance 2 (1999) 77–102. [8] J.C. Cox and S.A. Ross, The valuation of options for alternative stochastic processes, Journal of Financial Economics 3 (1976) 145–166. [9] S. Crepey, Calibration of the local volatility in a generalized Black-Scholes model using Tikhonov regularization, SIAM J. Math. Anal. 34(5) (2003) 1183–1206. [10] E. Derman and I. Kani, Riding on a smile, Risk 7(2) (1994) 32–39. [11] B. Dupire, Pricing with a smile, Risk 7(1) (1994) 18–20. [12] A. Friedman, Partial differential equations of parabolic type, Prentice-Hall, Engle- wood Cliffs, N.J., 1964. [13] N. Jackson, E. Suli and S. Howison, Computation of deterministic volatility sur- faces, Journal of Computational Finance 2(2) (1998) 5–32. 82 [14] S.G. Krein, Linear differential equations in Banach space, Translation of mathe- matical monographs, V ol.29, Amer. Math. Soc., Providence, Rhode Island, 1971. [15] E. Kreyszig, Introductory functional analysis with applications, John Wiley & Sons, New York, 1978. [16] R. Lagnado and S. Osher, A technique for calibrating derivative security pricing models: numerical solution of an inverse problem, The Journal of Computational Finance 1(1) (1997) 13–25. [17] J.L. Lions, Optimal control of systems governed by partial differential equations, Springer, New York, 1971. [18] M. Musiela and M. Rutkowski, Martingale methods in financial modelling, Appli- cations of Mathematics, Stochastic Modelling an Applied Probability, 36, New York, 1998. [19] G. Rosen, C. Wang, G. Hajj, X. Pi, B. Wilson, An Adjoint Based Approach to Data Assimilation for a Distributed Parameter Model for the Ionosphere, Proceedings of the 40th IEEE Conference on Decision and Control, Orlando, Florida, December 4-7, 2001, V olume 5, 4406–4408. [20] I.G. Rosen, A discrete approximation framework for hereditary systems, Journal of Differential Equations 40 (3) (1981) 377–449. [21] M. Rubinstein, Implied binomial trees, Journal of finance 69 (1994) 771–818. [22] M.H. Schultz, Spline analysis, Prentice-Hall, Englewood Cliffs, N.J., 1973. [23] R.G. Showalter, Hilbert space methods for partial differential equations, Pitman, London, 1977. [24] B.K. Swartz and R.S. Varga, Error bounds for spline and L-spline interpolation, Journal of Approximation Theory 6(1) (1972) 6–49. [25] H. Tanabe, Equations of evolution, Pitman, London, 1973. 83 Appendix: Results from Analysis Arzela-Ascoli characterization of compact sets Theorem (Arzela-Ascoli for [a,b]). For a closed setF ⊂C([a,b]) to be compact it is necessary and sufficient that 1)FamilyF is uniformly bounded, i.e.∃L∈R ∀f ∈F ∀t∈ [a,b] : |f(t)|≤L 2)Family F is equicontinuous: ∀ > 0 ∃δ > 0 ∀f ∈ F ∀{t 1 ,t 2 } ⊂ [a,b], d(t 1 ,t 2 )<δ :|f(t 1 )−f(t 2 )|<. Theorem (Arzela-Ascoli, general version). LetX be a compact metric space,Y a complete metric space. Then a subsetF of C(X,Y) is compact if and only if it is equicontinuous, pointwise relatively compact (i.e.∀x∈X, the set{f(x) :f ∈F} is relatively compact inY ) and closed. Gronwall’s inequality Theorem (Differential form). Letη(·) be a nonnegative, absolutely continuous function on [0,T], which satisfies for a.e.t the differential inequality ˙ η(t)≤φ(t)η(t)+ψ(t), whereφ(t) andψ(t) are nonnegative, summable functions on [0,T]. Then η(t)≤e R t 0 φ(s)ds [η(0)+ Z t 0 ψ(s)ds] for all 0≤t≤T. 84 Theorem (Integral form). If a continuous functionf on [a,b] satisfies the inequality f(t)≤g(t)+ Z t a h(s)f(s)ds, t∈ [a,b], whereg is continuous,h∈L 1 [a,b], andh(t)≥ 0 a.e. then f(t)≤g(t)+ Z t a g(s)h(s)exp( Z t s h(τ)dτ)ds, ∀t∈ [a,b]. 85
Abstract (if available)
Abstract
We consider a generalized Black-Scholes model which is used for pricing derivative securities. A fully discrete approximation framework based on the factor-method developed by Krein is presented for the solution of the associated inverse problem. The scheme allows one to estimate a parameter, local volatility, which is extremely important in the theory and practice of financial markets. Volatility is a function of the spatial and temporal variables which appear in the Black-Scholes partial differential equation. Theoretical convergence results are established. A numerical scheme utilizing the advantage of full discretization along with the adjoint method are presented and discussed. The usage of the adjoint method allows for the computation of the gradient of the cost functional precisely in a computationally efficient manner.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Finite dimensional approximation and convergence in the estimation of the distribution of, and input to, random abstract parabolic systems with application to the deconvolution of blood/breath al...
PDF
Physics-informed machine learning techniques for the estimation and uncertainty quantification of breath alcohol concentration from transdermal alcohol biosensor data
Asset Metadata
Creator
Lytvak, Oleksandr
(author)
Core Title
A fully discrete approach for estimating local volatility in a generalized Black-Scholes setting
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Applied Mathematics
Publication Date
07/31/2008
Defense Date
06/18/2008
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
adjoint method,Black-Scholes model,factor-method for evolution equations,inverse problem,local volatility,OAI-PMH Harvest
Language
English
Advisor
Rosen, Gary (
committee chair
), Lototsky, Sergey Vladimir (
committee member
), Westerfield, Mark (
committee member
)
Creator Email
lytvak@gmail.com,lytvak@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m1489
Unique identifier
UC1275360
Identifier
etd-Lytvak-2248 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-89646 (legacy record id),usctheses-m1489 (legacy record id)
Legacy Identifier
etd-Lytvak-2248.pdf
Dmrecord
89646
Document Type
Dissertation
Rights
Lytvak, Oleksandr
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
adjoint method
Black-Scholes model
factor-method for evolution equations
inverse problem
local volatility