Close
The page header's logo
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected 
Invert selection
Deselect all
Deselect all
 Click here to refresh results
 Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Mixing conditions and return times on Markov Towers
(USC Thesis Other) 

Mixing conditions and return times on Markov Towers

doctype icon
play button
PDF
 Download
 Share
 Open document
 Flip pages
 More
 Download a page range
 Download transcript
Copy asset link
Request this asset
Transcript (if available)
Content MIXING CONDITIONS AND RETURN TIMES ON MARKOV TOWERS by Yiannis Psiloyenis A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (MATHEMATICS) August 2008 Copyright 2008 Yiannis Psiloyenis Acknowledgments This thesis is a summary of the main results of my research that I carried out dur- ing my Ph.D. at the University of Southern California which was written under the supervision and mentorship of Dr. Nicolai Haydn. I want to take the opportu- nity, at this point, to express my sincere gratitude and heartfelt appreciation to Dr. Nicolai Haydn for his infinite advice, enthusiastic guidance, encouragement and support at all levels throughout the entire research and writing process. I would also like to express my deepest thanks to Professors Peter Baxendale, Larry Goldstein, Sergey Lototsky and Edmond Jonckheere for their service on my Guidance and Ph.D. committee. I must also acknowledge that this research would not have been possible with- out the financial assistance of the University of Southern California and, in partic- ular, the Department of Mathematics which opened doors to science and research for me and provided me with an entire world of opportunities. Finally, I would like to thank my family for their life-long love and support. I especially owe much to my parents who have always believed in me and fully supported my decisions and whose unconditional love and sacrifice have brought me this far. ii Table of Contents Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi Chapter 1 Mixing Conditions on Markov Towers . . . . . . . . . . 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Measure Theory . . . . . . . . . . . . . . . . . . . . . . 2 1.1.2 Mixing and Correlations . . . . . . . . . . . . . . . . . . 5 1.2 Decay of Correlations on Markov Towers . . . . . . . . . . . . . 9 1.2.1 Construction of the Markov Tower . . . . . . . . . . . . . 10 1.2.2 Regularity conditions on the Markov Tower . . . . . . . . 14 1.2.3 The Perron-Frobenius operator and Function Spaces . . . 15 1.2.4 Existence of an a.c. invariant measure and Decay of Cor- relations . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.2.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.3 Mixing Properties derived on the Markov Tower . . . . . . . . . . 22 1.3.1 Main Results . . . . . . . . . . . . . . . . . . . . . . . . 22 1.3.2 Proof of the Theorem . . . . . . . . . . . . . . . . . . . . 23 Chapter 2 The Stein Method . . . . . . . . . . . . . . . . . . . . . 35 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.1.1 The Framework . . . . . . . . . . . . . . . . . . . . . . . 36 2.2 Poisson Approximation . . . . . . . . . . . . . . . . . . . . . . . 37 Chapter 3 Multiple Return Times . . . . . . . . . . . . . . . . . . 41 3.1 Definitions and Notation . . . . . . . . . . . . . . . . . . . . . . 41 3.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 iii 3.2.1 Return Times . . . . . . . . . . . . . . . . . . . . . . . . 46 3.2.2 Hitting Times . . . . . . . . . . . . . . . . . . . . . . . . 49 3.2.3 Return Times on Markov Towers . . . . . . . . . . . . . . 50 3.2.4 An Example\ Application . . . . . . . . . . . . . . . . . 51 3.3 Method of Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.4 Preliminary Calculations . . . . . . . . . . . . . . . . . . . . . . 53 3.4.1 Bounds for the Stein Solution . . . . . . . . . . . . . . . 53 3.4.2 The Framework . . . . . . . . . . . . . . . . . . . . . . . 56 3.4.3 The Independent case . . . . . . . . . . . . . . . . . . . . 60 3.5 α-Mixing Case . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.5.1 Estimates on the measure of a cylinder . . . . . . . . . . . 65 3.5.2 Error term due to dependence . . . . . . . . . . . . . . . 68 3.6 Proof of the Main Theorem . . . . . . . . . . . . . . . . . . . . . 88 3.6.1 Return Times . . . . . . . . . . . . . . . . . . . . . . . . 88 3.6.2 Hitting Times . . . . . . . . . . . . . . . . . . . . . . . . 96 3.7 Return Times on Markov Towers . . . . . . . . . . . . . . . . . . 97 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 iv List of Figures Figure 1.1 The Markov Tower structure . . . . . . . . . . . . . . . . 13 Figure 1.2 An Example of a Markov Tower . . . . . . . . . . . . . . 20 v Abstract This dissertation discusses mixing properties derived on non-uniformly hyper- bolic dynamical systems which admit a Markov-Tower structure. For systems with exponential and polynomial decay of the tails of the return map we derive α-mixing conditions at exponential and polynomial rates, respectively. The moti- vation is to use these mixing conditions, on eligible systems, to approximate the law of the hitting and return times to a set of small measure. In the first chapter we set up the problem and derive mixing conditions. In the next chapter we give a brief discussion on Stein method and in chapter three we study the hitting and multiple-return times for α-mixing dynamical systems in general. Under the given rates of mixing we use the Stein method to show that the return time of order k to a cylinderA can be approximated by a simple distribution for which sharp error bounds, independent of the orderk, are obtained as well. Additionally, we conclude that the distribution of hitting times, suitably rescaled, can be approximated by an exponential distribution with mean 1. As an application we show that the findings for the return and hitting times can be applied, through the construction of a Markov Tower, to the Gaspard-Wang map which is broadly used in Physics. vi Chapter 1 Mixing Conditions on Markov Towers 1.1 Introduction The aim of this chapter is to study the decay of correlations for a specific class of dynamical systems, namely the ones that admit a Markov Tower structure, and derive for these dynamical systems mixing properties with specific rates of mix- ing. The motivation is to use the derived mixing properties to study statistical properties on eligible dynamical systems such as Hitting and Return Times which is something we do in chapter 3. It is known that uniformly hyperbolic systems have exponential decay of cor- relations. In what follows, and following the spirit of Lai-Sang Young’s paper [Y2], we see that for non-uniformly hyperbolic dynamical systems, but with some degree of hyperbolicity, if we choose a suitable reference set, a neighborhood of the phase space, after a number of iterations of the map it is possible to reveal a 1 Markov structure that is lying underneath. For a suitably chosen partition of the reference set each element of the partition makes a ”full” or Markov Return to the reference set. Using the scheme that is proposed in [Y2], on the new result- ing system, the Markov Tower that is constructed, one can successfully study and prove properties such as existence of Sinai-Ruelle-Bowen measures and decay of correlations. The findings can then be passed back to the original dynamical system. Leaving more precise definitions and results for later we mention here that it turns out that the main factor that determines the speed of the decay of correla- tions, and therefore the rates of mixing we investigate in this thesis, is the size of the tail of the return-time function. Exponentially decaying tails of the return-time function imply exponential decay of correlations and polynomially decaying tails of the return-time function imply polynomial decay of correlations. Below we give a brief presentation of the set-up and provide all definitions we will be using throughout this chapter. 1.1.1 Measure Theory Definition 1. (σ-Algebra) For a setΩ, aσ-algebraF is a collection of subsets of Ω such that the following properties hold: i) Ω belongs in F ii) For any set A inF its complementΩ\A belongs inF as well iii) F is closed under countable unions, i.e. if{A n } n is a sequence of elements ofF then S A n belongs inF as well Having defined the notion of a σ-algebra we can now define the notion of a measure. 2 Definition 2. (Measure) A measureμ :F →R + is a real non-negative function defined on aσ-algebraF which satisfies the following propeties: i) μ(∅) = 0 ii) μ is countably additive i.e., if {A n } n is a countable disjoint collection of elements ofF, then: μ [ n A n ! = X n μ(A n ). (1.1) If, in addition,μ(Ω) = 1 thenμ is said to be a probability measure. Definition 3. (Invariant Measure)A measureμ defined on theσ-algebraF(X) is said to be invariant with respect to a mapF :X →X if μ F −1 (A) =μ(A) for all setsA∈F. (1.2) For a mapF on a given space X equipped with a reference measurem it is possible that there are many invariant measures. One, however, is usually inter- ested in an invariant measure that is most compatible with the original reference measurem to study key properties of the system. These are usually the absolutely continuous invariant measures (a.c.i.m) with respect tom whose existence is al- ways highly sought after but not always guaranteed. The Lebesgue measure is an example of an invariant measure for the Bernoulli map2x mod 1 on(0,1]). The Lebesque measure is an invariant measure for the map 3 f(x) =        2x, if 0≤x≤ 1 2 2(1−x), if 1 2 <x≤ 1. (1.3) defined on [0,1], also. Definition 4. (Absolute Continuity of Measures) A measureμ is said to be ab- solutely continuous with respect to another measure ν, denoted by μ ≪ ν, if μ(A) = 0 for every setA withν(A) = 0. Ifμ is absolutely continuous with respect toν andν isσ-finite, according to Radon-Nikodym theorem, ν has a density, or Radon-Nikodym derivative, with respect to ν; that is there exists a ν-measurable positive real valued function ρ (= dμ dν ) such that μ(A) = Z A ρdν, ∀ν-measurable setsA. (1.4) Definition 5. (Ergodicity) A measureμ is said to be ergodic with respect to a map F if F −1 (A) =A⇒μ(A) = 0 orμ(A) = 1. (1.5) Ergodicity tells us that F moves almost all sets all over the space. In other words, the spaceX can be viewed essentially as one single element meaning that the only elements that do not interact with each other are either negligible, i.e. have measure 0, or have the size of the entire space. An example of an ergodic measure is the Lebesque measure with respect to the mapx→ 2x mod 1 on the unit interval. 4 1.1.2 Mixing and Correlations A central concept in the field of Dynamical Systems and Measure Theory is the notion of mixing of a measure and correlations of functions. The precise defini- tions and importance of these notions become clearer throughout the text. Definition 6. (Cylinder sets) Given a measurable partitionη of the spaceΩ, finite or countably infinite withH(η)<∞, and for eachn∈N, the elements of C(n) = n−1 _ i=0 F −i η = A 0 ∩F −1 A 1 ∩···∩F −(n−1) A n−1 ,|A 0 ,A 1 ···A n−1 ∈η (1.6) are calledn-cylinders. In what follows, theσ-algebraF in a typical dynamical system (Ω,F,ν,F) will be theσ-algebra generated by the cylinders of all orders n≥ 0. Example: In a subshift of finite type with a finite alphabetV ={v 1 ,v 2 ,...v m }, an mxm adjacency matrix with entries in{0,1} and the sets of all admissible 1-sided sequences denoted by Σ + A = (x 0 ,x 1 ,...) :x i ∈V andA x i x i+1 = 1 ∀i∈N ∗ , if T is the shift operator from Σ + A onto Σ + A which is acting on a sequence by shifting all its symbols to the left, i.e. (T(x)) i =x i+1 , then a typicaln-cylinder is defined as C([a 0 ,a 1 ,...,a n−1 ]) = (x 0 ,x 1 ,...)∈ Σ + A :x 0 =a 0 ,x 1 =a 1 ,...x n−1 =a n−1 . (1.7) Given a partitionη of a space Ω the set of n-cylindersC(n), for alln ∈ N, forms a new partition of the space, a refinement of the original partition. 5 Definition 7. (Mixing) In a measure-preserving dynamical system(Ω,F,μ,F)μ is said to be mixing with respect to the mapF if |μ(A∩F −n (B)−μ(A)μ(B)|→ 0 (1.8) asn→∞ for all measurable setsA,B∈F. A mixing dynamical system is a system in which the preimages of any set A under the mapF , namelyF −n (A), tend to spread all over the entire space as n→∞. To make this clearer we may rewrite (1.8) as μ(A∩F −n (B)) μ(B) −μ(A) → 0 asn→∞. (1.9) The above expression suggests that not only will the preimages ofB spread everywhere over the entire spaceX but, in fact, they will spread evenly over the entire space. It is then only natural, once we have defined mixing, for one to become inter- ested in how fast mixing occurs; that is the speed at which the quantity in (1.9) converges to zero. In that context, we give the following definitions pertaining to the speed of convergence. Let{α(l)} l∈N and{φ(l)} l∈N be two decreasing sequences of positive real num- bers that converge to 0. Then we define i)α-mixing (Ω,F,μ,F) is said to beα-mixing if |μ(A∩F −n−l B)−μ(A)μ(B)|≤α(l)μ(A), ∀A∈C(n) andB∈F (1.10) 6 ii)φ-mixing (Ω,F,μ,F) is said to beφ-mixing if |μ(A∩F −n−l B)−μ(A)μ(B)|≤φ(l)μ(A)μ(B), ∀A∈C(n) andB∈F. (1.11) Remark: Note thatφ-mixing impliesα-mixing and both, in turn, imply ergodic- ity. For more on properties and examples on mixing dynamical systems we refer to [D2]. Definition 8. (The Correlation function) The correlation function of two real- valued,μ-measurable functionsφ,ψ : Ω → R with respect to a mapF is given by C n (φ,ψ) = Z (φ◦F n )ψdμ− Z φdμ Z ψdμ (1.12) which is essentially the covariance ofφ andψ whenφ is being iterated under the mapF . As we see later on, it is of main interest to look at the asymptotic behavior of the correlation function, asn→∞, and for a specific class of observables, in whichφ andψ lie, to ask questions regarding the rates at which correlations decay to zero. Another concept we need to familiarize ourselves with before we proceed fur- ther on is the notion of the Jacobian. Given a differentiable functionF :R n →R n with F(x 1 ,x 2 ,...,x n ) = (f 1 (x 1 ,x 2 ,...,x n ),...,f n (x 1 ,x 2 ,...,x n )), 7 the Jacobi matrix of the functionF is defined to be the matrix given by A ij = ∂f i ∂x j . We can then calculate its determinant, known as the Jacobian, of which the absolute value at a pointp gives us the rate at which the functionF expands or contracts volumes nearp with respect to the Lebesque measure onR n . Inspired by that, we make an analogous definition for the Jacobian of a map F : Ω → Ω with respect to a measure μ. In this case the Jacobian, which we formally denote by JF μ , where the subscript μ will be omitted when there is no risk for confusion, gives us the rate at which the map F expands or shrinks volumes when volumes are measured using the measureμ. It then becomes clear, by the means of a change of variables, that if F is a bijective map andμ has a density functionρ, that ρ JF =ρ◦F ⇔ρ(x) = ρ(F −1 (x)) JF(F −1 (x)) ∀x∈ Ω (1.13) or, in a similar way, ifF is non-bijective, that ρ(x) = X y∈F −1 (x) ρ(y) JF(y) ∀x∈ Ω. (1.14) The representation (1.14) above will be the key equation for defining the Perron-Frobenius operator which operator and its properties will play a crucial role in obtaining many of the results discussed in this thesis. 8 1.2 Decay of Correlations on Markov Towers First we start with the definition of a Markov map. Definition 9. (Markov Map) A Markov map,F : Δ→ Δ, is a map for which we can find a countable partitionη = {η i } i∈N of Δ such thatF maps each element η i of the partition bijectively ontoΔ i.e., F |η i : η i → Δ is a bijection∀i∈N. Example: An example of a Markov map is the Bernoulli map defined by F(x) = 2x mod 1 from [0,1] → [0,1]. The partition of [0,1) into two parts, namely [0,1/2) and [1/2,1), obviously satisfies the Markov property under the action of F . Similarly the partition of the unit interval [0,1] into m intervals [i/m,(i + 1)/m) is a Markov partition for the uniformly piecewise expanding endomorphismE m . The Markov property is in general a property of uniformly expanding endo- morphisms and Anosov diffeomorphisms, such as the Arnold’s cat map, of com- pact Riemannian manifolds. Specifically, in 1968, Sinai [Si1] shows that one can construct Markov partitions for all Anosov diffeomorphisms and later, in 1970, Bowen [B] constructed Markov partitions for all Axiom A diffeomorphisms. For Anosov and Axiom A diffeomorphisms, via Markov partitions, exponential decay of correlations for H¨ older continuous functions and the existence of Sinai-Ruelle- Bowen measures is proved in [Si2] and in ([Si2], [R]), respectively. Besides the fact that it has provided a means to proving key statistical proper- ties such as decay of correlations, the existence of a Markov partition is of great importance in the study of dynamical systems. In the presence of a such, one can use symbolic dynamics to model a topological or smooth dynamical system by a 9 discrete space consisting of infinite sequences of abstract symbols, each of which corresponds to a state of the system, with the evolution given by the shift operator. This identification reduces the study of the original system to the study of a purely symbolic system and allows often to draw stunning conclusions about the original system. Young’s approach, whose motivation is to study non-uniformly hyperbolic systems, does not make use of Markov partitions. Her approach, which covers a broader class of maps than the class of the maps with the Markov property, is based on a tower construction which is a result of a process that captures the renewal properties (return times to a reference set) of a dynamical system. In par- ticular, statistical properties are extracted from the distribution of the return maps. The concept of Markov towers is illustrated in the following section. 1.2.1 Construction of the Markov Tower Given a spaceX and a mapf: X → X assume there exists a neighborhood Δ 0 for which we can find a partitionη 0 = {Δ 0,i } i∈N and a functionR: Δ 0 → Z + , constant on each element ofη 0 , such thatf R i maps bijectivelyΔ 0,i onto Δ 0 . The functionR, for which we haveR(x) =R i ∀x∈ Δ 0,i , we call the return time. The mapf, the setΔ 0 and the return timeR will be the key ingredients for constructing the Markov tower with base Δ 0 . Definition 10. (Markov Tower) Given the setΔ 0 and the return timeR we define the towerΔ to be: Δ := (x,n) :x∈ Δ 0 andn∈Z + withn<R(x) . (1.15) 10 Then a mapF on the tower Δ,F : Δ→ Δ, is defined by: F(x,l) =              (x,l +1), if l+1<R(x) (f R(x) (x),0), if l+1 =R(x) (1.16) Essentially what we do here is passing from the original dynamical system f: X → X to another system F : Δ → Δ which we think of as the abstract model off. The reason for doing so is because the new system(F,Δ), which has the original system as a factor, is a much simpler object to study than the original one. The new induced system’s statistical properties will be easier to derive and as we see further below the statistical properties of (F,Δ) will easily carry over to(f,X). The towerΔ, as the name suggests, is a tower or skyscraper with baseΔ 0 and infinitely many horizontal levels lying on top of the base, one corresponding to every positive integer. The mapF takes each point of the base Δ 0 one level up upon each iteration until it reaches the highest level of the tower, which highest level is determined by the return time, and it finally returns to the bottom Δ 0 . Figure 1.1 helps clarify the mechanics on the tower Δ. Related to figure 1.1, the point z, that lies on the base floor of the tower is moved upwards one step at each iteration ofF and, finally, when it reaches the top after four iterations it is mapped back to level zero where it started. In fact, the entire dotted interval, an entire element of the partition, makes a full return meaning that it is mapped bijectively ontoΔ 0 . This way, given the countable partition{Δ 0,i } of the baseΔ 0 a partition on the entire tower Δ is naturally induced. We refer to this induced partition asη = {Δ l,i } i∈N l<R i . Under the action ofF , each element Δ 0,i is moved 11 upwards to Δ 1,i ,Δ 2,i ,Δ 3,i ,··· until Δ R i −1,i and then on the next iteration it is mapped bijectively down toΔ 0 . The mapF R from Δ 0 to Δ 0 which satisfies F R i |Δ 0,i : Δ 0,i → Δ 0 is a bijection∀i∈N. (1.17) is called the Return map. For what follows we assume that there exists a reference measure m on Δ not necessarily finite or F -invariant. The map F , however, and its inverses are assumed to be measurable and non-singular with respect tom. Also we assume that the partitionη = {Δ l,i } is separating, i.e. it eventually separates all points in the sense that for all x and y there exists a finite n 0 such that F n 0 (x) and F n 0 (y) lie in different elementsΔ l,i ’s of the partition. This allows us to introduce a metric that measures distances between points on the towerΔ as in the following definition: Definition 11. (Separation Time) Forx,y points in Δ 0 the separation time be- tweenx andy, denoted bys 0 (x,y), is defined by s 0 (x,y) = inf n (F R ) n (x) and(F R ) n (y) lie in distinctΔ 0,j ’s , (1.18) i.e. s 0 (x,y) is the first time thatx andy separate upon consecutive iterations of the map F R . We can then extend this definition to the separation time function s(x,y) defined for any pair of pointsx andy on the entire tower as s(x,y) = inf n (F R ) n (F −l (x)) and(F R ) n (F −l (y)) lie in distinctΔ 0,j ’s (1.19) if x,y lie in the same Δ l,j ands(x,y) = 1 ifx andy belong in differentΔ i,j ’s. 12 Figure 1.1: The Markov Tower structure Once the point z reaches the top, after 4 iterations of F , the pointz, as well as the whole dotted segment, is mapped back to Δ 0 viaf R . In fact, this mapping is a bijection onto Δ 0 . Note that for the pointz, as well as for the entire partition element thatz lies in, the return-time functionR = 5. 13 The separation times(x,y) enables us to define a metric on Δ that measures the distance betweenx andy. Definition 12. (Metric on the Tower) Forx,y ∈ Δ we defined(x,y) = β s(x,y) , with β arbitrarily chosen in (0,1), to be the metric that captures the symbolic distance ofx andy on the tower Δ. One can verify that this is indeed a metric. In particular, what makes this a metric is the fact that the partition is separating. No pair of points will stay together forever and, therefore, this givesβ s(x,y) = 0⇔x =y. 1.2.2 Regularity conditions on the Markov Tower We assume thatF carriesm| Δ l,i tom| Δ l+1,i without distortion ifl < R i − 1 and hence, if we denote the Jacobian of F by JF , JF = 1 on Δ\ F −1 Δ 0 . For the top of the tower, F −1 Δ 0 , we impose a regularity condition on JF| F −1 Δ 0 or equivalently on JF R | Δ 0 as follows: The return map restricted on each element of the partition, F R | Δ 0,i : Δ 0,i → Δ 0 , and its local inverses are measurable and non-singular with respect to the reference measurem so that measures with den- sities with respect to the reference measurem are transformed to measures with densities with respect tom underF . Additionally, as a regularity condition on the Jacobian on the top of the Tower, we ask that logJF R | Δ 0 be uniformly Lipschitz as follows. There existC > 0 and 0<β < 1 fixed such that JF R (x) JF R (y) −1 ≤Cd(x,y) =Cβ s(x,y) for alli and for allx,y∈ Δ 0,i . (1.20) As we mentioned earlier JF(x) is a measure of expansion or contraction at the point x under F R so how far the ratio of the derivatives at x and y is away 14 from 1 essentially captures the distortion at the pointsx andy. Hence, condition (1.20) is indeed a regularity condition that assures the distortion at ”neighboring” points is controlled by their distance. 1.2.3 The Perron-Frobenius operator and Function Spaces On a dynamical systemF : X → X the Perron-Frobenius operator associated withg, written asL g or simplyL when there is no ambiguity, that is acting on the space of functions{φ: X →R} is defined as L(φ)(x) = X y∈F −1 x g(y)φ(y), (1.21) whereg: X →R + is a potential (valuation) function. The Perron-Frobenius operator, else known as Ruelle operator or transfer operator, encodes informa- tion about how smoothly F evolves upon consecutive iterations. If L(φ)(x) = P y∈F −1 x g(y)φ(y) then for then th power of the Perron-Frobenius operator we have: L n (φ)(x) =L(L n−1 (φ))(x) = X y∈F −n x g n (y)φ(y) (1.22) with g n (x) =g(x)g(F(x))g(F 2 (x))···g(F n−1 (x)). (1.23) A special case in which we are interested in is when the functiong in (1.22) is the Jacobian 1 |JF| with respect to the reference measurem onX. 15 The Perron-Frobenius operator then becomes L(φ)(x) = X y∈F −1 x 1 |JF(y)| φ(y). (1.24) In rest of this thesis, without loss of generality, the Jacobian will be assumed to be positive so the absolute values in (1.24) will be omitted. For this choice of g a fixed point of the Perron-Frobenius operator is an invariant density under the dynamics of the system. LetH ∗ denote the family of Lipschitz continuous functions on Δ, i.e H ∗ ={ϕ: Δ→R| ∃C > 0 s.t|ϕ(x)−ϕ(y)|≤Cd(x,y) ∀x,y∈ Δ}, (1.25) whered(x,y) is the metric previously defined on the tower asd(x,y) =β s(x,y) . For functionsφ∈H ∗ we define the semi-norm|.| h to be |φ| h = inf C {C: |φ(x)−φ(y)|≤Cd(x,y) ∀x,y∈ Δ}. (1.26) We then define a full norm on the space of the real-valued functions on Δ as kφk h =|φ| h +||φ|| L 1, (1.27) where||.|| L 1 norm is defined as||φ|| L 1 = R Δ |φ|dm. All observables satisfyingkφk h <∞ make up the space of the H¨ older contin- uous observables, i.e H ={φ: Δ→R such thatkφk h <∞}. (1.28) 16 ByL 1 (m) we denote the space of allm-integrable observables on Δ defined as L 1 (m) = φ: Δ→R such that Z Δ |φ|dμ<∞ . (1.29) 1.2.4 Existence of an a.c. invariant measure and Decay of Correlations As we mentioned earlier, what determines the rates of decay of correlations of the dynamical system F : Δ → Δ is the decay rate of the tail of the return times. In her original paper [Y2], Young proves exponential decay of correlations for exponentially decaying tails of the return times and later, in [Y3], she proves an analogous result for polynomial decay. The findings for the abstract tower model F : Δ→ Δ can then be passed back to the original system system. The following proposition summarizes these results: Proposition 1. ([Y2], [Y3]) IfR is the return time andm is the reference measure on Δ, then the following properties for the tower F : Δ → Δ and the original system Φ: X →X, from which the tower is derived, hold: 1. If R Δ Rdm<∞ and the greatest common divisor of{R i } i is 1 then i) F admits an invariant probability measureν which is equivalent tom. ii) Φ admits an SRB measureμ. 2. Ifm(R>n) =O(θ n ), for some0<θ< 1, then∃C > 0 such that i) for some0<θ ′ < 1 and∀f ∈H andg∈L 1 (ν) Z f(g◦F n )dν− Z fdν Z gdν ≤Cθ ′n kfk h kgk L 1. (1.30) 17 ii) for some0<θ ′′ < 1 and∀f ∈H andg∈L 1 (μ) Z f(g◦Φ n )dμ− Z fdμ Z gdμ ≤Cθ ′′n kfk h kgk L 1, (1.31) wherek.k h andk.k L 1 are as defined in (1.28) and (1.29), respectively. 3. Ifm(R>n) =O(n −α ), for someα> 1, then∃C > 0 such that i) for allf ∈H and allg∈L 1 (ν) Z f(g◦F n )dν− Z fdν Z gdν ≤Cn −α+1 kfk h kgk L 1. (1.32) ii) for allf ∈H and allg∈L 1 (μ) Z f(g◦Φ n )dμ− Z fdμ Z gdμ ≤Cn −α+1 kfk h kgk L 1. (1.33) In 2ii) and 3ii) the spacesH andL 1 (μ) are defined in an analogous way as in (1.28) and (1.29) onX and with respect to the measureμ. 1.2.5 Examples Success of this approach, through the Markov Tower model, depends of course on being able to show that it is implementable and that it can give interesting results in concrete situations outside the class of uniformly hyperbolic systems. This scheme is generally thought to work for systems with ”enough hyperbol- icity”, that is for systems which are expanding or hyperbolic on large parts, but 18 not necessarily on all, of their phase spaces. Specifically, the construction of the Markov Tower has been successfully carried out for various classes of examples of interest such as dispersing billiards and certain logistic and H´ enon-type maps. For these systems it has been shown that they have exponential decay of correla- tions in [Y2]. Later, in [Y3], polynomial decay of correlations has been shown for piecewise expanding 1-dimensional maps with neutral fixed points. Detailed analysis of most of these examples is quite technical and cumbersome and there- fore, being beyond the scope of this thesis, it is not presented here. However, the following examples can help illustrate the idea. Example 1: The binomial map2x mod 1 on (0,1] trivially (R = 1 on all (0,1]) assumes a Markov Tower structure with{(0,1/2],(1/2,1]} as the partition. Example 2: The subshift of finite type is another example. Given a finite alphabet V = {v 1 ,v 2 ,...v m } and an mxm adjacency matrix with entries in {0,1} the sets of all admissible 1-sided sequences is Σ + A = (x 0 ,x 1 ,...) :x i ∈V andA x i x i+1 = 1 ∀i∈N ∗ . If T is the shift oper- ator from Σ + A onto Σ + A which is acting on a sequence by shifting all its symbols to the left, i.e. (T(x)) i = x i+1 , then the sets Σ k A = (v k ,x 1 ,...) :x i ∈V ,A x i x i+1 = 1 , with k = 1,2,...m, form a Markov par- tition for the set Σ + A . This is thought of as a Markov Tower with return time universally equal to 1. Example 3: (Gaspard-Wang map) A non trivial example is the following non- uniformly expanding 1-dimensional parabolic map on the interval with a neutral 19 Figure 1.2: An Example of a Markov Tower For the mapTx = x +2 α x 1+α , ifx ∈ (0, 1 2 ], andTx = 2x− 1 onx∈ (1/2,1], the interval (0,1] is partitioned into the subintervalsA i , as shown in the picture, which satisfyTA i =A i−1 ,∀i≥ 1 andTA 0 = (0,1]. 20 fixed point. The reference measure is the Lebesque measure. Specifically, on (0,1] we define T(x) =        x+2 α x 1+α , ifx∈ (0, 1 2 ] 2x−1, ifx∈ (1/2,1] , (1.34) with0<α< 1. The above map, defined on (0,1[ is indeed non-uniformly expanding as its derivative,T ′ (x) = 1+2 α (1+α)x α−1 , gets arbitrarily close to 1 around 0. We show, for values of α in (0,1), that this map fits the abstract Markov- Tower model discussed earlier. Realization of this identification can be used to successfully derive statistical properties on this system that other methods of proof would fail on. We start by finding a suitable partition of (0,1]. Let{A i } i∈N ∗ be defined as: A i =        ( 1 2 ,1], ifi = 0 (a i ,a i−1 ], ifi≥ 1 , (1.35) where the sequence{a i } i∈N ∗ is such that T(a 1 ) =a 0 = 1 2 and T(a i ) =a i−1 ∀i∈N. (1.36) With the choice of this partition (See Fig. (1.2)), each intervalA i is mapped ontoA i−1 under the action ofT andA 0 = ( 1 2 ,1] is mapped bijectively onto the entire interval(0,1]. This implies that each intervalA i will eventually be mapped bijectively onto(0,1], specifically afteri+1 iterations of the map. (See Fig. 1.2) 21 We can consequently, in part, identify this system with the abstract model of the Markov Tower. The return timeR is given byR|A i =i+1. To further verify that the rest of the properties are satisfied as well we need to look at the tail of the return time function. Specifically, we look at the size of the setsμ(R>n) where μ is the Lebesque measure. One has μ(R>n) = ∞ X i=n μ(A i ). (1.37) To complete the above calculation we need to find the measure of all intervals A i or, alternatively, the sum in (1.37) corresponds to that part of the(0,1]) interval that lies to the left ofa n−1 and, therefore, equals toa n−1 . This can be shown to be of the order ofn − 1 α and therefore, by Proposition 1, the system is shown to have an invariant measure equivalent to the Lebesgue measure and polynomial decay of correlations of the order ofn − 1 α +1 . 1.3 Mixing Properties derived on the Markov Tower 1.3.1 Main Results Recall that for eachn∈N the elements of the joinsC(n) = W n−1 i=0 F −i η of a given measurable partitionη are calledn-cylinders. Theσ-algebraF generated by the n-cylinders, for alln ≥ 0, is theσ-algebra of any system (Δ,F,ν) we consider in this thesis. Observe that two pointsx andy inΔ belong in the samen-cylinder if and only if they remain together for at leastn iterations ofF , i.e. ifs(x,y)≥n. For each n∈N then-cylindersC(n) form a new partition of the space, a refinement of the original partition. 22 Since the cylinder sets are the basis for theσ-algebra on a typical dynamical system, these sets are always of special interest. In particular, the aim of this chapter is to obtain specific rates of mixing in terms of the cylinder sets and later on, in chapter 3, we study hitting and return times to sets of this nature. Theorem 1. Let(Ω,F,Φ,m) be a dynamical system that admits a Markov Tower structure with return times satisfying R Ω Rdm <∞ and letν be an SRB measure on Ω (whose existence is guaranteed by Proposition 1). If A = S m i=1 A i , with A 1 ,A 2 ,...A m ∈C(n), is any finite union ofn-cylinders andB∈F the following mixing conditions hold: i) Ifm(R >n) =O(θ ′n ), for some 0<θ ′ < 1,∃C > 0 and 0<θ < 1 such that ν(A∩F −n−k B)−ν(A)ν(B) ≤Cθ k ν(B). (1.38) ii) Ifm(R>n) =O(n −α ), for someα> 1,∃C > 0 such that ν(A∩F −n−k B)−ν(A)ν(B) ≤Ck −α+1 ν(B). (1.39) 1.3.2 Proof of the Theorem Lemma 1. Under the assumption of bounded distortion property (1.20) for the Jacobian, there exist constantsC 1 and C 2 , independent ofn, such that∀n ∈ N andA∈C(n) C 1 ≤ JF n (x) JF n (y) ≤C 2 for all pairsx,y in the sameA, (1.40) whereJF n (x) =JF(x)JF (F(x))JF(F 2 (x))...JF(F n−1 (x)). 23 Proof. Givenx,y in the sameA∈C(n), by the definition ofJF n , we have log JF n (x) JF n (y) = log JF(x)JF(F(x))...JF(F n−1 (x)) JF(x)JF(F(x))...JF(F n−1 (y)) = n−1 X i=0 log JF(F i (x)) JF(F i (y)) ≤ n−1 X i=0 log JF(F i (x)) JF(F i (y)) ≤ n−1 X i=0) Kd(F i (x),F i (y)) using the distortion property(1.20) =K n−1 X i=0 β s(F i (x),F i (y)) =K n−1 X i=0 β s(F n−1 (x),F n−1 (y))+n−i =Kβ s(F n−1 (x),F n−1 (y)) n−1 X i=0 β n−i =Kd(F n−1 (x),F n−1 (y)) n X i=1 β i ≤Kd(F n−1 (x),F n−1 (y)) ∞ X i=1 β i ≤K β 1−β . (1.41) The constantsC 1 =e −K β 1−β andC 2 =e K β 1−β satisfy our inequality. Lemma 2. With the bounded distortion property (1.20) for the Jacobian still in place, there exists a constantM > 0 such thatkL n 1k ∞ ≤M ∀n∈N. 24 Proof. Using the bounded distortion property, and in particular the form we have derived in the first part of the proof of theorem 2∃K > 0, independent ofn, such that L n 1(x) L n 1(y) −1 ≤Kd(x,y) ∀n∈N and∀x,y in the same levelΔ l (1.42) and, in particular, for a constantK 1 also independent ofn L n 1(x)≤K 1 L n 1(y) ∀x,y∈ Δ l . (1.43) Therefore, keepingx∈ Δ l fixed and integrating overΔ l gives L n 1(x) ≤ K 1 1 ν(Δ l ) Z Δ l L n 1(y)dν(y) (1.44) ≤ K 1 1 ν(Δ l ) Z Δ L n 1(y)dν(y) (1.45) = K 1 1 ν(Δ l ) Z Δ 1(y)dν(y) (1.46) = K 1 ν(Δ l ) (1.47) after the measure Δ has been normalized toν(Δ) = 1. Therefore, up to a fixed levelΔ l 0 , withl 0 ≥ 2,L n 1 A (x) is uniformly bounded by K 1 sup l≤n 0 { 1 ν(Δ l ) }. (1.48) However, we need to look into what happens asl increases to∞ whereν(Δ l ) shrinks to 0 and therefore the bound K 1 ν(Δ l ) explodes to ∞. Notice that, since JF = 1 on Δ \ F −1 Δ 0 , for l ≥ 2 we have, for the transfer operator, that 25 sup x∈Δ l |L1(x)| = 1 and, as a result, the following estimates follows: if byM n we denoteM n = sup x∈Δ L n 1(x) then ∀x∈ Δ l , withl≥ 2, L n+1 1(x)≤L1(x)sup Δ L n 1(x)≤ 1.M n =M n ⇒ M n+1 ≤ max(Ksup l≤2 1 ν(Δ l ) ,M n ) (1.49) ⇒ M n ≤ max(Ksup l≤2 1 ν(Δ l ) ,M 1 )≡M ∀n∈N. ⇒ kL n 1k ∞ ≤M ∀n∈N. (1.50) Note thatM does not depend onn, by construction. Theorem 2. Under the regular assumption of the bounded distortion property for the Jacobian, for alln∈N and anyn-cylinderA∈C(n) the functionL n 1 A ∈H and, moreover, there exists a real positive constantL, independent ofn, such that kL n 1 A k h ≤L, where 1 A denotes the indicator function of the setA defined as 1 A (x) =              1 ifx∈A 0 ifx6∈A. (1.51) Note that the constantL (in the the above theorem) is independent ofn, the order of the cylinder setA. Proof. We want to estimate|L n 1 A | h . In the course of doing that we need to look at the difference d n (x,y)≡|L n 1 A (x)−L n 1 A (y)| forx,y∈ Δ. (1.52) 26 Below we make use of the following result: For 0 < k < k ′ < ∞ ∃ con- stantsC andC ′ (which depend onk,k ′ ) such that C|logx|≤|1−x|≤C ′ |logx| ∀x∈ [k,k ′ ]. (1.53) To see that note that both functions,|1−x| and|logx|, are bounded and continu- ous on [k,k ′ ] and they share the same zero atx = 1. To estimate the quantityd n (x,y) in(1.52), two cases arise: a) x andy belong in distinctΔ i,j ’s b) x andy belong in the same Δ i,j . In the first case, ifx andy belong in distinct Δ i,j ’s, we have thatd(x,y) = 1 and, therefore, the result follows trivially by virtue of lemma 2 withL≤ 2M. Ifx andy belong to the same element of the partitionΔ i,j , for somei,j ∈N, notice that either bothL n 1 A (x) andL n 1 A (y) are equal to zero, in which case we are done withL being any positive real, or they are both positive. One can easily see this by observing that for ann−cylinderA itsn th image underF is the union of one or more Δ i,j ’s. As a consequence, if x,y ∈ Δ i,j for some i,j ∈ N and F −n (x)∩A 6= ∅ then there will bey ′ ∈ A such thatF n (y ′ ) = y and therefore F −n (y)∩A6=∅ as well. 27 As a result, the only interesting case that needs to be studied is whenx andy belong in the same Δ i,j and bothL n (x) andL n (y)6= 0. We consider a such pair ofx andy and we look at the differenced n (x,y). We have: d n (x,y) =|L n 1 A (x)−L n 1 A (y)| =L n 1 A (x) 1− L n 1 A (y) L n 1 A (x) =L n 1 A (x) 1− P y ′ ∈F −n y 1 JF n (y ′ ) 1 A (y ′ ) P x ′ ∈F −n x 1 JF n (x ′ ) 1 A (x ′ ) , (1.54) whereJF n (x) =JF(x)JF (F(x))JF(F 2 (x))...JF(F n−1 (x)). Here, notice that while F −n x typically results in infinitely many branches, for this particular choice ofx andy, there will be one and only one point in the n th pull-back of x and y respectively that lies in the cylinderA. This is true as there are no two points,x ′ andx ′′ , inA such thatF n (x ′ ) = F n (x ′′ ) since, by the bijective nature ofF on eachΔ i,j , for two points to finally ”converge” they would first have to separate and we know that this is not the case forx ′ andx ′′ not until at least n iterations ofF later. Therefore, if the uniquen-preimages ofx andy inA we denote byx ′ andy ′ , respectively, we have: d n (x,y) =L n 1 A (x) 1− JF n (x ′ ) JF n (y ′ ) . (1.55) 28 Using this representation, and by utilizing lemma 2 and inequalities (1.40) and (1.53), we have d n (x,y)≤kL n 1 A k ∞ log JF n (x ′ ) JF n (y ′ ) ≤M log JF(x ′ )JF(F(x ′ ))...JF(F n−1 (x)) JF(y ′ )JF(F(y ′ ))...JF(F n−1 (y ′ =M n−1 X i=0 log JF(F i (x ′ )) JF(F i (y ′ )) ≤M n−1 X i=0 log JF(F i (x ′ )) JF(F i (y ′ )) . (1.56) Notice that in the above sum most of the terms will be equal to zero, asJF = 1 onΔ\F −1 Δ 0 , and for the rest the bounded distortion property applies. In addition, the separation times ofx ′ andy ′ along theirF -trajectory are related as follows: s(F j (x ′ ),F j (y ′ )) =s(x,y)+n−j ∀j = 0,1,...,n−1. (1.57) 29 We can then extend inequality (1.56) as follows d n (x,y)≤M n−1 X i=0) Kd(F i (x ′ ),F i (y ′ )) =MK n−1 X i=0 β s(F i (x ′ ),F i (y ′ )) =MK n−1 X i=0 β s(x,y)+n−i =MKβ s(x,y) n−1 X i=0 β n−i =MKd(x,y) n X i=1 β i ≤MKd(x,y) ∞ X i=1 β i =MK β 1−β d(x,y), (1.58) where the constantMK β 1−β is independent ofn. This completes the proof. We have shown that for alln∈N and allx,y ∈ Δ there exists positive constantL = max{2M,MK β 1−β }, independent of n, such that |L n 1 A (x)−L n 1 A (y)|≤Ld(x,y) and therefore |L1 A | h ≤L ∀n∈N andA∈C(n). (1.59) Theorem 3. Let(Ω,F,Φ,m) be a dynamical system that admits a Markov Tower structure with return times satisfying R Ω Rdm<∞. Ifν is an SRB measure on Ω (whose existence is guaranteed by Proposition1) and ifA∈C(n) andB∈F the following mixing conditions hold: 30 i) Ifm(R >n) =O(θ ′n ), for some 0<θ ′ < 1,∃C > 0 and 0<θ < 1 such that ν(A∩F −(n+k) B)−ν(A)ν(B) ≤Cθ k ν(B). (1.60) ii) Ifm(R>n) =O(n −α ), for someα> 1,∃C > 0 such that ν(A∩F −(n+k) B)−ν(A)ν(B) ≤Ck −α+1 ν(B). (1.61) Proof. We do the first part (i) with the exponential decay. The proof for the second part is the same. By proposition1 (i) we have Z f(g◦F k )dν− Z fdν Z gdν ≤Cθ k kfk h kgk L 1 ∀f ∈H andg∈L 1 (ν). (1.62) Takingf =L n 1 A andg = 1 B and plugging them into the left-hand side of (1.62) gives Z f(g◦F k )dν− Z fdν Z gdν = Z L k 1 A (1 B ◦F k )dν−ν(A)ν(B) = Z 1 A (1 B ◦F n+k )dν−ν(A)ν(B) = ν(A∩F −(n+k) B)−ν(A)ν(B) . (1.63) Equation (1.62), then, becomes ν(A∩F −(n+k) B)−ν(A)ν(B) ≤Cθ k kL n 1 A k h k1 B k L 1 (1.64) 31 and therefore, by invoking theorem 2, we get ν(A∩F −(n+k) B)−ν(A)ν(B) ≤CMθ k ν(B). (1.65) To prove the main theorem, a refinement of theorem 3, which covers the case whereA is a finite union ofn-cylinders we will need the following lemma. Lemma 3. Supposea 1 ,a 2 ,...,a n ,b 1 ,b 2 ,...,b n are positive reals which satisfy: 1− a i b i ≤ǫ for alli = 1,2,...,n. (1.66) Then 1− a 1 +a 2 +···+a n b 1 +b 2 +···+b n ≤ǫ as well. (1.67) Proof. 1− a i b i ≤ǫ ⇐⇒ 1−ǫ≤ a i b i ≤ 1+ǫ ⇐⇒ (1−ǫ)b i ≤a i ≤ (1+ǫ)b 1 . Summing overi gives, (1−ǫ) m X i=1 b i ≤ m X i=1 a i ≤ (1+ǫ) m X i=1 b i (1.68) and, therefore, (1−ǫ)≤ P m i=1 a 1 P m i=1 b i ≤ (1+ǫ) ⇐⇒ 1− a 1 +a 2 +···+a n b 1 +b 2 +···+b n ≤ǫ. (1.69) 32 Main Theorem Theorem 1 Statement. Let (Ω,F,Φ,m) be a dynamical system that admits a Markov Tower structure with return times satisfying R Ω Rdm < ∞ and let ν be an SRB mea- sure on Ω (whose existence is guaranteed by Proposition 1). If A = S m i=1 A i , with A 1 ,A 2 ,...A m ∈ C(n), is any finite union of n-cylinders and B ∈ F the following mixing conditions hold: i) Ifm(R >n) =O(θ ′n ), for some 0<θ ′ < 1,∃C > 0 and 0<θ < 1 such that ν(A∩F −n−k B)−ν(A)ν(B) ≤Cθ k ν(B). (1.70) ii) Ifm(R>n) =O(n −α ), for someα> 1,∃C > 0 such that ν(A∩F −n−k B)−ν(A)ν(B) ≤Ck −α+1 ν(B). (1.71) The bound, as in Theorem 3, not only does it not depend on the order of the cylinders but it does not depend on the number of the cylindersm either. Proof. We mimic the proof of theorem 3, this time with f = L n 1 ∪ m i=1 A i . As a preliminary step we prove, as in theorem 2, thatkL n 1 ∪ m i=1 A i k ≤ L for allm and alln∈N. Repeating the steps of theorem 2, it suffices to show that the quantity 1− P y ′ ∈F −n y 1 JF n (y ′ ) 1 ∪ m i=1 A i (y ′ ) P x ′ ∈F −n x 1 JF n (x ′ ) 1 ∪ m i=1 A i (x ′ ) (1.72) is uniformly bounded. Here note that, while in the case of a single cylinder we only have a uniquen- preimage ofx andy that lies inA, in this case we will have (at most)m preimages 33 of x and m preimages of y that lie in A = ∪ m i=1 A i . We remark that these n- preimages are always coming in pairs meaning that ify has a n-preimage inA i then so does x. We denote these pairs by (x 1 ,y 1 ),(x 2 ,y 2 )...(x m ,y m ) with the indexi indicating thatx i ,y i ∈A i . Therefore, the quantity (1.72) becomes 1− 1 JF n (y 1 ) + 1 JF n (y 1 ) ···+ 1 JF n (ym) 1 JF n (x 1 ) + 1 JF n (x 1 ) ···+ 1 JF n (xm) . (1.73) From the proof of theorem 2 we have, individually for each pair (x i ,y i ), that 1− 1 JF n (y i ) 1 JF n (x i ) ≤K β 1−β ∀i = 1,2,...m. (1.74) We can then conclude, by using lemma 3, that 1− P y ′ ∈F −n y 1 JF n (y ′ ) 1 ∪ m i=1 A i (y ′ ) P x ′ ∈F −n x 1 JF n (x ′ ) 1 ∪ m i=1 A i (x ′ ) = 1− 1 JF n (y 1 ) + 1 JF n (y 1 ) ···+ 1 JF n (ym) 1 JF n (x 1 ) + 1 JF n (x 1 ) ···+ 1 JF n (xm) (1.75) ≤K β 1−β , as well, (1.76) and this ends the proof of the theorem. 34 Chapter 2 The Stein Method 2.1 Introduction It is very common in probability theory to be looking to approximate complicated distributions by other known, simpler, ones. The Central Limit Theorem is a classical such example. Stein’s method is one of the tools available today to do this and a wide variety of applications exists where this method is being applied with great success. What makes Stein method a very popular and interesting among all such methods is, not only its ability to work with great effect in situations where dependence would typically, for example when using Fourier Analysis, be proved quite awkward but also because it provides sharp bounds for the estimation. In addition, Stein’s method comes with no restrictions as to what the distribution to be approximated is. For example there are already a plethora of examples where this method is used to provide approximations to the Normal as well as the Poisson distribution. The Poisson distribution is what we are going to use, as well, 35 to derive our results. Below, before we proceed, we give all definitions, notations and results that we are going to use during the course of this section. 2.1.1 The Framework Let (S,S,μ) be a probability space, letχ be the set of measurable functionsh : S → ℜ, and let χ 0 ⊂ χ be a set of μ-integrable functions; e.g. χ 0 can be the set of all indicator functions {I A ;A∈S}. Now imagine that, for a given h in the classχ 0 above, one wants to calculate the integral R S hdμ. Depending on the structure ofμ this might or might not be feasible. In caseμ is such that it makes it impossible to calculate R S hdμ exactly then we do the following: We replace the measure μ with a known, close enough to μ, simpler measure μ 0 , calculate R S hdμ 0 and finally try to estimate the error made in this estimation. Stein was the first one who attempted successfully to do this in 1972 [S] in the context of normal approximation. Later on, in 1975 [C], Chen showed that this can be adopted for approximation by the Poisson distribution as well. Throughout the time this method has been extended and worked out for distributions beyond the normal and Poisson such as compound Poisson and Poisson point process. The method, then, works as follows: Find a set of functionsF 0 and a mapping T 0 :χ→χ, such that for eachh∈χ 0 , the equation T 0 f =h− Z S hdμ 0 (2.1) has a solutionf ∈ F 0 . If this is true, and indeed there exists such a solutionf, then this gives Z S (T o f)dμ = Z S hdμ− Z S hdμ 0 . (2.2) 36 Notice that in (2.2) above, the right-hand side of the equation displays the dif- ference of the integral ofh with respect to the initial measureμ and the integral with respect to the known measureμ 0 . Remember that what we are trying to do here is to approximate the integral R S hdμ with the integral R S hdμ 0 . Therefore the absolute value of the left-hand side of (2.2) is exactly the error of this approxima- tion and hence all one focuses on, once the measureμ 0 is determined and fixed, is to estimate the size of| R S (T 0 f)dμ|. The mappingT 0 depends on the measureμ 0 and is called the Stein operator for the distributionμ 0 . The equation (2.1) is called the Stein equation and the solution f of (2.1) is called the the Stein transform ofh. Actually, to be more precisef is called a Stein operator since as we will see below, for a givenh,f is only unique up to a single point. How a Stein operatorT 0 can be constructed for a givenμ 0 is out of the scope of this paper. We rather take the known results that interest us as granted and we build upon those. However, we mention here that, in principle, there is no known method or prescribed recipe that is guaranteed to work all the time and therefore ad-hoc techniques maybe required. 2.2 Poisson Approximation Poisson approximation is what we use in our problem. Poisson approximation is the rather natural choice when it comes to approximating the distribution of the sum of indicator(0-1) functions which exhibit some dependence and when each indicator has a small probability for assuming the value 1. Using the same notation as in section 2.1.1 above, in the case of the Poisson approximation we have (S,S,μ) = (Z + ,B Z + ,μ), where (S,S) = (Z + ,B Z + ) is the set of non-negative integers equipped with the powerσ-algebra. Additionally, 37 and still adhering to the same notation as in section 2.1.1, we take μ 0 = P 0 (λ) whereP 0 (λ) represents a Poisson-distribution measure with meanλ. Recall that a Poisson distributionX with meanλ, denoted asP(λ), is a discreet distribution with probability function P(X =k) = e −λ λ k k! , ∀k∈Z + . (2.3) Also letF 0 = χ to be the set of all real-valued functions onZ + . Define the Stein operatorT 0 :χ→χ by (T 0 f)(k) =λf(k+1)−kf(k), ∀k∈Z + . (2.4) Below, we list a number of properties of the the Stein operatorT 0 . Theorem 4. The Stein equation T 0 f =h− Z Z + hdμ, (2.5) for the Poisson Stein operator in (2.4), has a solution f for each μ 0 -integrable h∈χ. The solutionf is unique except forf(0), which can be chosen arbitrarily. f can be computed recursively from the Steinequation. One can get an explicit form as follows: f(k) = (k−1)! λ k k−1 X i=0 h(i)− Z Z + hdμ 0 λ i i! (2.6) = − (k−1)! λ k ∞ X i=k h(i)− Z Z + hdμ 0 λ i i! , ∀k∈N. (2.7) Remark 1. Notice that ifh is bounded then so is the Stein solutionf. 38 Proof. We follow the proof given in [BC1]. The first representation can be easily verified by direct computation. The sec- ond representation follows form the following equality (k−1)! λ k k−1 X i=0 h(i)− Z Z + hdμ 0 λ i i! + (k−1)! λ k k−1 X i=0 h(i)− Z Z + hdμ 0 λ i i! = (k−1)! λ k ∞ X i=0 h(i)− Z Z + hdμ 0 λ i i! = (k−1)! λ k ∞ X i=0 h(i) λ i i! − (k−1)! λ k ∞ X i=0 Z Z + hdμ 0 λ i i! =e λ (k−1)! λ k ∞ X i=0 h(i)e −λ λ i i! − Z Z + hdμ 0 ∞ X i=0 e −λ λ i i! =e λ (k−1)! λ k Z Z + hdμ 0 −1 Z Z + hdμ 0 = 0. Theorem 5. A probability measureμ on(Z + ,B Z + ) is PoissonP 0 (λ) if and only if Z Z + (T 0 f)dμ = 0 for all bounded functionsf :Z + →R. (2.8) Proof. We follow the proof given in [BC1]. ”⇒” By direct computation. ”⇐” Fix a set A ⊂ Z + and take as h to be the indicator function of the set A, 39 i.e. h =I A . Then simple integration of the Stein equation (2.1) with respect toμ gives μ(A)−μ 0 (A) = Z Z + (T 0 f)dμ = 0. (2.9) SinceA was taken arbitrarily we have thatμ≡μ 0 . When we approximate a probability measureμ on (Z + ,B Z + ) with a Poisson distributionP 0 (λ) the error made, for a givenA⊂Z + , is given by |μ(A)−μ 0 (A)| = Z Z + (T 0 f)dμ = Z Z + (λf(k +1)−kf(k))dμ , (2.10) where f is the Stein solution that corresponds to the indicator function 1 A . Sharp bounds for the quantity on the right-hand side of (2.10) is what one is after when the Stein method is used for Poisson approximation. In the next chapter, we get more insight into how this technique works as approximating the law of the return times is a direct application of the Stein method. 40 Chapter 3 Multiple Return Times 3.1 Definitions and Notation In this chapter we are studying the distribution of hitting and multiple return times forn-cylinders. For what follows (Ω,F,μ,F) is an ergodic,μ-measure preserv- ing dynamical system. We start with the following definitions. Definition 13. (Hitting and Return Times) For a set U ⊂ Ω the hitting time τ U : Ω→N∪{∞} is a random variable defined on the entire setΩ as follows τ U (x) = inf k≥ 1: F k (x)∈U . (3.1) If we narrow down the domain ofτ U to the setU thenτ U is called the return time or first-return time. 41 We can then define thek th return time, denoted byτ k U as the number of itera- tions ofF it takes until the system enters the setU for thek th time. Mathemati- cally,τ k U is defined by induction in the following way: τ k U (x) =        τ U (x) fork = 1 inf l>τ k−1 U (x): F l (x)∈U fork≥ 2. (3.2) Definition 14. (Recurrence Time) For aA⊂ Ω we define the recurrence time of A, under the mapF , to be as follows r A = inf{1≤n∈N|A∩F −n (A)6=∅}. (3.3) Proposition 2. (Poincar´ e Recurrence Theorem, 1890) Let (Ω,F,ν,F) be a measure-preserving dynamical system and A ∈ F be a measurable set with ν(A) > 0. Then, the set of those points x ∈ A such that F n (x) 6∈ A for all n> 0 has zero measure. That is, almost every point inA returns toA and, in fact, it will return toA infinitely often. Specifically, ν({x∈A :∃N > 0 such thatF n (x)6∈A for alln>N}) = 0. (3.4) Proof. We follow the proof given in [BS]. Let B ={x∈A :F n (x)6∈A for alln∈N}. (3.5) 42 Notice thatB can equivalently be expressed as B =A\ [ n∈N F −n (A). (3.6) The setB is a measurable set and so are its preimages. Additionally, by the invariance of the measure, its preimagesF −n B have the same measure asB itself and since the total space of Ω is finiteB must be of measure 0. Note that, even though Poincar´ e Recurrence Theorem states the system will return to a set A within a finite horizon, it does not guarantee that the system, starting at a random point outsideA, will visitA also at a finite time. Proposition 3. [K] (Kac Theorem) Let(Ω,F,ν,F) be an ergodic dynamical sys- tem andA be a measurable set withν(A) > 0. For the hitting timeτ A it holds that E A (τ A ) = 1 ν(A) , (3.7) whereE A denotes the conditional expectation with respect toA. Kac’s theorem quantifies Poincar´ e Recurrence Theorem’s conclusion for er- godic measures. It asserts that not only is the recurrence timeτ A finite but it is on average equal to the inverse of the measure ofA,ν(A), as intuition suggests. Definition 15. [BS] (Entropy of a partition) In a probability space(Ω,F,ν), the entropy of a partitionζ ={C a :a∈I} is defined to be H(ζ) =− X Ca∈ζ ν(C a )logν(C a ), (3.8) where log is the natural logarithm with basee. Also by convention 0log0 = 0. 43 Definition 16. [BS] (Entropy of a transformation) In a measure-preserving dy- namical system (Ω,F,ν,F), the metric entropy of the transformationF relative to the partitionζ ={C a :a∈I} with finite entropy is defined to be h(F,ζ) = lim n→n 1 n H(ζ n ), (3.9) where ζ n =ζ∨F −1 (ζ)∨···∨F −(n−1) (ζ) (3.10) is the partition that consists of then-cylinders generated fromζ. Definition 17. [BS] (Measure-Theoretic entropy) The measure-theoretic entropy of a measure-preserving dynamical system (Ω,F,ν,F) is defined to be h(ν) = sup ζ finite measurable partitions ofΩ h(F,ζ), (3.11) whereh(F,ζ) as in Definition 16. If a partitionζ ∗ is generating (i.e. the elements of C(∞) are single points) and finite, or countably infinite withH(ζ ∗ )<∞, one can show that h(F,ζ ∗ ) = sup ζ finite measurable partitions ofΩ h(F,ζ) =h(ν). (3.12) Proposition 4. ([BS],[M[) (Shannon-McMillan-Breiman Theorem) Let (Ω,F,ν,F) be an ergodic, measure-preserving dynamical system. If the underly- ing partitionζ (which generatesF) is finite or countably infinite withH(ζ)<∞ then forν-a.e.x∈ Ω, ifA n (x)∈C(n) is then-cylinder that contains the pointx, 44 the following property holds lim n→∞ 1 n logν(A n (x)) =h(F,ζ). (3.13) Proposition (4) implies that, for a typicalx ∈ Ω, the quantity|logν(A n (x))| grows asymptotically asnh(F,ζ). Specifically, given the partitionζ, there is a set E of measure 0 such that for any fixed pointx ∈ Ω\E, there exists a constant C > 0 such that |logν(A n (x))|≤Cn ∀n∈N. (3.14) The constantC in (3.14) possibly depends on the pointx. 45 3.2 Main Results 3.2.1 Return Times Theorem 6. Let(Ω,F,ν,F) be anα-mixing 1 , with respect to a partitionη (finite or countably infinite with H(η) < ∞)), measure-preserving dynamical system and letA be a cylinder set of ordern andτ k A be thek th time the system visits the setA, withk ≥ 2. Additionally, let ˜ C be a positive fixed constant. Then, for all A n ∈C(n) 2 which satisfy|logν(A n )|≤ ˜ Cn 3 andr An > n 2 4 , the following results hold true : i) Exponential mixing rate. Ifα(n) =β n , with0<β < 1, there existsγ =γ(β)> 0 andC =C(β, ˜ C)> 0 such that P τ k An > t ν(A n ) − k−1 X i=1 e −t t i i! ≤Ct(t∨1)e −γn , ∀t> 0 and∀n∈N. (3.15) ii) Polynomial mixing rate. Ifα(n) = 1 n β , withβ > 2, there existsC =C(β, ˜ C)> 0 such that P τ k An > t ν(A n ) − k−1 X i=1 e −t t i i! ≤Ct(t∨1) 1 n β−2 , ∀t> 0 and∀n∈N. (3.16) 1 Theα-mixing condition is as defined in definition 7. 2 C(n) is the set of alln-cylinders. See definition 6. 3 See (iv) in explanatory remarks following the theorem. 4 See (ii) in explanatory remarks following the theorem. Also,r A is as in definition 14. 46 Explanatory Remarks on Theorem 6 (i) On the sharpness of the upper bounds in theorem 6 we note that Ct(t∨1) 1 n β−2 → 0 and Ct(t∨1)e −γn → 0, asn→∞. (3.17) Therefore these estimates are meaningful only in approximating the distri- bution of return times for rare events, that is for sets of small measure. Also note that these bounds, in both the exponential and polynomial case, do not depend onk, the order of the return time. Additionally, the constantC does not depend on the particular choice ofA as long as we stay within the sets of the cylinders that satisfy the assumptions of the theorem for the same constant ˜ C. Analysis of the theorem assumptions: (ii) The assumption of Theorem 6 that the recurrence timer A be greater than n 2 , as can be seen in the proof, can be substituted with any other number of the order ofn. This assumption is in place to ensure that the reference cylinderA does not exhibit a periodic behavior. By its very definition, the setA consists of points that travel together for at leastn iterates of the map F . In view of this property if the setA revisited itself too early on by the means of a single pointx that would have caused an entire neighborhood ofA to fall intoA at that same iterate. Considering the extreme case, if the entire set falls into A at the same iterate of F that renders A periodic. In this case the setA would act like a ”trap”. By asking that more time passes by before any ofA’s points comes back toA we ensure that the system is 47 nearer to the time where the set will start spreading all over the space, by virtue of the mixing properties that govern the dynamics. (iii) In particular, cylinders around periodic points with periodm < n 2 , do not satisfy the assumption of this theorem. Haydn and Vaienti, in [HV3], study the behaviour of return times at periodic points and show that the limiting distribution is a compound Poissonian distribution. They also derive error terms for the convergence to the limiting distribution. (iv) Commenting on the assumption that |logν(A n )| ≤ ˜ Cn recall that by Shannon-MacMillan-Brieman Theorem (Proposition 4), for a.e. pointx ∈ Ω, for the family of n-cylinders {A n (x)} n∈N centered at x, there exists C > 0 such that |log(ν(A n (x)))|≤Cn ∀n∈N. (3.18) Also the assumption that|logν(A n )| < ˜ Cn, for some ˜ C > 0, can be sub- stituted with|logν(A n )| < ˜ Cn δ with δ > 1. In the polynomial case the exponentδ can be taken to be as big asβ− 1, whereβ is the polynomial exponent of theα function, whereas in the exponential caseδ can be taken arbitrarily large (at the cost of a smallerγ). For fixed K > 0 and δ > 1 let C δ (n) ⊂ C(n) be the set of all the n- cylinders that satisfy|logν(A n )| < ˜ Cn δ . The setC δ (n) is the ”good” set, for it contains the cylinders that satisfy the assumption of the theorem. We show that the ”bad” set, namelyB(n) = Ω\C δ (n) is a small set. Indeed, if the entropyH(η) is finite andH(η n ) is the entropy of then th join then we have 48 nH(η)≥H(η n ) = X An∈C(n) ν(A n )|logν(A n )| (3.19) ≥ X An6∈C δ (n) ν(A n )|logν(A n )| (3.20) = X An∈B(n) ν(A n )|logν(A n )| (3.21) ≥ X An6∈C δ (n) Kn δ ν(A n ). (3.22) = Kn δ ν(B(n)) (3.23) Therefore, ν(B(n))≤ H(η) Kn δ−1 . (3.24) This shows that forδ > 1 asn increases the exception set, or ”bad” set, gets smaller. The bigger theδ we choose the bigger coverage we achieve, where the estimates hold, but makingδ larger that has a direct effect on the error estimates. So there is essentially a direct trade-off between the two; one is losing sharpness of the error estimates in return for gaining better coverage. 3.2.2 Hitting Times Corollary 1. In anα-mixing, measure-preserving dynamical system(Ω,F,ν,F), with underlying partitionη that is finite or countably infinite withH(η) < ∞)), ifA n is a cylinder set of ordern andτ A is the hitting time to the setA, i.e. the first time the system entersA, then the distribution ofτ A suitably rescaled can be approximated by an exponential distribution with mean 1. 49 Specifically, given a constant ˜ C > 0, for all A n ∈ C(n) which satisfy |logν(A n )|≤ ˜ Cn andr An > n 2 the following estimates hold true : i) Exponential mixing rate. Ifα(n) =β n , with0<β < 1, there existsγ =γ(β)> 0 andC =C(β, ˜ C)> 0 such that P(τ An >t)−e −ν(An)t ≤Ct(t∨1)e −γn ∀t> 0 and∀n∈N. (3.25) ii) Polynomial mixing rate. Ifα(n) = 1 n β , withβ > 2, there existsC =C(β, ˜ C)> 0 such that P(τ An >t)−e −ν(A)t ≤Ct(t∨1) 1 n β−2 ∀t> 0 and∀n∈N. (3.26) 3.2.3 Return Times on Markov Towers Theorem 7. LetΦ :M →M be a dynamical system that admits a Markov Tower structureF : Δ → Δ with a reference measurem and a return time functionR. Also let ν be the absolutely continuous invariant measure for Φ. Then, given a 50 positive fixed constant ˜ C, for allA n ∈C(n) which satisfy|logν(A n )|≤ ˜ Cn and r An > n 2 , the following results hold true : i) Exponentially decaying return-map tails. Ifm(R > n) = O(θ n ), for some 0 < θ < 1, there existsγ = γ(θ) > 0 and C =C(θ, ˜ C)> 0 such that P(τ An >t)−e −ν(An)t ≤Ct(t∨1)e −γn ∀t> 0 and∀n∈N. (3.27) ii) Polynomially decaying return-map tails. Ifm(R>n) =O(n −β ), for someβ > 3, there existsC =C(β, ˜ C)> 0 such that P(τ An >t)−e −ν(A)t ≤Ct(t∨1) 1 n β−3 ∀t> 0 and∀n∈N, (3.28) whereτ k A is thek th time the system visitsA. The above result can equally be extended to hitting times, with the same error estimates. 3.2.4 An Example\ Application As we have seen in section 1.2.5 the non-uniformly expanding 1-dimensional parabolic map on the interval (Gaspard-Wang map), defined as 51 T(x) =        x+2 α x 1+α , ifx∈ (0, 1 2 ] 2x−1, ifx∈ (1/2,1] , (3.29) admits a Markov Tower structure with a return map that satisfiesm(R>n) = O(n − 1 α ), as long as 0<α< 1 (see Fig. (2)). Theorem 7 (Polynomial case (ii)) tells us that if we further restrictα in (0, 1 3 ), which would imply that that 1 α > 3 as the theorem requires, then we can approxi- mate the return timesτ k An by the Poisson distribution as in (3.28). 3.3 Method of Proof The aim of this chapter is to approximate the distribution ofτ k A , namely we want to approximateP(τ k A ≤t) for allk≥ 1 and allt∈R + . If by W A n (x) we denote the number of visits of the orbit {F(x),F 2 (x),...F n (x)} to the setA, i.e. W A n (x) = n X j=1 1 A (F j (x)) (3.30) then we have P(τ k A ≤t) = 1−P(τ k A >t) (3.31) = 1−P(τ k A > [t]) (3.32) = 1−P(W A [t] <k) (3.33) 52 where [t] denotes the integer part of t and, as seen earlier, 1 A is the indicator function of the setA defined as 1 A (x) =        1 ifx∈A 0 ifx / ∈A . (3.34) Therefore, our problem of approximating the distribution of τ k A becomes equivalent to approximating the distribution of W A m for allm ∈ N. For the rest of this chapter we will assume A to be an n-cylinder, i.e. A ∈ C(n) and, for simplicity, we will suppress the superscriptA simply writingW m meaning in fact W A m . 3.4 Preliminary Calculations As we have seen above, calculating P(τ k A ≤ t) is equivalent to estimating the distribution ofW m (x) = m P j=1 1 A (F j (x)) for allm∈N. To calculate the distribution ofW m , as a sum of dependent indicator functions, we will implement Stein’s method as described in section2. 3.4.1 Bounds for the Stein Solution Lemma 4. For the Poisson distributionP(λ), the Stein solution of the Stein equa- tion (2.1) that corresponds to the indicator function h = 1 E , with E ⊂ Z + , satisfies |f 1 E (k)|≤        1, ifk≤λ. 2+λ k , ifk>λ (3.35) Proof. We consider 2 different cases. 53 First we look atk>λ and then atk≤λ. Forh = 1 E , from the representation (2.7) for the Stein solution we have f 1 E (k) =− (k−1)! λ k ∞ X i=k h(i)− Z Z + hdμ 0 λ i i! . (3.36) Therefore, |f 1 E (k)| ≤ (k−1)! λ k ∞ X i=k h(i)− Z Z + hdμ 0 λ i i! ≤ (k−1)! λ k ∞ X i=k λ i i! = (k−1)! λ k λ k k! 1+ ∞ X i=1 λ i (k +1)...(k +i) ! = (k−1)! λ k λ k k! 1+ ∞ X i=1 λ k +1 λ k +2 ... λ k +i ! . (3.37) Provided thatk > λ, and if we allow enough time to elapse, specifically for i>λ, each term in the infinite sum in(3.37) is no greater than ( 1 2 ) i−λ . Fori≤λ, all terms in the sum in (3.37) are clearly no greater than 1. Hence, splitting the sum accordingly, yields |f 1 E (k)| ≤ (k−1)! λ k λ k k! 1+ ∞ X i=1 λ k +1 λ k +2 ... λ k +i ! (3.38) ≤ (k−1)! λ k λ k k! 1+λ+ ∞ X i=1 1 2 i ! (3.39) = 2+λ k . (3.40) 54 Note that the above estimate holds only fork >λ and this proves the second part of the desired inequality(3.35). If, on the other hand,k≤λ then different estimates hold as follows. Using the alternative representation (2.6) for the Stein solutionf 1 E , this time, we get |f 1 E (k)| ≤ (k−1)! λ k k−1 X i=0 h(i)− Z Z + hdμ 0 λ i i! ≤ (k−1)! λ k k−1 X i=0 λ i i! . (3.41) Now observe that the sequence{ (λ) j j! } j∈N is increasing forj ≤λ and decreas- ing forj >λ. Indeed, if we look at the ratio of the (j +1) th over thej th term of the sequence we get (λ) j+1 (j+1)! (λ) j j! = λ j +1 =        ≥ 1, ifj <λ < 1, ifj≥λ. (3.42) This observation helps us control the size of the summands in (3.41) and, in particular, continuing from (3.41), we get |f 1 E (k)| ≤ (k−1)! λ k λ k−1 (k−1)! k−1 X i=0 1 (3.43) = 1 λ k (3.44) ≤ 1 sincek≤λ, (3.45) and this completes the proof of the first part of inequality(3.35). 55 Corollary 2. For the Stein solutionf 1 E of the Stein equation (2.1) the following inequality holds k X i=1 |f 1 E (i)| ≤        k, ifk≤λ λ+(2+λ)log k λ , ifk>λ (3.46) whereλ is the parameter of the Poisson distributionP(λ), the measureμ 0 in (2.1), andlog is the natural logarithm with basee. Proof. Fork ≤λ the result follows trivially from Lemma 4 whereas ifk >λ we combine Lemma4 with the simple inequality k X i=λ+1 1 i ≤ k Z λ 1 t dt (3.47) = log k λ . (3.48) 3.4.2 The Framework In our problem we approximate the distribution ofW m with the Poisson distribu- tionP(λ), where the parameterλ is the expected value ofW m . If we definep i to be p i ≡P(F i x∈A) ∀i = 1,2,··· (3.49) 56 then λ =E(W m ) (3.50) = m X i=1 E 1 A F i (3.51) = m X i=1 P(F i x∈A) (3.52) ≡ m X i=1 p i . (3.53) To make the connection with chapter2, in the Stein equation(2.1) we takeμ 0 to be the Poisson(λ) distribution measure andh = 1 E withE an arbitrary subset of the positive integers,E⊂Z + . Then, ifμ is the true distribution measure ofW m , (2.2) becomes Z Z + (T 0 f)dμ = Z Z + hdμ− Z Z + hdμ 0 =|P(W m ∈E)−μ 0 (E)| (3.54) and in turn, since the Stein OperatorT 0 for the Poisson distribution is known as in(2.4), equation(3.54) becomes |P(W m ∈E)−μ 0 (E)| =|E(λf(W m +1)−W m f(W m ))| ∀E ⊂Z + . (3.55) Notice that the difference |P(W m ∈E)−μ 0 (E)| above gives exactly the error of the Poisson approximation in question. 57 Therefore if we used(m,λ) to denote this error term, i.e. d(m,λ) =|P(W m ∈E)−μ 0 (E)|, (3.56) using(3.55) we establish the following equality d(m,λ) =|E(λf(W m +1)−W m f(W m ))| = λEf(W m +1)−E m X i=1 I i f(W m ) ! = λEf(W m +1)− m X i=1 E(I i f(W m )) = m X i=1 p i Ef(W m +1)− m X i=1 p i E(f(W m )|I i = 1) = m X i=1 p i (Ef (W m +1)−E(f(W m )|I i = 1)) (3.57) and, by expanding the expectations in (3.57), ones gets 58 d(m,λ) = m X i=1 p i m X a=0 f(a+1)P(W m =a)− m X a=0 f(a)P(W m =a|I i = 1) ! = m X i=1 p i m X a=0 f(a+1)P(W m =a)− m X a=1 f(a)P(W m =a|I i = 1) ! = m X i=1 p i m X a=0 f(a+1)P(W m =a) − m X a=0 f(a+1)P(W m =a+1|I i = 1) = m X i=1 p i m X a=0 f(a+1) P(W m =a)−P(W m =a+1|I i = 1) ! ≡ m X i=1 p i m X a=0 f(a+1)ǫ a ! , (3.58) where ǫ a =|P(W m =a)−P(W m =a+1|I i = 1)|. (3.59) We recall that the functionf above is the solution of the Stein equation (2.1) that corresponds to the indicator functionh = 1 E in the Stein method as described in section2. Hence, by remark 1, sinceh is bounded,f is bounded as well. More precisely, not only is the solutionf bounded but we have obtained exact bounds for it in corollary2. Now, in view of the new representation for |P(W m ∈E)−μ 0 (E)|, as in (3.58), we need to look at the difference ǫ a =|P(W m =a)−P(W m =a+1|I i = 1)| (3.60) more closely. 59 3.4.3 The Independent case We first consider the scenario under which the indicator functions are indepen- dent so as to build the skeleton for the proof in the mixing case. Once we have worked out the details under the independence assumption, we refine the proof accordingly to obtain the analogous result in theα-mixing case. Specifically, ifI i ’s were independent, and if byW i m we denoteW m minus the i th indicator, then a simple calculation P(W m =a+1|I i = 1) = P({W m =a+1}∩{I i = 1}) P(I i = 1) (3.61) = P({W i m =a}∩{I i = 1}) P(I i = 1) (3.62) = P(W i m =a)P(I i = 1) P(I i = 1) (3.63) =P(W i m =a) (3.64) gives ǫ a =|P(W m =a)−P(W m =a+1|I i = 1)| (3.65) = P(W m =a)−P(W i m =a) (3.66) and, therefore, from the following set inclusion {W m =a}⊂{W i m =a}∪{I i = 1} (3.67) one concludes ǫ a ≤P(I i = 1) =p i . (3.68) 60 More specifically, in our case, recall that p i =P(I i = 1) =P(x :T i x∈A) =ν(A) for alli = 1,2...m (3.69) by the invariance property of the measureν. Therefore, sup E⊂Z + |P(W m ∈E)−μ 0 (E)| = m X i=1 p i m X a=0 f(a+1)ǫ a ! (3.70) ≤ m X i=1 p i m X a=0 f(a+1)ν(A) ! (3.71) =mν(A) 2 m X a=0 f(a+1) ! (3.72) ≤mν(A) 2 λ+(2+λ)log m λ , (3.73) by corollary2, provided thatm>λ. As a result, P(W m ≤k)− k X i=1 e −λ λ i i! ≤ sup E⊂Z + |P(W m ∈E)−μ 0 (E)| ≤mν(A) 2 λ+(2+λ)log m λ , ifm>λ. (3.74) Translating(3.74) in terms ofτ k A using the equality P(τ k A >m) =P(W m ≤k−1), (3.75) 61 that would imply P(τ k A >m)− k−1 X i=1 e −λ λ i i! ≤mν(A) 2 λ+(2+λ)log m λ , form>λ. (3.76) However, when looking at the return timesτ k A , one is rather interested in rescal- ing the return time to the setA by dividing the timet by the measure of the set in question. The idea is that the wait time until the system enters a set A is on average half of the wait time that corresponds to a setA ′ with twice the size ofA. This also becomes intuitive from Kac’s Theorem which states that R A τ A dν = 1. Specifically, the distribution one is interested in approximating isP(τ k A > t ν(A) ) rather thanP(τ k A >t). For an arbitraryt> 0, P τ k A > t ν(A) = P τ k A > t ν(A) (3.77) = P W [ t ν(A) ] <k . (3.78) Therefore in equation (3.76) we use [ t ν(A) ] for m and for λ, the mean of the Poisson distribution with which we are approximating the law ofτ k A , we have λ = m X i=1 p i = [ t ν(A) ] X i=1 ν(A) = [ t ν(A) ]ν(A). (3.79) 62 Since the Poisson parameter [ t ν(A) ]ν(A) is not exactly equal tot, for the sake of simplicity, we can findt ∗ such that t ∗ ν(A) = h t ν(A) i . This means, for an arbitraryt> 0, we have P τ K A > t ν(A) = P τ K A > t ν(A) (3.80) = P τ K A > t ∗ ν(A) (3.81) = P W t ∗ ν(A) <k , (3.82) wheret ∗ = h t ν(A) i ν(A). Notice that the proxyt ∗ is only within a distance of sizeν(A) away from the originalt as the following inequality shows ( t ν(A) −1)ν(A)≤ t ν(A) ν(A)≤ t ν(A) ν(A) ⇒ t−ν(A)≤ t ν(A) ν(A)≤t ⇒ t−ν(A)≤t ∗ ≤t. (3.83) With this particular choice ofm andλ, and with t ∗ ν(A) = h t ν(A) i , we have P τ k A > t ν(A) − k−1 X i=1 e −[ t ν(A) ]ν(A) h t ν(A) i ν(A) i i! (3.84) = P τ k A > t ν(A) − k−1 X i=1 e − t ∗ ν(A) ν(A) t ∗ ν(A) ν(A) i i! (3.85) 63 and, therefore, inequality (3.76) becomes P τ k A > t ν(A) − k−1 X i=1 e −t ∗(t ∗ ) i i! (3.86) ≤ t ∗ ν(A) ν(A) 2 t ∗ +(2+t ∗ )log t ∗ ν(A) t ∗ !! (3.87) =t ∗ ν(A)(t ∗ +(2+t ∗ )|logν(A)|). (3.88) Note that the error term in the right hand side of (3.88) is independent of k, the order of the return time. Also note that, as desired, this error term shrinks to zero as the size of the cylinderA goes to zero. For a neater expression the proxy t ∗ , defined as t ∗ = h t ν(A) i ν(A), can be substituted with the originalt. However, before doing so, one needs to check the size of the error this substitution incurs and compare with the principal error term of the approximation. This is something we do in the proof of Theoren 6 (See ineq. (3.197)). 3.5 α-Mixing Case In the previous subsection we considered the case where the indicators were in- dependent. In our problem, however, where the indicatorsI i ’s signify the system being within the setA at thei th iterate of the map, defined asI i (x) = 1 A T i (x), we clearly do not have this property. Hence for the error termǫ a , as introduced in (3.59), one should expect different estimates. Specifically, in Proposition 6 we derive these bounds for ǫ a . What follows prepares the ground for the proof of Proposition 6. 64 3.5.1 Estimates on the measure of a cylinder Before we proceed, we establish the following upper bounds on the measure of a cylinder. Lemma 5. Let (Δ,F,ν,F) be anα-mixing, measure-preserving dynamical sys- tem . There exist strictly positive constants K and Λ such that for any integer n∈N and anyn-cylinderA the following inequality holds ν(A)≤Ke −Λn . (3.89) Proof. We follow the proof in Galves-Schmitt [GS]. For ann-cylinderA and any setB∈F we have |ν(A∩F −(n+k) B)−ν(A)ν(B)|≤α(k)ν(A) (3.90) which, in turn, implies ν(A∩F −(n+k) B)≤ν(A)(α(k)+ν(B)). (3.91) If for the cylinderA of ordern we use the representation A =A 0 ∩F −1 A 1 ∩F −2 A 2 ∩···∩F −(n−1) A n−1 withA i ∈η ∀i = 0,1...n−1 (3.92) then ifn>n 0 for somen 0 ∈N, and using (3.91), we get 65 ν(A) = ν A 0 ∩F −1 A 1 ∩F −2 A 2 ∩···∩F −(n−1) A n−1 ≤ ν A 0 ∩F −1 A 1 ∩···∩F −( h n−1 n 0 i −1)n 0 A ([ n−1 n 0 ]−1)n 0 ∩F −( h n−1 n 0 i )n 0 A ([ n−1 n 0 ])n 0 ≤ ν A 0 ∩F −1 A 1 ∩···∩F −( h n−1 n 0 i −1)n 0 A ([ n−1 n 0 ]−1)n 0 α(n 0 −1)+ν(A ([ n−1 n 0 ])n 0 ) = ν A 0 ∩F −1 A 1 ∩···∩F −( h n−1 n 0 i −2)n 0 A ([ n−1 n 0 ]−2)n 0 ∩F −( h n−1 n 0 i −1)n 0 A ([ n−1 n 0 ]−1)n 0 α(n 0 −1)+ν(A ([ n−1 n 0 ])n 0 ) ≤ ν A 0 ∩F −1 A 1 ∩···∩F −( h n−1 n 0 i −1)n 0 A ([ n−1 n 0 ]−1)n 0 α(n 0 −1)+ν(A ([ n−1 n 0 ]−1)n 0 ) α(n 0 −1)+ν(A ([ n−1 n 0 ])n 0 . . . ≤ ν(A 0 )(α(n 0 −1)+ρ) [ n−1 n 0 ] ≤ (α(n 0 −1)+ρ) [ n−1 n 0 ] (3.93) where ρ = sup{ν(B);B∈η}. (3.94) Sinceρ< 1 andα(k)→ 0 there existsn 0 such thatα(n 0 −1)+ρ< 1. This and the fact that in the parameters in (3.93),n 0 andρ, do not depend on the choice of the cylinderA and this proves the result. 66 Lemma 6. Let (Δ,F,ν,F) be anα-mixing, measure-preserving dynamical sys- tem. There exists a positive constantC such that for alln ∈N and all cylinders A∈C(n) the following inequality holds: ν(A∩F −k A)≤ 2ν(A)max e −Ck ,α k 2 , ∀k∈N. (3.95) Proof. IfA∈C(n) is ann-cylinder which has the following representation A =A 0 ∩F −1 A 1 ∩F −2 A 2 ∩···∩F −(n−1) A n−1 , withA i ∈η, (3.96) we have ν(A∩F −k A) ≤ ν A∩F −(n+⌈ k 2 ⌉) B (⌊ k 2 ⌋) n , (3.97) where ⌈ k 2 ⌉ and⌊ k 2 ⌋ are the lower and upper integer parts of k 2 , respectively, andB (⌊ k 2 ⌋) n is a cylinder set of order⌊ k 2 ⌋ defined as B (⌊ k 2 ⌋) n = ⌊ k 2 ⌋−1 \ j=0 F −j A n−⌊ k 2 ⌋+j . (3.98) To see that it suffices to verify that each member of the intersection A ∩ F −(n+⌈ k 2 ⌉) B (⌊ k 2 ⌋) n is also found as a component inA∩F −k A. Using the mixing property (α-mixing) on (3.97) we get ν(A∩F −k A) ≤ ν A∩F −(n+⌈ k 2 ⌉) B (⌊ k 2 ⌋) n (3.99) ≤ ν(A)ν B (⌊ k 2 ⌋) n +α ⌈ k 2 ⌉ ν(A) (3.100) = ν(A) ν B (⌊ k 2 ⌋) n +α ⌈ k 2 ⌉ . (3.101) 67 The setB (⌊ k 2 ⌋ n is a⌊ k 2 ⌋-cylinder and, therefore, lemma (5) can be used to esti- mate its measure. The desired inequality (3.95) automatically follows. The con- stantC does not depend onA ork. 3.5.2 Error term due to dependence Due to dependence, the error termǫ a defined in (3.59) is different from what we saw in the independent case in the subsection 3.4.3. This is where we utilize the mixing condition to get sharp estimates onǫ a and consequently estimates on the error term for the Poisson approximation. Specifically, starting from (3.59) we have ǫ a =|P(W m =a)−P(W m =a+1|I i = 1)| (3.102) = P(W m =a)− P({W m =a+1}∩{I i = 1}) P(I i = 1) (3.103) = P(W m =a)− P({W i m =a}∩{I i = 1}) P(I i = 1) . (3.104) (3.105) Unlike in the independent case, for the termP({W i m =a}∩{I i = 1}), we have P {W i m =a}∩{I i = 1} =P(W i m =a)P(I i = 1)+ǫ ′ a , (3.106) whereǫ ′ a is the error term due to dependence. The error termǫ a then becomes 68 ǫ a =|P(W m =a)−P(W m =a+1|I i = 1)| (3.107) = P(W m =a)− P({W i m =a}∩{I i = 1}) P(I i = 1) = P(W m =a)− P(W i m =a)P(I i = 1)+ǫ ′ a P(I i = 1) ≤ P(W m =a)−P(W i m =a) + |ǫ ′ a | P(I i = 1) ≤ P(W m =a)−P(W i m =a) + ξ a P(I i = 1) , (3.108) whereξ a is an upper bound for the error termǫ ′ a in (3.106), i.e.ξ a is such that P({W i m =a}∩{I i = 1})−P(W i m =a)P(I i = 1) ≤ξ a . (3.109) The breakdown ofǫ a as in(3.108) shows that we have to look at 2 terms. The first term is what we had in the independent case and which we bounded as P(W m =a)−P(W i m =a) ≤P(I i = 1) =ν(A). (3.110) The second term, namelyξ a , is the error due to dependence for which we get estimates in Proposition6 below. For now et us recall that the recurrence time to the setA, under the mapF , is defined as r A = inf{1≤n∈N|A∩F −n (A)6=∅} (3.111) Lemma 7. (Theorem 4.4, [A3]) Given anα-mixing, measure-preserving dynami- cal system(Δ,F,ν,F), withα(.) summable, there exist positive constantsC 1 ,C 2 , Λ 1 and Λ 2 such that for eachn-cylinder setA∈ C(n) the following distribution estimate for the hitting time holds 69 P A (τ A >t) − P A (τ A >r A )e −ξ A ν(A)t ≤ (3.112) C 1 inf 2n≤Δ≤ 1 ν(A) [Δ(ν(A)+α(n))+α(Δ−2n)] +C 2 inf 1≤w≤n n ν(A (w) )+α(n−w) ∀t>n (3.113) whereξ A , a parameter that possibly depends on the choice ofA, is uniformly bounded withΛ 1 <ξ A < Λ 2 andA (w) , withw≤n, is thew-cylinder that contains the originaln-cylinderA. Lemma 8. Let (Δ,F,ν,F) be anα-mixing dynamical, measure-preserving sys- tem withα(.) summable. Also letA be ann−cylinder. There exist positive con- stantsC,k, Λ 1 , Λ 2 , all independent ofA, such that |P A (τ A >t)−P A (τ A >r A )e −ξ A ν(A)t |≤Cnmax{α n 2 ,e −kn } ∀t>n, (3.114) where Λ 1 <ξ A < Λ 2 . Proof. We use Lemma7. Specifically for the first infimum in(3.112), by choosing the optimizing parameter to be Δ = 3n, we have inf 2n≤Δ≤ 1 ν(A) [Δ(ν(A)+α(n))+α(Δ−2n)] ≤ 4n(ν(A)+α(n)) (3.115) ≤ Cn e −Λn +α(n) , by Lemma5. 70 For the second term in(3.112), by choosingw = n 2 , we get inf 1≤w≤n n ν(A (w) )+α(n−w) ≤ n ν(A ( n 2 ) )+α n 2 (3.116) ≤ C e −Λ( n 2 ) +α n 2 also by Lemma5. Combining(3.115) and (3.116) yields the result immediately. As per Lemma 7 and Lemma 5 the constantsC,k,Λ 1 and Λ 2 are all independent ofA. Lemma 9. Let (Δ,F,ν,F) be an α-mixing, measure-preserving dynamical system with α(.) summable. Also let A be an n−cylinder. If r A = inf l∈N A∩F −l A6=∅ is the recurrence time of the set A there exists positive constantC, independent ofA, such that P A (τ A =r A )≤ 2max n e −Cr A ,α r A 2 o . (3.117) Proof. We have P A (τ A =r A ) ≤ P A (F r A ∈A) (3.118) = ν(A∩F −r A A) ν(A) (3.119) ≤ 2ν(A)max e −Cr A ,α r A 2 ν(A) by Lemma6 (3.120) = 2max n e −Cr A ,α r A 2 o . (3.121) The constantC, as in Lemma 6, is independent ofA. 71 Proposition 5. Let (Δ,F,ν,F) be anα-mixing, measure-preserving dynamical system withα(.) summable. There existC > 0 andk > 0, both independent ofA andn, such that for allAn-cylinders withr A > n 2 the following estimates hold P A (τ A <t)≤C ν(A)t+nmax{α( n 2 ),e −kn } , ∀t>n, (3.122) whereτ A is the hitting time for the setA andP A (.) denotes the conditional prob- ability with respect to the setA. Proof. One has P A (τ A ≤t) = 1−P A (τ A >t) = 1−P A (τ A >t)+P A (τ A >r A )e −ξ A ν(A)t −P A (τ A >r A )e −ξ A ν(A)t ≤|P A (τ A >t)−P A (τ A >r A )e −ξ A ν(A)t | +|1−P A (τ A >r A )e −ξ A ν(A)t | =L 1 +|1−(1−P A (τ A ≤r A ))e −ξ A ν(A)t | ≤L 1 +|1−e −ξ A ν(A)t |+P A (τ A ≤r A ) =L 1 +L 2 +L 3 , 72 where L 1 =|P A (τ A >t)−P A (τ A >r A )e −ξ A ν(A)t |≤K 1 nmax{α( n 2 ),e −k 1 n } (3.123) by virtue of Lemma 8, L 2 =|1−e −ξ A ν(A)t |≤K 2 ν(A)t for someK 2 > 0, (3.124) by a simple calculus argument and, finally, L 3 =P A (τ A ≤r A )≤ 2max n e −k 2 n ,α n 2 o by Lemma (9). (3.125) Therefore, by putting (3.123), (3.124) and (3.125) together yields the desired inequality P A (τ A ≤t)≤C ν(A)t+nmax{α( n 2 ),e −kn } . (3.126) Proposition 6. In an α-mixing, measure-preserving dynamical system (Δ,F,ν,F), with α(.) summable, if W A m (x) ≡ W m (x) counts the number of ”hits” of the orbit {Fx,F 2 x,...F m x} to the cylinder set A ∈ C(n), namely W m (x) = m P j=1 1 A (F j (x)), and ifW i m (x) = P 1≤j≤m j6=i 1 A (F j (x)), there exists a pos- itive constant C, independent of n and A, such that under the assumption that r A > n 2 the following estimate holds true ǫ a = P {W i m =a}∩{I i = 1} −P(W i m =a)P(I i = 1) ≤ C inf Δ>n Δν(A) 2 +nν(A)max{α( n 2 ),e −kn }+α(Δ) . (3.127) 73 Proof. In the proof of this lemma we use the following notation W i,− m (x) = i−(Δ+1) X j=1 1 A (F j (x)), W i,+ m (x) = m X j=i+Δ+1 1 A (F j (x)), (3.128) W i,0,− m (x) = i−1 X j=i−Δ 1 A (F j (x)), W i,0,+ m (x) = i+Δ X j=i+1 1 A (F j (x)), (3.129) W i,0 m (x) =W i,0,− m +W i,0,+ m , ˜ W i m (x) =W i m (x)−W i,0 m (x) (3.130) =W i,− m (x)+W i,+ m (x) (3.131) and as usualI j (x) =I A (F j (x)) ∀j = 1,2,...,m. With these partial sums we distinguish between the hits that occur near the i th iteration, namelyW i,0,− m andW i,0,+ m , and the hits that occur away from thei th iteration, namelyW i,− m andW i,+ m . Recall that here ourI i ’s are not independent but we do have a mixing condi- tion. In order to utilize this mixing condition properly, we open up a gap of sizeΔ aroundI i so that we allow mixing to occur. The size Δ of this gap will be chosen optimally meaning that it should be large enough so that we get sharp mixing re- sults but at the same time it should be small enough so as to not spoil the estimates with whatever we are left with from the middle interval aroundI i . We then have, for 0≤a≤m−1,a∈N 0 , that P({W m =a+1}∩{I i = 1}) =P({W i m =a}∩{I i = 1}) (3.132) 74 with P {W i m =a}∩{I i = 1} = X ~ a=(a − ,a 0,− ,a 0,+ ,a + ) s.t|~ a|=a P {W i,− m =a − }∩{W i,0,− m =a 0,− } ∩{I i = 1}∩{W i,0,+ m =a 0,+ }∩{W i,+ m =a + } . (3.133) For 0≤a≤m−1 we have P {W i m =a}∩{I i = 1} −P W i m =a P(I i = 1) ≤ P {W i m =a}∩{I i = 1} −P { ˜ W i m =a}∩{I i = 1} | {z } (A) + P { ˜ W i m =a}∩{I i = 1} −P ˜ W i m =a P(I i = 1) | {z } (B) + P ˜ W i m =a −P W i m =a P(I i = 1) | {z } (C) (3.134) In the above inequality we have 3 terms to bound. We are mainly using the mixing property and, for each one individually, we obtain the following bounds. Bounds for the term (A) in (3.134) : For the first term in (3.134) we work as follows. 75 First observe that the following set inclusions hold {W i m =a}∩{I i = 1} ⊂ { ˜ W i m =a}∩{I i = 1} [ {W o m > 0}∩{I i = 1} (3.135) and { ˜ W i m =a}∩{I i = 1} ⊂ {W i m =a}∩{I i = 1} [ {W o m > 0}∩{I i = 1} (3.136) and this, by taking probabilities, gives P {W i m =a}∩{I i = 1} −P { ˜ W i m =a}∩{I i = 1} ≤P {W o m > 0}∩{I i = 1} . (3.137) Now notice that ifW 0 m > 0 then eitherW 0.+ m > 0 orW 0,− m > 0. Therefore, if we useb − i andb + i to denote b − i =P {W 0,− m > 0}∩{I i = 1} and b + i =P {W 0,+ m > 0}∩{I i = 1} (3.138) 76 then P {W o m > 0}∩{I i = 1} ≤P {W 0,− m > 0}∩{I i = 1} (3.139) +P {W 0,+ m > 0}∩{I i = 1} =b − i +b + i . (3.140) The 2 new terms,b − i andb + i , we bound separately as follows. IfW 0,− m > 0 then {W 0,− m > 0}⊂ Δ [ k=1 {I i−k = 1} (3.141) and so {W 0,− m > 0}∩{I i = 1}⊂ Δ [ k=1 {I i−k = 1}∩{I i = 1} (3.142) with P {W 0,− m > 0}∩{I i = 1} =P Δ [ k=1 {{I i−k = 1}∩{I i = 1}} ! (3.143) We show the following symmetrical property P Δ [ k=1 {{I i = 1}∩{I i−k = 1}} ! =P Δ [ k=1 {{I i = 1}∩{I i+k = 1}} ! (3.144) or, equivalently, under the following notation 77 J i,k ={{I i = 1}∩{I i−k = 1}} S i = Δ [ k=1 {{I i = 1}∩{I i−k = 1}}≡ Δ [ k=1 J i,k , ˜ J i,k ={{I i = 1}∩{I i+k = 1}} ˜ S i = Δ [ k=1 {{I i = 1}∩{I i+k = 1}}≡ Δ [ k=1 ˜ J i,k that P(S i ) =P( ˜ S i ). (3.145) To do that we start by rewritingS i as a disjoint union follows. S i = Δ [ k=1 J i,k (3.146) = Δ [ k=1 ( J i,k \ k−1 [ j=1 J i,k ∩J i,j ) (3.147) = Δ [ k=1 V i,k (3.148) where we define the setsV i,k as V i,k ≡J i,k \ k−1 [ j=1 J i,k ∩J i,j for alli∈N. (3.149) Now notice that{V i,k } Δ k=1 are indeed disjoint and, therefore, 78 P(S i ) = P ˙ [Δ k=1 V i,k = Δ X k=1 P(V i,k ). (3.150) The goal is to show thatP(S i ) =P( ˜ S i ) and to do so, starting from (3.150), we attempt to constructP( ˜ S i ) now by working backwards. Indeed, if we define ˜ V i,k as ˜ V i,k ≡ ˜ J i,k \ k−1 [ j=1 ˜ J i,k ∩ ˜ J i,j ,i∈N (3.151) we show thatF −k V i,k = ˜ V i,k . To see that, we first observe that F −k J i,k = ˜ J i,k and F −k (J i,k ∩J i,j ) = ˜ J i,k ∩ ˜ J i,k−j , 0≤j ≤k−1. (3.152) 79 Using this, we can indeed show that F −k V i,k =F −k J i,k \ k−1 [ j=1 J i,k ∩J i,j ! =F −k J i,k \ k−1 [ j=1 F −k (J i,k ∩J i,j ) = ˜ J i,k \ k−1 [ j=1 ˜ J i,k ∩ ˜ J i,k−j by(3.152) = ˜ J i,k \ k−1 [ l=1 ˜ J i,k ∩ ˜ J i,l by puttingl =k−j = ˜ V i,k . (3.153) Therefore, using the invariance of the measure, (3.153) yields P( ˜ V i,k ) =P(V i,k ). (3.154) Finally, the sets ˜ V i,k are disjoint as well and this, along with (3.150) and (3.154), gives P(S i ) = Δ X k=1 P(V i,k ) (3.155) = Δ X k=1 P( ˜ V i,k ) (3.156) = P ˙ [Δ k=1 ˜ V i,k (3.157) = P Δ [ k=1 ˜ J i,k ! (3.158) = P( ˜ S i ) as desired. (3.159) 80 Using these calculations, we show b − i =P Δ [ k=1 {I i−k = 1∩I i = 1} ! =P Δ [ k=1 {I i = 1∩I i+k = 1} ! using equality(3.159) =P {W 0,+ m > 0}∩{I i = 1} =P(W 0,+ m > 0|I i = 1)P(I i = 1) =P A (τ A < Δ)P(I i = 1) ≤C ν(A)Δ+nmax{α( n 2 ),e −kn } ν(A) by virtue of Proposition5 =C ν(A) 2 Δ+nν(A)max{α( n 2 ),e −kn } . (3.160) To boundb + i we consider the case whenW 0,+ m > 0. In particular, b + i =P {W 0,+ m > 0}∩{I i = 1} =P(W 0,+ m > 0|I i = 1)P(I i = 1) =P A (τ A ≤ Δ)P(I i = 1) (3.161) ≤C ν(A) 2 Δ+nν(A)max{α( n 2 ),e −kn } , using Proposition5, and this proves P {W i m =a}∩{I i = 1} −P { ˜ W i m =a}∩{I i = 1} ≤P {W o m > 0}∩{I i = 1} ≤b − i +b + i (3.162) ≤C ν(A) 2 Δ+nν(A)max{α( n 2 ),e −kn } . 81 Bounds for the term (C) in (3.134): For the third term in (3.134) we work in a similar way as we did for for the first term. P ˜ W i m =a P(I i = 1)−P W i m =a P(I i = 1) (3.163) =P(I i = 1) P ˜ W i m =a −P W i m =a (3.164) and by means of the set inclusions {W i m =a}⊂{ ˜ W i m =a}∪{W 0 m > 0} (3.165) and { ˜ W i m =a}⊂{W i m =a}∪{W 0 m > 0} (3.166) we get P ˜ W i m =a −P W i m =a ≤P W 0 m > 0 ≤ 2P Δ [ k=1 {I i+k = 1} ! (3.167) ≤ 2Δν(A). Therefore the bound for the third term is simplified to P ˜ W i m =a P(I i = 1)−P W i m =a P(I i = 1) ≤ 2Δν(A) 2 . (3.168) 82 Bounds for the term (B) in (3.134): To bound the second term in (3.134) is a little bit more involved. Recall that ˜ W i m (x) =W i,− m (x)+W i,+ m (x) (3.169) and P { ˜ W i m =a}∩{I i = 1} −P ˜ W i m =a P(I i = 1) (3.170) = X ~ a=(a − ,a + ) s.t|~ a|=a P {W i,− m =a − }∩{W i,+ m =a + }∩{I i = 1} − X ~ a=(a − ,a + ) s.t|~ a|=a P {W i,− m =a − }∩{W i,+ m } P(I i = 1) (3.171) 83 For a particular choice of~ a = (a − ,a + ) such that|~ a| =a we have P {W i,− m =a − }∩{W i,+ m =a + }∩{I i = 1} − P {W i,− m =a − }∩{W i,+ m =a + } P(I i = 1) ≤ P {W i,− m =a − }∩{W i,+ m =a + }∩{I i = 1} − P {W i,+ m =a + }∩{I i = 1} P W i,− m =a − | {z } (B 1 ) + P {W i,+ m =a + }∩{I i = 1} − P W i,+ m =a + P(I i = 1) P W i,− m =a − | {z } (B 2 ) + P W i,+ m =a + P W i,− m =a − − P {W i,− m =a − }∩{W i,+ m =a + } P(I i = 1) | {z } (B 3 ) . (3.172) Three new terms are now to be bounded, namely(B 1 ), (B 2 ) and (B 3 ) in (3.172). 84 Bounds for (B 1 ): For the term (B 1 ), using the mixing property, we have P {W i,− m =a − }∩{W i,+ m =a + }∩{I i = 1} −P {W i,+ m =a + }∩{I i = 1} P W i,− m =a − ≤α(Δ)P W i,− m =a − . This gives X ~ a=(a − ,a + ) s.t|~ a|=a P {W i,− m =a − }∩{W i,+ m =a + }∩{I i = 1} − X ~ a=(a − ,a + ) s.t|~ a|=a P {W i,+ m =a + }∩{I i = 1} P W i,− m =a − ≤ X ~ a=(a − ,a + ) s.t|~ a|=a α(Δ)P W i,− m =a − (3.173) ≤ a X a − =0 α(Δ)P W i,− m =a − ≤α(Δ). 85 Bounds for (B 2 ): ForB 2 we have B 2 = P {W i,+ m =a + }∩{I i = 1} P W i,− m =a − −P W i,+ m =a + P W i,− m =a − P(I i = 1) = P W i,− m =a − P {W i,+ m =a + }∩{I i = 1} −P W i,+ m =a + P(I i = 1) ≤ α(Δ)P W i,− m =a − P(I i = 1) and therefore X ~ a=(a − ,a + ) s.t|~ a|=a P {W i,+ m =a + }∩{I i = 1} P W i,− m =a − − P W i,+ m =a + P W i,− m =a − P(I i = 1) ≤ X ~ a=(a − ,a + ) s.t|~ a|=a α(Δ)P W i,− m =a − P(I i = 1) (3.174) ≤ X a − α(Δ)P W i,− m =a − P(I i = 1) ≤α(Δ)ν(A). 86 Bounds for (B 3 ): Similarly, for the termB 3 in (3.172) the bound we get is X ~ a=(a − ,a + ) s.t|~ a|=a P W i,+ m =a + P W i,− m =a − − P {W i,− m =a − }∩{W i,+ m =a + } P(I i = 1) (3.175) ≤α(2Δ)ν(A). Gathering (3.173), (3.174) and (3.175) gives P { ˜ W i m =a}∩{I i = 1} −P( ˜ W i m =a)P(I i = 1) ≤Cα(Δ), (3.176) which completes the estimates for the term (B) in (3.134). Finally, putting the error terms (3.162), (3.168) and (3.176) together gives P {W i m =a}∩{I i = 1} −P W i m =a P(I i = 1) ≤ C Δν(A) 2 +nν(A)max{α( n 2 ),e −kn }+α(Δ) for someC,k∈R + 87 and, since this is true for all Δ>n, we in fact get P({W i m =a}∩{I i = 1})−P(W i m =a)P(I i = 1) ≤C inf Δ>n Δν(A) 2 +nν(A)max{α( n 2 ),e −kn }+α(Δ) , (3.177) as desired. The constantsC andk in (3.177) are independent ofA. 3.6 Proof of the Main Theorem (Theorem 6) 3.6.1 Return Times From Proposition6 we have that P({W i m =a}∩{I i = 1})−P(W i m =a)P(I i = 1) ≤C inf Δ>n Δν(A) 2 +nν(A)max{α( n 2 ),e −kn }+α(Δ) (3.178) and therefore, in terms of the notation used above in (3.108), we have ξ a ≤C inf Δ>n Δν(A) 2 +nν(A)max{α( n 2 ),e −kn }+α(Δ) (3.179) and, in turn, for the termǫ α in (3.107) we have 88 ǫ a ≤ν(A)+ ξ a ν(A) (3.180) ≤ν(A)+C inf Δ>n Δν(A)+nmax{α( n 2 ),e −kn }+ α(Δ) ν(A) (3.181) =C inf Δ>n Δν(A)+nmax{α( n 2 ),e −kn }+ α(Δ) ν(A) . (3.182) With the new estimates for the error term ǫ α in hand we can now work in a similar way as we did in the independent scenario, when we derived the inequality (3.88). Using the new estimates forǫ a the analogue of (3.88) is P τ k A > t ν(A) − k−1 X i=1 e −t (t ∗ ) i i! ≤Ct ∗ (t ∗ +(2+t ∗ )|logν(A)|) inf Δ>n Δν(A)+nmax{α( n 2 ),e −kn }+ α(Δ) ν(A) ≤Ct ∗ (t ∗ ∨1) inf Δ>n Δν(A)+nmax{α( n 2 ),e −kn }+ α(Δ) ν(A) |logν(A)| ≤Ct(t∨1) inf Δ>n Δν(A)+nmax{α( n 2 ),e −kn }+ α(Δ) ν(A) |logν(A)| (3.183) wheret ∗ = h t ν(A) i ν(A). We are considering two different mixing rates, namely (i) polynomial and (ii) exponential: (i)Polynomial mixing: In the polynomial case namely for α(k) = k −β with β > 2, inequality (3.183) becomes 89 P τ k A > t ν(A) − k−1 X i=1 e −t ∗(t ∗ ) i i! ≤Ct(t∨1) inf Δ>n Δν(A)+ 1 n β−1 + Δ −β ν(A) |log(ν(A))|, (3.184) where we recall thatt ∗ has been previously defined ast ∗ = h t ν(A) i ν(A). To find an explicit representation of the above infimum, that will not depend on Δ, might not be possible but we make an attempt to get an upper bound for it which is sharp enough in the sense that, as a function of the measureν(A), it tends to zero asν(A)→ 0 or, equivalently, asn→∞. To do that, for an exponent0<ω< 1, we make a particular choice of Δ as Δ = 1 ν(A n ) ω (3.185) then for the infimum in (3.184) we get inf Δ∈N Δν(A n )+ 1 n β−1 + Δ −β ν(A n ) ≤ν(A n ) 1−ω + 1 n β−1 +ν(A n ) βω−1 (3.186) and since this is true∀ 0<ω< 1 we can actually rewrite (3.186) in the form inf Δ∈N Δν(A n )+ 1 n β−1 + Δ −β ν(A n ) ≤ inf 0<ω<1 ν(A n ) 1−ω + 1 n β−1 +ν(A n ) βω−1 . (3.187) 90 Observe that the first term on the right-hand side of (3.187) is an increasing function of ω while the third term is a decreasing function of ω. The goal is to make sure that inf 0<ω<1 ν(A n ) 1−ω + 1 n β−1 +ν(A n ) βω−1 |log(ν(A n ))|→ 0, asn→∞. (3.188) Indeed, equating the 2 exponents gives 1−ω =βω−1⇔ω = 2 β +1 (3.189) which gives 1−ω ∗ =βω ∗ −1 = β−1 β +1 which does lie in(0,1). (3.190) Under these conditions, and by virtue of Lemma5, there existsK > 0 such that inf 0<ω<1 ν(A n ) 1−ω + 1 n β−1 +ν(A n ) βω−1 ≤ 2ν(A n ) β−1 β+1 + 1 n β−1 ≤ K n β−1 ∀n∈N (3.191) By the assumption of the theorem there exists a constant ˜ C > 0 such that |log(ν(A n ))|≤ ˜ Cn ∀n∈N. (3.192) Using , (3.191 ) and (3.192) we conclude that there existsC > 0 such that for β > 2 and∀n∈N, 91 inf 0<γ<1 ν(A n ) 1−γ + 1 n β−1 +ν(A n ) βγ−1 |log(ν(A n ))|≤C 1 n β−2 . (3.193) Finally, (3.184) becomes P τ k A > t ν(A n ) − k−1 X i=1 e −t ∗(t ∗ )i i! ≤Ct(t∨1) 1 n β−2 ,β > 2. (3.194) Lastly, we show that one can substitutet ∗ witht with no harm on the estimates. Specifically, we show that the difference k−1 X i=1 e −t ∗(t ∗ ) i i! − k−1 X i=1 e −t t i i! (3.195) is negligible compared to the error due to the approximation. Indeed, consider the functionh(t) = P k−1 i=1 e −tt i i! . As a function oft,h(.) is differentiable with a uniformly (independent of botht andk) bounded derivative: |h ′ (t)| = k−1 X i=1 e −t t i−1 (i−1)! − k−1 X i=1 e −t t i i! ≤ 2 ∀t∈R + ,∀k∈N. (3.196) Thereforeh(.) satisfies a Lipschitz condition with a Lipschitz constant 2 and this proves: k−1 X i=1 e −t ∗(t ∗ ) i i! − k−1 X i=1 e −t t i i! ≤ 2|t ∗ −t| ≤ 2ν(A) by(3.83). (3.197) 92 The error made is asymptotically smaller than the approximation error made in (3.194) and therefore (3.194) we can rewrite as P τ k A > t ν(A n ) − k−1 X i=1 e −t t i i! ≤Ct(t∨1) 1 n β−2 ,β > 2. (3.198) (ii)Exponential mixing: In the exponential case, we considerα(k) =e −λk withλ> 0. Then inequality (3.183) becomes P τ k A > t ν(A) − k−1 X i=1 e −t ∗(t ∗ )i i! ≤Ct(t∨1) inf Δ>n Δν(A)+ne −kn + e −kΔ ν(A) |log(ν(A))|. (3.199) In this case, for the infimum on the right-hand side of (3.199) we will get an upper bound with a different choice for the parameter Δ. Specifically, if we choose Δ to be Δ = 1+ǫ k |logν(A)| for someǫ> 0 (3.200) we get 93 inf Δ>n Δν(A)+ne −kn + e −kΔ ν(A) ≤ 1+ǫ k |logν(A)|ν(A)+ne −kn + 1 ν(A) e −k 1+ǫ k |logν(A)| = 1+ǫ k |logν(A)|ν(A)+ne −kn +ν(A) ǫ and therefore inf Δ>n Δν(A)+ne −kn + e −kΔ ν(A) |log(ν(A))| ≤ 1+ǫ k |logν(A)|ν(A)+ne −kn +ν(A) ǫ |logν(A)|. (3.201) In view of the basic inequality |logx| =O 1 x δ asx→ 0 and for all 0<δ < 1 (3.202) we have that for all 0<δ< 1 |logν(A)|≤C 1 ν(A) δ for someC =C(δ) and independent ofA. (3.203) This proves that we can findC > 0 andζ > 0 such that|logν(A)| 2 ν(A) ≤ Cν(A) ζ . Additionally, we can find ζ ′ > 0 such that ν(A) ǫ |logν(A)| ≤ Cν(A) ζ ′ . Lastly, by the assumption of the theorem we have ne −kn |logν(A n )|≤ ˜ Cn 2 e −kn . (3.204) 94 Using the last 3 inequalities, and Lemma (5) which guarantees exponential decay of the cylinder sets asn → ∞, yields the desired result. Strictly positive constantsC andγ exist, both independent ofA, such that inf Δ>n Δν(A)+ne −kn + e −kΔ ν(A) |log(ν(A))|≤Ce −γn for alln-cylindersA (3.205) and, in particular, after we drop thet ∗ as we did in the polynomial case using the estimates (3.197), inequality (3.199) becomes P τ k A > t ν(A n ) − k−1 X i=1 e −t t i i! ≤Ct(t∨1)e −γn for alln∈N. (3.206) Finally, we remark the assumption of the theorem that|log(ν(A))|≤ ˜ Cn can be substituted by |log(ν(A))|≤ ˜ Cn δ for someδ > 1. (3.207) As we reproduce the proof under this new assumption in place one can see that in the exponential case the parameterδ in (3.207) can be taken arbitrarily large, which will of course affect the values ofC andγ in the estimates (a choice of a largerδ results in a largerC and a smallerγ), whereas in the polynomialα-mixing caseδ can be taken as large asβ−1, whereβ is the polynomial exponent of the α function. Larger values of the parameterδ broadens the family of the cylinder sets for which the estimates hold. 95 3.6.2 Hitting Times Corollary 3. In anα-mixing, measure-preserving dynamical system(Ω,F,ν,F), with underlying partitionη that is finite or countably infinite withH(η) < ∞)), ifA n is a cylinder set of ordern andτ A is the hitting time to the setA, i.e. the first time the system entersA, then the distribution ofτ A suitably rescaled can be approximated by an exponential distribution with mean 1. Specifically, given a constant ˜ C > 0, for allA n ∈C(n) which satisfy|logν(A n )|≤ ˜ Cn andr An > n 2 the following estimates hold true : i) Exponential mixing rate. Ifα(n) =β n , with0<β < 1, there existsγ =γ(β)> 0 andC =C(β, ˜ C)> 0 such that P(τ An >t)−e −ν(An)t ≤Ct(t∨1)e −γn ∀t> 0 and∀n∈N. (3.208) ii) Polynomial mixing rate. Ifα(n) = 1 n β , withβ > 2, there existsC =C(β, ˜ C)> 0 such that P(τ An >t)−e −ν(A)t ≤Ct(t∨1) 1 n β−2 ∀t> 0 and∀n∈N. (3.209) 96 Proof. In terms of the auxiliary function W m , which we earlier defined as W m (x) = m P j=1 1 A (F j (x)), the probabilityP(τ A >t) can be expressed as P(τ A >t) =P(W [t] = 0) ∀t> 0 (3.210) and using this, and the same approach as in the proof of theorem 6, the result follows. 3.7 Return Times on Markov Towers Corollary 4. LetΦ :M →M be a dynamical system that admits a Markov Tower structureF : Δ → Δ with a reference measurem and a return time functionR. Also let ν be the absolutely continuous invariant measure for Φ. Then, given a positive fixed constant ˜ C, for allA n ∈C(n) which satisfy|logν(A n )|≤ ˜ Cn and r An > n 2 , the following results hold true : i) Exponentially decaying return-map tails. Ifm(R > n) = O(θ n ), for some 0 < θ < 1, there existsγ = γ(θ) > 0 and C =C(θ, ˜ C)> 0 such that P(τ An >t)−e −ν(An)t ≤Ct(t∨1)e −γn ∀t> 0 and∀n∈N. (3.211) ii) Polynomially decaying return-map tails. Ifm(R>n) =O(n −β ), for someβ > 3, there existsC =C(β, ˜ C)> 0 such that 97 P(τ An >t)−e −ν(A)t ≤Ct(t∨1) 1 n β−3 ∀t> 0 and∀n∈N, (3.212) whereτ k A is thek th time the system visitsA. Proof. In the presence of an admissible Markov tower structure with a return time functionR whose tails decay at exponential and polynomial rates Theorem 1 guar- antees that one can get anα-mixing condition for Φ with respect to its absolutely continuous invariant measureν. One can then invoke Theorem 6 to approximate the return and hitting times forn-cylinders. If the return time tails decay exponentially fast, then the system mixes expo- nential fast (α-mixing) and exponential error bounds for the Poisson approxima- tion can be obtained. In the polynomial case, however, the polynomial exponent in Theorem 6 has to be> 2 which, in view of Theorem 1, forcesβ in (3.212) to be> 3. 98 Bibliography [A1] M. Abadi, Exponential approximation for hitting times in mixing pro- cesses, Math. Phys. Electron. J. 7 , (2001), 343-363. [A2] M. Abadi, Sharp error terms and necessary conditions for exponential hitting times in mixing processes, Ann. Probab. 32 , (2004), 243-264. [A3] M. Abadi, Instantes de ocorrencia de eventos raros em processos mistu- radores, Ph.D. Thesis, Instituto de Mathematica e Estatistica, Universi- dade de Sao Paulo , (2001). [ADU] J. Aaronson, M. Denker and M. Urbanski, Ergodic theory for Markov fibred systems and parabolic rational maps, Trans. Amer. Math. Soc., 337, (1993), 495-548. [AG] M. Abadi, A. Galves Inequalities for the occurrence times of rare events in mixing processes. The state of the art, Markov Process. Related Fields 7 , (2001), 97-112. [AGG] R. Arratia, L. Goldstein, L. Gordon, Poisson Approximation and the Chen-Stein Method, Statistical Science V ol. 5 , No. 4 , (1990), 403-425. [B] R. Bowen, Markov partitions for Axiom A diffeomorphisms, Amer. J. Math., 92 , (1970), 725-749. [BC1] A. D. Barbour, Louis H.Y . Chen, An Introduction to Stein’s Method, Lec- ture Notes Series, Institute for Mathematical Sciences, National Univer- sity of Singapore, V ol. 4 (2005) 99 [BC2] A. D. Barbour, Louis H.Y . Chen, Stein’s Method and Applications, Lec- ture Notes Series, Institute for Mathematical Sciences, National Univer- sity of Singapore, V ol. 5 (2005) [BG] M. Brin, G. Stuck Introduction to Dynamical Systems, Cambridge Uni- versity Press (New York, 2002),1 st Edition. [C] Louis H.Y . Chen, Poisson approximation for dependent trials., Ann. Probab., 5 (1975), 534-545. [CY] N. Chernov, L.-S. Young, Decay of Correlations for Lorentz Gases and Hard Balls, Encycl. of Math. Sc., Math. Phys. II, V ol. 101 , Ed. Szasz, (2001), 89-120. [D1] R. Durrett, Probability: Theory and Examples (Wadsworth, 1996), 2 nd Edition. [D2] P. Doukhan, Mixing: Properties and Examples. Lecture Notes in Statist. 85. Springer-Verlag, (New York, 1994). [GS] A. Galves, B. Schmitt, Inequalities for hitting times in mixing dynamical systems, Random Comput. Dynam. 5 , (1997), 337-347. [HV1] N. Haydn, S. Vaienti, Fluctuations of the Metric Entropy for Mixing Mea- sures, Stochastics and Dynamics, V ol. 4 , No. 4 (2004), 595-627. [HV2] N. Haydn, S. Vaienti, The limiting distribution and error terms for return time of hyperbolic maps, Discrete Contin. Dyn. Syst., V ol. 10, (2004), 584-616. [HV3] N. Haydn, S. Vaienti, The compound Poisson distribution and return times in dynamical systems, to appear in Probab. Theory Related Fields. [K] M. Kac, On the notion of recurrence in discrete stochastic processes, Bull. Amer. Math. Soc., 53 (1947), 1002-1010. [M1] R. Man´ e, Ergodic Theory and Differentiable Dynamics, Springer-Verlag, (New York, 1985). [M2] V . Maume-Deschamps, Projective Metrics and Mixing Properties on Tow- ers, Transactions of the American Mathematical Society, V ol. 353 , No. 8 (2001), 3371-3389 [R] D. Ruelle, A measure associated with Axiom A attractors, American Jour- nal of Mathematics, 98 , (1976), 619-654. 100 [S] Stein, C. A bound for the error in the normal approximation to the dis- tribution of a sum of dependent random variables., Proc. 6th Berkeley Sympos. Math. Statist. Probab. 2 (1972), 583-602 [Si1] Ya. G. Sinai, Construction of Markov partitions, Functional Anal. Appl., 2 , (1968), 245-253. [Si2] Ya. G. Sinai, Gibbs measures in ergodic theory, Russian Mathematical Surveys, 27 , (1972), 21-69. [W] P. Walters, An Introduction to Ergodic Theory, Springer-Verlag, New York (Wadsworth, 1982). [Y1] L.-S. Young, Decay of correlations for certain quadratic maps, Commun. Math. Phys., 146 , (1992), 123-138. [Y2] L.-S. Young, Statistical properties of dynamical systems with some hy- perbolicity, Annals of Math. 7 , (1998), 585-650. [Y2] L.-S. Young, Recurrence time and rate of mixing, Israel J. of Math. 110 , (1999), 153-188. 101 
Asset Metadata
Creator Psiloyenis, Yiannis (author) 
Core Title Mixing conditions and return times on Markov Towers 
School College of Letters, Arts and Sciences 
Degree Doctor of Philosophy 
Degree Program Mathematics 
Publication Date 06/24/2008 
Defense Date 05/23/2008 
Publisher University of Southern California (original), University of Southern California. Libraries (digital) 
Tag hitting time,hyperbolic dynamical systems,Markov towers,mixing,OAI-PMH Harvest,Poisson approximation,return times,Stein method 
Language English
Advisor Haydn, Nicolai (committee chair), Baxendale, Peter H. (committee member), Jonckheere, Edmond A. (committee member), Lototsky, Sergey Vladimir (committee member) 
Creator Email psiloyen@usc.edu 
Permanent Link (DOI) https://doi.org/10.25549/usctheses-m1288 
Unique identifier UC1440132 
Identifier etd-Psiloyenis-20080624 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-70504 (legacy record id),usctheses-m1288 (legacy record id) 
Legacy Identifier etd-Psiloyenis-20080624.pdf 
Dmrecord 70504 
Document Type Dissertation 
Rights Psiloyenis, Yiannis 
Type texts
Source University of Southern California (contributing entity), University of Southern California Dissertations and Theses (collection) 
Repository Name Libraries, University of Southern California
Repository Location Los Angeles, California
Repository Email uscdl@usc.edu
Abstract (if available)
Abstract This dissertation discusses mixing properties derived on non-uniformly hyperbolic dynamical systems which admit a Markov-Tower structure. For systems with exponential and polynomial decay of the tails of the return map we derive alpha-mixing conditions at exponential and polynomial rates, respectively. The motivation is to use these mixing conditions, on eligible systems, to approximate the law of the hitting and return times to a set of small measure. 
Tags
hitting time
hyperbolic dynamical systems
Markov towers
mixing
Poisson approximation
return times
Stein method
Linked assets
University of Southern California Dissertations and Theses
doctype icon
University of Southern California Dissertations and Theses 
Action button