Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Return time distributions of n-cylinders and infinitely long strings
(USC Thesis Other)
Return time distributions of n-cylinders and infinitely long strings
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Return Time Distributions ofn-cylinders and Infinitely Long Strings by Jason Tomas Dungca A Thesis Submitted to the Faculty of The USC Graduate School University of Southern California In Partial Fulfillment of the Requirements for the Degree Master of Arts (Applied Mathematics) August 2014 Copyright 2014 Jason Tomas Dungca Contents 1 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 Introductory Definitions and a Lemma . . . . . . . . . . . . . . . . . . . . 9 5 Distribution of Return Times . . . . . . . . . . . . . . . . . . . . . . . . . 16 5.1 The Main Result on Return Time Distributions . . . . . . . . . . . 18 5.2 Hitting and Return Time Distributions . . . . . . . . . . . . . . . . 32 6 Defining a Special Probability Sequence l (A) and More Definitions . . . . 37 7 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 7.1 A Markov Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 7.2 Normalized Case of Stochastic Processes . . . . . . . . . . . . . . 46 7.3 Existence of Probability Measure . . . . . . . . . . . . . . . . . . 48 7.4 Special Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 7.5 A Specific Distribution . . . . . . . . . . . . . . . . . . . . . . . . 57 8 Case Study of Return Times of n-cylinders . . . . . . . . . . . . . . . . . . 58 8.1 Case 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 8.2 Case 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 9 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 1 1 Acknowledgements I greatly thank my thesis advisor, Professor Nicolai Haydn, for the years of guidance (since 2011) he has provided me throughout my undergraduate years, my Masters, and my thesis research and writing process. Furthermore, I would also like to thank him for helping me become a much more independent mathematician. I would also like to earnestly thank Professor Remigijus Mikulevicius for helping me improve my abilties in Real Analysis, advising me during my Masters, and helping with the thesis revision process. Also, I definitely thank Professor Sergey Lototsky for helping revise this paper. Furthermore, I thank Dr. Triet Pham for his help on LaTex technical difficulties. Finally, I would like to thank James-Michael Leahy for his input on my thesis as well as his motivational support. 2 2 Abstract Consider an invariant probability measure and a shift space made of symbolic strings (se- quences of symbols, which are considered to be probabilitistic events). Within this shift space, we will analyze the behavior of strings by shifting elements of these strings of sym- bols using the left-shift map (each element of the string moves left). For instance, under the map the second position of the string becomes the first position of the given string under the left-shift map. An introduction to terminology, which will include return time, hitting time, and period, from ergodic theory will be given. Also, a major analytic concept called -mixing, which involves the phenomenon of string independence over time, will be dis- cussed in this paper due to its significant impact on our strings. The main part of this thesis will prove and analyze the probability distribution of return times. In particular, we find the return time distributions are exponential. The analysis of these distributions will be mainly through many extensions of an example on Markov chains. The final piece of analysis of these distributions is a case study on the effects of the periodicity of a nested sequence of n-cylinders on the convergence of return time distributions. 3 3 Introduction To slowly ease the reader into the content of this thesis, we will give a brief expository discussion on ergodic theory and then an overview of the objectives of the paper. Ergodic theory, the study of dynamical systems with invariant measure, is essentially the inter- section of dynamical systems and probabilistic analysis. From the work of Poincare to V on-Neumann, ergodic theory has certainly made an impact in dynamical systems over the 19 th -21 st centuries. Poincare’s work has sparked the formation of ergodic theory and in particular, return time (the time in which a point in a setA reappears in that same set after the application of a mapT ), during the late 19 th century. Of particular importance is the Poincare recurrence theorem, discovered by Poincare in 1890. Before we mention the statement of Poincare’s theorem, let us introduce some necessary definitions and notation. Consider ( ;F;), which is composed of a proability space (our shift space, which is the set of strings under the shift mapT that maps elements of strings one step to the left), a-algebra composed of strings, and a probability measure: In connection to subsetsU of , note that the return time is rigorously defined mathemati- cally as follows: U (x) = minfj 1 :T j (x)2Ug forx2U: Note that a measure is called T -invariant if (T 1 U) = (U) for all measurable sets U : As proven by Poincare [9] in 1890, the statement and proof of the Poincare recurrence theorem is below: Theorem 3.1. LetT : ! and be aT -invariant probability measure. If(U) > 0; then U (x)<1 for almost everyx2U: Proof. Let U n =[ 1 j=n T j U for the set of points x2 that enter U at least once after time n: Obviously U 0 U 1 U 2 ; and also U n = T 1 U n+1 which implies by the invariance of the measure that (U n ) = (T 1 U n+1 ) = (U n+1 ) and consequently (U 0 ) = (U n )8n: Now W =\ 1 n=1 U n =fx2 enters U infinitely ofteng: Let us consider the setUnW =fx2 entersU finitely ofteng componentwise, i.e., the setsU n : Most importantly, we know that (UnU n )(U 0 nU n ) = 0 sinceUU 0 for everyn and(U 0 ) =(U n ): Thus, (UnU n ) = 0: 4 Because the above equality is valid for everyn andW =\ 1 n=1 U n ; by Demorgan’s Laws, ([ 1 n=1 (UnU n )) =(Un\ 1 n=1 U n ) =(UnW ): Since(UnU n ) = 0 for everyn; by additivity, ([ 1 n=1 (UnU n )) = 0: Hence, (UnW ) = 0; so(fx :T j (x) = 2Uforj >ng) = 0: Becausefx : T j (x) = 2 Uforj > ng is a null set, for anyx2 U;T j (x) must be inU for somej 1: Therefore, by definition, U (x)<1 for almost everyx2U: Note 1. The Poincare recurrence theorem does not apply to spaces with infinite measure because if it was applicable, we would be able to apply it to any positive measure setU in a space of infinite measure. Consider the Lebesgue measurem onR with the dynamical systemTx = x + 1 and the positive measure setU = [a;b] s.t.a2R andb2R: Of course,m(U) =jbaj: Then, underT; an arbitrary valuex2U is mapped toT (x) =x + 1: Then,T 1 x =x 1: We also know thatT 1 (U) = [a1;b1]; som(U) =m(T 1 U) for this specificU: Thus, in general,m isT -invariant because provingT -invariance for these closed set (which gen- erate the Borel-algebra) provesT -invariance onR: Usually, we would like T -invariant measures that come with a given dynamical system. However, in this case, our dynamical system leads to some trouble as there does not exist a pointx2U s.t.T j (x) =x +j =x for anyj2N: In particular,T j (U) = [a +j;b +j]; soT j (U) is not identically equal toU: Thus, we can observe that not allx2U satisfy the result of the Poincare recurrence theorem that states thatT j (x) must also inU: Thus, the Poincare recurrence theorem does not apply. Hence, it is necessary for us to work on finite measure spaces to use the Poincare recurrence theorem and analyze return time. Poincare’s work has led to the original research by many ergodic theorists, including Nico- lai Haydn, Michael Abadi, and Nicolas Vergne. The basis of the research done for this paper is Abadi and Vergne’s paper, Sharp Error Terms for Return Time Statistics under Mixing Conditions [2]. Note that a string is a specific sequence of symbols that come from a given set. SupposeA is a cylinder set, i.e., given an arbitrary string,fx 0 x 1 :::x n1 g inA; U(x 0 x 1 :::x n1 ) =fy2 :y 0 :::y n1 =x 0 :::x n1 g: 5 In laymans terms, a cylinder set is the set of all strings that coincide with the firstn 1 symbols of a given stringx. Abadi and Vergne’s paper is mainly about the distribution of return times, denoted A (given a setA) in their notation, for infinitely long strings (i.e. a sequence of symbols). We will be discussing and proving two main results from [2] as well as examples related to their results. The main theorem and lemma are significant because they describe the mea- sure, or the probability, these special collections of symbolic symbols, calledn-cylinders, when a condition called-mixing is applied to each elementX n of the string. Before we discuss these findings, let us roughly define the meaning of hitting and return time distri- butions. The hitting time distribution is a proability function, P( A >t) =P(fx2 : A (x)>tg) for a setA s.t.x2 : Similarly, the return time distribution is a probability function, P A ( A >t) =P(fx2A : A (x)>tg) for a setA s.t.x2A: The difference between the hitting and return time distributions is that the return time dis- tribution is the hitting time distribution when the domain of values,x, is restricted toA: In summary, the first main result is that the distribution of return times is approximately exponential. The key word in the previous sentence is "approximately" because Abadi and Vergne introduce the error term(A) to rigorously show that the distribution of return times is a special exponential function,e P(A)t ; for a given setA and timet > 0: Here is the statement of the theorem as reference for the reader: Theorem 3.2. LetfX m g m2N 0 be a -mixing process and f A = 1 2P(A) : Then, for allA2 C n ;n2N 0 ; the following holds: Fort<(A) : P A ( A >t) 1 = 0: For(A)tf A : jP A ( A >t) A e A P(A)(t(A)) j 9 2 (A)P(A)t: Fort>f A : jP A ( A >t) A e A P(A)(t(A)) j 58(A)f(A;t): 6 In order to prove this main result for different values oft; they establish various proposi- tions, which involve estimates of return and hitting time, A (x) = inffk 1 :T k (x)2Ag for anyx2 . Additionally, Abadi and Vergne prove a lemma that states that these results are equivalent: the distributions of the return time and hitting time are nearly equivalent to each other, A ! 1, and the distributions of these times are exponential. Equivalently, the lemma also applies these results to n-strings that grow longer and longer. This result is applied to special sequences of n-cylinders in Section 8. To establish this lemma, they need the results of the above theorem. As reference for the reader, we provide the statement of this important lemma below: Lemma 3.3. Let the processfX m g m2N 0 be-mixing. There exists a constantC > 0 such that, for allA2C n ;n2N 0 and allt> 0; the following conditions are equivalent: (a)jP A ( A >t)e P(A)t jC(A)f(A;t): (b)jP A ( A >t)P( A >t)jC(A)f(A;t): (c)jP( A >t)e P(A)t jC(A)f(A;t): (d)j A 1jC(A): Moreover, iffA n g n2N 0 is a sequence of strings such thatP(A n )! 0 asn!1; the fol- lowing conditions are equivalent: (i) The return time law ofA n converges to a parameter one exponential law. (ii) The return time law and the hitting time law ofA n converge to the same law. (iii) The hitting time law ofA n converges to a parameter one exponential law. (iv) The sequence ( An ) n2N 0 converges to one. Besides the main result of their paper, which is about exponential return time distributions, they also prove various approximations for the expectation of return time and the distribu- tion of sojourn time (the last time an positive integer iteration (of a multiple of the period of a set) applied to a pointx2A is present in the setA:). We do not have enough time to cover examples of these phenomena in this paper, but a brief discussion of these concepts will be present in the conclusion. This thesis is based on Abadi and Vergne’s paper as described above. Section 4 of this paper will concentrate on introducing various important definitions, such as return time and hitting time. Section 5 will concentrate on giving a thorough analysis of Abadi and Vergne’s results and proofs about exponentially distributed return times. Section 6 con- centrates on giving essential definitions, including ones for sojourn time and ergodicity (a special measure-theoretic property involving measure invariant sets). More importantly, this section defines a probability sequence, connected the concept of sojourn time, that will be analyzed through various examples, such as one on Markov chains and another on a specific given distribution. Section 7 introduces a specific Markov chain that generates a 7 subshift of finite type, which will be defined later, and explicit calculations for probability sequence i (A) for this example. Note that the specific Markov chain (or random walk) we introduce is one that only allows us to proceed or "walk" either onN 0 sequentially or return to 0: Basically, this section is the analysis of the limit(S n ) = lim n!1 i (S n ) given Markov chain under specific and general probability distributions. Section 8 is a case study of pe- riodicity and cylinder sets, which will involve analyzing the limiting behavior of An(x) ; a probability sequence, connected to return time, that will be defined later. Section 9 is the Appendix for the full proof on exponential return time distributions. The final section is the conclusion, which will explain about other work Abadi and Vergne have done and how we can further extend that work done by them, Professor Haydn, and myself. 8 4 Introductory Definitions and a Lemma In order to make sense of the main results on hitting and return time distributions, we must rigorously define and explain the necessary concepts and details behind these findings. We proceed in this section via construction of fundamental spaces, sets, and maps. First, we define the probability space, including its elements (strings) and its special elements called n-cylinders. Then, we define the topology and -algebra. Afterwards, we construct the Markov measure, which is essential to our main result due to its nice behavior (an example of which is measure invariance). Finally, we introduce the ubiquitous components of the two main results (briefly discussed in the introduction), which aren-cylinders generated by a given setC, hitting time, return time, and respectively, the distributions of the preceding. 4.0.0.1 Construction of Probability Space and Left Shift Map SupposeC is a set with a finite number of elements. Set the probability space (or measure space) =C N 0 , the set of infinite sequencesfx 0 x 1 x 2 :::g s.t.x j 2C. Alternatively, these sequences are called strings. For everyx = (x n ) n2N 0 2 andn2N 0 , setX n : !C be then-th coordinate projec- tion, i.e.X n (x) =x n . LetT : ! be the one-step-left-shift operator defined by (T (x)) n =x n+1 : for alln2N 0 4.0.0.2 Example of a String Under the one-step-left-shift operatorT LetC =f0; 1g and the typical definition for . Consider a finite string x =f101010101010g: Then, the string under the shift operator is Tx =f01010101010g: Note that the * on the 12-th position of Tx could be anything inC: As the reader might notice, (T (x)) n =x n+1 as for instance, (T (x)) 0 = 0 =x 1 : 4.0.0.3 Cylinder Sets Before introducing the topology on , let us introduce the following definition: Definition 4.1. A cylinder set given by a stringfx 0 x 1 :::x n1 g is U(x 0 x 1 :::x n1 ) =fy2 :y 0 :::y n1 =x 0 :::x n1 g: Note thatU(x 0 x 1 :::x n1 ) is called ann-cylinder as the given string is of lengthn: 9 4.0.0.4 Example of Elements of a Cylinder Set As a general example, let us consider infinite strings. Consider the strings x =f111::::10 | {z } x 0 :::x n1 00000000:::g and y =f111:::10 | {z } y 0 :::y n1 1010101:::g: Thus,y belongs to then-cylinder generated by the firstn-elements ofx,U(x 0 x 1 :::x n1 ); becausefx 0 :::x n1 g = fy 0 :::y n1 g. By definition, U(x 0 x 1 :::x n1 ) is composed of all strings in that hold the same property, namely the coinciding property of y has with x. Note that an n-cylinder can be generated by elements of a string. For instance, given a string a =fx 0 :::x 4 g, one can consider T 5 (U(a)) = T 5 (U(x 0 :::x 4 )), the cylinder set generated by the 5th position ofx. 4.0.1 Topology of The topology is given by this metric: suppose 2 (0; 1) and put d(x;y) = n(x;y) , s.t. n(x;y) = minfjjj : x j 6= y j g. The metric basically states that the distance between two strings is number between 0 and 1 to a power of the first position these strings do not agree. A basis for the topology is made of cylinder sets. 4.0.1.1 Example Using the Given Topology Consider the strings x =f111::::10 | {z } x 0 :::x 6 00000000:::g and y =f111:::10 | {z } y 0 :::y 6 1010101:::g: Then, for = 1 2 (choosing this value is standard convention), d(x;y) = n(x;y) = ( 1 2 ) 7 = 1 128 : 4.0.2 -algebra on for the Strings Before we define the measure for these strings, letF be the-algebra on created by the strings. LetF I be the-algebra created by strings with coordinates fromI;IN 0 : 10 4.0.2.1 Stochastic Matrix Definition 4.2. A stochastic matrix is a matrixR such that each of its rows sum to 1, i.e. P n j=1 R ij = 1 fori = 1;:::;n. 4.0.2.2 Example of a Stochastic Matrix Consider annn matrixR s.t.R i1 = 1 2 ,R in = 1 2 ; andR ij = 0 forj6= 1;n: Namely, R m;n = 0 B B B @ 1 2 0 0 1 2 1 2 0 0 1 2 . . . . . . . . . . . . . . . 1 2 0 0 1 2 1 C C C A : Each row ofR sums to 1 because n X j=1 R ij =R i1 +R in = 1 2 + 1 2 = 1 fori = 1;:::;n. Note: Equivalently, one can also define a matrix to be stochastic in terms of columns. In the case a matrix is stochastic in terms of its rows and columns, it is callled bi-stochastic. 4.0.2.3 State Space Definition 4.3. The state spaceS is the set of values each random variable (i.e. function) X m can attain. Note that the state space is a countable set. 4.0.3 Markov Measure Let us examine a specialT -invariant probability measureP overF. This measure, called a Markov measure, is constructed below: Consider a matrixB ann matrix composed of entries that are either 0 or 1 and suppose T : ! is our left shift, which is called the subshift of finite type because ourX m can only take values inN 0 (our state space). SupposeR is a stochastic matrix, which means thatR1 = 1(1 = [1; 1;:::; 1]) andpR = p for a left eigenvectorp = [p 1 ;:::;p n ] (in the case thatC hasn elements), corresponding to the eigenvalue 1, that is positive and has components that sum to one, P j p j = 1. We also requireR ij = 0 whenB(i;j) = 0, which explains whyR1 = 1. 11 This creates the invariant measure under T;P; on s.t. cylinder sets have the following measure: P(U(x 1 x 2 :::x n )) =p x 1 R x 1 x 2 R x 2 x 3 R x n1 xn : Note that these cylinder sets generate the -algebra. By the Kolmogorov extension the- orem, the construction of the measure on cylinder sets gives the measure for the whole -algebra. The invariance ofP as proven by [10] is below: Lemma 4.4. The Markov measure is invariant under the one-step-left-shift mapT: Proof. P(U(x m+1 x m+2 :::x n )) =p x m+1 R x m+1 x m+2 R x m+2 x m+3 R x n1 xn = ( X B(x m ;x m+1 )=1 X 1x m k p x m p x m+1 R x m x m+1 )p x m+1 R x m+1 x m+2 :::R x n1 xn = X B(x m ;x m+1 )=1 X 1x m k P(U(x m :::x n )): By observing the indices, we finally conclude that indeed the result follows. An example using the Markov measure will be discussed in Section 7. Notation: Given measurable sets V and W , let P(VjW ) = P W (V ) = P(V;W) P(W) be the conditional measure ofV givenW . We writeP(V ;W ) =P(V\W ). 4.0.4 Analysis of These Measurable Sets Let us now consider measurable subsets of . Definition 4.5. GivenA ,A2C n if and only ifA =fX 0 =a 0 ;:::;X n1 =a n1 g s.t. a i 2C for 0in 1. 4.0.4.1 Example of Sets that CreateC n The reader might notice a similarity between the definitions forC n andn-cylinders. In fact, the set ofn-cylinders generated byC createC n as shown below: U(a 0 :::a n1 ) s.t.a i 2C for 0 i n 1 formC n . For instance, takefa 0 :::a n1 g =f111:::1g =fx 0 :::x n1 g; sox andy from then-cylinder example are elements ofU(a 0 :::a n1 ). Hence,x andy are elements ofC n . Other necessary definitions are given below: 12 4.0.4.2 Period Definition 4.6. SupposeA . The period ofA (with respect toT ) is the number(A): (A) = minfk2NjA\T k (A)6=;g: 4.0.4.3 Example To Illustrate Period LetC =f0; 1g andA with defined the usual way. ConsiderA =f1000100010001000g: Now, let us consider the behavior ofA under the one-step-left-shift mapT : A =f1000100010001000g T 1 (A) =f1000100010001000g T 2 (A) =f 1000100010001000g T 3 (A) =f1000100010001000g T 4 (A) =f 1000100010001000g: * represents any element ofC as there is no way of know what exact element it represents. Notice thatA\T 4 (A)6=; andT 4 is the smallest pullback (inverse iteration) that makes this true. Thus, in this case,(A) = 4: Note that ifA does not satisfy Definition 4.4, i.e.A\T k (A) =; for allk2N,A is said to have infinite period or(A) =1. According to Professor Haydn, ifP(A) > 0; then (A)<1 4.0.4.4 Hitting and Return Time Definition 4.7. GivenA , the hitting time A : !N 0 [f1g is the random variable as follows: For an arbitraryx2 , A (x) = inffk 1 :T k (x)2Ag: Note that the return time is the hitting time s.t.x2A: 4.0.4.5 Example To Illustrate Hitting Time LetC =f0; 1g and define the usual way. Consider a stringx2A s.t.A =T 5 (U(a)) = T 5 (U(x 0 :::x 4 )) andx 4 = 1. Then, let us look at a stringy =f0000000010:::g2 . Then, y =f0000000010:::g T (y) =f0000000100:::g T 2 (y) =f0000001000:::g 13 T 3 (y) =f0000010000:::g T 4 (y) =f0000100000:::g T 5 (y) =f0001000000:::g SinceT 5 (y)2A, by definition, A (y) = 5. Thus, the hitting time ofy is 5: 4.0.4.6 Example To Illustrate Return Time Again, letC =f0; 1g and define the usual way. Consider a stringx2A s.t. A =fstrings with at least one zero in positions that are multiples of seveng: For instance,f0010000000g2A. Let us take a stringy =f11111111111110g2A. Then, y =f11111111111110g T (y) =f1111111111110g T 2 (y) =f111111111110g T 3 (y) =f11111111110g T 4 (y) =f1111111110g T 5 (y) =f111111110g T 6 (y) =f11111110g T 7 (y) =f1111110g: SinceT 7 (y)2A andy2A, by definition, the return time ofy is 7: 4.0.4.7 The Relationship Between Return Time and Period Lemma 4.8. LetA : Return times before the period,(A); are not possible. Proof. Consider A j A =j and(A) =k so A <(A) withj <k: Since(A) =k; A\T k (A)6=; s.t.A\T l (A) =; forl<k as A = minfk2NjA\T k (A)6=;g: Now, given A j A =j;T j (x)2A for anyx2A: However, since(A) =k;A\T k (A) =;; which means that for somex2A;x = 2T l (A) for alll<k; includingj: Then,T j (x) = 2A; which is a contradiction. Therefore,P A ( A <(A)) = 0: 14 4.0.5 Remark 1 Thus, alternatively, we can define period in this way:(A) = inff A (x) :x2Ag: 4.0.5.1 Definition of Hitting and Return Time Distributions Definition 4.9. Consider a setA and the hitting time A (x) s.t.x2 . Givent> 0, P( A >t) =P(fx2 : A (x)>tg) is called the hitting time distribution. Analogously, there is a corresponding definition for return time, called the return time distribution: Definition 4.10. Consider a setA and the return time A (x) s.t.x2A. Givent> 0, P A ( A >t) =P(fx2A : A (x)>tg) is called the return time distribution. Examples of hitting and return time distributions will be analyzed in subsequent sections. Note that hitting and return time distributions are usually described in a non-normalized set up. Instead of having the hitting or return time greater thant; we would have it be greater than s (B) s.t.s is the time parameter, is the given probability measure, andB is a set from the probability space : 4.0.5.2 An Important Factor Called A Definition 4.11. ForA ; define the term A =P A ( A (A) + 1): Note that A particularly makes an impact on the hitting and return time distributions of n-cylinders. 15 5 Distribution of Return Times Before proving the main theorem and lemma, we must define a few more essential defi- nitions. We start this section by introducing the concepts of stochastic process, Bernoulli process, and -mixing. The main theorem relies on the fact these strings are generated by random variables in random way. We are considering a system of special strings, n- cylinders, in a stochastic (random) rather than a deterministic fashion. The random nature of thesen-cylinders gives the generality we want for the theorem. The Bernoulli process is introduced to give the reader an example of a stochastic process as well as a way to transi- tion and connect the sections of this paper. Most importantly, we discuss-mixing. This probabilitistic condition on then-cylinders is necessary because we would not necessarily have ourn-cylinders exponentially distributed. As discussed in the conclusion, that result was proven by Lacroix and Kupsa, two ergodic theorists. In addition, we briefly discuss strong and weak mixing as a way to further sup- plement the reader’s understanding of mixing. We also discuss a probabilistic function, A closely related to the distributional results ofn-cylinders in that it has a certain conver- gence property. Note that these connections are investigated by a case study on a future section. Finally, we discuss and prove the main theorem and lemma on return and hitting time distributions. 5.0.5.3 Stochastic Processes Definition 5.1. A stochastic processfX m g m2N 0 that has a state spaceS is a set of random variables on the probability space ( ;F;P). 5.0.5.4 Remark For the rest of this paper, whenever we refer to the term "process", we are considering a stochastic process. Definition 5.2. Consider a processfX m g m2N 0 and letfY m g m2N 0 =fX n+m g n+m2N 0 . The preceding condition makesfX m g m2N 0 what we call a stationary process. As a consequence given a setA generated by a stationary stochastic process and the one- step-left-shift mapT , for a given specific measureP,P(T i (A)) =P(A): Definition 5.3. The Bernoulli processfX m g m2N 0 that has a state spaceS is a set of ran- dom variables each with Bernoulli distribution on the probability space ( ;F;P). As a reminder to the reader, the Bernoulli distribution is a measure of success or failure of out- comes. To be more precise, the Bernoulli distribution is a distribution such that takes two values from the state spaceS =f0; 1g withP(X i = 1) = andP(X i = 0) = 1 for some 0 < 1. We call our parameter. The Bernoulli process has stationarity (which is another way of saying that it is a stationary process). The associated measure for 16 the Bernoulli process is called the Bernoulli measure. The Bernoulli measure is a special case of the Markov measure because the construction involves the matrix B completely composed of 1s (rather than 0 or 1). As a side note, the Bernoulli process is discussed as an example later for a sequence of probability distributions. Notationally, since the Bernoulli measure is a special case of the Markov measure, we will useP for both measures. 5.0.5.5 Types of Mixing We will begin the discussion by defining two major types of mixing in ergodic theory. These are weak and strong mixing. We will consider the general definitions for weak and strong mixing by looking in the context of measure preserving maps rather than the specific one-step-left-shift map. By convention, we take to be our probability space. Definition 5.4. Consider a measure preserving mapT : ! on ( ;F;P). We say that T is weak mixing if given arbitraryA;B2F; 1 N N1 X n=0 jP(T n (A)\B)P(A)P(B)j! 0 whenN! +1: In other terms, weak mixing means that as one follows pullbacks asn increases (or in other terms the negative orbit, the set of iterations of a map), the average of the difference jP(T n (A)\B)P(A)P(B)j over the negative orbit goes to 0: The average measure of the pullbacks of A intersected withB converges to the product of the measures ofA andB: Definition 5.5. Consider a measure preserving mapT : ! on ( ;F;P). We say that T is strong mixing if given arbitraryA;B2F; P(T n (A)\B)!P(A)P(B) whenn! +1: As a remark, we would like to state that the terminology for strong mixing can differ from text to text. Strong mixing has simply been referred to as mixing. Definition 5.6. The processfX m g m2N 0 is called-mixing if it satisfies the condition (l) = supjP B (C)P(C)j! 0: for the sequence(l): The supremum is onB andC with the conditions thatB2F f0:::ng ;n2 N 0 ;P(B)> 0;C2F fm2N 0 jmn+l+1g : 17 In laymans terms,-mixing means that as the "gap" between strings respectively generated from the above different-algebras grows, the dependence between these strings goes to 0 (or in other words, these strings become independent). Note that if the process is -mixing, the respective measure (or map) is -mixing. This statement respectively applies to other types of mixing. Furthermore, the converse of this statement holds. Lemma 5.7. If a measureP is strong mixing, it is weak mixing. Proof. The proof can be found in Pollicott’s text [10]. 5.0.5.6 An Example of strong mixing,-mixing, and weak mixing The Markov measure is strong mixing. As stated by Haydn [4], a measure or process that is strong mixing is also any other type of mixing. For instance, since the Markov measure is strong mixing, it is-mixing and weak mixing. Definition 5.8. A refined definition for A follows given Lemma 4.8. GivenA2 ; let A =P A ( A 6=(A)) =P A ( A >(A)): Notation: LetA2C n : (A) = inf 0wn [(2n +(A))P(A (w) ) +(nw)]: Notation: Letf(A;t) =P(A)te ( A 16(A))P(A)t : 5.1 The Main Result on Return Time Distributions The following important theorem shows that return times are exponentially distributed over any value oft. Thus, return time distributions are exponential anytime. Note that this form of the proof of Theorem 5.9 is shortened and the full length version of the proof can be found in the Appendix. Theorem 5.9. LetfX m g m2N 0 be a -mixing process and f A = 1 2P(A) : Then, for allA2 C n ;n2N 0 ; the following holds: Fort<(A) : P A ( A >t) 1 = 0: For(A)tf A : jP A ( A >t) A e A P(A)(t(A)) j 9 2 (A)P(A)t: Fort>f A : jP A ( A >t) A e A P(A)(t(A)) j 58(A)f(A;t): The theorem was split up into three cases to better organize the corresponding proof. 18 Proof. 5.1.1 Fort<(A) The equality follows from Lemma 4.8. Since A is always at least(A); or, in other terms, the return time is at least the period, t < (A) A : Then,P A ( A > t) = 1: Thus, the equality follows. 5.1.2 For(A)tf A The main step of this proof is the rewriting our hitting time (P( A > t)) and return time (P A ( A >t)) as products and most importantly, applying the inequality: j Y a i Y b i j maxja i b i j(#i) maxfa i ;b i g #i1 80a i ;b i 1: s.t. thea i andb i are from the product representations of our return time and e A P(A)(t(A)) ; which are formed below. We further bound the left side of the inequality as well to gain our desired final result. Now, let us consider the product representation of our return and hitting times. 5.1.2.1 Product Representation of Hitting and Return Times We will now rewriteP( A >t) andP A ( A >t) as products. Let p i = P A ( A >i 1) P( A >i 1) : Now, write P A ( A >t) = P A ( A >t) P( A >t) P( A >t) =p t+1 P( A >t) and P( A >t) =P( A >t(A)) = t Y i=(A)+1 P( A >ij A >i 1) (1) = t Y i=(A)+1 (1P( A ij A >i 1)) = t Y i=(A)+1 (1P(T i (A)j A >i 1)) (2) = t Y i=(A)+1 (1p i P(A)): (3) 19 Note that: (1) follows from stationarity. (2) is proven here: If A i; then T k (x) 2 A s.t. k i: Then, x 2 T k (A) for k i: Now, since A >i 1;x = 2T k (A) forki 1: Thus, combining these two results,T i (x)2A; so x2T i (A): The equality follows. (3) is proven in the Appendix because it is a simple argument based on stationarity as well as facts, the inverse conditional proability formula and a special reversibility lemma, to be introduced in that section. 5.1.2.2 Bounding maxja i b i j As a reminder to the reader, the product inequality is j Y a i Y b i j maxja i b i j(#i) maxfa i ;b i g #i1 80 a i ;b i 1: s.t. thea i andb i are from the product representations of our return time ande A P(A)(t(A)) : Takea i = 1p i P(A) andb i =e A P(A) : Our choice ofa i andb i will be clear later. We must now bound the term maxja i b i j from the product inequality. Let us consider maxja i b i j =j1p i P(A)e A P(A) j =jp i P(A) + A P(A)) + 1 A P(A)e A P(A) j j (p i P(A) A P(A))j +j1 A P(A)e A P(A) j =jp i A jP(A) +j1 A P(A)e A P(A) j: Now that we have bounded maxja i b i j; we will bound the left hand side of the inequality by considering each modulus individually. 5.1.2.2.1 Boundingjp i A j Since the term maxja i b i j hasjp i A j as part of its bound, we must bound the latter. Using Proposition 4.1(b) from [2], for alli(A)2N; jP A ( A >i) A P( A >i)j 2(A); and the fact thatP( A >i) 1 2 ; sinceif A = 1 2P(A) ; we have jp i A j =j P A ( A >i 1) P( A >i 1) A j = jP A ( A >i 1)) A P( A >i 1)j P( A >i 1) 20 2(A) P( A >i 1) 2(A) P( A >i) 4(A): Note that the last two inequalities follow from the facts thatP( A > i 1) P( A > i) andP( A >i) 1 2 : Thus,jp i A j 4(A): 5.1.2.2.2 Boundingj1 A P(A)e A P(A) j Since the term maxja i b i j hasj1 A P(A)e A P(A) j as part of its bound, we must bound the latter. Note that j1xe x j x 2 2 (4) for all 0 x 1 (this range is obtained from the Taylor series representation ofe x ) as proven below: The Taylor series expansion ofe x = 1x+ x 2 2 +O(x 3 ): This means that 1x+ x 2 2 e x : Then,1 +xe x x 2 2 : By applying absolute value to both sides of the inequality, (4) follows. Since 0 A P(A) 1 (as 0 A 1 and 0 P(A) 1), we can use (4) through the substitution ofx = A P(A): This substitution yields: j1 A P(A)e A P(A) j ( A P(A)) 2 2 : (5) 5.1.2.3 A Collection of Bounds forj1p i P(A)e A P(A) j The purpose of this part of the proof is to combine all the calculations done earlier in order to boundj1p i P(A)e A P(A) j: Combining (4), (5), and an inequality obtained earlier: j1p i P(A)e A P(A) jjp i A jP(A)+j1 A P(A)e A P(A) j 4(A)P(A)+ ( A P(A)) 2 2 = 4(A)P(A) + (P A ( A >(A)) 2 P(A) P(A) 2 4(A)P(A) + 1 2 (A)P(A) = 9 2 (A)P(A): Thus, j1p i P(A)e A P(A) j 9 2 (A)P(A) 8i =(A) + 1;:::;f A : 21 5.1.3 Final Combination of Results To Prove Desired Exponential Return Time Dis- tribution In order to prove our desired inequality about exponentially distributed return times,let us consider the product inequality we provided earlier, the product representations of return and hitting times, and finally, the bounds for the product inequality. We will first prove that hitting times are exponentially distributed and then, we will show that with a few adjust- ments, return times are exponentially distributed. Consider j Y a i Y b i j maxja i b i j(#i) maxfa i ;b i g #i1 (6) 80a i ;b i 1: Using the previous results and the above inequality, let Q a i = Q t i=(A)+1 (1p i P(A)) = P( A >t) s.t.a i = 1p i P(A): Also, let Q b i = Q t i=(A)+1 e A P(A) =e A P(A)(t(A)) s.t. b i =e A P(A) : Note that this choice ofa i andb i satisfies the condition 0a i ;b i 1: Using (6), j t Y i=(A)+1 (1p i P(A)) t Y i=(A)+1 e A P(A) j maxj1p i P(A)e A P(A) j(t(A)) 9 2 (A)P(A)t (7) Now, (7) follows from the fact thatj1p i P(A)e A P(A) j 9 2 (A)P(A);t(A) t; and maxf1p i P(A);e A P(A) g t(A)1 1 (as 1p i P(A) 1 ande A P(A) 1). This means that j t Y i=(A)+1 (1p i P(A)) t Y i=(A)+1 e A P(A) j 9 2 (A)P(A)t: Since t Y i=(A)+1 (1p i P(A)) =P( A >t) and t Y i=(A)+1 e A P(A) =e A P(A)(t(A)) ; jP( A >t)e A P(A)(t(A)) j 9 2 (A)P(A)t: 22 Now, the above inequality looks very similar to the inequality we want to show. First, notice thatP A ( A >t) =p t+1 P( A >t) and A 1: Then, jP A ( A >t) A e A P(A)(t(A)) jjp t+1 P( A >t) A e A P(A)(t(A)) j maxfp t+1 ; A gjP( A >t)e A P(A)(t(A)) jjP( A >t)e A P(A)(t(A)) j as maxfp t+1 ; A g 1: Therefore, sincejP( A >t)e A P(A)(t(A)) j 9 2 (A)P(A)t; jP A ( A >t) A e A P(A)(t(A)) j 9 2 (A)P(A)t: 5.1.4 Fort>f A This is the longest, most technical part of the proof. For the sake of clarity, we have moved the full proof of this case to the appendix. This shortened version of the proof highlights the important elements of the proof. We prove that return times are exponentially diributed fort>f A : To prove our desired inequality forjP A ( A >t) A e A P(A)(t(A)) j, we must use the following trangle inequality to break our modulus into 4 pieces. The first and last moduli on the right side of the triangle inequality are minor adjustments to the tail of our modulus on the left side. The second modulus is the mixing term. Finally, the third mod- ulus is the exponential term. The second and third moduli are the most important parts of this proof because they involve the behavior of the processes and our return times. Note: The most important parts of the following triangle inequality forjP A ( A > t) A e A P(A)(t(A)) j (used to bound this case) are the exponential term (10) and mixing term (9). For the purposes of bounding this case, we emphasize bounding the first three terms of the right hand side of the triangle inequality rather than emphasizing the bound for the exponential and mixing terms alone. This methodology for finding the bound is done for simplicity. The triangle inequality for our modulus is done below: Lett =kf A +r s.t.k2Z + and 0rf A : Consider the inequality jP A ( A >t) A e A P(A)(t(A)) jjP A ( A >kf A +r)P A ( A >kf A )P( A >r)j (8) +jP A ( A >kf A )P A ( A >f A )P( A >f A ) k1 jP( A >r) (9) +jP A ( A >f A )P( A >f A ) k1 A e A k 2 jP( A >r) (10) + A e A k 2 jP( A >r)e A P(A)(r(A)) j: (11) 23 (11) results from a bit of algebraic manipulation and the substitution of an essential equal- ity fork: Let us bound the left side of the triangle inequality componentwise. 5.1.4.1 Bounding (8) Since (8) is on the left side of our return time triangle inequality, we must bound this term to get our desired bound. Let us consider (8): jP A ( A >kf A +r)P A ( A >kf A )P( A >r)j 2(A)P A ( A >kf A 2n) (12) 2(A)(P( A >kf A 2n) +(n)) k1 : (13) Note that (12) is exactly Proposition 4.1a of [2] given the conditions of this theorem. (13) follows from Lemma 4.3 of [2], which states that given the conditions of the theorem, P A ( A >kf A 2n) (P( A >kf A 2n) +(n)) k1 : Thus, we have bounded (8) as jP A ( A >kf A +r)P A ( A >kf A )P( A >r)j 2(A)(P( A >kf A 2n) +(n)) k1 : 5.1.4.2 Bounding (9) Since (9) is on the left side of our return time triangle inequality, we must bound this mod- ulus to get our desired bound. This modulus is the mixing term because Abadi and Vergne used the-mixing proprty of the processes to bound it. Let us consider (9): jP A ( A >kf A )P A ( A >f A )P( A >f A ) k1 jP( A >r): Let us bound the modulus of (9): jP A ( A >kf A )P A ( A >f A )P( A >f A ) k1 j 2(A)(k 1)P A ( A >f A 2n)(P( A >f A 2n) +(n)) k2 : Now, given the conditions of the theorem, the bound of the modulus of (9) is exactly by Proposition 4.2 of Abadi and Vergne [2]. 24 5.1.4.3 Bounding The Combination of (8) and (9) To bound the sum of (8) and (9), we use the calculations found earlier and most impor- tantly,-mixing. Consider the sum of (8) and (9): P A ( A >kf A +r)P A ( A >kf A )P( A >r)j+jP A ( A >kf A )P A ( A >f A )P( A >f A ) k1 P( A >r)j 2(A)(P( A >kf A 2n)+(n)) k1 +2(A)(k1)P A ( A >f A 2n)(P( A >f A 2n)+(n)) k2 = 2(A)[P( A >f A 2n)+(n)] k2 [P( A >kf A 2n)+(n)+(k1)P A ( A >f A 2n)] 2(A)[P( A >f A 2n) +(n)] k2 [jP( A >kf A 2n)P A ( A >f A 2n)j +(n) + kP A ( A >f A 2n)] = 2(A)[P( A >f A 2n) +(n)] k2 [k +(n)] Note that since the process is-mixing,jP( A > kf A 2n)P A ( A > f A 2n)j! 0: Also, note thatP A ( A > f A 2n) 1: Thus, combining these two facts, we have proven the last equality. 5.1.4.3.1 BoundingP( A >f A 2n) +(n) SinceP( A > f A 2n) +(n) is in the bound for (8) and (9), we will bound this term. This procedure will help us get the exponential term we need for our final bound. Let us use jP( A >t)e A P(A)(t(A)) j 9 2 (A)P(A)t s.t.t =f A 2n: jP( A >f A 2n)e A P(A)(f A 2n(A)) j 9 2 (A)P(A)[f A 2n]: This means that jP( A >f A 2n)e A P(A)( 1 2P(A) 2n(A)) j 9 2 (A)P(A)[ 1 2P(A) 2n] 9 4 (A): A little manipulation yields that, given the previous calculations: jP( A >f A 2n)e A 2 +(2n+(A))P(A) jjP( A >f A 2n)e A P(A)( 1 2P(A) 2n(A)) j 9 4 (A): Thus, jP( A >f A 2n)e A 2 +(2n+(A))P(A) j 9 4 (A): 25 Using the Mean Value Theorem, which gives usjf(b)f(a)j (ba)f 0 (c) (in this case, letc = ba 2 ), letf(x) =e x ;a = A 2 ; andb = A 2 + (2n +(A)P(A)): Then, j exp( A 2 + (2n +(A)P(A))) exp( A 2 )j (2n +(A)P(A)) exp((2n +(A))P(A)): (14) Thus, j exp( A 2 +(2n+(A)P(A)))exp( A 2 )j (2n+(A)P(A)) exp((2n+(A))P(A)); so jP( A >f A 2n) +(n) exp( A 2 )j 4(A): (15) Note that (15) follows from the fact thatj(n)j 3 4 (A) (as it is negligible), and (2n + (A) exp((2n +(A)) (A) +(nn) = (A) +(0)! (A) (as this term is in the exact expression for(A)). Therefore, P( A >f A 2n) +(n) exp( A 2 ) + 4(A): 5.1.4.4 Further Bounding for 2(A)[P( A >f A 2n) +(n)] k2 [k +(n)] Now, using that inequality, 2(A)[P( A >f A 2n) +(n)] k2 [k +(n)] 2(A)[exp( A 2 ) + 4(A)] k2 k as(n) is negligible. 5.1.4.4.1 Bound for exp( A 2 ) + 4(A) Since this exponential function appears in the bound above for the sum of (8) and (9), we will bound it to get our desired inequality. To bound this term, we will use Taylor series and an equivalent form fork: By Taylor series expansion, exp( A 2 ) exp( A 2 ) + 4(A) (exp( A 2 ))( P 1 n=0 (8(A)) n n! ) = (exp( A 2 ))(exp(8(A))) = exp(( A 2 8(A))) 26 Now, sincet =kf A +r;t = k 2P(A) +r: Then,k = (tr)2P(A): Thus, after a bit of algebraic manipulation withk; exp(( A 2 )(k 2)) = exp( A P(A)t + A (P(A)r + 1)): Note that sincerf A 1 2P(A) , A (P(A)r + 1) 3 2 : Hence, exp(( A 2 )(k 2)) = exp( A P(A)t + A (P(A)r + 1)) exp( A P(A)t + 3 2 ): Similarly, again takek = (tr)2P(A): Thus, again with similar algebraic manipulations, exp(( A 2 8(A))(k 2)) = exp(( A 16(A))P(A)t + ( A 16(A)(P(A)r + 1)): Consider that with a little algebraic manipulation and use of inequalities, ( A 16(A)(P(A)r + 1) 3 2 for largen as(A) becomes negligible for bign: Thus, exp(( A 2 8(A))(k 2)) = exp(( A 16(A))P(A)t + ( A 16(A)(P(A)r + 1)) exp(( A 16(A))P(A)t + 3 2 ) for large enoughn: Hence, 4(A)k[exp( A 2 ) + 4(A)] k2 36(A)P(A)t exp(( A 16(A))t) Therefore, 4(A)k[exp( A 2 ) + 4(A)] k2 4(A)k[exp( A 2 )] k2 +O((4(A) exp( A 2 )) k1 ) 27 = 36(A)P(A)t exp(( A 16(A))t) Note that the factor of bigO vanishes since it is negligible. This thereby proves that the sum of (8) and (9), P A ( A >kf A +r)P A ( A >kf A )P( A >r)j +jP A ( A >kf A )P A ( A >f A )P( A >f A ) k1 P( A >r)j 36(A)P(A)t exp(( A 16(A))t): 5.1.5 Bound for (10) This is the exponential term of our triangle inequality. We must first boundjP( A >f A ) exp( A 2 )j because it arises in the product inequality for (10). Now, let us bound (10): jP A ( A >f A )P( A >f A ) k1 A e A k 2 jP( A >r): Consider jP( A >t)e A P(A)(t(A) j 9 2 (A)P(A)t and jP A ( A >t) A e A P(A)(t(A) j 9 2 (A)P(A)t witht =f A so jP( A >f A ) exp( A 2 )j 3(A) Note that the preceding long inequality follows from the fact that exp( A 2 )nP(A) 1; j exp( A P(A)(A)) 1j nP(A) for largen, andnP(A) 3 4 (A) (sincenP(A) is in the definition of(A)). Thus, jP( A >f A ) exp( A 2 )j jP( A >f A ) exp( A P(A)(f A (A)))j + exp( A 2 )j exp( A P(A)(A)) 1j 9 4 (A)P(A) +nP(A) 3(A) 28 and, similarly, jP A ( A >f A ) A exp( A 2 )j jP A ( A >f A ) A exp( A P(A)(f A (A)))j + exp( A 2 )j exp( A P(A)(A)) 1j 9 4 (A)P(A) +nP(A) 3(A): 5.1.5.1 Product Inequality for (10) We now use the product inequality and our previous calculations to bound our exponential term, (10). Now, applyingjP( A >f A )exp( A 2 )j 3(A);jP A ( A >f A ) A exp( A 2 )j 3(A); andj Q a i Q b i j maxja i b i j(#i) maxfa i ;b i g #i1 80a i ;b i 1; one gets that the modulus in (10): jP A ( A >f A )P( A >f A ) k1 A (exp( A 2 )) k j =jp t+1 P( A >f A ) k A (exp( A 2 )) k j maxjP( A >f A ) exp( A 2 )jk maxfP( A >f A ); exp( A 2 )g k1 3(A)k(3(A) + (exp( A 2 )) k1 Now, the Taylor series representation yields: exp( A 2 ) exp( A 2 )+3(A) (exp( A 2 ))( 1 X n=0 (6(A)) n n! ) = (exp( A 2 ))(exp(6(A))) exp(( A 2 6(A))): Also,k = (tr)2P(A) and exp(( A 2 )(k 1)) = exp( A P(A)t + A P(A)r): A basic susbstitution for k leads to that equality. Note that A P(A)r 1 2 and exp(( A 2 )(k 2)) = exp( A P(A)t + A P(A)r) exp( A P(A)t + 1 2 ): Similarly, again takek = (tr)2P(A); exp(( A 2 6(A))(k 2)) = exp(( A 12(A))P(A)t + ( A 12(A)(P(A)r + 1 2 )): 29 Consider ( A 12(A)(P(A)r + 1 2 ) 1 for largen as(A) becomes negligible for bign: Note that the preceding inequality is valid because A 1;f A = 1 2P(A) ; andP(A) 1: Thus, exp(( A 2 6(A))(k 1)) exp(( A 12(A))P(A)t + 1) for large enoughn: Hence, 3(A)k[exp( A 2 ) + 3(A)] k1 3(A)k[exp( A 2 )] k1 +O((3(A) exp( A 2 )) k2 ) 6e(A)P(A)t exp(( A 12(A))t) Therefore, 3(A)k[exp( A 2 ) + 3(A)] k1 6e(A)P(A)t exp(( A 12(A))t) Note that the factor of bigO vanishes since it is negligible. This thereby proves that (10): jP A ( A >f A )P( A >f A ) k1 A e A k 2 jP( A >r) 6e(A)P(A)t exp(( A 12(A))t) 14(A)P(A)t exp(( A 12(A))t): 5.1.6 Bounding (11) We will now bound the last modulus, (11). Let us again consider: jP( A >t)e A P(A)(t(A)) j 9 2 (A)P(A)t: Now, A e A k 2 jP( A >r)e A P(A)(r(A)) j A e A k 2 ( 9 2 (A)P(A)r) A e A k 2 ( 9 4 (A)) < A e A k 2 ( 9 2 (A)): 30 Ifr<(A); then (11) is equivalent to e A P(A)(r(A)) 1 +P( A r) 2nP(A): This is justified by using 1P( A r) =P( A >r); the fact thatnP(A) is in the expres- sion for(A); and previous calculations. Note that A P(A)r A P(A)f A A P(A)( 1 2P(A) ) and A 2 1 2 : Thus, using these calculations, (11) is bounded: A e A k 2 jP( A >r)e A P(A)(r(A)) j 9 2 (A)e A k 2 = 9 2 (A)e A 2(tr)P(A) 2 9 2 (A)e A P(A)(tr) = 9 2 (A)e A P(A)t+ A P(A)r 9 2 (A)e A (P(A)t 1 2 ) 9 2 (A)e A P(A)t e 1 2 = 9 2 e 1 2 (A)e A P(A)t 8(A)e A P(A)t 8(A)P(A)te A P(A)t Note that the last inequality follows from t f A as shown below: For large enough t; t 1 P(A) >f A : Then,P(A)t 1: Thus, e A P(A)t P(A)te A P(A)t and the inequality follows. 5.1.7 Final Calculations of Inequality Now, we will gather all of the inequalities we proved via technicial manipulation, mixing properties, and exponential estimates in order to prove our final inequality. Finally, jP A ( A >t) A e A P(A)(t(A)) jjP A ( A >kf A +r)P A ( A >kf A )P( A >r)j +jP A ( A >kf A )P A ( A >f A )P( A >f A ) k1 jP( A >r) +jP A ( A >f A )P( A >f A ) k1 A e A k 2 jP( A >r) + A e A k 2 jP( A >r)e A P(A)(r(A)) j 36(A)P(A)t exp(( A 16(A))t) + 14(A)P(A)t exp(( A 12(A))t) + 8(A)P(A)te A P(A)t 58(A)f(A;t) 31 The final inequality is valid due to the fact that(A) is an error term. Thus, the return time distribution is exponential fort(A): 5.2 Hitting and Return Time Distributions In connection to the previous section, the return time distribution can give more informa- tion about the hitting time distribution and the term A (return time distribution). Now, the following lemma proves equivalent conditions for the distributions of hitting times, dis- tribution of return times, and the distribution of return times that are strictly greater than the period. The lemma states that the return time distribution and hitting time distribution approximately have the same exponential distribution and furthermore, it states that this is equivalent to the conditional probability, A = P A ( A > (A)) converging to 1: These equivalent results apply to sequences strings of strings of increasing length as long as their measure converges to 0: The decay of their measure guarantees their nice exponential dis- tribution. Lemma 5.10. Let the processfX m g m2N 0 be-mixing. There exists a constantC > 0 such that, for allA2C n ;n2N 0 and allt> 0; the following conditions are equivalent: (a)jP A ( A >t)e P(A)t jC(A)f(A;t): (b)jP A ( A >t)P( A >t)jC(A)f(A;t): (c)jP( A >t)e P(A)t jC(A)f(A;t): (d)j A 1jC(A): Moreover, iffA n g n2N 0 is a sequence of strings such thatP(A n )! 0 asn!1; the fol- lowing conditions are equivalent: (i) The return time law ofA n converges to a parameter one exponential law. (ii) The return time law and the hitting time law ofA n converge to the same law. (iii) The hitting time law ofA n converges to a parameter one exponential law. (iv) The sequence ( An ) n2N 0 converges to one. Proof. I will call the following statement Theorem 1 as given by Abadi [4]. Note that Abadi has another paper [1], which states that: given the same conditions as this lemma ands as the time variable, jP( A > s A P(A) )e s j(A)e s minfs; 1gC(A)e s minfs; 1g s.t. A 2 (0;1): Also,jP( A > s A P(A) )e s j C 3 (A)f(A; t A P(A) ): Let us prove relations between (a), (b), (c), and (d): 32 5.2.1 (a)() (d) Let us first consider (d) implies (a). 5.2.1.1 (d) =) (a) We are given thatj A 1jC(A): Given Theorem 5.9, for(A)t (as the other case has been proven in Theorem 5.9), jP A ( A >t) A e A P(A)(t(A)) j 54(A)f(A;t): Now, this proof comes down to proving that j A e A P(A)(t(A)) e P(A)t j is bounded as done below: j A e A P(A)(t(A)) e P(A)t j(A)P(A)O(1) ~ C(A)f(A;t): The above inequality is justified by using the Mean Value Theorem on the functione x and the fact that(A)P(A)(A) (since they are in the representation of(A)). Thus, by the triangle inequality, j A e A P(A)(t(A)) e P(A)t jjP A ( A >t) A e A P(A)(t(A)) j+j A e A P(A)(t(A)) e P(A)t j (54 + ~ C)(A)f(A;t)C(A)f(A;t) s.t. 54 + ~ CC: 5.2.1.2 (a) =) (d) We are given that jP A ( A >t)e P(A)t jC(A)f(A;t): Theorem 5.9 states that jP A ( A >t) A e A P(A)(t(A)) j 54(A)f(A;t): By combining the above two inequalities, je P(A)t A e A P(A)(t(A)) jje P(A)t P A ( A >t)j +jP A ( A >t) A e A P(A)(t(A)) j (54 + ^ C)(A)f(A;t)C(A)f(A;t) Since the above inequality is bounded by an arbitrarily small value, it follows that j A 1jC(A): Thus, this proves the part about (a)() (d). 33 5.2.2 (b) =) (a) and (b) =) (c) First, let us consider (b) =) (a): 5.2.2.1 (b) =) (a) We are given that jP A ( A >t)P( A >t)j C(A)f(A;t): Also, we are given that for(A) t (again we consider this case for the same reason as last time): jP A ( A >t) A e A P(A)(t(A)) j 54(A)f(A;t) and we are given Theorem 1: jP( A > s A P(A) )e s j(A)e s minfs; 1gC(A)e s minfs; 1g s.t. A 2 (0;1): Lett = s A P(A) =) s = A P(A)t; so jP( A >t)e A P(A)t j(A)e A P(A)t C 3 (A)f(A;t) as minfs; 1g 1 and for large enoughC: Then,jP A ( A >t)e P(A)t j jP A ( A >t)P( A >t)j +jP( A >t)e A P(A)t j +je A P(A)t e P(A)t j C(A)f(A;t)+(A)e A P(A)t +(1 A )e 1 A C(A)f(A;t)+ C 3 (A)f(A;t)+ C 3 (A)f(A;t) C(A)f(A;t) for big enoughC: Note that the bound of (1 A )e 1 A comes from applying the Mean Value Theorem to je A P(A)t e P(A)t j: 5.2.2.2 (b) =) (c) We are given that jP A ( A >t)P( A >t)j C(A)f(A;t): 34 and what we found earlier: jP( A >t)e A P(A)t j(A)e A P(A)t C 3 (A)f(A;t): Combining the above two inequalities, we get: jP( A >t)e P(A)t jjP A ( A >t)P( A >t)j +jP( A >t)e A P(A)t j ( C +)(A)f(A;t)C(A)f(A;t) Thus, this proves the parts (b) =) (a) and (b) =) (c). Let us now consider (a) =) (b) and (a) =) (c): First, consider (a) =) (b): 5.2.2.3 (a) =) (b) We are given that: jP A ( A >t)e P(A)t j _ C(A)f(A;t)C(A)f(A;t): By the triangle inequality, jP A ( A >t)P( A >t)jjP A ( A >t)e P(A)t j+je P(A)t e A P(A)t j+je A P(A)t P( A >t)j _ C(A)f(A;t)+(1 A )e 1 A +(A)f(A;t) _ C(A)f(A;t)+ C 3 (A)f(A;t)+(A)f(A;t) C(A)f(A;t) Now, let us consider (a) =) (c): 5.2.2.4 (a) =) (c) We are given that: jP A ( A >t)e P(A)t j _ C(A)f(A;t)C(A)f(A;t): Again, by the triangle inequality, jP( A >t)e P(A)t jjP( A >t)e A P(A)t j +je A P(A)t e P(A)t j (A)f(A;t) + (1 A )e 1 A (A)f(A;t) + C 3 (A)f(A;t)C(A)f(A;t): Thus, we have proven that (a) =) (b) and (a) =) (c). Considering that this was the last relation to prove, we have proven the first past of the lemma. 35 5.2.3 Equivalent Conditions for Sequences of Strings The final part of the lemma is a result about sequences of strings. This result clearly follows from the first part of the lemma as shown below: GivenfA n g n2N ; a sequence of strings with arbitrary length, s.t.P(A n )! 0 asn!1; f(A n ;t)! 0 asn!1 since f(A n ;t) =P(A n )te ( An 16(An))P(An)t ! 0 asP (A n )! 0 whenn approaches1: Thus, substituteA n to the previous parft of the lemma andf(A n ;t)! 0 to get the desired result for these strings of arbitrary length. 36 6 Defining a Special Probability Sequence l (A) and More Definitions Before starting with the example, we will give motivation to its purpose. We first intro- duce sojourn time. Then, a probability sequence related to sojourn time as well as some necessary concepts, such as a Markov chain, are introduced. The significance of the prob- ability sequence is not realized until the next section. On Section 7, we discuss a specific Markov chain and measure and furthermore, we apply the probability sequence to this sce- nario. Furthermore, on that section, we briefly explain the connections of the main results, Theorem 5.9 and Lemma 5.10, to the example. 6.0.3.1 Sojourn Time Definition 6.1. SupposeA . The sojourn time onA is the random variableS A : ! N[f1g given by S A (x) = supfk2Njx2A\T j(A) (A);j = 1;:::;kg and putS A (x) = 0 when the sup is taken on the empty set. 6.0.4 Remark 2 Since the sojourn time is defined by S A (x); let us consider the sojourn time on strings x2 A: By definition, the sojourn time is the largest time, i.e., biggest j such thatA\ T j(A) (A)6=;: Equivalently, the sojourn time is the highest iteratej (or time) for a string x to still belong toA: In other terms, the sojourn time is the time a string leaves a set. This is shown below: Letj be the sojourn time on the setA: By definition, this means thatx2A\T j(A) (A); which is equivalent to givenx2A;T j(A) (x)2A: Sincej is the largest power such that an iterate, which is a multiple of the period, returns toA; the stringx cannot return toA for any subsequent powers of the period. Mathematically, this is equivalent tox = 2 T k(A) (x) for every period multiplek>j: 6.0.4.1 Example of Sojourn Time Again, letC =f0; 1g and define the usual way. Consider a stringx2A s.t. A =fstrings with zero in positions that are multiples of 7g: Lety =f0010000000g: For instance, U(f0010000000g) A. Let us take a stringB = U(f11111101111110g)A. Then, B =U(f11111101111110g) 37 T (B) =U(f1111101111110g) T 2 (B) =U(f111101111110g) T 3 (B) =U(f11101111110g) T 4 (B) =U(f1101111110g) T 5 (B) =U(f101111110g) T 6 (B) =U(f01111110g) T 7 (B) =U(f1111110g): As usual, * stands for an element inC: Since T 7 (B) A, B A, and 7 is the largest iterate multiple of the period(A) ofT s.t.T 7 (B)A, by definition, the sojourn time for y isS A (x) = 1. This is a great example of sojourn time because after one multiple of the period(A) = 7, under the left-shift mapT , then-cylinderB (generated by the stringy) clearly leaves the setA: 6.0.4.2 Intuitive Meaning of Sojourn Time The preceding example shows us that the sojourn time tells us when a string leaves (under the given map) its original set. In a sense, sojourn time is a form of measurement for invariance. 6.0.4.3 Continuity Property for the MeasureP A continuity property forP conditioned to specfic pullbacks is given below: Definition 6.2. For allA2C n , we have the probability sequence ( i (A)) i2N as i (A) =P(Aj T i j=1 T j(A) (A)). If the limit lim i!1 i (A) has a value, then its value is written as(A). In general, the preceding equality is equivalent to i (A) = P (A\ T i j=1 T j(A) (A)) P ( T i j=1 T j(A) (A)) : 6.0.5 Remark 3 The sequence, i (A) = P(Aj T i j=1 T j(A) (A)); is the measure of all strings x such that x2 T i(A) (A) andx2 A: It is a measure of allx2 A that create the sojourn time, but the sequence measures much more than those elements. The sequence is a measure of all x2 A given thatx is in all backward iterate multiples of the period, not just the highest period multiple, i(A), ofT: Essentially, it is a measure of the elements ofA that create the hitting time for each valuej <i such thati;j2N: Notation: Ann-stringA =fX f0;:::;n1g = 1g =fX 0 = 1;:::;X n1 = 1g. 38 6.0.5.1 Example of i (A) Applied to a Probability Distribution Given an independent and identically distributed Bernoulli process (X i ) i2N 0 that has a parameter 0 < = P(X i = 1) = 1P(X i = 0) for every i and for the n-string A =fX f0;:::;n1g = 1g, we have that i (A) =P(Aj i \ j=1 T j(A) (A)) =P(fX f0;:::;n1g = 1gj i \ j=1 T j(A) (fX f0;:::;n1g = 1)): Again, we will use * to denote unknown elements that come fromS =f0; 1g: Note that sinceA =fX f0;:::;n1g = 1g,(A) = 1 as shown below: A =fX 0 = 1;:::;X n1 = 1g T 1 (A) =f;X 0 = 1;:::;X n1 = 1g: Since the first throughn 1-th elements ofT 1 (A) coincide withA,A\T 1 (A). Hence, our desired result follows. Now, let us consider all pullbacks ofA up to the poweri. A =fX 0 = 1;:::;X n1 = 1g T 1 (A) =f;X 0 = 1;:::;X n2 = 1g T 2 (A) =f;;X 0 = 1;:::;X n3 = 1g . . . T i (A) =f;:::; | {z } i ;X 0 = 1;:::;X ni1 = 1g: Consider the intersectionsA T i j=1 T j (A): We will first look at simpler versions of this intersections: A\T 1 (A) =fX 0 = 1;:::;X n1 = 1g\f;X 0 = 1;:::;X n2 = 1g =f;X 0 = 1;:::;X n2 = 1;X n1 = 1g: A\T 1 (A)\T 2 (A) =fX 0 = 1;:::;X n1 = 1g\f;X 0 = 1;:::;X n2 = 1g \f;;X 0 = 1;:::;X n3 = 1g =f;;X 0 = 1;:::;X n2 = 1;X n1 = 1g: 39 Thus, A i \ j=1 T j (A) =A\T 1 (A)\T 2 (A)\T i (A) =fX 0 = 1;:::;X n1 = 1g\f;X 0 = 1;:::;X n2 = 1g \f;;X 0 = 1;:::;X n3 = 1g\\f;:::; | {z } i ;X 0 = 1;:::;X ni1 = 1g =f;:::; | {z } i ;X 0 = 1;:::;X ni1 = 1;X ni = 1;:::;X n1 = 1g: We now follow the same procedure for T i j=1 T j (A) : T 1 (A)\T 2 (A) =f;X 0 = 1;:::;X n2 = 1g\f;;X 0 = 1;:::;X n3 = 1g =f;;X 0 = 1;:::;X n3 = 1;X n2 = 1g: T 1 (A)\T 2 (A)\T 3 (A) =f;X 0 = 1;:::;X n2 = 1g\f;;X 0 = 1;:::;X n3 = 1g \f;;X 0 = 1;:::;X n4 = 1g =f;;X 0 = 1;:::;X n4 = 1;X n3 = 1;X n2 = 1g: Hence, i \ j=1 T j (A) =T 1 (A)\T 2 (A)\T 3 (A)\T i (A) =f;X 0 = 1;:::;X n2 = 1g\f;;X 0 = 1;:::;X n3 = 1g \f;;;X 0 = 1;:::;X n4 = 1g\\f;:::; | {z } i ;X 0 = 1;:::;X ni1 = 1g =f;:::; | {z } i ;X 0 = 1;:::;X ni1 = 1;X ni = 1;:::;X n2 = 1g: 40 Using the above information about intersections, i (A) becomes: i (A) =P(fX f0;:::;n1g = 1gj i \ j=1 T j(A) (fX f0;:::;n1g = 1)) =P(Aj i \ j=1 T j (A)) = P(A\ T i j=1 T j (A)) P( T i j=1 T j (A)) = P(f;:::; | {z } i ;X 0 = 1;:::;X ni1 = 1;X ni = 1;:::;X n1 = 1g) P(f;:::; | {z } i ;X 0 = 1;:::;X ni1 = 1;X ni = 1;:::;X n2 = 1g) = P(X 0 = 1)P(X n2 = 1)P(X n1 = 1) P(X 0 = 1)P(X n2 = 1) =P(X n1 = 1) = We can get an even more interesting result when we observe that = P(X i = 1) = 1P(X i = 0) = 1 A = 1 (A). Notice that A =P(X i = 0) because 1 A =P A ( A (A)) =P A ( A 1) is the same as the measure of all random variables equaling 0 because if allX i = 1, the period(A) = 1 and as proven earlier,(A) A . Thus, i (A) = 1 (A). Therefore, by definition, (A) = lim i!1 i (A) = lim i!1 1 (A) = 1 (A) =: 6.0.5.2 Other Important Definitions for the Following Example As a note before the following definition, a Markov chain isT -invariant if its corresponding mapT is measure invariant with respect to the given measure. Definition 6.3. A stationary,T -invariant Markov chain that has a countable state space, in this case,N 0 ; is a sequenceX 0 ;X 1 ;X 2 ;::: of random variables with values inN 0 so that for each staten andp and each timem = 0; 1; 2, P(X m+1 =pjX m =n) =q n s.t.q n only has dependence on statesp andn and not on previous time or preceding states X m1 ;X m2 ;X m3 ;:::. Theq n are called the chain’s transition probabilities and note that 0 q n 1. The q n each give the probability of moving between elements of the state space. 41 Definition 6.4. A stochastic transition probability matrix is the matrixQ made ofq n as its (n;p) entries and each row ofQ adds up to 1. Note that makesQI =I. Definition 6.5. A squaref0; 1g matrix ~ Q is irreducible if for all pairs (n;p); there exists anl> 0 s.t. ( ~ Q l ) np > 0: 6.0.5.3 Example of an Irreducible Matrix Consider ~ Q = 0 1 1 0 : Then, ~ Q 2 = 1 0 0 1 ; i.e. the 2 2 identity matrix. Now, notice that ~ Q 12 = 1> 0; ~ Q 21 = 1> 0; ( ~ Q 2 ) 11 = 1> 0; and ( ~ Q 2 ) 22 > 0: Since there is anl for every (n;p) s.t. ( ~ Q l ) np > 0, ~ Q is irreducible. Note that not all irreducible matrices are necessarily invertible. For instance, ~ Q as given above has zero determinant. 42 7 Example We provide a thorough analysis of the probability sequence l (A) (for anA in this section. First, a special Markov chain is introduced. The special part of the Markov chain is that it is built from a random walk that goes stepwise onN 0 : For each move on the random walk, there is a transition probability. Next, these transition probabilities, which create a stochastic transition matrix, are normalized. We then introduce the concept of ergodic- ity for measures and furthermore, we prove that the Markov measure, generated by this Markov chain, is ergodic. Note that since we are working with the Markov measure and a-algebra containingn-cylinders, the important results, Theorem 5.9 and Lemma 5.10, apply. Then, we consider what happens to the Markov measure if its corresponding stochastic process is normalized (or in other terms, any non-zero value in the state space becomes 1:). Then, an analysis of the probability sequence i (S n ), or the probability sequence for the n-string of ones, S n , is done for the specific Markov measure. Next, we look at the components of i (S n ): i (S n ) is given in terms of a sum of products of q j s. We give a general sequence of q j (composed of subpolynomial (i.e. rational terms of the form 1 j ) and consider the convergence of i (S n ): If convergence occurs, then we try to calculate the value of(S n ): Finally, we consider a specific probability distribution and then calculate i (S n ) as well as(S n ): 7.1 A Markov Chain Now, we introduce a Markov chain and lay the foundations for calculating l (A): Further- more, we discuss the ergodic Markov measure generated by the Markov chain. Suppose the state space is N 0 . Let the following matrix Q be indexed by i and j: Let 0 < q n < 1. LetfX n g n2N 0 be a Markov Chain on N 0 with transition probabilities: Q(n;n + 1) =q n ,Q(n; 0) = 1q n forn 0. Otherwise,Q(i;j) = 0: Note that the q n are given probabilities about going from n to n + 1. In this case, n are arbitrary values ofN 0 . A finite dimensional form of the stochastic, transition probability matrixQ is shown below for the reader: Q i;j = 0 B B B @ 1q 0 q 0 0 0 0 1q 1 0 q 1 0 0 . . . . . . . . . . . . . . . . . . 1q N 0 q N 0 1 C C C A : If we identify each non-zero entry ofQ with 1 and each zero entry ofQ with 0,Q becomes 43 af0; 1g matrix. Denote this matrix as ~ Q: A finite dimensional form of the transition matrix ~ Q is shown below for the reader: ~ Q i;j = 0 B B B @ 1 1 0 0 0 1 0 1 0 0 . . . . . . . . . . . . . . . . . . 1 0 1 0 1 C C C A : Notice that ~ Q(n;n + 1) = 1 ~ Q(n; 0) = 1: Definition 7.1. Let =fx2N 0 N 0 : ~ Q(n;n + 1) = 18n2N 0 g: Then, the associated left shift mapT : ! is called the subshift of finite type. Lemma 7.2. ~ Q is an irreduciblef0; 1g matrix. Proof. First, note that by construction, ~ Q is af0; 1g matrix. Kitchen’s text [6] shows that each power of ~ Q represents a step taken in a random walk. For instance, consider ~ Q 2 has ~ Q(n;n+2) = 1 and ~ Q(n; 1) = 1 since one can only go stepwise or back to 0 in the random walk. In general, given anyl2N; its entries of ~ Q l are ~ Q(n;n+l) = 1 and ~ Q(n;l1) = 1: Since one can let the exponent l tend to1; ( ~ Q l ) ij always has non-zero entries for every l2N; so for everyi andj inN 0 ; ( ~ Q l ) ij > 0: The motivation for ~ Q is the normalization of stochastic processes to be explained later. Now, let us again consider the non-normalized transition probability matrixQ: The transition probability matrixQ times the identityI equalsI. ThusQ has a left eignen- vector x = [x 0 ;x 1 ;x 2 ;:::] satistfying (1q 0 )x 0 + (1q 1 )x 1 +::: = x 0 and x k q k = x k+1 for k2 N 0 . The left eigenvector is made ofx k =x 0 P k+1 s.t. P k1 = k2 Y j=0 q j <1 andx 0 is picked to makex a probability vector, then, x 0 1 = 1 X k=2 P k1 = 1 X k=2 k2 Y j=0 q j <1: 44 Thus, the probability measureP from earlier, which we now identify as, is defined only if 1 X k=0 k2 Y j=0 q j <1: Given this measure; the preceding defines on by generators, i.e. cylinder sets. Using the same construction as earlier with left eigenvector ~ q = [~ q 0 ;:::; ~ q n1 ]; the transition matrix ~ Q found earlier, and the measure; (U(x 0 :::x n1 )) = ~ q x 0 q x 0 q x n2 =x 0 P n1 =x n1 : Note that x 0 = 1 P 1 k=2 Q k2 j=0 q j is the normalizing term for the measure: 7.1.1 Ergodicity of The Measure Now, one can analyze properties of this measure: Definition 7.3. Suppose ( ;F;) is a probability space (F the-algebra defined earlier) andT : ! is a map that preserves measure. is ergodic if for eachTinvariant set S2F;(S) is 0 or 1: Theorem 7.4. The definition of ergodicity leads to these results: 1. is ergodic if and only if every T-invariant measurable function is constant (a.e.). 2. If is ergodic andf2L 1 ; then 1 n P n1 j=0 f(T j )!(f) almost surely. 3. is ergodic if and only if for all U;V 2 F; one has 1 n P n1 j=0 (U\ T j (V )) ! (U)(V ): The converse direction is the condition for weak mixing. 4. is ergodic if and only if for allA2F with(A)> 0; one has[ 1 j=0 T j (A) = up to nullsets. 5. is ergodic if and only if for allU;V2F with(U);(V )> 0; there exists aj so that (U\T j (V ))> 0: Proof. A proof can be found on [5]. Lemma 7.5. The measure defined above is ergodic. Proof. Typically, proving a measure is ergodic involves proving that the generators of the -algebra in the given measure space, ( ;F;); satisfies property three or five of the pre- ceding theorem. However, the other properties can be used as well depending on the sce- nario. In this case, since one has a subshift of finite type wrt the matrix ~ Q; one uses property three from the preceding theorem. Since cylinder sets generate the-algebra, it suffices to 45 prove property three to show the ergodicity of : For cylinder setsU andV , one has 1 N N1 X j=0 (W\T j (V ))!(W )(V ) asN goes to infinity. LetW =U(x 0 :::x n1 );V =U(y 0 :::y m1 ); then ifjn one has W\T j (V ) =[ z 1 :::z jn U(x 0 :::x n1 z 1 :::z jn y 0 :::y m1 ) where the union is over all wordsz 1 :::z jn in of lengthjn which allow for transitions ~ Q x n1 x 0 = ~ Q zjny m1 = 1: Thus (W\T j (V )) = X z 1 :::z jn U(x 0 :::x n1 z 1 :::z jn y 0 :::y m1 ) =x 0 q x 0 ::q x n2 (Q jn ) xny 0 q y 0 :::q y n2 and consequently 1 N N1 X j=0 (W\T j (V )) =x 0 q x 0 ::q x n2 ( 1 N N1 X j=n Q jn ) xny 0 q y 0 :::q y n2 which converges to(V )(W ) asN!1 since the entry in brackets converges by Lem- mas 19 and 20 from Haydn toq y 0 and in the averagingn terms will not affect the limit. 7.2 Normalized Case of Stochastic Processes We will now normalize the elements of our random walk or state space. Then, after nor- malizing we consider an induced measure and the effect this normalization has on an n-stringA =f1111:::1g when it is applied to i (S n ): Furthermore, we find an expression for the limit of i (S n ): Consider the stochastic process Y n = ( 0; if X n = 0 1; if X n 6= 0: Let =N 0 N 0 and =f0; 1g N 0 : Notice that the projection map : ! : mapsX n toY n . Consider the stringf0 1:::1 |{z} j g (a string with a zero in the beginning and thenj consecutive ones. Given the projection map;(f01234:::jg) =f0 1:::1 |{z} j g: One can also take the in- verse projection map, 1 so the converse of the preceding is 1 (f0 1:::1 |{z} j g) =f01234:::jg: 46 The measure induces a measure on : The projection map induces a measure s.t. = as for a stringA;(A) =( 1 (A)): Consider S n =fY f0:::n1g = 1g. This is the case that every Y j s.t. j = 0;:::;n 1 is exactly equal to 1. Thus,S n : This means that the complement of S n , S n c = S n1 j=0 V n;j = fY f 0:::n 1g : 9j 2 f0;:::;n 1g s.t. Y j = 0g; where V n;j = T (n1j) fY fn1j:::n1g = f0 1:::1 |{z} j gg with theV n;j distinct and pairwise disjoint by construction. One can consider the measure ofV n;j : This set is on , so it’s measure is. The measure ofV n;j is taken here: (V n;j ) =(f0 1:::1 |{z} j g) =x 0 P j =(f01234:::jg) =( 1 f0 1:::1 |{z} j g) as explained above. The last two terms of the equality follow from (f01234:::jg) = f0 1:::1 |{z} j g; which implies that 1 (f0 1:::1 |{z} j g) =f01234:::jg: Thus, the measure of these sets with respect to is equal. Now, the calculation for the measure ofS n c is done below: (S n ) = 1((S n ) c ) = 1 n1 X j=0 (V n;j ) = 1 1 P 1 j=0 P j n1 X k=0 P k = 1 P 1 j=0 P j [ 1 X k=0 P k n1 X k=0 P k ]: Finally,(S n ) = 1 P 1 j=0 P j P 1 k=n P k . Now, one can calculate the quantity i (S n ): i (S n ) = (S n \ T i j=1 T j S n ) ( T i j=1 T j S n ) = (S n+i ) (T 1 S n+i1 ) = (S n+i ) (S n+i1 ) due to the assumed invariance of the probability measure. Note that the smallest string, i.e. the string with the most ones, is selected when intersections are taken on the probability space. Now, i (S n ) = (S n+i ) (S n+i1 ) = P 1 k=n+i P k P 1 k=n+i1 P k = 1 P n+i1 P 1 n+i1 P k : 47 Then, letl =n +i 1. This substitution yields: lim i!1 i (S n ) = 1 lim l!1 P l P 1 k=l P k = 1 lim l!1 1 P 1 k=l P k P l : Some additional calculations needed to simplify this expression are here: P k = Q k1 j=0 q j implies that P k P l = Q k1 j=0 q j Q l1 j=0 q j = k1 Y j=l q j : This thereby proves that lim i!1 i (S n ) = 1 1 lim l!1 P 1 k=l Q k1 j=l q j . 7.3 Existence of Probability Measure In the previous section, we defined the probability measure sequence l (S n ): One question to ask is: Does P 1 k=l Q k1 j=l q j converge for everyl2N? Naturally, a quick analysis of this yields that: Since 1 X k=l k1 Y j=l q j 1 X k=1 k1 Y j=l q j ; P 1 k=l Q k1 j=l q j converges forl> 1 (by thep-test forq j =O( 1 j l )): Now, ifl = 1; then determining the convergence will be a bit more difficult. For that rea- son, we will consider specific distributions and specific q j to analyze the convergence of P 1 k=l Q k1 j=l q j and l (S n ): Notice that P 1 k=l P k = P 1 k=l Q k1 j=l q j converges for various values of q j depending on whether the sum behaves like a common geometric sum (or common sum) or is finite under the integral test. This case by case approach is what we use to determine the convergence of this measure theoretic sum. 48 7.4 Special Cases Let us consider the behavior of i (S n ) for the special case ofq j s whereq j = 1 c j j for > 0 andc j =c for somec2 (0; 1) andj >N forN2N: Note also that the appropriate choices for ~ c j are taken on the following calculations. To study this situation, let us consider P 1 k=l Q k1 j=l q j = P 1 k=l Q k1 j=l (1 c j j ) for varying values of. 7.4.1 Case = 1 For this case, one can find an upper and lower bound for the product: k1 Y j=l q j = k1 Y j=l (1 c j j ) = k1 Y j=l (1 c j ) = exp k1 X j=l log(1 c j )< exp( k1 X j=l ~ c 0 j ) forj >N: Now, by using the integral test, one can get an upper bound: exp( k1 X j=l ~ c 0 j ) exp(~ c 0 Z k1 l 1 j dj) = exp(~ c 0 log( k 1 l )) = ( l k 1 ) ~ c 0 : Here are the calculations for the lower bound of the product: k1 Y j=l q j = exp k1 X j=l log(1 c j )> exp k1 X j=l ( ~ c 1 j O( 1 j 2 ))C 1 exp( k1 X j=l ~ c 1 j ) s.t. C 1 exp k1 X j=l O( 1 j 2 )) forj >N: Furthermore: C 1 exp( k1 X j=l ~ c 1 j ) C 1 exp( Z k1 l ~ c 1 j dj) = C 1 exp(~ c 1 (log k 1 l )) = C 1 ( l k 1 ) ~ c 1 s.t. C 1 C 1 exp( P k1 j=l ~ c 1 j ) ( l k1 ) ~ c 1 : The final piece of analysis for = 1 is the estimation of(S n ) as done step by step below: C 1 1 X k=l ( l k 1 ) ~ c 1 < 1 X k=l k1 Y j=l q j < 1 X k=l ( l k 1 ) ~ c 0 : (16) 49 Using the integral test on the left and right hand sides of (16), one can estimate the value of P 1 k=l Q k1 j=l q j . A worked-out explanation of this follows below. Notice that for any ~ c 0 and ~ c 1 (the choice is arbitrary due to what we find to be the divergence of the sum), C 1 1 X k=l ( l k 1 ) ~ c 1 ~ C 1 Z 1 l ( l k 1 ) ~ c 1 dk = ~ C 1 l ~ c 1 Z 1 l ( 1 k 1 ) ~ c 1 dk = ~ C 1 l ~ c 1 (k 1) 1~ c 1 1 ~ c 1 1 l = ~ C 1 l ~ c 1 (l 1) 1~ c 1 ~ c 1 1 and similarly, 1 X k=l ( l k 1 ) ~ c 0 ~ C 0 Z 1 l ( l k 1 ) ~ c 0 dk = ~ C 0 l ~ c 0 Z 1 l ( 1 k 1 ) ~ c 0 dk = ~ C 0 l ~ c 0 (k 1) 1~ c 0 1 ~ c 0 1 l = ~ C 0 l ~ c 0 (l 1) 1~ c 0 ~ c 0 1 s.t. ~ C 1 C 1 P 1 k=l ( l k1 ) ~ c 1 (~ c 1 1) l ~ c 1 (l 1) 1~ c 1 and ~ C 0 P 1 k=l ( l k1 ) ~ c 0 (~ c 0 1) l ~ c 0 (l 1) 1~ c 0 : Finally, using the preceding estimates, (16) becomes ~ C 1 l ~ c 1 (l 1) 1~ c 1 ~ c 1 1 < 1 X k=l k1 Y j=l q j < ~ C 0 l ~ c 0 (l 1) 1~ c 0 ~ c 0 1 which means that given l (S n ) = 1 1 P 1 k=l Q k1 j=l q j ; ~ C 1 l ~ c 1 (l 1) 1~ c 1 ~ c 1 1 < l (S n )< ~ C 0 l ~ c 0 (l 1) 1~ c 0 ~ c 0 1 : Thus, by taking the limit asl!1; lim l!1 ~ C 1 l ~ c 1 (l 1) 1~ c 1 ~ c 1 1 < lim l!1 1 X k=l k1 Y j=l q j < lim l!1 ~ C 0 l ~ c 0 (l 1) 1~ c 0 ~ c 0 1 s.t. lim l!1 ~ C 1 l ~ c 1 (l 1) 1~ c 1 ~ c 1 1 =1 and lim l!1 ~ C 0 l ~ c 0 (l 1) 1~ c 0 ~ c 0 1 =1: Therefore,(S n ) = lim l!1 l (S n ) = 1 1 lim l!1 P 1 k=l Q k1 j=l q j = 1 0 = 1: 50 7.4.2 Case> 1 For this case, one can find an upper and lower bound for the product: k1 Y j=l q j = k1 Y j=l (1 c j j ) = k1 Y j=l (1 c j ) = exp k1 X j=l log(1 c j )< exp( k1 X j=l ~ c 2 j ) forj >N: Now, by using the integral test, one can get an upper bound: exp( k1 X j=l ~ c 2 j ) exp(~ c 2 Z k1 l 1 j dj) = exp(~ c 2 (( 1 k 1 ) 1 ( 1 l ) 1 )): Here are the calculations for the lower bound of the product: k1 Y j=l q j = k1 Y j=l (1 c j j ) = k1 Y j=l (1 c j ) = exp k1 X j=l log(1 c j ) exp k1 X j=l ( ~ c 3 j O( c j +1 )) C 3 exp k1 X j=l ~ c 3 j s.t. C 3 exp k1 X j=l O( c j +1 ) forj >N: Furthermore: C 3 exp k1 X j=l ~ c 3 j C 3 exp( Z k1 l ~ c 3 j dj) = C 3 exp(~ c 3 (( 1 k 1 ) 1 ( 1 l ) 1 )) s.t. C 3 C 3 exp P k1 j=l ~ c 3 j exp(~ c 3 (( 1 k1 ) 1 ( 1 l ) 1 )) : The final piece of analysis for> 1 is the estimation of(S n ) as done step by step below: C 3 1 X k=l exp(~ c 3 (( 1 k 1 ) 1 ( 1 l ) 1 ))< 1 X k=l k1 Y j=l q j < 1 X k=l exp(~ c 2 (( 1 k 1 ) 1 ( 1 l ) 1 )): (17) 51 The choices for ~ c 3 and ~ c 2 are discussed below. One can estimate the value of P 1 k=l Q k1 j=l q j by using the integral test on the left and right hand sides of (17). Notice that for ~ c 3 (k1) 1 ~ c 2 ; C 3 1 X k=l exp(~ c 3 (( 1 k 1 ) 1 ( 1 l ) 1 )) _ C 3 Z 1 l exp(~ c 3 (( 1 k 1 ) 1 ( 1 l ) 1 )) dk = _ C 3 exp(~ c 3 ( 1 l ) 1 ) Z 1 l exp(~ c 3 ( 1 k 1 ) 1 ) dk ~ C 3 exp(~ c 3 ( 1 l ) 1 ) Z 1 l (k 1) ~ c 3 ( 1) exp(~ c 3 ( 1 k 1 ) 1 ) dk = ~ C 3 exp(~ c 3 (( 1 l ) 1 ( 1 l 1 ) 1 )) and 1 X k=l exp(~ c 2 (( 1 k 1 ) 1 ( 1 l ) 1 )) _ C 2 Z 1 l exp(~ c 2 (( 1 k 1 ) 1 ( 1 l ) 1 )) dk = _ C 2 exp(~ c 2 ( 1 l ) 1 ) Z 1 l exp(~ c 2 ( 1 k 1 ) 1 ) dk ~ C 2 exp(~ c 2 ( 1 l ) 1 ) Z 1 l (k 1) ~ c 2 ( 1) exp(~ c 2 ( 1 k 1 ) 1 ) dk = ~ C 2 exp(~ c 2 (( 1 l ) 1 ( 1 l 1 ) 1 )) s.t. _ C 3 C 3 P 1 k=l exp(~ c 3 (( 1 k1 ) 1 ( 1 l ) 1 )) R 1 l exp(~ c 3 (( 1 k1 ) 1 ( 1 l ) 1 )) dk ; _ C 2 P 1 k=l exp(~ c 2 (( 1 k1 ) 1 ( 1 l ) 1 )) R 1 l exp(~ c 2 (( 1 k1 ) 1 ( 1 l ) 1 )) dk ; ~ C 3 _ C 3 exp(~ c 3 ( 1 l ) 1 ) R 1 l exp(~ c 3 ( 1 k1 ) 1 ) dk exp(~ c 3 (( 1 l ) 1 ( 1 l1 ) 1 )) ; 52 and ~ C 2 _ C 2 exp(~ c 2 ( 1 l ) 1 ) R 1 l exp(~ c 2 ( 1 k1 ) 1 ) dk exp(~ c 2 (( 1 l ) 1 ( 1 l1 ) 1 )) : Finally, using the preceding estimates, once we take the limit, (17) becomes lim l!1 ~ C 3 exp(~ c 3 (( 1 l ) 1 ( 1 l 1 ) 1 )) lim l!1 1 X k=l k1 Y j=l q j lim l!1 ~ C 2 exp(~ c 2 (( 1 l ) 1 ( 1 l 1 ) 1 )): Thus, since lim l!1 ~ C 3 exp(~ c 3 (( 1 l ) 1 ( 1 l 1 ) 1 )) = ~ C 3 and lim l!1 ~ C 2 exp(~ c 2 (( 1 l ) 1 ( 1 l 1 ) 1 )) = ~ C 2 ; ~ C 3 < lim l!1 1 X k=l k1 Y j=l q j < ~ C 2 : Therefore, since(S n ) = lim l!1 l (S n ) = 1 1 lim l!1 P 1 k=l Q k1 j=l q j ; 1 1 ~ C 3 1 1 lim l!1 P 1 k=l Q k1 j=l q j 1 1 ~ C 2 : Hence, for values ~ c 3 (k1) 1 ~ c 2 and appropriate constants ~ C 3 and ~ C 2 , 1 1 ~ C 3 (S n ) 1 1 ~ C 2 : 7.4.3 Case 0<< 1 For this case, one can find an upper and lower bound for the product: k1 Y j=l q j = k1 Y j=l (1 c j j ) = k1 Y j=l (1 c j ) = exp k1 X j=l log(1 c j )< exp( k1 X j=l ~ c 4 j ) forj >N: Now, by using the integral test, one can get an upper bound: exp( k1 X j=l ~ c 4 j ) exp(~ c 4 Z k1 l 1 j dj) 53 = exp(~ c 4 (( 1 k 1 ) 1 ( 1 l ) 1 )) = exp(~ c 4 ((k 1) 1 (l) 1 )): Here are the calculations for the lower bound of the product: k1 Y j=l q j = k1 Y j=l (1 c j j ) = k1 Y j=l (1 c j ) = exp k1 X j=l log(1 c j ) exp k1 X j=l ( ~ c 5 j O( c j +1 )) C 5 exp k1 X j=l ~ c 5 j s.t. C 5 exp k1 X j=l O( c j +1 ) forj >N: Furthermore: C 5 exp k1 X j=l ~ c 5 j C 5 exp( Z k1 l ~ c 5 j dj) = C 5 exp(~ c 5 (( 1 k 1 ) 1 ( 1 l ) 1 )) = C 5 exp(~ c 5 ((k 1) 1 l 1 )) s.t. C 5 C 5 exp P k1 j=l ~ c 5 j exp(~ c 5 ((k 1) 1 l 1 )) : The final piece of analysis for 0 < < 1 is the estimation of(S n ) as done step by step below: C 5 exp(~ c 5 ((k 1) 1 l 1 ))< 1 X k=l k1 Y j=l q j < 1 X k=l exp(~ c 4 ((k 1) 1 l 1 )): One can estimate the value of P 1 k=l Q k1 j=l q j by using the integral test on the left and right hand sides of the preceding inequality. Through a sly u-substitution: Notice that for ~ c 5 (k1) 1 ~ c 4 ; C 5 1 X k=l exp(~ c 5 ((k 1) 1 l 1 )) 54 _ C 5 Z 1 l exp(~ c 5 ((k 1) 1 l 1 )) dk = _ C 5 exp(~ c 5 l 1 ) Z 1 l exp(~ c 5 (k 1) 1 ) dk ~ C 5 exp(~ c 5 l 1 ) Z 1 l ( ~ c 5 ( 1) (k 1) ) exp(~ c 5 (k 1) 1 ) dk = ~ C 5 exp(~ c 5 (l 1 (l 1) 1 )) and 1 X k=l exp(~ c 4 ((k 1) 1 l 1 )) _ C 4 Z 1 l exp(~ c 4 ((k 1) 1 l 1 )) dk = _ C 4 exp(~ c 4 l 1 ) Z 1 l exp(~ c 4 (k 1) 1 ) dk ~ C 4 exp(~ c 4 l 1 ) Z 1 l ( ~ c 4 ( 1) (k 1) ) exp(~ c 4 (k 1) 1 ) dk = ~ C 4 exp(~ c 4 (l 1 (l 1) 1 )) s.t. _ C 5 C 5 P 1 k=l exp(~ c 5 ((k 1) 1 l 1 )) exp(~ c 5 l 1 ) R 1 l exp(~ c 5 (k 1) 1 ) dk ; _ C 4 P 1 k=l exp(~ c 4 ((k 1) 1 l 1 )) exp(~ c 4 l 1 ) R 1 l exp(~ c 4 (k 1) 1 ) dk ; ~ C 5 _ C 5 exp(~ c 5 l 1 ) R 1 l exp(~ c 5 (k 1) 1 ) dk exp(~ c 5 (l 1 (l 1) 1 )) ; and ~ C 4 _ C 4 exp(~ c 4 l 1 ) R 1 l exp(~ c 4 (k 1) 1 ) dk exp(~ c 4 (l 1 (l 1) 1 )) : Finally, using the preceding estimates, once we take the limit, the inequality for P 1 k=l Q k1 j=l q j becomes lim l!1 ~ C 5 exp(~ c 5 (l 1 (l 1) 1 )) lim l!1 1 X k=l k1 Y j=l q j lim l!1 ~ C 4 exp(~ c 4 (l 1 (l 1) 1 )): Thus, since lim l!1 ~ C 5 exp(~ c 5 (l 1 (l 1) 1 )) = ~ C 5 55 and lim l!1 ~ C 4 exp(~ c 4 (l 1 (l 1) 1 )) = ~ C 4 ; ~ C 5 lim l!1 1 X k=l k1 Y j=l q j ~ C 4 : Therefore, since ( S n ) = lim l!1 l (S n ) = 1 1 lim l!1 P 1 k=l Q k1 j=l q j ; 1 1 ~ C 5 1 1 lim l!1 P 1 k=l Q k1 j=l q j 1 1 ~ C 4 : Hence, for values ~ c 5 (k1) 1 ~ c 4 and appropriate constants ~ C 5 and ~ C 4 , 1 1 ~ C 5 (Sn) 1 1 ~ C 4 : In summary, as long as = 1; l (S n ) converges to(S n ); which exactly equals 1: Otherwise, for6= 1; l (S n ) converges to some value of(S n ) s.t. minf1 1 ~ C 5 ; 1 1 ~ C 3 g(S n ) maxf1 1 ~ C 4 ; 1 1 ~ C 2 g: for constants ~ C 5 ; ~ C 4 ; ~ C 3 ; and ~ C 2 s.t. ~ C 5 _ C 5 exp(~ c 5 l 1 ) R 1 l exp(~ c 5 (k 1) 1 ) dk exp(~ c 5 (l 1 (l 1) 1 )) ; ~ C 4 _ C 4 exp(~ c 4 l 1 ) R 1 l exp(~ c 4 (k 1) 1 ) dk exp(~ c 4 (l 1 (l 1) 1 )) ; ~ C 3 _ C 3 exp(~ c 3 ( 1 l ) 1 ) R 1 l exp(~ c 3 ( 1 k1 ) 1 ) dk exp(~ c 3 (( 1 l ) 1 ( 1 l1 ) 1 )) ; and ~ C 2 _ C 2 exp(~ c 2 ( 1 l ) 1 ) R 1 l exp(~ c 2 ( 1 k1 ) 1 ) dk exp(~ c 2 (( 1 l ) 1 ( 1 l1 ) 1 )) : 56 7.5 A Specific Distribution We continue our analysis of l (S n ) by calculating it for a specific distribution. Finally, for an intuitive perspective for this example, let us give a concrete distribution: pickq j so that (S n ) =e n+(n) s.t.(n) is an arbitrary, converging sequence (to an arbitrary element of the reals) so that one has control over the growth of the sequence, i.e.,j(l + 1)(l)j< 1 for alll2N 0 . Then,(S n ) = 1. Now, one can calculate l (S n ) for this specific example. l (S n ) = (S n+l ) (S n+l1 ) = exp((n +l + 1) +(n +l + 1)) exp((n +l) +(n +l)) Thus, l (S n ) =e 1+(n+l+1)(n+l) and lim l!1 l (A) =e 1 , which is in (0; 1). 57 8 Case Study of Return Times of n-cylinders We now go to an application of the main results, Theorem 5.9 and Lemma 5.10. Here is the statement for Theorem 5.9 (which is also called Theorem 8.1 below) as a re- fresher for the reader: Theorem 8.1. LetfX m g m2N 0 be a -mixing process and f A = 1 2P(A) : Then, for allA2 C n ;n2N 0 ; the following holds: Fort<(A) : P A ( A >t) 1 = 0: For(A)tf A : jP A ( A >t) A e A P(A)(t(A)) j 9 2 (A)P(A)t: Fort>f A : jP A ( A >t) A e A P(A)(t(A)) j 58(A)f(A;t): Of course, there is a lemma that results from Theorem 5.9. As a reminder to the reader, Lemma 5.10 (called Lemma 8.2 below) stated that, under certain conditions, the hitting and return time distributions ofn-cylinders (and sequences ofn-cylinders) are approximately exponential and that is equivalent to A ! 1: The reader might have noticed that Lemma 5.10 is an application (or Corollary) of Theorem 5.9. Theorem 5.9 and the second part of Lemma 5.10 is used in this section. As a reference for the reader, we have provided this result below: Lemma 8.2. Let the processfX m g m2N 0 be-mixing. IffA n g n2N 0 is a sequence of strings such thatP(A n )! 0 asn!1; the following conditions are equivalent: (i) The return time law ofA n converges to a parameter one exponential law. (ii) The return time law and the hitting time law ofA n converge to the same law. (iii) The hitting time law ofA n converges to a parameter one exponential law. (iv) The sequence ( An ) n2N 0 converges to one. We apply the results of Theorem 5.9 and Lemma 5.10 to sequences ofn-cylinders that we construct below. We consider two scenarios about the periods of a general nested sequence ofn-cylinders. Now, let us consider a case study of the behavior of An under the case of periodic and nonperiodicn-cylinders. Suppose x2 be a string s.t. x =fx 0 :::x n1 g andA n (x) = U(x 0 :::x n1 ); soA is a cylinder set). 58 One can observe the behavior of the period forn-cylinders as the value ofn increases: Let A i (x) have period i (x) for alli2N 0 : n (x) n+1 (x) n+2 (x) ::: sinceA n (x) A n+1 (x)A n+2 (x) :::: Furthermore, notice thatfx 0 :::x n1 gfx 0 :::x n1 x n g; so that the inclusion ofA n+1 (x) inA n (x) follows. The numerator of A isP A ( A >(A)): Instead of an infinitely long stringA; we are con- sidering a nested sequence ofn-cylinders and then analyzing the limiting behavior of An using Abadi and Vergne and the change of notation we have established. Now, there are two cases for the period ofA n (x) we must consider in order to find the limit- ing behavior of An : These possibilities are determined by the form of the numerator of An : These cases are sup n2N 0 n (x)<1 and sup n!1 n (x)!1: 8.1 Case 1 Let sup n2N 0 n (x)<1 and put 1 (x) = lim n!1 n (x) = sup n2N 0 n (x)<1: Then,9N s.t.(A n (x)) = 1 (x);8nN: By definition, this means thatA n (x)\T 1(x) (A n (x))6=;;8nN: Using the definition ofA n , form>n;A m (x)A n (x): This means thatT 1(x) (A m (x)) T 1(x) (A n (x)): The explanation is the following: Note that sinceA m (x) =U(x 0 :::x n1 x n :::x m1 ) andA n (x) =U(x 0 :::x n1 ); thenT 1(x) (A m (x)) = U(x 1 :::x n1 x n :::x m1 ) andT 1(x) (A n (x)) =U(x 1 :::x n1 ): Now, since fx 1 :::x n1 gfx 1 :::x n1 x n :::x m1 g; U(x 0 :::x n1 )U(x 1 :::x n1 x n :::x m1 ): Thus, the claim follows. Notice thatP An(x) ( An(x) > 1 )< 1 asn goes to1 Thus, by definition, An(x) < 1: Using the results of Theorem 5.9, the distribution of return times forn-cylinders with finite periods is exponential, or approximately, An(x) e An(x) P(An(x))(t(An(x))) =P An(x) ( An(x) >t): 59 8.2 Case 2 If n (x)!1 asn!1;8N;9M(N)s:t:A n (x)\T j A n (x) =;;8jN;8nM(N): Claim: n (x) = An(x) ! 1 asn!1: Assume that is-mixing. Note that since mixing implies ergodicity, is also ergodic. Let n =P An(x) ( An(x) > n (x)) = 1P An(x) ( An(x) n (x)) = 1P An(x) ( An(x) = n (x)): The final equality is attributed to Lemma 4.8. Now, we go back to the argument. By definition, P An(x) ( An(x) = n (x)) = P(( An(x) = n (x))\A n (x)) P(A n (x)) : Consider: P(( An(x) = n (x))\A n (x))P(A n (x)\T n(x) (A n (x)))P(A n (x)\T n(x) (A ( n(x) 2 )^n (x)) P(A n (x))[P(A ( n(x) 2 ^n) (x)) +( n (x) 2 )]: The preceding involved inserting a gap of length n(x) 2 and the-mixing property. This shows that 1 n (x) P(A ( n(x) 2 ^n) (x)) +( n(x) 2 ^n)! 0 as n!1 because asn increases to1; the gap n(x) 2 increases, which makesP(A ( n(x) 2 ^n) (x)) and( n(x) 2 ) converge to 0: Therefore, if n (x)!1 then An(x) ! 1 asn!1: Hence, the distribution ofn-cylinders that have periods converging to1 ise t by Lemma 5.10. Thusly, we have proven that n-cylinders always have the exponential distribution of ap- proximately the form e t =P An(x) ( An(x) >t) if their respective periods converge to infinity ( n (x)!1). 60 Otherwise (when n (x) <1) for the nested, increasing sequence of n-cylinders), like most other n-cylinders with the -mixing condition, each n-cylinder has an exponential distribution of approximately the form e P(A)t =P An(x) ( An(x) >t): 61 9 Appendix For the detail oriented reader, we include the full proof of Theorem 5.9, which we call Theorem 9.1 in this section. Theorem 9.1. LetfX m g m2N 0 be a -mixing process and f A = 1 2P(A) : Then, for allA2 C n ;n2N 0 ; the following holds: Fort<(A) : P A ( A >t) 1 = 0: For(A)tf A : jP A ( A >t) A e A P(A)(t(A)) j 9 2 (A)P(A)t: Fort>f A : jP A ( A >t) A e A P(A)(t(A)) j 58(A)f(A;t): The theorem was split up into three cases to better organize the corresponding proof. Proof. 9.0.1 Fort<(A) The equality follows from Lemma 4.8. Since A is always at least(A); or, in other terms, the return time is at least the period, t < (A) A : Then,P A ( A > t) = 1: Thus, the equality follows. 9.0.2 For(A)tf A The main step of this proof is the rewriting our hitting time (P( A > t)) and return time (P A ( A >t)) as products and most importantly, applying the inequality: j Y a i Y b i j maxja i b i j(#i) maxfa i ;b i g #i1 80a i ;b i 1: s.t. thea i andb i are from the product representations of our return time and e A P(A)(t(A)) ; which are formed below. We further bound the left side of the inequality as well to gain our desired final result. Now, let us consider the product representation of our return and hitting times. 9.0.2.1 Product Representation of Hitting and Return Times We will now rewriteP( A >t) andP A ( A >t) as products. Let p i = P A ( A >i 1) P( A >i 1) : 62 Now, write P A ( A >t) = P A ( A >t) P( A >t) P( A >t) =p t+1 P( A >t) and P( A >t) =P( A >t(A)) = t Y i=(A)+1 P( A >ij A >i 1) (18) = t Y i=(A)+1 (1P( A ij A >i 1)) = t Y i=(A)+1 (1P(T i Aj A >i 1)) (19) = t Y i=(A)+1 (1p i P(A)): (20) Note that: (18) follows from stationarity. (19) is proven here: If A i; then T k (x) 2 A s.t. k i: Then, x 2 T k (A) for k i: Now, since A >i 1;x = 2T k (A) forki 1: Thus, combining these two results,T i (x)2A; so x2T i (A): The equality follows. Before proving (20), let us introduce two facts: The inverse conditional proability lemma states that given two eventsA andB, P(AjB) = P(B\A) P(A) : The following will be referred as the reversiblity lemma: Given the same conditions as the above theorem, P(A\ ( A >i 1)) =P(( A >i 1)\T i (A)): Now, to prove (20), let us consider the term: P(T i (A)j A >i 1): 63 Through the above lemmas and stationarity, P(T i (A)j A >i 1) = P( A >i 1jT i (A))P(T i (A)) P( A >i 1) = P(( A >i1)\T i (A)) P(T i A) P(T i (A)) P( A >i 1) = P(A\( A >i1)) P(T i (A)) P(T i (A)) P( A >i 1) = P(A\( A >i1)) P(A) P(A) P( A >i 1) = P(( A >i1)\A) P(A) P(A) P( A >i 1) = P A ( A >i 1) P( A >i 1) P(A) =p i P(A): Therefore, (20) follows: t Y i=(A)+1 (1P(T i (A)j A >i 1)) = t Y i=(A)+1 (1 P(A\ ( A >i 1)) P( A >i 1) ) = t Y i=(A)+1 (1 P A ( A >i 1) P( A >i 1) P(A)) = t Y i=(A)+1 (1p i P(A)): 9.0.2.2 Bounding maxja i b i j As a reminder to the reader, the product inequality is j Y a i Y b i j maxja i b i j(#i) maxfa i ;b i g #i1 80 a i ;b i 1: s.t. thea i andb i are from the product representations of our return time ande A P(A)(t(A)) : Takea i = 1p i P(A) andb i =e A P(A) : Our choice ofa i andb i will be clear later. We must now bound the term maxja i b i j from the product inequality. Let us consider maxja i b i j =j1p i P(A)e A P(A) j =jp i P(A) + A P(A)) + 1 A P(A)e A P(A) j j (p i P(A) A P(A))j +j1 A P(A)e A P(A) j 64 =jp i A jP(A) +j1 A P(A)e A P(A) j: Now that we have bounded maxja i b i j; we will bound the left hand side of the inequality by considering each modulus individually. 9.0.2.2.1 Boundingjp i A j Since the term maxja i b i j hasjp i A j as part of its bound, we must bound the latter. Using Proposition 4.1(b) from [2], for alli(A)2N; jP A ( A >i) A P( A >i)j 2(A); and the fact thatP( A >i) 1 2 ; sinceif A = 1 2P(A) ; we have jp i A j =j P A ( A >i 1) P( A >i 1) A j = jP A ( A >i 1)) A P( A >i 1)j P( A >i 1) 2(A) P( A >i 1) 2(A) P( A >i) 4(A): Note that the last two inequalities follow from the facts thatP( A > i 1) P( A > i) andP( A >i) 1 2 : Thus,jp i A j 4(A): 9.0.2.2.2 Boundingj1 A P(A)e A P(A) j Since the term maxja i b i j hasj1 A P(A)e A P(A) j as part of its bound, we must bound the latter. Note that j1xe x j x 2 2 (21) for all 0 x 1 (this range is obtained from the Taylor series representation ofe x ) as proven below: The Taylor series expansion ofe x = 1x+ x 2 2 +O(x 3 ): This means that 1x+ x 2 2 e x : Then,1 +xe x x 2 2 : By applying absolute value to both sides of the inequality, (4) follows. Since 0 A P(A) 1 (as 0 A 1 and 0 P(A) 1), we can use (4) through the substitution ofx = A P(A): This substitution yields: j1 A P(A)e A P(A) j ( A P(A)) 2 2 : (22) 65 9.0.2.3 A Collection of Bounds forj1p i P(A)e A P(A) j The purpose of this part of the proof is to combine all the calculations done earlier in order to boundj1p i P(A)e A P(A) j: Combining (21), (22), and an inequality obtained earlier: j1p i P(A)e A P(A) jjp i A jP(A)+j1 A P(A)e A P(A) j 4(A)P(A)+ ( A P(A)) 2 2 = 4(A)P(A) + (P A ( A >(A)) 2 P(A) P(A) 2 4(A)P(A) + 1 2 (A)P(A) = 9 2 (A)P(A): Thus, j1p i P(A)e A P(A) j 9 2 (A)P(A) 8i =(A) + 1;:::;f A : 9.0.3 Final Combination of Results To Prove Desired Exponential Return Time Dis- tribution In order to prove our desired inequality about exponentially distributed return times, let us consider the product inequality we provided earlier, the product representations of return and hitting times, and finally, the bounds for the product inequality. We will first prove that hitting times are exponentially distributed and then, we will show that with a few adjust- ments, return times are exponentially distributed. Consider j Y a i Y b i j maxja i b i j(#i) maxfa i ;b i g #i1 (23) 80a i ;b i 1: Using the previous results and the above inequality, let Q a i = Q t i=(A)+1 (1p i P(A)) = P( A >t) s.t.a i = 1p i P(A): Also, let Q b i = Q t i=(A)+1 e A P(A) =e A P(A)(t(A)) s.t. b i =e A P(A) : Note that this choice ofa i andb i satisfies the condition 0a i ;b i 1: Using (23), j t Y i=(A)+1 (1p i P(A)) t Y i=(A)+1 e A P(A) j maxj1p i P(A)e A P(A) j(t(A)) 9 2 (A)P(A)t (24) 66 Now, (24) follows from the fact thatj1p i P(A)e A P(A) j 9 2 (A)P(A);t(A)t; and maxf1p i P(A);e A P(A) g t(A)1 1 (as 1p i P(A) 1 ande A P(A) 1). This means that j t Y i=(A)+1 (1p i P(A)) t Y i=(A)+1 e A P(A) j 9 2 (A)P(A)t: Since t Y i=(A)+1 (1p i P(A)) =P( A >t) and t Y i=(A)+1 e A P(A) =e A P(A)(t(A)) ; jP( A >t)e A P(A)(t(A)) j 9 2 (A)P(A)t: Now, the above inequality looks very similar to the inequality we want to show. First, notice thatP A ( A >t) =p t+1 P( A >t) and A 1: Then, jP A ( A >t) A e A P(A)(t(A)) jjp t+1 P( A >t) A e A P(A)(t(A)) j maxfp t+1 ; A gjP( A >t)e A P(A)(t(A)) jjP( A >t)e A P(A)(t(A)) j as maxfp t+1 ; A g 1: Therefore, sincejP( A >t)e A P(A)(t(A)) j 9 2 (A)P(A)t; jP A ( A >t) A e A P(A)(t(A)) j 9 2 (A)P(A)t: 9.0.4 Fort>f A This is the longest, most technical part of the proof. We prove that return times are ex- ponentially diributed for t > f A : To prove our desired inequality forjP A ( A > t) A e A P(A)(t(A)) j, we must use the following trangle inequality to break our modulus into 4 pieces. The first and last moduli on the right side of the triangle inequality are minor adjustments to the tail of our modulus on the left side. The second modulus is the mixing term. Finally, the third modulus is the exponential term. The second and third moduli are the most important parts of this proof because they involve the behavior of the processes and our return times. The triangle inequality for our modulus is done below: 67 Lett =kf A +r s.t.k2Z + and 0rf A : Consider the inequality jP A ( A >t) A e A P(A)(t(A)) jjP A ( A >kf A +r)P A ( A >kf A )P( A >r)j (25) +jP A ( A >kf A )P A ( A >f A )P( A >f A ) k1 jP( A >r) (26) +jP A ( A >f A )P( A >f A ) k1 A e A k 2 jP( A >r) (27) + A e A k 2 jP( A >r)e A P(A)(r(A)) j: (28) Consider (28): A e A k 2 jP( A >r)P(A)(r(A)j = j A e A k 2 P( A >r) A e A k 2 e A P(A)(r(A)) j = j A e A k 2 P( A >r) A e A P(A)(k 1 2P(A +r(A)) j = j A e A k 2 P( A >r) A e A P(A)(kf A +r(A)) j = j A e A k 2 P( A >r) A e A P(A)(t(A)) j Thus, the above work shows why the last part of the triangle inequality was different from what we would expect. Let us bound the left side of the triangle inequality componentwise. 9.0.4.1 Bounding (25) Since (25) is on the left side of our return time triangle inequality, we must bound this term to get our desired bound. Let us consider (25): jP A ( A >kf A +r)P A ( A >kf A )P( A >r)j 2(A)P A ( A >kf A 2n) (29) 2(A)(P( A >kf A 2n) +(n)) k1 : (30) Note that (29) is exactly Proposition 4.1(a) of [2] given the conditions of this theorem. (30) follows from Lemma 4.3 of [2], which states that given the conditions of the theorem, P A ( A >kf A 2n) (P( A >kf A 2n) +(n)) k1 : Thus, we have bounded (25) as jP A ( A >kf A +r)P A ( A >kf A )P( A >r)j 2(A)(P( A >kf A 2n) +(n)) k1 : 68 9.0.4.2 Bounding (26) Since (26) is on the left side of our return time triangle inequality, we must bound this modulus to get our desired bound. This modulus is the mixing term because Abadi and Vergne used the-mixing proprty of the processes to bound it. Let us consider (26): jP A ( A >kf A )P A ( A >f A )P( A >f A ) k1 jP( A >r): Let us bound the modulus of (26): jP A ( A >kf A )P A ( A >f A )P( A >f A ) k1 j 2(A)(k 1)P A ( A >f A 2n)(P( A >f A 2n) +(n)) k2 : Now, given the conditions of the theorem, the bound of the modulus of (26) is exactly by Proposition 4.2 of Abadi and Vergne [2]. 9.0.4.3 Bounding The Combination of (25) and (26) To bound the sum of (25) and (26), we use the calculations found earlier and most impor- tantly,-mixing. Consider the sum of (25) and (26): P A ( A >kf A +r)P A ( A >kf A )P( A >r)j+jP A ( A >kf A )P A ( A >f A )P( A >f A ) k1 P( A >r)j 2(A)(P( A >kf A 2n)+(n)) k1 +2(A)(k1)P A ( A >f A 2n)(P( A >f A 2n)+(n)) k2 = 2(A)[P( A >f A 2n)+(n)] k2 [P( A >kf A 2n)+(n)+(k1)P A ( A >f A 2n)] 2(A)[P( A >f A 2n) +(n)] k2 [jP( A >kf A 2n)P A ( A >f A 2n)j +(n) + kP A ( A >f A 2n)] = 2(A)[P( A >f A 2n) +(n)] k2 [k +(n)] Note that since the process is-mixing,jP( A > kf A 2n)P A ( A > f A 2n)j! 0: Also, note thatP A ( A > f A 2n) 1: Thus, combining these two facts, we have proven the last equality. 69 9.0.4.3.1 BoundingP( A >f A 2n) +(n) SinceP( A >f A 2n) +(n) is in the bound for (25) and (26), we will bound this term. This procedure will help us get the exponential term we need for our final bound. Let us use jP( A >t)e A P(A)(t(A)) j 9 2 (A)P(A)t s.t.t =f A 2n: jP( A >f A 2n)e A P(A)(f A 2n(A)) j 9 2 (A)P(A)[f A 2n]: This means that jP( A >f A 2n)e A P(A)( 1 2P(A) 2n(A)) j 9 2 (A)P(A)[ 1 2P(A) 2n] 9 4 (A): A little manipulation yields that, given the previous calculations: jP( A >f A 2n)e A 2 +(2n+(A))P(A) jjP( A >f A 2n)e A P(A)( 1 2P(A) 2n(A)) j 9 4 (A): Thus, jP( A >f A 2n)e A 2 +(2n+(A))P(A) j 9 4 (A): Using the Mean Value Theorem, which gives usjf(b)f(a)j (ba)f 0 (c) (in this case, letc = ba 2 ), letf(x) =e x ;a = A 2 ; andb = A 2 + (2n +(A)P(A)): Then, j exp( A 2 + (2n +(A)P(A))) exp( A 2 )j ( A 2 + (2n +(A)P(A)) A 2 ) exp( A 2 + (2n +(A)P(A))) A 2 2 ) = (2n +(A)P(A)) exp( (2n +(A))P(A) 2 ) (2n +(A)P(A)) exp((2n +(A))P(A)): (31) Thus, j exp( A 2 +(2n+(A)P(A)))exp( A 2 )j (2n+(A)P(A)) exp((2n+(A))P(A)); 70 so jP( A >f A 2n) +(n) exp( A 2 )j jP( A >f A 2n) exp( A 2 + (2n +(A)P(A)))j +j(n)j = +j exp( A 2 + (2n +(A)P(A))) exp( A 2 )j 9 4 (A) +j(n)j + (2n +(A) exp((2n +(A)) 9 4 (A) + 3 4 (A) +(A)(nn) 4(A): (32) Note that (32) follows from the fact thatj(n)j 3 4 (A) (as it is negligible), and (2n + (A) exp((2n +(A)) (A) +(nn) = (A) +(0)! (A) (as this term is in the exact expression for(A)). Then, jP( A >f A 2n) +(n) exp( A 2 )j 4(A): Therefore, P( A >f A 2n) +(n) exp( A 2 ) + 4(A): 9.0.4.4 Further Bounding for 2(A)[P( A >f A 2n) +(n)] k2 [k +(n)] Now, using that inequality, 2(A)[P( A >f A 2n) +(n)] k2 [k +(n)] 2(A)[exp( A 2 ) + 4(A)] k2 k as(n) is negligible. 9.0.4.4.1 Bound for exp( A 2 ) + 4(A) Since this exponential function appears in the bound above for the sum of (25) and (26), we will bound it to get our desired inequality. To bound this term, we will use Taylor series and an equivalent form for k. By Taylor series expansion, exp( A 2 ) exp( A 2 ) + 4(A) (exp( A 2 ))( P 1 n=0 (8(A)) n n! ) = (exp( A 2 ))(exp(8(A))) = exp(( A 2 8(A))) 71 Now, sincet =kf A +r;t = k 2P(A) +r: Then,k = (tr)2P(A): Then, exp(( A 2 )(k 2)) = exp(( A 2 )((tr)2P(A) 2)) = exp(( A 2 )((2P(A)t 2P(A)r 2)) = exp( A P(A)t + A P(A)r + A ) = exp( A P(A)t + A (P(A)r + 1)): Thus, exp(( A 2 )(k 2)) = exp( A P(A)t + A (P(A)r + 1)): Note that A (P(A)r + 1)P(A)r + 1P(A)f A + 1 =P(A)( 1 2P(A) ) + 1 = 3 2 : Then, A (P(A)r + 1) 3 2 : Hence, exp(( A 2 )(k 2)) = exp( A P(A)t + A (P(A)r + 1)) exp( A P(A)t + 3 2 ): Similarly, again takek = (tr)2P(A); exp(( A 2 8(A))(k 2)) = exp(( A 2 8(A))((tr)2P(A) 2)) = exp(( A 2 8(A))(2P(A)t 2P(A)r 2)) = exp( A P(A)t + A P(A)r + A + 16(A)P(A)t 16(A)P(A)r 16(A)) = exp(( A 16(A))P(A)t + ( A P(A)r + A ) (16(A)P(A)r + 16(A))) = exp(( A 16(A))P(A)t + ( A (P(A)r + 1) (16(A)(P(A)r + 1))) = exp(( A 16(A))P(A)t + ( A 16(A)(P(A)r + 1)): Thus, exp(( A 2 8(A))(k 2)) = exp(( A 16(A))P(A)t + ( A 16(A)(P(A)r + 1)): Consider ( A 16(A)(P(A)r + 1) ( A 16(A)(P(A)f A + 1) (1 16(A)(P(A) 1 2P(A) + 1) (1 16(A)( 1 2A + 1) = (1 16(A)( 3 2 ) 3 2 72 for largen as(A) becomes negligible for bign: Note that the preceding inquealities are valid because A 1;f A = 1 2P(A) ; andP(A) 1: Thus, exp(( A 2 8(A))(k 2)) = exp(( A 16(A))P(A)t + ( A 16(A)(P(A)r + 1)) exp(( A 16(A))P(A)t + 3 2 ) for large enoughn: Hence, 4(A)k[exp( A 2 ) + 4(A)] k2 4(A)k[exp( A 2 )] k2 +O((4(A) exp( A 2 )) k1 ) 4(A)k exp(( A 16(A))P(A)t + 3 2 ) +O((4(A) exp( A 2 )) k1 ) 4(A)(2tP(A) exp( 3 2 ) exp(( A 16(A))t) 8( 9 2 )(A)tP(A) exp(( A 16(A))t) = 36(A)P(A)t exp(( A 16(A))t) Therefore, 4(A)k[exp( A 2 ) + 4(A)] k2 4(A)k[exp( A 2 )] k2 +O((4(A) exp( A 2 )) k1 ) = 36(A)P(A)t exp(( A 16(A))t) Note that the factor of bigO vanishes since it is negligible. This thereby proves that the sum of (25) and (26), P A ( A >kf A +r)P A ( A >kf A )P( A >r)j +jP A ( A >kf A )P A ( A >f A )P( A >f A ) k1 36(A)P(A)t exp(( A 16(A))t): 73 9.0.5 Bound for (27) This is the exponential term of our triangle inequality. We must first boundjP( A >f A ) exp( A 2 )j because it arises in the product inequality for (10). Now, let us bound (27): jP A ( A >f A )P( A >f A ) k1 A e A k 2 jP( A >r): Consider jP( A >t)e A P(A)(t(A) j 9 2 (A)P(A)t and jP A ( A >t) A e A P(A)(t(A) j 9 2 (A)P(A)t witht =f A so jP( A >f A ) exp( A 2 )j =jP( A >f A ) exp( A P(A)(f A (A))) + exp( A P(A)(f A (A))) exp( A 2 )j =jP( A >f A ) exp( A P(A)(f A (A))) + exp( A P(A)( 1 2P(A) (A))) exp( A 2 )j =jP( A >f A ) exp( A P(A)(f A (A))) + exp( A 2 ) exp( A P(A)(A)) exp( A 2 )j jP( A >f A ) exp( A P(A)(f A (A)))j + exp( A 2 )j exp( A P(A)(A)) 1j 9 2 (A)P(A)f A + exp( A 2 )j exp( A P(A)(A)) 1j 9 2 (A)P(A) 1 2P(A) + exp( A 2 )nP(A) 9 4 (A)P(A) + 3 4 (A) = 3(A) Note that the preceding long inequality follows from the fact that exp( A 2 )nP(A) 1; j exp( A P(A)(A)) 1j nP(A) for largen, andnP(A) 3 4 (A) (sincenP(A) is in the definition of(A)). Thus, jP( A >f A ) exp( A 2 )j jP( A >f A ) exp( A P(A)(f A (A)))j + exp( A 2 )j exp( A P(A)(A)) 1j 9 4 (A)P(A) +nP(A) 3(A) 74 and, similarly, jP A ( A >f A ) A exp( A 2 )j jP A ( A >f A ) A exp( A P(A)(f A (A)))j + exp( A 2 )j exp( A P(A)(A)) 1j 9 4 (A)P(A) +nP(A) 3(A): 9.0.5.1 Product Inequality for (27) We now use the product inequality and our previous calculations to bound our exponential term, (27). Now, applyingjP( A >f A )exp( A 2 )j 3(A);jP A ( A >f A ) A exp( A 2 )j 3(A); andj Q a i Q b i j maxja i b i j(#i) maxfa i ;b i g #i1 80a i ;b i 1; one gets that the modulus in (10): jP A ( A >f A )P( A >f A ) k1 A (exp( A 2 )) k j =jp t+1 P( A >f A ) k A (exp( A 2 )) k j maxfp t+1 ; A gjP( A >f A ) k (exp( A 2 )) k j jP( A >f A ) k (exp( A 2 )) k j maxjP( A >f A ) exp( A 2 )jk maxfP( A >f A ); exp( A 2 )g k1 3(A)k(3(A) + (exp( A 2 )) k1 Now, the Taylor series representation yields: exp( A 2 ) exp( A 2 )+3(A) (exp( A 2 ))( 1 X n=0 (6(A)) n n! ) = (exp( A 2 ))(exp(6(A))) exp(( A 2 6(A))): Also,k = (tr)2P(A) and exp(( A 2 )(k 1)) = exp( A P(A)t + A P(A)r): A basic susbstitution for k leads to that equality. Note that A P(A)r 1 2 and exp(( A 2 )(k 2)) = exp( A P(A)t + A P(A)r) exp( A P(A)t + 1 2 ): 75 Similarly, again takek = (tr)2P(A); exp(( A 2 6(A))(k 1)) = exp(( A 2 6(A))((tr)2P(A) 1)) = exp(( A 2 6(A))(2P(A)t 2P(A)r 1)) = exp( A P(A)t + A P(A)r + A 2 + 12(A)P(A)t 6(A)P(A)r 12(A)) = exp(( A 12(A))P(A)t + ( A P(A)r + A 2 ) (12(A)P(A)r + 6(A))) = exp(( A 12(A))P(A)t + ( A (P(A)r + 1 2 ) (12(A)(P(A)r + 1 2 ))) = exp(( A 12(A))P(A)t + ( A 12(A)(P(A)r + 1 2 )): Thus, exp(( A 2 6(A))(k 2)) = exp(( A 12(A))P(A)t + ( A 12(A)(P(A)r + 1 2 )): Consider ( A 12(A)(P(A)r + 1 2 ) ( A 12(A)(P(A)f A + 1 2 ) (1 12(A)(P(A) 1 2P(A) + 1 2 ) (1 12(A)( 1 2 + 1 2 ) = (1 12(A)(1) 1 for largen as(A) becomes negligible for bign: Note that the preceding inequalities are valid because A 1;f A = 1 2P(A) ; andP(A) 1: Thus, exp(( A 2 6(A))(k 1)) = exp(( A 12(A))P(A)t + ( A 12(A)(P(A)r + 1 2 )) exp(( A 12(A))P(A)t + 1) for large enoughn: Hence, 3(A)k[exp( A 2 ) + 3(A)] k1 3(A)k[exp( A 2 )] k1 +O((3(A) exp( A 2 )) k2 ) 3(A)k exp(( A 12(A))P(A)t + 1 2 ) +O((3(A) exp( A 2 )) k2 ) 3(A)(2tP(A) exp(1) exp(( A 12(A))t) 6 exp(1)(A)tP(A) exp(( A 12(A))t) = 6e(A)P(A)t exp(( A 12(A))t) 76 Therefore, 3(A)k[exp( A 2 ) + 3(A)] k1 3(A)k[exp( A 2 )] k1 +O((3(A) exp( A 2 )) k2 ) = 6e(A)P(A)t exp(( A 12(A))t) Note that the factor of bigO vanishes since it is negligible. This thereby proves that (27): jP A ( A >f A )P( A >f A ) k1 A e A k 2 jP( A >r) 6e(A)P(A)t exp(( A 12(A))t) 14(A)P(A)t exp(( A 12(A))t): 9.0.6 Bounding (28) We will now bound the last modulus, (28). Let us again consider: jP( A >t)e A P(A)(t(A) j 9 2 (A)P(A)t: Now, A e A k 2 jP( A >r)e A P(A)(r(A) j A e A k 2 ( 9 2 (A)P(A)r) A e A k 2 ( 9 4 (A)) < A e A k 2 ( 9 2 (A)): Ifr<(A); then (28) is equivalent to e A P(A)(r(A)) 1 +P( A r) 2nP(A): This is justified by using 1P( A r) =P( A >r); the fact thatnP(A) is in the expres- sion for(A); and previous calculations. Note that A P(A)r A P(A)f A A P(A)( 1 2P(A) ) and A 2 1 2 : Thus, using these calculations, (28) is bounded: A e A k 2 jP( A >r)e A P(A)(r(A)) j 9 2 (A)e A k 2 = 9 2 (A)e A 2(tr)P(A) 2 9 2 (A)e A P(A)(tr) = 9 2 (A)e A P(A)t+ A P(A)r 9 2 (A)e A (P(A)t 1 2 ) 9 2 (A)e A P(A)t e 1 2 ) = 9 2 e 1 2 (A)e A P(A)t 8(A)e A P(A)t 8(A)P(A)te A P(A)t 77 Note that the last inequality follows from t f A as shown below: For large enough t; t 1 P(A) >f A : Then,P(A)t 1: Thus, e A P(A)t P(A)te A P(A)t and the inequality follows. 9.0.7 Final Calculations of Inequality Now, we will gather all of the inequalities we proved via technicial manipulation, mixing properties, and exponential estimates in order to prove our final inequality. Finally, jP A ( A >t) A e A P(A)(t(A)) jjP A ( A >kf A +r)P A ( A >kf A )P( A >r)j +jP A ( A >kf A )P A ( A >f A )P( A >f A ) k1 jP( A >r) +jP A ( A >f A )P( A >f A ) k1 A e A k 2 jP( A >r) + A e A k 2 jP( A >r)e A P(A)(r(A)) j 36(A)P(A)t exp(( A 16(A))t) + 14(A)P(A)t exp(( A 12(A))t) + 8(A)P(A)te A P(A)t 58(A)f(A;t) The final inequality is valid due to the fact that(A) is an error term. Thus, the return time distribution is exponential fort(A): 78 10 Conclusion To sum up the meaning of this thesis, return times have exponential distributions under certain conditions which control the behavior of the random variables. Namely,-mixing, the condition of infinitely long strings, and the condition that we work over n-cylinders guarantee the existence of the behaved exponential distributions, which are justifiably nice as they are of parameter one, of return times. We can extend the results of my thesis even further. First of all, how could we extend my thesis? What other questions could we ask? This leads to an interesting question: "What happens if we relax the conditions on these strings?" For instance, what would happen to the distribution of return times if we worked with arbitrary strings were notn-cylinders and also, were not a sequence of strings that con- verged to 0? According to Professor Nicolai Haydn, two ergodic theorists, Yves Lacroix and Michal Kupsa, have proven that relaxing the conditions on the processes and strings will yield return times of any type. The return time distributions could be anything, such as Poisson, geometric, Bernoulli, hypergeometric, or even one that does not have any uniform behavior (it could be defined in a piecewise way and have different distributions depending on the set). We could also ask: "What would happen to return time distributions of n- cylinders under a different type of mixing, such as-mixing? The definition of-mixing is similar to-mixing and we leave it to the reader to read about it in more depth, if desired. That question has been answered in different ways by ergodic theorists already and much more information is left and can be found as this paper shows. For instance, we can look at different probabilistic scenarios. Instead of illustrating and applying Abadi and Vergne’s results to a Markov chain, we could consider another scenario, such as a geometric process. There are other results of Abadi and Vergne’s paper that we did not consider and analyze. For instance, Abadi and Vergne’s paper [2] proves that the sojourn times have a geomet- ric distribution that is expressed in terms of the limits, (A); of the probability sequence i (A): Abadi and Vergne also proved results about the moments of the sojourn time ofn- cylinders. More importantly, Abadi and Vergne proved results under the given conditions of Theorem 4.5, about the moments of return times. Now, this leads to a discussion about the expectation of return times under the-mixing andn-cylinder conditions. An analo- gous result (to Poincare’s recurrence theorem) for the expectation of return times is Kac’s theorem. Kac’s theorem states that expected value of return times over a positive measure setA2 is exactly 1 P(A) : A proof for this can be found in many texts about ergodic theory. Furthermore, we can try to find the connection between Abadi and Vergne’s constant f A and the result of Kac’s theorem. In conclusion, we could spend hundreds and hundreds of pages discussing and proving various results and variations of results about return time distributions. We could also consider other types of time, such as recurrence time, which the author has not defined. Furthermore, we could consider the subshift of finite time under the transfer operator, which 79 would be in the connections of ergodic theory and operator theory. In conclusion, the possibilities to analyze ergodic theory may be endless, but the research time is limited. That is the never ending story of a mathematician. 80 Bibliography [1] M. Abadi. Sharp Error Terms and Necessary Conditions for Exponential Hitting Times in Mixing Processes. Annals Probab. 32 (2004), 243-264. [2] M. Abadi and N. Vergne. Sharp Error Terms for Return Time Statistics under Mixing Conditions. J. Theor. Probab. 22 (2009), 18-37. [3] D. Gamarnik. Introduction to Probability and Statistics Class Notes, available at http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-436j- fundamentals-of-probability-fall-2008/lecture-notes/. [4] N. Haydn. Entry and Return Times Distribution. Dynamical Systems. 28 (2013), 333- 353. [5] N. Haydn. Math 625 Notes. (2011). [6] B. P. Kitchens. Symbolic Dynamics: One-sided, Two-sided and Countable State Markov Shifts. Springer-Verlag, Berlin, 1998. [7] S. Lalley. Stats 312: Markov Chains, Basic Theory, available at http://galton.uchicago.edu/lalley/Courses/312/MarkovChains.pdf. [8] D. Nualart. Stochastic Processes, available at http://www.mat.ub.edu/nualart/StochProc.pdf. [9] H. Poincare. Sur Le Problemèm des Trois Corps et Les Équations de La Dynamique. Acta Mathematica. 13 (1890), 1-270. [10] M. Pollicott and M. Yuri. Dynamical Systems and Ergodic Theory, available at http://homepages.warwick.ac.uk/masdbl/book.html. [11] E. Weisstein. Bernoulli Distribution, available at http://mathworld.wolfram.com/BernoulliDistribution.html. 81
Abstract (if available)
Abstract
Consider an invariant probability measure and a shift space made of symbolic strings (sequences of symbols, which are considered to be probabilitistic events). Within this shift space, we will analyze the behavior of strings by shifting elements of these strings of symbols using the left‐shift map (each element of the string moves left). For instance, under the map the second position of the string becomes the first position of the given string under the left‐shift map. An introduction to terminology, which will include return time, hitting time, and period, from ergodic theory will be given. Also, a major analytic concept called ϕ ‐mixing, which involves the phenomenon of string independence over time, will be discussed in this paper due to its significant impact on our strings. The main part of this thesis will prove and analyze the probability distribution of return times. In particular, we find the return time distributions are exponential. The analysis of these distributions will be mainly through many extensions of an example on Markov chains. The final piece of analysis of these distributions is a case study on the effects of the periodicity of a nested sequence of n‐cylinders on the convergence of return time distributions.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Second order in time stochastic evolution equations and Wiener chaos approach
PDF
Localized escape rate and return times distribution
PDF
Mixing times for the commuting chain
PDF
Statistical inference of stochastic differential equations driven by Gaussian noise
PDF
Mixing conditions and return times on Markov Towers
PDF
Limiting distribution and error terms for the number of visits to balls in mixing dynamical systems
PDF
Limiting entry/return times distribution for Ergodic Markov chains in a unit interval
PDF
Statistical inference for second order ordinary differential equation driven by additive Gaussian white noise
PDF
Asset price dynamics simulation and trading strategy
PDF
Large deviations rates in a Gaussian setting and related topics
PDF
Gaussian free fields and stochastic parabolic equations
PDF
Non-parametric models for large capture-recapture experiments with applications to DNA sequencing
PDF
A “pointless” theory of probability
PDF
Topics in selective inference and replicability analysis
PDF
CLT, LDP and incomplete gamma functions
PDF
Analysis of ergodic and mixing dynamical systems
PDF
Geometric properties of Anosov representations
PDF
Improvement of binomial trees model and Black-Scholes model in option pricing
PDF
High-frequency Kelly criterion and fat tails: gambling with an edge
PDF
Nonparametric estimation of an unknown probability distribution using maximum likelihood and Bayesian approaches
Asset Metadata
Creator
Dungca, Jason Tomas
(author)
Core Title
Return time distributions of n-cylinders and infinitely long strings
School
College of Letters, Arts and Sciences
Degree
Master of Arts
Degree Program
Applied Mathematics
Publication Date
06/20/2014
Defense Date
05/02/2014
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
ergodic theory,Markov measure,n-cylinder,OAI-PMH Harvest,shift space,stochastic process
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Haydn, Nicolai (
committee chair
), Lototsky, Sergey V. (
committee member
), Mikulevičius, Remigijus (
committee member
)
Creator Email
dungca@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-422266
Unique identifier
UC11285915
Identifier
etd-DungcaJaso-2570.pdf (filename),usctheses-c3-422266 (legacy record id)
Legacy Identifier
etd-DungcaJaso-2570.pdf
Dmrecord
422266
Document Type
Thesis
Format
application/pdf (imt)
Rights
Dungca, Jason Tomas
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
ergodic theory
Markov measure
n-cylinder
shift space
stochastic process