Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Limit theorems for three random discrete structures via Stein's method
(USC Thesis Other)
Limit theorems for three random discrete structures via Stein's method
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Limit Theorems for Three Random Discrete Structures via Stein’s Method by J. E. Paguyo A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (APPLIED MATHEMATICS) May 2023 Copyright 2023 J. E. Paguyo For Daphne ii Acknowledgements First and foremost, I would like to express sincere gratitude to my advisor, Jason Fulman, whose kindness, encouragement, and guidance influenced the way I approach and view mathematics. He has made me a better mathematician. The support from committee members Richard Arratia, Larry Goldstein, Sergey Lototsky, and David Kempe has been invaluable and their mentorship greatly contributed to my understanding of mathematics. Thank you to Richard Arratia for our fun reading course on the probabilistic method, for all the interesting math and life stories, and for always believing in me. Thank you to Larry Goldstein for helping solidify and expand my understanding of Stein’s method. Thank you to David Kempe for a close reading of this thesis and for providing many helpful suggestions and comments. Special thank you to Sergey Lototsky for always leading the wonderful graduate probability seminar course and for always being a strong advocate for students. I would also like to give special thanks to Erik Slivken for being like a second advisor to me. His generosity, patience, and mentorship have been essential to my growth and he continues to inspire me to take a more holistic approach to mathematics. My fellow graduate students Kayla Orlinsky, Henry Ehrhard, Jacob Der, Mark Ebert, Sam Armon, Sanat Mulay, Clemens Oszkinat, Jonathan Michala, Bixing Qiao, Peter Kagey, Gin Park, Apoorva Shah, Alex Tarter, Juan Vald´ es, Dan Douglas, Eilidh McKemmie, John Rahmani, Maria Allayioti, Inga Girshfeld, Wes Wise, and many others deserve a paragraph of their own. The best part about graduate school is the camaraderie and commiseration among the graduate students. I am glad we embarked on and experienced this brave journey together. With a little help from my friends, I was able to stay grounded and maintain a healthy work-life balance. I am lucky to call Brandon Findling my life friend. Our road trips, travels, and hangouts will always remain highlights of my life and I can’t wait for all of our future adventures. Thanks iii for always being there for me, through the good times and the bad. I am also very fortunate for the years spent with Sonia Norton and I thank her for all her love and support. She emphasized the importance of using my position as an educator to empower those who are at positions of disadvantage, taught me how to be a better ally for social justice, and encouraged me to come out of my shell and let people share in my joy and in my pain. Lastly, a huge shoutout to the Beer Friday crew Clemens Oszkinat, Sam Armon, Jonathan Michala, and Peter Kagey. Nothing beats watching the sunset together in KAP 500 on a good buzz. My family has been a constant source of love and support throughout my life. I am thankful for my aunt Mimi for raising me as her own child and shaping me to become the person I am today. My grandparents Marty and Jodell taught me how to live in the moment, to spread love wherever I go, and that life is a journey, not a destination. I continue to learn so much about life’s mysteries from them. Thank you Mark for all the love and happiness you gave me as we grew up together. I hope you’re having all the fun at the Rainbow Bridge. Finally, thank you to Los Angeles for making my time in graduate school so enjoyable. I fell in love with your diversity, delicious cuisines, vibrant art and music scenes, and natural beauty. Maybe if I had lived elsewhere this thesis would have turned out better, but I definitely would not have had as much fun. iv Table of Contents Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii Chapter 1: Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Crossings and Simple Chords in Random Chord Diagrams . . . . . . . . . . . . . 1 1.2 Fixed Points, Descents, and Inversions in Parabolic Double Cosets of S n . . . . . . 3 1.3 Cycle Structure of Random Parking Functions . . . . . . . . . . . . . . . . . . . . 6 Chapter 2: Stein’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Probability Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 Normal Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.4 Poisson Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Chapter 3: Chord Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.1.1 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.1.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 Crossings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2.1 Mean and Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2.2 Size-Bias Coupling and Stein’s Method . . . . . . . . . . . . . . . . . . . 20 3.2.3 Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2.4 Central Limit Theorem for the Number of Crossings . . . . . . . . . . . . 22 3.3 Simple Chords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.1 Poisson Limit Theorem for Simple Chords . . . . . . . . . . . . . . . . . 26 3.3.2 Chord Diagrams with No Simple Chords . . . . . . . . . . . . . . . . . . 28 3.4 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4.1 Length j Chords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4.2 Descents in Matchings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.4.3 k-Crossings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.4.4 Pattern Occurrence in Multiset Permutations and Set Partitions . . . . . . . 30 3.4.5 Configuration Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Chapter 4: Parabolic Double Cosets of the Symmetric Group . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.1.1 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.1.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 v 4.2.1 Notation and Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.2.2 Size-Bias Coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.2.3 Stein’s method for Poisson Approximation . . . . . . . . . . . . . . . . . 41 4.2.4 Dependency Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.2.5 Concentration Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.3 Fixed Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.3.1 Mean and Variance of Fixed Points . . . . . . . . . . . . . . . . . . . . . 43 4.3.2 Two-Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.3.3 Poisson Limit Theorem for Fixed Points . . . . . . . . . . . . . . . . . . . 47 4.4 Descents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.4.1 Mean and Variance of Descents . . . . . . . . . . . . . . . . . . . . . . . 51 4.4.2 Central Limit Theorem for Descents . . . . . . . . . . . . . . . . . . . . . 61 4.5 Generalized Descents for Fixed d . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.5.1 Mean and Variance of d-Descents . . . . . . . . . . . . . . . . . . . . . . 63 4.5.2 Central Limit Theorem for d-Descents . . . . . . . . . . . . . . . . . . . . 65 4.6 Inversions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.6.1 Mean and Variance of Inversions . . . . . . . . . . . . . . . . . . . . . . . 66 4.6.2 Central Limit Theorem for Inversions . . . . . . . . . . . . . . . . . . . . 67 4.7 Concentration of Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.8 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.8.1 Permutations from Fixed Conjugacy Classes of S n . . . . . . . . . . . . . 71 4.8.2 Other Permutation Statistics . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.8.3 Other Regimes of d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.8.4 Generalization to Finite Coxeter Groups . . . . . . . . . . . . . . . . . . . 72 Chapter 5: Parking Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.1.1 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.1.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.2.1 Definitions and Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.2.2 Abel’s Multinomial Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.2.3 Parking Completions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.2.4 Stein’s Method and Exchangeable Pairs . . . . . . . . . . . . . . . . . . . 79 5.3 Expected Number of Cycles of a Fixed Length . . . . . . . . . . . . . . . . . . . . 80 5.3.1 Fixed Points and Transpositions . . . . . . . . . . . . . . . . . . . . . . . 80 5.3.2 General k-Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.4 Poisson Limit Theorem for Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . 86 5.4.1 Digraph Representation of Parking Functions . . . . . . . . . . . . . . . . 87 5.4.2 Upper Bound on the Total Variation Distance . . . . . . . . . . . . . . . . 89 5.5 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.5.1 Random Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.5.2 Generating Function Approach . . . . . . . . . . . . . . . . . . . . . . . . 98 5.5.3 (m;n)- and u-Parking Functions . . . . . . . . . . . . . . . . . . . . . . . 99 5.5.4 Total Number of Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 vi List of Figures 3.1 A chord diagram and its linearized version. . . . . . . . . . . . . . . . . . . . . . 16 5.1 The digraph representation of the parking function p = (6124191684210)2 PF 12 . It consists of two components, where each component consists of a tree component attached to a cycle component. . . . . . . . . . . . . . . . . . . . . . . 87 vii Abstract In this thesis, we study the asymptotic distribution of statistics on three random discrete structures: chord diagrams, permutations from fixed parabolic double cosets of the symmetric group, and parking functions. Using Stein’s method, we obtain Poisson and central limit theorems for various statistics on these structures. The first problem concerns the asymptotic distributions of the number of crossings and the number of simple chords in a random chord diagram. This work is contained in [87, 7]. Using size-bias coupling and Stein’s method, we obtain bounds on the Kolmogorov distance between the distribution of the number of crossings and a standard normal random variable, and on the total variation distance between the distribution of the number of simple chords and a Poisson random variable. As an application, we provide explicit error bounds on the number of chord diagrams containing no simple chords. The second problem concerns fixed points and generalized descents, which include descents and inversions, on permutations chosen uniformly at random from fixed parabolic double cosets of the symmetric group. This work is contained in [89]. We show that the distribution of fixed points is asymptotically Poisson and establish a central limit theorem for the distribution of generalized descents for certain regimes. Our proofs use Stein’s method with size-bias coupling and depen- dency graphs. As applications of our size-bias coupling and dependency graph constructions, we also obtain concentration of measure results. The third problem concerns the cycle structure of uniformly random parking functions. This work is contained in [88]. Using the combinatorics of parking completions, we compute the asymp- totic expected value of the number of cycles of any fixed length. We obtain an upper bound on the total variation distance between the joint distribution of cycle counts and independent Poisson viii random variables using a multivariate version of Stein’s method via exchangeable pairs. Under a mild condition, we show that the process of cycle counts converges in distribution to a process of independent Poisson random variables. ix Chapter 1 Introduction In this thesis, we study the asymptotic distribution of various statistics on three random discrete structures: chord diagrams, permutations from fixed parabolic double cosets of the symmetric group, and parking functions. These structures have found many important applications in a wide range of fields such as statistics, computer science, physics, and biology, but they are also inter- esting mathematical objects in their own right and have been the subject of numerous theoretical work. The limit theorems that we establish contribute to a better understanding of what a typical object from these discrete structures looks like. Moreover they provide theoretical justification for approximating the distribution of these statistics in terms of simpler, universal distributions. Our results are also interesting as they provide useful testing grounds for results on Stein’s method and the techniques used can be extended or adapted to yield limit theorems in more general structures. The common thread that connects these three topics is Stein’s method, a powerful technique which is used to bound the distance between two probability distributions. 1.1 Crossings and Simple Chords in Random Chord Diagrams A chord diagram of size n is a matching of 2n points on a circle, labeled in clockwise order, with each matching corresponding to a chord. There are(2n 1)!! chord diagrams of size n. A connected chord diagram is a diagram where no set of chords can be separated from the remaining chords by a line. A component is a maximal connected subdiagram. A crossing is a 1 quadruple(i; j;k;`), with i< j< k<`, such that(i;k) and( j;`) are chords. A simple chord is a chord that connects two consecutive endpoints; that is, a simple chord is of the form(i;i+ 1), for i2[2n], where addition is understood to be modulo 2n. Chord diagrams have found applications in a wide variety of fields such as topology, random graph theory, biology, quantum field theory, and free probability [5, 21, 22, 58, 76, 79, 81, 85]. The study of chord diagrams began with Touchard [102], who found a generating function for T n;k , the number of chord diagrams of size n with exactly k crossings, and Riordan [94], who found an explicit formula for T n;k in the form of an alternating sum. Probabilistic questions on random chord diagrams were initiated by Stein and Everett in [101], where they used recurrence relations to show the probability that a random chord diagram is connected approaches 1=e as n!¥. More than 20 years later, Flajolet and Noy [44] used analytic combinatorics to find the asymptotic distributions of the number of components, the size of the largest component, and the number of crossings in a random chord diagram. They also showed that with high probability, a random chord diagram consists of a single connected component along with some isolated chords. More recently, Acan [1] extended and generalized these results in several directions. In Chapter 3, we extend Flajolet and Noy’s central limit theorem for crossings by obtaining an upper bound on the Kolmogorov distance between the number of crossings and a normal random variable and Acan’s Poisson limit theorem for simple chords by obtaining an upper bound on the total variation distance between the number of simple chords and a Poisson random variable. Thus we provide convergence rates for their distributional approximations and as corollaries we are able to recover their limit theorems. This work is contained in the preprint [87] and forthcoming preprint [7]. Theorem 1.1.1 ([87, 7], Theorem 1.1). Let X n be the number of crossings in a random chord diagram of size n 2. Let W n = X n m n s n , where m n = EX n ands 2 n = Var(X n ). Then d K (W n ;Z)= O n 1=2 ; 2 where Z is a standard normal random variable. Theorem 1.1.2 ([87], Theorem 1.2). Let S n be the number of simple chords in a random chord diagram of size n and let Y be aPoisson(1) random variable. Then d TV (S n ;Y)= O 1 n : The proofs of our theorems use size-bias coupling and Stein’s method for normal and Poisson approximations. As an application, we also provide explicit error bounds on the number of chord diagrams containing no simple chords. 1.2 Fixed Points, Descents, and Inversions in Parabolic Double Cosets of S n Let H and K be subgroups of a finite group G. Define an equivalence relation on G such that s t if and only if s= htk for s;t2 G, h2 H, k2 K. The equivalence classes are the double cosets of G. Let HsK denote the double coset containing the element s and let HnG=K denote the set of double cosets. Letl =(l 1 ;:::;l I ) be a partition of n. The parabolic subgroup S l is the set of permutations in S n that permutef1;:::;l 1 g among themselves,fl 1 + 1;:::;l 1 +l 2 g among themselves, and so on. Letm =(m 1 ;:::;m J ) be another partition of n. Then S l nS n =S m are the parabolic double cosets of S n . There is a bijection between S l nS n =S m and I J contingency tables, which are matrices of non-negative integers whose row sums equal l 1 ;:::;l I and whose column sums equal m 1 ;:::;m J . Hence we let T =(T i j ) 1iI;1 jJ = T(s) be the contingency table representing the double coset S l sS m , and we say that the double coset S l sS m is indexed by T . The uniform distribution on S n induces the Fisher-Yates distribution on contingency tables. Contingency tables are a mainstay in statistics and there has been extensive work on their statistical properties. Their importance can be seen from the fact that they arise when a population 3 of size n is classified according to two discrete categories. Fisher studied the distribution of I J contingency tables conditioned on fixed row and column sums, l and m, respectively. Then the conditional probability P(Tjl;m) of obtaining the contingency table T is precisely the Fisher- Yates distribution. More recently, Diaconis and Simper [39] studied the number of zeros in a Fisher-Yates dis- tributed contingency table, and under some assumptions, proved that it is asymptotically Poisson. In Chapter 4, we continue this probabilistic study by considering statistics on permutations chosen uniformly at random from fixed parabolic double cosets of S n . Our main results establish a Poisson limit theorem for the number of fixed points fp(s) and central limit theorems for the number of descents des(s), generalized descents des d (s) for fixed d, and inversions inv(s). This work is contained in the preprint [89]. Theorem 1.2.1 ([89], Theorem 1.1). Letl =(l 1 ;:::;l I ) andm=(m 1 ;:::;m J ) be two partitions of n. Lets2 S n be a permutation chosen uniformly at random from a fixed double coset in S l nS n =S m indexed by T . Suppose T k` 1 for all 1 k I;1` J such that A k` 6= / 0 (the sets A k` will be defined in Chapter 4). Let Y n be a Poisson random variable with raten n := E(fp(s)). Then d TV (fp(s);Y n ) 5(I+ J 1)minf1;n 1 n g maxfl I ;m J g : If the partitionsl and m also satisfy 1. lim n!¥ l I =¥ and lim n!¥ m J =¥, 2. lim n!¥ (I+ J 1)= C, for some constant C2N. and if limsup n!¥ n n =n <¥, then d TV (fp(s);Y)! 0 as n!¥, where Y is a Poisson random variable with raten, so that fp(s) d ! Y . 4 Theorem 1.2.2 ([89], Theorem 1.2). Letl =(l 1 ;:::;l I ) andm=(m 1 ;:::;m J ) be two partitions of n such that I= o(n). Lets2 S n be a permutation chosen uniformly at random from a fixed double coset in S l nS n =S m . Let W n := des(s)m n s n , where m n := E(des(s)) ands 2 n := Var(des(s)). Then d K (W n ;Z) O(n 1=2 ); where Z is a standard normal random variable, so that W n d ! Z, as n!¥. Theorem 1.2.3 ([89], Theorem 1.3). Let l =(l 1 ;:::;l I ) and m =(m 1 ;:::;m J ) be two partitions of n such that I = o(n). Let s2 S n be a permutation chosen uniformly at random from a fixed double coset in S l nS n =S m . Let d l I 2 be a fixed positive integer. Let W n := des d (s)m n s n , where m n := E(des d (s)) ands 2 n := Var(des d (s)). Then d K (W n ;Z) O(n 1=2 ); where Z is a standard normal random variable, so that W n d ! Z, as n!¥. Theorem 1.2.4 ([89], Theorem 1.4). Letl =(l 1 ;:::;l I ) andm=(m 1 ;:::;m J ) be two partitions of n such that I= o(n). Lets2 S n be a permutation chosen uniformly at random from a fixed double coset in S l nS n =S m . Let W n := inv(s)m n s n , where m n := E(inv(s)) ands 2 n := Var(inv(s)). Then d K (W n ;Z) O(n 1=2 ); where Z is a standard normal random variable, so that W n d ! Z, as n!¥. The proof of the Poisson limit theorem uses Stein’s method and size-bias coupling, and the proofs of the central limit theorems use Stein’s method along with the method of dependency graphs. As applications of our size-bias coupling and dependency graph constructions, we also obtained concentration of measure results on the number of fixed points, descents, generalized descents, and inversions. 5 1.3 Cycle Structure of Random Parking Functions Consider n parking spots placed sequentially on a one-way street. A line of n cars enter the street one at a time, with each car having a preferred parking spot. The ith car drives to its preferred spot, p i , and parks if the spot is available. Otherwise the car parks in the first available spot after p i . If the car is unable to find a spot, it exits the street without parking. A sequence of preferences p=(p 1 ;:::;p n ) is a parking function if all n cars are able to park. Let PF n denote the set of parking functions of size n. ThenjPF n j=(n+ 1) n1 . Parking functions were introduced by Konheim and Weiss [75] in their study of the hash storage structure, and have since found applications to combinatorics, probability, and computer science. For example, Foata and Riordan [47] established a bijection between parking functions and trees, Stanley found connections to noncrossing set partitions [98] and hyperplane arrangements [97], and Pitman and Stanley [92] found a connection to volume polynomials of certain polytopes. In [24], Chassaing and Marckert discovered connections between parking functions, empirical pro- cesses, and the Brownian bridge. The asymptotic distribution of the area statistic was studied by Flajolet, Poblete, and Viola [46] and Janson [65], where it was shown to converge to normal, Pois- son, and Airy distributions depending on the ratio between the number of cars and spots. More recently, Diaconis and Hicks [38] studied the distribution of coordinates, descents, area, and other statistics of random parking functions. In [71] and [107], Kenyon and Yin continued this prob- abilistic program by studying (m;n)-parking functions, where there are m n cars and n spots. Subsequently, Yin extended these results to u u u-parking functions [106], which generalize both the (m;n) and classical parking functions. We continue the probabilistic study of parking functions in Chapter 5, where we investigate the cycle structure of random parking functions. A k-cycle in a parking functionp2 PF n , for k2[n], is a sequence of k indices i 1 ;:::;i k 2[n] such thatp(i 1 )= i 2 ,p(i 2 )= i 3 ,:::,p(i k1 )= i k ,p(i k )= i 1 . Let C k (p) be the number of k-cycles inp2 PF n . Our main results show that the expected number of k-cycles in a random parking function is asymptotically 1 k , and we give an upper bound on the total variation distance between the joint distribution of cycle counts(C 1 ;:::;C d ) of a random 6 parking function and a Poisson process(Z 1 ;:::;Z d ), wherefZ k g are independent Poisson random variables with ratel k = E(C k ). This work is contained in the published paper [88]. Theorem 1.3.1 ([88], Theorem 1.1). Let p2 PF n be a parking function chosen uniformly at ran- dom. Then E(C k (p)) 1 k for all k2[n]. Theorem 1.3.2 ([88], Theorem 1.2). Let p2 PF n be a parking function chosen uniformly at random. Let C k = C k (p) be the number of k-cycles in p and let W = (C 1 ;C 2 ;:::;C d ). Let Z=(Z 1 ;Z 2 ;:::;Z d ), wherefZ k g are independent Poisson random variables with ratel k = E(C k ). Suppose d= o(n 1=4 ). Then d TV (W;Z)= O d 4 n d : Moreover the process of cycle counts converges to a process of independent Poisson random vari- ables (C 1 ;C 2 ;:::) D !(Y 1 ;Y 2 ;:::) as n!¥, wherefY k g are independent Poisson random variables with rate 1 k . This parallels the result of Arratia and Tavare [12] on the cycle structure of uniformly random permutations. Our proof uses a multivariate version of Stein’s method with exchangeable pairs, due to Chatterjee, Diaconis, and Meckes [26]. 7 Chapter 2 Stein’s Method 2.1 Introduction In this chapter, we give a brief history and overview of Stein’s method. We follow the exposition given by Ross in the accessible survey paper [95]. A central theme in probability and statistics is in establishing distributional limit theorems. However many of the techniques used to show distributional convergence, for example character- istic functions and the method of moments, do not provide a rate of convergence or error bound on the distributional approximation. Stein’s method is a powerful technique introduced by Charles Stein in [99] which is used to bound the distance between two probability distributions. Stein initially introduced it as a method to provide an error bound on the normal approximation for the sum of dependent random variables, but it has since been developed for many other target distributions such as the Poisson [28], multi- nomial [80], exponential [27], and even the Dickman [19]. The main advantage of using Stein’s method is that it provides an explicit error bound on the distributional approximation in addition to proving distributional convergence. The heart of Stein’s method lies in the following two components. The first component is turning the problem of bounding the approximation error between the distribution of interest and a target distribution into a problem of bounding the expectation of a functional of the random variable of interest. The second component is the collection of techniques which are used to bound 8 the expectation of a functional of the random variable of interest. Stein referred to this step as auxiliary randomization. As previously mentioned, the first component has been achieved for a wide range of distri- butions. Moreover, there are now many different techniques which have been developed for the auxiliary randomization step of the second component, such as dependency graphs and coupling techniques. Furthermore the coupling techniques which are developed are also useful in concen- tration of measure inequalities and local limit theorems. More recently, Stein’s method has found applications to stochastic analysis and data science. Thus Stein’s method continues to be a fruitful and important field of mathematical and applied studies. 2.2 Probability Metrics Since Stein’s method concerns bounding the distance between two probability distributions, we begin by defining the probability metrics that we use. Let m andn be two probability measures. We use probability metrics of the form d H (m;n)= sup h2H Z h(x)dm(x) Z h(x)dn(x) ; whereH is some family of test functions. As an abuse of notation, if X and Y are random variables with distributions m andn, respectively, we write d H (X;Y) to denote d H (m;n). SettingH =fh :R!R :jh(x)h(y)jjxyjg to be the set of 1-Lipschitz functions, denoted as Lip 1 , gives the Wasserstein distance d W (m;n) := sup h2Lip 1 Z h(x)dm(x) Z h(x)dn(x) This metric is widely used for approximation by continuous distributions. 9 Next, settingH =f1 fxg : x2Rg gives the Kolmogorov distance d K (m;n) := sup x2R jm(¥;x]n(¥;x]j: This is the metric that we use for our normal approximations. Finally, settingH =f1 f2Ag : A2 Borel(R)g gives the total variation distance d TV (m;n) := sup AW jm(A)n(A)j; whereW is a measurable space. If W and Z are are discrete random variables onW, another way to write the total variation distance is d TV (W;Z)= 1 2 å w2W jP(W =w) P(Z=w)j: This metric is useful for approximation by discrete distributions and is the metric that we use for our Poisson approximations. A useful fact is that if a random variable Z has Lebesgue density bounded by C, then for any random variable W, d K (W;Z) p 2Cd W (W;Z): This implies, for example, that a bound on the Wasserstein distance between a distribution of interest and a normal distribution gives a bound on their Kolmogorov distance. 10 2.3 Normal Approximation The main idea of Stein’s method is to replace the characteristic function which is typically used to show distributional convergence, and instead use a characterizing operator for the target distribu- tion. We outline this for the normal distribution. The starting point is Stein’s Lemma, which is as follows. For the case of the standard normal distribution, define the functional operatorA by A f(x)= f 0 (x) x f(x): If Z has the standard normal distribution, then EA f(Z)= 0 for all absolutely continuous f with Ej f 0 (Z)j<¥. On the other hand, if for some random variable W we have that EA f(W)= 0 for all absolutely continuous functions withk f 0 k<¥, then W has the standard normal distribution. The operatorA is the characterizing operator of the standard normal distribution. Let X and Y be random variables and letH be some family of functions. Recall the probability metric d H (X;Y)= sup h2H jEh(X) Eh(Y)j: Now letF(x) be the cdf of the standard normal distribution. Then the unique bounded solution, f h , to the Stein equation f 0 h (w) w f h (w)= h(w)F(h) is given by f h (w)= e w 2 =2 Z ¥ w e t 2 =2 (F(h) h(t))dt =e w 2 =2 Z w ¥ e t 2 =2 (F(h) h(t))dt: 11 Stein’s Lemma and the Stein equation lie at the heart of Stein’s method. As a corollary, we get the following. If W is a random variable and Z has the standard normal distribution, then d H (W;Z)= sup h2H jE[ f 0 h (W)W f h (W)]j: Thus if W is approximately standard normal, then the right hand side of the above should be close to zero. The main strategy for bounding the distance between the distributions of W and Z is now clear. We want to bound the right hand side E[ f 0 h (W) W f h (W)] for f h which solve the Stein equation. Although this may seem to be more complicated to handle, it is usually more tractable to use since the right hand side involves the distribution of only one random variable. The strategy to bound E[ f 0 h (W)W f h (W)] is to use the structure to rewrite E(W f(W)) so that it is shown to be close to E( f 0 (W)). There are now numerous techniques which have been devel- oped to bound this expectation for the case of the normal distribution. These include dependency graphs [14], exchangeable pairs [100], size-bias couplings [55], zero-bias couplings [53], and more generally Stein couplings [29]. 2.4 Poisson Approximation We now give an outline of Stein’s method for the Poisson distribution. The approach is analogous to the case of normal approximation. Stein’s method was extended to the Poisson distribution by Chen in [28] and so it is also referred to as the Chen-Stein method for Poisson approximation. Barbour, Holst, and Janson give an accessible treatise of Poisson approximation in their book [16]. The starting point is the characterizing operator of the Poisson distribution, which is given by A f(k)=l f(k+ 1) k f(k); for l > 0. Then Stein’s Lemma for the Poisson distribution is as follows. If Z has the Poisson distribution with mean l, then EA f(Z)= 0 for all bounded f . On the other hand, if for some 12 non-negative integer-valued random variable W, EA f(W)= 0 for all bounded functions f , then W has the Poisson distribution with meanl. LetP l be the probability measure with respect to a Poisson distribution with mean l. Let AN[f0g. Then the unique solution, f A , to the Stein equation l f A (k+ 1) k f A (k)= 1 fk2Ag P l (A) with f A (0)= 0 is given by f A (k)= e l (k 1)! l k [P l (A\U k )P l (A)P l (U k )]; where U k =f0;1;:::;k 1g. One can show that if f A solves the Stein equation, then k f A k minf1;l 1=2 g and kD f A k 1 e l l minf1;l 1 g; whereD f(k) := f(k+ 1) f(k). LetF be the set of functions which satisfy the two conditions above. As a corollary to the above, we have the following. If W 0 is an integer-valued random variable with meanl, then jP(W2 A)P l (A)j=jE[l f A (W+ 1)W f A (W)]j: Therefore d TV (W;Z) sup f2F jE[l f(W+ 1)W f(W)]j where Z is a Poisson random variable with meanl. Similar to the normal approximation case, the main strategy in bounding the distance between the distributions of W and Z is to use the structure of W which will allow us to show that E(W f(W)) 13 is close tolE( f(W+1)). Several techniques which have been developed to bound this expectation for the Poisson case include dependency graphs [10, 9], size-bias couplings [16], and exchangeable pairs [26]. 14 Chapter 3 Chord Diagrams 3.1 Introduction A chord diagram of size n is a matching of 2n points on a circle, labeled in clockwise order, with each matching corresponding to a chord. Alternatively, we can represent it as a linearized chord diagram by placing 2n points on a line in increasing order and connecting pairs by arcs. See Figure 3.1 for an example. There are(2n 1)!! chord diagrams of size n. A matching, or fixed-point free involution, is a permutationp2 S 2n such thatp 2 = 1 andp(i)6= i for all i2f1;:::;2ng. Thus we may represent a chord diagram, C n , by its corresponding matching, p, and write C n = C n (p)=p. For the remainder of this chapter, we go back and forth between C n and p to denote a chord diagram whenever one representation is more useful in context than the other. For example, the chord diagram in Figure 3.1 can be written as a matching in cycle notation asp =(18)(29)(34)(57)(610)(1112). We now define terminology and fix some notation that will be used throughout the chapter. For a positive integer n, let[n]=f1;:::;ng. We let(2n 1)!! denote the product of odd integers from 1 to 2n 1. If there exists positive constants c and n 0 such that a n cb n for all n n 0 , then we write a n = O(b n ). The chord connecting any two points x and y is denoted by(x;y). A connected chord diagram is a diagram where no set of chords can be separated from the remaining chords by a line. A component is a maximal connected subdiagram. The root component is the component that contains the point labeled 1. A crossing is a quadruple (i; j;k;`), with 15 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 Figure 3.1: A chord diagram and its linearized version. i< j< k<`, such that (i;k) and ( j;`) are chords. On the other hand, a nesting is a quadruple (i; j;k;`), with i< j< k<`, such that(i;`) and( j;k) are chords. A simple chord is a chord that connects two consecutive endpoints; that is, a simple chord is of the form (i;i+ 1), for i2[2n], where addition is understood to be modulo 2n. For example, the chord diagram in Figure 3.1 has 4 crossings, 4 nestings, 3 components, and 2 simple chords. The enumeration of chord diagrams was initiated by Touchard in [102], where he found a generating function for T n;k , the number of chord diagrams of size n with exactly k crossings, in the form of a continued fraction. In particular, this gives T n;0 = 1 n+1 2n n , the classical result that the number of noncrossing chord diagrams of size n is given by the nth Catalan number. Riordan [94] extended Touchard’s results by finding an explicit formula for T n;k in the form of an alternating sum. This allows for the computation of T n;k for moderately sized n and k, however obtaining asymptotics on T n;k as n;k!¥ remains difficult. Probabilistic questions have also been studied. Using recurrence relations, Stein and Everett [101] showed that the probability a random chord diagram is connected approaches 1=e as n!¥. Flajolet and Noy [44] used generating functions and analytic combinatorics to find the asymptotic 16 distributions of the number of components, the size of the largest component, and the number of crossings. Moreover they showed that a random chord diagram is monolithic with high probability, meaning it consists of one large connected component and some isolated chords. In [30], Chen et al showed that the crossing numbers and nesting numbers of matchings have a symmetric joint distribution. More recently, Acan [1] extended Flajolet and Noy’s results about the components of a random chord diagram in several directions, and Acan and Pittel [2] discovered the emergence of a giant component in the intersection graph of a random chord diagram, under some conditions on the number of crossings. Chord diagrams have found applications in a wide variety of fields such as topology, random graph theory, biology, quantum field theory, and free probability [5, 21, 22, 58, 76, 79, 81, 85]. 3.1.1 Main Results In [44], Flajolet and Noy used generating functions to prove a central limit theorem for the num- ber of crossings in a random chord diagram. Another proof using weighted dependency graphs was given by Feray in [43]. The main difficulty with the crossing statistic is that occurrences of crossings in disjoint sets of indices are not independent events. Our main result gives a rate of convergence for the asymptotic normality of the number of crossings by providing an upper bound of order n 1=2 on the Kolmogorov distance between the standardized number of crossings and a standard normal random variable. Theorem 3.1.1 ([87, 7], Theorem 1.1). Let X n be the number of crossings in a random chord diagram of size n 2. Let W n = X n m n s n , where m n = EX n ands 2 n = Var(X n ). Then d K (W n ;Z)= O n 1=2 ; where Z is a standard normal random variable. The proof uses size-bias coupling and Stein’s method. Size-bias coupling and Stein’s method have previously been used to prove central limit theorems, for example, by Conger and Viswanath 17 [32] for the number of inversions and descents of permutations of multisets, Betken [18] for the number of isolated vertices in preferential attachment random graphs, and He [60] for the number of descents in a Mallows distributed permutation and its inverse. We also use size-bias coupling and Stein’s method to obtain an upper bound on the total varia- tion distance between the number of simple chords and a Poisson random variable. Theorem 3.1.2 ([89], Theorem 1.2). Let S n be the number of simple chords in a random chord diagram of size n and let Y be aPoisson(1) random variable. Then d TV (S n ;Y)= O 1 n : As an application of Theorem 3.1.2, we obtain absolute error bounds on the number of chord diagrams of size n which contain no simple chords. Theorem 3.1.3 ([87], Theorem 1.3). Let s(n) be the number of chord diagrams of size n with no simple chords. Then (2n 1)!! e e 1 2n1 10 n s(n) (2n 1)!! e e 1 2n1 + 10 n : We remark that Stein’s method via size-bias coupling has been successfully used to prove Poisson limit theorems. Some examples are in Angel, van der Hofstad, and Holmgren [6] for the number self-loops and multiple edges in the configuration model, Arratia and DeSalvo [8] for completely effective error bounds for Stirling numbers of the first and second kinds, Goldstein and Reinert [54] for the size of the intersection of random subsets of [n], and Holmgren and Janson [64] for sums of functions of subtrees of random binary search trees and random recursive trees. 18 3.1.2 Outline This chapter is organized as follows. In Section 3.2 we give a brief overview of size-bias cou- pling and Stein’s method. We then construct a size-bias coupling for the number of crossings and combine this with a size-bias coupling version of Stein’s method to prove Theorem 3.1.1. In Section 3.3 we use size-bias coupling and Stein’s method for Poisson approximation to prove Theorem 3.1.2. We then use this to prove Theorem 3.1.3. We conclude the chapter with final remarks and some open problems. 3.2 Crossings A crossing in a chord diagram, C n , is a quadruple (a;b;c;d) with a< b< c< d such that (a;c) and (b;d) are chords in C n . Let X n be the number of crossings in C n . Let N = 2n 4 and order the set of quadruples (a;b;c;d), with a< b< c< d, in some arbitrary but fixed labeling. Let 1 (i; j) be the indicator random variable that chord(i; j) is in C n . For a quadruple k=(a;b;c;d), let Y k = 1 (a;c) 1 (b;d) , so that Y k is the indicator random variable that the quadruple k forms a crossing in C n . Then we can write X n as a sum of indicators as X n =å N k=1 Y k . The next lemma is useful for the remainder of the chapter. We omit the straightforward proof. Lemma 3.2.1. Given distinct i 1 ;:::;i k ; j 1 ;:::; j k 2[2n], the probability that a random chord dia- gram, C n , contains the set of chordsf(i m ; j m )g 1mk is P(f(i m ; j m )g 1mk 2 C n )= (2n 2k 1)!! (2n 1)!! = 1 (2n 1)(2n 2k+ 1) : 3.2.1 Mean and Variance Let m n = E(X n ) and s 2 n = Var(X n ) be the mean and variance of X n . The exact values for m and s were stated without proof in [94]. This was subsequently proven via generating functions in [44], where moments of any fixed order were determined. A more direct proof using the method of indicators, with some computer assistance for the variance computation, was given in [43]. 19 Lemma 3.2.2 ([44], Theorem 3). Let X n be the number of crossings in a random chord diagram of size n. Then m n = n(n 1) 6 and s 2 n = n(n 1)(n+ 3) 45 : 3.2.2 Size-Bias Coupling and Stein’s Method Let X be a non-negative integer-valued random variable with finite mean. Then X s has the size-bias distribution of X if P(X s = x)= xP(X = x) EX : A size-bias coupling is a pair (X;X s ) of random variables defined on the same probability space such that X s has the size-bias distribution of X. To prove our main theorem, we will use the following size-bias coupling version of Stein’s method. Theorem 3.2.3. Let X 0 be a random variable such thatm= EX<¥ and Var(X)=s 2 . Suppose (X;X s ) is a bounded size-bias coupling, so thatjX s Xj A for some A. If W = Xm s and Z is a standard normal random variable, then d K (W;Z) 2m s 2 p Var(E[X s Xj X])+ 8A 2 s 3 : Proof. This follows by applying Construction 3A to Corollary 2.6 in [29]. Observe that this implies a central limit theorem if both error terms on the right hand side go to zero. We now turn to the construction of a size-biased version of the number of crossings X n . We use the following recipe provided in [95] for coupling a random variable X with its size-bias version X s . Let X =å n i=1 X i where X i 0 and m i = EX i . 20 1. For each i2 [n], let X s i have the size-bias distribution of X i independent of (X k ) k6=i and (X s k ) k6=i . Given X s i = x, define the vector(X (i) k ) k6=i to have the distribution of(X k ) k6=i condi- tional on X i = x. 2. Choose a random summand X I , where the index I is chosen, independent of all else, with probability P(I= i)=m i =m, where m = EX. 3. Define X s =å k6=I X (I) k + X s I . If X s is constructed by Items (1)–(3) above, then X s has the size-bias distribution of X. As a special case, note that if X i is a zero-one random variable, then its size-bias distribution is X s i = 1. We summarize this as the following proposition. Proposition 3.2.4 ([95], Corollary 3.24). Let X 1 ;:::;X n be zero-one random variables and let p i := P(X i = 1). For each i2[n], let(X (i) k ) have the distribution of(X k ) k6=i conditional on X i = 1. If X =å n i=1 X i , m = EX, and I is chosen independent of all else with P(I= i)= p i =m, then X s = å k6=I X (I) k + 1 has the size-bias distribution of X. 3.2.3 Construction Following steps (1)-(3) above, we construct a random variable X s n having the size-bias distribution with respect to the number of crossings X n . Fix a chord diagram C n and let X n be the number of crossings. Pick an index I =(a;b;c;d) uniformly at random from [N], independent of all else. We form a new chord diagram C s n as follows. If Y I = 1, set C s n = C n . Otherwise, there is no crossing at index I. Delete all edges containing a;b;c;d as endpoints. Then form the edges(a;c) and(b;d), so that there is a crossing at index I. Finally form the edges (p(a);p(c)) and (p(b);p(d)). Let C s n be the resulting chord diagram. It is clear that the resulting chord diagram C s n is a uniformly random chord diagram, conditioned on containing the edges(a;c) and(b;d). Set X s n =å k6=I Y (I) k + 1, where Y (I) k is the indicator random variable that the new chord diagram C s n has a crossing at index k. By construction, C s n is a uniformly random chord diagram conditioned 21 on Y I = 1. Thus(Y (I) k ) has the marginal distribution of(Y k ) k6=I conditional on Y I = 1. By Proposition 4.2.1, X s n has the size-bias distribution of X n . We record this as the following proposition. Proposition 3.2.5. Let X n be the number of crossings in a random chord diagram of size n. Let X s n be constructed as above. Then X s n has the size-bias distribution of X n . 3.2.4 Central Limit Theorem for the Number of Crossings In this section we use the size-bias coupling from the previous section combined with the size-bias version of Stein’s method, Theorem 4.2.2, to prove Theorem 3.1.1. We begin by bounding the variance term in Theorem 4.2.2, which is the main difficulty in applying this theorem. Lemma 3.2.6. Let X n be the number of crossings in a random chord diagram, p, and let X s n have the size-bias distribution with respect to X n . Then Var(E[X s n X n j X n ])= O(n): Proof. First note that Var(E[X s n X n j X n ]) Var(E[X s n X n jp]): Let X (i) n denote X s n conditioned to have a crossing at index I= i. Then E(X s n X n jp)= N å i=1 E(X s n X n jp;I= i)P(I= i) = 1 N N å i=1 (X (i) n X n jp); 22 where X (i) n X n is the change in the number of crossings. Thus we can write the variance as Var(E[X s n X n jp])= 1 N 2 å 1i; jN cov p (X (i) n X n ;X ( j) n X n ); where the covariance is taken with respect top. By construction,jX (i) n X n j 4n since there are at most four new chords inp s and each chord either creates or destroys at most n new crossings. We split up the sum according to whether the two sets of indices i; j satisfyji\ jj6= 0 orji\ jj= 0. Observe that the number of variance terms is at most n 4 and the number of covariance terms withji\ jj6= 0 is at most 16n 7 . For these terms, it is enough to use the bound cov p (X (i) n X n ;X ( j) n X n ) 16n 2 : Thus the contribution of these variance and covariance terms to the sum is upper bounded by 16n 2 (n 4 + 16n 7 ) 272n 9 . It remains to bound covariance terms such thatji\ jj= 0. There are at most n 8 such terms. We can write the covariance as cov p (X (i) n X n ;X ( j) n X n )= E[(X (i) n X n )(X ( j) n X n )jp][E(X (i) n X n jp)] 2 : For the second expectation, we begin with the previously derived expression E(X s n X n )= E(E(X s n X n jp)) = 1 N N å i=1 E(X (i) n X n jp) = E(X (i) n X n jp); 23 where in the last equality we used the fact that E(X (i) n X) is the same for all i, by definition. By definition of the size-bias distribution, E(X s n )= E(X 2 ) m n = s 2 n +m 2 n m n : This gives E(X (i) n X n jp)= s 2 n +m 2 n m n m n = s 2 n m n = 2(n+ 3) 15 : The first expectation term is more difficult to compute. However a direct but tedious computation shows that E[(X (i) n X n )(X ( j) n X n )jp]= 4n 2 225 + 404n 4725 + 164 945 : The idea is to rewrite the product as a sum of indicators, but we omit the tedious computations here and refer the reader to our forthcoming paper with Arenas-Velilla and Arizmendi [7] for the details. Thus we get that cov p (X (i) n X n ;X ( j) n X n )= 4n 2 225 + 404n 4725 + 164 945 2(n+ 3) 15 2 = 64 100n 4725 : so that each covariance term withji\ jj= 0 is bounded by jcov(X (i) n X n ;X ( j) n X n )j n: Therefore the total contribution of these covariance terms is n 9 . 24 Combining the cases above gives Var(E[X s n X n jp]) 272n 9 + n 9 N 2 50 2 n= O(n); which gives the desired upper bound. Proof of Theorem 3.1.1. Let (X n ;X s n ) be the size-bias coupling from Section 3.2.3, so that by Proposition 3.2.5 X s n has the size-bias distribution of X n . By construction, we have thatjX s n X n j 4n, so we can set A := 4n. Moreover for n 2, we have that m n n 2 6 ands 2 n n 3 45 . Therefore by Lemmas 3.2.2, 3.2.6 and Theorem 4.2.2, d K (W n ;Z) 2m n s 2 n p Var(E[X s n X n j X n ])+ 8m n A 2 s 3 n : 2n 2 6 45 n 3 50n 1=2 + 8n 2 6 45 3=2 n 9=2 16n 2 7190n 1=2 : It follows that d K (W n ;Z)= O(n 1=2 ) as desired. Theorem 3.1.1 shows that d K (W n ;Z)! 0 as n!¥. Therefore we recover the central limit theorem for the number of crossings, along with a rate of convergence. 3.3 Simple Chords Let S n be the number of simple chords in a random chord diagram of size n. We can write S n as a sum of indicators as S n =å 2n k=1 X k , where X k is the indicator random variable that(k;k+ 1) is a simple chord in C n . 25 3.3.1 Poisson Limit Theorem for Simple Chords Recall the relevant definitions and properties of size-bias distributions and size-bias couplings from Section 3.2. We will use the following size-bias coupling version of Stein’s method for Poisson approximation to prove the Poisson limit theorem for the number of simple chords. Theorem 3.3.1 ([95], Theorem 4.13). Let W 0 be an integer-valued random variable such that E(W)=l > 0 and let W s be a size-bias coupling of W. If ZPoisson(l), then d TV (W;Z) minf1;lgEjW+ 1W s j: Next we follow steps (1)-(3) from Section 3.2.2 to construct the size-bias distribution S s n with respect to S n . Fix a chord diagram p. Pick an index I2[2n] uniformly at random, independent of p. If X I = 1, set p s = p. Otherwise, let p s be the chord diagram such that p s (I)= I+ 1, p s (p(I))=p(I+ 1), and p s (k)=p(k) for k = 2fI;I+ 1;p(I);p(I+ 1)g. That is, p s is the chord diagram that results by creating the simple chord(I;I+ 1), the chord(p(I);p(I+ 1)), and fixing all other chords. Finally, let S s n =å k6=I X (I) k + 1, where X (I) k is the indicator that p s has the simple chord (k;k+ 1). It is clear that (X (i) k ) k6=i has the distribution of (X k ) k6=i conditional on X i = 1, so by Proposition 4.2.1 we have that(S n ;S s n ) is a size-bias coupling. We record this in the following proposition. Proposition 3.3.2. Let S n be the number of simple chords in a random chord diagram of size n. Let S s n be constructed as above. Then S s n has the size-bias distribution of S n . With the size-bias coupling established, we now prove Theorem 3.1.2. 26 Proof of Theorem 3.1.2. Let S n =å 2n k=1 X k be the number of chords in a random chord diagram C n and let Y n be a Poisson(l n ) random variable, wherel n := ES n = 2n 2n1 . By Proposition 3.3.2 and Theorem 3.3.1, d TV (S n ;Y n ) minf1;l n gEjS n + 1 S s n j = E X I + å k6=I X k X (I) k EX I + å k6=I E X k X (I) k : The first term on the right hand side is EX I = 2n å i=1 E(X i )P(I= i) = 1 2n 1 2n å i=1 1 2n 1 = 2n (2n 1) 2 : For the second term, we consider two cases. If k= i 1, then X (i) k = 0, so that X k X (i) k is a zero-one random variable that equals one if and only if X k = 1. This gives E(X i X (i) i )= P(X k = 1)= 1 2n1 . Otherwise, suppose k2[2n]nfi;i 1g. Observe that X (i) k X k by the size-bias coupling con- struction. It follows that X (i) k X k is a zero-one random variable that equals one if and only if 27 (i;k);(i+1;k+1)2 C n or(i;k+1);(i+1;k)2 C n . This event occurs with probability 2 (2n1)(2n3) . Thus å k6=I E X k X (I) k = 2n å i=1 1 2n 1 å k6=i E X k X (i) k = 2n å i=1 1 2n 1 å k=i1 E(X k X (i) k )+ å k= 2fi;i1g E(X (i) k X k ) ! = 2n å i=1 1 2n 1 2 2n 1 + 2(2n 3) (2n 1)(2n 3) = 8n (2n 1) 2 : Combining the above gives d TV (S n ;Y n ) 2n (2n 1) 2 + 8n (2n 1) 2 = 10n (2n 1) 2 10 n : Finally, sincel n ! 1 as n!¥, it follows that d TV (Y n ;Y)! 0. Therefore d TV (S n ;Y)= O 1 n : Theorem 3.1.2 shows that S n converges in distribution to aPoisson(1) random variable. 3.3.2 Chord Diagrams with No Simple Chords As an application of Theorem 3.1.2, we obtain absolute error bounds on the number of chord diagrams of size n that contain no simple chords, for all n 1. 28 Proof of Theorem 3.1.3. Let S n be the number of simple chords in a random chord diagram of size n and let Y n be aPoisson(l n ) random variable, wherel n = 2n 2n1 = 1+ 1 2n1 . By definition of total variation distance, d TV (S n ;Y n )jP(S n = 0) P(Y n = 0)j= s(n) (2n 1)!! e l n ; for all n 1. Applying Theorem 3.1.2 gives s(n) (2n 1)!! e l n 10 n ; and rearranging yields (2n 1)!! e e 1 2n1 10 n s(n) (2n 1)!! e e 1 2n1 + 10 n : From Proposition 3.1.3, we get that s(n)= (2n 1)!! e (1+ o(1)); which implies that s(n) is asymptotically (2n1)!! e . Therefore the probability that a random chord diagram of size n contains no simple chords is asymptotically 1 e . 3.4 Final Remarks 3.4.1 Length j Chords A chord has length j if it is of the form(i;i+ j+ 1), for i2[2n], where addition is understood to be modulo 2n. Let L j be the number of length j chords in a random chord diagram of size n. Note that simple chords are length 0 chords. Acan [1] proved that for 0 j n 2, L j converges in distribution to a Poisson(1), but did not provide a convergence rate. Using essentially the same 29 argument as in the proof of Theorem 3.1.2, one can obtain a O(n 1 ) upper bound on the total variation distance between the distribution of L j and aPoisson(1) random variable. 3.4.2 Descents in Matchings Kim [72] showed that the number of descents in a random matching is asymptotically normal. Subsequently ¨ Ozdemir [86] used martingales to obtain a rate of convergence of order n 1=2 . It should also be tractable to use size-bias coupling and Stein’s method to obtain another proof for the rate of convergence. 3.4.3 k-Crossings Feray [43] introduced the theory of weighted dependency graphs and gave a normality criterion in this context. As an application, he gave another proof of the asymptotic normality of the number of crossings in a random chord diagram. It would be interesting to see if there is a Stein’s method version for weighted dependency graphs, analogous to the one for regular dependency graphs in [14]. In another direction, Feray [43] states that weighted dependency graphs can be used to prove a central limit theorem for the number of k-crossings. It may be possible to use Stein’s method to obtain a rate of convergence. 3.4.4 Pattern Occurrence in Multiset Permutations and Set Partitions Chern, Diaconis, Kane, and Rhoades [31] showed that the number of crossings in a uniformly chosen set partition of [n] is asymptotically normal. Note that a chord diagram is a special case corresponding to a partition of [2n] into n blocks, each of size 2. More recently, Feray [42] gen- eralized this result and used weighted dependency graphs to prove central limit theorems for the number of occurrences of any fixed pattern in multiset permutations and set partitions. Is it possible to use size-bias coupling and Stein’s method to obtain rates of convergence? 30 3.4.5 Configuration Model Fix n 1, which will denote the number of vertices in a random graph. Let d d d=(d 1 ;:::;d n ), with d i 1 for all i2[n], be a degree sequence. That is, vertex i has degree d i . Consider 2m half- edges (so that two half-edges can be connected to form a single edge), where 2m=å n i=1 d i , and perform a random matching of these half-edges. The resulting random graph CM n (d d d) is called the configuration model with degree sequence d d d (see [63], Ch. 7). Note that random chord diagrams correspond to the special case where the number of vertices is 2n and the degree sequence consists of all 1’s. What is the asymptotic distribution of the number of crossings in the configuration model? 31 Chapter 4 Parabolic Double Cosets of the Symmetric Group 4.1 Introduction Let H and K be subgroups of a finite group G. Define an equivalence relation on G by s t () s= htk; for s;t2 G, h2 H, k2 K: The equivalence classes are the double cosets of G. Let HsK denote the double coset containing the element s and let HnG=K denote the set of double cosets. Let l =(l 1 ;:::;l I ) be a partition of n, so that l 1 l I > 0 and l 1 ++l I = n. Let `(l) be the length ofl. The parabolic subgroup, S l , is the set of permutations in S n that permute f1;:::;l 1 g among themselves,fl 1 + 1;:::;l 1 +l 2 g among themselves, and so on. Thus we can write S l = S l 1 S l I . Letm =(m 1 ;:::;m J ) be another partition of n. Then S l nS n =S m are the parabolic double cosets of S n . LetT l;m be the set of I J contingency tables with nonnegative integer entries whose row sums equal l and column sums equal m. Then the parabolic double cosets S l nS n =S m are in bijection withT l;m , The bijection can be described as follows. For the partitionl, define thel i -block to be the set of elements L i :=fl 1 ++l i1 +1;l 1 ++l i1 +2;:::;l 1 ++l i g, for 1 i`(l), where l 0 is defined to be zero. Define them i -block, M i , analogously for the partitionm. Fixs2 S n . Then 32 its corresponding contingency table T = T(s)=(T i j ) is defined to be the table such that entry T i j is the number of elements from the set M j which occur in the positions of s given by the set L i . This gives a map from S n toT l;m . Thus we may write T = T(s) as the contingency table representing the double coset S l sS m , and we say that the double coset S l sS m is indexed by T . For example, let n= 5, l =(3;2), and m =(2;2;1). Consider s = 134252 S 5 . Then L 1 = f1;2;3g, L 2 =f4;5g, M 1 =f1;2g, M 2 =f3;4g, and M 3 =f5g. By the bijection, s = 134257! 0 B @ 1 2 0 1 0 1 1 C A : The uniform distribution on S n induces the Fisher-Yates distribution on contingency tables, given by P l;m (T)= 1 n! (Õ i l i !)(Õ j m j !) Õ i; j T i j ! : It follows that the size of the double coset indexed by T is given by jS l sS m j= (Õ i l i !)(Õ j m j !) Õ i; j T i j ! : We remark that enumerating parabolic double cosets remains a difficult problem; in fact, it is #P-complete [37]. Contingency tables are a mainstay in statistics and there has been extensive work on their statistical properties. The importance of contingency tables can be seen in the fact that they arise when a population of size n is classified according to two discrete categories. Ronald Fisher studied the distribution of I J contingency tables conditioned on fixed row and column sums, l and m, respectively. Then the conditional probability P(Tjl;m) of obtaining the contingency table T is precisely the Fisher-Yates distribution. 33 Under the Fisher-Yates distribution, Kang and Klotz [70] showed that the joint distribution of the entries of the table T =(T i j ) is asymptotically multivariate normal. As corollaries, they also showed that each entry T i j is asymptotically normal, and that the chi-squared statistic c 2 (T)= å i; j T i j l i m j n 2 l i m j n ; which measures how “close” a table is to a product measure on tables, is asymptotically chi-squared distributed with(I1)(J1) degrees of freedom [70]. This shows that a typical contingency table looks like the independence table, T =(T i j )= l i m j n . Moreover, it follows that the largest double cosets are those which are closest to the independence table. Barvinok [17] obtained analogous results for uniformly distributed contingency tables. Contingency tables can be partially ordered as follows. Let T and T 0 be contingency tables with the same row and column sums. Then T 0 majorizes T , denoted by T T 0 , if the largest element in T 0 is greater than the largest element in T , the sum of the two largest elements in T 0 is greater than the sum of the two largest elements in T , etc. Let T;T 0 be Fisher-Yates distributed contingency tables. It follows that if T T 0 , then P(T)> P(T 0 ); see [39] for a short proof. Joe [68] proved that the independence table T is the unique smallest table in majorization order among all contingency tables with the same row and column sums. A natural statistic on contingency tables is the number of zero entries. Recall that most tables are close to the independence table T , which contains no zero entries, and so we expect most contingency tables to have no zero entries. Diaconis and Simper [39] showed, under some mild hypotheses, that the number of zeros in a Fisher-Yates distributed contingency table is asymptot- ically Poisson. Their proof uses the observation that a Fisher-Yates distributed contingency table is equivalent to rows of independent multinomial vectors, conditioned on the column sums. As open problems, they asked for the distribution of the positions of the zeros, and for the size and distribution of the largest entry in a Fisher-Yates distributed contingency table. 34 Another line of work is to consider double cosets of other groups G and subgroups H;K and study the distribution that a uniform distribution of G induces on HnG=K. The following two examples are treated in detail in [39]. Let G= GL n (F q ), whereF q is a finite field of size q, and let H = K be the subgroup of lower triangular matrices. Then the double cosets HnG=K are indexed by permutations, and the uniform distribution on G induces the Mallows measure on S n , p q (s)= q inv(s) [n] q ! ; where inv(s) denotes the number of inversions of the permutation s and [n] q !=Õ n i=1 (1+ q+ + q i1 ). Let G= S 2n be the symmetric group on[2n], and let H = K be the hyperoctahedral subgroup of centrally symmetric permutations, which are isomorphic to C n 2 o S n , the group of symmetries of an n-dimensional hypercube. Then the double cosets are indexed by the partitions of n, and the uniform distribution on G induces the Ewens’s sampling formula, p q (l)= n!q `(l) z z l ; where`(l) is the length of the partition l, z l =Õ n i=1 i a i a i !, and z= q(q+ 1):::(q+ n 1), with a i the number of parts ofl of size i. As Diaconis and Simper remarked in [39], enumeration by double cosets can lead to interesting mathematics. 4.1.1 Main Results We initiate the study of statistics on permutations chosen uniformly at random from a fixed parabolic double coset in S l nS n =S m . This line of study was suggested by Diaconis and Simper in [39], as a parallel to recent work on permutations from fixed conjugacy classes of S n [49, 72, 74, 73, 50]. 35 Under some mild conditions, we obtain limit theorems, with rates of convergence, for the number of fixed points, descents, and inversions. Let s2 S n . Then s has a fixed point at i2 [n] if s(i) = i. Let fp(s) be the number of fixed points of s. Then we can decompose fp(s) into a sum of indicator random variables as fp(s)=å n i=1 q i , where q i = 1 fs(i)=ig is the indicator random variable that s has a fixed point at index i. Define A k` := L k \ M ` [n] for all 1 k I and 1` J. Note that A k` may possibly be empty and that the setsfA k` g form a disjoint partition of[n]. Our first result shows, under some hypotheses on the partitionsl;m and the contingency table T , that the number of fixed points of a permutation chosen uniformly at random from a fixed parabolic double coset indexed by T is asymptotically Poisson. Moreover, we obtain an explicit error bound on the distributional approximation. Theorem 4.1.1. Let l =(l 1 ;:::;l I ) and m =(m 1 ;:::;m J ) be two partitions of n. Let s2 S n be a permutation chosen uniformly at random from a fixed double coset in S l nS n =S m indexed by T . Suppose T k` 1 for all 1 k I;1` J such that A k` 6= / 0. Let Y n be a Poisson random variable with raten n := E(fp(s)). Then d TV (fp(s);Y n ) 5(I+ J 1)minf1;n 1 n g maxfl I ;m J g : If the partitionsl and m also satisfy 1. lim n!¥ l I =¥ and lim n!¥ m J =¥, 2. lim n!¥ (I+ J 1)= C, for some constant C2N. and if limsup n!¥ n n =n <¥, then d TV (fp(s);Y)! 0 as n!¥, where Y is a Poisson random variable with raten, so that fp(s) d ! Y . The proof uses size-bias coupling and Stein’s method. The main advantage of using Stein’s method is that we are able to obtain error bounds on our distributional approximations, in addition to proving limit theorems. 36 We remark that Stein’s method via size-bias coupling has been successfully used to prove Poisson limit theorems; some examples are: Angel, van der Hofstad, and Holmgren [6] for the number of self-loops and multiple edges in the configuration model, Arratia and DeSalvo [8] for completely effective error bounds on Stirling numbers of the first and second kinds, Goldstein and Reinert [54] for Poisson subset numbers, and Holmgren and Janson for sums of functions of fringe subtrees of random binary search trees and random recursive trees [64]. The study of fixed points of random permutations began with Montmort [83], who showed that the number of fixed points in a uniformly random permutation is asymptotically Poisson with rate 1. Since then, the topic has branched out in many different directions. Diaconis, Fulman, and Guralnick [36] sought generalizations and obtained limit theorems for the number of fixed points under primitive actions of S n . In [35], Diaconis, Evans, and Graham established a relationship be- tween the distributions of fixed points and unseparated pairs in a uniformly random permutation. Mukherjee [84] used the recent theory of permutation limits to derive the asymptotic distributions of fixed points and cycle structures for several classes of non-uniform distributions on S n , which include Mallows distributed permutations. More recently, Miner and Pak [82] and Hoffman, Riz- zolo, and Slivken [62] studied fixed points in random pattern-avoiding permutations and showed connections with Brownian excursion. Lets2 S n . A pair(i; j)[n] 2 is a d-descent, or generalized descent, ofs if i< j i+ d and s(i)>s( j). In particular, a descent corresponds to a 1-descent and an inversion corresponds to an(n 1)-descent. Let des d (s) be the number of d-descents, des(s) the number of descents, and inv(s) the number of inversions ins2 S n . Our next three results establish central limit theorems for the number of descents, the number of d-descents for fixed d, and the number of inversions in a permutation chosen uniformly at random from a fixed double coset indexed by T by providing upper bounds on the Kolmogorov distances to the standard normal of order n 1=2 . 37 Theorem 4.1.2. Letl =(l 1 ;:::;l I ) andm=(m 1 ;:::;m J ) be two partitions of n such that I= o(n). Let s2 S n be a permutation chosen uniformly at random from a fixed double coset in S l nS n =S m . Let W n := des(s)m n s n , where m n := E(des(s)) ands 2 n := Var(des(s)). Then d K (W n ;Z) O(n 1=2 ); where Z is a standard normal random variable, so that W n d ! Z, as n!¥. Theorem 4.1.3. Letl =(l 1 ;:::;l I ) andm=(m 1 ;:::;m J ) be two partitions of n such that I= o(n). Let s2 S n be a permutation chosen uniformly at random from a fixed double coset in S l nS n =S m . Let d l I 2 be a fixed positive integer. Let W n := des d (s)m n s n , where m n := E(des d (s)) and s 2 n := Var(des d (s)). Then d K (W n ;Z) O(n 1=2 ); where Z is a standard normal random variable, so that W n d ! Z, as n!¥. Theorem 4.1.4. Letl =(l 1 ;:::;l I ) andm=(m 1 ;:::;m J ) be two partitions of n such that I= o(n). Let s2 S n be a permutation chosen uniformly at random from a fixed double coset in S l nS n =S m . Let W n := inv(s)m n s n , where m n := E(inv(s)) ands 2 n := Var(inv(s)). Then d K (W n ;Z) O(n 1=2 ); where Z is a standard normal random variable, so that W n d ! Z, as n!¥. The proofs use dependency graphs and Stein’s method. Stein’s method via dependency graphs is a powerful technique which reduces proving central limit theorems to a construction of a de- pendency graph and a variance estimate. This method has been successfully used to prove central limit theorems; some examples include Avram and Bertsimas [13] for the dependence range of several geometric probability models where random points are generated according to a Poisson 38 point process, Barany and Vu [15] for the volume and number of faces of a Gaussian random poly- tope, and Hofer [61] for the number of occurrences of a fixed vincular permutation pattern in a random permutation. The probabilistic study of descents and inversions of random permutations is a rich and ex- pansive topic. We do not attempt to cover the entire history, but we mention some previous work. There are many proofs for the asymptotic normality of descents, but we highlight Fulman’s proof using Stein’s method via exchangeable pairs and a non-reversible Markov chain in [48]. Bona [23] used the method of dependency graphs to show the asymptotic normality of generalized descents, and subsequently Pike [90] used Stein’s method and exchangeable pairs to obtain rates of con- vergence. Chatterjee and Diaconis [25] proved a central limit theorem for the two sided descent statistic in uniformly random permutations, and He [60] extended this result to Mallows distributed permutations. In another direction, Conger and Viswanath [32] established the asymptotic normal- ity of the number of descents and inversions in permutations of mulitsets. 4.1.2 Outline This chapter is organized as follows. Section 5.2 introduces notation and definitions that we use throughout the chapter and gives necessary background and relevant results on size-bias coupling, dependency graphs, Stein’s method, and concentration inequalities. In Section 4.3 we use size-bias coupling and Stein’s method to prove Theorem 4.1.1, and in Sections 4.4, 4.5, and 4.6 we construct dependency graphs for descents and inversions, and apply Stein’s method to prove Theorems 4.1.2, 4.1.3, and 4.1.4. We then use our size-bias coupling and dependency graph constructions to prove concentration of measure results in Section 4.7. We conclude with some final remarks in Section 5.5. 39 4.2 Preliminaries 4.2.1 Notation and Definitions For a positive integer n, let[n] :=f1;:::;ng. Let S n be the symmetric group on[n]. As an abuse of notation, we use s to denote a permutation, while s 2 n denotes the variance of some random variable, which may depend on n. Let a n ;b n be two sequences. If lim n!¥ a n b n is a nonzero constant, then we write a n b n and we say that a n is of the same order as b n . If there exists positive constants c and n 0 such that a n cb n for all n n 0 , then we write a n = O(b n ). Letl =(l 1 ;:::;l N ) be a partition of n. Recall from the introduction that thel i -block is the set L i :=fl 1 ++l i1 + 1;:::;l 1 ++l i g. Define thel i -border be the value l b i :=l 1 ++l i . Then thel i -interior is defined to be the set L o i := L i nfl b i g, that is, thel i -block minus thel i -border. Also recall from the introduction the sets A k` := L k \ M ` [n] for all 1 k I and 1` J, and note that these setsfA k` g form a disjoint partition of[n]. 4.2.2 Size-Bias Coupling Let W be a non-negative integer-valued random variable with finite mean. Then W s has the size- bias distribution of W if P(W s = w)= wP(W = w) EW : A size-bias coupling is a pair (W;W s ) of random variables defined on the same probability space such that W s has the size-bias distribution of W. We use the following outline provided in [95] for coupling a random variable W with its size- bias version W s . Let W =å n i=1 X i where X i 0 and p i = EX i . 40 1. For each i2 [n], let X s i have the size-bias distribution of X i independent of (X k ) k6=i and (X s k ) k6=i . Given X s i = x, define the vector(X (i) k ) k6=i to have the distribution of(X k ) k6=i condi- tional on X i = x. 2. Choose a random summand X I , where the index I is chosen, independent of all else, with probability P(I= i)= p i =l, wherel = EW. 3. Define W s =å k6=I X (I) k + X s I . If W s is constructed by Items (1)-(3) above, then W s has the size-bias distribution of W. As a special case, note that if the X i is an indicator random variable, then its size-bias distribution is X s i = 1. We summarize this as the following proposition. Proposition 4.2.1 ([95], Corollary 4.14). Let X 1 ;:::;X n be indicator variables with P(X i = 1)= p i , W =å n i=1 X i , and l = EW =å n i=1 p i . If for each i2[n], (X (i) k ) k6=i has the distribution of(X k ) k6=i conditional on X i = 1 and I is a random variable independent of all else such that P(I= i)= p i =l, then W s =å k6=I X (I) k + 1 has the size-bias distribution of W. 4.2.3 Stein’s method for Poisson Approximation We use the following size-bias coupling version of Stein’s method for Poisson approximation. Theorem 4.2.2 ([95], Theorem 4.13). Let W 0 be an integer-valued random variable such that EW =l > 0 and let W s be a size-bias coupling of W. If ZPoisson(l), then d TV (W;Z) minf1;lgEjW+ 1W s j: 4.2.4 Dependency Graphs LetfY a :a2 Ag be a family of random variables. Then a graph G is a dependency graph for the familyfY a :a2 Ag if the following two conditions hold: 1. The vertex set of G is A. 41 2. If A 1 and A 2 are disconnected subsets in G, thenfY a : a2 A 1 g andfY a : a2 A 2 g are independent. Dependency graphs are useful for proving central limit theorems for sums of random variables without too many pairwise dependencies. Janson provided an asymptotic normality criterion in [67], and Baldi and Rinott [14] used Stein’s method to obtain an upper bound on the Kolmogorov distance to the standard normal in terms of the dependency graph structure. We use the following stronger version of Stein’s method via dependency graphs for normal approximation due to Hofer. Theorem 4.2.3 ([61], Theorem 3.5). Let Z N(0;1). Let G be a dependency graph forfY k g N k=1 and D 1 be the maximal degree of G. Let X =å N k=1 Y k . Assume there is a constant B> 0 such thatjY k E(Y k )j B for all k. Then for W = Xm s , where m = EX ands 2 = Var(X), it holds that d K (W;Z) 8B 2 D 3=2 N 1=2 s 2 + 8B 3 D 2 N s 3 : 4.2.5 Concentration Inequalities The techniques developed for distributional approximation via Stein’s method can also be used to prove concentration of measure inequalities (large deviations). These provide information on the tail behavior of distributions. The following theorem due to Ghosh and Goldstein gives concentration inequalities for the upper and lower tails, given a bounded size-bias coupling. Theorem 4.2.4 ([52], Theorem 1.1). Let Y be a nonnegative random variable with mean m and variances 2 , both finite and positive. Suppose there exists a coupling of Y to a variable Y s having the Y -size-bias distribution which satisfiesjY s Yj C for some C> 0 with probability one. 1. If Y s Y with probability one, then P Ym s t exp t 2 2A 42 for all t> 0, where A= Cm=s 2 . 2. If the moment generating function m(q)= E e qY is finite atq = 2=C, then P Ym s t exp t 2 2(A+ Bt) for all t> 0, where A= Cm=s 2 and B= C=2s. The next theorem established by Janson provides concentration inequalities for sums of depen- dent random variables with an appropriate dependency graph structure. Theorem 4.2.5 ([66], Theorem 2.1). Suppose X =å a2A Y a with a a Y a b a for alla2 A and some real numbers a a and b a . Let G be a dependency graph forfY a :a2 Ag and let D 1 be the maximal degree of G. Then for t> 0, P(X EX t) exp 2t 2 Då a2A (b a a a ) 2 : The same estimate holds for P(X EXt). 4.3 Fixed Points 4.3.1 Mean and Variance of Fixed Points Let s2 S n be chosen uniformly at random from the double coset indexed by T . Let fp(s) be the number of fixed points ofs, so that fp(s)= n å i=1 q i ; whereq i = 1 fs(i)=ig is the indicator random variable thats has a fixed point at i. We start by computing the probability that a value from M ` occurs at an index in L k . 43 Lemma 4.3.1. Lets2 S n be a permutation chosen uniformly at random from the double coset in S l nS n =S m indexed by T . If a2 L k and b2 M ` , then P(s(a)= b)= T k` l k m ` : Proof. Suppose a2 L k and b2 M ` . Then P(s(a)= b)= jfs :s(a)= b;a2 L k ;b2 M ` gj jS l ns=S m j = Õ i6=k; j6=` l i !m j ! T i j ! (l k 1)!(m ` 1)! (T k` 1)! Õ i; j l i !m j ! T i j ! = T k` l k m ` : Using the previous lemma, we can compute the expected number of fixed points. Theorem 4.3.2. Let s2 S n be a permutation chosen uniformly at random from the double coset in S l nS n =S m indexed by T . Then E(fp(s))= å k;` jA k` jT k` l k m ` : Proof. By linearity of expectation and Lemma 4.3.1, E(fp(s))= n å i=1 Eq i = I å k=1 J å `=1 å i2A k` T k` l k m ` = å k;` jA k` jT k` l k m ` : Next we compute the variance of the number of fixed points. Let q i j = 1 fs(i)=i;s( j)= jg be the indicator random variable that s has fixed points at i and j. We begin by computing the probability of having fixed points at two different indices. Note that it suffices to consider i; j which are contained in nonempty A k` . 44 Lemma 4.3.3. Lets2 S n be a permutation chosen uniformly at random from the double coset in S l nS n =S m indexed by T . Suppose i2 A k` and j2 A st . Then P(s(i)= i;s( j)= j)= 8 > > > > > > > > > > < > > > > > > > > > > : T k` (T k` 1) l k (l k 1)m ` (m ` 1) if k= s and`= t T k` T kt l k (l k 1)m ` m t if k= s and`6= t T k` T s` l k l s m ` (m ` 1) if k6= s and`= t T k` T st l k l s m ` m t if k6= s and`6= t Proof. This follows by similar computations as in the proof of Lemma 4.3.1. Suppose we are in the first case where k= s and`= t, so that i; j2 A k` . Then P(s(i)= i;s( j)= j)= jfs :s(i)= i;s( j)= j;i; j2 A k` j jS l ns=S m j = Õ i6=k; j6=` l i !m j ! T i j ! (l k 2)!(m ` 2)! (T k` 2)! Õ i; j l i !m j ! T i j ! = T k` (T k` 1) l k (l k 1)m ` (m ` 1) : The other three cases follow by similar computations and so we omit the details here. Combining this lemma with the expected value obtained in Theorem 4.3.2, we can compute the variance. Theorem 4.3.4. Let s2 S n be a permutation chosen uniformly at random from the double coset in S l nS n =S m indexed by T . Then Var(fp(s))= å k;` jA k` jT k` l k m ` jA k` j 2 T 2 k` l 2 k m 2 ` + jA k` j(jA k` j 1)T k` (T k` 1) l k (l k 1)m ` (m ` 1) + 2 å k `<t jA k` jjA kt jT k` T kt l k (l k 1)m ` m t + 2 å ` k > > > > > > > > > < > > > > > > > > > > : T `s T rt l ` l r m s m t if`6= r and s6= t T `s T `t l ` (l ` 1)m s m t if`= r and s6= t T `s T rs l ` l r m s (m s 1) if`6= r and s= t T `s (T `s 1) l ` (l ` 1)m s (m s 1) if`= r and s= t Proof. Suppose we are in the first case where`6= r and s6= t. Then P(s(a)= b;s(b)= a)= jfs :s(a)= b;s(b)= a;a2 M s ;a2 L ` ;b2 M t ;b2 L r gj jS l ns=S m j = Õ i6= j6=` i6= j6=r l i !m j ! T i j ! (l ` 1)!(m s 1)! (T `s 1)! (l r 1)!(m t 1)! (T rt 1)! Õ i; j l i !m j ! T i j ! = T `s T rt l ` l r m s m t : The other three cases follow by similar computations and so we omit the details here. 4.3.3 Poisson Limit Theorem for Fixed Points We now prove the Poisson limit theorem for the number of fixed points. Proof of Theorem 4.1.1. Let W := fp(s)=å n i=1 q i , where q i = 1 fs(i)=ig . By Proposition 4.3.2, the expected number of fixed points is EW = å k;` jA k` jT k` l k m ` =:n n : 47 We start by defining a size-bias distribution, W s , of W as follows. Pick I independent of all else with probability P(I= i)= Eq i EW = T k` n n l k m ` , for i2 L k and i2 M ` . Suppose thats(X)= I. If X = I, then I is already a fixed point ofs, so we simply sets s =s. So suppose X6= I so that s does not have a fixed point at I. If the indices I and X are in the same l-block or if the values I and s(I) are in the same m-block, then we simply swap the valuess(I) and I to get a new permutation with a fixed point at index I. Sets s to be the resulting permutation. Observe that this swap does not change the value of T i j for all i; j. Otherwise, indices I and X are in different l-blocks and values s(I) and I are in different m- blocks. Suppose I2 L k , s(I)2 M a , and X2 L r . Pick an element z uniformly at random from the setfs(y)2 M ` : y2 L k g, which is of size T k` , and then swaps(I) and z. Note that we can always find such a z since T k` 1 by hypothesis. Finally, swap the values z ands(X), so that the resulting permutation has a fixed point at I. Set s s to be the resulting permutation. Note that in this case, it is not enough to simply swap the valuess(I) and I since we must account for the change in the values of T i j when we are making swaps. Observe that this two step swap also does not change the value of T i j for all i; j. By construction, s s obtained by the coupling above has the same marginal distribution as a random permutation from the double coset indexed by T , conditioned to have a fixed point at position I. Indeed the distribution of a random permutation from the double coset indexed by T only depends on the counts T i j and not on the positions of values within blocks. Set W s = fp(s s )=å b6=I q (I) b +1, whereq (I) b is the indicator random variable thats s has a fixed point at positionb. By Proposition 4.2.1, W s has the size-bias distribution of W. Thus(W;W s ) is a size-bias coupling. 48 Let V i :=å b6=i q (i) b , so that W s = V I + 1. Let Y n Poisson(n n ). By Theorem 4.2.2, d TV (W;Y n ) minf1;n n gEjW+ 1W s j = minf1;n n g n å i=1 P(I= i)EjWV i j = minf1;n 1 n g å k;` å i2A k` T k` l k m ` EjWV i j = minf1;n 1 n g å k;` jA k` jT k` l k m ` EjWV i j: Next, we compute the random variablejWV i j= q i +å b6=i q b q (i) b . Suppose i2 L k and i2 M ` . There are several cases. Case 1. If i is a fixed point, thenjWV i j= 1. By Lemma 4.3.1, this occurs with probability T k` l k m ` . So we may assume that i is not a fixed point. Lets(x)= i. Supposes(i)2 M a and x2 L r . Case 2. If`= a so that values i;s(i) are in the same m-block, or if k= r so that indices i;x are in the same l-block, then the coupling swaps the values i and s(i). In this case,jWV i j= 1 fi;s(i) are in a two-cycleg = 1 fs(i)=x;s(x)=ig . Thus by Lemma 4.3.5, the probability that i and x are in a two-cycle is P(i and x are in a two-cycle)= T k` T r` l k l r m ` (m ` 1) 1 fx2M ` ;x2L r ;`=a;k6=rg + T k` T ka l k (l k 1)m ` m a 1 fx2M a ;x2L k ;`6=a;k=rg + T k` (T k` 1) l k (l k 1)m ` (m ` 1) 1 fx2M ` ;x2L k ;`=a;k=rg : Case 3. Otherwise`6= a and k6= r so that values i;s(i) are in different m-blocks and indices i;x are in different l-blocks. Then the coupling picks z2fs(y)2 M ` : y2 L k g uniformly at 49 random, swaps values z and s(i), and then swaps values z and i. By construction,jW V i j= 1 fz is a fixed pointg . Again by Lemma 4.3.1, this occurs with probability T k` l k m ` . Combining the cases above gives the expected value EjWV i j= P(i is a fixed point)+ P(i is in a two-cycle) + P(z is a fixed point) = T k` l k m ` + å x6=i P(i and x are in a two-cycle)+ T k` l k m ` 5T k` l k m ` : Thus the total variation distance is upper bounded by d TV (W;Y n ) minf1;n 1 n g å k;` jA k` jT k` l k m ` 5T k` l k m ` 5minf1;n 1 n g å k;` jA k` j l k m ` 5minf1;n 1 n g å k;` minfl k ;m ` g l k m ` 5minf1;n 1 n g å k;` 1 maxfl k ;m ` g 5(I+ J 1)minf1;n 1 n g maxfl I ;m J g where we used the facts thatjA k` j minfl k ;m ` g for all k;`, the size of the partitionfA k` g is at most I+ J 1, andl k l I for all k and m ` m J for all`. Let Y Poisson(n). By hypothesis, since limsup n!¥ n n =n, I+ J 1! C, l I !¥, and m J !¥, we have that d TV (W;Y n )! 0 and d TV (Y n ;Y)! 0 as n!¥. Therefore d TV (W;Y)! 0 as n!¥, and thus W d ! Y . As a corollary, we can recover Montmort’s classical result on the Poisson limit theorem for the number of fixed points in a uniformly random permutation, along with a rate of convergence. 50 Corollary 4.3.6 ([83]). Lets2 S n be a permutation chosen uniformly at random. Then d TV (fp(s);Y) 5 n ; where YPoisson(1). Thus fp(s) d !Poisson(1) as n!¥. Proof. Setting l = m =(n) in Theorem 4.1.1 gives d TV (fp(s);Y) 5 n , where YPoisson(1). Therefore, d TV (W;Y)! 0 as n!¥, and so fp(s) d !Poisson(1). In [33] Crane and DeSalvo used Stein’s method and size-bias coupling to obtain a total variation upper bound of 3 n for uniformly random permutations, which is slightly better than our bound. However, using properties of alternating series and the definition of total variation distance directly, Arratia and Tavare [12] showed a super-exponential upper bound of 2 n (n+1)! for uniformly random permutations. 4.4 Descents Lets2 S n be chosen uniformly at random from the double coset in S l nS n =S m indexed by T . Let des(s) be the number of descents ofs, so that des(s)= n1 å i=1 I i ; where I i = 1 fs(i)>s(i+1)g is the indicator random variable thats has a descent at index i. 4.4.1 Mean and Variance of Descents In this section we compute the mean and asymptotic variance of des(s). We start by computing the probability that a descent occurs at some fixed position. In what follows, let l =(l 1 ;:::;l I ) and m =(m 1 ;:::;m J ) be two partitions of n. 51 Lemma 4.4.1. Let s be a permutation chosen uniformly at random from the double coset in S l nS n =S m indexed by T . Then P(I i = 1)= 8 > > < > > : 1 2 if i2 L o k , for k2[I]; 1 2 f(k) 2l k l k+1 if i= l b k , for k2[I 1]; where f(k) :=å 1a<bJ (T ka T k+1;b T k+1;a T kb ), for k2[I 1]. Proof. Suppose that i2 L k for k2[I]. There are two cases for the position of i: • Case 1 (Interior): i2 L o k , for k2[I], • Case 2 (Border): i= l b k , for k2[I 1]. First suppose i2 L o k . Then swapping the values s(i) and s(i+ 1) maps a uniformly random permutation in S l nS n =S m to a permutation with the same distribution, and transforms a permutation with a descent at i to a permutation without a descent at i. It follows that P(I i = 1;Case 1)= 1 2 . So suppose i= l b k . There are two subcases for the values ofs(i) ands(i+ 1): • Subcase 1 (Same block): s(i);s(i+ 1)2 M ` , for`2[J], • Subcase 2 (Different blocks): s(i)2 L b ands(i+ 1)2 L a , for 1 a< b J. 52 Then P(I i = 1;Case 2;Subcase 1)= ( m ` 2 )(m ` 2)!(l k 1)!(l k+1 1)! (T k` 1)!(T k+1;` 1)! Õ i= 2fk;k+1g; j6=` l i !m j ! Õ (i; j)= 2f(k;`);(k+1;`)g T i j ! Õ i; j l i !m j ! T i j ! = T k` T k+1;` 2l k l k+1 ; P(I i = 1;Case 2;Subcase 2)= m a (m a 1)!m b (m b 1)!(l k 1)!(l k+1 1)! (T kb 1)!(T k+1;a 1)! Õ i= 2fk;k+1g; j= 2fa;bg l i !m j ! Õ (i; j)= 2f(k+1;b);(k;a)g T i j ! Õ i; j l i !m j ! T i j ! = T k+1;a T kb l k l k+1 : Summing over the two subcases yields P(I i = 1;Case 2) = J å `=1 P(I i = 1;Case 2;Subcase 1)+ å 1a<bJ P(I i = 1;Case 2;Subcase 2) = 1 2l k l k+1 J å `=1 T k` T k+1;` + 2 å 1a<bJ T k+1;a T kb ! = 1 2l k l k+1 J å `=1 T k` ! J å `=1 T k+1;` ! å 1a<bJ (T ka T k+1;b T k+1;a T kb ) ! = l k l k+1 2l k l k+1 1 2l k l k+1 å 1a<bJ (T ka T k+1;b T k+1;a T kb ) = 1 2 f(k) 2l k l k+1 : Next we compute the probability that a descent occurs in two consecutive positions. Let I i j = I i I j denote the indicator random variable for the event that there are descents at both positions i and j. Define the set L oo k := L k nfl b k ;l b k 1g, for 1 k I. 53 Lemma 4.4.2. Let s be a permutation chosen uniformly at random from the double coset in S l nS n =S m indexed by T . Then P(I i;i+1 = 1)= 8 > > > > > > < > > > > > > : 1 6 i2 L oo k , for k2[I] g(k) 6l k (l k 1)l k+1 if i= l b k 1, for k2[I 1] h(k) 6l k l k+1 (l k+1 1) if i= l b k , for k2[I 1]; where g(k) := J å `=1 T k` (T k` 1)T k+1;` + 3 å 1a<bJ (T kb (T kb 1)T k+1;a + T kb T ka T k+1;a ) + 6 å 1a > > > > > > > > > > > > > < > > > > > > > > > > > > > > : 1 4 ; if i; j2 L oo k , for 1 k I, or if i2 L o ` and j2 L o r , for 1`< r I; 1 4 f(k) 4l k l k+1 ; if exactly one of i or j equals l b k , for k2[I 1]; 1 2 f(`) 2l ` l `+1 1 2 f(r) 2l r l r+1 ; if i= l b ` and j= l b r , for 1`< r I 1; andjr`j 2: If i= l b k and j= l b k+1 for k2[I 2], then P(I i j = 1) = l k 4(l k 1) f(k+ 1) 4(l k+1 1)l k+2 f(k) 4l k (l k+1 1) + f(k) f(k+ 1) e(k) 4l k l k+1 (l k+1 1)l k+2 ; where f(k) is defined as in Lemma 4.4.1 and e(k) := J å `=1 T k` T k+1;` T k+2;` + 2 å 1a<bJ T kb T k+1;a T k+2;a å 1c<dJ T kd T k+1;d T k+2;c ! + 4 å 1c > < > > : V =[n 1] E =f(i; j) : if j= i+ 1, or if j= i 1, or if i= l b k and j= l b k+1 for k2[I 2]g: The vertices represent each indicator random variable I i . By Lemmas 4.4.1, 4.4.2, and 4.4.3, the indicators I i and I j are dependent if the positions(i;i+ 1) and( j; j+ 1) are not disjoint or if i= l b k 61 and j= l b k+1 for k2[I 2], and independent otherwise. Thus the graph above is a dependency graph forfI i : 1 i n 1g. With this dependency graph construction, we can now prove Theorem 4.1.2. Proof of Theorem 4.1.2. Consider the family of indicatorsfI i : 1 i n 1g and the dependency graph, L, constructed above. Let D 1 be the maximal degree of L. Note thatjVj= n 1. It is clear by construction that D 1= 4, so that D= 5. We also have thatjI i E(I i )j 1 for all i2[n 1], and so we set B := 1. From Proposition 4.4.4, we have thats 2 n := Var(des(s)) n 12 . Therefore by Theorem 4.2.3, d K (W n ;Z)= O 8 5 3=2 p n 1 n 12 + 8 5 2 (n 1) n 12 3=2 ! = O(n 1=2 ); where W n = des(s)E(des(s)) s n and Z is a standard normal random variable. Thus d K (W n ;Z)! 0 as n!¥, so that W n d ! Z. 4.5 Generalized Descents for Fixed d In this section we prove Theorem 4.1.3, the central limit theorem for d-descents when d is a fixed positive integer. Let s2 S n be chosen uniformly at random from the double coset in S l nS n =S m indexed by T . Let des d (s) be the number of d-descents ofs. Observe that for n 2 and 1 d< n, the number of ordered pairs(i; j)2[n] 2 such that i< j< i+ d is N n = d(n d)+ d 2 . We order these pairs in lexicographic order and index them byf(i k ; j k )g N n k=1 , so that des d (s)= N n å k=1 Y k where Y k = 1 fs(i k )>s( j k )g is the indicator random variable thats has a d-descent at(i k ; j k ). 62 4.5.1 Mean and Variance of d-Descents The computation for the mean and variance of d-descents use the probabilities computed in Section 4.4.1 for the case of descents. Using essentially the same proof, we find that the probability of a d-descent is 1 2 when both indices i; j are in the same block, and is 1 2 f(k) 2l k l k+1 when i is in L k and j is in L k+1 . Since the proof is similar to the descents case, we will not include all the technical details but we will describe the key ideas. Proposition 4.5.1. Let s be a permutation chosen uniformly at random from the double coset in S l nS n =S m indexed by T , where I= o(n). Let d l I 2 be a fixed constant. Then E(des d (s))= nd d+1 2 2 d+1 2 2 I1 å k=1 f(k) l k l k+1 Var(des(s)) nd; where f(k) :=å 1a<bJ (T ka T k+1;b T k+1;a T kb ), for k2[I 1] Proof. For the expected value, we only need to consider d-descents which are fully contained within a l-block and those which cross the border between two adjacent l-blocks. Observe that the number of d-descents fully contained in L k is d(l k d)+ d 2 for all k2[I], and the number of d-descents which cross the border between L k and L k+1 is d+1 2 for all k2[I 1]. Thus by linearity of expectation and Lemma 4.4.1, E(des d (s))= N n å k=1 E(Y k ) = I å k=1 d(l k d)+ d 2 2 + I1 å k=1 d+ 1 2 1 2 f(k) 2l k l k+1 = nd d+1 2 2 d+1 2 2 I1 å k=1 f(k) l k l k+1 63 Next, the variance can be written as Var(des d (s))= N n å k=1 Var(Y k )+ 2 å 1k<`N n cov(Y k ;Y ` ): The first sum is computed to be N n å k=1 Var(Y k )= N n å k=1 E(Y k )[E(Y k )] 2 = I å k=1 d(l k d)+ d 2 4 + I1 å k=1 d+ 1 2 1 4 f(k) 2 4l 2 k l 2 k+1 ! = nd d+1 2 4 d+1 2 4 I1 å k=1 f(k) 2 l 2 k l 2 k+1 nd 4 ; sincej f(k) l k l k+1 j 1, I= o(n), and d is fixed. Now we compute the sum of covariances. We split this into several cases. First consider indicators Y k ;Y ` such that i k ; j k ;i ` ; j ` are all in the same block. The covariance computation within each block proceeds as in the random permutations case, as computed by Pike [90], then we sum over all blocks. In this case we note that whenfi k ; j k g\fi ` ; j ` g= / 0, cov(Y k ;Y ` )= 0. Thus the only nonzero covariance terms correspond to cases where i k = i ` , j k = j ` , and j k = i ` . For the cases i= i k = i ` and j= j k = j ` , we have that E(Y k Y ` )= 1 3 , so that cov(Y k ;Y ` )= 1 3 1 4 = 1 12 . For the case i k < j k = m= i ` < j ` , we have that E(Y K Y ` )= 1 6 . Summing over all such terms, the contribution to the covariance is of order nd. Next, consider indicators Y k ;Y ` which occur in different blocks with indices fully contained in their respective blocks. That is, i k ; j k 2 L a and i ` ; j ` 2 L b where a6= b. Then the indicators are independent since we have that E(Y k Y ` )= E(Y k )E(Y ` ), and so cov(Y k ;Y ` )= 0 in this case. It remains to consider indicators Y k ;Y ` such that at least one of the pairs(i k ; j k ) or(i ` ; j ` ) cross between two consecutive blocks. Iffi k ; j k g\fi ` ; j ` g= / 0, we have that Y k ;Y ` are independent unless i k ;i ` 2 L a and j k ; j ` 2 L b or i k 2 L a , j k ;i ` 2 L a+1 , and j ` 2 L a+2 . The sum of covariances of these 64 terms is of order d 4 I = o(n). To see this, note that in this case E(Y k Y ` ) 1 4 , so that cov(Y k ;Y ` ) is a constant independent of n. There are at most d 4 such pairs which cross between two blocks L a and L a+1 , and at most d 4 pairs such that one d-descent crosses L a ;L a+1 and the other crosses L a+1 ;L a+2 . Otherwise iffi k ; j k g\fi ` ; j ` g6= / 0, then Y k ;Y ` are no longer independent, but by a similar argument as above, the contribution to the sum of covariances is still a constant times I. Thus the total contribution from these cases is o(n). Combining the above cases, we find that Var(des d (s)) nd. We remark that the condition d l I 2 ensures that every d-descent occurs at indices which lie at most between two consecutive blocks. The regime where d> l I 2 is combinatorially more difficult and so we do not pursue it here, except for the case of inversions, d= n 1, which is considered in the next section. 4.5.2 Central Limit Theorem for d-Descents We now use the method of dependency graphs to prove the central limit theorem for d-descents. Proof of Theorem 4.1.3. Consider the family of indicatorsfY k : 1 k N n g and the dependency graph, L, whose vertex set is[N n ] indexed by the indicatorsfY k g and edge set E such that(i; j) is an edge if Y i and Y j are dependent. Note thatjVj= N n d(n d)+ d 2 nd. Let D 1 be the maximal degree of L. The maximal degree is attained by indicators which correspond to d-descents which cross the border separating L k and L k+1 for some 2 k I 2. That is, indicators Y k such that i k 2 L a and j k 2 L a+1 . Observe that such indicators are dependent on all indicators which cross between L a1 and L a , all indicators which cross between L k+1 and L k+2 , and all indicators which contain either i k or j k as one of the indices which define their d-descent. Thus D 1 4d+ 2 d+1 2 , so that D d 2 . Therefore by Theorem 4.2.3, d K (W n ;Z)= O 8(d 2 ) 3=2 p nd nd + 8(d 2 ) 2 (nd) (nd) 3=2 ! = O(n 1=2 ); 65 where W n = des d (s)E(des d (s)) s n and Z is a standard normal random variable. Thus d K (W n ;Z)! 0 as n!¥, so that W n d ! Z. 4.6 Inversions Let s2 S n be chosen uniformly at random from the double coset in S l nS n =S m indexed by T and let inv(s) be the number of inversions ofs, so that inv(s)= å 1i< jn J i j ; where J i j = 1 fs(i)>s( j)g . 4.6.1 Mean and Variance of Inversions We begin by stating a lemma which gives the probability of an inversion. Lemma 4.6.1. Let s be a permutation chosen uniformly at random from the double coset in S l nS n =S m indexed by T . Then P(I i = 1)= 8 > > < > > : 1 2 if i; j2 L k , for k2[I]; 1 2 f(`;r) 2l ` l r if i2 L ` and j2 L r , for 1`< r I; where f(`;r) :=å 1a<bJ (T `a T rb T ra T `b ), for 1`< r I. The proof follows from essentially the same argument as in Lemma 4.4.1. The main idea is that the probability of an inversion depends only on the location of the indices i; j. It is 1 2 if both i; j are in the same block and is a function of T ` ;T r ;l ` ;l r if i2 L ` and j2 L r . We now state the mean and asymptotic variance of inv(s). 66 Proposition 4.6.2. Let s be a permutation chosen uniformly at random from the double coset in S l nS n =S m indexed by T , where I= o(n). Then E(inv(s))= n(n 1) 4 1 2 å 1`<rI f(`;r) n 2 ; Var(inv(s)) n 3 ; where f(`;r) :=å 1a<bJ (T `a T rb T ra T `b ). The proof follows by similar arguments and computations as the descents case, and so we omit it here to preserve readability. 4.6.2 Central Limit Theorem for Inversions The proof of the central limit theorem for the number of inversions follows by a similar argument as in the descents case. First we construct a dependency graph L on the family of indicators fJ i j : 1 i< j ng. Define the vertex set, V , to be the set of ordered pairsf(i; j) : 1 i< j ng, and the edge set, E, such that there is an edge between the vertices(i; j) and(i 0 ; j 0 ), with i6= i 0 and j6= j 0 , if either i= i 0 or j= j 0 . We now prove Theorem 4.1.4 using this dependency graph construction. Proof of Theorem 4.1.4. Consider the dependency graph, L, constructed above for the family of indicatorsfJ i j : 1 i< j ng. Let D 1 be the maximal degree of L. Note thatjVj= n 2 . Observe that D 1= 2(n 2)+ O(n). To see this, consider an indicator J i j such that i2 L ` and j2 L r for 1`< j I. Then J i j is dependent on indicators which contain i or j as an index, or indicators which has exactly one index in L ` or exactly one index in L r . This gives D= O(n). We set B := 1 sincejJ i j E(J i j )j 1 for all 1 i< j n. From Proposition 4.6.2, we have that s 2 n := Var(des(s)) n 3 . 67 Therefore by Theorem 4.2.3, d K (W n ;Z)= O 4n 3=2 p n(n 1) n 3 + 4n 2 n(n 1) n 9=2 ! = O(n 1=2 ); where W n = inv(s)E(inv(s)) s n and Z is a standard normal random variable. Therefore d K (W n ;Z)! 0 as n!¥, from which it follows that W n d ! Z. 4.7 Concentration of Measure As applications, we use our size-bias coupling and dependency graph constructions from above to obtain concentration of measure results for the number of fixed points, descents, and inversions. Our first proposition gives upper and lower tail bounds on the number of fixed points. Proposition 4.7.1. Let s be a permutation chosen uniformly at random from the double coset in S l nS n =S m indexed by T . Let W := fp(s) be the number of fixed points ofs. Let m n := E(W) and s 2 n := Var(W). Then P Wm n s n t exp s 2 n t 2 4m n ; P Wm n s n t exp s 2 n t 2 4m n + 2s n t ; for all t> 0. Proof. Let(W;W s ) be the size-bias coupling defined in the proof of Theorem 4.1.1 from Section 4.3.3. Observe thatjW s Wj 2 almost surely, so that we have a bounded size-bias coupling. Let m n := E(W) ands 2 n := Var(W) be the mean and variance of the number of fixed points computed in Theorems 4.3.2 and 4.3.4, which are both finite and positive. Next we have that W s W almost surely, so that our size-bias coupling is monotone. Moreover, since W is a bounded random variable, its moment generating function m(s) := E(exp(sW)) is 68 finite for all s2R. Therefore applying Theorem 4.2.4 with A := 2m n s 2 n and B := 1 s n finishes the proof. The next three propositions give matching upper and lower tail bounds on the number of de- scents, d-descents for fixed d, and inversions. Proposition 4.7.2. Let s be a permutation chosen uniformly at random from the double coset in S l nS n =S m indexed by T and let des(s) be the number of descents ofs. Let m n := E(des(s)) and s 2 n := Var(des(s)). Then P des(s)m n s n t exp 2t 2 n 5 ; P des(s)m n s n t exp 2t 2 n 5 ; for all t> 0. Proof. From the dependency graph for the family of indicatorsfI i : 1 i n 1g constructed in Section 4.4.2, we have that D= 5. Moreover 0 I i 1 for all i2[n1]. Applying Theorem 4.2.5 and usings 2 n n from Proposition 4.4.4 gives P des(s)m n s n t exp 2(ts n ) 2 5å n1 k=1 1 ! exp 2t 2 n 5 ; and the same estimate holds for P des(s)m n s n t . Proposition 4.7.3. Let s be a permutation chosen uniformly at random from the double coset in S l nS n =S m indexed by T and let des d (s) be the number of d-descents of s, where 1 d l I 2 is a fixed constant. Let m n := E(des d (s)) ands 2 n := Var(des d (s)). Then P des d (s)m n s n t exp 2t 2 n d ; P des d (s)m n s n t exp 2t 2 n d ; 69 for all t> 0. Proof. From the dependency graph for the family of indicatorsfY k : 1 k N n g constructed in Section 4.5.2, we have that D d 2 . Moreover 0 I i 1 for all i2[n 1]. Applying Theorem 4.2.5 and usings 2 n nd from Proposition 4.4.4 gives P des(s)m n s n t exp 2(tnd) 2 d 2 nd exp 2t 2 n d ; and the same estimate holds for P des(s)m n s n t . Proposition 4.7.4. Let s be a permutation chosen uniformly at random from the double coset in S l nS n =S m indexed by T and let inv(s) be the number of inversions ofs. Letm n := E(des(s)) and s 2 n := Var(des(s)). Then P inv(s)m n s n t exp 2t 2 n 2 ; P inv(s)m n s n t exp 2t 2 n 2 ; for all t> 0. Proof. From the dependency graph for the family of indicatorsfJ i j : 1 i< j ng constructed in Section 4.6.2, we have that D n 2 . Moreover 0 J i j 1 for all 1 i< j n. Applying Theorem 4.2.5 and usings 2 n n 3 from Proposition 4.6.2 gives P inv(s)m n s n t exp 2(tn 3 ) 2 n 2 n 2 exp 2t 2 n 2 ; and the same estimate holds for P inv(s)m n s n t . 70 4.8 Final Remarks 4.8.1 Permutations from Fixed Conjugacy Classes of S n A recent line of work in probabilistic group theory is in the study of asymptotic distributions of statistics on permutations chosen uniformly at random from a fixed conjugacy class of S n . Fulman initiated the study in [49], where he proved central limit theorems for descents and major indices in conjugacy classes with large cycles. More recently, Kim extended this result to prove a central limit theorem for descents in the conjugacy class of fixed-point free involutions [72], and subsequently, Kim and Lee [74] proved a central limit theorem for descents in all conjugacy classes. In the setting of all conjugacy classes, Kim and Lee showed that the joint distribution of descents and major indices is asymptotically bivariate normal [73], and Fulman, Kim, and Lee showed that the distribution of peaks is asymptotically normal [50]. 4.8.2 Other Permutation Statistics We initiate a parallel line of study on permutations chosen uniformly at random from a fixed parabolic double coset of S n . In a work in progress, we investigate the cycle structure in fixed parabolic double cosets. In particular, we study the asymptotic distributions of k-cycles, the total number of cycles, and the longest cycle. Another interesting statistic is the number of unseparated pairs. This statistic can be used as a test to determine how close to uniformly random a given permutation is. A permutation s has an unseparated pair at position i ifs(i+ 1)=s(i)+ 1. Let up(s) denote the number of unseparated pairs ofs. It should be tractable to use Stein’s method and size-bias coupling to show that up(s) is asymptotically Poisson. Some other statistics of interest are crossings, excedances, descents plus descents of its inverse, derangements, and pattern occurrences. 71 4.8.3 Other Regimes of d In this chapter we were able to obtain central limit theorems, with convergence rates, for descents, d-descents for fixed d l I 2 , and inversions. With a bit more work it should be tractable to extend our methods to the case where d l I 2 and d grows with l I . For the regime where d> l I 2 , the combinatorics of estimating the variance become much more difficult. Moreover, there are much more dependencies between the indicators such that Stein’s method no longer holds. However, we still expect a central limit theorem to hold under some mild conditions. It should be possible to use the method of weighted dependency graphs [43] to prove a central limit theorem, without a convergence rate, in this case. 4.8.4 Generalization to Finite Coxeter Groups Billey et al [20] studied parabolic double cosets of finite Coxeter groups. A probabilistic study of parabolic double cosets in this more general setting is fully warranted. 72 Chapter 5 Parking Functions 5.1 Introduction Consider n parking spots placed sequentially on a one-way street. A line of n cars enter the street one at a time, with each car having a preferred parking spot. The ith car drives to its preferred spot,p i , and parks if the spot is available. If the spot is already occupied, the car parks in the first available spot afterp i . If the car is unable to find any available spots, the car exits the street without parking. A sequence of preferences p =(p 1 ;:::;p n ) is a parking function if all n cars are able to park. More precisely, let[n] :=f1;:::;ng. A sequencep=(p 1 ;:::;p n )2[n] n is a parking function of size n if and only ifjfk :p k igj i for all i2[n]. This follows from the pigeonhole principle, since a parking function must have at least one coordinate with value equal to 1, at least two coordinates with value at most 2, and so on. Equivalently, p =(p 1 ;:::;p n )2[n] n is a parking function if and only if p (i) i for all i2[n], where (p (1) ;:::;p (n) ) is p sorted in a weakly increasing order p (1) p (n) . Let PF n denote the set of parking functions of size n. The total number of parking functions of size n is jPF n j=(n+ 1) n1 : 73 An elegant proof using a circle argument is due to Pollak and can be found in [47]. Parking functions were introduced by Konheim and Weiss [75] in their study of the hash storage structure. They have since found many applications to combinatorics, probability, and computer science. Observe that the expression forjPF n j is reminiscent of Cayley’s formula for labeled trees. Indeed, Foata and Riordan [47] established a bijection between parking functions and trees on n+1 labeled vertices. Connections to other combinatorial objects have since been established. For ex- ample, Stanley found connections to noncrossing set partitions [98] and hyperplane arrangements [97], and Pitman and Stanley [92] found a relation to volume polynomials of certain polytopes. The literature on the combinatorics of parking functions is vast and we refer the reader to Yan [104] for an accessible survey. There are several generalizations of the classical parking functions. In [103], Yan considered x x x-parking functions, where parking functions are associated with a vector x x x, and constructed a bijection with rooted forests. Kung and Yan then introduced[a;b]-parking functions in [77] and u u u- parking functions in [78], and for both generalizations, they gave explicit formulas for moments of sums of these parking functions. In [93], Postnikov and Shapiro introduced G-parking functions, where G is a digraph on[n]; the classical parking functions correspond to the case where G= K n+1 . Gorsky, Mazin, and Vazirani [56] studied rational parking functions and found connections to affine permutations and representation theory. Returning to the classical notion of parking on a one-way street, Ehrenborg and Happ [41] studied parking in which cars have different sizes, and in [40] they extended this to study parking cars behind a trailer of fixed size. Much of the recent combinatorial work on parking functions concerns parking completions. Suppose that` of the n spots are already occupied, denoted by v v v=(v 1 ;:::;v ` ), where the entries are in increasing order, and that we want to find parking preferences for the remaining n` cars so that they can all successfully park. The set of successful preference sequences are the parking completions for the sequence v v v=(v 1 ;:::;v ` ). Gessel and Seo [51] obtained a formula for parking completions where the occupied spots consist of a contiguous block starting from the first spot, v=(1;:::;`). Diaconis and Hicks [38] introduced the parking function shuffle to count parking 74 completions with one spot arbitrarily occupied. Extending this result, Aderinan et al. [3] used join and split operations to count parking completions where` n spots are arbitrarily occupied. Probabilistic questions about parking functions have also been considered, but probabilistic problems tend to be more complicated than enumeration results. In their study of the width of rooted labeled trees, Chassaing and Marckert [24] discovered connections between parking func- tions, empirical processes, and the Brownian bridge. The asymptotic distribution for the cost construction for hash tables with linear probing (which is equivalent to the area statistic of parking functions) was studied by Flajolet, Poblete, and Viola [46] and Janson [65], where it was shown to converge to normal, Poisson, and Airy distributions depending on the ratio between the number of cars and spots. More recently, Diaconis and Hicks [38] studied the distribution of coordinates, descent pat- tern, area, and other statistics of random parking functions. Yao and Zeilberger [105] used an experimental mathematics approach combined with some probability to study the area statistic. In [71], Kenyon and Yin explored links between combinatorial and probabilistic aspects of parking functions. Extending previous work, Yin developed the parking function multi-shuffle to obtain formulas for parking completions, moments of multiple coordinates, and all possible covariances between two coordinates for (m;n)-parking functions (where there are m n cars and n spots) [107] and u u u-parking functions [106]. We continue the probabilistic study of parking functions by establishing the asymptotic distri- bution of the cycle counts of parking functions, partially answering a question posed by Diaconis and Hicks in [38]. 5.1.1 Main Results A k-cycle in a parking functionp2 PF n , for k2[n], is a sequence of k indices i 1 ;:::;i k 2[n] such thatp(i 1 )= i 2 ,p(i 2 )= i 3 ,:::,p(i k1 )= i k ,p(i k )= i 1 . Let C k (p) be the number of k-cycles in the parking functionp2 PF n . Our first result shows that the expected number of k-cycles in a random parking function is asymptotically 1 k . 75 Theorem 5.1.1 ([88], Theorem 1.1). Let p2 PF n be a parking function chosen uniformly at ran- dom. Then E(C k (p)) 1 k : Our main result gives an upper bound on the total variation distance between the joint distribu- tion of cycle counts(C 1 ;:::;C d ) of a random parking function and a Poisson process(Z 1 ;:::;Z d ), wherefZ k g are independent Poisson random variables with ratel k = E(C k ). Theorem 5.1.2 ([88], Theorem 1.2). Let p2 PF n be a parking function chosen uniformly at random. Let C k = C k (p) be the number of k-cycles in p and let W = (C 1 ;C 2 ;:::;C d ). Let Z=(Z 1 ;Z 2 ;:::;Z d ), wherefZ k g are independent Poisson random variables with ratel k = E(C k ). Then d TV (W;Z)= O d 4 n d : Moreover if d= o(n 1=4 ), then the process of cycle counts converges in distribution to a process of independent Poisson random variables (C 1 ;C 2 ;:::) D !(Y 1 ;Y 2 ;:::) as n!¥, wherefY k g are independent Poisson random variables with rate 1 k . The proof uses a multivariate Stein’s method with exchangeable pairs. Stein’s method via exchangeable pairs has previously been used to prove limit theorems in a wide range of settings. We refer the reader to [26] for an accessible survey. In particular, Judkovich [69] used this approach to prove a Poisson limit theorem for the joint distribution of cycle counts in uniformly random permutations without long cycles. We remark that our limit theorem parallels the result of Arratia and Tavare [12] on the cycle structure of uniformly random permutations. There is a vast probabilistic literature on the cycle 76 structure of random permutations, including the works of Shepp and Lloyd [96] on ordered cycle lengths and DeLaurentis and Pittel [34] on a functional central limit theorem for cycle lengths with connections to Brownian motion. We initiate a parallel study of the cycle structure in random parking functions, but further study is fully warranted. 5.1.2 Outline This chapter is organized as follows. Section 5.2 introduces definitions and notation that we use throughout the chapter, and gives the necessary background and relevant results that we use to prove our main theorems. In Section 5.3, we compute the exact values for the expected number of fixed points and trans- positions in a random parking function, and obtain the asymptotic formula, Theorem 5.1.1, for the expected number of k-cycles, for general k. We then apply Stein’s method via exchangeable pairs in Section 5.4 to establish the Poisson limit theorem for joint cycle counts, Theorem 5.1.2. We conclude with some final remarks and open problems in Section 5.5. 5.2 Preliminaries In this section we introduce the definitions and notation that we use throughout the chapter, and give necessary background, tools, and techniques that we use to prove our main results. 5.2.1 Definitions and Notation Let X and Y be two random variables. We let X d = Y denote that X and Y are equal in distribution, and we let X D ! Y denote that X converges in distribution to Y . Let a n ;b n be two sequences. If lim n!¥ a n b n is a nonzero constant, then a n b n and we say that a n is of the same order as b n . If lim n!¥ a n b n = 1, then a n b n and we say that a n is asymptotic to 77 b n . If there exists positive constants c and n 0 such that a n cb n for all n n 0 , then a n = O(b n ). If lim n!¥ a n b n = 0, then a n = o(b n ). 5.2.2 Abel’s Multinomial Theorem The classical binomial and multinomial theorems are generalized by Abel’s multinomial theorem. We use the following version of Abel’s multinomial theorem due to Yin, which was derived from Pitman [91]. Lemma 5.2.1 ([107], Theorem 3.10, Abel’s Multinomial Theorem). Let A n (x 1 ;:::;x m ; p 1 ;:::; p m ) := å n s s s m Õ j=1 (s j + x j ) s j +p j ; where s s s=(s 1 ;:::;s m ) andå m i=1 s i = n. Then A n (x 1 ;:::;x i ;:::;x j ;:::;x m ; p 1 ;:::; p i ;:::; p j ;:::; p m ) = A n (x 1 ;:::;x j ;:::;x i ;:::;x m ; p 1 ;:::; p j ;:::; p i ;:::; p m ); A n (x 1 ;:::;x m ; p 1 ;:::; p m ) = m å i=1 A n1 (x 1 ;:::;x i1 ;x i+1 ;:::;x m ; p 1 ;:::; p i1 ; p i+1 ;:::; p m ); A n (x 1 ;:::;x m ; p 1 ;:::; p m ) = n å s=0 n s s!(x 1 + s)A ns (x 1 + s;x 2 ;:::;x m ; p 1 1; p 2 ;:::; p m ): and the following special cases hold via the above recurrences: A n (x 1 ;:::;x m ;1;:::;1)= (x 1 ++ x m )(x 1 ++ x m + n) n1 (x 1 x 2 x m ) ; A n (x 1 ;:::;x m ;1;:::;1;0)= x m (x 1 ++ x m + n) n (x 1 x 2 x m ) : 78 5.2.3 Parking Completions Suppose that spots v 1 ;:::;v ` are already occupied. Recall that the parking completions for the sequence v v v = (v 1 ;:::;v ` ) is the set of successful preference sequences for the remaining n` cars. The following result due to Adeniran et al. is crucial for counting parking completions and computing probabilities on parking functions. Lemma 5.2.2 ([3], Theorem 1.1). The number of parking completions of v v v=(v 1 ;:::;v ` ) in[n] is jPC n (v v v)j= å s s s2L n (v v v) n` s s s `+1 Õ i=1 (s i + 1) s i 1 ; where L n (v v v)= ( s s s=(s 1 ;:::;s `+1 )2N `+1 s 1 ++ s i v i i8i2[`]; `+1 å i=1 s i = n` ) As a corollary, we get a formula for parking completions when the occupied spots form a contiguous block. We use the following version due to Yin. Lemma 5.2.3 ([107], Proposition 2.7). Let 1` n and let 1 k n`+ 1. Then jPC n ((p 1 = k;:::;p ` = k+` 1))j= n` å s=0 n` s (s+`) s1 `(n s`+ 1) ns`1 : 5.2.4 Stein’s Method and Exchangeable Pairs There are many variants of Stein’s method, but we use the exchangeable pairs method. The ordered pair (W;W 0 ) of random variables is an exchangeable pair if (W;W 0 ) d =(W 0 ;W). We will use the following multivariate version of Stein’s method for Poisson approximation via exchangeable pairs due to Chatterjee, Diaconis, and Meckes. Theorem 5.2.4 ([26], Proposition 10). Let W =(W 1 ;:::;W d ) be a random vector with values inN d and E(W i )=l i <¥. Let Z=(Z 1 ;:::;Z d ) have independent coordinates with Z i Poisson(l i ). 79 Let W 0 =(W 0 1 ;:::;W 0 d ) be defined on the same probability space as W with(W;W 0 ) an exchange- able pair. Then d TV (W;Z) d å k=1 a k [Ejl k c k P(A k )j+ EjW k c k P(B k )j]; witha k = minf1;1:4l 1=2 k g, any choice of thefc k g, and A k =fW 0 k = W k + 1;W j = W 0 j for k+ 1 j dg; B k =fW 0 k = W k 1;W j = W 0 j for k+ 1 j dg: 5.3 Expected Number of Cycles of a Fixed Length In this section we compute the expected number of k-cycles in a random parking function. We consider the cases of fixed points and transpositions separately, as we are able to compute their expected values exactly. For general k, we compute the asymptotic expected number of k-cycles. 5.3.1 Fixed Points and Transpositions Let fp(p) and tc(p) be the number of fixed points and the number of transpositions, respectively, ofp2 PF n . We can decompose fp(p) and tc(p) into a sum of indicator random variables as fp(p)= n å i=1 1 fp i =ig ; tc(p)= å 1i< jn 1 fp i = j;p j =ig Proposition 5.3.1. Let p2 PF n be a parking function chosen uniformly at random. Then the expected number of fixed points ofp is E(fp(p))= 1; 80 and the expected number of transpositions is E(tc(p))= n 2(n+ 1) : Proof. For fixed points, linearity of expectation gives E(fp(p))= n å i=1 E(1 fp i =ig )= n å i=1 P(p i = i)= n å i=1 P(p 1 = i); where the last equality follows by the symmetry of coordinates. By Lemma 5.2.2, n å i=1 jfp2 PF n :p 1 = igj= n å i=1 jPC n ((i))j = n å i=1 ni å s=0 n 1 s (s+ 1) s1 (n s) ns2 = n1 å s=0 n 1 s (s+ 1) s1 (n s) ns2 ns å i=1 1 = n1 å s=0 n 1 s (s+ 1) s1 (n s) ns2 (n s) = n1 å s=0 n 1 s (s+ 1) s1 (n s) ns1 = A n1 (1;1;1;0) =(n+ 1) n1 where the last two equalities follow from Abel’s multinomial theorem, Lemma 5.2.1. Combining the above and usingjPF n j=(n+ 1) n1 yields E(fp(p))= n å i=1 P(p 1 = i)= n å i=1 jPC n ((i))j jPF n j = 1: 81 Next we consider transpositions. By linearity of expectation, E(tc(p))= å 1i< jn E(1 fp i = j;p j =ig )= å 1i< jn P(p i = j;p j = i) = å 1i< jn P(p 1 = i;p 2 = j); where the last equality follows by symmetry of coordinates. Using Lemma 5.2.2 gives å 1i< jn jfp2 PF n :p 1 = i;p 2 = jgj= å 1i< jn jPC((i; j))j = å 1i< jn å s s s2L n ((i; j)) n 2 s s s 3 Õ i=1 (s i + 1) s i 1 ; where L n ((i; j))= s s s=(s 1 ;s 2 ;s 3 )2N 3 s 1 i 1;s 1 + s 2 j 2;s 1 + s 2 + s 3 = n 2 : This gives us the summation indices for the sum over s s s, so that å 1i< jn jPC(i; j)j= n1 å i=1 n å j=i+1 n2 å s 1 =i1 n2s 1 å s 2 = j2s 1 n 2 s 1 ;s 2 ;n 2 s 1 s 2 (s 1 + 1) s 1 1 (s 2 + 1) s 2 1 (n 1 s 1 s 2 ) n3s 1 s 2 = n2 å s 1 =0 n2s 1 å s 2 =0 n 2 s 1 ;s 2 ;n 2 s 1 s 2 (s 1 + 1) s 1 1 (s 2 + 1) s 2 1 (n 1 s 1 s 2 ) n3s 1 s 2 s 1 +1 å i=1 s 1 +s 2 +2 å j=i+1 1: 82 By a change of variables with s= s 2 and t= n 2 s 1 s 2 we get å 1i< jn jPC((i; j))j= n2 å s=0 n2s å t=0 n 2 s;t;n 2 st (s+ 1) s1 (t+ 1) t1 (n 1 st) n3st n1st å i=1 nt å j=i+1 1: Computing the inner sum yields n1st å i=1 nt å j=i+1 1= n1st å i=1 (nt i)= (nt+ s)(nt s 1) 2 ; and plugging this back gives å 1i< jn jPC((i; j))j= 1 2 n2 å s=0 n2s å t=0 n 2 s;t;n 2 st (s+ 1) s1 (t+ 1) t1 (n 1 st) n2st (nt+ s): For ease of notation, define f(n;s;t) := n 2 s;t;n 2 st (s+ 1) s1 (t+ 1) t1 (n 1 st) n2st : We distribute the(nt+ s) term so that the sum splits into three components å 1i< jn jPC((i; j))j= 1 2 n2 å s=0 n2s å t=0 n f(n;s;t) 1 2 n2 å s=0 n2s å t=0 t f(n;s;t)+ 1 2 n2 å s=0 n2s å t=0 s f(n;s;t) = n 2 n2 å s=0 n2s å t=0 f(n;s;t) 1 2 n2 å t=0 n2t å s=0 t f(n;s;t)+ 1 2 n2 å s=0 n2s å t=0 s f(n;s;t) = n 2 n2 å s=0 n2s å t=0 f(n;s;t); where the last two sums cancel out by symmetry of s and t. 83 Finally, we use Abel’s multinomial theorem, Lemma 5.2.1, to get n 2 n2 å s=0 n2s å t=0 f(n;s;t)= n 2 A n2 (1;1;1;1;1;0)= n(n+ 1) n2 2 : Putting everything together gives E(tc(p))= å 1i< jn P(p 1 = i;p 2 = j)= å 1i< jn jPC((i; j))j jPF n j = n(n+ 1) n2 2(n+ 1) n1 = n 2(n+ 1) : 5.3.2 General k-Cycles Let C k (p) be the number of k-cycles ofp2 PF n . We can decompose C k (p) into a sum of indicator random variables as C k (p)= å a2A k 1 fa is a k-cycle inpg ; where A k =f(i 1 ;:::;i k ) : 1 i 1 << i k ng. Proof of Theorem 5.1.1. The proof follows similarly as in Proposition 5.3.1. We will not include all the technical details, but we walk through the key ideas. By linearity of expectation, E(C k (p))= å a2A k P(a is a k-cycle inp) = å 1i 1 < k, then there are two such transpositions. There are three subcases where A k does not occur. If L(C a )= 2k, then C a splits into two cycles of length k so that W 0 k = W k +2. Next, if L(C a )2 fk+1;k+2;:::;dg, then W 0 L(C a ) = W L(C a ) 1. Finally, if L(C a )2f2k+1;2k+2;:::;d+kg, then W 0 L(C a )k = W L(C a )k +1. Therefore we must have that d< L(C a )< 2k or L(C a )> d+k. 3. Suppose a and b are in the same connected component such that a is in the tree component and b is in the cycle component C b . If L(P a )+ L(C b )> k, then there is exactly one transpo- sition that breaks C b into a smaller cycle of length k and a path of length L(P a )+ L(C b ) k. There are two subcases where A k does not occur. If L(C b )= k, then W 0 k = W k since a k-cycle is created but the k-cycle C b is destroyed. Second, if L(C b )2fk+ 1;k+ 2;:::;dg, then W 0 L(C b ) = W L(C b ) 1. Therefore we must have that L(C b )< k or L(C b )> d. 4. Finally suppose a and b are both in the same tree component such that b lies on the unique path from a to the root. If L(P a )> k, then there is exactly one transposition that breaks P a into a cycle of length k and a path of length L(P a ) k. 90 Combining the cases above gives P(A k )= 1 n(n 1) n å a=1 å b6=a 1 fL(C a )+L(C b )=kg 1 fC a 6=C b g + 2 n(n 1) n å a=1 (1 fd<L(C a )<2kg + 1 fL(C a )>d+kg ) + 2 n(n 1) n å a=1 1 fL(P a )+L(C b )>kg (1 fL(C b )<kg + 1 fL(C b )>dg ) + 2 n(n 1) n å a=1 1 fL(P a )>kg : Rewriting using 1 fL(C a )>d+kg = 1 1 fL(C a )d+kg , 1 fL(P a )+L(C b )>kg = 1 1 fL(P a )+L(C b )kg , 1 fL(C b )>dg = 1 1 fL(C b )dg , and 1 fL(P a )>kg = 1 1 fL(P a )kg yields, upon simplification, P(A k )= 6 n 1 + 1 n(n 1) n å a=1 å b6=a 1 fL(C a )+L(C b )=kg 1 fC a 6=C b g + 2 n(n 1) n å a=1 (1 fd<L(C a )<2kg 1 fL(C a )d+kg ) + 1 n(n 1) n å a=1 (1 fL(C b )<kg 1 fL(C b )dg 1 fL(P a )+L(C b )kg ) 1 n(n 1) n å a=1 1 fL(P a )kg : Using the fact thatl k = 1 k + O(n 1 ) from the proof of Theorem 5.1.1, c k = n 6k , and the triangle inequality, we get Ejl k c k P(A k )j 1 k(n 1) + O(n 1 )+ 1 6k(n 1) n å a=1 å b6=a E(1 fL(C a )+L(C b )=kg 1 fC a 6=C b g ) + 1 3k(n 1) n å a=1 E(1 fd<L(C a )<2kg + 1 fL(C a )d+kg ) + 1 3k(n 1) n å a=1 E(1 fL(C b )<kg + 1 fC b dg + 1 fL(P a )+L(C b )kg ) + 1 3k(n 1) n å a=1 E(1 fL(P a )kg ): 91 We bound the expectations in the four summands above. Conditioning on the length of C a and applying Lemma 5.4.1, the first summand is E(1 fL(C a )+L(C b )=k;C a 6=C b g )= k1 å j=1 E(1 fC a 6=C b g 1 fL(C a )= jg 1 fL(C b )=k jg ) k1 å j=1 P(L(C b )= k jj C a 6= C b ;L(C a )= j)P(L(C a )= j) k1 å j=1 k j+ 1 n k+ 1 j+ 1 n+ 1 = k 3 + 6k 2 k 6 6(n+ 1)(n k+ 1) : By the union bound and Lemma 5.4.1, the second summand is E(1 fd<L(C a )<2kg + 1 fL(C a )d+kg ) 2kP(L(C a )= 2k)+(d+ k)P(L(C a )= d+ k) = 2k(2k+ 1)+(d+ k)(d+ k+ 1) n+ 1 : Similarly, by Lemma 5.4.1, the third summand is E(1 fL(C b )<kg + 1 fC b dg + 1 fL(P a )+L(C b )kg ) (k 1)P(L(C b )= k 1)+ dP(L(C b )= d)+ kP(L(P a )+ L(C b )= k) k(k 1)+ d(d+ 1)+ k(k+ 1) n+ 1 : Finally, by Lemma 5.4.1, the fourth summand is E(1 fL(P a )kg ) kP(L(P a )= k) k(k+ 1) n+ 1 : 92 Therefore combining the above yields the upper bound Ejl k c k P(A k )j 1 k(n 1) + O(n 1 )+ 1 6k(n 1) n å a=1 å b6=a k 3 + 6k 2 k 6 6(n+ 1)(n k+ 1) + 1 3k(n 1) n å a=1 2k(2k+ 1)+(d+ k)(d+ k+ 1) n+ 1 + 1 3k(n 1) n å a=1 k(k 1)+ d(d+ 1)+ k(k+ 1) n+ 1 + 1 3k(n 1) n å a=1 k(k+ 1) n+ 1 1 k(n 1) + O(n 1 )+ k 2 + 6k n k+ 1 + d 2 + 2dk+ d+ 5k 2 + 3k k(n 1) + d 2 + d+ 2k 2 k(n 1) + k+ 1 n 1 = 2d 2 + 2dk+ 2d+ 8k 2 + 4k+ 1 k(n 1) + k 2 + 6k n k+ 1 + O(n 1 ): Lemma 5.4.3. Let(W;W 0 ) be the exchangeable pair defined above. If c k = n 6k , then EjW k c k P(B k )j k+ 1 n 1 + dk 3 + 6k 3 2dk 2 + 3d 2 k 6k+ 3d 2 + 3d k(n k+ 1) : Proof. Lett =(ab) be the transposition in the exchangeable pairs construction for W 0 . We count the transpositions such that B k occurs. Observe that if a2 C a where C a is a k-cycle, then any transpositiont such that b6= a will break the cycle C a . There are four cases. 1. If a and b are in different cycles C a and C b , thent breaks C a and C b and strings them together to form a cycle of length L(C b )+ k. Thus we must have that either L(C b )> d, or L(C b )< k and L(C b )+ k> d. 2. Suppose a and b are both in the same cycle component C a . Thent breaks C a into two smaller cycles. For fixed a, there are k 1 choices for b. 93 3. Suppose a and b are in different components, with a2 C a and b in a tree component. Denote this event by G 1 . Thent breaks C a and P b , and creates a path of length L(P b )+ L(C a ). 4. Suppose a and b are in the same component, with a2 C a and b on the tree component. Then t creates a cycle whose length lies in the interval[L(P b );L(P b )+ L(C a ) 1] and a path. The created cycle must have length greater than d or less than k. There are three cases. If L(P b )> d, thent creates a cycle of length greater than d. If L(P b )= d j+ 1 for 1 j k 1, then there are exactly (k j) transpositions t which create a cycle of length greater than d. Finally if L(P b )= j for 1 j k 1, then there are exactly(k j) transpositionst which creates a cycle of length less than k. Combining the cases above gives P(B k )= 2 n(n 1) n å a=1 å b6=a 1 fL(C a )=kg 1 fC a 6=C b g [1 fL(C b )>dg + 1 fL(C b )<kg 1 fL(C b )>dkg ] + k 1 n(n 1) n å a=1 1 fL(C a )=kg + 2 n(n 1) n å a=1 å b6=a 1 fL(C a )=kg 1 G 1 + 2 n(n 1) n å a=1 å a6=b 1 fL(C a )=kg [1 fL(P b )>dg + k1 å j=1 (k j)(1 fL(P b )=d j+1g + 1 fL(P b )= jg )]; Rewriting using 1 fL(C b )>dg = 1 1 fL(C b )dg , 1 fL(C b )>dkg = 1 1 fL(C b )dkg , 1 fL(P b )>dg = 1 1 fL(P b )dg , and simplifying gives P(B k )= 6 n n å a=1 1 fL(C a )=kg + 2 n(n 1) n å a=1 å b6=a 1 fL(C a )=kg 1 fC a 6=C b g [1 fL(C b )<kg 1 fL(C b )<kg 1 fL(C b )dkg 1 fL(C b )dg ] + k 1 n(n 1) n å a=1 1 fL(C a )=kg + 2 n(n 1) n å a=1 å a6=b 1 fL(C a )=kg [1 fL(P b )dg + k1 å j=1 (k j)(1 fL(P b )=d j+1g + 1 fL(P b )= jg )]; 94 Observe that we may write the number of k-cycles as a sum of indicators as W k = 1 k å n a=1 1 fL(C a )=kg . To see this, note that every member of a cycle of length k contributes an indicator variable of value 1 to the sum, so that each k-cycle is counted k times. Thus dividing the sum by k yields the correct number of k-cycles. Using this representation of W k along with c k = n 6k and the triangle inequality gives EjW k c k P(B k )j 1 3k(n 1) n å a=1 å b6=a E(1 fL(C a )=kg 1 fC a 6=C b g [1 fL(C b )<kg + 1 fL(C b )<kg 1 fL(C b )dkg + 1 fL(C b )dg ]) + k 1 6k(n 1) n å a=1 E(1 fL(C a )=kg )+ 1 3k(n 1) n å a=1 å a6=b E(1 fL(C a )=kg [1 fL(P b )dg + k1 å j=1 (k j)(1 fL(P b )=d j+1g + 1 fL(P b )= jg )]) We bound the expected values in the summands. The computations are similar to those in the proof of Lemma 5.4.2 so we omit some technical details. By the union bound and Lemma 5.4.1, the first summand is E(1 fL(C a )=kg 1 fC a 6=C b g [1 fL(C b )<kg + 1 fL(C b )<kg 1 fL(C b )dkg + 1 fL(C b )dg ]) P(L(C a )= k)[P(L(C b )< kj L(C a )= k;C a 6= C b ) + P(L(C b )< k;L(C b ) d kj L(C a )= k;C a 6= C b ) + P(L(C b ) dj L(C a )= k;C a 6= C b )] k+ 1 n+ 1 2k(k 1)+(d k)(d k+ 1)+ d(d+ 1) n k+ 1 2d 2 k+ 2d 2 2dk 2 + 2d+ 3k 3 3k (n+ 1)(n k+ 1) : 95 By Lemma 5.4.1, the second summand is E(1 fL(C a )=kg )= P(L(C a )= k) k+ 1 n+ 1 : Finally, the third summand is E(1 fL(C a )=kg [1 fL(P b )dg + k1 å j=1 (k j)(1 fL(P b )=d j+1g + 1 fL(P b )= jg )]) k+ 1 n+ 1 d(d+ 1) n k+ 1 + k1 å j=1 (k j) d j+ 2+ j+ 1 n k+ 1 ! d 2 k+ d 2 + dk 3 + d+ 3k 3 3k (n+ 1)(n k+ 1) : Combining the above gives us the upper bound EjW k c k P(B k )j 1 3k(n 1) n å a=1 å b6=a 2d 2 k+ 2d 2 2dk 2 + 2d+ 3k 3 3k (n+ 1)(n k+ 1) + k 1 6k(n 1) n å a=1 k+ 1 n+ 1 + 1 3k(n 1) n å a=1 å b6=a d 2 k+ d 2 + dk 3 + d+ 3k 3 3k (n+ 1)(n k+ 1) 2d 2 k+ 2d 2 2dk 2 + 2d+ 3k 3 3k k(n k+ 1) + k+ 1 n 1 + d 2 k+ d 2 + dk 3 + d+ 3k 3 3k k(n k+ 1) = k+ 1 n 1 + dk 3 + 6k 3 2dk 2 + 3d 2 k 6k+ 3d 2 + 3d k(n k+ 1) : Using the two lemmas above with the multivariate version of Stein’s method via exchangeable pairs, we can now prove our Poisson limit theorem for cycle counts. 96 Proof of Theorem 5.1.2. By Theorem 5.2.4 and Lemmas 5.4.2 and 5.4.3, d TV (W;Z) d å k=1 (Ejl k c k P(A k )j+ EjW k c k P(B k )j) d å k=1 2d 2 + 2dk+ 2d+ 8k 2 + 4k+ 1 k(n 1) + k 2 + 6k n k+ 1 + O(n 1 ) + d å k=1 k+ 1 n 1 + dk 3 + 6k 3 2dk 2 + 3d 2 k 6k+ 3d 2 + 3d k(n k+ 1) 1 n d d å k=1 dk 3 + 7k 3 dk 2 + 15k 2 + 3d 2 k+ 2dk k+ 5d 2 + 5d+ 1 k + O(d=n) = 1 n d (5d 2 + 5d+ 1)H d + d 4 + 16d 3 + 38d 2 + 23d 3 + O(d=n); where H d is the dth harmonic number. Using the fact that H d logd+ 1 and simplifying, we obtain an upper bound of d TV (W;Z) d 4 + 6d 3 + 18d 2 + 13d+ 1+(5d 2 + 5d+ 1)logd n d + O(d=n) = O d 4 n d : For d= o(n 1=4 ), we have that d TV (W;Z)! 0 as n!¥. Let Y =(Y 1 ;:::;Y d ), wherefY k g are Poisson random variables with rate 1 k . Since l k ! 1 k as n!¥ by Theorem 5.1.1, we have that d TV (Z;Y)! 0 as n!¥. By the triangle inequality, d TV (W;Y)! 0 as n!¥. It follows that for all fixed d,(C 1 ;:::;C d ) D !(Y 1 ;:::;Y d ) as n!¥. Therefore (C 1 ;C 2 ;:::) D !(Y 1 ;Y 2 ;:::) as n!¥. We remark that if d is a fixed constant, the total variation distance upper bound is O(n 1 ), which is the optimal convergence rate for Poisson approximation. 97 5.5 Final Remarks 5.5.1 Random Mappings Parking functions are a subset of a more general class of functions,F n , called random mappings, which are functions f :[n]![n] from the set[n] to itself. Random mappings are extensively used in computer science and computational mathematics, for example in random number generators, cycle detection, and integer factorization. There is an extensive literature on the probabilistic properties of random mappings and we only cite a few here. Harris [59] initiated the classical theory of random mappings and studied various probability distributions related to random mappings. In [57], Hansen proved a functional central limit theorem for the component structure of the graph representation of random mappings. Subsequently, Flajolet and Odlyzko [45] studied various statistics on random mappings and used analytic combinatorics to prove limit theorems on these statistics. In [4], Aldous and Pitman found connections between features on random mappings and features on Brownian bridge. A paral- lel probabilistic study on parking functions is fully warranted. One direction is to study various statistics on random parking functions and find their limiting distributions. 5.5.2 Generating Function Approach It would be interesting to see if there is a generating function approach to prove the Poisson limit theorem for cycle counts. Generating functions combined with analytic combinatorics and singu- larity analysis have been widely used to compute moments for statistics in various random com- binatorial structures, as well as in proving limit theorems. In particular, it would be interesting to place parking functions into the theoretical framework of logarithmic combinatorial assemblies, introduced by Arratia, Stark, and Tavare in [11]. Some examples of assemblies include permuta- tions, mappings, and set partitions. 98 5.5.3 (m;n)- and u-Parking Functions In [107] and [106], Yin initiated the probabilistic study of(m;n)-parking functions and u u u-parking functions, respectively, and in particular, obtained explicit formulas for their parking completions. It should be tractable to follow our approach and use Stein’s method via exchangeable pairs to obtain limit theorems, with convergence rates, for the distribution of cycle counts in these more general models. 5.5.4 Total Number of Cycles A generating function approach was used in [96] by Shepp and Lloyd to show that the total num- ber of cycles in a uniformly random permutation is asymptotically normal with mean and variance logn. Similarly, Flajolet and Odlyzko [45] used generating functions to show that the total number of cycles in a uniformly random mapping is asymptotically normal with mean and variance 1 2 logn. Note that asymptotically, random mappings have about half as many total cycles as random per- mutations. Let K n (p)=å n k=1 C k (p) be the total number of cycles of a uniformly random parking function p2 PF n . By Theorem 5.1.1, E(K n (p))= n å k=1 E(C k (p)) n å k=1 1 k = H n where H n is the nth harmonic number. Note that although E(C k (p)) 1 k , the implied error terms are not uniform in k. Thus we are only able to get that E(K n (p)) logn. A more careful analysis should give the correct constant in front of the log term. It should be tractable to use Stein’s method for normal approximation to show that K n is asymptotically normal along with a rate of convergence. 99 References [1] H. Acan. “On a uniformly random chord diagram and its intersection graph”. Discrete Mathematics 340 (2017), pp. 1967–1985. [2] H. Acan and B. Pittel. “Formation of a giant component in the intersection graph of a random chord diagram”. Journal of Combinatorial Theory, Series B 125 (2017), pp. 33– 79. [3] A. Adeniran, S. Butler, G. Dorpalen-Barry, P. E. Harris, C. Hettle, Q. Liang, J. L. Martin, and H. Nam. “Enumerating parking completions using join and split”. Electronic Journal of Combinatorics 27.2 (2020), P2.44. [4] D. J. Aldous and J. Pitman. “Brownian bridge asymptotics for random mappings”. Random Structures and Algorithms 5.4 (1994), pp. 487–512. [5] J. E. Anderson, R. C. Penner, C. M. Reidys, and M. S. Waterman. “Topological classifica- tion and enumeration of RNA structures by genus”. Journal of Mathematical Biology 67.5 (2013), pp. 1261–1278. [6] O. Angel, R. van der Hofstad, and C. Holmgren. “Limit laws for self-loops and multiple edges in the configuration model”. Annales de l’Institut Henri Poincare Probabilities et Statistiques 55.3 (2019), pp. 1509–1530. [7] S. Arenas-Velilla, O. Arizmendi, and J. E. Paguyo. “Convergence rates for crossings in random chord diagrams”. In Preparation. [8] R. Arratia and S. DeSalvo. “Completely effective error bounds for Stirling numbers of the first and second kinds via Poisson approximation”. Annals of Combinatorics 21 (2017), pp. 1–24. [9] R. Arratia, L. Goldstein, and L. Gordon. “Poisson Approximation and the Chen-Stein method”. Statistical Science 5.4 (1990), pp. 403–434. [10] R. Arratia, L. Goldstein, and L. Gordon. “Two moments suffice for Poisson approxima- tions: The Chen-Stein method”. Annals of Probability 17.1 (1989), pp. 9–25. [11] R. Arratia, D. Stark, and S. Tavare. “Total variation asymptotics for Poisson process ap- proximations of logarithmic combinatorial assemblies”. Annals of Probability 23.3 (1388 1995), p. 1347. 100 [12] R. Arratia and S. Tavare. “The cycle structure of random permutations”. Annals of Proba- bility 20.3 (1992), pp. 1567–1591. [13] F. Avram and D. Bertsimas. “On central limit theorems in geometrical probability”. Annals of Applied Probability 3.4 (1993), pp. 1033–1046. [14] P. Baldi and Y . Rinott. “On normal approximations of distributions in terms of dependency graphs”. Annals of Probability 17.4 (1989), pp. 1646–1650. [15] I. Barany and V . Vu. “Central limit theorems for Gaussian polytopes”. Annals of Probabil- ity 35.4 (2007), pp. 1593–1621. [16] A. D. Barbour, L. Holst, and S. Janson. Poisson Approximation. New York: The Clarendon Press Oxford University Press, 1992. [17] A. Barvinok. “What does a random contingency table look like?” Combinatorics, Proba- bility, and Computing 19 (2010), pp. 517–539. [18] C. Betken. “A central limit theorem for the number of isolated vertices in a preferential attachment random graph”. arXiv:1910.02668. 2019. [19] C. Bhattacharjee and L. Goldstein. “Dickman approximation in simulation, summations and perpetuities”. Bernoulli 25.4A (2019), pp. 2758–2792. [20] S. C. Billey, M. Konvalinka, T. K. Petersen, W. Slofstra, and B. E. Tenner. “Parabolic double cosets in Coxeter groups”. Electronic Journal of Combinatorics 25.1.23 (2018). [21] B. Bollobas and O. Riordan. “Linearized chord diagrams and an upper bound for Vassiliev invariants”. Journal of Knot Theory and Its Ramifications 9 (2000), pp. 847–853. [22] B. Bollobas and O. Riordan. “The diameter of a scale free random graph”. Combinatorica 24.1 (2004), pp. 5–34. [23] M. Bona. “Generalized descents and normality”. Electronic Journal of Combinatorics 15.N21 (2008). [24] P. Chassaing and J. F. Marckert. “Parking functions, empirical processes, and the width of rooted labeled trees”. Electronic Journal of Combinatorics 8.R14 (2001). [25] S. Chatterjee and P. Diaconis. “A central limit theorem for a new statistic on permutations”. Indian Journal of Pure and Applied Mathematics 48 (2017), pp. 561–573. [26] S. Chatterjee, P. Diaconis, and E. Meckes. “Exchangeable pairs and Poisson approxima- tion”. Probability Surveys 2 (2005), pp. 64–106. 101 [27] S. Chatterjee, J. Fulman, and A. Rollin. “Exponential approximation by Stein’s method and spectral graph theory”. ALEA Latin American Journal of Probability and Mathematical Statistics 8 (2011), pp. 197–223. [28] L. H. Y . Chen. “Poisson approximation for dependent trials”. Annals of Probability 3.3 (1975), pp. 534–545. [29] L. H. Y . Chen and A. Rollin. “Stein couplings for normal approximation”. arXiv:1003.6039. 2010. [30] W. Y . C. Chen, E. Y . P. Deng, R. R. X. Du, R. P. Stanley, and C. H. Yan. “Crossings and nestings of matchings and partitions”. Transactions of the American Mathematical Society 359.4 (2007), pp. 1555–1575. [31] B. Chern, P. Diaconis, D. M. Kane, and R. C. Rhoades. “Central limit theorems for some set partition statistics”. Advances in Applied Mathematics 70 (2015), pp. 92–105. [32] M. Conger and D. Viswanath. “Normal approximations for descents and inversions of per- mutations of multisets”. Journal of Theoretical Probability 20.2 (2007), pp. 309–325. [33] H. Crane and S. DeSalvo. “Pattern avoidance for random permutations”. Discrete Mathe- matics and Theoretical Computer Science 19.2 (2018), p. 13. [34] J. M. DeLaurentis and B. Pittel. “Random permutations and Brownian motion”. Pacific Journal of Math- ematics 119.2 (1985), pp. 287–301. [35] P. Diaconis, S. Evans, and R. Graham. “Unseparated pairs and fixed points in random permutations”. Advances in Applied Mathematics 61 (2014), pp. 102–124. [36] P. Diaconis, J. Fulman, and R. Guralnick. “On fixed points of permutations”. Journal of Algebraic Combinatorics 28 (2008), pp. 189–218. [37] P. Diaconis and A. Gangolli. “Rectangular arrays with fixed margins”. Discrete Probability and Algorithms. V ol. 72. New York: Springer, 1993, pp. 15–41. [38] P. Diaconis and A. Hicks. “Probabilizing parking functions”. Advances in Applied Mathe- matics 89 (2017), pp. 125–155. [39] P. Diaconis and M. Simper. “Statistical enumeration of groups by double cosets”. Journal of Algebra 607 (2022), pp. 214–246. [40] R. Ehrenborg and A. Happ. “Parking cars after a trailer”. Australasian Journal of Combi- natorics 70.3 (2018), pp. 402–406. [41] R. Ehrenborg and A. Happ. “Parking cars of different sizes”. American Mathematical Monthly 123.10 (2016), pp. 1045–1048. 102 [42] V . Feray. “Central limit theorems for patterns in multiset permutations and set partitions”. Annals of Applied Probability 30.1 (2020), pp. 287–323. [43] V . Feray. “Weighted dependency graphs”. Electronic Journal of Probability 23.93 (2018), pp. 1–65. [44] P. Flajolet and M. Noy. “Analytic combinatorics of chord diagrams”. Formal Power Series and Algebraic Combinatorics. Springer, 2000, pp. 191–201. [45] P. Flajolet and A. Odlyzko. “Random mapping statistics”. Advances in Cryptology - EU- ROCRYPT ’89. V ol. 434. Lecture Notes in Computer Science. 1989, pp. 329–354. [46] P. Flajolet, P. Poblete, and A. Viola. “On the analysis of linear probing hashing”. Algorith- mica 22 (1998), pp. 490–515. [47] D. Foata and J. Riordan. “Mappings of acyclic and parking functions”. Aequationes Math- ematicae 10 (1974), pp. 10–22. [48] J. Fulman. “Stein’s method and non-reversible Markov chains”. Stein’s Method: Expository Lectures and Applications. V ol. 46. IMS Lecture Notes - Monograph Series. 2004, pp. 66– 74. [49] J. Fulman. “The distribution of descents in fixed conjugacy classes of the symmetric group”. Journal of Combinatorial Theory, Series A 84 (1998), pp. 171–180. [50] J. Fulman, G. B. Kim, and S. Lee. “Central limit theorem for peaks of a random permuta- tion in a fixed conjugacy class of S n ”. Annals of Combinatorics 26 (2022), pp. 97–123. [51] I. Gessel and S. Seo. “A refinement of Cayley’s formula for trees”. Electronic Journal of Combinatorics 11.2 (2006), R27. [52] S. Ghosh and L. Goldstein. “Concentration of measures via size-biased couplings”. Prob- ability Theory and Related Fields 149 (2011), pp. 271–278. [53] L. Goldstein and G. Reinert. “Stein’s method and the zero bias transformation with appli- cation to simple random sampling”. Annals of Applied Probability 7.4 (1997), pp. 935– 952. [54] L. Goldstein and G. Reinert. “Total variation distance for Poisson subset numbers”. Annals of Combinatorics 10 (2006), pp. 333–341. [55] L. Goldstein and Y . Rinott. “Multivariate normal approximations by Stein’s method and size bias couplings”. Journal of Applied Probability 33.1 (1996), pp. 1–17. [56] E. Gorsky, M. Mazin, and M. Vazirani. “Affine permutations and rational slope parking functions”. Transactions of the American Mathematical Society 368 (2016), pp. 8403– 8445. 103 [57] J. Hansen. “A functional central limit theorem for random mappings”. Annals of Probabil- ity 17.1 (1989), pp. 317–332. [58] J. Harer and D. Zagier. “The Euler characteristic of the moduli spaces of curves”. Inven- tiones Mathematicae 85 (1986), pp. 457–486. [59] B. Harris. “Probability distributions related to random mappings”. Annals of Mathematical Statistics 31.4 (1960), pp. 1045–1062. [60] J. He. “A central limit theorem for descents of a Mallows permutation and its inverse”. Annales de l’Institut Henri Poincare Probabilities et Statistiques 58.2 (2022), pp. 667– 694. [61] L. Hofer. “A central limit theorem for vincular permutation patterns”. Discrete Mathemat- ics and Theoretical Computer Science 19.2 (2018), p. 9. [62] C. Hoffman, D. Rizzolo, and E. Slivken. “Pattern-avoiding permutations and Brownian excursion part II: Fixed points”. Probability Theory and Related Fields 169.1 (2017), pp. 377–424. [63] R. van der Hofstad. Random Graphs and Complex Networks, Volume 1. Cambridge Se- ries in Statistical and Probabilistic Mathematics. Cambridge: Cambridge University Press, 2017. [64] C. Holmgren and S. Janson. “Limit laws for functions of fringe trees for binary search trees and random recursive trees”. Electronic Journal of Probability 20.4 (2015), pp. 1–51. [65] S. Janson. “Asymptotic distribution for the cost of linear probing hashing”. Random Struc- tures and Algorithms 19.3-4 (2001), pp. 438–471. [66] S. Janson. “Large deviations for sums of partly dependent random variables”. Random Structures and Algorithms 24.3 (2004), pp. 234–248. [67] S. Janson. “Normal convergence by higher semiinvariants with applications to sums of de- pendent random variables and random graphs”. Annals of Probability 16.1 (1988), pp. 305– 312. [68] H. Joe. “An ordering of dependence for contingency tables”. Linear Algebra and its Appli- cations 70 (1985), pp. 89–103. [69] D. Judkovich. “The cycle structure of permutations without long cycles”. arXiv:1905.04636. 2019. [70] S. H. Kang and J. Klotz. “Limiting conditional distribution for tests of independence in the two way table”. Communications in Statistics - Theory and Methods 27 (1998), pp. 2075– 2082. 104 [71] R. Kenyon and M. Yin. “Parking functions: From combinatorics to probability,” arXiv:2103.17180. 2021. [72] G. B. Kim. “Distribution of descents in matchings”. Annals of Combinatorics 23.1 (2019), pp. 73–87. [73] G. B. Kim and S. Lee. “A central limit theorem for descents and major indices in fixed conjugacy classes of S n ”. Advances in Applied Mathematics 124.102132 (2021). [74] G. B. Kim and S. Lee. “Central limit theorem for descents in conjugacy classes of S n ”. Journal of Combinatorial Theory, Series A 169.105123 (2020). [75] A. G. Konheim and B. Weiss. “An occupancy discipline and applications”. SIAM Journal on Applied Mathematics 14.6 (1966), pp. 1266–1274. [76] M. Kontsevich. “Vassiliev’s knot invariants”. Advances in Soviet Mathematics 16.2 (1993), pp. 137–150. [77] J. P. S. Kung and C. H. Yan. “Exact formulas for moments of sums of classical parking functions”. Advances in Applied Mathematics 31 (2003), pp. 215–241. [78] J. P. S. Kung and C. H. Yan. “Expected sums of general parking functions”. Annals of Combinatorics 7 (2003), pp. 481–493. [79] N. Linial and T. Nowik. “The expected genus of a random chord diagram”. Discrete and Computational Geometry 45.1 (2011), pp. 161–180. [80] W. L. Loh. “Stein’s method and multinomial approximation”. Annals of Applied Probabil- ity 2.3 (1992), pp. 536–554. [81] A. A. Mahmoud. “An asymptotic expansion for the number of 2-connected chord dia- grams”. arXiv:2009.12688. 2020. [82] S. Miner and I. Pak. “The shape of random pattern-avoiding permutations”. Advances in Applied Mathematics 55 (2014), pp. 86–130. [83] P. R. de Montmort. Essay d’Analyse sur les Jeaux de Hazard. Reprinted by Chelsea Pub- lishing. Paris, 1708. [84] S. Mukherjee. “Fixed points and cycle structure of random permutations”. Electronic Jour- nal of Probability 21.40 (2016), pp. 1–18. [85] A. Nica and R. Speicher. Lectures on the Combinatorics of Free Probability. V ol. 335. Lecture Notes Series. Cambridge: London Mathematical Society, 2006. [86] A. ¨ Ozdemir. “Martingales and descent statistics”. Advances in Applied Mathematics 140.102395 (2022). 105 [87] J. E. Paguyo. “Convergence rates of limit theorems in random chord diagrams”. arXiv:2104.01134. 2021. [88] J. E. Paguyo. “Cycle structure of random parking functions”. Advances in Applied Mathe- matics 144.102458 (2023). [89] J. E. Paguyo. “Fixed points, descents, and inversions in parabolic double cosets of the symmetric group”. arXiv:2112.07728. 2021. [90] J. Pike. “Convergence rates for generalized descents”. Electronic Journal of Combinatorics 18.P236 (2011). [91] J. Pitman. “Forest volume decompositions and Abel-Cayley-Hurwitz multinomial expan- sions”. Journal of Combinatorial Theory, Series A 98 (2002), pp. 175–191. [92] J. Pitman and R. P. Stanley. “A polytope related to empirical distributions, plane trees, park- ing functions, and the associahedron”. Discrete and Computational Geometry 27 (2002), pp. 603–634. [93] A. Postnikov and B. Shapiro. “Trees, parking functions, syzygies, and deformations of monomial ideals”. Transactions of the American Mathematical Society 356.8 (2004), pp. 3109– 3142. [94] J. Riordan. “The distribution of crossings of chords joining pairs of 2n points on a circle”. Mathematics of Computation 29.129 (1975), pp. 215–222. [95] N. Ross. “Fundamentals of Stein’s method”. Probability Surveys 8 (2011), pp. 210–293. [96] L. A. Shepp and S. P. Lloyd. “Ordered cycle lengths in a random permutation”. Transac- tions of the American Mathematical Society 121 (1966), pp. 340–357. [97] R. P. Stanley. “Hyperplane arrangements, parking functions and tree inversions”. Mathe- matical Essays in honor of Gian-Carlo Rota. V ol. 161. Progress in Mathematics. Birkhauser Boston, 1998, pp. 359–375. [98] R. P. Stanley. “Parking functions and noncrossing partitions”. Electronic Journal of Com- binatorics 4.2 (1997), R20. [99] C. Stein. “A bound for the error in the normal approximation to the distribution of a sum of dependent random variables”. Proceedings of the Sixth Berkeley Symposium on Mathe- matical Statistics and Probability. V ol. II: Probability theory. Univ. California Press, 1972, pp. 583–602. [100] C. Stein. Approximate computation of expectations. V ol. 7. Institute of Mathematical Statis- tics Lecture Notes - Monograph Series. Institute of Mathematical Statistics, 1986. 106 [101] P. R. Stein and C. J. Everett. “On a class of linked diagrams, II. Asymptotics”. Discrete Mathematics 21 (1978), pp. 309–318. [102] J. Touchard. “ur un probleme de configurations es sur les fractions continues”. Canadian Journal of Mathematics 4 (1952), pp. 2–25. [103] C. H. Yan. “Generalized parking functions, tree inversions, and multicolored graphs”. Ad- vances in Applied Mathematics 27 (2001), pp. 641–670. [104] C. H. Yan. “Handbook of Enumerative Combinatorics”. Boca Raton: CRC Press, 2015. Chap. Parking Functions, pp. 835–893. [105] Y . Yao and D. Zeilberger. “An experimental mathematics approach to the area statistic of parking functions”. Mathematical Intelligencer 41.2 (2019), pp. 1–8. [106] Parking functions, multi-shuffle, and asymptotic phenomena. 33rd International Confer- ence on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algo- rithms. 2022. [107] M. Yin. “Parking functions: Interdisciplinary connections”. Advances in Applied Probabil- ity, to appear (2021). 107
Abstract (if available)
Abstract
In this thesis, we study the asymptotic distribution of statistics on three random discrete structures: chord diagrams, permutations from fixed parabolic double cosets of the symmetric group, and parking functions. Using Stein's method, we obtain Poisson and central limit theorems for various statistics on these structures.
The first problem concerns the asymptotic distributions of the number of crossings and the number of simple chords in a random chord diagram. Using size-bias coupling and Stein's method, we obtain bounds on the Kolmogorov distance between the distribution of the number of crossings and a standard normal random variable, and on the total variation distance between the distribution of the number of simple chords and a Poisson random variable. As an application, we provide explicit error bounds on the number of chord diagrams containing no simple chords.
The second problem concerns fixed points and generalized descents, which include descents and inversions, on permutations chosen uniformly at random from fixed parabolic double cosets of the symmetric group. We show that the distribution of fixed points is asymptotically Poisson and establish a central limit theorem for the distribution of generalized descents for certain regimes. Our proofs use Stein's method with size-bias coupling and dependency graphs. As applications of our size-bias coupling and dependency graph constructions, we also obtain concentration of measure results.
The third problem concerns the cycle structure of uniformly random parking functions. Using the combinatorics of parking completions, we compute the asymptotic expected value of the number of cycles of any fixed length. We obtain an upper bound on the total variation distance between the joint distribution of cycle counts and independent Poisson random variables using a multivariate version of Stein's method via exchangeable pairs. Under a mild condition, we show that the process of cycle counts converges in distribution to a process of independent Poisson random variables.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Stein's method via approximate zero biasing and positive association with applications to combinatorial central limit theorem and statistical physics
PDF
Stein's method and its applications in strong embeddings and Dickman approximations
PDF
Cycle structures of permutations with restricted positions
PDF
Concentration inequalities with bounded couplings
PDF
Stein couplings for Berry-Esseen bounds and concentration inequalities
PDF
Finite sample bounds in group sequential analysis via Stein's method
PDF
CLT, LDP and incomplete gamma functions
PDF
Eigenfunctions for random walks on hyperplane arrangements
Asset Metadata
Creator
Paguyo, Jan Ernest
(author)
Core Title
Limit theorems for three random discrete structures via Stein's method
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Applied Mathematics
Degree Conferral Date
2023-05
Publication Date
04/03/2023
Defense Date
03/22/2023
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
central limit theorem,chord diagrams,double cosets,OAI-PMH Harvest,parking functions,permutations,Poisson approximation,Stein's method
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Fulman, Jason (
committee chair
)
Creator Email
je.paguyo@gmail.com,paguyo@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC112922129
Unique identifier
UC112922129
Identifier
etd-PaguyoJanE-11559.pdf (filename)
Legacy Identifier
etd-PaguyoJanE-11559
Document Type
Dissertation
Format
theses (aat)
Rights
Paguyo, Jan Ernest
Internet Media Type
application/pdf
Type
texts
Source
20230404-usctheses-batch-1015
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
central limit theorem
chord diagrams
double cosets
parking functions
permutations
Poisson approximation
Stein's method