Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Applications of topological data analysis to operational research problems
(USC Thesis Other)
Applications of topological data analysis to operational research problems
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Applications of Topological Data Analysis to Operational Research Problems
by
Shannon Sweitzer-Siojo
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(INDUSTRIAL AND SYSTEMS ENGINEERING)
May 2023
Copyright 2023 Shannon Sweitzer-Siojo
Acknowledgements
I would first like to thank my advisor, Professor John Gunnar Carlsson, for his guidance and encouragement
during the last five years. I am so grateful that he believed in me and saw in me potential that I didn’t know
I had. His passion and excitement for research inspired me, and his constant support made it possible for
me to find a fulfilling work-life balance during my PhD years. I could not have done any of this without
his mentorship and kindness.
I also thank my qualifying exam and dissertation committee members Professor Satish Kumar Thitta-
maranahalli, Professor Meisam Razaviyayn, Professor Andres Gomez Escobar, and Professor Sheldon Ross
for their valuable insights and suggestions. I would like to thank committee member and collaborator
Professor Erik Carlsson for his mentorship and contributions.
Thank you also to all of my amazing labmates, especially Bo Jones and Javad Azizi. It was a privilege
to work alongside you. I am also so grateful to Anthony Nguyen, Nathan Decker, and Grace Owh for their
friendship in these last years, and to Shelly Lewis for her support in matters of academic guidance.
I would like to thank my parents and family for their unconditional love and support throughout my
graduate studies. I am also so grateful to my husband, Sebastian, for walking with me through this journey.
His unwavering love helped me to persevere. I couldn’t ask for a better partner.
Lastly, I thank the Lord for His faithfulness. To Him be the glory.
ii
TableofContents
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Motivation and Structure of This Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Topological Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Homology and Persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Data Visualization: Illustrating the Use of Filtered Simplicial
Complexes and Persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Chapter 2: TDA and Local Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1 A pipeline for local search with TDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 The traveling salesman problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3 Computational experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4 A toy example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5 K-means and higher order Betti numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Chapter 3: TDA and Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1 A New Filtered Complex: Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Construction of the Filtered Complex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3 Random Walks on Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.4 Performance on Benchmark Data Sets (TADASets) . . . . . . . . . . . . . . . . . . . . . . . 38
3.5 The Ising Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.5.1 Ising on a Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.5.2 Ising on a Circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.6 Social Dynamics and the Sznajd Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.7 U.S. Airline Traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.8 Molecular conformation spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Chapter 4: Future Research Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.1 Ising on a Grid, Landmark Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2 Opinion and Social Dynamics Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.3 U.S. Freight Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
iii
Chapter 5: Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
iv
ListofFigures
1.1 Essential features that make topology ideal for data applications . . . . . . . . . . . . . . . 4
1.2 A computational topology analysis of point cloud data from the human hand from [19] . . 5
1.3 A 2-dimensional "blob" hasβ 0
=1 because it consists of one component, andβ k
=0 for all
k≥ 1 (3a). The 2-dimensional blob with holes in (3b) still hasβ 0
=1, butβ 1
=3 because
there are three holes present. The (hollow) sphere in (3c) hasβ 0
=1,β 1
=0,β 2
=1, and
β k
= 0 fork≥ 3; theβ 2
= 1 is due to the interior of the sphere. Finally, the torus in (3d)
hasβ 0
= 1,β 1
= 2,β 2
= 1, andβ k
= 0 fork ≥ 3; theβ 1
= 2 is due to the two holes
indicated with thickened lines (one for the "donut" hole and one that wraps around the
thickness of the torus). Theβ 2
=1 is due to the interior of the torus. . . . . . . . . . . . . 6
1.4 Evolution ofVR(X) for increasing values oft. Intervals shown:0≤ t<
√
2,
√
2≤ t<2,
2≤ t<
√
8,
√
8≤ t. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5 Zeroth and first Betti barcodes for the house point cloud under Vietoris-Rips complex and
filtration values 0≤ t≤ 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.6 From visual inspection, we can see that the dataset in Figure 1.6a has two clusters and
three “holes” (loops). In Figures 1.6b-1.6e, we build the Čech complex for this dataset by
surrounding the points with balls of increasing radiusr. We are interested in the Betti
numbers that result from taking the union of these balls, which vary withr. Figure 1.6b
shows 10 connected components (β 0
= 10) and no holes (β 1
= 0), Figure 1.6c shows2
connected components (β 0
= 2) and 3 holes (β 1
= 3), Figure 1.6d shows 2 connected
components and no holes (β 0
= 2, β 1
= 0), and Figure 1.6e shows one connected
component and no holes (β 0
= 1, β 1
= 0). The evolution of these values ofβ 0
andβ 1
is captured in the barcode diagram in Figure 1.6f; the upper plot shows the connected
componentsβ 0
asr varies, and the lower plot shows the distinct holesβ 1
asr varies. This
barcode plot is consistent with our visual inspection in the following sense: the majority
of the upper plot indicates that there are two connected components (i.e. two clusters)
that “persist” for a wide range ofr (in particular,0.50≤ r ≤ 1.41), and there are three
loops (i.e. three holes) that “persist” for a significant range of r as well (in particular,
0.54 ≤ r ≤ 0.87). There are also two short-lived cycles near r = 0.37 and r = 1.52,
which would likely be discarded as “noise”. . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.7 Evolution of custom filtered complex on a set of time periods . . . . . . . . . . . . . . . . . 13
v
2.1 A 2-opt exchange on a tour of8 points. The edges(3,7) and(4,5) in (2.1a) are replaced
with(3,4) and(7,5) in (2.1b). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Theβ 0
persistence barcodes for a set of1000 local minimizing tours from a TSP instance
on100 points. One can see that the two longest barcodes are more evident in the witness
complex. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Local minimizing tours from a TSP instance on 100 points . . . . . . . . . . . . . . . . . . 19
2.4 Theβ 0
persistence barcode for a tetrahedral clustered dataset. The two longest barcodes
suggest two main clusters within the set of local solutions. . . . . . . . . . . . . . . . . . . 22
2.5 Local minimizing tours from a TSP instance on data clustered in a tetrahedral shape . . . . 23
2.6 A solution to the k-means problem for points in the unit square withk =2 . . . . . . . . . 23
2.7 A family of k-means solutions fork =2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.8 The longestβ 1
interval captures the cyclic nature of the set of local solutions. . . . . . . . 25
2.9 The longestβ 0
intervals capture the four clusters of centroids corresponding to the four
edges of the square sample space. Despite the very loosely cyclic shape, theβ 1
barcodes
still capture the cycle within the data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.1 Lotka-Volterra equations phenomena [21] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2 Transition structure of the CTMC model for predator-prey interactions as defined by [21]. 28
3.3 A simulation of 500 steps from the Markov chain defined by Equations (3)-(17) . . . . . . . 30
3.4 Level sets for a functiong and a simplicial complex built on ag . . . . . . . . . . . . . . . 31
3.5 β 0
andβ 1
barcodes for a stochastic model of predator-prey interactions obtained from the
Markov chain complex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.6 Three degree-4 graphs on 96 nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.7 Three arrangement variations on the same cuboid graph . . . . . . . . . . . . . . . . . . . 36
3.8 Persistence barcodes for three degree-4 graphs obtained from the new filtered complex. . . 37
3.9 The longestβ 1
intervals correspond to the largest faces, which are detected as cycles. . . . 38
3.10 Theβ 1
barcodes capture the double-loop structure of the infinity sign point cloud. . . . . . 39
3.11 An Ising spin system on a lattice graph. [28] . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.12 Ising model on a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
vi
3.13 For the Ising model on a line, the lowest energy (most probable) states form a cycle. For
example, starting with the state at the top of the circle where all spins are+1, either the
node at the right or left end of the line has a chance of changing spins. Once this happens,
the changed exerts influence on its neighbor to also change spin with a certain probability,
and the effect cascades until all nodes have changed their spin to − 1. . . . . . . . . . . . . 42
3.14 Ising model on a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.15 For the Ising model on a circle, the lowest energy (most probable) states form a spherical
space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.16 In the 1D Sznajd model, the agreement of two agents causes their immediate neighbors
to agree with them. This is referred to as social validation. On the other hand, the
disagreement of two agents causes discord to spread to their neighbors. [22] . . . . . . . . 44
3.17 Persistence barcodes for the Sznajd model on a line . . . . . . . . . . . . . . . . . . . . . . 46
3.18 β 0
andβ 1
persistence barcodes for airline data from 1987 . . . . . . . . . . . . . . . . . . . 48
3.19 A representative cycle for the longestβ 1
barcode . . . . . . . . . . . . . . . . . . . . . . . 48
3.20 β 0
andβ 1
persistence barcodes for airline data from 2008 . . . . . . . . . . . . . . . . . . . 49
3.21 Representative cycles for the three longestβ 1
barcodes: ABQ-AUS-DAL-DFW-PHX-ABQ
ABQ-AUS-DAL-DFW-DEN-SJT-LAX-PHX-ABQ ABQ-AUS-DAL-DFW-IAH-ATL-ORD-
MSP-DEN-SJT-LAX-PHX-ABQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.22 The subcomplexes composed of the nodes and valid simplices from the representative
cycles of each of the longest β 1
barcodes. Three different 3D embeddings are shown
(column-wise), and empty 2-simplices that may contribute to the underlying loop structure
are highlighted in red. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.23 Airports from the top 50 contributors to ABQ, DEN, and DFW were considered. . . . . . . 51
3.24 Airports from the top 50 contributors to ABQ, PHX, and DFW were considered. . . . . . . 52
3.25 A Google map showing the highlighted airports from Figure 3.23 and the three airports
that compose the empty 2-simplex (in yellow) . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.26 A Google map showing the highlighted airports from Figure 3.24 . . . . . . . . . . . . . . 53
3.27 Persistence barcodes for a Markov random walk on the nearest-neighbors graph of the
conformation space of a pentane molecule. The significant β 1
component suggests a
rotational degree of freedom. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1 Two different schematic representations for a 2D Sznajd model proposed by Stauffer and
Galam [22] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
vii
Abstract
Topological data analysis (TDA) is a powerful tool for understanding the underlying shape and structure
of data which may not be recoverable via other, more traditional methods. We present applications of
TDA to operational research problems, beginning with discrete optimization problems. Most notably, we
demonstrate improved performance of the 2-opt local search procedure by applying a standard Vietoris-
Rips construction which suggests a pipeline for TDA-guided local search. We then introduce a new filtered
simplicial complex construction associated to a stochastic matrix and steady state vector. This allows the
study of Markov chains derived from a variety of applications from a topological perspective. As a result,
we recover coarse features of the Markov chains that were not previously accessible.
viii
Chapter1
Introduction
1.1 MotivationandStructureofThisWork
The purpose of this dissertation is to apply techniques from topological data analysis (TDA) to problems
in operations research. TDA is a relatively nascent research area that allows one to describe geometric
properties of a data set, such as connectivity, existence of holes, or clustering, in a way that imposes mini-
mal assumptions on parametric structures like coordinate systems or forms of probability distributions. In
recent years, TDA has been successfully applied to many different scientific domains, such as time series
analysis, text mining, cancer biology, and materials science. To the best of our knowledge, this dissertation
is the first to use TDA in the area of operations research. The basic principle that we exploit is that TDA
excels at identifying coarse features in datasets using a technique called persistence, and is not sensitive
to more localized phenomena. This enables us to use its strengths in unique ways, such as improving the
performance of local search methods and identifying coarse features in Markov chains. As a side conse-
quence, we also show how our approach actually improves on existing practices in TDA, by effectively
reducing sensitivity to outliers in noisy datasets.
The structure of this dissertation is as follows: the remainder of this section introduces the founda-
tional techniques of TDA and persistent homology and illustrates its use via several examples. In Chapter
2, we apply TDA to a data set whose points are local minimizers of the travelling salesman problem (TSP).
1
We devise a scheme for incorporating the persistent zero-th Betti number (i.e. connected components) of
the Vietoris-Rips construction into a local search procedure, and show that it leads to improved perfor-
mance. We then outline a general basic pipeline for applying persistent homology to (discrete) optimization
problems.
Pivoting our focus in Chapter 3, we define a new simplicial complex construction associated to a
stochastic matrix and steady state vector. The persistence function is defined axiomatically in such a
way that leverages the available structure of a random process. We investigate the complex’s efficacy in
recovering coarse features of Markov chains on examples of varying complexity, beginning with simple
constructions with verifiable properties and progressing to some intuitive benchmarks, models of natural
phenomena, and ”real-world” data sets. Our first series of examples, several random walks on graphs of
varying shape, validates the new construction. We then consider applications to population dynamics,
spin systems, social models, chemistry, and transportation networks.
Finally, we present some potential avenues of research that continue in the same vein of this work in
Chapter 4.
1.2 RelatedWork
To the best of our knowledge, the fields of optimization and TDA have had very little cross-pollination.
The best example to our knowledge is [35], which notes that certain procedures in TDA have natural
interpretations as optimization problems:
Finding good representatives for qualitative features often turns out to be a case of search-
ing within such a class for an optimal member.... Another area where homology shows up as a
tool for optimization is inevaluating coverage for sensor agents – such as ensembles of robots,
or antenna configurations.
2
The paper [10] does not apply TDA per se, but uses sheave theory and Poincaré duality to generalize the
famous max flow-min cut theorem; their generalization lends itself to applications in multi-commodity
flows, among others. Very recently, the paper [3] gives an application of persistent homology to the image
correspondence problem, which is used to produce 3D reconstructions of scenery from two or more cam-
eras. They produce a complex whose nontrivial homology groups correspond to recognizable anomalies
in image pairs, such as repeated patterns, which contribute to nonconvexity of the objective function.
Regarding Markov chains and TDA, the two methodologies have been applied separately as in [34],
but to the best of our knowledge there are no filtered simplicial complex constructions tailored to leverage
the available structure of a simulated random process. The paper [16] applies an edge-value clique (EVC)
filtration to convection cycles for an irreversible Markov chain, but the filtered complex construction itself
is applicable to any undirected graph. The construction we introduce in Chapter 3 is a continuation of
the work from [4], where a similar construction is applied to a random walk in a Jeu-de-Taquin game and
random walks on a finite metric space, and is shown to have advantages over standard persistent homology
constructions.
1.3 TopologicalDataAnalysis
Topology is the branch of mathematics that studies shapes and spacial relations, ie qualitative geometric
information. Its application to the analysis of high-dimensional data sets is calledTopologicalDataAnalysis
(TDA). Topology has several features that make it ideal for applications to data. [19] First, topological
techniques are "coordinate free," meaning the geometric properties being studied are intrinsic and do not
depend on the choice of coordinates. Second, topology studies properties that are invariant under small
deformations, making it possible to pick out the "shape" of objects despite variation and deformation.
Thirdly, extensions of topological methods like homology allow us to construct summaries of the invariants
of a space over a range of parameter values. This is useful because the results of point cloud techniques
3
(a) (b)
(c) (d)
Figure 1.1: Essential features that make topology ideal for data applications
often depend on the choice of parameter, so a summary over a changing parameter value is oftentimes more
valuable. These properties, which are illustrated in Figure 1.1, make topology an effective lens through
which to view data.
In recent years, TDA has been successfully applied to many different scientific domains such as time
series analysis [11], image processing [6, 20, 24], text mining [37], cancer biology [17], and materials sci-
ence [13]. Figure 1.2 shows an example of a TDA pipeline applied to point cloud data from a human hand,
4
in which coarse features - the finger and thumbs - are recognized as distinct entities. In the following sec-
tions of this proposal, we will utilize TDA methodology in a novel application to optimization, exploiting
the ability of TDA to identify coarse features in datasets via a technique called persistence.
Figure 1.2: A computational topology analysis of point cloud data from the human hand from [19]
1.4 HomologyandPersistence
When considering a topological space, we often wish to characterize its intrinsic properties. Specifically,
we want information regarding its connected components, loops, and higher dimensional analogues. Al-
gebraic topology formalizes this notion of connectivity information via the homotopy group, the set of
equivalence classes of loops under an equivalence relation which encodes the "sameness" or essential dif-
ference of loops. Unfortunately, the homotopy group of a space is typically difficult to compute. However,
a more computable extension called the homology group exists. From the homology group of a space we
can derive a vector of integers called Betti numbers, where the k-th Betti number counts the number of
equivalence classes ofk-dimensional surfaces in the space under the extended equivalence relation. That
is, for a space X over an algebraic field F and for any nonnegative integer k, β k
is the (vector space)
dimension of the kth homology group H
k
(X,F). Alternatively, in more algebraic terms, β k
is the rank
of the abelian groupH
k
(X). Informally, the Betti numbersβ k
count the number ofk-dimensional holes,
thus providing a compact summary of spatial information. In particular, the lower dimensional Betti num-
bers have natural, visual interpretations. For example, the zeroth Betti number β 0
counts the number
of connected components, the first Betti number β 1
counts the number of 1-dimensional holes, or loops,
5
and the second Betti numberβ 2
counts the number of voids of a shape. Figure 1.3 shows a few examples
of some shapes and their corresponding Betti numbers. While homology groups apply to all topological
Figure 1.3: A 2-dimensional "blob" has β 0
= 1 because it consists of one component, and β k
= 0 for all
k ≥ 1 (3a). The 2-dimensional blob with holes in (3b) still has β 0
= 1, but β 1
= 3 because there are
three holes present. The (hollow) sphere in (3c) hasβ 0
= 1,β 1
= 0,β 2
= 1, andβ k
= 0 fork ≥ 3; the
β 2
= 1 is due to the interior of the sphere. Finally, the torus in (3d) has β 0
= 1,β 1
= 2,β 2
= 1, and
β k
= 0 fork ≥ 3; theβ 1
= 2 is due to the two holes indicated with thickened lines (one for the "donut"
hole and one that wraps around the thickness of the torus). Theβ 2
=1 is due to the interior of the torus.
spaces and are more computable than homotopy groups, direct computation is not feasible for most spaces.
[5] However, we can directly compute the homology group of an approximation of the underlying space
given the right structure. In particular, for simplicial complexes, homology is algorithmically computable.
A simplical complex expresses a space as a union of points, line segments, triangles, tetrahedrons, and
higher dimensional analogues. Furthermore, under certain conditions, namely the "nerve theorem" of [2],
we can guarantee that the simplicial complex approximation of the space is homotopy equivalent to the
underlying space. The nerve theorem is as follows:
First, for a topological space X with covering U = {U
α }
α ∈A
, we define the nerve of U to be the
abstract simplicial complex with vertex set A such that {α 0
,...,α k
} spans a k-simplex if and only if
U
α 0
∩···∩ U
α k
̸= ∅. Denote the nerve ofU by N(U). The nerve theorem states that if the covering
U consists of open sets and is numerable, and furthermore if for all∅ ̸= S ⊆ A we have that∩
s∈S
U
s
is
either contractible or empty, thenN(U) is homotopy equivalent toX. Thus, the task of finding abstract
6
simplicial complex approximations that are homotopy equivalent to the underlying topological space re-
duces to finding coverings U which satisfy the nerve theorem. We now introduce several natural covering
constructions.
First, consider the case where X is a metric space equipped with metric d, and define B
ϵ (x) for a
point x ∈ X to be the ball of radius ϵ centered at x. That is, B
ϵ (x) = {y ∈ X | d(x,y) ≤ ϵ }. Let
U = {B
ϵ (x)}
x∈X
be our covering of choice for some ϵ . The nerve ofU, which we denote by
˘ C(X,ϵ ),
satisfies the nerve theorem and is called the
˘
Cechcomplex ofX associated toϵ . For this complex, using the
above definition for the nerve, the vertex set is the entire space X and{x
0
,...,x
k
} spans ak-simplex if
and only if the balls of radiusϵ centered at each of the pointsx
0
,...,x
k
have nonempty intersection. We
will use a variant of the
˘
Cech complex called the Vietoris-Rips complex in the next section.
As an alternative to the
˘
Cech complex, consider the following construction. LetL⊆ X be a subset of
X. Forλ ∈L, define the Voronoi cell associated toλ asV
λ ={x∈X|d(x,λ )≤ d(x,λ ′
)} for allλ ′
∈L.
That is,V
λ is the set of points inX that are closer toλ than they are to any other point inL. The family of
Voronoi cells{V
λ }
λ ∈L
form a covering forX that satisfies the nerve theorem, and we refer to the nerve
of this covering as theDelauneycomplex associated toL. For most of our uses, however, we choose to use
a Vietoris-Rips complex over the Delauney complex because of its suitability to finite metric spaces.
There are many other choices of "nice" coverings that generate simplicial complex constructions that
are homotopy equivalent to the underlying space. One such construction, the Vietoris-Rips complex, will
be explained in detail via example in the next section. Furthermore, these simplicial complexes are attached
to a parameter, or filtration value ( ϵ in the above example), which can be varied to produce a summary of
spatial information over a range of values. This is especially useful in the presence of noise. Say that we
have a noisy sample of points from a probability distribution near a spaceX. While in general we may not
be able to infer the Betti number ofX from the the sample, we can obtain the desired spatial information
by varying the parameter value of the associated complex and looking for features (Betti intervals) that
7
persist for substantial filtration value ranges. This notion of persistence is illustrated in the following
example.
1.5 DataVisualization: IllustratingtheUseofFilteredSimplicial
ComplexesandPersistence
Rather than computing Betti numbers at a single filtration value of the complex, we compute a summary
of the Betti numbers over a range of filtration values to examine features in the point cloud which "persist."
An example from [1] is explained in detail below.
We begin with a point cloudX ={(− 1,0),(1,0),(− 1,2),(1,2),(0,3)} (Figure 1.4) These points form
a house-like shape inR
2
. We can construct aVietoris-Rips complexVR(X,t) with filtration value t in the
following way: take the setX as the vertex set for the complex, and include the k-simplex{x
0
,x
1
,...,x
k
}
if and only if d(x
i
,x
j
) ≤ t for all 0 ≤ i,j ≤ k. We then allow the filtration value t to vary from 0 to a
maximum valuet
max
= 4 and observe how the complex changes in terms of its Betti numbers. As long
as the maximum filtration value is larger than the diameter of X, all edges will eventually be included.
We will illustrate this process below, using balls of radiust centered at each vertex point. An edge[a,b] is
included whenb∈B
t
(a) anda∈B
t
(b), whereB
t
(x) is the ball of radiust centered at pointx.
First, analyzingβ 0
, we begin with five connected components at t = 0 (each point is its own compo-
nent). For filtration values 0≤ t <
√
2, none of the balls intersect, and we maintain five connected com-
ponents. At filtration value t =
√
2, edges[(− 1,2),(0,3)] and[(0,3),(1,2)] are included in the simplex,
reducing the number of connected components to three. This structure persists untilt = 2, where edges
[(− 1,2),(− 1,0)],[(− 1,0),(1,0)] and[(1,0),(1,2)] are included. This reduces the number of connected
components to one. Similarly forβ 1
, there are no1-dimensional holes until filtration value t=2, at which
point the hollow square appears. This hole is filled at t =
√
8, at which point the edges [(− 1,2),(1,0)]
8
and[(− 1,0),(1,2)] are added to the complex and the2-dimensional simplices that make up the square are
included.
Figure 1.4: Evolution ofVR(X) for increasing values oft. Intervals shown: 0 ≤ t <
√
2,
√
2 ≤ t < 2,
2≤ t<
√
8,
√
8≤ t.
The relationship between β 0
,β 1
and t can be summarized in Betti barcodes (Figure 1.5, which show
the intervals of t over which each connected component (first-dimensional hole) exists. These barcodes
provide a concise summary of the set’s key geometric features, including those that persist over large
intervals of filtration values and those that appear and "die" over comparatively shorter intervals. When
applying this method to large point clouds, these short-interval features often correspond to noise or
inadequate sampling. [5]
In the above example, we used a Vietoris-Rips complex, which takes the whole point cloud X as its
vertex set. For large point clouds, this is computationally expensive, and we can in fact approximate the
space using a smaller complex that takes a set of representative points, or landmark points, as its vertex
set. For every pointx∈X, letm
x
denote the distance from this point to the set of landmark pointsL. A
k-simplex{l
0
,...,l
k
} is included in the complex if and only if there is a witness pointx∈ X such that
d(x,l
i
)≤ m
x
+t for all0≤ i≤ k (t here is a filtration value). Such a construction is called a strongwitness
complex. The landmark points can be chosen either randomly or using a sequential maxmin procedure as
follows: choose the first landmark point l
1
randomly. If the first i− 1 landmark points l
1
,l
2
,...,l
i− 1
9
Figure 1.5: Zeroth and first Betti barcodes for the house point cloud under Vietoris-Rips complex and
filtration values 0≤ t≤ 4
have been chosen inductively, the ith landmark point l
i
is chosen such that l
i
maximizes the function
f(z)=min{d(l
i
,l
1
),d(l
i
,l
2
),...,d(l
i
,l
i− 1
)}. Generally, the maxmin approach gives a landmark set that
is more evenly distributed, but includes more outliers. [31]
Figures 1.6a-1.6e demonstrate the evolution of a
˘
Cech complex for a given point cloud, as well as the
accompanying persistence barcodes. Recall that for this complex construction, ak-simplex{l
0
,...,l
k
} is
included in the complex
˘ C(X,t) if and only ifB
t
(l
0
)∩B
t
(l
1
)∩···∩ B
t
(l
k
)̸=ϕ , whereB
t
(v) is the ball
of radiust centered atv. Note that there are many ways to assign a filtration value to a simplex. The
˘
Cech
complex uses the radius of the balls centered at the points of the point cloud, while for the Vietoris-Rips
complex the filtration value is the maximum pairwise distance between points. However, there are other
possibilities. One may define the filtration value in terms of a cost function in optimization, by filtering
a set of random variables by probability density, or by using other statistical quantities. For a practical,
non-pointcloud example, consider a set of historical time periods, each given as an interval [a
i
,b
i
]. We
can define a filtered complex on this set by saying that intervals [a
0
,b
0
],...,[a
k
,b
k
] form ak-simplex if
10
-2 0 2 4 6 8
-5
-4
-3
-2
-1
0
1
2
3
(a)
-2 0 2 4 6 8
-5
-4
-3
-2
-1
0
1
2
3
(b)r = 0.30
-2 0 2 4 6 8
-5
-4
-3
-2
-1
0
1
2
3
(c)r = 0.44
-2 0 2 4 6 8
-5
-4
-3
-2
-1
0
1
2
3
(d)r = 0.56
-2 0 2 4 6 8
-5
-4
-3
-2
-1
0
1
2
3
(e)r = 0.86
0 0.25 0.5 0.75 1 1.25 1.5 1.75
r
0
0 0.25 0.5 0.75 1 1.25 1.5 1.75
r
1
(f)
Figure 1.6: From visual inspection, we can see that the dataset in Figure 1.6a has two clusters and three “holes” (loops). In Figures 1.6b-1.6e, we
build the Čech complex for this dataset by surrounding the points with balls of increasing radius r. We are interested in the Betti numbers that
result from taking the union of these balls, which vary withr. Figure 1.6b shows 10 connected components (β 0
=10) and no holes (β 1
=0), Figure
1.6c shows 2 connected components (β 0
= 2) and 3 holes (β 1
= 3), Figure 1.6d shows 2 connected components and no holes (β 0
= 2, β 1
= 0),
and Figure 1.6e shows one connected component and no holes (β 0
= 1, β 1
= 0). The evolution of these values of β 0
and β 1
is captured in the
barcode diagram in Figure 1.6f; the upper plot shows the connected componentsβ 0
asr varies, and the lower plot shows the distinct holesβ 1
asr
varies. This barcode plot is consistent with our visual inspection in the following sense: the majority of the upper plot indicates that there are two
connected components (i.e. two clusters) that “persist” for a wide range ofr (in particular,0.50≤ r≤ 1.41), and there are three loops (i.e. three
holes) that “persist” for a significant range of r as well (in particular, 0.54≤ r ≤ 0.87). There are also two short-lived cycles nearr = 0.37 and
r =1.52, which would likely be discarded as “noise”.
11
they overlap for more than one year, and assign the simplex the persistence value of the first year that
all of the intervals have in common. We applied this complex construction to the set of intervals shown
in Figure 1.7. The time periods are the Bronze Age (B), Iron Age (I), Shang Dynasty (S), Zhou Dynasty
(Z), Egyptian New Kingdom (NK), Persian Empire (P), and Scandinavian Viking Age (V). Figure 1.7 also
shows the evolution of the complex from persistence value− 3300 to 793, where negative years simply
represent BC years. At year− 3300, only the Bronze Age 0-simplex exists. The next 0-simplex and the
first 1-simplex {B,S} appears at year− 1600, which is the first year shared by the Bronze Age and Shang
Dynasty. At− 1550, the Egyptian New Kingdom 0-simplex appears, as well as the simplices{S,NK} and
{B,S,NK}, reflecting the mutual overlap of these time periods. This evolution of the complex continues
until year793, where the Viking Age 0-simplex appears. However, since it shares no years with any of the
other time periods, it forms its own connected components. Theβ 0
persistence barcodes for this complex
would therefore show two longβ 0
barcodes which reflect these two connected components.
12
Figure 1.7: Evolution of custom filtered complex on a set of time periods
13
Chapter2
TDAandLocalSearch
A 2016 paper proposed and implemented a machine learning method for learning a branch-and-bound
variable selection strategy directly from data. [15] Researchers first constructed a training data set by
running the algorithm with Strong Branching as the variable selection strategy. At each node, a set of
features was collected, and variables were labelled with a binary labelling scheme that assigned a 1 to a
variable if its (SB) score was within a predetermined percentage of the maximum score. Alearning-to-rank
approach was then taken, in which a linear function of the features that minimized a loss function over
the training data was learned. Using the learned feature function as the branching strategy, researchers
observed fewer unsolved instances compared to other branching strategies.
Inspired by this methodology, we apply topological data analysis to local search, using persistence
barcodes to identify features that are likely to be present in good local minimizers and motivate future
choices during the search procedure. This also gives way to a general local search pipeline with TDA. To the
best of our knowledge, the fields of optimization and TDA have had very little cross-pollination. The best
example to our knowledge is [36], which notes that certain procedures in TDA have natural interpretations
as optimization problems. The paper [10] does not apply TDA per se, but uses sheave theory and Poincaré
duality to generalize the famousmaxflow-mincut theorem; their generalization lends itself to applications
in multi-commodity flows, among others. Very recently, the paper [3] gives an application of persistent
14
homology to the image correspondence problem, which is used to produce 3D reconstructions of scenery
from two or more cameras. They produce a complex whose nontrivial homology groups correspond to
recognizable anomalies in image pairs, such as repeated patterns, which contribute to nonconvexity of the
objective function.
In order to compare our approach with related work, we briefly describe other metaheuristics that arise
commonly in optimization literature:
Guidedlocalsearch augments the objective function with a penalty term that causes the local search
procedure to avoid local minimizers that it has already encountered; the idea is to make previous
local minimizers more costly than the surrounding search space. Given an objective functionf(x)
and a collection offeatures indexed by a termi, guided local search seeks to minimize the augmented
objective function
g(x):=f(x)+λ X
i
I
i
(x)p
i
;
hereλ is an intensity parameter,I
i
(x) is a binary function that indicates whether or not featurei is
present in candidate solutionx, andp
i
is a penalty parameter for that feature. A simple example of
a feature would be “I
i
(x)=1 ifx
i
=1”, for instance. The penalty parametersp
i
are initially all set
to zero, and increase as the metaheuristic finds more and more local minimizers.
Tabusearch avoids previous local minimizers by maintaining a tabu list of local search moves that are
prohibited. In general, a local search move is prohibited if it takes the search back to a state that it
has recently visited.
Simulatedannealing decides probabilistically whether to move from a state x to a new state x
′
, even
ifx
′
has a worse objective value. For example, iff(x
′
) > f(x), one might move fromx tox
′
with
probability ae
b(f(x)− f(x
′
))
, for suitable constants a and b. This moves the search away from poor
local minimizers by allowing the possibilities of climbing out of a basin of attraction.
15
Unlike with these metaheuristics, the goal of the proposed local search with TDA is to identify features
that are present in many disparate local minimizers, and to encourage their presence in subsequent runs,
rather than forbidding certain moves.
2.1 ApipelineforlocalsearchwithTDA
The following is an informal outline for using TDA to guide a local search procedure, for fixed positive
integersM andN and a given optimization problem:
1. PerformM local searches with randomly seeded inputs and record the local minimizers
ˆ x
(1)
,...,ˆ x
(M)
that are obtained.
2. Compute persistence barcodes for data points ˆ x
(1)
,...,ˆ x
(M)
, using (for instance) a Čech complex
or a Vietoris-Rips complex.
3. Determine a representative clustering by looking at the barcodes corresponding toβ 0
, the0-th Betti
number. LetC
1
,...,C
m
denote these clusters.
4. Identify features that are common to the clustersC
1
,...,C
m
.
5. PerformN local searches with randomly seeded inputs, but require that the common features from
step 4 be present.
6. Return the best solution obtained over allM +N iterations.
This description is purposefully vague and we show a concrete example applied to the travelling salesman
problem in the next section.
16
1
3 4
2
8
7 5
6
(a)
1
3 4
2
8
7 5
6
(b)
Figure 2.1: A 2-opt exchange on a tour of8 points. The edges(3,7) and(4,5) in (2.1a) are replaced with
(3,4) and(7,5) in (2.1b).
2.2 Thetravelingsalesmanproblem
Here we describe an implementation and show an example of the algorithm from the previous section.
that applies TDA to the well-known 2-opt heuristic [12] for the travelling salesman problem (TSP) with a
symmetric and complete distance matrixD =[d
ij
]. 2-opt is a simple strategy that works as follows: given
an existing tour of the input points, take2 edges(a,b) and(c,d) from the tour and remove them. Then,
form a new tour by inserting edges(a,d) and(c,b). Update the tour if the objective value decreases, which
occurs if and only ifd
ab
+d
cd
> d
ad
+d
cb
. This is illustrated in Figure 2.1. The search completes when
there are no pairs of edges that result in a cost reduction.
In our experiment, steps 1-6 from Section 2.1 now take the following form:
1. PerformM local search instances with randomly seeded inputs and record the local minimizing tours
ˆ x
(1)
,...,ˆ x
(M)
that are obtained. Each local search instance consists of taking a random permutation
of the input points, and selecting the 2-opt move that reduces the tour cost as much as possible (this
means that we enumerate through all
n
2
possible pairs of edges, wheren is the number of points).
For our example, we collected a set of1000 local minimizing tours on a TSP instance of100 points.
17
2. Compute persistence barcodes for data pointsˆ x
(1)
,...,ˆ x
(M)
using a Čech complex, where the metric
is the Hamming distance: the distance between two tours ˆ x
(i)
,ˆ x
(j)
is
d(ˆ x
(i)
,ˆ x
(j)
)=#(edges belonging to only one of ˆ x
(i)
and ˆ x
(j)
)
In our example, we construct a Vietoris-Rips complexVR(X,ϵ ) on the metric spaceX of the local
minimizing tours equipped with the Hamming distance as the metric. This simplicial complex is a
variant of the Čech complex where{ˆ x
(1)
,...,ˆ x
(k)
} spans ak-simplex if and only ifd(ˆ x
(i)
,ˆ x
(j)
)≤ ϵ for all 0 ≤ i,j ≤ k. The β 0
persistence barcodes resulting from this complex is shown in Figure
2.2a. The figure indicates that there are two persistent (long) barcodes. Note that the distance matrix
D = [d
ij
] was scaled by the dividing the matrix element-wise by the largest distance so that the
maximum persistence value will be less than 1.
3. Let C
1
,...,C
m
be the m longest barcodes corresponding to the 0-th Betti number. While m can
be determined from the β 0
persistence barcodes generated by the Čech/Vietoris-Rips complex, in
practice it can be difficult to identify m visually due to the large vertex set. In this case, a strong
witness complex may provide a clearer visual aid. Using a strong witness complex can also reduce
the computational expense of the Vietoris-Rips complex, which utilizes all of X as the vertex set.
Given a set of landmark points L ⊆ X, we define for every point ˆ x
(i)
∈ X a distance d
ˆ x
(i) =
min
l∈L
d(ˆ x
(i)
,l), i.e. its minimum distance to the setL. The strong witness complexW(X,L,ϵ ) is
then constructed with vertex set L with the following rule: {l
1
,...,l
k
} spans a k-complex if and
only if there exists an ˆ x
(i)
∈ X such that d(ˆ x
(i)
,l
j
) ≤ d
ˆ x
(i) +ϵ for all 1 ≤ j ≤ k. The landmark
pointsL can be chosen via a variety of methods, including random selection. Here we chose a set
of 50 landmark points using sequential maxmin, a greedy selection process. The β 0
persistence
barcodes for this example are shown in Figure 2.2b. This figure more clearly identifies the presence
18
(a)β 0 persistence barcodes,VR(X,ϵ ) (b)β 0 persistence barcodes,W(X,L,ϵ )
Figure 2.2: The β 0
persistence barcodes for a set of 1000 local minimizing tours from a TSP instance on
100 points. One can see that the two longest barcodes are more evident in the witness complex.
(a) Sample of size 20 from Component 1 (b) Sample of size 20 from Component 2
Figure 2.3: Local minimizing tours from a TSP instance on 100 points
of the two persistent intervals first identified in the Vietoris-Rips complex, and confirms our choice
of them=2 longest barcodes for this step. Furthermore, now that we have detected two connected
components within the setX of local minimizing tours, we can examine a sample of tours from each
of these components. Figure 2.3 shows a sample of 20 local minimizing tours from each of the two
connected components.
19
4. For each edgee, define probability weights
w
e
=
m
Y
i=1
Pr(e appears inX|X is uniformly sampled fromC
i
).
Whenw
e
is high, it means thate appears frequently in all of the clustersC
i
, and is therefore more
likely to be part of the optimal solution.
5. Perform N local searches with randomly seeded inputs, subject to the constraint that all edges e
such thatw
e
≥ α are present (hereα is specified beforehand).
6. Return the best solution obtained over allM +N iterations.
2.3 Computationalexperiments
We implemented the procedure from section 2.2 and applied it to 41 benchmark problem instances from
the TSPLIB library [29]: for efficiency’s sake, we restricted ourselves only to those instances with up to
300 nodes, and set M = N = 500, m = 3, and α = 10
− 3
. Our comparison procedure is as follows:
after computing the first M minimizers, we then compute one additional set ofN minimizers with TDA
information, and another set ofN minimizers without TDA information (that is, they are obtained using
the same procedure as the originalM). Table 2.1 shows the best objective values obtained with and without
TDA. Out of the 41 instances, there are 30 for which local search with TDA is superior, 7 where the two
are tied, and 4 where local search with TDA is inferior to vanilla local search.
2.4 Atoyexample
As a sort of sanity check on the effectiveness of this TDA pipeline for local search, we conducted a similar
experiment on the tetrahedral graph shown in Figure 2.4a. As seen in the image, the graph consists of four
20
Instance name TDA No TDA
a280 2729 2751
att48 33556 33601
berlin52 7545 7545
bier127 119282 119353
ch130 6133 6236
ch150 6705 6790
d198 16010 16010
eil101 659 660
eil51 434 431
eil76 555 558
gil262 2489 2501
gr137 710 711
gr202 498 498
gr229 1682 1683
gr96 517 520
kroA100 21311 21357
kroA150 26991 26991
kroA200 30165 30193
kroB100 22322 22441
kroB150 26716 26582
kroB200 30351 30419
kroC100 20770 20944
kroD100 21538 21559
kroE100 22226 22334
lin105 14383 14496
pr107 44463 44495
pr124 59031 59031
pr136 98280 98698
pr144 58535 58535
pr152 73822 74221
pr226 80883 80729
pr264 50984 50871
pr299 49417 49889
pr76 108305 108305
rat195 2473 2477
rat99 1234 1264
rd100 7953 7979
st70 678 684
ts225 126905 127316
tsp225 4034 4041
u159 42768 42781
Table 2.1: Best objective values obtained on TSPLIB benchmark problem instances, with and without TDA.
A bold and underlined number indicates the superior (i.e. smaller) objective value.
21
(a) A data set of 20 points clustered in a tetrahedral shape,
with 5 points at each "vertex". The intravertex edge
lengths are small compared to the the intervertex edges.
(b)β 0 persistence barcodesVR(X,ϵ )
Figure 2.4: Theβ 0
persistence barcode for a tetrahedral clustered dataset. The two longest barcodes suggest
two main clusters within the set of local solutions.
families of nodes grouped closely together and arranged in a tetrahedral shape. The distances between
the centers of the node families are equal. An obvious solution to TSP on this graph would be a tour that
visits all of the nodes within a family before travelling to the next one. However, the order in which the
families are visited could vary, only inflicting minimal change on the objective value of the tour. Thus,
when the TDA pipeline is applied to a set of local solutions obtained via 2-opt, we expect to see multiple
components in theβ 0
barcodes corresponding to groups opf tours that visit the node families in different
orders. The resulting Betti barcodes are shown in Figure 2.4b. As expected, there are two barcodes that
persist much longer than the rest. Taking a sample from the locally optimally tours from each of these
components shows that they do indeed visit the node families in different orders. The samples are shown
in Figure 2.5.
2.5 K-meansandhigherorderBettinumbers
In the above application of TDA to local search, only the zeroth Betti numbers are used to inform the
search procedure, serving as a method of detecting clusters within the local solution dataset. A natural
22
(a) Sample of size 10 from Component 1 (b) Sample of size 10 from Component 2
Figure 2.5: Local minimizing tours from a TSP instance on data clustered in a tetrahedral shape
Figure 2.6: A solution to the k-means problem for points in the unit square withk =2
question to ask is, "What information can higher order Betti numbers contribute to local search, and when
are they useful?" As a preliminary answer to this question, we offer the following application to the k-
means problem, which aims to identifyk optimal landmark or centroid points such that the in-cluster sum
of squared distances to the centroids are minimized. An example of a solution to the k-means problem is
shown in Figure 2.6.
Consider an instance of k-means on a set of 1000 points sampled uniformly from the unit circle with
k = 2. Following Steps 1-3 from the local search pipeline, we perform M = 1000 local searches with
randomly seeded inputs and record the local minimizers (in this case, a set of centroids for each of the
23
Figure 2.7: A family of k-means solutions fork =2
two clusters). We then compute the persistence barcodes for the dataset of these local minimizers. Theβ 0
andβ 1
barcodes are shown in Figure 2.8b. Inspecting theβ 0
barcodes, we see a single, highly persistent
interval, which indicates a single family of solutions. We can take a sample of these solutions, as shown in
Figure 2.7. While the individual solutions may not provide much insight into the structure of the solution
set, superimposing them as done in Figure 2.8a reveals a loop of solutions centered at the vertex of the unit
circle. Furthermore, this phenomenon is detected by theβ 1
barcodes, which show a single, long interval.
On an interesting side note, we recover a different phenomenon when the 1000 points are sampled from
the unit square instead of the unit circle. The centroids from the local solutions cluster near the edges of
the square, forming four clusters of centroids when the solutions are superimposed. This is seen in Figure
2.9a. This clustering is recovered when we compute the persistence barcodes via a Vietoris-Rips complex,
as shown in Figure 2.9b. Additionally, though the cycle is much more rough in shape, a significant β 1
interval still appears.
In this instance of kmeans, higher order Betti numbers reveal the intrinsic shape of the set of local
solutions. In other case, they may uncover "topological obstructions" to global optimization in the same
way that connected components do.
24
(a) A family of k-means solutions fork = 2, superimposed (b)β 0 andβ 1 persistence barcodesVR(X,t)
Figure 2.8: The longestβ 1
interval captures the cyclic nature of the set of local solutions.
(a) A family of k-means solutions fork = 2, superimposed (b)β 0 andβ 1 persistence barcodesVR(X,t)
Figure 2.9: The longestβ 0
intervals capture the four clusters of centroids corresponding to the four edges
of the square sample space. Despite the very loosely cyclic shape, theβ 1
barcodes still capture the cycle
within the data.
25
Chapter3
TDAandMarkovChains
3.1 ANewFilteredComplex: Motivation
We now introduce a new filtered complex construction associated to a stochastic matrix (that is, a Markov
chain) together with a steady state vector. We then consider several examples of varying complexity. To
the best of our knowledge, no other Markov chain-based filtered complex constructions exist. First, we
begin with some intuition behind the complex.
Markov chains are used to model many different things; one advantage they possess is that one can
use (for example) first transition analysis to derive coarse algebraic features of a model. The best example
of such a feature is the stationary distribution, which roughly speaking provides information about the
importance of the various states, but other features include transient versus recurring states, degree se-
quences, hitting times, and information from first transition analysis. These "features" can aid in making
observations and conclusions about the process. The purpose of this chapter is to use topological data
analysis to identify coarse features in Markov chains that are not accessible via traditional methods.
For example, consider the Lotka-Volterra model, which studies the interactions of a population ofX
prey particles (animals, agents) andY predator particles. We assume the following three possible reaction
events:
26
1. Prey reproduction,X →2X
2. Prey consumption generates a predator,X +Y →2Y
3. Predator death,Y →ϕ Furthermore, each prey reproduces at rateα , prey and predator encounters occur at rateβ , and predators
die off at rate γ . Letting X(t) and Y(t) be the predator and prey populations as functions of time, the
Lotka-Volterra equations are given by
dX(t)
(dt
=αX (t)− βX (t)Y(t) (3.1)
dY(t)
dt
=βX (t)Y(t)− γY (t) (3.2)
Solutions to this set of equations are characterized by boom and bust cycles within the predator and
prey populations, pictured in Figure 3.1a. Intuitively, these capture the phenomenon that when predator
populations are high, prey are consumed at a higher rate and experience population decline. Similarly,
when prey populations are low because of over-hunting, predators no longer have a robust food source and
begin experiencing population decline until prey populations recover. Examining the state space diagram
of the Lotka-Volterra equations reveals an orbital structure constrained by the initial state(X(0),Y(0)),
as shown in Figure 3.1b.
A continuous-time Markov chain (CTMC) version of this model is given in [21]. Let(X,Y), the num-
ber of prey and predator particles, be the states of the CTMC and define the following state-dependent
transition rates:
• (X,Y)→(X +1,Y) with ratec
1
X (prey reproduction)
• (X,Y)→(X− 1,Y +1) with ratec
2
XY (predator-prey encounter)
• (X,Y)→(X,Y − 1) with ratec
3
Y (predator death)
27
(a) Boom and bust cycles in the predator-prey model(b) State space diagram for the Lotka-Volterra equations showing
orbits for several initial conditions
Figure 3.1: Lotka-Volterra equations phenomena [21]
Figure 3.2: Transition structure of the CTMC model for predator-prey interactions as defined by [21].
These transitions are summarized in Figure 3.2. Of course there are other ways of capturing the
predator-prey relationship in a stochastic model. However, for the purpose of TDA applications, we choose
this particular stochastic model to study.
We consider the transition matrix P derived from the associated jump chain with modifications to
ensure that there are no absorbing states (that is, no population extinctions or growth beyond maximum
28
valuesM
prey
,M
pred
). The entries ofP are defined as follows, where Equations (7)-(17) describe transitions
to or from population boundaries.
P((i,j),(i+1,j))=
c
1
i
c
1
i+c
2
ij+c
3
j
, i̸=M
prey
(3.3)
P((i,j),(i− 1,j+1))=
c
2
ij
c
1
i+c
2
ij+c
3
j
, i̸=1,j̸=M
pred
(3.4)
P((i,j),(i,j− 1))=
c
3
j
c
1
i+c
2
ij+c
3
j
, j̸=1 (3.5)
(3.6)
P((i,M
pred
),(i+1,M
pred
))=
c
1
i
c
1
i+c
3
M
pred
, i̸=M
prey
(3.7)
P((i,M
pred
),(i,M
pred
− 1))=
c
3
M
pred
c
1
i+c
3
M
pred
(3.8)
P((M
prey
,j),(M
prey
− 1,j+1))=
c
2
M
prey
j
c
2
M
prey
j+c
3
j
, j̸=M
pred
(3.9)
P((M
prey
,j),(M
prey
,j− 1))=
c
3
j
c
2
M
prey
j+c
3
j
, j̸=1 (3.10)
P((M
prey
,M
pred
),(M
prey
,M
pred
− 1))=1 (3.11)
P((1,j),(1,j− 1))=
c
3
j
c
1
=c
3
j
, j̸=1 (3.12)
P((1,j),(2,j))=
c
1
c
1
+c
3
j
(3.13)
P((i,1),(i+1,1))=
c
1
i
c
1
i+c
2
i
, i̸=M
prey
(3.14)
P((i,1),(i− 1,2))=
c
2
i
c
1
i+c
2
i
, i̸=1 (3.15)
P((M
prey
,1),(M
prey
− 1,2)=1 (3.16)
P((1,1),(2,1))=1 (3.17)
29
Figure 3.3: A simulation of 500 steps from the Markov chain defined by Equations (3)-(17)
Figure 3.3 shows a simulation of 500 steps from the resulting Markov chain starting from a randomly
selected state. Parameter choices were M
prey
= M
pred
= 35,c
1
= 1,c
2
= 0.075,c
3
= 1 and M
prey
=
M
pred
= 20,c
1
= 1,c
2
= 0.2,c
3
= 1. [21] states that there is no single limit cycle, but rather a family of
perturbed cycles. However, common sense says that there’s really just one cycle, but we lack the machinery
to identify it. Our objective in the next section is to construct a complex that captures this cycle.
3.2 ConstructionoftheFilteredComplex
We now describe in detail the construction of the new filtered complex on Markov chains. Let P be a
stochastic matrix for a Markov chain with n states, with stationary distribution π such that πP = π .
Let [P
m
]
ij
denote the ij-th entry of that matrix P
m
. The discrete Carlsson-Sweitzer-Siojo (CSS) function
induces a matrixQ by
Q
ij
= sup
m≥ 1
([P
m
]
ij
− π j
). (3.18)
30
Figure 3.4: Level sets for a functiong and a simplicial complex built on ag
Another possibility is to consider a matrixQ induced by the discrete Green function [8]
Q
ij
=
X
m≥ 1
([P
m
]
ij
− π j
). (3.19)
For each of these functions, the entryQ
ij
quantifies how much node i contributes to nodej over the long
run. Next, since stationary distributions have the convention that high values ofπ i
correspond to signif-
icant statesi, we will reverse the usual convention for persistence of simplices. That is, if two simplices
satisfies the inclusion relation ∆ ′
⊆ ∆ , then the persistence functionf should satisfyf(∆
′
)≥ f(∆) . We
now construct the persistence function for the new complex in an axiomatic fashion.
First, the persistence value of a simplex{j
0
,...,j
k
} should be an aggregate of all the contributions to
it, from all the n nodes in the chain. Furthermore, nodes with low stationary distribution values should
make proportionally small contributions. Thus, our persistence function should be of the form
f(∆)=
n
X
i=1
π i
h(Q
ij
0
,Q
ij
1
,...,Q
ij
k
). (3.20)
31
Next, the functionh should only be large if all of its argumentsQ
ij
0
,Q
ij
1
,...,Q
ij
k
are large. This is
becausef(∆) should always be thought of as a property of the entire simplex, not merely its components.
For example, ifQ
ij
′ =0 for somej
′
∈∆ , then we should haveh(Q
ij
0
,Q
ij
1
,...,Q
ij
k
)=0 because node
i simply does not contribute toj
′
, and therefore not to all of∆ . There are many ways to accomplish this
(e.g. leth be the max of its inputs). To leverage smoothness properties, we choose to take their product:
h(Q
ij
0
,Q
ij
1
,...,Q
ij
k
)=
Y
j∈∆ Q
ij
. (3.21)
The function defined above already satisfies the necessary conditions for a simplicial complex. How-
ever, as a general rule of thumb in TDA, the persistence value of a simplex is often set to be the "worst"
value of the function that you are measuring within the simplex. We now turn to sublevel set persistence
as our inspiration for this next step [32]. Suppose we have a point cloud{x
1
,...,x
n
} and a functiong, as
in Figure 3.4. Furthermore, suppose we would like to build a persistence complex on the point cloud with
the homology similar to the homology of the sublevel sets ofg (for instance, if it has three distinct modes,
we would expect three longβ 0
barcodes). The most natural way is to define the persistence function f by
f(∆)=min
x∈∆ g(x). (3.22)
Returning to our new complex, a sensible analog to this is to define h to be a minimum over the probability
simplexP
k
. We found that the choice
h(Q
ij
0
,Q
ij
1
,...,Q
ij
k
)= min
p∈P
k
Y
j∈∆ Q
p
j
ij
(3.23)
32
performed well, and has the natural benefit that if we set p
j
= 0 for some j ∈ ∆ , then the resulting
function is identical to the value ofh(∆ \{j}). To summarize, the final form of the persistence function
for our complex construction is
f({j
0
,...,j
k
})= min
p∈∆ k
n
X
i=1
π i
Q
p
0
ij
0
··· Q
p
k
ij
k
. (3.24)
We now return to the CTMC Lotka Volterra model. The persistence barcodes obtained from applying
the Markov chain complex to the transitions matrix P is given in Figure 3.5. The longest β 1
barcode
indicates a cycle in the Markov chain, which is best interpreted as a cycle of increasing and decreasing
predator and prey populations. While the state space diagram shows individual orbits tied to specific
initial conditions, the TDA technique reveals the overall cyclic nature of the Markov chain. This coarse
feature is not uncovered by traditional methods. Furthermore, theβ 0
barcodes indicate that there are many
states(i,j) that are disconnected, ie. not typically visited by the chain.
This experiment not only demonstrates the use of the new filtered complex and TDA techniques applied
to a theoretical model, but also the use of higher order Betti numbers to analyze the structure of Markov
chains.
We also anticipate that applying this methodology to "real-world" data sets will reveal useful trends. For
instance, we propose using this new filtered complex on a data set of pickup and drop-off locations from the
freight network of the United States. Though the details of this application and the appropriate filtration
values are not yet refined, we predict that this Markov chain TDA approach might reveal repeating cycles in
the freight network that correspond to profitable routes. In a similar vein of research, we could examine the
air traffic network as in [25]. Applying this Markov chain based filtered complex to a strategically defined
random walk on the airport network has the potential to uncover patterns in the aviation network. These
potential avenues for research are explained in more detail in the Future Work section.
33
Figure 3.5: β 0
and β 1
barcodes for a stochastic model of predator-prey interactions obtained from the
Markov chain complex
3.3 RandomWalksonGraphs
In this first set of experiments, we validate the use of the new filtered complex by applying it to a set of
stochastic matrices derived from random walks on graphs of varying shape. The set of graphs includes a
cylinder, cuboid, and a torus, each constructed in a way such that 96 nodes are used and each node has
degree four. The three graphs are pictured in Figure 3.6.
Note that although the nodes are arranged in such a way that the underlying shape of the graph is
apparent, this need not be the case. For example, the cuboid graph in Figure 3.6a could also be visualized
as in Figure 3.7, in which the underlying structure, particularly the void, is not obvious.
For each of the graphs, we form a stochastic matrix, that is a matrix of transition probabilities for a
random walk along the graph, such that a given node transitions to any of its four neighbors with equal
probability. We then use the new filtered complex defined in the previous section to obtain the persistence
barcodes for each of three stochastic matrices. The barcodes are shown in Figure 3.8. The barcodes capture
34
(a) Cuboid (b) Cylinder
(c) Torus
Figure 3.6: Three degree-4 graphs on 96 nodes
35
(a) (b)
(c)
Figure 3.7: Three arrangement variations on the same cuboid graph
36
(a) Cylinder (b) Cuboid
(c) Torus
Figure 3.8: Persistence barcodes for three degree-4 graphs obtained from the new filtered complex.
the defining features of each of the graphs, particularly the cyclic shape of the cylinder, the inner void of
the cuboid, and the two loops and three-dimensional void of the torus. Additionally, while each of these
graphs has the same degree sequence, TDA reveals the topological distinctions in their shapes. To our
knowledge, there aren’t existing tools in the research of stochastic processes that would capture these
coarse features.
Another shape to which we applied a similar process is the icosidodecahedron pictured in Figure 3.9a.
Interestingly, due to its large 10-gon faces, the persistence barcodes pick up ten "loops" corresponding to
these faces. The barcodes are pictured in Figure 3.9b.
37
(a) A degree-4 icosidodecahedron (b)β 0 andβ 1 persistence barcodes
Figure 3.9: The longestβ 1
intervals correspond to the largest faces, which are detected as cycles.
3.4 PerformanceonBenchmarkDataSets(TADASets)
While the initial experiment demonstrates the effectiveness of the new complex construction on rather sim-
ple, manufactured graphs, the following experiment demonstrates its performance on benchmark topology
data sets with varying levels of noise. We use several data sets manufactured using the TADAsets library
[9], namely the infinity sign and d-sphere, and torus. The infinity sign point cloud was generated with
100 points and noise level0.1. A random walk was then constructed as follows. First, we ran ak-nearest
neighbors procedure to identify the5 nearest neighbors for each vertex in the point cloud. A vertex was
then connected to each of those neighbors via a directed edge to form a directed graph. A random walk
could then be taken on the graph, using1/5 for the transition probabilities. We applied the Markov chain
complex to the transition probability matrix and produced the persistence diagrams shown in Figure 3.10.
Thed-sphere and torus point clouds are left as future work.
3.5 TheIsingModel
In the following set of experiments, we consider a well-known spin system called the nearest neighbor
Ising model on a graph G = (V,E). The system can be thought of as a set of magnets, each placed on
38
-1 -0.5 0 0.5 1
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
(a) caption
-18 -16 -14 -12 -10 -8 -6 -4 -2 0 2
10
4
Dim 0
-18 -16 -14 -12 -10 -8 -6 -4 -2 0 2
10
4
Dim 1
(b) caption
Figure 3.10: Theβ 1
barcodes capture the double-loop structure of the infinity sign point cloud.
a single vertex of the graph G, and each with one of two possible orientations. An example is shown
in Figure 3.11. The model itself is often used to simulate the magnetic dipole moments of atomic spins
and study phase transitions. Mathematically, and using notation from [18], each vertexv∈ V has a spin
σ (v)∈{− 1,1}. The energy of a configuration σ ∈{− 1,1}
V
=X is given by
H(σ )=− X
v,w∈V,v∼ w
σ (v)σ (w) (3.25)
The energy of a configuration is higher when there are more pairs of neighbors whose spins disagree.
The Gibbs distribution corresponding toH is the probability distributionµ onX defined by
µ (σ )=
1
Z(β )
e
− βH (σ )
, (3.26)
whereZ(β ) =
P
σ ∈X
e
− βH (σ )
is the normalizing constant required to makeµ a probability distribution,
and β ≥ 0 is a parameter that determines the influence of the energy function. When β = 0, H plays
no role in the interactions of the particles in the spin system andµ is the uniform distribution onX . As
β > 0 increases, so does the bias towards low-energy configurations.
39
Figure 3.11: An Ising spin system on a lattice graph. [28]
The Glauber dynamics for µ (a reversible MC with state spaceX and stationary distribution µ ) have
the following transition probabilities:
P(σ,σ ′
)=
1
|V|
X
w∈V
e
βσ ′
(w)S(σ,w )
e
βσ ′
(w)S(σ,w )
+e
− βσ ′
(w)S(σ,w )
· 1
{σ (v)=σ ′
(v)forv̸=w}
, (3.27)
where S(σ,w ) =
P
u:u∼ w
σ (u). The transition probabilities are derived from the following method of
moving from stateσ to a new state: a vertexv is chosen uniformly at random fromV and a new state is
chosen according to the measureµ conditioned on the set of states equal toσ at all vertices different from
v.
Each of the following instances uses the above Glauber dynamic transition probabilitiesP .
40
(a) A graph on 5 nodes connected in the shape of a line
(b)β 0 andβ 1 persistence barcodes for the Ising model on a line
Figure 3.12: Ising model on a line
3.5.1 IsingonaLine
We first consider the Ising model on a graph G consisting of 5 vertices arranged in a line, as shown in
Figure 3.12a. We calculate the transition probabilities via Equation (20), and use the new filtered complex
to recover theβ 0
andβ 1
persistence barcodes. These barcodes are shown in Figure 3.12b.
Theβ 1
barcodes show a single, lengthy interval which corresponds to a cycle in the structure of the
Markov chain. Examining the a representative cycle for this interval reveals that the lowest energy states,
those with blocks of nodes of a single spin, make up theβ 1
interval. This suggests that the low energy occur
in a cycle in which a particular spin progressively occupies more and more of the vertices until all vertices
have the same spin. However, since the spin of a vertex can also change randomly, even if that implies
a transition to a higher energy state, either of the endpoints of the line can change spin, inciting a new
sequence of transitions that move towards the whole line having the new spin value. This phenomenon
41
Figure 3.13: For the Ising model on a line, the lowest energy (most probable) states form a cycle. For
example, starting with the state at the top of the circle where all spins are+1, either the node at the right
or left end of the line has a chance of changing spins. Once this happens, the changed exerts influence
on its neighbor to also change spin with a certain probability, and the effect cascades until all nodes have
changed their spin to− 1.
is illustrated in figure 3.13. Additionally, the two longest β 0
barcodes correspond to the two antipodes on
the circle,(− 1− 1− 1− 1− 1) and(11111), as these are the two most dominating, low-energy states.
3.5.2 IsingonaCircle
Now letG be a graph on five nodes, but connected in a circular formation as shown in Figure 3.14a. As
before, we construct the transition matrixP using Equation (20). Note that the new graph configuration
changes the values of S(σ,w ) if the set of nodes adjacent to w has changed. Applying the new filtered
complex to P yields the persistence barcodes shown in Figure 3.14b. As with the line, the two uniform
states(− 1− 1− 1− 1− 1) and(11111) correspond to the two longestβ 0
barcodes. However, unlike with
the line graph, there is now a significant β 2
barcode, which implies a void-like structure in the Markov
chain. This can best be explained by Figure 3.15, which shows the lowest energy states (those with blocks
of nodes of the same spin) arranged along a sphere.
42
(a) A graph on 5 nodes connected in the shape of
a circle
(b)β 0 andβ 1 persistence barcodes for the Ising model on a circle
Figure 3.14: Ising model on a line
Figure 3.15: For the Ising model on a circle, the lowest energy (most probable) states form a spherical space
43
Figure 3.16: In the 1D Sznajd model, the agreement of two agents causes their immediate neighbors to
agree with them. This is referred to as social validation. On the other hand, the disagreement of two
agents causes discord to spread to their neighbors. [22]
As seen in the image, a state is likely to transition to a state in a "level" above it or a level below given
that the transition is legal in terms of the Glauber dynamics. Each level consists of low-energy states with
the same number of vertices of a particular spin. For example, the top level consists only of the state
(11111), which has no− 1’s, and the level immediately below consists of states with a single− 1 spin.
Transitions then move from state to state along this sphere. In simulated random walks, the sequence of
visited states often lands on and remains at one of the antipodes for a large sequences of steps. However,
because there is a decent probability of leaving the antipode and moving to a state in the adjacent level,
movement around the sphere resumes.
3.6 SocialDynamicsandtheSznajdModel
The Sznajd model [33] is an extension of the Ising spin model which studies a phenomenon called social
validation in opinion dynamics. The basic principle is that discord spreads. This is reflected in the basic
update rules:
1. If two agents share an opinion, their neighbors will agree with them
2. If a block of adjacent agents disagree, their neighbors will start to argue with them.
These update rules are illustrated in Figure 3.16. More formally, each agenti has opinionS
i
∈{− 1,1}. In
44
a 1D formulation, agents are arranged on a line, each having two neighbors. Agent opinions are updated
according to the two rules
1. IfS
i
=S
i+1
, thenS
i− 1
=S
i
andS
i+1
=S
i
.
2. IfS
i
=− S
i+1
, thenS
i− 1
=S
i+1
andS
i+2
=S
i
.
To study this model, we define a Markov chain similar to the Glauber dynamics of the nearest-neighbor
Ising model. That is, at each update step from an origin stateσ , choose a pair of neighboring agentsi and
i+1 uniformly at random. Then choose a new state from the set of states equal toσ at all vertices except
possibly agentsi− 1 andi+2 according to the probabilitiesP
V
andP
D
defined below:
P
V
=P(S
i− 1
=S
i
=S
i+2
|S
i− 1
=S
i
) (3.28)
P
D
=P(S
i− 1
=S
i+1
,S
i+2
=S
i
|S
i− 1
=− S
i
) (3.29)
This process yields the following transition probabilities from a stateσ to a stateτ :
P(σ,τ )=
1
|V|− 1
|V|− 1
X
i=1
P(τ (i− 1),τ (i+2)|σ (i),σ (i+1)) (3.30)
Using this stochastic version of the 1D Sznajd model and its transition matrix P , we can apply the new
filtered complex as we did with the Ising model. The persistence barcodes for the Sznajd model on a
five-node line graph are shown in Figure 3.17.
The four longestβ 0
barcodes naturally correspond to the two complete agreement and gridlock states,
namely{+1+1+1+1+1},{− 1− 1− 1− 1− 1},{+1− 1+1− 1+1},{− 1+1− 1+1− 1}. There
are also five β 1
barcodes that stand out among the noise. A (non-unique) representative cycle for each of
these barcodes is given by the following states:
1. {− 1− 1− 1− 1− 1},{− 1− 1+1− 1+1},{+1− 1+1− 1+1},{+1− 1+1− 1− 1}
45
Figure 3.17: Persistence barcodes for the Sznajd model on a line
2. {− 1+1− 1+1− 1},{+1+1+1+1− 1},{+1+1+1+1+1},{− 1+1+1+1+1}
3. {− 1− 1− 1− 1− 1},{− 1− 1+1− 1+1},{+1− 1+1− 1+1}
4. {− 1+1− 1+1− 1},{− 1+1+1+1+1},{+1+1+1+1+1}
5. {+1+1− 1+1+1},{+1+1− 1+1+1},{+1− 1+1+1+1},{+1− 1+1− 1+1},{+1− 1− 1− 1+1},{+1− 1− 1− 1− 1},{− 1− 1+1− 1+1},{− 1− 1− 1− 1+1},{− 1− 1− 1− 1− 1},{− 1− 1− 1+1− 1},{− 1+1− 1+1− 1},{− 1+1− 1+1+1}
Though the physical interpretation and social implications of these cycles is not yet clear, some interesting
patterns are evident. For example, each representative cycle contains at least one complete agreement and
one gridlock state.
46
3.7 U.S.AirlineTraffic
One “real-world” application is to a set of commercial flight departure and arrival data available through
the ASA Section on Statistical Computing 2009 Data Expo [7]. The dataset includes details from commer-
cial flights between October 1987 and April 2008 with nearly 120 million records. We explore this data
using the Markov chain complex using the following methodology. Taking just the "Origin" and "Desti-
nation" columns from the data set for a single year, we build a Markov chain that uses the frequency of
origin-destination pairs to derive transition probabilities for each location pair. That is, the probability
of a transition from location x to location y is simply the number of flights from x to y divided by the
total number of flights departing from x. Additionally, in order to avoid absorbing states, we delete any
locations that are either a destination but never an origin, or an origin but never a destination. We can
now apply the filtered complex to the resulting transition probability matrix and observe the resulting
persistence barcodes. The barcodes obtained from performing these steps for the 1987 data are shown in
Figure 3.18. There is oneβ 1
barcode of significant length, and we can visualize the airports and edges that
make up a representative cycle for this barcode, as shown in Figure 3.19.
Following the same procedure with the 2008 data, we obtain the barcodes shown in Figure 3.20, which
show three significant β 1
barcodes. Javaplex can identify a (nonunique) representative cycle for each of
the three longest β 1
barcodes, which we display in Figure 3.21. Though they appear to be nested in the
map image, we turn to the constructed simplicial complex to better understand the structure of these cy-
cles. That is, we can viewR
3
embeddings of the subcomplexes of simplices composed of the nodes in
the representative cycles. We include any 0, 1 and 2-simplices that have persistence values within the
range on the barcodes. Since the embeddings are artificial (that is, the Euclidean coordinates for each of
the nodes are generated randomly), we examine several randomly generated embeddings in order to bet-
ter understand the structure of the subcomplexes, as shown in Figure 3.22. Viewing the subcomplexes
in this way reveals three empty 2-simplices composed of the following edges (1-simplices): ABQ-DFW,
47
Figure 3.18: β 0
andβ 1
persistence barcodes for airline data from 1987
Figure 3.19: A representative cycle for the longestβ 1
barcode
48
Figure 3.20: β 0
andβ 1
persistence barcodes for airline data from 2008
Figure 3.21: Representative cycles for the three longestβ 1
barcodes:
ABQ-AUS-DAL-DFW-PHX-ABQ
ABQ-AUS-DAL-DFW-DEN-SJT-LAX-PHX-ABQ
ABQ-AUS-DAL-DFW-IAH-ATL-ORD-MSP-DEN-SJT-LAX-PHX-ABQ
49
(a) (b) (c)
(d) (e) (f)
(g) (h) (i)
Figure 3.22: The subcomplexes composed of the nodes and valid simplices from the representative cycles
of each of the longestβ 1
barcodes. Three different 3D embeddings are shown (column-wise), and empty
2-simplices that may contribute to the underlying loop structure are highlighted in red.
50
Figure 3.23: Airports from the top 50 contributors to ABQ, DEN, and DFW were considered.
DFW-PHX, PHX-ABQ; ABQ-DFW, DFW-DEN, DEN-ABQ; DEN-DFW, DFW-PHX, PHX-DEN. Recall that,
intuitively, a simplex is added to this Markov chain complex if there are sufficiently many nodes with large
enough stationary distribution values that contribute to it over the long run. Furthermore, ifQ
ij
′ =0 for
some j
′
in the simplex, then the term for node i in the persistence value is zero. Thus, the appearance
of an empty 2-simplex implies that in the long run, there are nodes that contribute to both airports in
each edge of the simplex, but not to all three. To confirm this, we can examine the matrix Q. Specifically,
taking the empty2 simplex ABQ-DFW-DEN as an example, we consider the top10% of valuesQ(i,j) for
j ∈ {ABQ, DFW, DEN} and all possible nodes i and look for contributing nodes j that are in common
to two but not all three airports in the 2-simplex. The Venn diagrams in Figures 3.23 and 3.24 show the
results of this process for both the ABQ-DFW-DEN and ABQ-DFW-PHX empty2-simplices. Additionally,
plotting these contributing airports on a map of the United States shows them as belonging to three geo-
graphically separated regions (see Figures 3.25 and 3.26). The top contributors for an edge are also smaller
51
Figure 3.24: Airports from the top 50 contributors to ABQ, PHX, and DFW were considered.
local airports, possibly implying that the two edge airports but not the third in the simplex act as a "hub"
for the geographic region.
3.8 Molecularconformationspaces
Molecules can exist in various realizations, or conformations, within three-dimensional space. The study
of molecular dynamics aims to taxonomize all feasible conformations of a molecule. The collection of
Figure 3.25: A Google map showing the highlighted airports from Figure 3.23 and the three airports that
compose the empty 2-simplex (in yellow)
52
Figure 3.26: A Google map showing the highlighted airports from Figure 3.24
conformations of a molecule is called theconformationspace of that molecule. The Python packageRDKit
[27] is commonly used to simulate molecular dynamics, and has a function called EmbedMolecule that
can be used to generate random samples from the conformation space of a particular molecule. In this
experiment, motivated by a result in [30], we drew 200 samples of conformations of a pentane molecule
(embedded as high-dimensional vectors) and computed their pairwise distance matrix. Then, we took
a Markov random walk on the k = 8 nearest neighbors, and computed persistence barcodes using our
complex. Figure 3.27 shows a significant β 1
contribution, indicating a rotational degree of freedom in the
conformation space.
53
-8000 -7000 -6000 -5000 -4000 -3000 -2000 -1000 0 1000
Dim 0 -8000 -7000 -6000 -5000 -4000 -3000 -2000 -1000 0 1000
Dim 1
Figure 3.27: Persistence barcodes for a Markov random walk on the nearest-neighbors graph of the con-
formation space of a pentane molecule. The significant β 1
component suggests a rotational degree of
freedom.
54
Chapter4
FutureResearchDirections
There are many avenues to explore using the new filtered complex construction to study Markov chains.
We outline some of them in the following sections.
4.1 IsingonaGrid,LandmarkPoints
Perhaps an immediate extension to explore would be the Ising model on more complex graphs. For exam-
ple, we could construct the transition matrix for Ising on a grid in order to include a more complex network
of nearest-neighbor influences. The 2D Ising model with no external field, which was solved analytically
by Onsager in 1944 [23], exhibits a phase transition. Below a certain critical temperature, almost all spins
are aligned. However, above this critical temperature the order breaks down and there is no longer a pre-
ferred spin alignment. We predict that the persistence barcodes for the 2D Ising model at temperatures
above and below the critical temperature should differ and reflect the underlying behavior of the systems
at these temperatures.
One challenge that comes with exploring Ising and Ising-like models on larger systems is the increased
number of spin states. The number of states increases exponentially with the number of nodes, which
can prove to be computationally expensive when it comes to building filtered simplicial complexes with
simplices larger than 2-simplices. For this reason, it may be necessary to adopt a methodology for reducing
55
the number of states. One possible approach is to simply discard a percentage of states based on their
energy, although this makes it necessary to also calculate the energy for all of the states. Another option
is incorporate a witness-complex-inspired method into building the filtered complex so that only a smaller
number of "landmark" states are necessary.
4.2 OpinionandSocialDynamicsModels
The Sznajd model also offers some interesting directions for future work. First, there is more exploration
to be done in terms of interpreting the persistence barcodes for the simple Sznajd model on a line. Addi-
tionally, as with the Ising model, we could explore the Sznajd model on graphs of different shapes. There
are also many other opinion and social dynamics models to which we could apply our methodology. For
example, the voter model is an agent-based opinion model where each voter has an opinion in a given
opinion set, and each randomly selected voter assumes the opinion of one of its neighbors with a cer-
tain probability. Other variations include those in which only "undecided" agents can interact and form
opinions and those in which there are "zealots" who never change their opinions despite interaction with
oppositely-opinionated neighbors [14]. These models often seek to more closely resemble the behavior of
real-world interactions between people of differing opinions in various social situations. For example, in
[22] the effect of social media was modelled as an external field applied to a Sznajd-like model. Thus the
insights revealed by TDA techniques may provide real-world-applicable information regarding opinion
formation and influence. Additional models that are of particular interest are the q-state Potts model that
allows for q different spin values in an Ising-like system [26], and the two-dimensional Sznajd model in
which a two-by-two configuration of agents and their opinions influence the eight surrounding neigh-
bors (see Figure 4.1). The task for each these models is to "Markovianize" them in a natural and apply the
Markov chain complex to the transition probability matrix.
56
Figure 4.1: Two different schematic representations for a 2D Sznajd model proposed by Stauffer and Galam
[22]
4.3 U.S.FreightData
Another real-world application would be to domestic freight data, which consists of pick-up and drop-
off locations for trucking routes throughout the United States. While the details such as the appropriate
filtration values for a complex are not yet fully refined, we could define a Markov chain on the network
using the frequency of origin-destination pairs to derive the transition probabilities as in the airline traffic
application. Applying the Markov chain complex to this chain may give insights into the most profitable
routes in the freight network.
57
Chapter5
Conclusions
This dissertation has attempted to bridge the gap between topological data analysis and operations re-
search, and to demonstrate how the two fields can complement each other. We have developed a TDA-
guided local search procedure for optimization problems and demonstrated improved performance for the
2-opt heuristic applied to the traveling salesman problem. We have also introduced a novel simplicial com-
plex construction applicable to Markov chains. Its use in exploring the underlying structure and coarse
features of a random process is validated via a series of experiments of ranging complexity, and identifies a
new way to determine coarse features in Markov chains such as cycling or clustering. Our results broaden
the relevance and suitability of topological data analysis as an interdisciplinary tool for many fields of
study.
58
Bibliography
[1] H. Adams and A. Tausz. Javaplex Tutorial. 2019.url:
https://www.math.colostate.edu/~adams/research/javaplex%5C_tutorial.pdf.
[2] Paul Alexandroff. “Über den allgemeinen Dimensionsbegriff und seine Beziehungen zur
elementaren geometrischen Anschauung”. In: Mathematische Annalen 98 (), pp. 617–635.
[3] Erik Carlsson and John Gunnar Carlsson. Topology and local optima in computer vision. Tech. rep.
UC Davis, 2021.
[4] Erik Carlsson, John Gunnar Carlsson, and Shannon Sweitzer. Applying topological data analysis to
local search problems. 2022.doi: 10.3934/fods.2022006.
[5] Gunnar E. Carlsson. “Topology and data”. In: Bulletin of the American Mathematical Society 46
(2009), pp. 255–308.
[6] Gunnar E. Carlsson, Tigran Ishkhanov, Vin de Silva, and Afra Zomorodian. “On the Local Behavior
of Spaces of Natural Images”. In: International Journal of Computer Vision 76 (2007), pp. 1–12.
[7] Data Expo 2009: Airline on time data. Version V1. 2008.doi: 10.7910/DVN/HG7NV7.
[8] Robert B. Ellis. “Discrete Green’s functions for products of regular graphs”. In: arXiv:
Combinatorics (2003).
[9] Michael Catanzaro Filip Cornell Nathaniel Saul. tadasets. https://github.com/scikit-tda/tadasets.
[10] Robert Ghrist and Sanjeevi Krishnan. “A topological max-flow-min-cut theorem”. In: 2013 IEEE
Global Conference on Signal and Information Processing. IEEE. 2013, pp. 815–818.
[11] Marian Gidea and Yuri Katz. Topological Data Analysis of Financial Time Series: Landscapes of
Crashes. Papers 1703.04385. arXiv.org, Mar. 2017.url:
https://ideas.repec.org/p/arx/papers/1703.04385.html.
[12] Keld Helsgaun. “General k-opt submoves for the Lin–Kernighan TSP heuristic”. In: Mathematical
Programming Computation 1.2-3 (2009), pp. 119–163.
59
[13] Yasuaki Hiraoka, Takenobu Nakamura, Akihiko Hirata, Emerson G. Escolar, Kaname Matsue, and
Yasumasa Nishiura. “Hierarchical structures of amorphous solids characterized by persistent
homology”. In: Proceedings of the National Academy of Sciences 113 (2016), pp. 7035–7040.
[14] Ran Huo and Rick Durrett. “The zealot voter model”. In: The Annals of Applied Probability 29.5
(2019), pp. 3128–3154.doi: 10.1214/19-AAP1476.
[15] Elias Khalil, Pierre Le Bodic, Le Song, George Nemhauser, and Bistra Dilkina. “Learning to Branch
in Mixed Integer Programming”. In: Proceedings of the AAAI Conference on Artificial Intelligence
30.1 (Feb. 2016).url: https://ojs.aaai.org/index.php/AAAI/article/view/10080.
[16] Minh Quang Le and Dane Taylor. Persistent homology of convection cycles in network flows . 2022.
arXiv: 2109.08746[math.DS].
[17] Jin-Ku Lee, Zhaoqi Liu, Jason K. Sa, Sang Shin, Jiguang Wang, Mykola Bordyuh, Hee Jin Cho,
Oliver Elliott, Timothy Chu, Seung Won Choi, Daniel I S Rosenbloom, In-Hee Lee, Yong Jae Shin,
Hyun Ju Kang, Donggeon Kim, Sun Young Kim, Moon-Hee Sim, Ju-sun Kim, Taehyang Lee,
Yun Jee Seo, Hyemi Shin, Mijeong Lee, Sung Heon Kim, Yong-Jun Kwon, Jeong-Woo Oh,
Mi Young Song, Misuk Kim, Doo-Sik Kong, Jung Won Choi, Ho Jun Seol, Jung-Il Lee,
Seung Tae Kim, Joon-Oh Park, Kyoung-Mee Kim, Sang Yong Song, Jeong-Won Lee,
Hee Cheol Kim, Jeong Eon Lee, Min Gew Choi, Sung Wook Seo, Young Mog Shim, Jae Ill Zo,
Byong Chang Jeong, Yeup Yoon, Gyu Ha Ryu, Nayoung K. D. Kim, Joon Seol Bae,
Woong-Yang Park, Jeongwu Lee, Roel G. W. Verhaak, Antonio Iavarone, Jeeyun Lee, Raúl Rabadán,
and Do-Hyun Nam. “Pharmacogenomic landscape of patient-derived tumor cells informs precision
oncology therapy”. In: Nature Genetics 50 (2018), pp. 1399–1411.
[18] David A. Levin and Yuval Peres. “Markov Chains and Mixing Times: Second Edition”. In: 2017.
Chap. 3.3.
[19] P. Y. Lum, G. Singh, A. Lehman, T. Ishkanov, M. Alagappan, J. Carlsson, G. Carlsson, and Mikael
Vilhelm Vejdemo Johansson. “Extracting insights from the shape of complex data using topology”.
English. In: Scientific Reports 3 (Feb. 2013).issn: 2045-2322.doi: 10.1038/srep01236.
[20] Arian Maleki, Morteza Shahram, and Gunnar Carlsson. “A near optimal coder for image geometry
with adaptive partitioning”. In: 2008 15th IEEE International Conference on Image Processing. 2008,
pp. 1061–1064.doi: 10.1109/ICIP.2008.4711941.
[21] Gonzalo Mateos. Predator-Prey Population Dynamics. Slide presentation. 2018.
[22] Kamyar Nazeri. “Effect of Social Media on Opinion Formation.” In: arXiv: Physics and Society
(2018).
[23] Lars Onsager. “Crystal statistics. I. A two-dimensional model with an order-disorder transition”.
In: Physical Review 65 (1944), pp. 117–149.
[24] Jose A. Perea and Gunnar E. Carlsson. “A Klein-Bottle-Based Dictionary for Texture
Representation”. In: International Journal of Computer Vision 107 (2013), pp. 75–97.
60
[25] Luiz Manella Pereira, Luis Caicedo Torres, and M. Hadi Amini. “Topological Data Analysis for
Network Resilience Quantification”. In: SN Operations Research Forum 2.2 (June 2021), pp. 1–17.
doi: 10.1007/s43069-021-00070-.
[26] R. B. Potts. “Some generalized order-disorder transformations”. In: Mathematical Proceedings of the
Cambridge Philosophical Society 48.1 (1952), pp. 106–109.doi: 10.1017/S0305004100027419.
[27] RDKit: Open-source cheminformatics. http://www.rdkit.org. [Online; accessed 11-April-2013].
[28] Andrew Rechnitzer.url:
https://personal.math.ubc.ca/~andrewr/research/intro_html/node14.html.
[29] Gerhard Reinelt. “TSPLIB - A traveling salesman problem library”. In: ORSA journal on computing
3.4 (1991), pp. 376–384.
[30] Luis Scoccola and Jose A. Perea. FibeRed: Fiberwise Dimensionality Reduction of Topologically
Complex Data with Vector Bundles. 2023. arXiv: 2206.06513[cs.CG].
[31] Vin de Silva and Gunnar E. Carlsson. “Topological estimation using witness complexes”. In: PBG.
2004.
[32] Sublevel set persistent homology.url:
https://mtsch.github.io/Ripserer.jl/v0.10/generated/sublevelset/.
[33] Katarzyna Sznajd-Weron. “Sznajd model and its applications”. In: Acta Physica Polonica B 36 (2005),
p. 2537.
[34] Sarah Tymochko, Kritika Singhal, and Giseon Heo. Classifying sleep states using persistent
homology and Markov chain: a Pilot Study. 2020. arXiv: 2002.07810[q-bio.QM].
[35] Mikael Vejdemo-Johansson and Primoz Skraba. “Topology, big data and optimization”. In: Big data
optimization: recent developments and challenges. Springer, 2016, pp. 147–176.
[36] Mikael Vejdemo-Johansson and Primoz Skraba. “Topology, big data and optimization”. In: Big data
optimization: recent developments and challenges. Springer, 2016, pp. 147–176.
[37] Hubert Wagner, Pawel Dlotko, and Marian Mrozek. “Computational Topology in Text Mining”. In:
CTIC. 2012.
61
Abstract (if available)
Abstract
Topological data analysis (TDA) is a powerful tool for understanding the underlying shape and structure of data which may not be recoverable via other, more traditional methods. We present applications of TDA to operational research problems, beginning with discrete optimization problems. Most notably, we demonstrate improved performance of the 2-opt local search procedure by applying a standard Vietoris-Rips construction which suggests a pipeline for TDA-guided local search. We then introduce a new filtered simplicial complex construction associated to a stochastic matrix and steady state vector. This allows the study of Markov chains derived from a variety of applications from a topological perspective. As a result, we recover coarse features of the Markov chains that were not previously accessible.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Applications of explicit enumeration schemes in combinatorial optimization
PDF
A stochastic conjugate subgradient framework for large-scale stochastic optimization problems
PDF
Algorithms and landscape analysis for generative and adversarial learning
PDF
The warehouse traveling salesman problem and its application
PDF
Continuous approximation for selection routing problems
PDF
Modern nonconvex optimization: parametric extensions and stochastic programming
PDF
A continuous approximation model for the parallel drone scheduling traveling salesman problem
PDF
Applications of Wasserstein distance in distributionally robust optimization
PDF
Advancements in understanding the empirical hardness of the multi-agent pathfinding problem
PDF
Revisiting FastMap: new applications
PDF
Provable reinforcement learning for constrained and multi-agent control systems
PDF
Continuous approximation formulas for cumulative routing optimization problems
PDF
Congestion reduction via private cooperation of new mobility services
PDF
Improving decision-making in search algorithms for combinatorial optimization with machine learning
PDF
Some bandit problems
PDF
Stochastic games with expected-value constraints
PDF
Speeding up multi-objective search algorithms
PDF
Landscape analysis and algorithms for large scale non-convex optimization
PDF
Decentralized real-time trajectory planning for multi-robot navigation in cluttered environments
PDF
Robustness of gradient methods for data-driven decision making
Asset Metadata
Creator
Sweitzer-Siojo, Shannon Marie
(author)
Core Title
Applications of topological data analysis to operational research problems
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Industrial and Systems Engineering
Degree Conferral Date
2023-05
Publication Date
04/14/2023
Defense Date
04/04/2023
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Markov chains,OAI-PMH Harvest,operations research,optimization,persistent homology,topological data analysis
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Carlsson, John Gunnar (
committee chair
), Carlsson, Erik (
committee member
), Kumar Thittamaranahalli, Satish (
committee member
), Razaviyayn, Meisam (
committee member
)
Creator Email
ladytrombone27@gmail.com,sweitzer@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113015430
Unique identifier
UC113015430
Identifier
etd-SweitzerSi-11625.pdf (filename)
Legacy Identifier
etd-SweitzerSi-11625
Document Type
Dissertation
Format
theses (aat)
Rights
Sweitzer-Siojo, Shannon Marie
Internet Media Type
application/pdf
Type
texts
Source
20230414-usctheses-batch-1022
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
Markov chains
operations research
optimization
persistent homology
topological data analysis