Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Strategies for effective rail track capacity use
(USC Thesis Other)
Strategies for effective rail track capacity use
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
STRATEGIES FOR EFFECTIVE RAIL TRACK CAPACITY USE
by
Pavankumar Murali
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(INDUSTRIAL & SYSTEMS ENGINEERING)
December 2010
Copyright 2010 Pavankumar Murali
ii
Acknowledgments
This work is the result of the guidance, support and encouragement afforded to me by the
faculty at USC, my family, relatives and friends. I thank each and every one of them for
their contribution towards the smooth and timely completion of my Ph.D.
I am greatly indebted to my advisor, Dr. Maged Dessouky, for his guidance
throughout the course of my Master‟s and Ph.D at USC. I thank him for his
encouragement, advice and patience at every step of my doctoral study and in teaching
me the ropes of conducting research. I have truly enjoyed and learnt a lot from our
conversations about academics, politics, sports, philosophy, and life, in general. It was a
true pleasure being his student for over 6 years. I thank Dr. Fernando Ordóñez for sharing
his passion for mathematical optimization with me. His assistance during my research
work with CREATE and METRANS proved invaluable. I thank him for allotting time to
carefully review my research work, reports and papers and help me improve myself in
these areas. I am grateful to my doctoral committee members Dr. Jim Moore, Dr. Sheldon
Ross and Dr. Genevieve Giuliano for their suggestions towards my qualifying proposal
and the ensuing research work. I thank Dr. Moore for being kind enough to write
reference letters during my job search. I thank Dr. Ajit Kolar at IIT Madras for
recognizing my hunger for research, and for encouraging me to pursue a Ph.D.
I have truly enjoyed my stay at USC, and a lot of it was made easy by the help I
received from the wonderful staff at the Department of Industrial & Systems
iii
Engineering; Evelyn Felina, Georgia Lum and Mary Ordaz. I thank all of them for their
time and effort in helping me with the administrative nitty-gritties.
My special thanks to my aunt Shanta Shamrao, my cousin Venkatesh Rao and his
wife Ana Corina Bucur for their hospitality, and for providing me a homely atmosphere
in LA, far away from home. I am indebted to my friend and colleague Indrajeet Dixit and
his wife Manasee Bhandekar for being my wonderful friends. My special thanks to
Indrajeet for helping me develop the taste to appreciate a variety of cuisines. I have truly
cherished our conversations ranging from football to behavioral economics. I thank my
Ph.D colleagues Bill Jia, Iris Shen, Worawan Suteewong, Shi Mu, Luca Quadrifoglio and
Bo Zhang for their help during the various stages of my research. I thank my friends in
Los Angeles and my friends from IIT Madras for their constant help and encouragement,
and for providing me moments of joy that I will cherish for a long time.
Words cannot describe how grateful I am to my parents and sister for their help and
support in making me the person I am today. Their unconditional love, faith and
encouragement have allowed me to pursue my dreams and goals, and succeed in life. I
am especially grateful to my sister for her cooperation and understanding during times
when I had hectic schedules and deadlines. My stay in LA was comfortable largely due to
her kindness. I dedicate this thesis to my parents and sister.
Lastly, I am thankful to the Metropolitan Transportation Center (METRANS) at USC
for the financially assistance during my Ph.D, and to the John Randolph Haynes and Dora
iv
Haynes Foundation for awarding me the Doctoral Dissertation Fellowship that aided the
uninterrupted completion of my research work.
v
Table of Contents
Acknowledgments ............................................................................................................. ii
List of Tables .................................................................................................................. viii
List of Figures .................................................................................................................... x
Abstract ............................................................................................................................ xii
Chapter 1: Introduction ................................................................................................... 1
1.1 Motivation ................................................................................................................. 1
1.2 Existing rail planning methodology .......................................................................... 2
1.3 Proposed methodology .............................................................................................. 5
1.4 Complexity of the IP model ...................................................................................... 7
1.5 Dissertation layout..................................................................................................... 9
Chapter 2: Literature Review ........................................................................................ 11
2.1 Rail routing and scheduling .................................................................................... 11
2.2 Capacity management and delay estimation ........................................................... 15
2.2.1 Analytical models ............................................................................................. 16
2.2.2 Simulation models ............................................................................................ 19
Chapter 3: Research Accomplishments ........................................................................ 21
3.1 Research gap ........................................................................................................... 21
3.2 Research contributions ............................................................................................ 24
Chapter 4: A Capacity Management Model ................................................................ 29
4.1 Notation ................................................................................................................... 31
4.2 IP model .................................................................................................................. 33
4.3 Complexity .............................................................................................................. 36
4.4 Aggregation ............................................................................................................. 37
4.5 Preliminary results................................................................................................... 39
vi
Chapter 5: Delay Estimation Techniques ..................................................................... 47
5.1 A delay updating procedure .................................................................................... 52
5.2 Generating the delay function: An example ............................................................ 55
5.3 Delay function in IP model ..................................................................................... 60
5.4 A generic delay estimation procedure ..................................................................... 64
5.5 Case study: Los Angeles area railway network ...................................................... 69
5.5.1 Delay estimation for a double-track sub-network ............................................ 69
5.5.2 Delay estimation for a single-track sub-network .............................................. 78
Chapter 6: Approximation-based Solution Procedures .............................................. 84
6.1 A solution approach using LP-relaxation ................................................................ 85
6.2 A solution approach using routing constraints ........................................................ 87
6.3 A genetic algorithm (GA) solution procedure......................................................... 88
6.3.1 Background ....................................................................................................... 89
6.3.2 Genetic representation ...................................................................................... 91
6.3.3 Initialization ...................................................................................................... 93
6.3.4 Selection operation ........................................................................................... 96
6.3.5 Crossover operation .......................................................................................... 97
6.3.6 Crossover repairing algorithm ........................................................................ 100
6.3.7 Mutation operation ......................................................................................... 102
6.3.8 Mutation repairing algorithm.......................................................................... 105
6.3.9 The genetic algorithm procedure .................................................................... 106
Chapter 7: Experimental Design and Results ............................................................ 108
7.1 Experimental design .............................................................................................. 109
7.2 Experimental results .............................................................................................. 113
7.3 Sensitivity analysis ................................................................................................ 121
7.3.1 Test network 5 ................................................................................................ 122
7.3.2 Test network 6 ................................................................................................ 125
7.3.3 Test network 7 ................................................................................................ 128
vii
Chapter 8: Conclusions and Future Research ........................................................... 132
8.1 Conclusions ........................................................................................................... 132
8.2 Future work ........................................................................................................... 136
Bibliography .................................................................................................................. 138
Appendix A: Test Networks for Experiments ............................................................ 145
Appendix B: Test Networks for Sensitivity Analysis ................................................. 150
viii
List of Tables
Table 1: Route data specification .......................................................................................40
Table 2: Total travel times for 14 trains (in hours) ............................................................40
Table 3: Route data specification - with aggregation ........................................................43
Table 4: Comparison of results from IP model with aggregation and simulation
model..................................................................................................................................45
Table 5: Settings for network topology parameters for double-track simulations.............70
Table 6: 27 parameter combinations considered in the one-third fractional factorial
design for double-track simulations ...................................................................................70
Table 7: Backward elimination for double-track sub-network elimination .......................74
Table 8: Validation of the double-track delay model. 1-5 are from the 54 unused
topological sub-network configurations. 6 is a real rail network.......................................76
Table 9: Validation of the double-track delay model. Case 1: low traffic intensities.
Case 2: medium traffic intensities......................................................................................78
Table 10: Settings for network topology for single-track simulations ..............................78
Table 11: 27 parameter combinations considered in the one-third fractional factorial
design for single-track simulations ....................................................................................79
Table 12: Backward elimination for single-track sub-network simulations ......................80
Table 13: Validation of the single-track delay model. 1-5 are from the 54 unused
network configurations. 6 is an actual rail network. ..........................................................82
Table 14: Validation of the single-track delay model. Case 1: low traffic levels.
Case 2: medium traffic levels. ...........................................................................................83
Table 15: Genetic algorithm: initial routings .....................................................................94
Table 16: Experimental test network configurations .......................................................111
Table 17: Problem sizes ...................................................................................................114
ix
Table 18: Comparison of lower and upper bounds ..........................................................114
Table 19: Comparison of objective values (and solution times in minutes) of the
solution procedures ..........................................................................................................116
Table 20: Comparison of average travel times (minutes) ................................................118
Table 21: Comparison of average delays (minutes) ........................................................118
Table 22: Parameters for test network 5 ..........................................................................122
Table 23: Sensitivity analysis results for test network 5 ..................................................123
Table 24: Parameters for test network 6 ..........................................................................125
Table 25: Sensitivity analysis results for test network 6 ..................................................126
Table 26: Parameters for test network 7 ..........................................................................128
Table 27: Sensitivity analysis results for test network 7 ..................................................129
x
List of Figures
Figure 1: Railway network between Downtown Los Angeles and Inland Empire
trade corridor ........................................................................................................................7
Figure 2: Research contribution .........................................................................................24
Figure 3: Conversion from a physical network to a general network ................................30
Figure 4: An example of aggregation ................................................................................37
Figure 5: A portion of the rail network near Downtown Los Angeles ..............................39
Figure 6: Union Pacific-Alhambra rail network near Downtown Los Angeles ................43
Figure 7: Illustration of a sub-network to be aggregated ...................................................55
Figure 8: Regression equation for delay ............................................................................56
Figure 9: Regression equation for delay with quadratic effects ........................................57
Figure 10: Regression equation for delay with quadratic and cross-product terms ...........58
Figure 11: Regression equation with quadratic, cross-product and cubic effects..............59
Figure 12: A double-track railway segment with 7 moving trains at the time train
is deciding to enter on either A or B ............................................................................67
Figure 13: Residuals vs. predicted response for double-track sub-network
simulation ...........................................................................................................................71
Figure 14: Normal probability plot for double-track sub-network simulation ..................72
Figure 15: Detailed regression results for model 6 in the double-track simulation ...........75
Figure 16: Detailed regression results for model 6 in the single-track simulation ............81
Figure 17: An illustrative network for genetic algorithm ..................................................92
Figure 18: Genetic representation of chromosomes ..........................................................93
Figure 19: A depiction of the crossover operation.............................................................98
xi
Figure 20: After applying the crossover repairing algorithm ..........................................102
Figure 21: A depiction of the mutation operation ............................................................103
Figure 22: After applying the mutation repairing algorithm ...........................................105
Figure 23: Experimental design: skeleton networks ........................................................110
Figure 24: Graphical comparison of delay values ...........................................................120
Figure 25: Skeleton structure for test network 5 ..............................................................122
Figure 26: Variation in delay values: test network 5 .......................................................124
Figure 27: Variation in solution times: test network 5 ....................................................124
Figure 28: Skeleton structure for test network 6 ..............................................................125
Figure 29: Variation in delay values: test network 6 .......................................................127
Figure 30: Variation in solution time: test network 6 ......................................................127
Figure 31: Skeleton structure for test network 7 ..............................................................128
Figure 32: Variation in delay values: test network 7 .......................................................130
Figure 33: Variation in solution time: test network 7 ......................................................130
Figure 34: Test network 1 ................................................................................................146
Figure 35: Test network 2 ................................................................................................147
Figure 36: Test network 3 ................................................................................................148
Figure 37: Test network 4 ................................................................................................149
Figure 38: Test network 5 ................................................................................................150
Figure 39: Test network 6 ................................................................................................151
Figure 40: Test network 7 ................................................................................................152
xii
Abstract
In the United States, railways are the major means to trans-continentally move goods
from ports to the various inland destinations. Due to mergers and abandonment of rail
lines, there has been a reduction in the track capacity, concentrating rail traffic to fewer
lines. In addition to this, the growth in the number of containers has already introduced
congestion and threatened the capacity of the rail network system in many corridors.
There is a need among U.S. freight railroads for better analytical tools to manage their
capacity and scheduling. A challenging problem for railroad companies is to be able to
plan the traffic and operating conditions over a network so that deadlocks are avoided and
travel-times are below a threshold. This requires estimating travel-times and delays in a
network, and determining the most efficient method of scheduling a set of trains.
The focus of this research is the development of a decision tool that can aid train
planners in developing good quality routes and schedules, on a daily or weekly basis,
within a short amount to time, to better manage the limited track capacity available for
train movements. In daily operations, dispatchers use these plans to decide the movement
of trains through a network. Due to extraneous factors such as train breakdowns, track
maintenance etc., routes might need to be altered on a real-time basis. Dispatchers ensure
that any such deviations from the planned routes and schedules are kept to a minimum, so
as to minimize congestion and avoid deadlocks. In addition, dispatchers resolve conflicts
between train schedules in real-time and determine priorities when trains meet at
xiii
junctions, crossings and sidings. In this work, we concentrate ourselves to railroad
routing and scheduling alone.
Towards achieving this goal, we develop an integer programming-based railway
capacity management model that is capable of assigning trains to routes based on the
statistical expectation of running times in order to balance the railroad traffic. This model
is also capable of determining the best release times for trains to depart from origin
stations and enter a network. We also present a simulation-based delay estimation
methodology that can estimate the travel-time delay over any given single-track or
double-track rail network. These estimates can be used to route and schedule trains to
either ease or avoid congestion in a network or a section of it.
In the freight railroad companies in the United States, planners typically route and
schedule trains, on a daily or weekly basis, for networks up to 50 miles long, depending
on the prevailing traffic conditions. Due to this reason, efficient routes and schedules that
improve capacity utilization need to be generated frequently and quickly. For this
purpose, we develop solution techniques using approximation procedures and
evolutionary methods to solve the aforementioned capacity management model in a
reasonable amount of time. Our experimental results show that our recommended
solution procedure is capable of lowering current real-world delays by up to 30%. As a
whole, this research represents an original effort in developing a quantitative model to
tactically plan the movement of trains through a complex network, with decisions based
on an accurate representation of the delays these trains cause on the railroad and the
possibility of real-time rerouting trains to alternative tracks.
1
Chapter 1: Introduction
1.1 Motivation
The freight transportation system in the United States relies heavily on railroads.
According to a study conducted by Hillestad et al. (2009), railroads contribute to
transporting approximately 40% of the traffic, by freight ton-miles, within the U.S. Trains
primarily transport time-insensitive containers and bulk goods such as coal, oil,
machinery, automobiles etc. over distances 400 miles and above. In the past few decades,
due to consolidations between rail companies and closure of certain tracks, the focus on
service quality has dropped and the focus on margin improvement has increased. While
rates of return have improved, at present U.S. railroads do not earn their cost of capital
(Winston, 2005).
Alongside the declining track capacity is the steady growth in rail movement of coal,
chemicals, grain and other bulk commodities. For example, according to the American
Association of Railroads (AAR), there was a 25% hike in rail traffic between 1997 and
2007. AAR also expects the freight rail volume to dramatically increase by 88% by 2035,
resulting in 55% of the rail network being congested. As global trade continues to
increase, cargo traffic at the nation's ports continues to increase at dramatic levels. The
Ports of Los Angeles and Long Beach (San Pedro Bay Ports) are among the busiest ports
in America. In 2007, the twin ports accounted for approximately 33% (11 million TEU)
of the total container traffic through the U.S. seaports. The total volume that these ports
2
handle is roughly evenly divided between transcontinental and local shipments.
Furthermore, a large portion of the local shipments are re-packaged and/or sorted at local
warehouse facilities for re-shipment across the continent. Railways form the major mode
to transport these goods across North America over distances 200 miles and more. The
growth in the number of containers has already introduced congestion and threatened the
accessibility and capacity of the rail network system in the Los Angeles area [see
Leachman (2002)]. Major railroad companies like Union Pacific, Burlington Northern
Santa Fe, CSX and Norfolk Southern have to cope with traffic volumes close to
meltdown conditions. While some strategic actions have been taken of late by rail
managements (e.g., surcharges and embargoes, strategic re-routing of train movements
onto alternative lines or alternative times of week), capabilities of the industry in this
regard are still very weak.
1.2 Existing rail planning methodology
For a given railroad network, planning can be broadly classified into three levels, namely,
train routing, train scheduling (or timetabling) and train dispatching [see Cordeau et al,
(1998) D‟Ariano (2008)]. Train routing, done at a strategic level, determines the track
segments each train needs to be routed on, so as to minimize the expected delay for each
train, while considering the capacity of these segments. The routing problem also deals
with assigning origin-destination pairs to cars, assembling and disassembling cars into
blocks, and grouping and ungrouping blocks into trains. The train scheduling problem
addresses the issue of developing operational timetables, usually for passenger trains,
taking into account train speeds, and headways and buffer time between trains. This is
3
performed at the tactical level. The train dispatching problem deals with real-time
operations involving precise synchronization of freight and passenger train movements
on the lines of the physical railway network. Given a train timetable, the train dispatching
problem determines a feasible plan of meets and overtakes that satisfies a system of
constraints.
Railroad companies have dispatchers who assist in real-time routing, in deciding
priorities during meets between opposing trains and overtakes between trains in the same
direction, and in scheduling of trains at stations and junctions. First, planners detail routes
for each train and release times from stations based on their earliest possible departure
times. This is done either daily or weekly, depending on the traffic mix. In real-time
dispatching, a dispatcher uses the previously developed plans as a basis for routing and
scheduling trains. However, due to unforeseen circumstances such as new trains in the
network, track outages, locomotive breakdowns etc., routes might need to be updated at a
short notice. Planners and dispatchers perform their respective operations using fairly
simple procedures, in addition to using their own experiences. The capacity and delay
over a certain section of a network change with time, and are a factor of the traffic
volume, train lengths, acceleration and deceleration rates, speed-limits etc. Due to these
reasons, simple greedy procedures are not capable of routing trains based on the expected
delay in a certain network section. There is immense opportunity and a need to design
efficient modeling procedures that can route trains based on expected wait times and
schedule trains to minimize interferences between trains at stations and junctions. In the
research community, the planning and dispatching models have been extensively studied
4
in attempts to model various parameters of complex train movements, and to solve these
models as quickly as possible. Some examples of planning models include Ahuja (2007),
Barnhart (2000), Carey and Lockwood (1995) and Crainic et al. (1984). Some examples
of dispatching models include Caprara et al. (2002), D‟Ariano (2008), Dessouky et al.
(2006) and Suteewong (2006). However, most of the prior work in railroad operations
has been in the area of scheduling as part of the dispatching operation. That is, they deal
with resolving conflicts and setting priorities when trains meet and pass each other at
junctions, crossings and sidings.
This research falls in the broad area of scheduling trains as part of the railway
planning process. The focus of this research is the development of a decision tool that can
aid train planners in developing good quality routes and schedules within a short amount
to time, to better manage the limited track capacity available for train movements. In this
research work, we focus on train routing and train scheduling. By routing, we mean given
a set of possible routes in going from an origin to a destination or a set of destinations, a
train is made to travel on a route with the least expected delay. Delays occur when trains
traveling in either the same direction or opposite direction meet, thus requiring one of the
trains to be pulled over for the other to cross or overtake it. This operation would need at
least one of the trains involved to either slow down or come to a complete halt. The time
taken to decelerate to slow down or stop, the duration for the train is stationary, and the
time taken to accelerate to the original speed together constitute delay. Routing is
typically done at a macro level, that is, it does not deal with which siding to travel on,
which crossing should be used, which track segment of a double- or triple-track the train
5
should travel on etc. We note that most of the prior work in the area of railway planning
deals with railway scheduling without routing. However, models that address both
routing and scheduling aspects make it easier to find a timetable for all the planned trains,
and hence improve service levels and reduce delays.
1.3 Proposed methodology
In the past, researchers have addressed the problem of minimizing travel time delays by
developing techniques to streamline blocking and yard operations [Ahuja (2008)],
locomotive assignment [Vaidyanathan (2008)], railway routing [Carey and Lockwood
(1995)], timetabling [Brännlund (1998)] and rail crew scheduling [Caprara et al. (1997)]
among others. The work presented in this dissertation addresses strategies to minimize
network congestion through efficient utilization of the existing rail network capacity,
brought about by efficient railway routing and scheduling. As a whole, this research
represents an original effort in developing a quantitative model to tactically plan the
movement of trains through a complex network, with decisions based on an accurate
representation of the delays these trains cause on the railroad and the possibility of real-
time rerouting trains to alternative tracks. The former is important since in many urban
areas, like Southern California, there are several different rail routes. For example, there
are three distinct rail lines from the Colton crossing area to the Downtown area (two
served by UP and one by BNSF). The capability to balance the freight rail traffic along
the three routes has the potential to significantly reduce train delays in the area.
This research comprises of two main parts – an integer programming (IP)-based
integrated routing and capacity management model and a regression-based procedure to
6
estimate delay. The IP model assigns trains to routes that minimize travel time delays, in
order to balance the traffic in a railway network. This model accounts for track capacities
and speed-limits, train speeds, and also has deadlock avoidance constraints. Due to the
inherent NP-hard feature of this integer program, this modeling strategy does not allow us
to plan the movement of trains over large real-world rail networks. So, an approximation
technique known as “aggregation” is used to aggregate suitable sections of a network in
order to keep the model tractable. An estimate of the expected travel time delay with
traffic in the aggregated section is fed to the integrated routing and capacity management
model so that trains can estimate, well in advance, the delay they could experience along
each possible route to their destination. In this way, the IP model routes and schedules
trains and balances the traffic to minimize the travel time delay for all trains in a network.
To estimate delays over aggregated network sections, we develop a regression-based
delay estimation methodology that predicts delay as a function of traffic volume, number
of trains of different types (each type has its own length, speed-limit, acceleration and
deceleration rates), network topology and operating conditions such as headways. The
delay estimation equations are approximated to be linear regression models, and therefore
maintain the linearity of the integrated routing and capacity management IP model.
The third contribution of this research is the solution procedures to solve the IP
model. We develop two procedures – one based on a linear relaxation formulation of the
IP, and the second based on routing constraints. Using these procedures we only
determine the routes and release times from the origin stations of a train. As we explain
later on in this dissertation, these play a major role in determining the delays experienced
7
somewhere down the line in the network. We also develop a genetic algorithm procedure
to improve the quality of the routes and release times generated by the first two solution
procedures. Finally, using the construction heuristic of Lu et al. (2004), we obtain real-
world delay and travel-time estimates for the trains over a network. This heuristic
considers all the complexities and non-linearities of the train movement process.
1.4 Complexity of the IP model
As mentioned previously, our main goal is to develop a decision tool that can aid train
planners in developing efficient routing plans and schedules over medium-scale
networks, which are 30-50 miles long, with a relatively short computational time. Figure
1 below gives an example of a medium-scale network from the Los Angeles area rail
network.
Figure 1: Railway network between Downtown Los Angeles and Inland Empire trade
corridor
8
There are approximately 80 trains that travel on the above network from Alameda
Corridor (not shown in the figure) to the East of Colton Crossing, and back. There are 3
routes available – UP-Alhambra, UP-San Gabriel and BNSF. Dispatchers would need to
select a route for each trains as it enters the above network from either end, and also
determine release times and priorities during meets and overtakes. Our research addresses
the first two decisions, whereas the decision of determining priorities is done in a
dispatching model and is beyond the scope of this research.
If we were to develop an integrated routing and capacity management model for the
above network with its daily traffic, we would end up with easily over 120 nodes, and
half a million binary variables and constraints. This is the reason why we aggregate
suitable sections of a network to reduce the number of nodes and arcs. A challenging
decision to make is the degree of required aggregation. The higher the degree of
aggregation, the smaller is the size of a network, but lesser are the accuracies of the final
delay and travel-time estimates from the construction heuristic. This is a major trade-off
that we explore in the sensitivity analysis section of this dissertation.
Another issue related to complexity is to do with the delay estimation equations
generated through regression models. As explained by Krueger (1999), delays over a rail
network section are not linear in nature. This is because of the highly non-linear nature of
the physics of train movement – how they accelerate and decelerate. Similar to that paper,
our delay estimation methodology determines an exponential relation between delays and
traffic, operating and network parameters. However, in the interest of preserving the
9
linearity of our IP model, we approximate this non-linear relationship with a linear one,
without sacrificing significantly on the adjusted R-squared value.
1.5 Dissertation layout
In Chapter 2, we provide a detailed survey of the existing literature in railroad planning.
We have categorized our survey into papers on railway routing and scheduling models
and solution approaches, and railway delay estimation methods.
In Chapter 3, we describe the similarity of our routing and scheduling methodology
to job-shop scheduling, and list our research contributions.
In Chapter 4, we present an integer programming (IP)-based integrated capacity
management model, give examples on problem sizes for medium-scale networks, present
approximation methods to reduce the complexity of the IP model. We also present results
from experiments run on a small-scale network to show the importance of the quality of
the routes and release times from origin stations in keeping the overall delay in the
system at a minimum.
In Chapter 5, we discuss a methodology to develop generic delay estimation
equations that reflect on how the delay in an aggregated single-track or double-track
section of the network varies with each additional train. Once a delay estimation equation
has been generated by this technique, it can be used to estimate the delay and capacity of
a network section with physical attributes within the range of those of the generic
networks used in the experiments. That is, individual simulations need not be run for
every aggregated network section. Using a fractional factorial experimental design, we
10
run simulations to develop an exponential relation between travel time delay and traffic,
operating and network topology parameters.
In Chapter 6, we present approximation-based solution procedures to solve the
integrated routing and capacity management model for large, complex rail networks. The
outputs from the IP model are the routes and release times (or schedules) from the origin
stations into the network for all trains. The IP model makes a few assumptions to keep
the problem size and computational time low. To test the quality of the solutions
generated by the IP model under real world conditions, the routes and release times are
fed into a simulation model developed by Lu et al. (2004), which has been validated by
the authors on actual rail networks in the Los Angeles area. The simulation model uses a
construction heuristic to schedule the trains for the intermediate track segments.
Additionally, a genetic algorithm procedure was also developed to improve upon the
quality of the routes and schedules outputted by the approximation solution methods. The
best solution procedure is obtained by running experiments on multiple real-world and
randomly generated test rail networks. The computational times of the recommended
solution approach and its sensitivity to the degree of aggregation and traffic volumes are
presented in Chapter 7. Finally, in Chapter 8, a few concluding remarks and opportunities
for subsequent research are provided.
11
Chapter 2: Literature Review
There is clearly a need among US freight railroads for better analytical tools to manage
their capacity and scheduling. In order to have the ability to minimize travel times and
delays in delivering freight goods, we should be able to have each train start from its
origin at a given time and travel on a route that minimizes travel time, meet/pass
interferences and expected delays. To accomplish this, it is imperative to have an
integrated routing and capacity management model, instead of a simple scheduling model
that assumes the routes to be fixed. Although there exists a slew of problems that have
been studied in rail transport research, there has been little work in developing an
integrated approach for the capacity planning and scheduling problems that considers all
the intricate aspects of rail operations. In particular, in the capacity planning area, there
are no models to guide planners in determining how to schedule a train based on its effect
on the travel-time delay in a network. Although there has been some work in the
scheduling area, it does not take into account alternative routes. In sections 2.1 and 2.2
we present an overview of the prior work done in the areas of rail routing and scheduling,
solution approaches to these models, and delay estimation methods.
2.1 Rail routing and scheduling
There are many problems that fall into the general category of the routing problem. These
include blocking models and makeup models. The railroad blocking model determines
which cars should be assembled into which blocks at which yards, and places emphasis
on the movement of cars as opposed to the movement of trains. Newton (1996) models
12
this as a network budget design problem with nodes as yards and arcs as potential blocks.
Constraints model capacities at the nodes and restrictions on the paths for each
commodity. The blocks are designed as virtual arcs which a commodity might use to
have an uninterrupted service between terminals that are not necessarily connected by a
physical link. The objective is to minimize the cost of delivering all the commodities.
This problem is solved using a column generation approach along with a branch-and-
bound algorithm. Barnhart et al. (2000) formulate the blocking problem as a network
design problem with maximum degree and flow constraints on the nodes. They solve this
problem using a Lagrangian relaxation approach and use subgradient optimization to
solve the Lagrangian dual. Ahuja et al. (2007) propose a very large-scale neighborhood
(VLSN) search algorithm to solve real-life instances of the blocking problem. This
algorithm has two main subroutines: constructing the initial feasible solution, and re-
optimizing the blocking arcs emanating from a node. The authors show that the algorithm
can solve the blocking problem for real-world scenarios in about 2 hours. The makeup
model assigns blocks to trains, and the routing model determines the routing and
frequency of trains. These papers on the blocking problem do not account for the routing
decisions that go hand-in-hand with the blocking problem in the real-world. Hence, the
routes that the blocks travel on might be far from being optimal, thereby resulting in large
delays. Crainic et al. (1984) propose a nonlinear mixed-integer multicommodity flow
problem that integrates the blocking, the makeup and the routing models all together.
Keaton (1992) uses Lagrangian relaxation for solving similar combined blocking and
routing problems.
13
There is no dearth in the amount of research done in train scheduling. However, most
of the work has been done specific to passenger trains which, unlike freight trains, follow
fixed timetables. Nevertheless, we list some of the most recent and important literature in
this area, and note the various solution procedures that have been used to solve this
complex problem for real-world instances. Cai and Goh (1994) propose an algorithm to
the train scheduling problem, based on local optimality criteria in the event of a potential
crossing conflict. The authors manage to obtain the suboptimal, yet feasible, solution in
polynomial time. The model can also be generalized to the possibility of overtaking when
trains have different speeds. Huntley et al. (1995) propose a routing and scheduling
model with an objective to minimize operational costs. It is solved using simulated
annealing techniques and the output is the sequence of train links to be followed by each
demand block from origin to destination. Higgins et al. (1995, 1996) address the problem
of scheduling trains on a single line track when the priority of each train in a conflict
depends on an estimate of the remaining crossing and overtaking delay. This priority is
used in a branch and bound procedure to allow the determination of optimal solutions.
Carey and Lockwood (1995) propose a model, algorithms and strategies for uni-
directional train pathing (or routing) and timetabling problem for passenger trains. As the
pathing problem is computationally intensive, they develop heuristics and a branch and
bound method to solve sub-problems to route a single train while holding the order of all
the trains fixed. They propose dealing with more general rail networks by decomposing
the network into rail corridors, and repeatedly applying the single line pathing approach
of their paper. Hallowell and Harker (1996) develop an analytical line delay model for
analyzing rail line-haul operations, and they validate the model as a predictive tool.
14
Hallowell and Harker (1998) extend this earlier model as a prescriptive tool for
generation of train schedules. They consider the problem of scheduling trains on a
partially double track rail line by accounting for delays due to meet/pass interference. The
model incorporates dynamic meet/pass priorities in order to approximate an optimal
meet/pass planning process. Gorman (1998) considers an integrated routing and
scheduling problem, and uses a genetic search algorithm to find acceptable solutions. The
results from this algorithm are further improved using a tabu-search procedure. The
model has binary variables for potential train services and assignment of demand blocks
to a train. The objective is to minimize the sum of fixed costs of trains and marginal cost
per car. The tabu-enhanced search procedure is shown to have a potential for a 6%
reduction in late services. Caprara et al. (2002) propose a graph theoretic model for the
timetabling problem on a one-way, single track network linking two stations, with many
intermediate stations. This work is applicable to passenger trains, and addresses
constraints such as track capacities and operational restrictions. They develop an integer
program model in which nodes correspond to departure or arrivals at a certain station at a
given time instant. A Lagrangian relaxation of the formulation is embedded within a
heuristic algorithm that uses the dual information associated with the Lagrangian
multipliers.
Using a discrete event model of train movements along the lines of a network, Kraay
et al. (1991) and Kraay and Harker (1995) study the problem of train dispatching, and
present techniques to adapt train schedules to robust and requiring little modifications
should there be a perturbation in the rail network. Şahin (1999) studies rescheduling of
15
trains by modifying meet/pass plans in conflicting situations in a single-track railway.
Dorfman and Medanic (2004) develop a local feedback-based travel advance strategy.
This approach is shown to quickly handle perturbations in the schedule and performs
well, while minimizing deviations. A capacity check algorithm keeps track of deadlocks
and works towards preventing them. Dessouky et al. (2006) propose a mixed integer
programming model for train dispatching in a complex network by considering speed
limits and headways. They use branch and bound techniques to solve the problem.
D‟Ariano et al. (2007) study the passenger train scheduling problem during real-time
traffic control. When train operations differ from the planned timetable, a new conflict-
free timetable of feasible schedules needs to be recomputed, so that the deviation from
the original is minimal. The authors model this as a huge job-shop scheduling problem
with no-store constraints. A branch and bound algorithm is developed and tested on a
bottleneck area of the Dutch rail network to prove optimal or near-optimal solutions
within short computational times. In most of the research work done so far on railway
routing, the routing almost always refers to railway cars routing and not train routing.
Also, most of the papers mentioned above present models that assume train routes to be
known, thus concentrating on train scheduling and dispatching alone.
2.2 Capacity management and delay estimation
For efficient and cost-effective train routing, a modeling strategy needs to account
for track capacity. Ideally, we would like to route the trains such that the various sections
of the track infrastructure are, more or less, uniformly utilized. This would reduce travel
time and the probability of experiencing knock-on delays. More importantly,
16
incorporating capacity constraints into routing and scheduling models enables us to
determine if the existing trackage can handle any new trains, and identify the sections
where additional tracks could be added in order to increase the capacity of the entire
network. A network (or sub-network) is saturated if the addition of a new train results in
a sudden increase in the delays for any or all trains. Thus, capacity analysis is vital in
identifying the traffic thresholds at which travel times would exceed service standards.
Some work has been done in developing capacity management models. Burdett and
Kozan (2006) define absolute capacity as the theoretical capacity when critical sections
are saturated, and actual capacity as the capacity of the network after incorporating the
interference delays. As can be expected, capacity is dependent on the train mix and
capacity analysis is, therefore, an iterative procedure.
2.2.1 Analytical models
One of the earliest analytical models on capacity and delay assessment was
developed by Frank (1966). He studies delay on a single track with unidirectional and
bidirectional traffic. By restricting only one train on each link between sidings and using
single train speeds and deterministic travel times, he estimated the number of trains that
could travel on the network. Petersen (1974) extends this work to accommodate for two
different train speeds. He assumed independent and uniformly distributed departure
times, equally spaced sidings and a constant delay for each encounter between two trains.
Chen et al. (1990) extend Petersen's model to present a technique to calculate delay for
different types of trains over a specified single track section as a function of the schedules
of the trains and the dispatching policies. They assumed sidings to be equally distributed,
17
that faster trains can overtake slower trains, meets and overtakes occur only between 2
trains at a time, and there exists a fixed probability P
i,j
of a train i getting delayed by a
train j. This modeling technique was extended by Harker et al. (1990) to a partially
double-track rail network which consisted of a single-track section with sidings and
double-track sections. Similar to the previous work, trains depart according to their
scheduled departure times. The train to be delayed during a meet (or overtake) is
determined by a trade-off between the lateness of the train with respect to its schedule
and the overall priority of the train. Carey et al. (1994) study the effects of knock-on
delays between two trains on a single-track. They used non-linear regression to develop
stochastic approximations of the relation between scheduled headways and knock-on
delays, and tested these approximations by conducting detailed stochastic simulation of
the interactions between trains as they traverse sections of the network. Özekici et al.
(1994) use Markov chain techniques to study the effects of various dispatching patterns
and arrival patterns of passengers on knock-on delays and passenger waiting times. Given
a travel time probability density function for a train on a track link, a departure time
transition matrix was constructed for the calculation of the expected departure delay.
Higgins et al. (1998) present an analytical model to quantify the positive delay for
individual passenger trains, track links and schedule as a whole in an urban rail network.
The network they considered has multiple unidirectional and bidirectional tracks,
crossings and sidings. Carey (1999) uses heuristic measures to estimate knock-on delays
and build reliable schedules. Yuan (2006, 2008) propose probability models that provide
a realistic estimate of knock-on delays and the use of track capacity. The proposed model
reflects speed fluctuation due to signals, dependencies of dwell times at stations and
18
stochastic interdependencies due to train movements. D'Ariano (2008) studies delay
propagation by decomposing a long time horizon into tractable intervals to be solved in
cascade, and using advanced Conflict Detection and Resolution with Fixed Routes
(CDRFR) algorithms. These algorithms are used to detect and globally solve train
conflicts on each time interval.
Queuing theory is another methodology that has been used for estimating delay in
railroads. Greenberg et al. (1988) presents queuing models for predicting dispatching
delays on a low speed, single track rail network supplemented with sidings and/or
alternate routes. Train departures are modeled as a Poisson process, and the slow transit
speed and deterministic travel times enable them to travel with close headways. This
work assumes sidings to have infinite capacity. Huisman et al. (2001) investigates delays
to a fast train caught behind slower ones by capturing both scheduled and unscheduled
movements. This is modeled as an infinite server G/G/∞ re-sequencing queue, where the
running time distributions for each train service are obtained by solving a system of linear
differential equations. Wendler (2007) presents an approach for predicting waiting times
using a M/SM/1/∞ queuing system with a semi-Markovian kernel. The arrival process is
determined by the requested train paths. The description of the service process is based
on an application of the theory of blocking times and minimum headway times.
A bottleneck approach is one way to determine the absolute capacity of a network,
by identifying the maximum number of trains that can travel through the track segments
constituting a bottleneck in a given time period. De Kort et al. (2003) considers the
problem of determining the capacity of a planned railway infrastructure layout under
19
uncertainties for an unknown demand of service. The capacity assessment problem for
this generic model is translated into an optimization problem. Burdett and Kozan (2006)
develop capacity analysis techniques and methodologies for estimating the absolute
(theoretical) traffic carrying ability of facilities over a wide range of defined operational
conditions. Specifically, they address the factors on which the capacity of a network
depends on, namely, proportional mix of trains, direction of travel, length of trains,
planned dwell times of trains, the presence of crossing loops and intermediate signals in
corridors and networks. Gibson et al. (2002) also develop a regression model to define a
correlation between capacity utilization and reactionary delay. Landex et al. (2006) and
Kaas (1998) discuss techniques to calculate capacity utilization for railway lines with
single and multiple tracks, as per the UIC (International Union of Railways) 406 method.
Abril et al. (2008) present an automated tool that is able to perform several capacity
analyses. Their results show how the capacity varies according to factors such as train
speed, location of stations, train heterogeneity, distance between railway signals and
timetable robustness.
2.2.2 Simulation models
Simulation techniques can be used to study direct, knock-on and compound delays and
ripple effects from conflicts at complex junctions, terminals, railroad crossings, network
topology, train and traffic parameters. The compound interaction effects of these factors
cannot be effectively captured in an analytical delay estimation model. Petersen et al.
(1982) present a structured model for rail line simulation. They divide the rail line into
track segments representing the stretches of track between adjacent switches and develop
20
algebraic relationships to represent the model logic. Dessouky et al. (1995) use a
simulation modeling methodology to analyze the capacity of tracks and delay to trains in
a complex rail network. Their methodology considers both single and double-track lines
and is insensitive to the size of the rail network. Their model has a distinctive advantage
of accounting for track speed-limits, headways, and actual train lengths, speed-limits
acceleration and deceleration rates in order to determine the track configuration that
minimizes congestion delay to trains. This work is extended by Lu et al. (2004).
Hallowell et al. (1998) improve upon the work by Parker et al. (1990) by incorporating
dynamic meet/pass priorities in order to approximate an optimal meet/pass planning
process. Extensive Monte Carlo simulations are conducted to examine the application of
an analytical line model for adjusting real-world schedules to improve on-time
performance and reduce delay. Krueger (1999) uses simulation to develop a regression
model to define the relationship between train delay and traffic volume. The parameters
involved are network parameters, traffic parameters and operating parameters.
21
Chapter 3: Research Accomplishments
The work presented in this dissertation addresses strategies to improve utilization of the
existing capacity of a rail network through efficient railway routing and scheduling. As a
whole, this research represents an original effort in developing a quantitative model to
tactically plan the movement of trains through a complex network, with decisions based
on an accurate representation of the delays these trains cause on the railroad and the
possibility of real-time rerouting trains to alternative tracks. The former is important
since in many urban areas, like Southern California, there are several different rail routes.
For example, there are three distinct rail lines from the Colton crossing area to the
Downtown area (two served by UP and one by BNSF). The capability to balance the
freight rail traffic along the three routes has the potential to significantly reduce train
delays in the area.
3.1 Research gap
A vast majority of the literature in railway capacity management deals with methods to
estimate capacities and delays, and to ensure that real-time perturbations on a rail
network can be managed with as few changes as possible to the railway timetable. To the
best of our knowledge, there has not been any effort from the research community to
improve rail track capacity use through efficient routing and scheduling jointly. In
Chapter 2, we noted that most routing models concentrate on modeling blocking and
make-up operations. We believe the reason for this is that railroad companies in the U.S.
invest money and time to fine-tune these operations through sophisticated modeling
22
procedures. At the same time, greedy approaches are adopted by these companies when it
comes to improving track utilization and controlling traffic to avoid deadlocks. In this
research, we explore the possibility of developing decision tools that could handle rail
routing and scheduling in a more sophisticated and efficient manner, compared to simple
greedy heuristics.
Although Carey and Lockwood (1995) address the train pathing problem, their
technique only provides a local solution since each train is routed individually. Moreover,
similar to many papers, they test their routing model on a single unidirectional line,
thereby not accounting for the complexities and non-linearities of railway operations.
More importantly, their modeling and solution approaches cannot be applied to networks
that are of medium (30-50 miles) or large-scale (75-200).
Most research papers on delay estimation and capacity assessment for railway
networks does not explicitly consider the vital and complex interactions between traffic,
operating and network parameters. In the case of the analytical models, heavy
assumptions are made in order to maintain the complexity of the problem within solvable
bounds. Moreover, these models may be incapable of recognizing the dynamic nature of
capacity and knock-on delays involving more than two trains. Delays typically occur at
junctions where two or more rail tracks merge into one, and at railway crossings. Delays
get propagated through a network resulting in multiple trains being slowed down or
coming to a halt. When a train slows down or stops, some additional time depending on
the signal time and acceleration rate is required before it can reach maximum speed once
again. As pointed by Dessouky and Leachman (1995), modeling each track segment
23
separately as a G/D/1 queue would not capture interaction between trains on different
track segments. More often than not, delay or capacity estimation is unlikely to be the
final step in railway operations planning. Instead, a planner might use these estimated
values in railway routing and scheduling, that is, to route a set of trains over tracks with
the minimum expected delay so as to minimize the overall system delay. For such
purposes, it would be beneficial to design delay estimation models that could be easily
integrated with or incorporated into a routing or scheduling model. Analytical models
requiring algorithms to solve a system of equations might not be the best option for this
purpose. Simulation models, on the other hand, would enable us to develop simple, yet
accurate, algebraic relationships that better capture the stochastic nature of the
interactions between the traffic, operating and network parameters, and their impact on
travel time delays. Studies by Krueger (1999) and Abril et al. (2008) show how delays
and track capacities are dependent on these parameters.
24
3.2 Research contributions
Figure 2: Research contribution
Figure 2 summarizes the contribution of this research work to rail-track capacity planning
and management. Given the daily traffic volume, train mix or heterogeneity, and sets of
origins and destinations, the integrated routing and capacity management model can be
solved to obtain the routes the trains should travel on and their order of departure from
the origin stations. This problem is akin to a no-wait job-shop scheduling problem, where
the trains are similar to jobs that need to be processed and the rail-tracks are resources
such as machines, with each job J
i
having a processing time p
ik
on machine M
k
. In a
typical job-shop scheduling problem, the order of release of the jobs impacts the overall
delay in processing all of them. Hence, determining the release times of our trains from
their respective origins is vital to minimize travel-times and delays. Similar to job-shop
scheduling, trains are scheduled to depart from origin and intermediate stations such that
the travel-time delays in the network are minimized. It could even be necessary to hold
25
back a train ready to depart and instead allow another train to depart ahead, because
doing so might reduce interferences between trains somewhere down in the network.
However, once a train has been released into a network, a simple greedy scheduling
algorithm that moves a train from one track segment to the next tends to work well in
practice since holding a train reduces track capacity. Of course, there may be number of
valid reasons to hold a train at intermediate track segments (e.g., to allow a higher
priority passenger train pass), but these types of decisions are typically made in real-time
and by dispatchers, and is outside the scope of this thesis. There have been quite a few
research papers in the recent past that treat railway scheduling as a job-shop scheduling
problem, and borrow ideas for models and algorithms from the literature existing in that
area. Some examples are D‟Ariano et al. (2007), Mascis and Pacciarelli (2002), Mannino
and Mascis (2009), Oliveira and Smith (2000) and Oliveira (2001).
The tasks that were accomplished during the course of my doctoral research can be
broadly classified as follows.
Develop an integer programming (IP) model which dispatches each train on an
optimal route between its origin and destination points. An optimal route is
defined as one that has the least expected delay in travel time associated with it.
Using an approximation method called aggregation, reduce the complexity of the
IP model to be able to solve it optimally for small instances of network sizes and
train counts.
Develop a regression-based delay estimation technique using a simulation model.
This can be used to estimate delay across a sub-network at various traffic levels.
26
These delay equations are then imbedded into the IP formulation to improve
model accuracy.
Using approximation-based solution procedures, and aggregating suitable sections
of a network, solve the IP model to obtain initial routes and release times that
produce significantly less travel times and delays, in comparison to current real-
world values.
Design a genetic algorithm procedure to further improve upon the quality of
initial routes and release times generated by the aforementioned solution
procedures.
Develop a technique to test the performance of our modeling and solution
methodologies for any rail network comprising of double-track, single-tracks, and
multiple routing options, for a given set of trains.
Test the trade-off between the degree of aggregation and quality of solution
compared to current real-world delays and travel-times. This offers an idea of the
maximum size of a network for which the IP model can be solved without
comprising on the quality of the routes and schedules.
In addition to routing trains, the IP model also decides the order in which the trains
leave their origin points, the times at which they depart each node and the sequence in
which they pass each other at crossings. However, the schedules at the intermediate
stations cannot be determined by our solution procedure for medium-scale networks with
around 80 trains. From a practical point of view, since this research deals with railway
planning and not with railway dispatching, the intermediate scheduling decisions are not
27
needed at this stage and can be solved at a later time in the dispatching stage. Due to
disturbances that can occur real-time or choices made in the blocking model, a dispatcher
would need to solve a train dispatching model either optimally or using greedy
procedures to determine those priorities. In view of that, we do not compute those
parameters from our models. Because of this, we achieve significant time savings in
solving the integrated routing and capacity management model and generating
information vital to the planning stage, namely, train routes and release times from origin
stations.
Another aspect to note about our modeling strategy is that the IP model can be solved
for a network of any size by using a suitable degree of aggregation. However, we do not
recommend its use for large-scale network for three reasons. First, freight trains seldom
travel as a single unit over large distances. Often, as a train passes through a major rail
yards, cars are removed from one train and attached to another. These operations are
decided by the blocking model. In other words, the cars on a train might have the same
destination for the time-being. However, their ultimate destination might vary. Hence,
our IP model would be accurate only until the next station where the cars will be resorted.
Second, planning over very large rail networks would result in routes and release times
that are not practical in the real-world due to real-time disturbances. Under such
circumstances, dispatchers would never be able to keep up with the planned schedules.
Third, we will show in Chapter 7 that as the degree of aggregation increases, the
accuracies of the predicted real-world delays and travel-times are significantly reduced.
28
A novel feature of our modeling strategy is the incorporation of track capacity into
the routing model. The use of capacity-delay correlations enables the adjustment of the
capacity of a track or sub-network to have admissible delays, and to reject or defer trains
that would overload the network. In the following sections, we present a detailed
description of the routing and scheduling IP model, the regression-based delay estimation
techniques and some preliminary results obtained by running this integrated routing and
capacity management model on the railway network in the Southern California region.
A large portion of the research on railway planning handles train scheduling and
dispatching at a micro-level, to determine every track segment, junction and siding each
train travels on. With the aid of aggregation techniques, which are explained further on,
our modeling procedure is carried out at a macro-level. The integrated routing and
capacity management model is capable of incorporating route flexibility as opposed to
having a train follow a fixed path from origin to destination, and of handling large size
railway networks with many more trains. These features constitute the primary
differences between our methodology and that proposed by Lu et al. (2004) and
Dessouky et al. (2006).
29
Chapter 4: A Capacity Management Model
In this section, we provide a formal description of the integrated train routing and
capacity management model. We use the same network definition of a rail system as
proposed by Lu et al. (2004). Given a railway network consisting of main tracks, sidings,
junctions and platforms, it can be converted to a general network G = (N, A), where N is
the node set and A is the arc set. Each node j ϵ N contains a set of segment resources,
whose lengths are equal to the length of the longest train to be routed. The time spent by
a train at a node j will be at least equal to the time needed to traverse the segment(s)
included in that node. The capacity of these nodes depends on the number and type of
track segments included in the node. The stations existing in the network are also
represented by nodes. The capacity of these station-nodes is greater than 1, in order to
accommodate crossing and overtaking. The nodes are connected to each other by directed
arcs a A which do not have any length or speed limit associated with them. There are
two directed arcs, one for each direction, between any two nodes i and j. To replicate the
physical network, only one of these two arcs can be simultaneously occupied. Similarly,
each node j can be occupied by an additional train as long as the capacity of the segments
constituting the node is not violated. Hence, the time taken by the trains to travel in the
transformed network will be exactly the same as in the physical network. In Figure 3, we
show an example of how to translate an actual track configuration into network G.
30
Figure 3: Conversion from a physical network to a general network
For each train h H, the origin and destination stations are known, but the route can
be either known or unknown. If the route is unknown, the IP model would route and
schedule all the trains. In this case, each train is routed along the track segments (nodes)
with the least overall delay. On the other hand, if the route is known, then the IP model
would just be a train scheduling model. As in any train scheduling model, the time of
departure for each train from its origin, and the meet/pass order are optimally determined
by the model.
To represent a train h traveling from node i to node j at time t, we introduce a binary
variable,
where this variable is equal to 1 if train h travels from node i to node j at
time t. The IP model is run over a time horizon T. We have trains traveling in both
directions. This modeling procedure generates an IP model with a large number of binary
variables that, even for small-scale applications, is not solvable in any reasonable time
using standard solvers such as CPLEX. For example, suppose we have 14 trains, with 7
in each direction, to be routed over a network with a 100 directed arcs and the model is
run till T = 300. This gives us 420,000 binary variables. To make the problem more
31
tractable, we define binary variables Y
i,j,h,t
to denote a train h traveling from node i to
node j by time t (instead of at time t as described above). In other words, a single Y
variable defined at time instant t can be treated as the sum of the individual variables
defined for time instants 1,...t. This reduces the number of variables in each constraint,
thereby making the model sparser. We also perform pruning by defining Y only for
possible directions of travel for each train. For instance, for a train h that could travel
only from node a to node b on its way from its origin to its destination, we define Y
indexed only in a,b,h,T, and not in b,a,h,T. This pruning step halves the number of binary
variables. In contrast to Dessouky et al. (2006), we ignore train lengths, thereby reducing
the number of constraints. We believe this assumption does not significantly skew the
results due to our procedure of dividing the network into segments equal to the train
lengths, and also due to node capacity constraints included in the IP model. We also
make an assumption that the trains have very high acceleration and deceleration rates to
enable them to instantly change their speeds to follow the speed limits of the track
segments they are traveling on.
4.1 Notation
We consider a set of H trains traveling through a railway network. As outlined in Lu
et al. (2004), each node in the transformed network has two ports: port 0 and port 1. Port
0 indicates the starting point of travel for a train moving in the node from one direction.
Port 1 indicates the starting point of travel in the opposite direction of port 0. If a train h
enters a node from port 0 (respectively, port 1), then it must leave from port 1
32
(respectively, port 0). We use M as a large constant to express non-linear relationships
through linear constraints. We define the following parameters.
H
1
: set of trains heading from „0‟ to „1‟
H
2
: set of trains heading from „1‟ to „0‟
N: set of nodes representing track segments
A: set of directed arcs
A
1
: set of arcs directed from „1‟ of the previous node to „0‟ of the next node
A
2
: set of arcs directed from „0‟ of the previous node to „1‟ of the next node
v
i
: velocity limit of node i
l
i
: length of node i
c
i
: capacity of node i
r
h
: earliest departure (or release) time of train h
o
h
: origin node of train h
d
h
: destination node of train h
T: time horizon
: set of (A
1
, H
1
)
: set of (A
2
, H
2
)
:
M: a very large number
The decision variables in our mathematical programming model are Y
i,j,h,t
. These are
flow variables defined such that (i,j,h) .
33
if train traverses on at any time between and
otherwise
4.2 IP model
The integrated train routing and capacity management model is given below. The
objective function of this integer programming model is to minimize the sum of all the
arc traversal times for each train, thereby indirectly minimizing the total travel time (and
delay) for all trains. Weights can be associated with the travel time of each train to
represent any priority that may exist. Here, we give an objective function assuming all
trains are of equal importance. This can be written as follows:
(IP ) Minimize
subject to constraints (1) – (10)
where the constraints are explained below.
The following constraint (1) ensures that a train does not leave its origin station prior
to its earliest departure time.
(1)
The following constraint (2) ensures that a train must eventually leave its origin
station after the earliest departure time. Each time is allowed the flexibility to leave at an
appropriate time so as to minimize the total delay in the network.
34
(2)
The following constraint (3) enforces the condition that a train has to arrive at its
destination station.
(3)
Constraint (4) enforces the condition that a train must take at least (length of
segment/velocity limit over that segment) units of time to traverse the node representing
that segment. Constraint (5) takes care of flow conservation, thereby ensuring that every
train entering a node leaves the node after a certain amount of time.
(4)
(5)
Constraints (6)-(7) are capacity constraints for the nodes. In each time period τ the
number of trains traveling in node j must be less than c
j
.
(6)
(7)
35
Constraints (8)-(9) ensure that trains traveling in opposite directions cannot enter the
same node at the same time. If a node is already occupied by a train, then a train traveling
in the opposite direction would either need to wait for this node to be freed or may
choose an alternative node to reach its destination. Trains traveling in the same direction
can occupy the same node at the same time, provided that the capacity constraint for that
node is not violated.
(8)
(9)
Constraint (10) ensures that an arc is not simultaneously occupied by trains traveling
in the opposite direction, since this could lead to a deadlock.
(10)
36
4.3 Complexity
In order to make significant contributions to real-world railway routing and scheduling,
we need to be able to run our (IP*) model for large railroad networks. That is, we need to
do train routing and scheduling at a macro-level, as opposed to a micro-level. In the
previous section, we discussed some modeling techniques that were used to make the
model sparse and to reduce the number of binary variables. In spite of these efforts,
solving the integer programming model presented above for medium to large-scale real-
world rail networks can be quite challenging.
Figure 1 is a section of a medium-scale network from the Los Angeles-area rail
network. The network is between downtown Los Angeles and Colton crossing. There are
three main routes in this section of the network – UP-Alhambra, UP-San Gabriel and
BNSF. The first two are operated by Union Pacific and the third by Burlington Northern
Santa Fe. Each of the three routes is approximately 60 miles long. Dividing them into
nodes by considering the length of the longest train traveling on these lines, we get
approximately 40 nodes per route. If we solve the formulation (IP*) for this network with
a daily traffic of 50 trains, then we end up with approximately 200,000 binary variables
and 350,000 constraints. Note that the number of binary variables and constraints
increases drastically with the addition of new nodes. Thus, this current formulation (IP*)
inhibits our ability to solve the capacity management problem for medium to large-scale
rail networks. In the interest of being able to solve IP models using either a standard
solver such as CPLEX or any other solution approach, we need to resort to approximation
37
procedures that help reduce the number of nodes and arcs in the general network. For this
purpose, in addition to the aforementioned pruning, we resort to Aggregation.
4.4 Aggregation
Aggregation refers to combining a suitable and sizeable portion of the network under
consideration into a single node in the graph G. An example of aggregation is presented
in the figure below.
Figure 4: An example of aggregation
For the physical network shown in Figure 4 above, if we divide it into nodes for the
general network by considering the length of the longest train, then we end up having
many nodes and arcs. However, if we aggregate suitable sections of this network, we end
up with a condensed network with 5 nodes and 4 arcs, as shown above. Clearly,
aggregation helps to reduce the problem size by decreasing the number of nodes and arcs
in G for a given physical network. We follow some fundamental guidelines for
aggregation – each node can have only similar type of track segments. We cannot
38
combine single, double and triple-track segments into a single node. In Figure 4, AB, BC
and CD are double-track, single-track and double-track sub-networks respectively. So,
based on the aforementioned guidelines, we aggregate these into 3 nodes as shown in the
figure.
For networks without aggregation, we used a node capacity of 1 in the (IP*)
formulation. Also, the time taken to travel through a node was assumed to be linear, and
equal to the length of the node divided by the speed limit of that node. During
aggregation since many smaller nodes are combined into a bigger node, we need to
reevaluate the capacity of an aggregate node. An aggregated node could accommodate
trains traveling in opposite directions. In an aggregate node, the assumption of
instantaneous acceleration and deceleration is less valid because of a higher probability of
trains meeting and overtaking each other. Hence, the time taken to travel through a node
would no longer be linear. Moreover, trains might need to wait for each other before they
exit one aggregate node and enter another. Therefore, we need a better representation of
the time spent by a train in an aggregate node than denoting it as node length divided by
velocity. In view of these reasons, a more detailed and robust capacity and delay
estimation procedure needs to be developed. In this work, we have developed a
regression-based delay estimation technique that can be used to solve (IP*) for networks
with aggregate nodes. This will be discussed in detail in Chapter 5. In the following
section, we present some preliminary experimental results from solving (IP*) for small-
scale networks.
39
4.5 Preliminary results
In this section, we compare the results from our IP model with the results obtained from a
greedy construction heuristic implemented in the simulation model presented in Lu et al.
(2004), and the mixed integer program for train scheduling presented in Dessouky et al.
(2006). Recall the model by Dessouky et al. (2006) explicitly accounts for actual train
lengths but does not allow for flexible routing. The network used for running the three
models is shown below in Figure 5. For running the IP model, we transform this physical
network into a general network as described in the previous section.
Figure 5: A portion of the rail network near Downtown Los Angeles
The transformed network has 42 nodes and 52 directed arcs in each direction. Similar
to Dessouky et al. (2006), we have 14 trains traveling across the above rail network. All
40
trains are assumed to be ready at time zero. The origin and destination for the trains are
given below in Table 1 below.
Route ID Origin Destination
1, 13 CP Dayton Taylor Yard Alameda Corridor
2, 14 Alameda Corridor CP Dayton Taylor Yard
3 LATC Alameda Corridor
4 Alameda Corridor LATC
5 CP Dayton Taylor Yard LATC
6 LATC CP Dayton Taylor Yard
7 Union Station Metrolink San Bernardino Line
8 Metrolink San Bernardino Line Union Station
9 CP Dayton Taylor Yard East Yard
10 East Yard CP Dayton Taylor Yard
11 East Yard Alameda Corridor
12 Alameda Corridor East Yard
Table 1: Route data specification
The IP model for routing these 14 trains is run using CPLEX 9.0 solver on a Linux
server with a 3.06GHz Intel Xeon CPU for 2.0 CPU hours. The results of the greedy
heuristic were obtained using the simulation model described in Lu et al. (2004). It is run
using AweSim! 2.0 simulation software. The results are tabulated in Table 2 below.
Model Travel time
(hr)
Decrease delay Increase delay
New IP model
(flexible routing)
LB: 13.33
UB: 15.13
No train length,
flexible path
Discretized time.
t = 5 minutes.
New IP model
(fixed routing)
15.53 No train length Discretized time. Fixed
path. t = 4 minutes.
IP model, Dessouky
et al. (2006)
LB: 13.6
UB: 14.5
Continuous time Train lengths, fixed paths.
Greedy heuristic, Lu
et al. (2004)
14.103 Continuous time,
flexible path
Heuristic approximation,
train lengths.
Table 2: Total travel times for 14 trains (in hours)
41
In the first row, we present the results from running the train routing IP model,
presented in Section 4.2, over the network shown in Figure 5. To make the problem size
tractable so that it can be solved by CPLEX in a reasonable amount of time, we modify t
to represent 5 minutes. After running the routing model for 2.0 CPU hours, we get the
lower and upper bounds on the optimal solution, which are presented in the table above.
The second row represents the results we derive from our new IP model by forcing the
trains to follow the same routes as used by Dessouky et al. (2006). For this, we resort to a
4 minute rounding to run our IP model in CPLEX. In the third row, we show the results
presented by Dessouky et al. (2006) for the same network and set of trains. Finally, the
fourth row represents the results obtained from the simulation model, presented in Lu et
al. (2004), by having the trains follow the route and release time determined by the new
IP model under flexible routing (row 1) and uses a greedy construction heuristic for
scheduling the trains after they have been released into the network.
In Table 2, the third column lists the features and assumptions of each modeling
strategy that cause the travel times of the 14 trains to decrease, and the fourth column
lists the reasons that cause the travel times to increase. The results shown in Table 2
depict the relative accuracy of the four modeling techniques. We draw the following
conclusions:
1. The simulation model presented by Lu et al. (2004) and the IP model by
Dessouky et al. (2006) have the advantage of using continuous time in their
modeling procedure. However, accounting for actual train lengths together with
continuous time modeling limits the size of the railway network to which these
42
two models can be applied. That is, the IP formulation in Dessouky et al. (2004) is
less scalable than the new IP model.
2. The integrated routing and capacity management IP model has an advantage of
being capable of performing train routing on large real-life railway networks with
many more trains. There are two reasons for this to be possible, namely, assuming
train lengths to be negligible and using aggregation techniques. In our model, we
discretize time to control the IP problem size so that it can be solved by CPLEX.
3. All the four models above are capable of generating a complete schedule,
including order of departure from intermediate stations. However, only the new IP
model is capable of generating optimal routes from the origin stations.
4. The new IP model with flexible routing cannot be solved to optimality for the
network on hand. However, using the initial routes and release times information
as an input to the greedy heuristic, we get travel-time to be equal to 14.1 hours.
This is lower than the values for “new IP model with fixed routing” and the IP
model by Dessouky et al. (2006).
In order to be able to perform routing of trains over a large network and estimate
travel time precisely, we test our IP model by applying aggregation to a portion of the
railway network in Southern California, shown in Figure 6. It stretches from the CP
Dayton Taylor Yard to the City of Industry, through Downtown Los Angeles, a total of
nearly 25 miles.
43
Figure 6: Union Pacific-Alhambra rail network near Downtown Los Angeles
The trains used for this network are given in Table 3. All trains are assumed to be
ready at time zero.
Route ID Origin Destination
1, 7 City of Industry CP Dayton Taylor Yard
2, 8 CP Dayton Taylor Yard City of Industry
3, 9 City of Industry LATC
4, 10 LATC City of Industry
5, 11 LATC CP Dayton Taylor Yard
6, 12 CP Dayton Taylor Yard LATC
Table 3: Route data specification - with aggregation
The transformed network G has 38 nodes and 35 arcs. In order to accurately compare
the results from our IP model with the results from the simulation model, presented in Lu
et al. (2006), we resort to a one minute rounding in our IP. Hence, even though the
network in Figure 6 is slightly smaller than the network in Figure 5, due to a one minute
rounding, the IP problem size is larger in the case of the network in Figure 6. In order to
solve the routing IP model for this network with a minute rounding, we can only have
trains 1 to 6 from Table 3. We use N
a
to denote the set of aggregated nodes.
44
As per the discussion held in Section 4.4, there are two sections in G that can each be
aggregated into a single node - section A-B and section C-D. For section A-B, there exist
six paths between A and B, namely, 1-3-6-8, 1-3-6-7-8, 1-3-5-7-8, 2-4-5-7-8, 2-4-3-6-8
and 2-4-3-6-7-8. Hence, this sub-network can be substituted with two nodes, one for each
direction and each with capacity equal to three. In this case, due to aggregation, we can
have trains 1 to 10 from Table 3 traveling through the network, still using a minute
rounding. Similarly, section C-D has four paths and can be substituted with two nodes,
one for each direction and each with capacity equal to two. This allows us to have trains 1
to 12 traveling through the network using a one minute rounding. Please note that in
running these experiments we resort to a very simple way of calculating capacity. As
explained towards the end of Section 4.4, capacity of an aggregate node changes
dynamically with the prevailing traffic and operating conditions. A more robust
procedure to measure capacity of an aggregate node will be described in Chapter 5.
The IP model, presented in Section 4.2, and the simulation model, presented in Lu et
al. (2006), are each made to run on the network given in Figure 3 for three scenarios -
non-aggregated network, aggregation for section A-B, aggregation for sections A-B and
C-D. In the case of the IP model, each scenario is made to run for 4.0 CPU hours, using
the CPLEX 9.0 solver. The travel-times calculated by (IP*) are recorded in rows 1, 3 and
5 in Table 4 below. The optimal routes and departure times from origin stations
determined by (IP*) are then fed into the simulation model to address the effects of trains
lengths and non-linear variation in train speeds due to acceleration and deceleration. Note
45
that these effects are not considered in (IP*) as per the assumptions made. The results
from the simulation model are presented in rows 2, 4 and 6 in Table 4.
Row Model # of
nodes
# of
arcs
# of
trains
Capacity of
aggregate node
Travel
time (hr)
1 IP Model 38 35 6 n/a 3.68
2 Simulation 38 35 6 n/a 3.82
3 IP model,
with N
A-B
30 27 10 3 (for A-B) 8.38
4 Simulation 38 35 10 n/a 8.56
5 IP model,
with N
A-B
and N
C-D
33 31 12 3 (for A-B), 2
(for C-D)
LB: 10.41
UB: 10.84
6 Simulation 38 35 12 n/a 10.89
Table 4: Comparison of results from IP model with aggregation and simulation model
Although the IP model is solved to optimality using a 1 minute rounding, the travel-
times noted in rows 1, 3 and 5 could be far away from the real-world values because the
IP model generates schedules assuming that trains are infinitesimally small. The results
in rows 2, 4, and 6 are from the simulation model which includes many of the features of
actual rail systems (e.g., it models train lengths and the acceleration/deceleration
process). In the simulation model, we use the routes and release times generated by the IP
model and as previously mentioned a construction heuristic is used to schedule the trains
for the intermediate tracks. Comparing rows 1, 3, and 5 with rows 2, 4, and 6, we infer
from the results that there is not a significant loss in accuracy of estimating travel-times
by using the IP model alone. From Tables 3 and 4 we may conclude that a viable option
to solve (IP*) for medium to large-scale networks would be obtain information about
initial routes and release times from (IP*) and input these into the simulation model to
46
obtain schedules at intermediate stations, and travel-times and delays that reflect real-
world train movements. This will be discussed in detail in Chapters 6 and 7.
47
Chapter 5: Delay Estimation Techniques
Various kinds of estimation techniques can be used to study how the delay varies with a
change in the traffic conditions and/or the railway network topology. The difference
between the actual running time and the free running time is termed as travel time delay
or, simply, delay. The free running time of a train over a network is defined as the time
the train takes to traverse the network, when traveling at its maximum allowable speed
and not experiencing delays due to other train(s). The actual running time is the time a
train travels to reach its destination when there are other trains in the network. For freight
trains, delays can be of two types, namely, direct and knock-on (or indirect) delays.
Direct delays to trains are a consequence of minor delays at a station. These are not as a
result of other trains traveling along the same lines. Knock-on delays are those which are
induced into the system due to a direct and/or knock-on delay to another train in the
network. It is transferred from one train to, possibly, all the other trains in the vicinity.
The capacity of a railway network and the delay across it are closely related. The
delays encountered by trains under different operating assumptions can be used to
evaluate the capacity of a section of a network, which is referred to as a subnetwork.
Capacity can be defined as the maximum number of trains that can traverse a network, or
a section of a network, without resulting in a deadlock. Burdett and Kozan (2006) define
absolute capacity as the theoretical capacity value that is realized only when critical
sections of a network are saturated. On the other hand, actual capacity is the number of
trains that can safely coexist in a network, or a portion of it, when interference delays are
48
taken into consideration. Both measures of capacity are measured over time. Absolute
capacity can be used as an upper bound for planning purposes.
The actual capacity of any section of a railway network cannot be a unique value,
and it is neither easily defined nor quantified. It depends on the average minimum
headway time between consecutive trains, the signaling system, train speeds, track
configuration etc. For instance, single-tracks with sidings can accommodate more trains
and enable crossings and overtakes than those without. Signals can increase capacity by
reducing the required headway between trains. Delay estimation and capacity analysis in
railway transportation is dependent on various operational aspects. The first aspect is the
track (or network) configuration. The network can consist of single, double, triple or even
more track. Single tracks are common in the case of North America while double and
triple tracks are common in Europe. Normally, the level of complexity in urban areas is
higher than in rural areas because they contain many junctions. A train can block the
movement of other trains when it tries to cross over at a junction from one line to another.
The second aspect is the variation in speed limits on different track segments and
junctions. Furthermore, passenger trains and freight trains can have different maximum
speeds even though their paths may use the same tracks, but not necessarily at the same
time. If a train passes a junction by changing lines, the speed limit at the junction will be
enforced. While single speed limits are common on networks in rural areas, multiple
speed limits are common in metropolitan areas. A lower speed-limit over a sub-network
tends to increase travel time delays. The third aspect is the characteristic of each train in
the rail network such as priority, train length, speed, acceleration rate and deceleration
49
rate. Generally, passenger trains have higher priority than freight trains. If two trains want
to seize the same track simultaneously, the train with the lower priority should wait and
stop until the train with higher priority passes. Sometimes trains cannot be dispatched at
their maximum speed because of the track speed limit. Acceleration and deceleration
rates need to be considered in order to increase or reduce speed without violating the
speed limit. This results in a nonlinear function to represent the movement of trains.
In Chapter 4, we presented the routing and scheduling IP model and used
aggregation techniques to solve this model for large railway networks. However, the only
property of the individual nodes that was considered while building an aggregated node
was the length of the track segments. Track configurations, interactions between trains
and knock-on delay effects were not explicitly considered. Hence, the actual travel time
of the trains, from their origins to their respective destinations, might be quite different
from the values obtained from the IP model. Some key aspects of real-life railway
operations that were not accounted for in the aforementioned strategy include the
following.
1. Nodes N in G that make up an aggregate node j can have different capacities c
j
and speed-limits v
’
. Our strategy presented above computes c
j
for each j in N
a
in a
simplified fashion, without considering the variations in the speed limits of the
track segments and the trains.
2. At a given point in time, the nodes constituting an aggregated node j can have
multiple trains traveling on the track segments they represent. In addition, these
trains could be traveling in opposite directions.
50
3. The delay across an aggregated node j is dependent upon the physical
characteristics of the track segments constituting j. The type of trackage (single-
track, double-track etc), the number of meet-pass points, spacing between the
meet-pass points, track speed-limit and the number of sidings need to be
considered.
4. The delay across an aggregated node j is dependent upon the current traffic in j.
That is, the number of trains of various types, their direction of travel and speed-
limits.
Due to these reasons, we had to develop a delay estimation technique that accounts
for the complexities and non-linearities that exist in real-world railway operations.
Instead of the simple aggregation step presented previously, we now develop a technique
that accurately estimates the delay for each aggregated node j as a function of the track
configuration represented by the nodes in N constituting j, as well as the prevailing traffic
in j. Once we obtain these delay functions, they could be inserted into the IP model in
order to route and schedule the trains so as to minimize their travel time. The capacity of
each j can be set such that the time taken by an incoming train to traverse j is within delay
thresholds.
As mentioned in Chapter 2, there are various ways to obtain delay estimates that
have been used in prior research on railway operations; these include queuing theory,
stochastic approximation methods, simulation methods, analytical models etc. We
decided to use simulation models to obtain data on delay values, and then use regression
to develop delay estimates as a function of the train traffic.
51
The simulation model used for this purpose was developed using AweSim! 2.0
software and presented in Lu et al.(2004). This model considers multiple track
configurations in the same rail network with multiple speed limits while accounting for
the acceleration and deceleration limits of the trains. Freight trains are assumed to arrive
at origin stations following a stochastic arrival process. Train movement is a continuous
process while the scheduling and dispatching of trains are triggered by discrete events.
The continuous motion of train movements is approximated by dividing the movement in
small discrete steps. A deadlock-free dispatching algorithm is embedded into the
simulation model to determine the optimal run times for a train under multiple speed
limits, and to decide the movement of each train in the network considering whether to
continue moving at the same speed, to accelerate or decelerate, or to stop. The algorithm
also determines the next track to be seized from among the multiple alternative tracks.
All the complexities of real-world railway operations such as track configurations, track
speed-limits, number of meet-pass points, number of sidings and train lengths and speed-
limits are built into the system. The authors prove this algorithm to be deadlock-free,
while attempting to keep the train delays to a minimum. The modeling methodology does
not depend on the size of the network and is insensitive to the track configuration. Thus,
changes to the track configuration require changes only to the input data files. The route
for each train needs to be given in order for the simulation model to schedule them. For a
given set of trains and routes, the simulation model has been shown to give efficient
schedules with minimum travel delay. We ran this simulation model over each sub-
network that we intended to aggregate, in order to get delay estimates across the
aggregated node representing the sub-network.
52
5.1 A delay updating procedure
Given that for different routes we can estimate delay variations using a simulation
model, we propose an iterative procedure to update delay as we determine optimal routes.
A step-by-step explanation of our methodology to route and schedule trains along the
routes with minimum delay is provided below.
1. Given a railway network N and a set of trains H that need to be routed along this
network, identify the various routes along which each train could be routed.
2. Identify suitable sub-networks along each route that can be aggregated according
to the procedure explained in Section 4.4.
3. Let
denote the number of trains that will be traveling through node j in
iteration n. Assume some initial value for the number of trains traveling on each
route in each direction. This gives us the initial (i = 0) total number of trains that
pass through each aggregated node,
for all j in N
a
.
4. Set counter n 0,
M (some large number).
5. Repeat until
, for all j in N
a
, is a very small number.
a. n n+1.
b. Run the simulation model on the sub-network constituting an aggregated
node j, with the number of trains in each direction being equal to
.
Record the time taken to travel through j by each train. Repeat this step for
all aggregated nodes.
53
c. Develop a regression model for each aggregated node to represent delay in
traversing it as a function of the track configuration and the existing
traffic.
d. The capacity
for each j is set to be equal to the maximum number of
trains that can co-exist in j, without the delay for each train exceeding the
threshold value.
e. In the network N, substitute the aggregated sub-networks with their
respective aggregate nodes. The length
of an aggregate node is equal to
the total track length of the aggregated sub-network it represents. The
speed-limit
of an aggregate node is equal to the weighted speed-limit of
the sub-network, with the weights being proportional to the track lengths.
f. Incorporate the delay functions [step (c)] and capacities [step (d)] for the
aggregated nodes into the IP model from Section 4.2. This will be
explained in detail in the following section.
g. Run the modified IP model and get new estimates
for the number of
trains traveling through each j.
h.
=
.
6. Input the final routes and schedules into the simulation model to get the final
travel times and delay values for all trains. The necessity for this step arises due to
the assumptions made in the IP model (Section 4.2) regarding the train lengths
and acceleration and deceleration rates.
54
To begin with this iterative process, we first identified the parameters to be recorded
in each simulation run, which were to be later used to develop the regression-based delay
functions. The parameters selected were those which impact the delay experienced by the
trains traveling through a sub-network. For this purpose, the parameters considered
included the track configuration, traffic and operating parameters. As explained in
Krueger (1999), by focusing on how changes in these parameters affect delay, we
account for the dynamic nature of delay and capacity. Hence, the IP model would be able
to efficiently route the trains along the routes with minimum delay. Since we ran the
simulation model for each aggregated sub-network individually, the track configuration
remained unchanged and, hence, was not considered as one of the factors for regression.
In the simulation model, the number of trains scheduled is equal to
. The ends of the
sub-network under consideration are treated as the origins and destinations of the trains
passing through it. Trains arrive at their respective origins according to a Poisson process.
The simulation was started and made to run over a 100 days time horizon, with statistics
being cleared after 10 days, that is, after steady state has been reached. The traffic and
operating parameters of the system were captured at the time of arrival of a randomly
selected train at either ends of the sub-network. For a fixed
, the arrival rate of the
trains were varied and the simulation run to get a good representation of the system delay
at varied levels of traffic. The traffic and operating parameters recorded were the
following.
1. X
i
: the number of trains of type i in the sub-network.
55
2. D
1
: the number of trains that enter the sub-network from the opposite direction
after the entry and before the exit of the aforementioned randomly selected train.
3. D
2
: the number of trains already present in the sub-network when the randomly
selected train enters the sub-network and traveling in a direction opposite to it.
Once the simulation-generated data has been gathered, it is fed into a statistical
software called Minitab to generate a regression equation for delay as a function of the
traffic and operating parameters. For each aggregated node, we generated regression
equations with these parameters, and also with the interaction effects between the traffic
parameters. Finally, the regression equation with a reasonable value of adjusted R-
squared and number of significant parameters is selected to be incorporated into the
routing and scheduling IP model.
5.2 Generating the delay function: An example
In this section, we illustrate the procedure of developing the delay equation for a sub-
network. The network being considered here is a Union Pacific - Alhambra railway track
between the CP Dayton Taylor Yard and the City of Industry, shown below in Figure 4. It
is a double-track segment with crossings, and is 5.79 miles in length.
Figure 7: Illustration of a sub-network to be aggregated
56
As explained above, the simulation model was run for this sub-network over 100
days. We considered four types of trains for this sub-network - long double stack (8000
feet), other intermodal (6000 feet), carload (6500 feet) and oil (5000 feet). The traffic and
operating parameters were recorded when randomly selected trains enter the sub-
network. The arrival rates of the trains were altered in order to get a good representation
of the impact of the traffic and operating parameters on delay. Over 25,000 data points
were collected in this manner. The following regression models were developed.
1.
's,
and
.
2.
's, quadratic interaction effects of
's,
and
.
3.
's, quadratic and cross-product interaction effects of
's,
and
.
4.
's, quadratic, cross-product and cubic interaction effects of
's,
and
.
Figure 8: Regression equation for delay
57
Figure 9: Regression equation for delay with quadratic effects
58
Figure 10: Regression equation for delay with quadratic and cross-product terms
59
Figure 11: Regression equation with quadratic, cross-product and cubic effects
We assume a 5% significance level. In Figure 8, all the traffic and operating
parameters are significant, and the regression equation has an adjusted R-squared value
of 78.5%. As can be seen from Figures 9, 10 and 11, there is very little improvement in
the adjusted R-squared value as we add the interaction terms. Since these traffic and
operating parameters need to be mathematically expressed in order to be included into the
IP model, and since we preferred to keep the IP model free of any non-linear expressions,
we decided to go with the regression equation for delay with just the first-order traffic
60
and operating parameters. By doing so, we did not compromise on the quality of the
delay function because of an insignificant variation in the adjusted R-squared value.
5.3 Delay function in IP model
Once the delay equations have been derived, they need to be inserted into the routing
and scheduling IP model to get the new number of trains routed over each route (and
aggregate node). In the IP formulation presented in Section 4.2, constraint (4) enforces
the time a train should take to traverse a node by considering the speed limit over the
node's segment. However, in reality, the maximum speed a train can travel is the
minimum of its speed limit and the segment's speed limit, denoted by
. So, we
modify constraint (4) accordingly.
(4a)
In the case of an aggregated node, the speed limit of the sub-network is the weighted
speed-limit of all the track segments making up the sub-network, as explained previously.
The regression-based delay equation in Figure 8 above is represented by the
following equations (11) and (12). We separate the delay expression for the two
directions, denoted by
and
. Note that
and
are indexed in because
delay is being expressed as a function of the traffic and the operating parameters, both of
which vary with time.
61
(11)
62
(12)
, 1
j
, 2
j
, 3
j
, 4
j
,
and
are the constant and the coefficients of
,
,
,
,
and
respectively from the regression-based delay function.
,
,
and
are subsets of for the long double stack, other intermodal, oil and carload types of
train respectively. For the IP model, it is not easy to mathematically express
, the
number of trains that can potentially enter the sub-network in the opposite direction
between the entry and exit times of the train under consideration, since this is something
that occurs in the future. So, we make an approximation by expressing it as the number of
trains traveling in the opposite direction that are in the adjacent nodes k, for all j,k A, at
the time of entry of a train h into an aggregated node j from, say, i. For a train entering an
aggregate node J at, say, ,
is expressed mathematically as the number of trains
63
traveling in the opposite direction that have entered j by , but have not yet departed from
j. Now, in addition to constraint (4a), we need to introduce two more delay constraints for
the aggregated nodes, accounting for
and
. For this purpose, we introduce the
following equations.
According to constraint (13),
equals the time train h enters j from i. if
is
greater than zero, and it is equal to zero otherwise.
(13)
Constraints (14) and (15) enforce the minimum time a train entering an aggregate
node j takes to travel through it, as per the regression-based delay equations.
(14)
(15)
Constraints (4a), (14) and (15) determine the actual delay experienced by a train
traveling through an aggregate node.
64
5.4 A generic delay estimation procedure
In this section, we extend the regression-based delay estimation procedure, presented
above, to generic rail networks. A methodology, based on Design of Experiments
techniques, is used to predict delays in railway networks, while capturing interactions
between the network, traffic and operating parameters. As explained in detail below,
generic networks are first constructed to represent the range of the physical attributes of
the actual networks for which delays are to be estimated. Next, we run simulations
representing train movements through these generic networks, and record the relevant
system state data. Finally, a regression analysis is made to run on the collected data. This
regression equation is shown to accurately estimate the travel time delay on an actual
network that has its physical attributes within the extreme limits of the networks used in
the experiments.
Once again, we used the simulation model developed by Lu et al. (2004) to run
simulations on the generic networks. Considering the Downtown Los Angeles - Inland
Empire Trade Corridor shown in Figure 3 as an example, the authors show that the delays
experienced by the trains as per the simulation model are very close to the real-world
travel time delays. This simulation model is used to study the impact of the network
topology and traffic parameters on the delay experienced by trains in traversing a sub-
network. We assume trains can accelerate and decelerate instantaneously to obey track
speed limits, and the simulation model is modified accordingly. Hence, the maximum
speed of a train at each instant of time is simply set to the constraining speed-limit of the
65
track segment. Furthermore, a Poisson arrival process is assumed for each train. The
control parameters for each simulation are as follows:
1.
: the arrival rate of each type of train.
2. L: the length of the sub-network, that is, the distance in miles between the start
and the end of the sub-network.
3. V: the speed-limit of the sub-network. The free running time of the train over the
sub-network is inversely proportional to the minimum of V and the train speed-
limit.
4. C: the number of crossings or sidings for a double- or single-track respectively.
They are assumed to be uniformly distributed. These enable a smooth flow of
traffic within the sub-network. Typically, delay reduces with an increase in C.
5. S: the spacing over a sub-network. This is defined as the portion of the sub-
network over which crossings (or sidings) are uniformly distributed. If crossings
are uniformly distributed over the entire track length (i.e., S = 1), then trains can
more easily overtake and/or cross each other, than if all the crossings are
concentrated at one end of the network segments. Therefore, delay increases with
a decrease in S due to possible interactions between trains on two consecutive
crossings (or sidings).
Among the above control parameters, L, V, C and S are utilized to represent various
sub-network configurations in order to study the impact of these four on the travel time
delay. To be able to build a generic delay estimation model for a single-track or a double-
track, we assume that each of these four parameters can take three different values which
66
are labeled as LO, MID and HI. These three levels can be thought of as representing the
lowest, middle and highest sub-network length, speed-limits, crossings (or sidings) and
spacing that can be found in the complex railway network under consideration. There are
3
4
, or 81 sub-network configurations that need to be simulated in order to build the delay
model. However, due to the need for efficiency, we invoke a response surface
methodology tool known as fractional factorial design. We develop a one-third fractional
factorial design, wherein we assume third-order and higher interactions between the four
control parameters to be negligible, and instead concentrate our efforts in studying the
main effects and the two-factor interactions. According to standard rules (Montgomery,
1984), we choose 27 of these 81 designs so as to get a good representation of the
interaction effects. In response surface terminology, this is called a 3
4-1
design. In this
way, the generic delay model developed would be able to estimate delay, with high
precision, on a host of sub-networks within the extreme values of the four topological
parameters. An important thing to note here is that we do not mix double-track sub-
networks with single-track sub-networks. The generic delay model is developed
separately for each of them.
The simulation is run for each of the 27 sub-network configurations using AweSim!
2.0 (Pritsker and O'Reilly, 1999), by altering the data files. For each sub-network
configuration, simulations are run for a fixed ratio of the types of trains traveling through
the sub-network, and various values of
for each train type. Two stations are assumed to
be present at either end of a sub-network, and there are an equal number of trains
travelling in either direction. Trains are made to travel between their respective origin and
67
destination stations. Furthermore, we also assume that there are no stations in between
the origin and destination stations. During each run, the state of the system is recorded at
the arrival times of randomly selected trains at their respective origin station. At each
randomly sampled time instant we record
.
Figure 12: A double-track railway segment with 7 moving trains at the time train
is
deciding to enter on either A or B
Figure 9 above, shows a double-track sub-network of length L = 10 mi with
crossings CD and GH and siding EF, i.e., C = 3. The speed limit V over this sub-network
is 35 mi/hr. The crossings and the sidings are uniformly distributed over
or 75% of
the length of the sub-network, i.e., S = 0.75.
is a train that is about to enter the
network. There are four train types, represented by rectangles, squares, circles and
ellipses. Of the 7 trains already existing in the sub-network, 3 are traveling in the same
direction as
would upon entering, and 4 are traveling in the opposite direction. Hence,
= 4. At time of entry of
,
(squares) = 2,
(rectangles) = 1,
(circles) = 2 and
(ellipses) = 2.
represents the trains that would enter through IJ from the adjacent
68
sub-network(s) after
enters through AB and before it exits through IJ.
,
and
are called covariate parameters because they can be altered only by changing the control
parameter(s), in this case,
.
and
represent traffic moving in the opposite
direction, relative to the randomly selected train, and therefore impact delay by providing
``resistance'' to its smooth flow.
Then, we run a single regression analysis over the data collected from the 27 sub-
network configurations, using Minitab. The parameters used are the
's,
,
, L, V, C
and S, and the response variable is the travel time delay experienced by the train, Y. A
normal probability plot and a plot of the residuals
versus the predicted response
(fitted response value from the regression analysis) are plotted. This is done to examine
the fitted model to ensure that it provides an adequate approximation to the true system,
and to verify that none of the least squares regression assumptions are violated. We also
run regressions with quadratic and cross-product interaction effects of the
's and the
network topological parameters, in order to study their effects on the delay.
In the next section, we present an example wherein we build these generic delay
models for single and double-track sub-networks for the railway network in the Los
Angeles area. On a side note, we use the following nomenclature in the remaining
sections of this paper: actual delay refers to the delay experienced by the trains in real-
world rail operations, simulation delay refers to the travel time delay experienced by the
trains as per the simulation model by Lu et al. (2004), and predicted delay refers to the
delay estimated from the delay estimation equation (obtained from the regression
analysis), that is expected to be experienced by the trains in traveling through a network.
69
5.5 Case study: Los Angeles area railway network
The Ports of Los Angeles and Long Beach are the busiest ports on the West Coast.
Three railroad lines, Union Pacific - Alhambra, Union Pacific - San Gabriel and
Burlington Northern Santa Fe operate service from Los Angeles downtown to the ports.
Travel time delays from the simulation model on this network have been shown by Lu et
al. (2004) to be close to real-world delay values. The trackage in this region is primarily a
combination of single and double-tracks. Crossings and sidings are provided for the
purpose of train meets and overtakes, thereby ensuring a smooth traffic flow. Four types
of trains primarily travel on these tracks – long double stack (8000 feet), intermodal
(6000 feet), carload (6500 feet) and oil (5000 feet). The speed-limits of these trains are
70, 55, 50 and 40 mi/hr respectively. In our experiments, we assume a fixed ratio of these
four train types. For each subnetwork, multiple simulation runs are performed, each with
a different combination of the
's. The primary purpose of this is to get a good
representation of the system space, that is, how the delay varies with different values of
, for a fixed setting of the network topology parameters.
5.5.1 Delay estimation for a double-track sub-network
For the purpose of designing networks to build a generic delay model for a double-
track sub-network, the three grades of values listed in Table 5 below were selected for the
network topology parameters.
70
LO MED HI
Length (mi) 5.0 12.5 20.0
Speed-limit (mi/hr) 15 35 55
Crossings 1 3 5
Spacing (%) 0.50 0.75 1.00
Table 5: Settings for network topology parameters for double-track simulations
These values reflect the range of the four parameters within which a majority of the
double-track sub-networks in the Los Angeles area network lie. As described in the
previous section, we now develop a one-third fractional factorial design on which the
simulations are to be run. The 27 treatment combinations that are used to run the
simulations are shown in Table 6.
L V C S L V C S
1 12.5 55 1 0.50 15 12.5 55 5 0.75
2 20.0 35 3 1.00 16 12.5 15 1 1.00
3 5.0 35 1 1.00 17 12.5 15 3 0.75
4 5.0 15 1 0.50 18 20.0 55 3 0.75
5 5.0 35 3 0.75 19 5.0 15 3 1.00
6 20.0 35 1 0.50 20 45.0 35 5 0.5
7 20.0 15 5 1.00 21 20.0 35 5 0.75
8 5.0 15 5 0.75 22 20.0 15 1 0.75
9 12.5 35 3 0.50 23 20.0 55 5 0.50
10 20.0 55 1 1.00 24 12.5 35 1 0.75
11 20.0 15 3 0.5 25 5.0 55 3 0.50
12 12.5 15 5 0.5 26 12.5 35 5 1.00
13 5.0 55 5 1.00 27 12.5 55 3 1.00
14 5.0 55 1 0.75
Table 6: 27 parameter combinations considered in the one-third fractional factorial design
for double-track simulations
71
For each of the 27 treatment combinations, 25 simulation runs are made, each with a
different combination of the
values for the four train types. This is done to obtain
travel time estimates under various network operating conditions, described by the X and
D variables. The simulations are run at the real-world daily peak traffic conditions, thus
representing a stationary process. Therefore, the variance of the observed values is
constant. As explained previously, in each simulation run, the state of the system is
recorded at random intervals of time, each triggered by the arrival of a randomly selected
train at its respective origin station. In each simulation run, approximately 1000 data
points are recorded in this manner. Finally, all the data collected from these 27x25
simulations are combined to fit a regression model. The results are plotted in the graphs
below.
Figure 13: Residuals vs. predicted response for double-track sub-network simulation
72
Figure 14: Normal probability plot for double-track sub-network simulation
In the plot of the residuals versus the predicted response
, the general impression
should be that the residual scatter randomly on the display, suggesting that the variance
of the predicted response is constant for all values of the mean of
[see Myers and
Montgomery (2002), Law and Kelton (1991)]. However, in Figure 13, our plot exhibits a
funnel-shaped pattern, which indicates that the variance of the predicted response
depends on its mean value. In Figure 14, it is apparent that the normality assumption is
being violated. A remedial procedure for these abnormalities is to transform the response
variable Y. A Box-Cox transformation procedure is carried out, and the transformation
parameter that minimizes the sum of squares of error is selected. For the experiment
presented above, a natural log transformation has the best effect in improving the fit of
the model to the data.
In addition to the single-order effects, we also fit regression models to a data set that
includes interactions in the
's and the network parameters. In Table 7, we retrace the
73
backward elimination procedure. As per this procedure, we start with the single and
higher-order effects of the X's, and the single-order and higher-order interaction effects of
the network topology parameters. The higher-order effects of the four network topology
parameters L, V, C and S comprised of the quadratic effects represented by LL, VV, CC
and SS, and the interaction effects represented by LV, LC, LS, VC, VS and CS. After
running regression analysis, we delete those effects that have no or least impact on the
adjusted R-sq value. Next, we run regression analysis with just the effects that were not
deleted in the previous step. This is done iteratively. We stop when we are left with just
the statistically significant terms in the regression equation.
The size of the data sets obtained from the simulation runs creates a problem when
assessing statistical significance. Specifically, the large degree of freedom for error
makes all of the candidate regression terms significant at typical alpha levels. So, instead
of using P-values as criteria for model selection, we use the relative magnitudes of the
coefficients and the adjusted R-squared value. In effect, we are eliminating terms that
provide negligible contributions to the predicted values. Since the P-values of all the
terms are always significant, we eliminate the term with the smallest coefficient value.
An important observation to be made from the table above is that the network topology
variables in the fractional factorial design have been chosen so as to keep them linearly
independent of each other. By the virtue of this design, if any of the network topology
interaction terms has a low coefficient value and is chosen to be eliminated, then all the
topology interaction terms that have been so chosen can be simultaneously eliminated
from the regression equation.
74
Regression Parameters Adjusted
R-Sq (%)
Eliminated
terms
1 X1,X2,X3,X4,D1,D2,L,V,C,S,X1X1,X1X2,X1X3,
X1X4,X2X2,X2X3,X2X4,X3X3,X3X4,X4X4,LL,
LV,LLC,LS,VV,VC,VS,CC,CS,SS
93.7 -
2 X1,X2,X3,X4,D1,D2,L,V,C,S,X1X1,X1X2,X1X3,
X1X4,X2X2,X2X3,X2X4,X3X3,X3X4,X4X4,LL,
VV,CC,SS
93.6 LV,LC,LS,VC
,VS, CS
3 X1,X2,X3,X4,D1,D2,L,V,C,S,X1X1,X1X2,X1X3,
X1X4,X2X2,X2X3,X2X4,X3X4,X4X4,LL,VV,CC
,SS
93.6 X3X3
4 X1,X2,X3,X4,D1,D2,L,V,C,S,X1X2,X1X3,X1X4,
X2X2,X2X3,X2X4,X3X4,X4X4,LL,VV,CC,SS
93.6 X1X1
5 X1,X2,X3,X4,D1,D2,L,V,C,S,X1X2,X1X3,X1X4,
X2X3,X2X4,X3X4,X4X4,LL,VV,CC,SS
93.6 X2X2
6 X1,X2,X3,X4,D1,D2,L,V,C,S,X1X2,X1X3,X1X4,
X2X3,X2X4,X3X4,LL,VV,CC,SS
93.6 X4X4
7 X1,X2,X3,X4,D1,D2,L,V,C,S,X1X2,X1X3,X1X4,
X2X3,X2X4,X3X4
89.9 LL,VV,CC,SS
8 X1,X2,X3,X4,D1,D2,L,V,C,S 87.9 X1X2,X1X3,
X1X4,X2X3,
X2X4,X3X4
Table 7: Backward elimination for double-track sub-network elimination
By comparing the regression models shown in Table 7, we notice that the one in row
6 has the least number of significant terms without a drastic reduction in the adjusted R-
squared value. This regression model is shown in Figure 15. In a physical sense, this
regression model suggests that the delay experienced by a train is impacted by the
heterogenic mix of traffic flowing in the opposite direction. Furthermore, in addition to
their single-order effects, quadratic interactions of the network parameters influence
delay. In this manner, we derive a regression model that defines an exponential relation
between delay and traffic, operating and network parameters.
75
The regression equation below can be used to estimate delay (Y) for an actual double-
track sub-network.
Figure 15: Detailed regression results for model 6 in the double-track simulation
The next logical step is to test our delay modeling methodology. As part of this step,
we adopt two validation strategies. First, we randomly chose five treatment combinations
of the 54 that were not used in the one-third fractional factorial design. The performance
76
of the generic delay estimation model for these five network configurations is shown in
rows 1-5 in Table 8 below. In the second validation strategy, we choose a section of the
Los Angeles area railway network, and test the performance of our delay estimation
model on this sub-network. This result is shown in row 6 in Table 8 below.
L V C S Relative error,
mean (%)
Relative error,
median (%)
Percent within
20% rel. error
1 5.0 15 1 1.00 14.55 9.09 87.82
2 12.5 35 3 0.75 9.06 5.75 88.65
3 20.0 35 3 1.00 11.22 5.75 81.77
4 5.0 55 3 0.75 4.92 0.78 93.40
5 12.5 55 3 0.50 9.19 6.78 88.03
6 6.0 36.67 3 1.00 20.35 20.34 78.28
Table 8: Validation of the double-track delay model. 1-5 are from the 54 unused
topological sub-network configurations. 6 is a real rail network.
For a given network, the simulation delay values for trains are derived from running
the simulation model. The relative error is defined as the absolute value of the difference
between the simulation delay and predicted (from the delay estimation equation) delay
divided by the simulation delay. The mean and the median of the relative error are given
in columns 6 and 7. Our observation from these tests is that the delay estimation model
estimates data with a high accuracy under normal, expected levels of traffic. But, it also
has a tendency to overestimate delay under conditions of high traffic in a sub-network
that could potentially lead to a deadlock. These values are small in number and, therefore,
are not removed while collecting descriptive statistics. Instead, they are considered as
extreme values. Hence, in this case, the median of the relative error proves to be a more
robust measure than the mean, and looking at the median of the relative error gives an
77
estimate of the effect of these extreme values on the mean of the relative error. The final
column lists the portion of the data set with a corresponding relative error within 20%,
which gives an estimate of the number of these extreme values.
We next investigate the conditions when our delay model provides small relative
error terms since the previous analysis showed that the relative error can be large under
extreme heavy traffic. We consider two cases in this analysis: light and medium traffic. In
other words, we present the performance of our delay estimation model for double-tracks
without including the extreme values of traffic indicative of a high degree of congestion.
In Table 9 below, we compare the predicted delay value with the delay value obtained
from the simulation model for the same six network configurations listed in Table 8. We
compute the portion of the data set, with relative error in the delay values within 10%, for
two different cases. In the row labeled `Case 1', we select the data where just the right
number of trains co-exist in the network so as to maintain the minimum safety distance,
that is, the number of trains ≤
2
1.5
, assuming trains are 1.5 miles long and the safety
distance is of a similar length. In other words, we compare the predicted and simulation
delay values for low traffic densities without any queues at either station. The values
listed in this row show that our delay estimation model performs fairly well under low
traffic conditions. In the row labeled `Case 2', we select data values where the quantity
number of trains
≤ 80%. This can be thought of as the utilization of the network being ≤
80%. The maximum number of trains in this case will be higher than the number of trains
in Case 1. Since all cannot be accommodated simultaneously, there might be some
78
queuing occurring at either or both stations. Under this case of medium traffic densities,
our delay estimation model continues to perform well, as more than 90% of the data is
within 10% relative error for all the six network configurations.
Test Config. 1 2 3 4 5 6
Case 1 95.02 94.87 94.10 94.67 93.44 93.32
Case 2 92.54 93.41 90.19 92.75 91.65 90.17
Table 9: Validation of the double-track delay model. Case 1: low traffic intensities. Case
2: medium traffic intensities
5.5.2 Delay estimation for a single-track sub-network
Delay estimation for a single-track sub-network is analogous to the delay estimation for a
double-track sub-network. In Table 10 below, we list the three grades of values for the
network topology parameters that were used for estimating delay for single-track sub-
networks.
LO MED HI
Length (mi) 10.0 15.0 20.0
Speed-limit (mi/hr) 15 35 55
Crossings 2 3 4
Spacing (%) 0.70 0.85 1.00
Table 10: Settings for network topology for single-track simulations
As in the case of a double-track, we develop a one-third fractional factorial design to
run the simulations and develop the delay model. Table 11 below lists the 27 treatment
79
combinations selected from the 81 possible by assuming third-order and higher
interactions to be negligible.
L V C S L V C S
1 10 35 4 0.70 15 20 55 2 1.00
2 10 55 4 1.00 16 20 55 3 0.85
3 15 35 3 0.70 17 20 15 4 1.00
4 20 15 3 0.70 18 10 35 2 1.00
5 10 55 2 0.85 19 20 55 4 0.70
6 15 55 2 0.70 20 15 55 4 0.85
7 15 15 3 0.85 21 10 15 3 1.00
8 10 15 4 0.85 22 15 15 2 1.00
9 20 35 2 0.70 23 15 35 4 1.00
10 15 55 3 1.00 24 20 35 4 0.85
11 10 15 2 0.70 25 20 15 2 0.85
12 20 35 3 1.00 26 10 55 3 0.70
13 15 15 4 0.70 27 15 35 2 0.85
14 10 35 3 0.85
Table 11: 27 parameter combinations considered in the one-third fractional factorial
design for single-track simulations
All the data collected from these 27x25 simulations are combined to fit a regression
model. This model has an adjusted R-squared value of 79.4%. Similar to the simulation
data for double-track sub-networks, the normality plot and the plot of the residuals versus
the predicted response for single-track sub-networks depict a violation of the normality
and homoscedasticity assumptions. A remedial Box-Cox transformation is carried out.
Similar to the case of a double-track, the natural logarithm of the response variable, Y, is
used as the transformed response. We begin with a regression model containing all the
second-order interaction terms of the
's and the topology parameters, and by using
80
backward elimination we derive a regression model to estimate delay on a single-track
sub-network.
Regression Parameters Adjusted
R-Sq (%)
Eliminated
terms
1 X1,X2,X3,X4,D1,D2,L,V,C,S,X1X1,X1X2,X1X3,
X1X4,X2X2,X2X3,X2X4,X3X3,X3X4,X4X4,LL,
LV,LLC,LS,VV,VC,VS,CC,CS,SS
91.1 -
2 X1,X2,X3,X4,D1,D2,L,V,C,S,X1X1,X1X2,X1X3,
X1X4,X2X2,X2X3,X2X4,X3X3,X3X4,X4X4,LL,
VV,CC,SS
91.0 LV,LC,LS,VC
,VS, CS
3 X1,X2,X3,X4,D1,D2,L,V,C,S,X1X1,X1X2,X1X3,
X1X4,X2X3,X2X4,X3X3,X3X4,X4X4,LL,VV,CC
,SS
91.0 X2X2
4 X1,X2,X3,X4,D1,D2,L,V,C,S,X1X1,X1X2,X1X3,
X1X4,X2X3,X2X4,X3X4,X4X4,LL,VV,CC,SS
91.0 X3X3
5 X1,X2,X3,X4,D1,D2,L,V,C,S,X1X1,X1X2,X1X3,
X1X4,X2X3,X2X4,X3X4,LL,VV,CC,SS
91.0 X4X4
6 X1,X2,X3,X4,D1,D2,L,V,C,S,X1X2,X1X3,X1X4,
X2X3,X2X4,X3X4,LL,VV,CC,SS
90.9 X1X1
7 X1,X2,X3,X4,D1,D2,L,V,C,S,X1X2,X1X3,X1X4,
X2X3,X2X4,X3X4
89.4 LL,VV,CC,SS
8 X1,X2,X3,X4,D1,D2,L,V,C,S 88.2 X1X2,X1X3,
X1X4,X2X3,
X2X4,X3X4
Table 12: Backward elimination for single-track sub-network simulations
From Table 12, we note that the regression model in row 6 has the highest adjusted
R-squared value with only the significant single-order and interaction terms included.
The regression-based delay estimation equation for a single-track sub-network is given
below. The heterogeneity in the traffic flowing in the opposite direction and the quadratic
interaction terms of the network parameters affect the delay experienced by a train
traveling on a single-track.
81
Figure 16: Detailed regression results for model 6 in the single-track simulation
The single-track delay model is validated in a similar fashion as the double-track
delay model. In Table 13 below, rows 1-5 show the performance of the delay model with
respect to the delay obtained by running simulations on 5 randomly chosen sub-network
configurations that were not used in the one-third fractional factorial design. Row 6
shows the performance of the delay model on an actual single-track sub-network existing
in the Downtown Los Angeles to the Ports railway network.
82
L V C S Relative error,
mean (%)
Relative error,
median (%)
Percent within
20% rel. error
1 10 35 3 1.00 12.67 10.57 74.36
2 10 55 2 1.00 16.92 14.02 74.10
3 15 15 4 0.85 17.21 12.68 68.76
4 15 55 2 1.00 17.85 15.34 70.31
5 20 35 3 0.7 18.91 14.00 75.19
6 11.43 55 2 1.00 14.11 10.54 71.96
Table 13: Validation of the single-track delay model. 1-5 are from the 54 unused network
configurations. 6 is an actual rail network.
The relative error is defined as the absolute value of the difference between the
simulation delay and predicted (from the delay estimation equation) delay divided by the
simulation delay. The mean and the median of the relative error are given in columns 6
and 7. Our observation from these tests is that the delay estimation model estimates data
with a high accuracy under normal levels of traffic. But, it also has a tendency to
overestimate delay under conditions of high traffic in a sub-network that could potentially
lead to a deadlock. These values are small in number and, therefore, are not removed
while collecting descriptive statistics. Instead, they are considered as extreme values. In
the presence of these extreme values, the median of the relative error proves to be a more
robust measure than the mean, and looking at the median of the relative error gives an
estimate of the effect of these extreme values on the mean of the relative error. The final
column lists the portion of the data set with a corresponding relative error within 20%,
which gives an estimate of the number of these extreme values.
We next investigate the conditions when our delay model provides small relative
error terms since the previous analysis showed that the relative error can be large under
83
extreme heavy traffic. We consider two cases in this analysis: light and medium traffic. In
Table 14 below, we compare the predicted delay value with the delay value obtained
from the simulation model for the same five network configurations listed in Table 13.
We compute the portion of the data set, with relative error in the delay values within
10%, for two different cases. In the row labeled `Case 1', we select the data where just the
right number of trains co-exist in the network so as to maintain the minimum safety
distance, that is, the number of trains ≤
1.5
1.5
, assuming trains and sidings are 1.5
miles long and the safety distance is of a similar length. In other words, we compare the
predicted and simulation delay values for low traffic densities without any queues at
either station. The values listed in this row show that our delay estimation model
performs fairly well under low traffic conditions. In the row labeled `Case 2', we select
data values where the quantity
number of trains
1.5
≤ 80%. This can be thought of as the
utilization of the network being lesser than or equal to 80%. The maximum number of
trains in this case will be higher than the number of trains in Case 1. Since all trains
cannot be accommodated simultaneously, there might be some queuing occurring at
either or both stations. Under this case of light to medium traffic densities, our delay
estimation model continues to perform well as is shown in the table below.
Test Config. 1 2 3 4 5 6
Case 1 90.40 91.77 91.23 90.14 89.36 88.76
Case 2 88.65 89.11 87.43 88.89 87.54 86.97
Table 14: Validation of the single-track delay model. Case 1: low traffic levels. Case 2:
medium traffic levels.
84
Chapter 6: Approximation-based Solution Procedures
Using the integer programming model (IP*) presented in Chapter 4 together with the
delay estimation techniques discussed in Chapter 5, we can now tackle the problem of
routing and scheduling trains on large and complex rail networks with combinations of
track segments of varying lengths, speed-limits, traffic conditions, and number of
crossings, junctions and sidings. The usual approaches used by researchers in the past to
solve large-scale integer programming problems include using cutting planes to reduce
the size of the convex hull of the LP-relaxation problem closer to that of (IP*), exact
solution methods such as branch-and-cut, and approximation procedures. For the problem
at hand, given the rail network topology and the movement of trains in general, we did
not have much success in developing effective cutting planes that shorten the
computational time. Moreover, due to the general nature of the problem structure, we had
exhaustively considered all possible constraints while formulating (IP*). Due to these
reasons, we decided to use approximation-based solution procedures to solve (IP*) for
real-world sized rail networks.
In this section, we present an approximation-based solution procedure to solve the
NP-hard train routing and scheduling problem (IP*). It is computationally impractical to
identify an optimal solution to problem (IP*) that lists the routes to be chosen by each
train while traveling through a network, the arrival and departure times at each node, and
the priorities during meets and overtakes between trains. At the planning level, the most
important decisions are the identification of the route for each train and the departure
times at the origin stations. The detailed explanation for this was presented in Chapter 3.
85
Hence, our solution procedures focus on identifying suitable integer solutions based on
the LP relaxation formulation of (IP*) for these two decisions. The decisions related to
the detailed sequence of trains at the subsequent stations in the network can be made later
at the operational level. Our proposed procedures determine only the route that each train
takes and the order of departure at the origin destination with an objective to reduce the
number of meet-pass interferences down the line.
The following sections describe the three solutions procedures that were developed
to solve (IP*) for real-world railways of medium and large size within a reasonable
amount of computational time.
6.1 A solution approach using LP-relaxation
As previously noted, the complexity of (IP*) grows exponentially with the number of
nodes and arcs in the general network G. As a result, commercial solvers such as CPLEX
cannot be used to solve the capacity management problem for real-world networks of
medium-scale (30-50 miles). One solution approach is to relax the integrality constraints
on the
binary variables, obtain a solution to this relaxed problem using a solver
such as CPLEX, and then apply approximation techniques to convert the appropriate
linear variables to integer quantities. Since (IP*) is a minimization problem, its linear
relaxation affords a lower bound to the original problem. This lower bound solution
comprises of fractional values to the Y variables that determine the initial routes and
schedules from the origin stations – quantities that are of interest to us. In other words,
this solution allows each train h to be split across the various possible arcs (o
h
, j) from its
origin o
h
and across different time intervals t. Each arc (o
h
, j) is part of a unique route
86
between the origin and destination of train h. As can be expected, this solution is not
feasible to (IP*). Hence, using information from the solution to the relaxed problem we
need to determine the Y variables for the origin stations that can be converted to integer
values. We still allow the Y variables at intermediate stations to remain fractional. This
produces a partially feasible solution to (IP*) and is yet another lower bound to the
original capacity management problem. Next, we describe the approximation techniques
used to obtain this partially feasible solution.
The solution to the LP-relaxation problem produces fractional values of Y variables
that denote the route chosen by a train when it departs its origin. A simple rounding
algorithm that allocates a train h to a route corresponding to the largest
value, such
that
, is applied to the solution obtained to the LP-relaxation problem.
Another piece of information derived from the LP-relaxation output is the order of
departure from the origin nodes. To circumvent the problem posed by the fractional
decision variable values, we solve the following maximal matching problem [see
Papadimitriou and Steiglitz (1998)] to decide the departure times for each train from their
respective origins.
subject to
(16)
87
(17)
In the above formulation,
if train h departs its origin at time t, else it is 0.
such that (
) . Constraint (16) enforces the condition
that only one train can depart at any time instant t from the origin station, and constraint
(17) ensures that train h departs only once over the planning time horizon.
The solution obtained after applying the rounding algorithm and solving the
matching problem is a partial feasible solution to (IP*). We call this solution partial
feasible solution from LP-relaxation, abbreviated as “PFSLPR”.
6.2 A solution approach using routing constraints
In the previous section, we described the implications of solving the LP-relaxation
formulation of (IP*). In order to ensure that a train is assigned to a unique route as it
departs from an origin or intermediate node with multiple arcs, we used a rounding
algorithm. Another possibility to obtain the routes for a given set of trains is to include
routing constraints in addition to the constraints in the LP-relaxation formulation. For
this purpose, we introduce new binary variables
defined for all origin nodes
for
h .
is equal to 1 if train h travels to node j from its origin node
, and is
equal to 0 otherwise. The routing constraints are:
(18)
(19)
88
Constraint (19) enforces the condition that for a given train h, only one node j such
that
will be chosen as a successive node when h departs its origin.
Constraint (18) assigns an integral value to the Y variable corresponding to arc (
,j)
along which train h moves. The Y variables corresponding to the remaining forward arcs
from
are forced to be zero. The above constraints when added to the LP-relaxation
formulation force a train to choose a single route when departing from an origin station.
Similar routing constraints are also added for intermediate nodes from where trains could
depart along multiple routes. Adding constraints (18) and (19) to the LP-relaxation of
(IP*) results in a mixed integer linear program with discrete and continuous decision
variables,
and
respectively. This formulation, represented as (MILP) forms
another lower bound to (IP*).
The solution to the order of departure of trains from their origins can be determined
once again by solving the matching problem (IP
1
). The solution obtained by solving
(MILP) and the matching problem is a partial feasible solution to (IP*). We call this
solution partial feasible solution from routing constraints, abbreviated as “PFSRC”.
6.3 A genetic algorithm (GA) solution procedure
The two previously presented approximation-based solution procedures entail using a
rounding algorithm, routing constraints and a matching problem to obtain a partial
feasible solution to (IP*). One possible way to improve upon the quality of the routes and
schedules corresponding to this partial feasible solution is to use a search algorithm that
explores the solution space using a suitable evaluation criterion. Genetic algorithm is one
such procedure that has been applied in the past to solve large-scale scheduling and
89
timetabling problems. Nachtigall and Voget (1996) use a combination of manual
timetable generation methods and genetic algorithms to compile timetables in periodic
served railway networks. Nachtigall and Voget (1997) use genetic algorithms to generate
sub-optimal solutions to a bi-criteria optimization problem that reforms track states to
minimize waiting times for trains. Other works on railway scheduling and dispatching
that use GA procedures include Salim and Cai (1996, 1997) and Suteewong (2006). In
the following sections, we provide a brief overview of the GA procedure and explain how
we apply it to obtain better quality solutions.
6.3.1 Background
A genetic algorithm (GA) is a search procedure that utilizes concepts from natural
evolution to produce solutions to an optimization problem [see Reeves and Rowe
(2003)]. Solutions to the optimization problem are represented by a string of binary
numbers, usually referred to as a chromosome. Similar to the natural process of evolution,
chromosomes are first evaluated based on their quality or fitness value, and those
satisfying the minimum fitness criteria are selected for the reproduction process.
Reproduction normally occurs through crossover and mutation operations. The selection
process ensures that the fittest of the available parent chromosomes are selected to
reproduce, and the reproduction process ensures that the search heuristic moves in the
direction of better quality solutions.
To begin with the GA procedure, initial solutions or population for the optimization
problem need to be generated. This population can consist of randomly generated
solutions as well as solutions that represent the entire expanse of the solution space. This
90
process is called initialization. Typically, an initialization algorithm based on the
structure of the problem at hand is used for this purpose. Good quality initial solutions
reduce the time taken by a GA to find the best possible solution.
To progress from the current best solution to the next best solution, a GA uses a
fitness function to evaluate the fitness of the current set of parent chromosomes. This is
called the selection process. Similar to natural evolution, the fitter a chromosome, the
greater is its chance of being selected for the reproduction process. Usual selection
procedures include roulette wheel selection, tournament selection and rank selection. In a
roulette wheel selection, a function based on the fitness of the parent chromosomes is
used to assign a probability of a certain chromosome being selected for the reproduction
process. In tournament selection, a certain set of chromosomes is randomly selected from
the pool and tournaments are held between every possible pair of this randomly selected
set. The winner of each tournament is selected for the reproduction process. In rank
selection, chromosomes are ranked according to their fitness values and the ones with the
best fitness are selected.
A GA uses crossover and mutation operations between parent chromosomes to
generate children chromosomes. Every possible pair of parent chromosomes that were
selected through the selection process is made to produce a child chromosome. The basic
underlying concept of the reproduction process is to retain the favorable quality of the
parent chromosomes, while doing away with the unfavorable quality. This ensures that
the fitness value of the chromosomes only increases at every step of reproduction,
thereby improving the quality of the solution to the optimization problem. Crossover is
91
used for local search, whereas mutation is used for exploring new search spaces.
Mutation is generally used when there is a possibility that the GA is producing solutions
close to a local minimum and we would like to expand the search to a new region. To
perform the crossover operation, two parent chromosomes are split at a certain point(s)
and they swap their genes with each other to generate new children chromosomes. To
perform mutation, a mutation ratio is utilized to determine whether or not each binary
number constituting a chromosome will be modified.
A genetic algorithm is made to run until the termination criterion is met. The
termination criterion could be a certain number of iterations, or a certain factor of
improvement in the solution to the optimization problem, or a time-limit.
6.3.2 Genetic representation
Each chromosome in our GA represents the binary variable
. This is defined as
being equal to 1 if a train h travels from node i to node j at time t, and 0 otherwise. Note
that by definition,
for a certain train h and arc (i,j) such that
(i,j,h) We define matrices, shown in Figure 18, for every combination of an origin
station for a train h H, and a possible route of travel from that origin node. The rows of
the matrix represent the set of trains, and the columns represent the time of departure. If a
train h departs from its origin o
h
along arc (o
h
,j) at time t, then the corresponding block in
the matrix that represents arc (o
h
,j) will be marked 1, otherwise it will be marked 0. The
arc chosen by a train when it departs from its origin defines a unique route to its
destination. In the ensuing discussion, we use the terms “routes” and “arcs”
interchangeably.
92
To illustrate this genetic representation, consider the general network given in Figure
17. There are a total of 10 nodes including 2 origin nodes represented as “ST1” and
“ST2”. As explained in Chapter 4, the remaining nodes in this general network represent
a certain section of the physical network.
AB
ST1
BC CD
EF FG GH HI
ST2 CH
Figure 17: An illustrative network for genetic algorithm
Let us assume there are 4 trains that travel on the above network – train 0 and train 2
originate from “ST1” and terminate at “ST2”, and train 1 and train 3 originate from
“ST2” and terminate at “ST1”. Trains originating from “ST1” can choose one of two
initial routes – (ST1, AB) or (ST1, EF). Similarly, trains originating from “ST2” can
depart either on (ST2, CD) or (ST2, HI). Hence, as per the rules described above, we
need 4 matrices representing each of these possible routings. This is shown in Figure 18
below. These matrices are constructed only for trains that can travel on a certain route,
thereby reducing the memory requirements.
93
(a)
(d) (c)
(b)
time
train 0 1 2 3 4 5 ... T
0 0 0 0 0 1 0 0 0
2 0 0 0 0 0 0 0 0
Node AB
time
train 0 1 2 3 4 5 ... T
0 0 0 0 0 0 0 0 0
2 1 0 0 0 0 0 0 0
Node EF
time
train 0 1 2 3 4 5 ... T
1 0 0 0 0 0 1 0 0
3 1 0 0 0 0 0 0 0
Node CD
time
train 0 1 2 3 4 5 ... T
1 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0
Node HI
Figure 18: Genetic representation of chromosomes
The binary variable
is constructed as a two-dimensional matrix as shown in
Figure 18. The figure also shows that train 0 departs on arc (ST1, AB) at t=4, train 1
departs on (ST2, CD) at t=5, train 2 departs on (ST1,EF) at t=0 and train 3 departs on
(ST2, CD) at t=0.
6.3.3 Initialization
To begin the GA procedure, we need to first generate the initial population comprising of
feasible solutions to the optimization problem (IP*). To reiterate what was noted
previously, the main purpose of using the GA procedure is to improve upon the partial
feasible solutions PFSLPR and PFSRC. Depending on which of these two is being
improved, we include that particular partial feasible solution into the initial population.
The remaining feasible solutions constituting the initial population are generated using an
advance node algorithm. A similar procedure was adopted by Suteewong (2006). In our
case, for a fixed route, the algorithm determines the order of departure of trains from their
respective origins. The steps of this algorithm are as follows:
94
1. Obtain information about the trains, their origins and destinations. Group trains by
their origins into sets
, based on their origins, where
represents the origin
node of train h.
2. For each set
, determine the possible successor nodes the train could travel to
upon departing from
. Let
be the set of successor nodes. Depending on
whether or not a certain train h in set
travels on a particular route, there can be
number of possible ways to route trains in set
from
. Each of
these possibilities represents a feasible initial solution. We need to determine the
order of departure for each possibility.
As an example, let us consider Figure 17. Train 0 and train 2 depart from
ST1. Therefore,
{train 0, train 2}. There are two possible initial routes for
these two trains: r
1
= (ST1, AB) and r
2
= (ST2, EF). Hence,
= 2. The possible
routings (denoted by k) are:
Route
k
(ST1, AB) (ST1, EF)
1 train 0, train 2 -
2 train 0 train 2
3 train 2 train 0
4 - train 0, train 2
Table 15: Genetic algorithm: initial routings
In this case, the number of possible routings =
= 2
2
= 4.
3. Populate the corresponding
matrices according to:
a. For each origin node
, do -
95
b. For each
, do -
i. Order the trains in
traveling along arc (o
h
,j) in the increasing
order of their earliest departure times. If multiple trains have the
same earliest departure time, break ties arbitrarily.
ii. Pick the train at the top of the sorted list. Dispatch it along the
assigned route based on the advance node algorithm.
iii. Delete train from the list. Are there any more trains in the sorted
list for arc (
? If yes, go to (ii).
The steps for the advance node algorithm are as follows:
1. Obtain information on the successor node j to which a train h needs to travel, and
the capacity of that node.
2. Select a departure time t such that no other train has been already scheduled to
depart from the origin o
h
to j at that time. This helps avoid conflicts between
trains.
3. Is the current traffic on j at time t less than its capacity c
j
? If yes, allow train h to
proceed to j and stop. If no, go to step (4).
4. Compute the free running time on the track segment represented by the successor
node j as the length of track segment divided by the track speed-limit, that is,
.
The traffic on node j reduces by at least 1 after this time has elapsed. The
departure time of train h from o
h
, t = t +
. Go to step (3).
96
6.3.4 Selection operation
A selection operation is vital to choosing the appropriate parent chromosomes for the
reproduction process. A GA works on the idea that choosing parent chromosomes that
have favorable attributes increases the chances of reproducing good quality children
chromosomes. There are many selection approaches that can be used in a GA procedure.
In our case, we resort to a roulette wheel selection procedure.
1. For each chromosome in the current stage of the GA procedure, input the
information about the route and release time from origin into the simulation
model by Lu et al. (2004). Record the delay values
computed by the built-in
construction heuristic for that particular chromosome q. Repeat for all potential
parent chromosomes.
2. Calculate fitness value for each chromosome q as per the following equations.
Where
is the delay value for chromosome q computed in the previous step
is the fitness value of chromosome q
q is the chromosome index, q = 0,1,2,...,N.
3. Randomly generate a number, defined as , between 0 and 1.
4. Find the minimum value of q that can solve this equation:
97
5. Select chromosome q obtained from step (4) as a parent chromosome.
6.3.5 Crossover operation
A crossover operation in a GA procedure is done when we want to proceed from the
current generation of chromosomes to the next. Mathematically speaking, this amounts to
exploring the neighborhood of a certain region in the solution space. A crossover
operation consists of identifying the crossover ratio or rate. This determines whether a
certain pair of chromosomes should be made to undergo the crossover operations. Next,
we identify the cut point(s). This is the length of a chromosome at which it will be sliced
and it exchanges a part of its gene code with another cut chromosome. There are three
main techniques that can be used in this operation – one-point crossover, two-point
crossover and cut-and-splice. In our GA, we use a one-point crossover; that is, we
identify a single vertical cut-point and all data beyond that point is swapped with the
other parent chromosome involved in the crossover operation. We explain the crossover
operation in Figure 19 using the example network shown in Figure 17.
98
Parent 1 Child 1
Parent 2 Child 2
time
train 0 1 2 3 4 5
0 0 0 0 0 0 0
2 0 0 1 0 0 0
Node EF
time
train 0 1 2 3 4 5
0 0 0 1 0 0 0
2 0 0 0 0 0 0
Node AB
time
train 0 1 2 3 4 5
0 0 0 1 0 1 0
2 0 0 0 1 0 0
Node AB
time
train 0 1 2 3 4 5
0 0 0 0 0 0 0
2 0 0 1 0 0 0
Node EF
time
train 0 1 2 3 4 5
0 0 0 0 0 0 0
2 0 0 0 0 0 0
Node EF
time
train 0 1 2 3 4 5
0 0 0 0 0 0 0
2 0 0 0 0 0 0
Node EF
time
train 0 1 2 3 4 5
0 0 0 0 0 0 0
2 0 0 0 0 0 0
Node AB
time
train 0 1 2 3 4 5
0 0 0 0 0 1 0
2 0 0 0 1 0 0
Node AB
Figure 19: A depiction of the crossover operation
Figure 19 shows how crossover is done between two parent chromosomes for train 0
and train 2 departing from station ST1. For the purpose of this example, we assume that
T = 5. In the first parent, train 0 departs at t = 2 along route (ST1, AB) and train 2 departs
at t = 2 along route (ST2, EF). In the second parent, both train 0 and train 2 depart along
route (ST1, AB) at t = 4 and 3 respectively. The crossover point is decided by scanning
through the two parent chromosomes and identifying the time instant by which at least
half of the trains move from their origin
. In Figure 19, this happens at t = 2. So, the
crossover point is between the 2
nd
and 3
rd
time interval. The children chromosomes are
generated as shown by the color codes in the matrices on the right-hand side of Figure 19.
99
The first half of “Child 1” chromosome comes from “Parent 1”, and the second half from
“Parent 2”. Similarly, the first half of “Child 2” chromosome comes from “Parent 2”, and
the second half from “Parent 1”.
A possible consequence of the crossover operation is that the children chromosomes
might not represent feasible initial routes and schedules to (IP*). One of the following
could occur:
1. The crossover might result in a child chromosome wherein a train might not
depart from an origin on any of the available routes. In Figure 19, in “Child 2”
train 0 and train 2 do not depart on either (ST1, AB) or (ST1, EF).
2. A train might depart along multiple routes in a child chromosome. An example of
this is “Child 1”, where train 2 departs on both (ST1, AB) and (ST1, EF).
3. A child chromosome might have a train departing at more than one time instant.
In Figure 19, train 0 departs at t = 2 and 4 in “Child 1” along route (ST1, AB).
To rectify these discrepancies in the routes and schedules generated in the children
chromosomes, we apply a repairing algorithm. This will be discussed in detail in Section
6.3.6. Essentially, the repairing algorithm modifies infeasibilities in a
matrix to
make routes and schedules feasible.
The crossover operation can be summarized as follows:
1. Obtain a pair of parent chromosomes selected in the selection process described in
Section 6.3.4.
100
2. Fix the crossover ratio as 0.7. Using a random number generator, generate a
number between 0 and 1. If this number is greater than the crossover ratio, then
go to step (3). Else, go to step (4).
3. Retain
as a potential parent chromosome for the next generation, and return
to step (1).
4. Select a vertical crossover point for the parent chromosomes such that a certain
percentage, say 50%, of the trains along that route departs at times to the right
hand side of the crossover point. Alternatively, the crossover point can also be
determined using a random number generator to identify t. If this time instant is t,
then the crossover point is between time instants t and t+1.
5. Generate children chromosomes by retaining the first half from a parent and
swapping the second half of the gene code with the other parent.
6. Are the children chromosomes generated in step (5) feasible routes and
schedules? If no, go to step (7). If yes, go to step (8).
7. Apply the repairing algorithm to the infeasible chromosome(s). Go to step (8)
8. Are there any more parent chromosome pairs that have not undergone a crossover
operation? If yes, return to step (1). If no, terminate the crossover algorithm.
6.3.6 Crossover repairing algorithm
This algorithm is applied to the children chromosomes generated by the crossover
operation to remove any infeasibility with respect to routings and scheduling. The three
types of possible infeasibilities were described in the previous section. The steps of the
repairing algorithm are as follows.
101
For each child chromosome, do –
1. Rank the trains in increasing order of their ready time.
2. h = 1.
3. While h ≤
, do –
a. (Case 1:
. This means train h does not depart
from its origin o
h
.) If case 1 is true, then assign train h to the route with the
least traffic and the earliest possible departure time. Compute the earliest
possible departure times for each route (
) using the advance node
algorithm. Increment h by 1.
b. (Case 2:
for only one j
. This means there is no
route or schedule conflict with train h.) If case 2 is true, increment h by 1.
c. (Case 3:
for only one
. This means that there is
no route conflict, but train h has two departure times from o
h
along the
same route.) If case 3 is true, simply pick the earlier departure time as the
time at which train h leaves its origin. Increment h by 1.
d. (Case 4:
and
for any
.
This means that there is a route conflict and the train h departs along two
routes simultaneously.) If case 4 is true, sort the routes first by traffic
volume and then earliest departure times. Pick the route with the least
traffic volume and the earliest departure time. Increment h by 1.
102
Now, applying the above repairing algorithm to the children chromosomes shown in
Figure 19, we obtain the following feasible routes and schedules. We assume that train 2
has an earlier ready time than train 0.
time
train 0 1 2 3 4 5
0 0 0 1 0 0 0
2 0 0 0 1 0 0
Node AB
time
train 0 1 2 3 4 5
0 0 0 0 0 0 0
2 0 0 0 0 0 0
Node EF
time
train 0 1 2 3 4 5
0 0 1 0 0 0 0
2 0 0 0 0 0 0
Node AB
time
train 0 1 2 3 4 5
0 0 0 0 0 0 0
2 1 0 0 0 0 0
Node EF
Child 1
Child 2
Figure 20: After applying the crossover repairing algorithm
6.3.7 Mutation operation
The purpose of a mutation operation in a GA procedure is to search a new region in the
solution space of an optimization problem. Mutation radically changes the gene code of a
chromosome, thereby avoiding a local minimum and introducing diversity in the
chromosome population. Similar to the crossover operation, there is a mutation ratio
which determines whether a chromosome undergoes mutation. Typically, this ratio is
lower than the crossover ratio. Mutation happens at a certain cell in the matrix called the
mutation point. This point is determined either randomly or by studying the problem
structure.
In the case of our problem where chromosomes represent
variables, mutation
alters the initial routing and scheduling of trains as they depart from their respective
103
origin stations. Reverting back to the example network in Figure 17, a possible mutation
operation for a chromosome involving train 0 and train 2 is shown in Figure 20. Again,
assume that T=6.
Original chromosome Mutated chromosome
time
train 0 1 2 3 4 5
0 0 1 0 0 0 0
2 0 0 0 1 0 0
Node AB
time
train 0 1 2 3 4 5
0 0 0 1 0 0 0
2 0 0 0 0 0 0
Node AB
time
train 0 1 2 3 4 5
0 0 0 0 0 0 0
2 0 0 0 0 0 0
Node EF
time
train 0 1 2 3 4 5
0 0 0 0 1 0 0
2 1 0 0 0 0 0
Node EF
Figure 21: A depiction of the mutation operation
In Figure 21, the original chromosome undergoes mutation. This operation is carried
out for every route in a chromosome and for every train in a chromosome. A random
number is generated between 0 and T-1. The number indicates the cell for a certain train
at a particular time where mutation will occur. The operation itself consists of flipping
the number in the chosen cell from either a 1 to a 0 or from a 0 to a 1. In the original
chromosome, train 0 departs on route (ST1, AB) at t = 1. However, since the mutation
point is at t = 2, train 0 will depart at t = 2 in the mutated chromosome. Similarly, train 2
changes its departure along route (ST1, AB) at t = 3 to along route (ST1, EF) at t = 1 after
mutation. An infeasibility occurs due to mutation when the entry in the cell
corresponding to train 0 along route (ST1, EF) and at t = 3 changes from 0 to a 1. The
mutated chromosome now indicates that train 0 departs along two separate routes from
ST1. To rectify this infeasibility, the mutation repairing algorithm is applied. This is
104
discussed in Section 6.3.7. To summarize, the mutation operation consists of the
following steps.
1. Consider a chromosome in the current population pool. Generate a random
number between 0 and 1. If this number is greater than 0.3 (mutation ratio), go to
step (2), else go to step (3).
2. Retain
as a potential parent chromosome for the next generation, and return
to step (1).
3. For each successor node j
in a chromosome, do –
a. h = 1.
b. For h ≤ |
|, do –
i. Note the last departure time of train h along route (
). Call this
π. Generate a random number between 0 and π. This identifies the
cut point for the train.
ii. Is the cell value in the cut point equal to 1? If yes, change it to a 0.
Otherwise, swap value in cell representing cut point from a 0 to a
1. Make all other cell values for this train equal to 0.
iii. h = h + 1.
4. Are the mutated chromosomes feasible? If yes, go to step (5). If not, apply
mutation repairing algorithm.
5. Are there any more chromosomes in the population that can be mutated? If yes,
go to step (1). Else, terminate the algorithm.
105
6.3.8 Mutation repairing algorithm
Infeasible routes and schedules could result from applying a mutation operation on a
chromosome. Three possible infeasibilities are a train departing on two routes from the
origin station, a train not departing along either route, and two or more trains departing
along the same route at the same time. In the example shown in Figure 20, after mutation
train 0 departs on both the possible routes whereas train 2 does not depart from ST1
along either route.
If train h departs along multiple routes, this repairing algorithm scans through the
departure time of h along the various routes and assigns it to the route with the least
assigned traffic and the earliest departure time. Essentially, we maintain a list of routes
sorted first by assigned traffic and then by departure time of h.
If train h does not depart along any of the available routes, then the advance node
algorithm discussed in Section 6.3.2 is applied along every possible route from its origin.
The train is ultimately routed along the line with the earliest departure time. The mutated
chromosome in Figure 20 will appear as below after applying the repairing algorithm.
time
train 0 1 2 3 4 5
0 0 0 1 0 0 0
2 0 0 0 0 0 0
Node AB
time
train 0 1 2 3 4 5
0 0 0 0 0 0 0
2 1 0 0 0 0 0
Node EF
Figure 22: After applying the mutation repairing algorithm
Now, train 0 departs on (ST1, AB) at t = 2 and train 2 departs on (ST1, EF) at t = 0.
106
Finally, it is possible that in the mutated chromosome or while applying the above
two sub-routines of the repairing algorithm we find two or more trains departing along
the same route at the same time. In real-world operations, departure times of trains are
spaced out to maintain safety headways, and to minimize the effect of delay propagation
from one train to another. In view of this, when in a mutated chromosome a train departs
on multiple routes, in addition to the steps described above, we check to see which route
does not have a time conflict with another train and select that route. Although this is not
the best possible way to solve the time conflict situation, the consequent crossover
operations would explore possibilities of lowering the delay in departing from an origin
station.
6.3.9 The genetic algorithm procedure
With the various components of the GA procedure explained, we now present the
actual workings of the heuristic.
1. Obtain the initial departure time information from either PFSLPR or PFSRC,
depending on whichever solution needs to be improved using the GA. Translate
this solution into its genetic representation based on the description in Section
6.3.2.
2. Use the initialization step (Section 6.3.3) to generate the initial population of
chromosomes representing the
binary variables for the origin stations.
3. Compute the delay associated with each chromosome using the construction
heuristic developed by Lu et al. (2004). It should be noted here that chromosomes
reflect only the routes and departure times from the origin station. These are fed
107
into the construction heuristic which then schedules trains at the intermediate
stations. The delay values obtained from the construction heuristic reflect the
quality of the routes and schedules at the origin station. Good quality solutions
minimize the meet-pass interferences between trains at intermediate stations and
junctions, thereby minimizing delay.
4. Use roulette wheel selection strategy, presented in Section 6.3.4, to select parent
chromosomes for the reproduction process.
5. Perform crossover and mutation operations and use their respective repairing
algorithms (Section 6.3.5 – 6.3.8) to rectify infeasibilities in the children
chromosomes generated by the reproduction process.
6. Step (5) generates a new population for the
binary variables. Compute the
delay values for the new population similar to step (3).
7. Check termination criteria. If this is met, terminate the algorithm and output the
values of the
variables corresponding to the current best chromosome. If
not, return to step (5). The termination criterion in our case is when there is no
significant improvement in the delay values between two consecutive iterations.
108
Chapter 7: Experimental Design and Results
In Chapters 4, 5 and 6 we have presented modeling approaches and solution
methodologies to route and schedule trains so that the travel-time delays in a network are
kept at a minimum. As we argued in Chapter 3, the routes chosen by the trains as they
depart their respective origin stations, and the time at which they depart play a very
important role in the number of interactions with other trains at intermediate stations.
Reducing the number of meet and pass interferences helps keep delays at a minimum. For
this purpose, our solution approaches presented in Chapter 6 aim to obtain the initial
routes and release times from the integrated capacity management IP model.
In this chapter, we run experiments to test our modeling and solution methodologies.
Section 7.1 describes our experimental design. In Section 7.2, we run experiments on test
networks to evaluate the following attributes:
1. The capability of (IP*) in generating conflict-free routes and release times from
the origin stations. These depend on the accuracy of the delay estimation
equations in (IP*) for the aggregate nodes
2. The quality of the solutions generated by PFSLPR and PFSRC in terms of delay
values.
3. The improvement brought about when the solutions from these two approaches
are run through the genetic algorithm procedure.
4. The variation in solution time versus problem size.
109
Finally, in Section 7.2, we perform sensitivity analysis to measure the impact of the
degree of aggregation and traffic volume on solution quality and solution time.
7.1 Experimental design
In order to test the performance of our modeling and solution strategies on a variety of
track configurations and network characteristics, we generate test networks.
The first step in generating these test networks is to fix the values of the parameters
defining a section of a network. For our experiments, we design test networks that have
double-track and single-track sections conforming to the values of the length (L), speed-
limit (V), number of crossing or sidings (C), and spacing (S) listed in Tables 5 (for
double-track sections) and 10 (for single-track sections) respectively. Next, we fix a
skeleton network, which is a general network with a definite number of nodes and arcs.
The two skeleton networks that we use for our experiments are shown in Figure 23.
The skeleton networks were constructed to test the quality of the routes and release
schedules from the origin and intermediate nodes. In Skeleton Network 1, routes need to
be determined for origin nodes ST1 and ST2, and intermediate aggregate nodes BC and
HI, as well as release schedules for ST1 and ST2. The link BC-CH-HI induces interaction
between trains traveling in opposite directions. In Skeleton Network 2, routes need to be
determined for origin node ST2 and intermediate node BC, and release schedules for
origins ST1 and ST2. Since there is a single route between ST1 and BC, the quality of the
release schedules from ST1 is vital to minimize delays and interferences between trains.
The network characteristics (length, speed-limit, number of crossings/sidings and spacing
110
between crossings/sidings) for aggregate nodes CD, DE, CF and FG are set so that it is
suboptimal for all trains moving in the same direction to travel on the same route between
BC and ST2. This would induce interactions between trains traveling in opposite
directions. In addition to the test networks, we also ran experiments on a real-world rail
network chosen from the Los Angeles area.
AB
ST1
BC CD
EF FG GH HI
ST2 CH
AB BC
CD DE
CF FG
ST2 ST1
Skeleton Network 1
Skeleton Network 2
Figure 23: Experimental design: skeleton networks
Skeleton Network 1 has 10 nodes including 2 stations, and Skeleton Network 2 has 8
nodes including 2 stations. Nodes, except ST1 and ST2, are aggregate nodes. The
configuration of an aggregate node is set as follows:
1. Generate a random number. If the number is greater than 0.5, make it a double-
track section. Else, make it a single-track section.
111
2. Depending on the outcome of step (1), randomly pick the length (L), track speed-
limit (V), number of crossings or sidings (C) and spacing between crossings and
sidings (S) parameters for the aggregate node. If the aggregate node is a double-
track choose the L, V, C and S values from Table 5, and if it single-track choose
the values from Table 10.
For our experiments, we generate four test networks. The configuration of the aggregate
nodes in each of these networks is listed below. The actual network diagrams are
included in Appendix A.
# of tracks L V C S
AB D 12.5 35 3 1
BC S 10 55 2 1
CD S 10 55 2 1
CH D 5 35 1 1
EF D 5 35 1 1
FG S 10 55 3 1
GH D 12.5 35 1 1
HI D 5 35 1 1
Network 1
# of tracks L V C S
AB D 12.5 35 3 1
BC S 10 55 2 1
CD S 10 55 2 1
CH D 5 35 1 1
EF D 12.5 15 3 0.5
FG S 15 35 4 0.7
GH D 20 35 1 0.5
HI D 12.5 35 5 0.75
Network 2
# of tracks L V C S
AB S 10 35 3 0.85
BC D 20 55 3 1
CD S 15 15 4 1
CH S 15 35 2 0.7
EF D 5 35 1 1
FG S 10 35 3 0.85
GH S 15 15 4 1
HI S 10 35 3 0.85
Network 3
# of tracks L V C S
AB D 5 35 3 1
BC D 12.5 55 3 1
CD D 12.5 55 3 1
DE D 5 35 3 1
CF S 15 35 3 1
FG S 20 35 3 0.85
Network 4
Table 16: Experimental test network configurations
112
The LA rail network used for our experiments is between CP Dayton Taylor Yard
and Pomona, as shown in Figure 1. There are two routes available for trains beginning at
either end – UP-Alhambra line and UP-San Gabriel line. Each route is approximately 30
miles long. Once the networks are constructed, we run experiments as per the following
steps:
1. Fix the daily arrival rate of trains so that the system is not overloaded resulting in
a deadlock. At the same time, the number of trains have to be large enough to
result in interactions between trains in the same and opposite directions, to
represent a moderately to highly congested system.
2. Run simulations over every aggregate node individually by varying traffic
volumes, arrival rates and the mix of the various types of trains. Collect system
states and run regression over the collected data, similar to the description in
Chapter 5. This gives us the delay estimation equations. Note that for real-world
networks, we need to first identify the sections to be aggregated before running
simulations. The simulation model used is the one by Lu et al. (2004).
3. Formulate the delay estimation equations similar to equations (11) and (12), and
add them to (IP*) along with equations (13), (14) and (15).
4. Follow the steps detailed for solution procedures PFSLPR and PFSRC in Sections
6.1 and 6.2 respectively. Obtain the routes and release times from the origin
stations for all trains and provide this information as input parameters to the
construction heuristic. The construction heuristic develops a deadlock-free
113
schedule for the trains when they travel through the intermediate stations. Record
the delay and travel-time values from the output of the construction heuristic.
5. By treating the routes and release times information generated by the PFSLPR
and PFSRC solution schemes as parent chromosomes, a genetic algorithm
procedure is used to improve the quality of this partial feasible solution. Similar to
step (4), the quality of the improved routes and release times are evaluated
through the construction heuristic that also determines train schedules for the
intermediate stations. The new delay and travel-times values are once again
recorded.
7.2 Experimental results
For our experiments, instances of (IP*), its LP-relaxation and its variants were run using
IBM ILOG‟s CPLEX Optimizer version 9.0 on a 3.06GHz Intel Xeon CPU. The integer
and linear programs were coded using AMPL. The simulation model by Lu et al. (2004)
and its underlying construction heuristic were run using the AweSim! Simulation
software [see Pritsker and O‟Reilly (1999)] on a desktop using a Pentium IV processor
with 512MB RAM.
As mentioned previously, in order to test the capabilities of our modeling and solution
strategies in reducing congestion, we run experiments with a sizeable number of trains
such that there is a possibility of congestion being induced in network sections with
crossings and sidings. Because of the size of the problem (shown in Table 17), (IP*)
cannot be solved to optimality. However, since PFSLPR and PFSRC generate partial
feasible solutions to (IP*), we would like to compare their performance with respect to
114
the lower and upper bounds of (IP*) for a certain network and train count. This is shown
in Table 18.
Network Train
count
Binary
var.
Constraints
LA network 80 94000 97000
Network 1 80 101000 86000
Network 2 56 55000 49000
Network 3 80 66000 63000
Network 4 56 38000 44000
Table 17: Problem sizes
Network Train
count
LP-
relaxation
LP-relax +
integer routing
variables
Lower
Bound
Upper
bound
LA network 80 31.01 31.65 31.08 -
Network 1 80 41.01 41.35 41.03 -
Network 2 56 29.01 29.52 29.19 34.36
Network 3 80 36.96 38.03 37.31 46.97
Network 4 56 29.01 29.42 29.26 33.46
Table 18: Comparison of lower and upper bounds
We note that the values recorded in Tables 18 and 19 are the objective values from
the optimization model averaged over the number of trains, and represent the average
travel times. Table 18 also lists two likely lower bounds to (IP*) – the LP-relaxation and
holding only the variables associated with determining the routes as integers. The former
is obtained by solving the LP-relaxation model. The
variables for all routes and
schedules are of fractional values. The latter is obtained by solving a variant of (IP*) with
linearized
variables together with routing constraints, as discussed in Section 6.2.
Here, the variables for the routes from the origin and intermediate stations are integer
115
values but the variables for release times are fractional. As per this procedure, the number
of binary variables for the five networks in Tables 17 and 18 are 160, 320, 224, 320 and
224 respectively. Naturally, it has a higher objective value for a given network with
respect to the LP-relaxation objective, mainly because trains are forced to depart on a
single route from their origins, thereby resulting in certain variables to be integers.
The fifth and sixth columns of Table 18 list the best lower and upper bounds
obtained for (IP*) after letting CPLEX run for 2.0 CPU hours. The first lower bound
CPLEX finds to an integer program is its LP-relaxation objective value. From there on,
the solver generates cuts to tighten the lower bound. The upper bound obtained from
CPLEX is the best integer solution found so far. The IP lower bound in the fifth column
is only slightly larger than the LP-relaxation objective value, but lower than the objective
value for the LP-relaxation with integer routing variables. CPLEX was unable to find an
upper bound to the „LA rail network‟ and „Network 1‟ mainly due to the large number of
binary variables for these networks, as noted in Table 17.
Table 19 shows the objective values for PFSLPR (obtained from the LP-relaxation
after applying the rounding algorithm and solving the matching problem described in
Section 6.1), PFSRC (obtained from the LP-relaxation with integer routing variables after
solving the matching problem described in Section 6.2), PFSLPR+GA (obtained after
using the GA procedure to improve the output from PFSLPR) and PFSRC+GA (obtained
after using the GA procedure to improve the output from PFSRC).
116
Network Train
count
PFSLPR PFSRC PFSLPR+GA PFSRC+GA
LA rail network 80 32.24
(47.32)
31.85
(61.5)
32.02
(63.32)
31.79
(78.5)
Network 1 80 41.56
(58.10)
41.44
(95.14)
41.49
(82.10)
41.38
(116.14)
Network 2 56 30.60
(27.85)
29.79
(42.78)
30.42
(46.85)
29.71
(57.78)
Network 3 80 38.34
(29.26)
38.24
(49.03)
38.31
(48.26)
38.19
(67.03)
Network 4 56 30.21
(21.07)
29.56
(40.61)
29.79
(38.07)
29.45
(54.61)
Table 19: Comparison of objective values (and solution times in minutes) of the solution
procedures
In Table 19, we notice an increase in the objective values in the PFSLPR and PFSRC
columns relative to the equivalent columns in Table 18, respectively. This is mainly
because for PFSLPR and PFSRC trains depart on a single route from their origins at a
single time instant. This is brought about by the rounding algorithm, binary routing
variables and constraints, and the matching problem. Another conclusion that can be
drawn is that the objective values in the PFSRC column are greater than those in the
PFSLPR column. This is due to the fact that the routes obtained from PFSRC are optimal
owing to the routing constraints. However, in the case of PFSLPR the routes are obtained
by the rounding algorithm wherein the initial routes could be far from being optimal,
thereby driving up the objective values. This feature is preserved when the GA procedure
is applied to the output from the two solution procedures. Depending on the network
structure and number of trains, the GA procedure can drastically improve the objective
values. This is evident from the last two columns in Table 19. An important observation
to make is that the solutions from PFSLPR+GA are greater than the solutions from
117
PFSRC. In other words, applying the GA procedure to PFSLPR does not beat the
objective values of PFSRC. A GA procedure is only as good as its parent chromosomes.
In the case of PFSLPR, due to the possible bad initial routing arising from the rounding
algorithm, the parent chromosomes are not good routes. This could be a reason why
PFSRC with its optimal routes performs better. Additionally, we observe that applying
the GA procedure to PFSRC does not dramatically improve the objective value. In terms
of the objective function we do not see a major improvement probably because of the
limited number of routes, usually 2 – 4, available for a train when going from point A to
point B. This property is reflected in the LA rail network and the constructed test
networks.
In terms of solution times (see Table 19), we infer from Table 19 that PFSLPR is
solved in a shorter time than PFSRC. This is mainly due to the latter having integer
routing variables. The GA procedure takes approximately the same amount of time to
improve upon the routes and release times determined by PFSLPR and PFSRC.
The performance of our solution approaches is compared to a simple greedy
approach for determining routes and initial release schedules. For the greedy approach,
we route trains by balancing traffic volume on the various possible routes. That is, a train
is assigned to a route with the least assigned traffic so far. Then, a greedy heuristic
determines the initial release schedules once the routes are provided to it. These results
are shown column 2 in Tables 20 and 21.
118
We test the quality of the routes and release schedules obtained from the
optimization model by generating a complete solution that includes train schedules from
the intermediate nodes. The complete solution is generated for each solution procedure
(PFSLPR, PFSLPR+GA, PFSRC, PFSRC+GA) and the lower and upper bounds to (IP*)
by feeding the routes and initial release schedules obtained in each case into the
simulation model by Lu et al. (2004) which includes many of the actual characteristics of
a rail network (e.g.: capabilities to model train lengths and the acceleration and
deceleration process). For the intermediate nodes, all solutions use the construction
heuristic for scheduling the trains. We record the average travel-times and average delays
associated with the complete solution in Tables 20 and 21 respectively.
Network
Constr-
uction PFSLPR
PFSLPR
+GA PFSRC
PFSRC
+GA
Lower
bound
Upper
bound
LA rail
network 292.85 246.32 244.64 239.74 235.10 241.74 -
Network 1 90.06 83.11 82.95 80.69 78.48 81.90 -
Network 2 144.90 141.17 140.91 138.14 137.97 140.34 304.19
Network 3 256.82 215.19 213.08 212.36 209.67 214.61 498.71
Network 4 142.72 134.20 133.19 131.05 130.72 132.29 284.08
Table 20: Comparison of average travel times (minutes)
Network
Constr-
uction PFSLPR
PFSLPR
+GA PFSRC
PFSRC
+GA
Lower
bound
Upper
bound
LA rail
network 239.79 191.29 186.39 184.59 178.43 188.33 -
Network 1 30.14 23.50 23.17 22.37 20.61 22.64 -
Network 2 37.95 33.76 33.44 32.85 31.05 33.21 48.86
Network 3 62.24 47.05 45.67 44.98 42.88 45.49 76.18
Network 4 26.31 23.01 22.29 22.33 21.16 23.01 27.33
Table 21: Comparison of average delays (minutes)
119
Table 20 lists the average travel times for the trains to reach their respective
destinations, whereas Table 19 lists the cumulative travel time. We notice that the
resulting cumulative travel-times are larger than the objective values recorded in Table
19. The reasons for this difference are twofold. First, the (IP*) formulation assumes that
trains are infinitesimally small to reduce the problem size (see Chapter 4), while the
simulation models the actual train lengths. Second, the
variables for intermediate
nodes are of a fractional value, thereby lowering the objective function value.
The delay values presented in Table 20 are plotted as a bar graph shown in Figure 24.
From Tables 19 and 20 we can infer that choosing the right routes plays a very important
role in reducing delays and travel-times. The construction heuristic uses a greedy routing
sub-routine, and as a result it has larger delays and travel-times. Our solution procedures
PFSLPR, PFSRC and the GA procedure all perform better than the construction heuristic
for all networks mainly because the initial routes and schedules are determined using
(IP*). Since the main difference between our solution procedures and the construction
heuristic is the initial routes and release time, the results shown in Tables 19 and 20 attest
the importance of determining these parameters. The next observation is that the GA
procedure reduces delays and travel-times for both the solution procedures. In the case of
PFSLPR+GA, the GA runs for a longer time mainly due to the non-optimality of the
routes from the rounding algorithm. For PFSRC+GA, with the routes being determined
optimally, nearly all the improvement in the delays and travel-times can be attributed to
finding a better initial release schedule. The delays and travel-times obtained by using the
routes and release times from the best lower bound to (IP*), were found to be worse-off
120
than the PFSRC, PFSLPR+GA and PFSRC+GA. This is because most of the
variables were fractional in the solution to the best lower bound, and the routes and
schedules were determined by using a simple rounding algorithm. On the other hand, the
upper bound to (IP*) was an integer solution, but was not a tight upper bound. The traffic
was not balanced across the routes which resulted in opposing trains being routed along
the same lines. In the simulation model, this translates to higher congestion arising from a
large number of meet-pass interferences between the trains. This results in dramatically
high travel-times and delays in the column representing the IP upper bound. The final
observation that we can make from the above tables and graph is that the best solution
procedure is PFSRC+GA, and hence this is our recommended procedure to solve the
capacity management problem.
Figure 24: Graphical comparison of delay values
0.00
50.00
100.00
150.00
200.00
250.00
LA rail network Network 1 Network 2 Network 3 Network 4
Delay (min)
Comparison of delay values
Construction PFSLPR PFSLPR+GA PFSRC
PFSRC+GA Lower bound Upper bound
121
From Figure 24 we notice that the percentage improvement in delay values between
the outputs from the construction heuristic and PFSRC+GA are: 25.6%, 31.6%, 18.2%,
31.1% and 19.6% respectively.
7.3 Sensitivity analysis
A key approximation procedure in our modeling approach is aggregation. Ordinarily,
given the NP-hard nature of the capacity management integer program, we cannot
attempt to perform routing and scheduling over a full-size network that is more than 10
miles in length with about 10 trains. Aggregation allows us to shrink the network size by
reducing the number of nodes and arcs. The experiments shown in Section 7.2 attest to
the fact that aggregation allows us to perform routing and scheduling over a medium-size
network between 30 – 50 miles in length with about 70 trains on an average.
While aggregation provides savings in terms of problem complexity and
computational time, we next present results that provide insight on the impact of
aggregation on solution quality. As we increase the size of the network section
aggregated into a single node (hereafter referred to as the degree of aggregation), we lose
accuracy in terms of the optimization model, because the estimated delay equations
become less representative of the actual delay. This, in turn, affects the quality of the
initial routes and release times obtained from the optimization model using one of the
aforementioned solution procedures.
For this purpose, we conduct sensitivity analysis tests. First, using skeleton structures
we build test networks comprising of single-track and double-track sections, similar to
122
what was done in Section 7.2. Next, for each network we vary the degree of aggregation
by varying the length of the network section represented by each aggregate node in the
general network. This translates to varying the number of nodes and arcs in G(N,A). For
each level of aggregation, we vary the train traffic volume over the network and obtain
delay values and solution times by using our recommended solution approach from the
previous section, PFSRC+GA. The test networks are included in Appendix B.
7.3.1 Test network 5
The skeleton structure for the first network is shown below:
ST1 AB BC CD DE ST2
Figure 25: Skeleton structure for test network 5
Each of the aggregate nodes AB, BC, CD and DE are built by randomly assigning it to be
either double-track or single-track and then deciding the track length (L), track speed-
limit (V), number of crossings or sidings (C) and spacing between crossings or sidings
(S) based on Table 5 (if double-track) or Table 10 (if single-track). The resulting network
parameters are shown in Table 22 and the network is shown in Figure 38 in Appendix B.
Node
# of
tracks L V C S
AB Double 5.0 35 1 1
BC Double 12.5 35 3 1
CD Single 15.0 15 3 1
DE Single 5.0 15 2 1
Table 22: Parameters for test network 5
123
AB and BC are double-track networks with 1 and 3 crossings, respectively,
distributed uniformly. CD and DE are single-track networks with 3 and 2 sidings,
respectively, distributed uniformly. The sidings are 1.5 miles long. As per the skeleton
structure in Figure 25, test network 5 has 6 nodes and 5 arcs. AB and BC being double-
track networks can be aggregated into a single aggregate node. Similarly, CD and DE can
be aggregated together into a single node. This results in a general network with 4 nodes
and 3 arcs. The degree of aggregation can also be reduced by having 18 nodes and 21
arcs, and 31 nodes and 27 arcs. The sensitivity analysis test results are presented below.
# of
nodes
# of
arcs
# of
trains
CPU Time
(sec)
Delay
(min)
Travel time
(min)
31 27 4 210.58 31.66 56.31
8 1175.28 33.02 74.17
10 2192.52 36.91 103.44
20 5085.36 43.27 245.05
30 9886.81 48.61 505.33
18 21 4 166.49 31.66 56.31
8 847.76 33.02 74.17
10 1814.14 37.13 105.10
20 4125.86 44.27 264.16
30 7624.09 51.08 557.88
6 5 4 128.40 31.94 56.31
8 653.40 33.42 78.40
10 1291.99 37.65 111.61
20 2467.47 45.00 277.89
30 4587.00 53.47 598.90
4 3 4 88.68 31.94 56.99
8 246.07 33.42 79.96
10 900.07 39.86 119.74
20 1545.74 48.46 294.40
30 2307.25 57.85 611.50
Table 23: Sensitivity analysis results for test network 5
124
Figure 26: Variation in delay values: test network 5
Figure 27: Variation in solution times: test network 5
The above figures show how the delay values increase and solution times decrease as
the degree of aggregation increases. It takes less time to solve using our recommended
solution approach because of the fewer number of nodes and arcs. However, the solution
quality is sacrificed. Based on the results for test network 5, we can conclude that “Agg.
30.00
35.00
40.00
45.00
50.00
55.00
60.00
4 8 10 20 30
Delay (min)
Number of trains
Variation in delay values
No agg.
Agg. 1
Agg. 2
Agg. 3
0.00
2000.00
4000.00
6000.00
8000.00
10000.00
12000.00
4 8 10 20 30
CPU Time (sec)
Number of trains
Variation in solution times
No agg.
Agg. 1
Agg. 2
Agg. 3
125
2” with 6 nodes and 5 arcs performs almost as well as “Agg. 1” with 18 nodes and 21
arcs. The drop in quality for “Agg. 2” w.r.t “Agg. 1” is about 4% when running
experiments with 30 trains. In other situations, both perform equally well. However, the
solution times to solve the network in the case of “Agg. 2” are less by nearly 60%.
Hence, we gain a huge savings in solution time by making a small sacrifice in terms of
modeling the delay values.
7.3.2 Test network 6
The skeleton structure for the test network is shown in Figure 28, and the parameters for
the aggregate nodes are shown in Table 24.
ST1
AD
BD
CE
DF
EG
ST2
Figure 28: Skeleton structure for test network 6
Node
# of
tracks L V C S
AD Single 15.0 15 3 1.00
BD Single 15.0 35 4 0.85
CE Single 15.0 55 2 1.00
DF Double 12.5 55 3 0.75
EG Double 12.5 35 5 1.00
Table 24: Parameters for test network 6
126
The network obtained once the topological characteristics of the aggregate nodes are
determined is shown in Figure 39. The results of running sensitivity analysis tests on the
above network by varying its degree of aggregation and traffic volume are shown in
Table 25. As per its skeleton structure test network 6 has 7 nodes and 9 arcs. Due to the
structure of the network, it cannot be aggregated any further. The results are shown in
Table 25.
# of
nodes
# of
arcs
# of
trains
CPU Time
(sec)
Delay
(min)
Travel time
(min)
41 57 4 707.05 19.41 104.71
8 1444.80 20.39 117.16
10 2482.68 24.86 141.68
20 5710.70 32.11 172.52
30 13626.48 40.78 209.35
27 36 4 344.48 19.41 104.71
8 959.18 20.59 118.57
10 2101.38 25.16 144.37
20 4145.22 33.27 176.66
30 9487.80 43.90 215.21
7 9 4 131.08 19.70 117.28
8 627.47 20.94 139.42
10 1364.74 26.43 178.52
20 2826.53 35.35 227.73
30 4864.68 45.55 293.09
Table 25: Sensitivity analysis results for test network 6
127
Figure 29: Variation in delay values: test network 6
Figure 30: Variation in solution time: test network 6
The plots for the variation of delay and solution time with the degree of aggregation
and traffic volume are shown in Figures 29 and 30 above. For this network, there is a
worst-case increase of 11% in the delay values in going from a network with no
15.00
20.00
25.00
30.00
35.00
40.00
45.00
50.00
55.00
4 8 10 20 30
Delay (min)
Number of trains
Variation in delay values
No agg.
Agg. 1
Agg. 2
0.00
2000.00
4000.00
6000.00
8000.00
10000.00
12000.00
14000.00
16000.00
4 8 10 20 30
CPU time (sec)
Number of trains
Variation in solution times
No agg.
Agg. 1
Agg. 2
128
aggregation to a network with 7 nodes and 9 arcs. In addition, there is a nearly 65% drop
in the solution time in the case of “Agg. 2” w.r.t.”No agg.”.
7.3.3 Test network 7
The skeleton structure for this network is shown below. The parameters for the test
network are shown in Table 26.
AB
ST1
CD
BE
EF
GH
ST2
Figure 31: Skeleton structure for test network 7
Node
# of
tracks L V C S
AB Double 5.0 15 1 1.00
CD Double 5.0 35 1 0.50
BE Double 20.0 35 5 1.00
EF Single 5.0 15 2 0.70
GH Single 5.0 35 2 0.85
Table 26: Parameters for test network 7
The above structure has 7 nodes and 8 arcs. Each of the 5 aggregate nodes are
populated by randomly deciding whether each of them is a single-track or double-track,
and then suitably fixing their length, speed-limit, number of crossings or sidings and
spacing parameters. The resulting network is shown in Figure 40. The network can be
129
further aggregated into AB plus BE, CD plus BE, EF and GH. AB and CD can each be
aggregated with BE because they are all double-track sections. This reduces the number
of nodes to 6 and arcs to 6. The results from the sensitivity analysis tests by varying
degree of aggregation and traffic volume is shown in Table 27.
# of
nodes
# of
arcs
# of
trains
CPU Time
(sec)
Delay
(min)
Travel time
(min)
32 40 4 779.70 26.10 211.08
8 1593.27 30.55 256.76
10 2737.80 38.67 308.43
20 6297.54 48.91 492.30
30 15026.75 54.84 664.21
19 22 4 373.09 26.10 211.50
8 1038.84 30.86 260.10
10 1821.62 39.72 315.22
20 4814.38 50.79 505.59
30 8976.03 57.39 689.45
7 8 4 135.01 26.80 232.65
8 646.29 32.32 292.71
10 1405.68 42.32 376.28
20 3529.33 54.81 635.07
30 5628.63 63.28 896.68
6 6 4 110.41 29.15 249.07
8 428.44 34.39 320.95
10 1177.80 44.98 400.96
20 2075.40 60.20 738.94
30 3260.04 69.30 1113.88
Table 27: Sensitivity analysis results for test network 7
The variation in delay and solution times with degree of aggregation and traffic volumes
are shown in Figures 32 and 33.
130
Figure 32: Variation in delay values: test network 7
Figure 33: Variation in solution time: test network 7
Based on the results from these sensitivity tests, we infer that “Agg. 1” has a very
slight increase in the delay values (4% in the worst-case) compared to no aggregation.
However, there is approximately 40% savings in terms of solution time. “Agg. 2” affords
20.00
30.00
40.00
50.00
60.00
70.00
4 8 10 20 30
Delay (min)
Number of trains
Variation in delay values
No agg.
Agg. 1
Agg. 2
Agg. 3
0.00
2000.00
4000.00
6000.00
8000.00
10000.00
12000.00
14000.00
16000.00
4 8 10 20 30
CPU time (sec)
Number of trains
Variation in solution times
No agg.
Agg. 1
Agg. 2
Agg. 3
131
a 60% savings in solution time compared to no aggregation. However, there is a
siginificant increase in delay values (15% in the worst case) w.r.t. no aggregation in the
network. As a result, “Agg. 1” is the best aggregation that can be applied to this test
network.
132
Chapter 8: Conclusions and Future Research
8.1 Conclusions
In this research, we develop a decision tool that can be used by railway planners to
develop good quality routes and schedules, on a daily or weekly basis, within a short
amount of time, to better manage the limited availability of track capacity. Railway
dispatchers can then use these plans to decide the real-time movement of trains through a
network on a daily basis. However, our research concentrates only on railway routing and
scheduling, without dispatching, for medium to large-scale rail networks.
A vast majority of the literature in railway capacity management deals with methods
to estimate capacities and delays, and to ensure that real-time perturbations on a rail
network can be managed with as few changes as possible to the railway timetable. To the
best of our knowledge, there has not been any effort from the research community to
improve rail track capacity use through efficient routing and scheduling jointly. Most
research papers on delay estimation and capacity assessment for railway networks does
not explicitly consider the vital and complex interactions between traffic, operating and
network parameters. In the case of the analytical models, heavy assumptions are made in
order to maintain the complexity of the problem within solvable bounds.
Given the daily traffic volume, train mix or heterogeneity, and sets of origins and
destinations, the integrated routing and capacity management model can be solved to
obtain the routes the trains should travel on and their order of departure from the origin
stations. This problem is akin to a no-wait job-shop scheduling problem, where the trains
133
are similar jobs that need to be processed and the rail-tracks are resources such as
machines. Another aspect to note about our modeling strategy is that the IP model can be
solved for a network of any size by using a suitable degree of aggregation. However, we
do not recommend its use for rail network greater than 100 miles for three reasons that
were explained in section 3.2. A large portion of the research on railway planning handles
train scheduling and dispatching at a micro-level, to determine every track segment,
junction and siding each train travels on. With the aid of aggregation techniques our
modeling procedure is carried out at a macro-level. The integrated routing and capacity
management model is capable of incorporating route flexibility as opposed to having a
train follow a fixed path from its origin to destination, and of handling large size railway
networks (50-100 miles in length) with many more trains. A novel feature of our
modeling strategy is the incorporation of track capacity into the routing model. The use
of capacity-delay correlations enables the adjustment of the capacity of a track or sub-
network to have admissible delays, and to reject or defer trains that would overload the
network.
Our planning methodology includes developing an integer programming- based
capacity management model that assigns trains to routes based on the statistical
expectation of running times in order to balance the railroad traffic. This model is also
capable of determining the best release times for trains to depart from origin stations and
enter a network. The planning model (IP*) can develop a complete solution, comprising
of routes and schedules for every train from every node in the network. However, as we
explained in Chapter 3, the schedules from the intermediate nodes depend on the
134
operating conditions, congestion levels, train breakdowns etc. at the time of operation,
and are determined in real-time by dispatchers. Due to this reason, we focus only on
(IP*) to determine the initial routes and release schedules for the trains from their origin
stations. However, as the complexity of (IP*) grows with the number of nodes and arcs in
the network, it becomes harder to obtain the initial routes and release schedules in a
reasonable amount of time. For this reason, we resort to aggregation, wherein we
approximate a suitable section of a network as a single node. In Chapter 4, we present
results from an experiment showing that if the optimal initial routes and schedules from
(IP*) are fed into the simulation model then performance of the complete solution from
the simulation model and (IP*), if (IP*) is solved to optimality, are within a small gap
from each other. The gap increases as the degree of aggregation increases. In an
aggregate node, the assumption of instantaneous acceleration and deceleration is less
valid because of a higher probability of trains meeting and overtaking each other. Hence,
the time taken to travel through a node would no longer be linear. Towards this end, we
present a delay estimation methodology that computes delay as a function of the network
topology, traffic mix and operating parameters. The delay equations are fed into (IP*),
which then routes trains along aggregate nodes with the least expected delay based on the
traffic mix (X
i
parameters) and the traffic in the opposite direction (D
1
and D
2
parameters). To solve the resulting (IP*) for medium and large-scale networks, we
develop a solution procedure (PFSLPR) based on the LP-relaxation formulation of (IP*),
use a rounding algorithm to decide the initial routes and solve a matching problem to
decide the release schedules. Another solution procedure we develop is based on the LP-
relaxation formulation with routing constraints (PFSRC) that force the routing variables
135
from origin stations for all trains to be integers. We again solve a matching problem to
decide the release schedules from the origin stations for all the trains. The number of
integer variables in PFSRC is significantly lower than (IP*). Finally, we use a genetic
algorithm procedure to search the solution space and improve the solution obtained from
the previous two solution procedures. In Chapter 7, we use 4 test rail networks and an
actual rail network from the Los Angeles area to test the performance of the initial routes
and release schedules obtained by our solution procedures. We test the performance by
generating a complete solution, including departure schedules from intermediate nodes,
using the simulation model developed by Lu et al. (2004). We also compare the quality of
the routes obtained from (IP*) with the routes used by the greedy heuristic imbedded in
the simulation model. Based on the travel-times and delay values, we conclude that
PFSRC+GA is the best solution procedure. We also conduct sensitivity analyses on 3 test
networks to evaluate the quality of the routes and release schedules obtained by
PFSRC+GA. We test the effect of degree of aggregation and traffic volumes on the
solution quality.
We note at this point that we do not provide an approach to obtain a complete
solution to (IP*), comprising of routes and schedules from all origin nodes as well as
intermediate nodes. As mentioned before, the schedules from intermediate stations are
best determined in real-time based on congestion levels. Therefore, we have developed an
approach that utilizes (IP*) to obtain good quality partial information (initial routes and
release times) about the movement of the trains, and uses this in conjunction with a
construction heuristic to estimate the actual delay by considering train lengths, speeds,
136
acceleration and deceleration rates. (IP*) relaxes some of these parameters to reduce its
computational complexity.
8.2 Future work
Exact solution procedures, based on column generation, or relaxation procedures such as
Lagrangian relaxation methods could be used to derive a complete solution to (IP*).
However, a complete solution to (IP*) might not be optimal in reality because (IP*)
determines the train schedules from the intermediate nodes while assuming trains are
infinitesimally small. Therefore, before developing an exact or relaxation-based solution
approach, we would need to verify whether an optimal solution to (IP*) always translates
to the best possible schedules for the intermediate nodes.
In our experimental analysis, we compare the performance of the routes and
schedules generated by our solution procedures with that of the lower and upper bounds
to (IP*). We note in Chapter 7 that no upper bounds were obtained for “LA rail network”
and “Network 1” due to the large number of binary variables. For the remaining test
networks, the upper bounds were of poor quality. We also note that the lower bounds
were very close to the LP-relaxation objective values. An interesting future work could
be introducing cuts to derive tighter lower and upper bounds.
In Chapter 5, we developed a generic delay estimation procedure to predict delays
for double-track and single-track network sections. An interesting extension would be to
develop generic models to predict delays across complex crossings, such as Colton
crossing located approximately 60 miles to the east of downtown Los Angeles. Such
137
crossings have trains entering and exiting in four or more directions, thereby having far
more interactions between trains compared to simple single or double-track sections.
Therefore, we might need to model parameters other than the traffic mix and the traffic in
the opposite direction.
A possible improvement to our current delay estimation methodology could entail
developing queuing models to identify the effects of topological, operating and traffic
parameters on the delay in a section of a network. While this could provide more accurate
delay estimates than our current methodology, they would introduce non-linear
constraints in (IP*). Hence, a more sophisticated solution procedure would be required to
solve the resulting routing and capacity management optimization model.
138
Bibliography
M. Abril, F. Barber, L. Ingolotti, M. A. Salido, P. Tormos and A. Lova, “An assessment
of railway capacity”, Transportation Research Part E, 44(5), 774-806, 2008.
R. K. Ahuja, K. C. Jha, and J. Liu, “Solving real-life railroad blocking problems”,
Interfaces, No. 5, 404-419, 2007.
R. K. Ahuja, K. C. Jha, and G. Sahin, “New approaches for solving the block-to-train
assignment problems”, Networks, 51, 48-62, 2008.
R. K. Ahuja, T. L. Magnanti, and J. B. Orlin, “Network Flows”, Prentice Hall, 1993.
C. Barnhart, H. Jin, P. H. Vance, “Railroad blocking: A network design application”,
Operations Research, 48, 603-614, 2000.
U. Brännlund, P. O. Lindberg, A. Nõu, and J. –E. Nilsson, “Railway timetabling using
Lagrangian relaxation”, Transportation Science, 32(4), 358-369, 1998.
R. L. Burdett and E. Kozan, “Techniques for absolute capacity determination in
railways”, Transportation Research Part B, 40, 616-632, 2006.
X. Cai and C. J. Goh, “A fast heuristic for the train scheduling problem”, Computers &
Operations Research, 21(5), 499-510, 1994.
A. Caprara, M. Fischetti, and P. Toth, “Modeling and solving the train timetabling
problem”, Operations Research, 50(5), 851-861, 2002.
A. Caprara, M. Fischetti, P. Toth, D. Vigo, and P. L. Guida, “Algorithms for railway
crew management”, Mathematical Programming, 79, 125-141, 1997.
M. Carey, “Ex ante heuristic measure of schedule reliability”, Transportation Research
Part B, 33, 473-494, 1999.
139
M. Carey and A. Kwiecinski, “Stochastic approximation to the effects of headways on
knock-on delays of trains”, Transportation Research Part B, 28(4), 251-267, 1994.
M. Carey and D. Lockwood, “A model, algorithms and strategy for train pathing”,
Journal of the Operational Research Society, 46, 988-1005, 1995.
B. Chen and P. T. Harker, “Two moment estimation of the delay on a single-track rail
line with scheduled traffic”, Transportation Science, 24, 261-275, 1990.
J. F. Cordeau, P. Toth and D. Vigo, “A survey of optimization models for train routing
and scheduling”, Transportation Science, 32(4), 380-404, 1998.
T. G. Crainic, J. A. Ferland, and J. M. Rosseau, “A tactical planning model for rail freight
transportation”, Transportation Science, 18, 165-184, 1984.
T. G. Crainic and M. Gendreau, “Approximate formulas for the computation of
connection delays under capacity restrictions in rail freight transportation”, Technical
Report 438, University of Montreal, Montreal, Canada, 1985.
A. D‟Ariano, “Improving real-time train dispatching: models, algorithms and
applications”, TRAIL Thesis Series, t2008/6 ed., The Netherlands, 2008.
A. D‟Ariano, D. Pacciarelli and M. Pranzo, “A branch and bound algorithm for
scheduling trains in a railway network”, European Journal of Operational Research,
183(2), 643-657, 2007.
A. F. De Kort, B. Heidergott, and H. Ayhan, “A probabilistic (max,+) approach for
determining railway infrastructure capacity”, European Journal of Operational Research,
148, 644-661, 2003.
P. J. Deitel and H. M. Deitel, “C++ - How to Program”, 6
th
Ed., Prentice Hall, 2006.
M. M. Dessouky and R. C. Leachman, “A simulation modeling methodology for
analyzing large complex rail networks”, Simulation, 65:2, 131-142, 1995.
140
M. M. Dessouky, Q. Lu and R. C. Leachman, “Using simulation modeling to assess rail
track infrastructure in densely trafficked metropolitan areas”, Proceedings of the 2002
Winter Simulation Conference, 725-231, 2002.
M. M. Dessouky, Q. Lu, J. Zhao and R. C. Leachman, “An exact solution procedure for
determining the optimal dispatching times for complex rail networks”, IIE Transactions,
pp. 141-152, 2006.
M. J. Dorfman and J. Medanic, “Scheduling trains on a railway network using a discrete
event model of railway traffic”, Transportation Research Part B, 38, 81-98, 2004.
M. Dyer and L. Wolsey, “Formulating the single machine sequencing problem with
release dates as a mixed integer program”, Discrete Applied Mathematics, 26(2-3), 255-
270, 1990.
O. Frank, “Two-way traffic in a single line of railway”, Operations Research, 14, 801-
811, 1966.
S. Gibson, G. Cooper, and B. Ball, “Developments in transport policy: The evolution of
capacity charges on the UK rail network”, Journal of Transport Economics and Policy,
36, 341-354, 2002.
M. F. Gorman, “An application of genetic and tabu searches to the freight railroad
operating plan problem”, Annals of Operations Research, 78, 51-69, 1998.
B. S. Greenberg, R. C. Leachman, and R. W. Wolff, “Predicting dispatching delays on a
low speed, single-track railroad”, Transportation Science, 22(1), 31-38, 1998.
S. F. Hallowell and P. T. Harker, “Predicting on-time line-haul performance in scheduled
railroad operations”, Transportation Science, 30, 364-378, 1996.
S. F. Hallowell and P. T. Harker, “Predicting on-time performance in scheduled railroad
operations: methodology and application to train scheduling”, Transportation Research
Part A, 32(4), 279-295, 1998.
141
P. T. Harker and S. Hong, “Two moments estimation of the delay on a partially double-
track rail line with scheduled traffic”, Transportation Research Forum, 30, 38-49, 1990.
A. Higgins and L. Ferreira, “Modeling single line train operations”, Transportation
Research Record, 1489, Transportation research Board, 9-16, 1995.
A. Higgins and E. Kozan, “Modeling train delays in urban networks”, Transportation
Science, 32(4), 346-357, 1998.
A. Higgins, E. Kozan, and L. Ferreira, “Optimal scheduling of trains on a single-line
track”, Transportation Research Part B, 30B(2), 147-161, 1996.
R. Hillestad, B. D. Van Roo, and K. D. Yoho, “Key Issues in Modernizing the U.S.
Freight-Transportation System for Future Economic Growth”, Technical Report, RAND
Supply Chain Policy Center, 2009.
T. Huisman and R. J. Boucherie, “Running times on railway sections with heterogeneous
train traffic”, Transportation Research Part B, 35, 271-292, 2001.
C. L. Huntley, D. E. Brown, D. E. Sappington and B. P. Markowicz, “Freight routing and
scheduling at CSX Transportation”, Interfaces, 25(3), 58-71, 1995.
A. H. Kaas, “Methods to Calculate Capacity of Railways”, Ph.D thesis, Department of
Planning, Technical University of Denmark, 1998.
M. H. Keaton, “Designing railroad operating plans: A dual adjustment method for
implementing Lagrangian relaxation”, Transportation Science, 26, 263-279, 1992.
D. R. Kraay and P. T. Harker, “Real-time scheduling of freight railroads”, Transportation
Research Part B, 29B(3), 213-229, 1995.
D. R. Kraay, P. T. Harker, and B. Chen, “Optimal pacing of trains in freight railroads:
model formulation and solution”, Operations Research, 39(1), 82-99, 1991.
142
H. Krueger, “Parametric modeling in rail capacity planning”, Proceedings of the Winter
Simulation Conference, 1194-1200, 1999.
A. M. Law and W. D. Kelton, “Simulation Modeling and Analysis”, 2
nd
Ed, McGraw
Hill, 1991.
R. C. Leachman, “Inland Empire railroad main line advanced planning study”, Technical
Report, prepared for the Southern California Association of Governments, Contract
number 01-077, Work element number 014302, October 2002.
Q. Lu, M. M. Dessouky, and R. C. Leachman, “Modeling of train movements through
complex networks”, ACM Transactions on Modeling and Computer Simulation, 14, 48-
75, 2004.
C. Mannino and A. Mascis, “ Optimal real-time traffic control in metro stations”,
Operations Research, 57(4), 1026-1039, 2009.
A. Mascis and D. Pacciarelli, “Job-shop scheduling with blocking and no-wait
constraints”, European Journal of Operational Research, 143, 498-517, 2002.
D. C. Montgomery, “Design and Analysis of Experiments”, 2
nd
Ed., John Wiley and Sons,
1984.
P. Murali, M. M. Dessouky, F. Ordóñez, and K. Palmer, “A delay estimation technique
for single and double-track railroads”, Transportation Research Part E, 46, 483-495,
2010.
R. H. Myers and D. C. Montgomery, “Response Surface Methodology”, 2
nd
Ed., Wiley
Series in Probability and Statistics, 2002.
K. Nachtigall and S. Voget, “A genetic algorithm approach to periodic railway
synchronization”, Computers & Operation Research, 23(5), 453-463, 1996.
143
K. Nachtigall and S. Voget, “Minimizing waiting times in integrated fixed interval
timetables by upgrading railway tracks”, European Journal of Operational Research,
103, 610-627, 1997.
H. N. Newton, “Network Design under Budget Constraints with Application to the
Railroad Blocking Problem”, Ph.D thesis, Auburn University, Auburn, AL, 1996.
E. S. Oliveira, “Solving single-track railway scheduling problem using constraint
programming”, Ph.D thesis, University of Leeds, Leeds, UK, 2001.
E. S. Oliveira and B. M. Smith, “A Job-Shop Scheduling Model for the Single-Track
Railway Scheduling Problem”, School of Computing Research Report 2000.21,
University of Leeds, England, 2000.
S. Özekici and S. Şengör, “On a rail transportation model with scheduled services”,
Transportation Science, 28(3), 246-255, 1994.
C. H. Papadimitriou and K. Steiglitz, “Combinatorial Optimization”, Dover, 1998.
E. R. Petersen, “Over the road transit time for a single-track railway”, Transportation
Science, 8, 65-74, 1974.
E. R. Petersen and A. J. Taylor, “A structured model for rail line simulation and
optimization”, Transportation Science, 16, 192-206, 1982.
A. A. B. Pritsker and J. J. O‟Reilly, “Simulation with Visual SLAM and AweSim”, 2
nd
Ed.,
John Wiley and Sons, New York and Systems Publishing Corporation, West Lafayette,
Indiana, 1999.
C. R. Reeves and J. Rowe, “Genetic Algorithms Principles and Perspectives. A Guide to
GA Theory”, Kluwer Academic Publishers, 2003.
İ. Şahin, “Railway traffic control and train scheduling based on inter-train conflict
management”, Transportation Research Part B, 33, 511-534, 1999.
144
V. Salim and W. Cai, “Scheduling cargo trains using genetic algorithms”, IEEE
International Conference on Evolutionary Computation, 224-227, 1996.
V. Salim and W. Cai, “A genetic algorithm for railway scheduling with environmental
considerations”, Environmental & Software, 12(4) 301-309, 1997.
W. Suteewong, “Algorithms for Solving the Train Dispatching Problem for General
Networks”, Ph.D thesis, University of Southern California, 2006.
B. A. Weatherford, H. H. Willis, and D. S. Ortiz, “The State of U.S. Railroads: A Review
of Capacity and Performance Data”, Technical Report, RAND Supply Chain Policy
Center, 2008.
B. Vaidyanathan, R. K. Ahuja, J. Liu and L. A. Shughart, “Real-life locomotive planning:
new formulations, algorithms and computational results”, Transportation Research Part
B, 147-168, 2008.
E. Wendler, “The scheduled waiting time on railway lines”, Transportation Research
Part B, 41, 148-158, 2007.
C. Winston, “The Success of the Stagger Rail Act of 1980”, Technical Report, Brookings
Institution, September 2005.
J. Yuan, “Stochastic Modeling of Train Delays and Delay Propagation in Stations”, Ph.D
thesis, Delft University of Technology, The Netherlands, 2006.
J. Yuan and I. A. Hansen, “Optimizing capacity utilization of stations by estimating
knock-on train delays”, Transportation Research Part B, 41, 202-217, 2007.
145
Appendix A: Test Networks for Experiments
The test networks used for running the experiments are shown in Figures 34, 35, 36 and
37 below. They were obtained by randomly populating the aggregate nodes in skeleton
networks 1 and 2 shown in Figures 22 and 23. Note that a section representing an
aggregate node is either a double-track or a single-track and not a combination of both.
This is to make it possible to estimate delay over these network sections using the
techniques described in Chapter 5.
146
Figure 34: Test network 1
STATION
STATION
12.5
10 10
3.125
2.8 2.5
2.5
5
1.375
10
6.25
12.5
2.5
5
A
B
C
F G H E
35
55
55
35
55
35
35
35
2.8
147
STATION
STATION
20
10 15
5
0.625
0.9
15
5
20 12.5
A
B
C D
F G
H
I E
35
35
35
15
35
35 35
7
1.5
10.5
6.25
12.5
10.5
1.56
12.5
2.08
55
Figure 35: Test network 2
148
STATION
STATION
A
B C D
E
F G
H I
10
8.5
1.0
20
5
15
1.8
2.5
5
1.0
8.5
10
1.8
15
1.0
8.5
10
2.5
10.5
15
35
55
15
35
35
15
35
35
Figure 36: Test network 3
149
STATION
STATION
A B C
D E
F G
1.25
5
12.5
3.125
12.5
3.125
1.25
5
15
2.61
3.11
17
20
35
55
55
35
35
35
Figure 37: Test network 4
150
Appendix B: Test Networks for Sensitivity Analysis
5
12.5
2.5
2.625 1.52
0.667
15 5
STATION
STATION
3.125
A B C
D
E
35 35
15
15
Figure 38: Test network 5
151
STATION
STATION
15
12.5
12.75
2.61
1.8
4
2.08
2.34
9.375
15
35
55
55
35
A
B
C
D
E
F
G
Figure 39: Test network 6
152
STATION
STATION
A
F
5
5
20
5
5
3.33
3.5
4.25
2.5
2.5
0.42
0.17
B
C
E
G
H
D
15
35
35
15
15
Figure 40: Test network 7
Abstract (if available)
Abstract
In the United States, railways are the major means to trans-continentally move goods from ports to the various inland destinations. Due to mergers and abandonment of rail lines, there has been a reduction in the track capacity, concentrating rail traffic to fewer lines. In addition to this, the growth in the number of containers has already introduced congestion and threatened the capacity of the rail network system in many corridors. There is a need among U.S. freight railroads for better analytical tools to manage their capacity and scheduling. A challenging problem for railroad companies is to be able to plan the traffic and operating conditions over a network so that deadlocks are avoided and travel-times are below a threshold. This requires estimating travel-times and delays in a network, and determining the most efficient method of scheduling a set of trains.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Traffic assignment models for a ridesharing transportation market
PDF
Train routing and timetabling algorithms for general networks
PDF
Intelligent urban freight transportation
PDF
Models and algorithms for pricing and routing in ride-sharing
PDF
Routing problems for fuel efficient vehicles
PDF
Routing and inventory model for emergency response to minimize unmet demand
PDF
Routing for ridesharing
PDF
An online cost allocation model for horizontal supply chains
PDF
Congestion reduction via private cooperation of new mobility services
PDF
New approaches for routing courier delivery services
PDF
Optimum multimodal routing under normal condition and disruptions
PDF
Train scheduling and routing under dynamic headway control
PDF
Dynamic programming-based algorithms and heuristics for routing problems
PDF
Improving mobility in urban environments using intelligent transportation technologies
PDF
Reconfiguration strategies for mitigating the impact of port disruptions
PDF
Novel queueing frameworks for performance analysis of urban traffic systems
PDF
Continuous approximation formulas for cumulative routing optimization problems
PDF
Cost-sharing mechanism design for freight consolidation
PDF
Computational geometric partitioning for vehicle routing
PDF
A stochastic employment problem
Asset Metadata
Creator
Murali, Pavankumar
(author)
Core Title
Strategies for effective rail track capacity use
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Industrial and Systems Engineering
Publication Date
08/20/2010
Defense Date
07/26/2010
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
delay estimation,OAI-PMH Harvest,railway routing,railway scheduling,transportation science
Place Name
USA
(countries)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Dessouky, Maged M. (
committee chair
), Giuliano, Genevieve (
committee member
), Ordóñez, Fernando I. (
committee member
)
Creator Email
muralipavan@hotmail.com,pmurali@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m3408
Unique identifier
UC191353
Identifier
etd-Murali-4033 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-386987 (legacy record id),usctheses-m3408 (legacy record id)
Legacy Identifier
etd-Murali-4033.pdf
Dmrecord
386987
Document Type
Dissertation
Rights
Murali, Pavankumar
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
delay estimation
railway routing
railway scheduling
transportation science