Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Coordinated freeway and arterial traffic flow control
(USC Thesis Other)
Coordinated freeway and arterial traffic flow control
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Coordinated Freeway and Arterial Traffic Flow Control
by
Tianchen Yuan
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fullfillment of the
Requirements for the Degree
Doctor of Philosophy
(ELECTRICAL ENGINEERING)
May 2024
Acknowledgements
I would like to sincerely thank my advisor, Prof. Petros A. Ioannou, for his invaluable
supervision, support and guidance during the course of my PhD degree. His vast wisdom
and wealth of experience have inspired me throughout both my studies and my life. I am
very fortunate to be one of his students. I would like to also extend my gratitude to Prof.
Pierluigi Nuzzo, Prof. Ketan Savla, Prof. Maged Dessouky, Prof. John Carlsson for being
on my qualification committee or thesis committee.
In addition, I would like to thank my former lab mates and friends, Dr. Faisal Alasiri and
Dr. Yihang Zhang, for their assistance, insights and encouragement on multiple occasions,
even after their graduations.
Finally, my deepest appreciation and love go to my parents, Enzhu Yu and Xuqiang
Yuan, for their emotional and financial support over the past ten years of my life oversea.
Best wishes for their health and happiness.
ii
Table of Contents
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Existing Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Chapter 2: Macroscopic Traffic Flow Models . . . . . . . . . . . . . . . . . . 14
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 The Lighthill-Whitham-Richards (LWR) Model . . . . . . . . . . . . . . . . 15
2.3 The Cell Transmission Model (CTM) . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Extended CTM for Freeway Bottlenecks . . . . . . . . . . . . . . . . . . . . 18
2.5 Extended CTM for Arterial Traffic . . . . . . . . . . . . . . . . . . . . . . . 19
2.5.1 Frequent Lane Change . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5.2 Queue Discharging . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Chapter 3: Evaluation of Integrated Variable Speed Limit and Lane Change
Control for Freeway Traffic Flow . . . . . . . . . . . . . . . . . . 23
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Multi-Section Cell Transmission Model . . . . . . . . . . . . . . . . . . . . . 24
3.3 Control Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3.1 Robust Variable Speed Limit Control . . . . . . . . . . . . . . . . . . 26
3.3.2 Lane Change Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4 Numerical Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4.1 Network Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4.2 Parameter Selections . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.4.3 Performance Measurements and Evaluation . . . . . . . . . . . . . . . 31
3.4.4 Uncertainties and Robustness Analysis . . . . . . . . . . . . . . . . . 33
iii
Chapter 4: Selection of the Speed Command Distance for Improved Performance of a Rule-Based VSL and Lane Change Control . . . 36
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2 Control Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2.1 Rule-Based Variable Speed Limit Control . . . . . . . . . . . . . . . . 37
4.2.2 Analysis of VSL Zone Distance . . . . . . . . . . . . . . . . . . . . . 40
4.2.3 Lane Change Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2.4 Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.3 Numerical Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.3.1 Simulation Network and Performance Measurements . . . . . . . . . 46
4.3.2 Evaluations with Various L0 . . . . . . . . . . . . . . . . . . . . . . . 46
4.3.3 Proposed VSL vs. Feedback-Linearization VSL . . . . . . . . . . . . 50
Chapter 5: Integrated Freeway Traffic Control Using Q-Learning with Adjacent Arterial Traffic Considerations . . . . . . . . . . . . . . . 52
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.2.1 Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.2.2 Freeway Traffic Control Agent . . . . . . . . . . . . . . . . . . . . . . 55
5.2.3 Arterial Traffic Management . . . . . . . . . . . . . . . . . . . . . . . 62
5.3 Numerical Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.3.1 Simulation Network and Parameters . . . . . . . . . . . . . . . . . . 67
5.3.2 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Chapter 6: Traffic Signal Control and Speed Offset Coordination Using
Q-Learning for Arterial Road Networks . . . . . . . . . . . . . . 75
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.3.1 TSC Agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.3.2 DSO Agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.3.3 Q-Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.3.4 Training Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.4 Experimental Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.4.1 Simulation Network and Parameters . . . . . . . . . . . . . . . . . . 92
6.4.2 Evaluation Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.4.3 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Chapter 7: Integration of Freeway and Arterial Traffic Control . . . . . . . 100
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
7.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.3.1 Freeway Traffic Control Agent . . . . . . . . . . . . . . . . . . . . . . 103
7.3.2 Traffic Signal Control Agent . . . . . . . . . . . . . . . . . . . . . . . 107
7.3.3 Dynamic Speed Offset Agent . . . . . . . . . . . . . . . . . . . . . . . 110
iv
7.3.4 Training Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7.4 Experimental Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.4.1 Simulation Network Configuration . . . . . . . . . . . . . . . . . . . . 113
7.4.2 Evaluation Criteria and Results . . . . . . . . . . . . . . . . . . . . . 115
Chapter 8: Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . 120
8.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
v
List of Tables
3.1 Definition of Variables and Model Parameters . . . . . . . . . . . . . . . . . 25
3.2 Evaluation Results for d = 7000 veh/h . . . . . . . . . . . . . . . . . . . . . 32
3.3 Evaluation Results for d = 6000 veh/h . . . . . . . . . . . . . . . . . . . . . 32
3.4 Uncertainties in Densities ˜ρi
. . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.5 Uncertainties in Flows ˜qi
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.6 Uncertainties in Model Parameter w . . . . . . . . . . . . . . . . . . . . . . 35
4.1 Evaluations of Two VSL Schemes . . . . . . . . . . . . . . . . . . . . . . . . 51
5.1 Evaluations of a Moderate-Demand Scenario without Incident . . . . . . . . 71
5.2 Evaluations of a Moderate-Demand Scenario with Incident . . . . . . . . . . 72
5.3 Evaluations of a High-Demand Scenario without Incident . . . . . . . . . . . 72
5.4 Evaluations of a High-Demand Scenario with Incident . . . . . . . . . . . . . 73
6.1 Low-Demand without Incident . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.2 Moderate-Demand without Incident . . . . . . . . . . . . . . . . . . . . . . . 96
6.3 High-Demand without Incident . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.4 Low-Demand with Incident . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.5 Moderate-Demand with Incident . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.6 High-Demand with Incident . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.1 Coordination Mechanisms Applied in Considered Control Strategies . . . . . 115
7.2 Low Demands without Incident . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.3 Moderate Demands without Incident . . . . . . . . . . . . . . . . . . . . . . 116
7.4 High Demands without Incident . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.5 High Demands with Freeway Right-lane Incident . . . . . . . . . . . . . . . . 117
7.6 High Demands with Freeway Mid-lane Incident . . . . . . . . . . . . . . . . 118
7.7 High Demands with Freeway Left-lane Incident . . . . . . . . . . . . . . . . 118
vi
List of Figures
2.1 Parabolic Fundamental Diagram . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Triangular Fundamental Diagram . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Division of Freeway Segment in the CTM Framework . . . . . . . . . . . . . 17
2.4 Freeway Bottleneck in Arbitrary CTM Section . . . . . . . . . . . . . . . . . 18
2.5 Multi-Lane CTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.6 Flow during Queue Discharging Process . . . . . . . . . . . . . . . . . . . . . 22
3.1 Multi-Section CTM with Lane Drop . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Multi-Section CTM with VSL Control . . . . . . . . . . . . . . . . . . . . . 26
3.3 Fundamental Diagram with VSL Control . . . . . . . . . . . . . . . . . . . . 27
3.4 Lane Change Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.5 Relationship between ξ and Traffic Demands . . . . . . . . . . . . . . . . . . 29
3.6 I-710 Simulation Road Network . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.7 Fundamental Diagram from Open-Loop Simulations . . . . . . . . . . . . . . 31
3.8 Fundamental Diagram with Integrated Controller . . . . . . . . . . . . . . . 33
4.1 Traffic Dynamics after the Activation of the Rule-Based VSL . . . . . . . . . 40
4.2 Traffic Flows with the Activated Rule-Based VSL . . . . . . . . . . . . . . . 41
4.3 I-710 Simulation Road Network with Adjustable L0 . . . . . . . . . . . . . . 46
4.4 RRMSE in Densities (eρ) vs. L0. . . . . . . . . . . . . . . . . . . . . . . . . 48
4.5 Average Number of Stops (¯s) vs. L0. . . . . . . . . . . . . . . . . . . . . . . 48
4.6 Average Emission Rates of CO2 vs. L0. . . . . . . . . . . . . . . . . . . . . . 49
4.7 Average Travel Time (ATT) vs. L0. . . . . . . . . . . . . . . . . . . . . . . . 49
5.1 Road Network Consisting of a Freeway segment and Adjacent Arterials . . . 53
5.2 Road Network of a Single Freeway Section with Adjacent Arterials . . . . . . 56
5.3 Learning Process of FTC Agent . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.4 Calibration Road Network with an Isolated Intersection. . . . . . . . . . . . 64
5.5 Traffic Signal Phasing Scheme with Five Phases . . . . . . . . . . . . . . . . 64
5.6 Linear Regression of Optimal Signal Cycle . . . . . . . . . . . . . . . . . . . 65
5.7 I-710 Simulation Road Network with Incident Location . . . . . . . . . . . . 68
5.8 I-710 Simulation Road Network on Bing Map . . . . . . . . . . . . . . . . . 68
5.9 Measured Vehicle Density Profiles in Freeway Section 4 (where the incident
takes place). (a) Moderate demand without incident, (b) Moderate demand
with incident, (c) High demand without incident, (d) High demand with incident. 74
vii
6.1 Road Network under Proposed Arterial Traffic Control . . . . . . . . . . . . 77
6.2 Traffic Signal Control (TSC) Agent . . . . . . . . . . . . . . . . . . . . . . . 79
6.3 Traffic Signal Phasing Scheme with Six Phases . . . . . . . . . . . . . . . . . 80
6.4 The Square Area for TSC to Compute the Average Travel Time . . . . . . . 81
6.5 Dynamic Speed Offset (DSO) Agent . . . . . . . . . . . . . . . . . . . . . . . 84
6.6 Training Process of TSC and DSO Agent . . . . . . . . . . . . . . . . . . . . 91
6.7 I-710 Simulation Road Network . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.1 Connected Freeway and Arterial Road Network . . . . . . . . . . . . . . . . 102
7.2 Freeway Traffic Control (FTC) Agent . . . . . . . . . . . . . . . . . . . . . . 103
7.3 Estimation of ˜d
F
k
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.4 Traffic Signal Control (TSC) Agent . . . . . . . . . . . . . . . . . . . . . . . 108
7.5 Traffic Signal Phasing Scheme with Six Phases . . . . . . . . . . . . . . . . . 108
7.6 Dynamic Speed Offset (DSO) Agent . . . . . . . . . . . . . . . . . . . . . . . 110
7.7 Training Process of FTC, TSC and DSO Agents . . . . . . . . . . . . . . . . 112
7.8 I-710 Simulation Road Network with Possible Incident Locations . . . . . . . 113
viii
Abstract
Traffic congestion is a persistently growing problem in urban areas worldwide. To mitigate
travel delays, reduce fuel consumption, and address the additional costs produced by traffic
congestion, intelligent transportation systems (ITS) technologies, such as dynamic routing,
driver information systems, variable speed limits (VSL), lane change (LC) control, ramp
metering (RM), and traffic signal control (TSC), have been extensively explored and studied
over the past few decades. Although many ITS technologies have proven effective in either
freeway or arterial traffic management, the integrated control of the two systems has rarely
been investigated due to the difficulty of modeling two completely different traffic patterns
and the high complexity of the road network. Some studies have demonstrated the effectiveness of coordinating freeway ramp control with adjacent arterial signals in reducing travel
time and ramp queues, which is a preliminary step toward coordinating freeway and arterial
(CFA) operations and motivates further investigation in this dissertation.
The prerequisite of traditional traffic control design is to have a model capable of accurately reproducing traffic states with acceptable computational complexity. In this regard,
the cell transmission model (CTM) emerges as a promising candidate. To enhance consistency between macroscopic analysis and microscopic simulations, the original CTM undergoes modifications to incorporate the capacity drop effect and a disturbance term accounting
for potential uncertainties. Building upon the modified CTM, a combined feedback-based
Variable Speed Limit (VSL) and Lane Change (LC) control scheme is proposed to alleviate
freeway bottleneck congestion and mitigate uncertainties. Subsequently, the feedback-based
VSL is replaced by a rule-based VSL, where the distance of the upstream VSL zone, denoted
ix
as L0, is treated as a control variable. A lower bound of L0 is derived analytically to prevent
additional shockwaves and is validated through microscopic simulations. The established
lower bound serves as a valuable design tool for fine-tuning and enhancing the performance
of VSL controllers.
To advance the study of CFA operations, the considered road network is expanded to
include both a freeway segment and adjacent arterial roads. The integrated control of the
two systems is developed in three steps: firstly, a freeway traffic control (FTC) strategy
that coordinates VSL, LC and RM actions is proposed; secondly, an arterial traffic control
strategy that coordinates TSC, offset and speed recommendations is proposed; finally, the
freeway and arterial traffic control strategies are integrated with necessary modifications to
each sub component. All control designs are based on a Q-learning (QL) framework for
higher degree of coordination and fast implementation.
x
Chapter 1: Introduction
1.1 Problem Description
With the growing population density and transportation activities in metropolitan areas,
traffic congestion has become a recurrent event that notably arises at sensitive locations
such as freeway bottlenecks, ramps, and arterial intersections. In the United States, the
annual delay per auto commuter has increased from 38 hours in 2000 to 54 hours in 2017,
by 42.1%. The congestion has also caused the annual wasted fuel consumption per auto
commuter to rise from 16 gallons in 2000 to 21 gallons in 2017, by 31.3 %. The total cost of
delays and wasted fuel by each vehicle has increased from 920 dollars in 2000 to 1080 dollars
in 2017 [1].
As the most straightforward solution, expanding the transportation network to accommodate increased traffic demands is often impractical due to the limited space, the high
cost and the long building period. Instead, some intelligent transportation systems (ITS)
technologies have been proposed and investigated as promising and cost-efficient methods to
increase the utility of road capacity and alleviate congestion. In freeway networks, variable
speed limit (VSL), lane change (LC) control and ramp metering (RM) are the most popular
traffic regulation techniques [2, 3, 4, 5, 6]. While in arterial networks, traffic signal control
(TSC) is the dominant traffic management strategy [7].
One single traffic regulation technique is not effective enough in high traffic demand
scenarios. Taking VSL as an example, although significant benefits of VSL control on traffic
mobility have been verified via macroscopic simulations [8, 9, 10], inconsistent improvements
1
have been reported especially under congested traffic conditions in microscopic simulations
and field tests [11, 12, 13]. According to the author’s observations, there are four factors that
potentially lead to the degradation of performance in microscopic results: the capacity drop
phenomenon due to forced lane changes at the vicinity of bottlenecks, the uncertainties not
captured by macroscopic traffic models, non-optimal VSL sign locations and the shockwaves
created by speed limit commands of VSL. Among these four factors, the capacity drop is the
primary cause of the inconsistency and can be effectively reduced by providing appropriate
LC recommendations to vehicles upstream of the bottleneck [4].
While the combination of multiple ITS technologies has been demonstrated to be more
beneficial than implementing them solely [9, 4, 14, 15, 16], many of them use an optimization
framework to coordinate different controllers, which requires substantial computational efforts and may not be feasible for large-scale road networks. On the other hand, the feedbackbased framework adopted in [17] requires accurate measurements and model parameters to
operate, which cannot be guaranteed in the real world.
Most existing integrated traffic control studies focused on the freeway network. The
joint control of freeway and arterial traffic has been rarely explored due to the difficulty
of modeling two completely different traffic patterns and the high complexity of the road
network. In practice, the coordinated operation of freeways and adjacent arterials is hindered
by the fact that the two facilities are typically managed by two separate authorities with
different objectives and limited communications [18]. In rush hours, this often leads to queue
overspills at on-ramps and off-ramps, which extends the traffic congestion from one area to
another and severely deteriorate the traffic mobility. To address the issue, some studies
have coordinated freeway ramp control with adjacent arterial traffic signals [19, 20, 21, 22]
and demonstrated benefits in ramp queue and travel delay reduction within relatively small
networks. It can be considered as a preliminary step of coordinating freeway and arterial
(CFA) operations and reveals the great potential of CFA on improving traffic operation
efficiency.
2
With the above research background, the following questions remain to be answered:
1. How do the uncertainties in measurements and model parameters affect the control
performance and how to enhance the robustness of the traffic control design against
these potential uncertainties?
2. What is the optimal VSL sign location given a specific traffic demand, road configuration, and initial traffic conditions?
3. How to coordinate freeway traffic control techniques with arterial traffic signals to
maximize the overall operation efficiency of both networks?
4. How much benefit does the coordinated traffic control provide compared with an uncoordinated control strategy in terms of travel time and queue reduction?
In this study, to answer the above questions, an integrated VSL and LC controller with
the ability to reject uncertainties is first proposed based on a modified cell transmission
model (CTM) that takes the capacity drop and uncertainty term into account. The effect of
uncertainties and the distance of the most upstream VSL zone are examined under various
traffic conditions in a freeway network via microscopic simulations. Then we develop a lower
bound that this distance needs to satisfy in order to guarantee homogeneous traffic density
across sections and reduce bottleneck congestion. After that, we expand the road network to
include both freeway and adjacent arterial streets, and coordinate all the control components
in the network under a Q-learning (QL) framework with the purpose of minimizing travel
time and prevent queue overspills at on-ramps and off-ramps, which involves three sub steps
- designing a QL-based freeway traffic control strategy, designing a QL-based arterial traffic
control strategy, and integrating the freeway and arterial control. The performance of the
each type of traffic control is evaluated using microscopic simulations over simplified real
road networks.
3
1.2 Existing Work
Demand for freeway and arterial travel grows at a fast pace as the population increases in
metropolitan areas, leading to significant traffic congestion and delays at sensitive parts of
road networks such as ramps and intersections. Research efforts on freeway and arterial
traffic management as separate entities have both achieved certain levels of success in terms
of reducing travel time, collision risks and emissions.
Variable speed limit (VSL) control, lane change (LC) control and ramp metering (RM)
are considered the most effective freeway traffic flow control techniques to mitigate bottleneck
congestion [4, 5, 15, 23, 24, 25]. The VSL controller regulates the mainstream traffic flow
via speed limit commands in order to protect a freeway section from becoming congested by
maximizing its throughput. Some early VSL control studies focused on reducing the speed
variations and stabilizing the traffic flow using reactive rule-based logic [26, 27]. The improvement achieved by such rule-based VSL control approaches is often insignificant because
of the limited VSL actions and the time lag between these actions. In the past two decades,
the majority of the VSL control strategies were developed based on either local feedback [23,
28, 29] or optimal control techniques [24, 30, 31]. The main idea of the feedback-based VSL
controller is to compute the VSL commands using the current and past traffic states, which
usually requires less computation time than the optimal-control-based approach. However,
the performance of the feedback-based VSL relies heavily on the accurate measurements of
the traffic states, such as traffic flows and densities. Therefore, a small disturbance in measured densities, for example, may result in the unsatisfactory performance of the closed-loop
system [32]. The optimal-control-based VSL strategies are typically implemented within the
Model Predictive Control (MPC) framework. At each time step, the VSL commands are
calculated by solving an optimization problem with an objective function involving performance measures, such as total travel time (TTT), safety measurements, emission, and fuel
consumption. This approach, however, does not guarantee the stability of the closed-loop
4
system and takes substantial computational efforts when the road network is large [33].
Most of the VSL studies mentioned above assume a static environment with perfect
measurements and models, which is hardly true in real-world scenarios. Therefore, the
robustness of VSL control against various types of uncertainties has to be examined. The
existing approaches to enhance the robustness of VSL are mainly two folds: modifying
classic traffic models such as the Lighthill-Whitham-Richards (LWR) model [34, 35] and the
Cell Transmission Model (CTM) [36] to accommodate uncertainty terms; implementing a
VSL controller that is less dependent on potential uncertainties. The first idea is adopted
in the following studies: Liu et al. proposed a two-stage stochastic model that considers
random traffic demands [37]; Alasiri et al. modeled the uncertainties as an additional term
in the traffic conservation law [38]. In accordance with the second idea, Frejo and Schutter
presented a rule-based VSL controller that activates or deactivates the speed limits when the
density of the corresponding bottleneck reaches a threshold that is determined offline [39].
This approach is less dependent on the accurate measurements of traffic states.
The deployment location of VSL signs is a crucial design parameter but has been neglected
by most of the above studies. Latest research developed some standards on placing VSL
signs in order to achieve optimal control performance [40, 41, 42]. In [40], the authors
claimed that VSL signs should be placed at locations so there is enough space for vehicles to
accelerate and reach the bottleneck capacity. In [41], VSL signs were placed at locations in
an effort to minimize collision risks at freeway recurrent bottlenecks. In [42], Martinez and
Jin defined the ”optimal” location as the minimum discharging distance to prevent capacity
drop, based on which they formulated an optimization problem using a variation of the LWR
model. However, the minimum discharging distance may not necessarily produce the best
performance in terms of travel time, safety and emission.
The underlying idea of LC control is to provide lane-change recommendations to upstream
vehicles to change lanes while at a higher speed before they reach the closed lane or lane drop.
[4] combined LC with VSL control, aiming to avoid forced lane changes at a bottleneck. The
5
work was further extended to include RM ([17]). The authors limited their investigation to
the case in which the traffic demand is strictly higher than the road capacity. Considering
the use of Vehicle Automation and Communication Systems (VACS), an optimal control
problem formulation for a coordinated traffic flow control design involving ramp metering,
VSL, and LC control was proposed by [43]. The application of VACS allowed optimal
control to alleviate congestion and, thus, improve safety. Based on a simplified multi-lane
motorway traffic flow model, Markantonakis et al. investigated the use of two feedback
control strategies, namely VSL and LC, utilizing VACS ([44]). The integrated control scheme
was evaluated using a microscopic simulation model. The results showed improvements
in maximizing the discharging flow rate at lane-drop bottleneck locations, assuming full
compliance rates and no communication delays. [5] proposed an integrated VSL and LC
scheme in order to alleviate congestion near bottlenecks. The combined controller is in
the structure of MPC, where a 2-lane CTM is adopted to predict the traffic states. The
considered model, however, did not account for the bounded acceleration behavior of vehicles,
and assumed that vehicles can change lanes instantaneously.
Since ramps connect freeways with arterial streets, a well-designed ramp metering strategy should be able to improve the traffic mobility of both regions. Some isolated RM algorithms were first proposed in the 1990s [45, 46], including the famous ALINEA [6], which
takes freeway occupancy as input and computes the metering rate in a local feedback control
manner. The classic ALINEA does not consider the potential spillback of on-ramp queues
under high traffic demands. Therefore, it was modified in [47] to avoid the overextension
of on-ramp queues by including both the mainstream occupancy and the queue length in
the feedback loop. Due to the fact that ramp flows are also affected by mainstream traffic, coordinated RM algorithms that take into account both local and system-wide traffic
conditions typically outperform isolated RM algorithms. In [48], Paesani et al. proposed
a system-wide adaptive RM algorithm to compute the metering rates based on estimated
future traffic states with linear regression. The lack of accurate real-time data makes such
6
methods deviate considerably from the theoretical best performance. In [49], another extension of ALINEA was made by connecting all the on-ramps via a central controller and
dynamically distributing the ramp demands. When one on-ramp queue reaches the threshold, the central controller increases the throughput of this particular on-ramp while decreases
the throughput of other on-ramps. In [50], a similar two-level structure was embedded into
the RM algorithm. The upper-level controller determines the optimal total inflow using
MPC framework, and the lower-level controller distributes the computed total inflow to each
on-ramp. Although improvement can be observed by coordinating each RM controller within
the network, the control performance is still limited when heavy traffic exists in the mainstream, as RM only affects the vehicle density closely downstream of the on-ramp. This
motivates the investigation of combining RM with mainstream traffic regulation techniques
such as VSL and LC control.
Due to limited on-ramp space, it is difficult to maintain efficient freeway operations and
avoid the spillback of ramp queues simultaneously with freeway traffic control only [18]. A
promising solution is to coordinate freeway and arterial (CFA) traffic, and utilize arterial road
capacities to mitigate on-ramp or freeway congestion. Traffic signal control (TSC) is the most
crucial and effective arterial-traffic-control approach and has been frequently studied by the
transportation community for over half a century. There are two main categories of existing
TSC strategies: fixed-time TSC and adaptive/traffic-responsive TSC. Fixed-time TSC operates by switching between predetermined signal programs based on the time of day, making
it well-suited for stable and unsaturated traffic conditions. In 1960s, Webster and Miller laid
the groundwork for modern fixed-time TSC by developing a traffic signal timing model and
calculation method to minimize the average vehicle delay [51, 52, 53]. In [54], the original
Webster model has been modified to consider vehicle delays, fuel consumption and emission
rates. Although the Webster model is able to optimize the performance of traffic flows for
an isolated intersection, repeatedly applying the model for closely distanced signals on an
arterial road may not yield the best performance. Instead it is recommended to synchronize
7
the traffic signal timing with proper offsets to create a green wave and reduce the overall
vehicle stops for red lights [55]. The exact idea has been incorporated into the MAXBAND
model [56]. The classic MAXBAND model lacks robustness with respect to unpredictable
arterial traffic characteristics, and thus, has been modified in a number of later studies.
In [57], Gartner et al. divided the road into multiple segments and apply MAXBAND for
each segment separately. In [58], Arsava et al. incorporated origin-destination (OD) information and route guidance with MAXBAND to guarantee the bandwidth allocation for
major origin-to-destination flows. In [59], De Nunzio et al. combined speed advisory with
MAXBAND to reduce the travel time and the energy consumption. Another widely adopted
and extended fixed-time TSC scheme is TRANSYT [60], which utilizes historical traffic data
from the road network as input and computes the optimal signal timing via a heuristic ”hill
climbing” algorithm.
The primary drawback of fixed-time TSC is the inability to manage highly saturated
traffic conditions or abnormal demands due to events or incidents. To address the issue,
some adaptive TSC methodologies have been investigated. In [61], Hunt et al. proposed a
traffic-responsive variation of TRANSYT called SCOOT, which modifies signal plans online
based on observed traffic flow rates and occupancy. In [7], a real-time hierarchical optimized
distributed effective system (RHODES) with two primary operational stages was introduced.
In the first stage, the system predicts future traffic flow rates across the road network using
sensor data. In the second stage, the optimal signal timing is computed with the predicted
traffic flow rates from the first stage. In [62], the authors presented an adaptive phase
allocation algorithm for a single signalized intersection that optimizes the phase sequence and
duration using vehicle location and speed data from connected vehicles. Even though many
adaptive TSC have demonstrated satisfactory performance in various field tests, a common
limitation is their dependence on precise real-time traffic data and high-speed computing
resources. The complexity of the internal optimization problem of some adaptive TSC is
significant, which imposes restrictions on the size of arterial networks that can be effectively
8
managed.
The recent developments in the study of eco-driving reveals the great potential of vehicle
speed control on improving the arterial traffic mobility [63, 64, 65]. The objective of the
eco-driving agent is to find a velocity trajectory for the vehicle to travel across a series of
traffic signals with least stops and energy consumption. Although eco-driving algorithms
are designed for vehicular-level traffic control, the concept can be implemented in flow-level
traffic control, i.e. coordinating speed advisory with TSC to improve the efficiency of arterial
travel.
Three types of frameworks have been considered to coordinate multiple control components mentioned previously in both freeway and arterial networks. The first one is the
optimization formulation proposed by Hegyi et al. with an objective function containing
the total time spent (TTS) for the mainstream traffic and the on-ramp queue [66]. This
framework has been adopted by many researchers since then to demonstrate the optimality
of their proposed integrated controllers [9, 67, 68, 69]. However, the optimization approach
is impractical to large-scale road networks because the computation time increases rapidly
with the network size [25]. To tackle this problem, several easy-to-implement integrated
controllers have been proposed [70, 71, 25] based on shock wave theory, feedback control
or logic rules. They are considered as the second type of framework. A common drawback
of these algorithms is the lack of coordination between different types of controllers. The
third framework that has attracted great attention in the transportation research community recently is reinforcement learning (RL). Reinforcement learning achieves the optimality
by having agents interact with the environment and learn the best action that maximizes a
long-term reward in each possible state [72]. Although the learning process of RL agents can
be very time-consuming, the field implementation is fast, and thus, applicable for large-scale
road networks.
RL techniques have been adopted in the design of both VSL [73, 24] and RM [74] algorithms for freeway traffic control. In [75], Wang et al. proposed a coordinated VSL and RM
9
controller using deep RL methods to improve freeway mobility and alleviate the congestion
at recurring bottlenecks. On the other hand, RL is also a promising alternative to deal with
the scalability issue of adaptive TSC [76] for arterial traffic control. In [77], the authors
proposed a basic RL structure that assigns one agent per signal to lower delays and number
of stops for a two-way arterial road segment with five intersections. The RL agents deliver
a more balanced distribution of the travel delay than fixed-time control. However, each RL
agent optimizes the performance of a local area and the global optimum is not guaranteed.
Besides, the progression bandwidth problem is not considered in [77]. To achieve global
optimum, many research efforts focused on joint state-action modeling methods where the
agent learns to choose the most rewarding joint action with joint state observations [78, 79,
80]. The drawback of joint state-action methods is that the number of state-action pairs
grows exponentially with the number of intersections, leading to long training time and the
need for a huge amount of training data for large networks. In [81], the basic RL structure
was upgraded to minimize the sum of intersection queue lengths and the cumulative stop
delay of the entire network using a deep neural network (DNN). The authors utilized a proximal policy optimization (PPO) algorithm to strike a balance between speed and stability of
training, which solves the scalability issue with global optimization in some sense. The potential concern with PPO is the high sensitivity to the choice of hyperparameters and limited
exploration ability, both of which prevent the model from achieving the true optimum.
The fast development of RL-based traffic control algorithms motivates us to implement a
RL framework to coordinate both freeway and arterial control components in the considered
road network. The unknown transition probability of the traffic environment requires us to
adopt model-free RL algorithms such as Q-learning (QL) to search the optimal policy [72].
10
1.3 Contribution
Driven by the questions that have been raised in Section 1.1, this study is devoted to the design, analysis, and evaluation of coordinated traffic flow control over a complex road network
that consists of both freeway and adjacent arterial intersections. The considered traffic control components involve variable speed limit (VSL) control, lane change (LC) control, ramp
metering (RM), arterial traffic signal control (TSC) and arterial speed recommendations.
The main contributions of this study are summarized as follows:
• Discovering that for feedback-based VSL control schemes, uncertainties leading to lower
VSL commands are more detrimental to traffic mobility than uncertainties leading
to higher VSL commands, because the traffic is slowed down excessively and extra
shockwaves are generated with lower VSL commands.
• Proposing an easy-to-implement rule-based VSL control scheme that applies to lowcomplexity freeway networks with low computational cost. The performance of the
rule-based VSL is no worse than the feedback-based VSL, which demonstrates the
advantage of concentrating the control efforts upstream over splitting them into every
section when the traffic conditions are homogeneous in downstream sections.
• Developing a lower bound for the length of the most upstream VSL section that leads to
faster convergence to steady state densities and achieves much higher benefits compared
to adhoc locations used in past research. It is then extended by considering ramp flows
and arbitrary bottleneck locations. The generated lower bound is an effective design
tool in tuning and improving the performance of VSL.
• Developing a freeway traffic control (FTC) strategy that coordinates VSL, LC and
RM using a Q-learning (QL) algorithm. The agent determines the VSL, LC and RM
actions based on observed traffic conditions of both the freeway section and adjacent
arterial intersections. The proposed FTC significantly improves the freeway travel time
11
compared to uncoordinated FTC. Moreover, it also reduces the average queue length
of arterial intersections by delivering smooth off-ramp flows that align with the arterial
TSC.
• Developing an adaptive arterial traffic control strategy that coordinates traffic signal
timing, offset and vehicle speed using a QL algorithm. The traffic signal control (TSC)
agent determines the signal cycle length and the split based on observed intersection
demands and freeway off-ramp queues. The dynamic speed offset (DSO) agent determines the relative offset and the recommended vehicle speeds between two adjacent
arterial signals based on the physical distance, the intersection queues and the signal
plans from TSC. The proposed control reduces the off-ramp queue significantly (over
70% compared with fixed-time TSC) with a slight trade-off in arterial travel time under
high traffic demands.
• Developing an integrated freeway and arterial traffic control where all the control components are coordinated. The FTC agent estimates the on-ramp demand using the
adjacent intersection demands and signal timing and takes proactive control actions.
The TSC agent is modified to consider both on-ramp and off-ramp queues and prevent potential overspills. The coordinated control is much more effective in travel time
reduction under low or moderate demands and ramp queue management under high
demands compared with multiple under-coordinated control strategies.
1.4 Outline
The rest of this dissertation is organized as follows: Chapter 2 reviews first-order freeway
traffic models and their arterial extensions. Chapter 3 proposes a combined VSL and LC
controller to alleviate freeway bottleneck congestion and investigates the effect of various
types and levels of uncertainties. In Chapter 4, a rule-based VSL control that treats the
distance of the upstream VSL zone as a control variable is proposed. A lower bound of
12
this distance is derived analytically and verified using microscopic simulations with PTV
VISSIM 10. In Chapter 5, the interested road network is expanded to include a freeway
segment and its adjacent arterial intersections. Then a QL-based FTC strategy is developed
to minimize freeway travel time and stabilize the vehicle density of each section. Chapter
6 presents an adaptive arterial traffic control strategy that combines TSC and DSO using
QL algorithms. Mutual benefits are obtained by integrating the arterial control with the
FTC from Chapter 5. In Chapter 7, the integration of freeway and arterial traffic control
is achieved with slight modifications to each control component and a better coordination
mechanism. It is compared to multiple under-coordinated control strategies to demonstrate
the benefit of each coordination mechanism. Chapter 8 presents the conclusions and future
work.
13
Chapter 2: Macroscopic Traffic Flow Models
2.1 Introduction
Traffic models are the foundation for the design of traffic flow control algorithms in terms of
both theoretical analysis and simulations. According to the level of detail, traffic models can
be divided into microscopic, mesoscopic and macroscopic models [82]. Microscopic models
focus on the movement and interaction of individual vehicles, and thus, they are very accurate
but computationally expensive [83, 84]. Mesoscopic models still consider individual vehicles
in terms of their origins, destinations and routes, but describe traffic flows aggregately in
a macroscopic manner [85, 86]. Macroscopic models represent traffic dynamics using three
aggregate traffic states, namely vehicle density ρ, flow rate q, average speed v, in analogy
with the flow of fluids [87]. They are adopted in this study because of the low computational
cost compared to the other two counterparts and the compatibility with large-scale road
networks and real-time applications. Great research efforts have been made over the years
in order to develop macroscopic models that reproduce road traffic behaviors accurately,
primarily in freeway networks [34, 35, 36, 88, 89, 90, 91, 92]. However, the macroscopic
modeling for arterial networks has been much less investigated due to the complexity of
arterial traffic and road configurations.
In this chapter, we list a few representative macroscopic traffic models in both freeway
and arterial scenarios. We only discuss the high-level theory and the general formulation
of these models here. Extensions and modifications will be made to accommodate special
traffic conditions such as uncertainties, capacity drop and arbitrary bottleneck locations in
14
future chapters.
2.2 The Lighthill-Whitham-Richards (LWR) Model
The first macroscopic traffic model dates back to the 50s when Lighthill, Whitham and
Richards proposed the famous LWR model, which has become a valuable tool for the study
of traffic behaviors and a solid basis for the design of modern traffic models. The LWR
model is a continuous first-order model described by the following equations:
∂ρ(x, t)
∂t +
∂q(x, t)
∂x = 0 (2.1)
q = ρv (2.2)
q = Q(ρ) (2.3)
where x, t, ρ, q and v denote the location, time, vehicle density, flow rate and speed, respectively. Equation (2.1) is a first-order partial differential equation that represents the conservation law of vehicles. Equation (2.2) describes the relationship between vehicle density,
flow and speed using an analogy to fluid dynamics. Equation (2.3) reflects the flow-density
relationship under steady-state conditions, also known as the fundamental diagram (FD).
The FD can be found using real-world traffic data, which was first attempted by Greenshield in the 30s [93], as depicted in Figure 2.1. In this FD, the flow rate increases with the
density before the density passes the critical value ρc. After that, the flow rate decreases as
the density further rises up because the drivers are concerned with the safety issue and slow
down the vehicles accordingly. ρc corresponds to the maximum possible flow rate q
max and
ρ
j
corresponds to the jam density where the traffic completely stops. Besides the parabolic
FD discovered by Greenshield, another widely adopted FD type is the triangular FD, as depicted in Figure 2.2. The triangular FD assumes that the traffic moves in the free-flow speed
vf before the density reaches ρc. When the density is higher than ρc, i.e. under congested
15
Figure 2.1: Parabolic Fundamental Diagram
traffic conditions, it assumes the backpropagation speed of the shockwave is a constant w.
Figure 2.2: Triangular Fundamental Diagram
2.3 The Cell Transmission Model (CTM)
Although the LWR model reproduces the traffic flow dynamics under both uncongested
and congested conditions accurately, finding its analytical solution is very difficult. Many
16
discretization schemes based on the LWR model have been proposed to address the issue.
CTM is the most well-recognized among these discrete macroscopic traffic models, developed
by Daganzo in the 90s [36, 88].
Figure 2.3: Division of Freeway Segment in the CTM Framework
In the CTM framework, a road segment is partitioned into N small homogeneous sections/cells and consecutively numbered from 1 to N in the traffic flow direction, as depicted
in Figure 2.3. Each section/cell is characterized by the vehicle density, inflow, outflow, and
length, denoted by ρi
, qi
, qi+1, and Li respectively, where i = 1, 2, . . . N. The triangular
fundamental diagram (FD) in Figure 2.2 is adopted to characterize the flow-density relationship. In replacement of the partial differential equation (2.1), the density is updated using
a first-order ordinary differential equation based on the traffic flow conservation as follows:
dρi(t)
dt =
qi(t) − qi+1(t)
Li
(2.4)
where the inflow qi and outflow qi+1 are determined as the minimum of the demand (sending)
and supply (receiving) functions:
qi(t) = min{Di−1, Si}
Di = min{viρi
, Ci
,w˜(˜ρ
j − ρi)}
Si = min{w(ρ
j − ρi), Ci}
(2.5)
where Di−1 is the demand from the upstream section i − 1, Ci
is the capacity of section i
17
and corresponds to the maximum possible flow in the FD, ˜w(˜ρ
j − ρi) and w(ρ
j − ρi) denote
the flow rates under congested traffic conditions. The combined term min{w(ρ
j − ρi), Ci}
represents the maximum possible flow that section i can receive, namely the supply of section
i.
2.4 Extended CTM for Freeway Bottlenecks
Although the original CTM can reproduce the traffic dynamics in normal circumstances, it
does not capture the capacity drop phenomenon and bounded acceleration effects produced
by forced lane change maneuvers at freeway bottlenecks or ramp merging areas [94, 91].
Without loss of generality, it is assumed that an incident occurs and creates a bottleneck at
section M (1 < M ≤ N), as shown in Figure 2.4.
Figure 2.4: Freeway Bottleneck in Arbitrary CTM Section
To improve the consistency with microscopic observations, a modified multi-section CTM
that accommodates the effect of both capacity drop and bounded acceleration is considered
[29]. Accordingly, the following equations describe the evolution of the vehicle density ρi
in
each section:
ρ˙i =
1
Li
(qi − qi+1 + ri − si), for i = 1, ..., N, (2.6)
18
where
q1 = min
d, C, w(ρ
j − ρ1)
,
qi = min
vf ρi−1,w˜(˜ρ
j − ρi−1), C, w(ρ
j − ρi)
,
for i = 2, ..., M − 1, M + 2, ..., N,
qM = min
vf ρM−1,w˜(˜ρ
j − ρM−1),(1 − ϵ(ρM))Cb,
w(ρ
j − ρM)
,
qM+1 = min
vf ρM,w˜(˜ρ
j − ρM),(1 − ϵ(ρM))Cb,
w(ρ
j − ρM+1)
,
qN+1 = min
vf ρN , C,w˜(˜ρ
j − ρN )
,
(2.7)
and
ϵ(ρM) =
ϵ0 if Cb < C and ρM >
Cb
vf
0 otherwise
,
2.5 Extended CTM for Arterial Traffic
The CTM formulation introduced in the previous section is designed primarily for the freeway
environment where the traffic pattern is relatively simple and consistent. However, it cannot
accurately reproduce the dynamics of arterial traffic due to the complex vehicular interactions
and different road configurations in arterial networks. In this section, we list some common
arterial traffic behaviors that the traditional CTM fails to capture and discuss the possible
solutions to these issues.
2.5.1 Frequent Lane Change
Lane change (LC) occurs more frequently on arterial roads due to the following two reasons
[92]:
• Channelization: when vehicles approach an intersection, they merge onto specific lanes
19
(left/through/right) depending on their destinations.
• Lane usage balancing: when multiple lanes serve the same direction, drivers tend to
choose the one with lower occupancy in order to reduce the travel time.
LC creates empty spaces in the traffic stream and significantly impacts the flow rate if it
happens frequently. To incorporate the LC effect into the CTM, researchers have proposed
several multi-lane CTM where each CTM section consists of multiple subsections to model
the traffic flow of each lane separately [95, 96], as depicted in Figure 2.5.
Figure 2.5: Multi-Lane CTM
The evolution of the vehicle density still follows the conservation of traffic flows and
resembles the one presented in section 2.3:
dρi,j (t)
dt =
qi,j (t) − qi+1,j (t)
Li
(2.8)
where ρi,j , qi,j , qi+1,j denote the density, inflow, outflow of section i lane j, and Li denotes
the length of section i. To compute qi,j , note that the demand function consists of the
longitudinal demand and the lateral demand from adjacent lanes, i.e.
qi,j (t) = min{Di−1,j , Si,j}
Di,j = D
long
i,j + D
lat
i,j
D
long
i,j = min{viρi,j , Ci,j ,w˜(˜ρ
j − ρi,j )}
Si,j = min{w(ρ
j − ρi,j ), Ci,j}
(2.9)
20
where the lateral demand Dlat
i,j may contain one or two parts depending on the number of
adjacent lanes that lane j has. For instance in Figure 2.5, Dlat
i−1,2 = Dlat
i−1,12 + Dlat
i−1,32 since
lane 2 is adjacent to both lane 1 and 3.
In general, the lateral demand between two adjacent lanes should drive both lanes toward their desired densities. According to the principle of lane usage balancing, each lane’s
desired density within one section should be the same. However, they may also be different
considering the channelization effect. To quantify the channelization effect, we define the
desired lane-density ratio of section i lane j as γi,j . Then the lateral demand from lane j
′
to
j can be computed as
D
∗
i,j′j =
γi+1,jρi+1,j′ − γi+1,j′ρi+1,j
γi+1,j + γi+1,j′
δLCvi+1 (2.10)
where δLC is a tuning parameter reflecting the sensitivity to imbalanced lane usage (0 ≤
δLC ≤ 1).
The lateral demand D∗
i,j′j
is positive if its actual direction is from lane j
′
to j, and negative
otherwise. Considering the conservation of traffic demands, the actual lateral demand Dlat
i,j′j
must not exceed the longitudinal demand on lane j
′
if it is positive, and must be greater
than or equal to the longitudinal demand on lane j if it is negative. Therefore,
D
lat
i,j′j = max{−D
long
i,j , min{D
long
i,j′ , D∗
i,j′j}} (2.11)
2.5.2 Queue Discharging
At a signalized intersection, when the signal turns from red to green, the accumulated queue
discharges and the flow rate gradually increases from zero to the saturation value, which leads
to an initial loss of the road capacity due to drivers’ reaction time and vehicles’ acceleration
process [92]. Figure 2.6 (a) presents what an actual flow vs. time diagram may look like
during the queue discharging. However, it is difficult to express the flow in a closed form
21
using the actual relationship. Therefore, two approximated models are proposed. The first
one introduces a step profile for the discharging process [92], as shown in Figure 2.6 (b). The
second one assumes the flow increases linearly during the discharging process, as shown in
Figure 2.6 (c).
(a) (b) (c)
Figure 2.6: Flow during Queue Discharging Process
In the step model, the flow is considered as a constant value proportional to the saturation
flow for the initial loss time, and jumps to the saturation status afterward, that is
q(t) =
δqs if t ≤ Ta
qs otherwise
(2.12)
where δ can be determined by counting the number of vehicles passing the stop line during
the initial loss time in simulations or field tests.
In the linear model, the flow dynamics are described as follows
q(t) =
tqs/Ta if t ≤ Ta
qs otherwise
(2.13)
22
Chapter 3: Evaluation of Integrated Variable
Speed Limit and Lane Change Control for Freeway Traffic Flow
3.1 Introduction
A sudden lane drop on a freeway with high volume of vehicles can cause severe traffic congestion and safety issues. Forced lane changes near the bottleneck reduce the traffic mobility on
all lanes and trigger the capacity drop phenomenon. Numerous studies have explored the effect of Variable Speed Limit (VSL) and Lane Change (LC) control on solving the bottleneck
congestion, but only a few have taken the measurement or model uncertainties into considerations. In this chapter, we propose an integrated VSL and LC controller based on a modified
multi-section Cell Transmission Model (CTM) to alleviate the freeway bottleneck congestion
and reject uncertainties. The proposed controller is evaluated without uncertainty first to
demonstrate its effectiveness in improving traffic flow. Then we incorporate the uncertainties in measurements and model parameters to examine the robustness of the control. The
evaluation process is performed using microscopic simulations with a commercial software
PTV VISSIM 10. The results indicate that the bounds of allowable uncertainties are more
sensitive when uncertainties lead to a reduction of VSL commands than an increase because
of the extra shockwaves produced in the former situation.
23
3.2 Multi-Section Cell Transmission Model
The Cell Transmission Model (CTM) is a discrete approximation of the Lighthill-WhithamRichards (LWR) kinematic wave model of traffic flow [34, 35, 36, 88]. It has been widely used
for traffic flow modeling and control design thanks to its simplicity and accurate ability to
describe traffic dynamics at a macroscopic level [29, 91]. In the CTM framework, a freeway
segment is partitioned into N small homogeneous sections/cells and consecutively numbered
from 1 to N in the traffic flow direction, as depicted in Figure 3.1. A lane drop is introduced
at the downstream exit of the freeway segment and reduces the road capacity from C to Cd.
The vehicle density of each CTM section is updated using a first-order ordinary differential
equation based on the traffic flow conservation, where the inflow and outflow are determined
by the supply (or receiving) and demand (or sending) functions, which define a flow-density
relationship known as the fundamental diagram [97].
Figure 3.1: Multi-Section CTM with Lane Drop
Though the original form of the CTM can reproduce traffic behaviors under both uncongested and congested conditions, it does not capture more complex traffic flow phenomena,
such as the capacity drop and bounded acceleration effects due to forced lane-changing
maneuvers at congested freeway bottlenecks [4, 98, 99], nor does it capture the potential uncertainties in measurements and model parameters. Therefore, the original form of the CTM,
proposed by Daganzo in 1994, has been modified over the years in order to be consistent
with the microscopic traffic flow observations [4, 23, 91, 100, 101].
In this chapter, the most updated multi-section CTM, which takes into account the effect
24
of both capacity drop and bounded acceleration, is considered [29]. Moreover, the measurement and model uncertainties are represented as an additional term µ in the conservation
law of traffic flow [38]. Without loss of generality, it is assumed that the geometry of all the
sections is identical. Accordingly, the evolution of the vehicle density ρi
in each section is
described by the following equations:
ρ˙i =
1
L
(qi − qi+1 + µ), for i = 1, ..., N, (3.1)
where
q1 = min
d, C, w(ρ
j − ρ1)
,
qi = min
vf ρi−1,w˜(˜ρ
j − ρi−1), C, w(ρ
j − ρi)
, for i = 2, ..., N,
qN+1 = min
vf ρN ,w˜(˜ρ
j − ρN ),(1 − ϵ(ρN ))Cd
,
(3.2)
and
ϵ(ρN ) =
ϵ0 if Cd < C & ρN >
Cd
vf
0 otherwise
,
where the parameters in (3.1) and (3.2) are defined in Table 3.1.
Table 3.1: Definition of Variables and Model Parameters
Symbol Definition Unit
d the upstream demand veh/h
C the capacity of each freeway section veh/h
Cd the downstream capacity veh/h
qi the inflow of freeway section i veh/h
qi+1 the outflow of freeway section i veh/h
vf the free flow speed km/h
w the backpropagation speed km/h
w˜ the backpropagation speed associated with the outflow qi+1 km/h
ρc the critical density veh/km
ρ
j
the jam density veh/km
ρ˜
j
the jam density associated with the outflow qi+1 veh/km
ρi the density of freeway section i veh/km
L the length of each freeway section km
ϵ0 the capacity drop factor, where ϵ0 ∈ (0, 1) unitless
25
3.3 Control Design
3.3.1 Robust Variable Speed Limit Control
This section proposes a feedback-based Variable Speed Limit (VSL) control design based on
the multi-section CTM presented in section 3.2. As depicted in Figure 3.2, the first VSL
command v0 takes effect in the very upstream to regulate the incoming traffic demand before
it reaches the first CTM section. The remaining VSL commands are deployed at the entrance
of each CTM section.
Figure 3.2: Multi-Section CTM with VSL Control
The control objective is to reject uncertainties and make the densities of all sections
converge to a desired value, denoted as ρ
∗
. In perfect traffic conditions (without any uncertainty), a trivial choice is to make ρ
∗ = Cd/vf , which corresponds to the highest discharging
flow-rate in this situation. However, a small disturbance may move the density towards
the capacity-drop region, which introduces unwanted oscillatory behavior of the closed-loop
system and negatively impacts the convergence rate ([38]). On one hand, the value of ρ
∗
needs to be comprised for the sake of robustness, i.e., ρ
∗ < Cd/vf . On the other hand, it
should be chosen so that not to lose excessive potential road capacity.
When the VSL command of section i is activated, the maximum possible flow governed
by vi
is viwρj
vi+w
according to the geometry of the fundamental diagram shown in Figure 3.3.
Therefore, the upstream VSL command vi−1 alters the demand part of qi and the current26
Figure 3.3: Fundamental Diagram with VSL Control
section VSL command vi alters the supply part of qi
. More specifically, the dynamics of
traffic flows when the VSL controller is activated are described as follows:
q1 = min
d, v0wρj
v0 + w
,
v1wρj
v1 + w
, w(ρ
j − ρ1)
,
qi = min
vi−1ρi−1,
vi−1wρj
vi−1 + w
,
viwρj
vi + w
, w(ρ
j − ρi)
, for i = 2, ..., N,
qN+1 = min
vN ρN ,w˜(˜ρ
j − ρN ),(1 − ϵ(ρN ))Cd
(3.3)
The speed limit commands for each section can be computed as follows:
v0 =
wq1v
wρj − q1v
,
vi−1 =
qiv
ρi−1
, for i = 2, ..., N,
vN = vf
(3.4)
where qiv is the desired inflow of section i. Assume that the disturbance term µ in equation
(3.1) is bounded by a constant µm and satisfies |µ| ≤ µm ≪ Cd. In order to reject µ and
guarantee the convergence of the closed-loop system, the following proportional-integral (PI)
27
controller equation is adopted in the design of qiv ([23, 38]):
qiv = qi+1 − λ1(ρi − ρ
∗
) − λ2
Z t
t0
(ρi − ρ
∗
)dτ −
λ1(ρi(t0) − ρ
∗
) − µm
λ2
(3.5)
where qi+1 and ρi are measurements subject to uncertainties, λ1 > 0 and λ2 > 0 are the
proportional and integral gains, respectively. These parameters are initialized with the
empirical values from [23] and tuned based on simulation results. t0 denotes the time when
the controller is activated.
To ensure safety and feasibility in real world, we also incorporate the following constraints
on the speed limit computations:
• vi
is rounded to be a multiple of 10 km/h
• The bounds of vi
: 20 km/h ≤ vi ≤ 100 km/h.
• vi can be increased or decreased by at most 10 km/h in each control cycle.
3.3.2 Lane Change Control
In order to improve the bottleneck throughput and relieve congestion, a Lane Change (LC)
control is implemented in the discharging section N [4] as shown in Figure 3.4. The mechanism of LC control involves two ingredients. The first one is to give appropriate lane-changing
recommendations to vehicles moving in the closed lane before approaching the bottleneck.
The second ingredient is determining at what distance from the bottleneck, referred to as
dLC, these recommendations are provided. dLC needs to be long enough so that the vehicles
can complete the lane change maneuvers safely, but an overextended dLC may lead to the
underutilization of the road. In [4], an empirical formula is proposed to determine the value
of dLC as follows:
dLC = ξ · nc (3.6)
28
where nc is the number of lanes closed at the bottleneck, ξ is a design parameter that
depends on the traffic demand d. The relationship between ξ and d is obtained from extensive
simulations and depicted in Figure 3.5.
Figure 3.4: Lane Change Control
Figure 3.5: Relationship between ξ and Traffic Demands
3.4 Numerical Simulations
3.4.1 Network Configuration
The commercial software PTV VISSIM 10 is used for microscopic simulations. The road
network in Figure 3.6 is built along a 14.4-km (9 mi) segment of the I-710 freeway from
29
I-105 to the Long Beach Port in California, United States. To ensure the incoming demand
is sufficiently regulated by v0, the length of the upstream VSL zone is set to be 4.8 km (3
mi), much longer than the length of the remaining sections, which is 1.6 km (1 mi). The
detailed effect of these lengths will be discussed in future chapters. The network has a fixed
lane number of 3, and no on-ramps or off-ramps are considered. An incident triggers middle
lane closure and creates a bottleneck at the exit of the freeway segment.
Figure 3.6: I-710 Simulation Road Network
3.4.2 Parameter Selections
We first run multiple simulations with the demand d gradually increasing in the open-loop
system (without any control). Based on the collected measurements of the flow and density
of the most downstream CTM section, we draw the fundamental diagram of the simulation
network as shown in Figure 3.7. The following parameters can be determined according
to the fundamental diagram: the road capacity C = 7200 veh/h, the bottleneck capacity
Cd = 4800 veh/h, the bottleneck capacity associated with the capacity drop (1−ϵ0)Cd = 4300
veh/h. Thus, the level of the capacity drop can be estimated as ϵ0 = 0.1. The free-flow speed
vf is set to be 100 km/h. The backpropagation speeds are selected with the empirical values
proposed in [38]: w = 30 km/h and ˜w = 15 km/h. Using the geometry in Figure 3.3, we
have ρ
j = C/vf + C/w = 312 veh/km and ˜ρ
j = C/vf + C/w˜ = 552 veh/km.
Each simulation run lasts for 90 minutes. The incident happens at the 10th minute and is
removed at the 80th minute. We take the measurements and update the control commands
every 30 seconds during the simulation, which is consistent with data collection rate from
30
Figure 3.7: Fundamental Diagram from Open-Loop Simulations
the Highway Performance Measurement System (PeMS) [102].
3.4.3 Performance Measurements and Evaluation
In this section, we evaluate the performance of the proposed controller in a one-lane-drop
scenario under different levels of demands. The following methods have been adopted for
the evaluations ([4]):
• Average Travel Time (ATT): the average time spent for each vehicle to travel through
the whole network.
AT T =
1
Nv
X
Nv
i=1
(ti,out − ti,in) (3.7)
where Nv is the number of vehicles passing through the network, ti,in and ti,out are the
time vehicle i enters and exits the network.
• Average number of stops: the average number of stops performed by each vehicle when
traveling in the network.
s¯ =
1
Nv
X
Nv
i=1
si (3.8)
31
where si
is the number of stops performed by vehicle i.
• Average emission rates of CO2: the calculation of emission rates is based on the
MOVES model provided by the Environment Protection Agency (EPA) [103].
R¯ =
X
Nv
i=1
Ei/
X
Nv
i=1
di (3.9)
where Ei
is the emission produced by vehicle i and di
is the travelled distance of vehicle
i
Table 3.2: Evaluation Results for d = 7000 veh/h
Evaluations No Control VSL Only VSL+LC
ATT (min) 22.2 18.7 18.6(16.2%)
s¯ 43.5 2.0 1.3(97%)
CO2 (g/veh/km) 399.4 319.3 313.8(21.4%)
Table 3.3: Evaluation Results for d = 6000 veh/h
Evaluations No Control VSL Only VSL+LC
ATT (min) 19.0 18.6 18.4(3.2%)
s¯ 31.7 2.2 1.2(96.2%)
CO2 (g/veh/km) 366.0 317.2 311.6(14.9%)
Each entry in Table 3.2 and 3.3 is computed as the average results of 5 microscopic
simulations to reduce the randomness and increase the reliability. The proposed controller
(VSL+LC) significantly improves the performance in terms of all evaluation methods under
the demand of 7000 veh/h. When the demand drops down to 6000 veh/h, the improvements
in ¯s and the emission rates of CO2 are still noticeable. The evaluation results demonstrate
the effectiveness of the proposed controller in solving the congestion caused by lane drop.
In addition, the higher demand of the incoming traffic, the more improvement we obtain.
The single VSL controller cannot keep up with the integrated controller in terms of the
average number of stops, because LC controller prevents the queue formulation ahead of
32
the bottleneck. The convergence is achieved on average 20 min after the occurrence of the
incident.
Figure 3.8 is a fundamental diagram depicted from the closed-loop simulation data under
multiple demands with the incident. The proposed controller is only activated when d > Cd
and deactivated otherwise. We can observe a clear convergence for all high demand scenarios
where the controller being applied. The open-loop convergence can also be achieved when
d < Cd. The data points of d = 4000 veh/h is more scattered because the actual vehicle
input generated in VISSIM is randomly distributed around 4000. A high input may drive
the system to the capacity drop region for a short time.
Figure 3.8: Fundamental Diagram with Integrated Controller
3.4.4 Uncertainties and Robustness Analysis
Uncertainties may exist in measurements and model parameters. The corrupted density and
flow measurements are represented as follows:
ρ˜i = ρi(1 + σρ) (3.10)
q˜i = qi(1 + σq) (3.11)
33
where ρi
, qi are the true measurements and σρ, σq are the levels of uncertainties. The fact
that many model parameters are determined based on simulation data or empirical values is
also likely to introduce uncertainties. Among those parameters, w is the most sensitive and
worth investigation because it is involved in the computation of control commands.
With the existence of uncertainties, our primary concern is whether the closed-loop system can still be stabilized around the desired equilibrium. In order to evaluate the robustness
of the proposed controller, we collect density and flow data for all sections and compute the
relative root mean square error (RRMSE) (denoted as e) with respect to the desired equilibrium as following:
eρ(i) = 1
ρ
∗
s
1
te − tc
Z te
t=tc
(ρi(t) − ρ
∗
)
2
(3.12)
eq(i) = 1
vf ρ
∗
s
1
te − tc
Z te
t=tc
(qi(t) − vf ρ
∗
)
2
(3.13)
where tc is the time when convergence occurs and te is the time when the incident is cleared.
In general, we consider the convergence is acceptable when eρ(i) ≤ 20% and eq(i) ≤ 20% for
i = 1, ..., N. Table 3.4, 3.5 and 3.6 show the evaluation results when there exist uncertainties
in measured densities, measured flows and model parameter w respectively.
Table 3.4: Uncertainties in Densities ˜ρi
σρ eρ(1) eρ(3) eρ(5) eq(1) eq(3) eq(5)
0 6.7% 11.4% 13.4% 7.8% 10.0% 12.3%
-0.2 6.2% 11.6% 14.4% 8.5% 11.4% 13.4%
-0.1 6.3% 11.7% 14.7% 8.7% 11.6% 13.6%
0.1 6.7% 10.4% 12.2% 7.2% 9.7% 11.7%
0.2 268.3% 23.3% 23.0% 12.3% 14.4% 22.3%
In Table 3.4 and 3.5, the convergence deteriorates as we increase σρ from 0.1 to 0.2 or
decrease σq from -0.1 to -0.2. In both cases, the actual speed limit commands are lower than
the ones computed from true measurements. It implies that slowing down the traffic excessively produces worse performance than speeding up the traffic for the proposed controller.
The intuitive explanation is that we create more shockwaves as we slow down the traffic
34
Table 3.5: Uncertainties in Flows ˜qi
σq eρ(1) eρ(3) eρ(5) eq(1) eq(3) eq(5)
0 6.7% 11.4% 13.4% 7.8% 10.0% 12.3%
-0.2 344.1% 25.8% 22.9% 23.6% 21.8% 21.4%
-0.1 6.8% 10.9% 14.1% 9.1% 11.5% 12.7%
0.1 6.0% 11.3% 14.2% 9.0% 10.5% 12.9%
0.2 6.8% 11.6% 13.9% 8.2% 10.6% 12.6%
Table 3.6: Uncertainties in Model Parameter w
w eρ(1) eρ(3) eρ(5) eq(1) eq(3) eq(5)
30 6.7% 11.4% 13.4% 7.8% 10.0% 12.3%
24 6.3% 10.9% 12.7% 8.6% 9.7% 11.3%
27 6.6% 12.3% 15.6% 8.3% 11.5% 14.1%
33 6.6% 11.2% 13.1% 8.2% 9.9% 11.8%
36 6.9% 12.0% 14.7% 8.5% 11.3% 13.7%
unnecessarily, which prevents the convergence. The evaluation results indicate the proposed
controller is able to tolerate 10% of uncertainties in measurements and 20% of variations in
the value of w, so the overall robustness of the closed-loop system is satisfactory.
35
Chapter 4: Selection of the Speed Command
Distance for Improved Performance of a RuleBased VSL and Lane Change Control
4.1 Introduction
Variable Speed Limit (VSL) control has been one of the most popular techniques with the
potential of smoothing traffic flow, maximizing throughput at bottlenecks, and improving
mobility and safety. Despite the substantial research efforts in the application of VSL control,
few studies have looked into the effect of the VSL sign distance from the point of an accident
or a bottleneck. In this chapter, we show that this distance has a significant impact on
the effectiveness and performance of VSL control. We propose a rule-based VSL strategy
that matches the outflow of the upstream VSL zone with the bottleneck capacity based on
a multi-section Cell Transmission Model (CTM). The control strategy minimizes the speed
variations between all the downstream sections and produces less shockwaves compared with
the feedback-based VSL presented in the previous chapter.
Then, we consider the distance of the upstream VSL zone as a control variable and perform a comprehensive analysis of its impact on the performance of the closed-loop traffic
control system based on the multi-section CTM. We develop a lower bound that this distance needs to satisfy in order to guarantee homogeneous traffic density across sections and
reduce bottleneck congestion. The bound is verified analytically and demonstrated using
36
microscopic simulation of traffic on I-710 in Southern California. The simulations are used
to quantify the benefits on mobility, safety and emissions obtained by selecting the upstream
VSL zone distance to satisfy the analytical lower bound. The developed lower bound is a
design tool which can be used to tune and improve the performance of VSL controllers.
4.2 Control Design
We continue to use the multi-section Cell Transmission Model (CTM) presented in section
3.2 as a basis of the control design but remove the disturbance term µ in the density evolution equation since uncertainties are beyond the scope of this chapter. The goal is to develop
a traffic flow controller so that the traffic conditions of all the road sections operate within
the free-flow region of the fundamental diagram, despite the activation of the downstream
bottleneck. Since the mainstream traffic flow is to be regulated, variable speed limit (VSL)
control is a reasonable traffic flow control strategy. The underlying idea is to regulate the
inflow, q1, to a level that is within the capacity constraints of the downstream section at the
bottleneck. Furthermore, minimize the speed variations between all the CTM sections to
diminish the stop-and-go traffic behavior and, thus, achieve smooth traffic flow conditions.
Driven by this idea, a rule-based VSL control is proposed in this section, considering the
distance of the most upstream VSL zone, denoted by (L0), as a control variable. In addition, a lane change (LC) controller is combined with the VSL to manage the lane-changing
maneuvers in the vicinity of the bottleneck in order to prevent the VSL performance from
being deteriorated [4].
4.2.1 Rule-Based Variable Speed Limit Control
We propose a rule-based VSL controller to alleviate freeway bottleneck congestion, based on
the multi-section CTM presented in section 3.2. The VSL control signs are implemented in
the upstream of the first section as well as all CTM sections as shown in Figure 3.2. Each
37
VSL command takes effect at the beginning of the section. The downstream bottleneck is
created by a lane closure due to an incident. The control objective is to match the inflow of
the first section, q1, with the bottleneck capacity Cb during the incident.
The flow-density relationship when the VSL control is activated still follows the fundamental diagram presented in Figure 3.3. The traffic flow dynamics are described in equation
(3.3).
As discussed in section 3.2, when computing the bottleneck capacity, we take the capacity
drop phenomenon into consideration:
Cb = (1 − ϵ(ρN ))Cd,
ϵ(ρN ) =
ϵ0 if Cd < C & ρN >
Cd
vf
,
0 otherwise
(4.1)
The key idea is to force the inflow q1 to be less than or equal to the bottleneck capacity
Cb under various traffic conditions by adjusting the most upstream speed limit v0. To be
more specific, we consider the following three scenarios based on different levels of traffic
demands d:
• scenario 1: d < (1 − ϵ0)Cd
• scenario 2: (1 − ϵ0)Cd ≤ d ≤ Cd
• scenario 3: d > Cd
In scenario 1, the demand d is less than the bottleneck capacity and no control effort is
needed. Therefore, we simply let v0 = vf . In scenario 2, d exceeds the dropped bottleneck
capacity (1 − ϵ0)Cd when there exists congestion near the bottleneck. However, d does not
exceed the recovered bottleneck capacity Cd when the congestion is cleared. The VSL control
38
needs to be activated in order to regulate the inflow as follows:
q1 =
d if 0 ≤ ρN ≤
Cd
vf
,
v0wρj
v0 + w
= (1 − ϵ0)Cd otherwise
(4.2)
In scenario 3, since d > Cd, the VSL control needs to be activated all the time to regulate
the inflow q1 so that:
q1 =
v0wρj
v0 + w
=
Cd if 0 ≤ ρN ≤
Cd
vf
,
(1 − ϵ0)Cd otherwise
(4.3)
Combining all three scenarios together, the VSL control command v0 can be computed as:
v0 =
wCd
wρj − Cd
if d > Cd & 0 ≤ ρN ≤
Cd
vf
,
w(1 − ϵ0)Cd
wρj − (1 − ϵ0)Cd
if d ≥ (1 − ϵ0)Cd & ρN >
Cd
vf
,
vf otherwise
(4.4)
Note that the condition ρN >
Cd
vf
represents the existence of bottleneck congestion, and its
counterpart 0 ≤ ρN ≤
Cd
vf
means that the congestion has been removed. The bottleneck
capacity recovers from (1 − ϵ0)Cd to Cd during the transition from the former condition to
the latter. In both scenarios 2 and 3, we switch v0 from a lower value to a higher value
once the transition is completed as indicated by (4.4) in order to maximize the bottleneck
throughput.
By matching q1 upstream with Cb, the remaining downstream sections are set to maintain
a steady traffic flow, achieved by simply letting
vi = vf for i = 1, ..., N (4.5)
39
The simulation results presented in section 3.4 indicates that variations in the given speed
limit commands, especially from high to low speeds, create extra shockwaves and, thus,
deteriorate traffic mobility. This should be avoided as much as possible. In section 4.3, we
will show that during high-density traffic conditions, the proposed VSL controller performs
better than a classical feedback-based VSL controller, where the speed limit commands vary
based on measured densities and flows.
4.2.2 Analysis of VSL Zone Distance
In this section, we consider the length of the upstream VSL zone L0 where the speed limit
given by (4.4) will be applied as a control variable and analyze the impact of L0 on the
performance of the closed-loop system. Figure 4.1 depicts the road network and demonstrates
the traffic dynamics after the activation of the proposed rule-based VSL control.
Figure 4.1: Traffic Dynamics after the Activation of the Rule-Based VSL
According to the discussion in section 4.2.1, the VSL controller is activated immediately
once the incident takes place, which creates a speed difference between the upstream VSL
zone (v0) and all the downstream sections (v1, ..., vN ) when d ≥ (1−ϵ0)Cd. In Figure 4.1, the
black vehicles entered the road network before the activation of the VSL. They travel without
any restrictions from the speed limit signs and slow down as they approach the congested
area in front of the incident location. The yellow vehicles entered the road network after the
activation of the VSL. They travel at a reduced speed v0 within the upstream VSL zone.
As a result, a low-density area is created between the two groups of vehicles, allowing the
40
shockwaves from the bottleneck to be absorbed. The existence and propagation of the lowdensity area can be observed from the flow curves shown in Figure 4.2. Note that the vehicle
input flow never drops down to 0, and the inflow drops in all downstream sections because
of the VSL control.
Figure 4.2: Traffic Flows with the Activated Rule-Based VSL
The low-density area needs to be long enough so that the congestion associated with the
black vehicles is cleared before the yellow vehicles catch up. In other words, the first yellow
vehicle should never reach the last black vehicle within the road network as it may create
further shockwaves. This can be formulated as a chasing problem in which the time it takes
to clear congestion at the bottleneck, denoted as Tb, is strictly less than the time spent for
the first yellow vehicle to reach the bottleneck, denoted as Ty, i.e., Tb < Ty.
Theorem 4.2.1. Consider the freeway bottleneck control problem with a constant demand
d ≥ (1 − ϵ0)Cd and VSL commands given by (4.4) and (4.5). The propagation of the traffic
congestion at the bottleneck can be completely absorbed by the low-density area created by the
VSL control if the upstream VSL zone distance L0 satisfies
L0 >
(vf
PN
i=1 ρi(t0) − (1 − ϵ0)CdN)v0L
((1 − ϵ0)Cd − v0ρ0(t0))vf
(4.6)
41
where ρi is the measured density of section i, for i = 0, 1, 2, ..., N. Cd is the downstream
capacity; vf is the free flow speed; v0 is the upstream VSL command; ϵ0 is the capacity drop
factor; N is the number of downstream sections; L is the length of each downstream section
and t0 ≥ 0 is the time the incident takes place.
Proof. Let’s start by estimating the number of vehicles Nb that already entered the network
(colored in black in Figure 4.1) using the measured densities at the time t0:
Nb = L0ρ0(t0) + L
X
N
i=1
ρi(t0) (4.7)
Since the traffic flow at the bottleneck is qN+1, we have
Z t0+Tb
t0
qN+1(τ )dτ = Nb (4.8)
When congestion is active near the bottleneck, qN+1 is equal to (1−ϵ0)Cd due to the capacity
drop phenomenon. Thus,
Z t0+Tb
t0
qN+1(τ )dτ = (1 − ϵ0)CdTb (4.9)
From (4.7),(4.8) and (4.9), we obtain the time to clear congestion at the bottleneck as
Tb =
L0ρ0(t0) + L
PN
i=1 ρi(t0)
(1 − ϵ0)Cd
(4.10)
On the other hand, the yellow vehicles are able to follow the speed limit commands when
traveling through the network. The time the first yellow vehicle reaches the bottleneck
denoted by Ty can be computed as
Ty =
L0
v0
+
NL
vf
(4.11)
42
Therefore, Tb < Ty yields
L0ρ0(t0) + L
PN
i=1 ρi(t0)
(1 − ϵ0)Cd
<
L0
v0
+
NL
vf
(4.12)
which is equivalent to (4.6) giving the following condition:
v0 <
(1 − ϵ0)Cd
ρ0(t0)
(4.13)
Note that (4.13) is automatically satisfied as we compute v0 with (4.4). Therefore, (4.6)
guarantees that Tb < Ty, which prevents shockwaves and improves the performance of the
closed-loop system.
According to (4.6), the lower bound of L0 is positively correlated with the speed limit of
the upstream VSL zone, v0, and the initial densities of all sections, ρi(t0), for i = 0, ..., N.
Although there is no theoretical upper bound on L0, overextending it leads to undesirable
travel time and underutilization of the road. We will demonstrate the impact of L0 and
verify the effectiveness of the computed lower bound under different traffic scenarios via
microscopic simulations in section 4.3.
4.2.3 Lane Change Control
According to (4.10), the time spent for clearing the bottleneck, i.e., Tb, can be reduced
by increasing the bottleneck throughput. Therefore, we implement the Lane Change (LC)
control in the discharging section in order to reduce the capacity drop, increase the bottleneck
throughput, and accelerate the convergence. The mechanism of LC control is the same as
the one introduced in section 3.3.2.
43
4.2.4 Stability Analysis
In this section, we perform a rigorous stability analysis of the closed-loop system with the
proposed integrated VSL and LC controller. There are three control variables: the upstream
VSL command v0 given by (4.4), the upstream VSL zone distance L0 whose lower bound is
given by (4.6), and the lane change distance dLC determined from Figure 3.5. Accordingly,
the closed-loop system (3.1)-(3.3) with (4.4) can be expressed as follows:
ρ˙i =
1
L
(qi − qi+1), for i = 1, ..., N, (4.14)
where
q1 = min
d, v0wρj
v0 + w
, C, w(ρ
j − ρ1)
,
qi = min
vf ρi−1, C, w(ρ
j − ρi)
, for i = 2, ..., N,
qN+1 = min
vf ρN ,(1 − ϵ(ρN ))Cd,w˜(˜ρ
j − ρN )
(4.15)
Based on the analysis in section 4.2.2, vehicles that enter the network before the activation
of VSL control (colored in black in Figure 4.1) travel downstream freely and create congestion
at the bottleneck. Although the traveling speed is difficult to compute when the traffic is
congested, the bottleneck throughput is stabilized around (1 − ϵ0)Cd, meaning that the
congested density is at steady state for the given demand. According to theorem 4.2.1, the
congestion can be removed without affecting the new group of vehicles (held by v0, colored
in yellow in Figure 4.1) by choosing L0 appropriately. Once these (yellow) vehicles exit
the upstream VSL zone, they travel through all downstream sections in free-flow speed.
Note that the speed is maintained even when they approach the bottleneck because we have
matched the inflow q1 with the bottleneck capacity and applied LC recommendations to
avoid sudden lane changes. As a result, system (4.14)-(4.15) can be further simplified when
the congestion is removed (t > t0 + Tb) as follows:
44
ρ˙1 =
1
L
(min{d, v0wρj
v0 + w
} − vf ρ1),
ρ˙i =
vf
L
(ρi−1 − ρi), for i = 2, ..., N
(4.16)
Theorem 4.2.2. Consider the closed-loop system (4.14)-(4.15) with the traffic demand d ≥
(1 − ϵ0)Cd. The proposed VSL controller given by (4.4)-(4.6) guarantees that the density
converges exponentially fast to the equilibrium point ρ
e
i =
min{d,Cd}
vf
(i = 1, ..., N) for t > t0+Tb
which corresponds to the maximum possible throughput, where t0 is the time the controller is
activated and Tb is given by (4.10).
Proof. Let us define ρ = [ρ1, ρ2, ..., ρN ]
T, and then compute the equilibrium point of system
(4.16) by setting ˙ρ = 0, which yields
ρ
e
1 = ρ
e
2 = ... = ρ
e
N =
min{d, Cd}
vf
, t > t0 + Tb (4.17)
Note that v0 has already been switched to the higher value after the congestion is cleared
according to (4.4), which corresponds to the maximum bottleneck throughput min{d, Cd}
without the capacity drop. To prove the exponential convergence, we rewrite system (4.16)
in matrix form:
ρ˙ =
1
L
(A1ρ + b1), t > t0 + Tb (4.18)
where
A1 =
−vf 0 . . . 0 0
vf −vf . . . 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 . . . vf −vf 0
0 . . . 0 vf −vf
b1 =
min{d, Cd}
0
.
.
.
0
0
A1 is a lower triangular matrix with all diagonal entries being negative real numbers. Therefore, system (4.18) is exponentially stable when t > t0 + Tb.
45
4.3 Numerical Simulations
4.3.1 Simulation Network and Performance Measurements
The microscopic simulations are carried out with the same road network and parameter
setting as illustrated in section 3.4 except that the distance of the upstream VSL zone
becomes a control variable, as shown in Figure 4.3.
Figure 4.3: I-710 Simulation Road Network with Adjustable L0
The performance measurements and criteria introduced in section 3.4 is also adopted
with some minor modifications to the relative root mean square error (RRMSE) as follows:
eρ =
1
ρ
∗
s
1
te − ts
Z te
ts
(¯ρ(τ ) − ρ
∗
)
2
(4.19)
where ρ
∗ =
min{d,Cd}
vf
is the desired equilibrium, te is the time when the incident ends, ts
is the time we switch v0 to a higher value, ¯ρ is the average measured density of all CTM
sections.
4.3.2 Evaluations with Various L0
We are interested in the control performance of both high and moderate traffic demands
where d = 7000 and 5500 veh/h respectively. For high demand, we evaluate 12 scenarios with
the upstream VSL zone distance L0 = [0, 0.8, 1.2, 1.4, 1.6, 1.8, 2.0, 2.2, 2.4, 3.2, 4.0, 4.8] km.
46
For moderate demand, we evaluate 9 scenarios with L0 = [0, 0.4, 0.6, 0.8, 1.0, 1.2, 1.6, 3.2, 4.8]
km. We take the average results of 10 independent Monte Carlo simulations for each scenario
to reduce the randomness and increase reliability.
Since both high and moderate demands exceed the bottleneck capacity Cd = 4800 veh/h,
the traffic conditions before the bottleneck congestion being cleared are d > Cd and ρN >
Cd/vf , which falls into the second case of (4.4). Plugging the model parameters, we have
v0 = 25.7 km/h. After the bottleneck congestion is removed, the traffic conditions become
d > Cd and 0 ≤ ρN ≤ Cd/vf , which falls into the first case of (4.4). In this case, v0 = 31.6
km/h. Theoretically, these two values of v0 change q1 to match the dropped and recovered
bottleneck capacity respectively. However, these values may be too aggressive in practice and
should be considered as the upper bounds of v0 due to the uncertainties in model parameters
such as w and the randomness in microscopic simulations. Therefore, we select v0 to be
20 and 25 km/h before and after the removal of the bottleneck congestion. To determine
the proper time of switching v0 from 20 to 25 km/h, we use the time needed to clear the
bottleneck congestion given (4.10) calculated to be 14 min for high demand and 11 min for
moderate demand. Taking the potential uncertainties into account, the actual time required
for dissipating the congestion may be longer. To enhance the robustness and ensure that
the switching happens after the removal of the congestion, we set the switching time to be
20 min after the occurrence of the incident for both demand levels, i.e. ts = 30 min. In
summary, the control command of v0 for both high and moderate demands is given as
v0 =
20 km/h if 10 ≤ t < 30,
25 km/h if 30 ≤ t < 80,
100 km/h otherwise
(4.20)
Plugging the lower value of v0 (20 km/h) and d into (4.6), we obtain the lower bounds
of L0 as 1.8 km for high demand and 0.7 km for moderate demand. In our simulation study
we vary L0 from values below and above the lower bound and examine the impact on the
47
performance of the VSL controller.
Figure 4.4-4.7 show the evaluation results of all the above-mentioned scenarios under
both traffic demands in terms of the RRMSE in densities, the average number of stops, the
average emission rates of CO2 and the average travel time (ATT) respectively. Each fitting
curve is generated by the smoothing spline fitting algorithm in MATLAB.
Figure 4.4: RRMSE in Densities (eρ) vs. L0.
Figure 4.5: Average Number of Stops (¯s) vs. L0.
48
Figure 4.6: Average Emission Rates of CO2 vs. L0.
Figure 4.7: Average Travel Time (ATT) vs. L0.
49
In Figure 4.4, eρ drops down to 25% under both traffic demands as L0 reaches the lower
bound, which implies that the density of each CTM section reaches the steady state and
verifies the correctness of the computed lower bound. In Figure 4.5 and 4.6, we observe
significant benefits in the number of stops and the emission rates of CO2 under high demand
when L0 is close to the lower bound. Under moderate demand, both benefits are less obvious
because the performance deterioration caused by the congestion is less severe. As we further
extend L0 beyond the lower bound, these benefits increase consistently. It seems that L0
should be as long as possible in order to maximize the benefits in terms of number of stops
and emissions. However, overextending L0 leads to undesirable ATT, as shown in Figure 4.7.
In addition, extending L0 in the real world may be expensive or even impossible due to the
road geometry and conditions. Therefore, it is essential to determine L0 that achieves a good
balance between the closed-loop performance, the cost and the difficulty of implementation.
In this sense, the computed lower bound from (4.6) serves as a valuable design tool.
4.3.3 Proposed VSL vs. Feedback-Linearization VSL
In this section, we compare the performance of the proposed VSL controller with the
feedback-linearization (FBL) VSL controller presented in [29]. The LC control introduced in
section 3.3.2 is incorporated with both VSL controllers to enhance the bottleneck throughput and reduce the capacity drop. We consider two demand levels of 7000 veh/h and 5500
veh/h, and set L0 to be 4.8 km and 1.6 km respectively. The evaluation results are presented
in Table 4.1. The proposed VSL performs significantly better than FBL VSL in terms of
the RRMSE in densities in high-demand scenarios. The two VSL controllers deliver close
performance under moderate traffic demand.
The key idea of the proposed VSL scheme is to concentrate all the control efforts into
the upstream VSL zone and minimize the downstream speed variations, while the FBL VSL
distributes part of the control efforts into the downstream sections and allows some speed
difference. The results in table 4.1 indicate that concentrating the control in the upstream
50
Table 4.1: Evaluations of Two VSL Schemes
Demands Control ATT (min) ¯s CO2 (g/veh/km) eρ
7000 veh/h No Control 22.4 43.8 395.1 240.9%
7000 veh/h Proposed VSL 18.9 4.6 318.2 13.1%
7000 veh/h FBL VSL 19.2 4.9 320.8 23.9%
5500 veh/h No Control 16.6 23.6 332.9 235.1%
5500 veh/h Proposed VSL 17.8 13.6 331.8 23.8%
5500 veh/h FBL VSL 18.3 12.1 325.1 20.0%
is better than distributing it all over in high-demand scenarios as the traffic moves more
consistently using the former method. Moreover, the proposed VSL is easier to implement
and requires less computation. The performance is not affected by the uncertainties in the
measurements. However, the FBL VSL may outperform the proposed VSL in the following
situations:
• The road configurations and initial traffic conditions vary among the downstream sections.
• There exists ramps and ramp flows in the CTM sections.
• The available space for L0 is less than desired.
51
Chapter 5: Integrated Freeway Traffic Control Using Q-Learning with Adjacent Arterial
Traffic Considerations
5.1 Introduction
Demand for freeway and arterial travel grows in a fast pace as the population increases in
metropolitan areas worldwide, leading to traffic congestion and delays at sensitive parts of
road networks such as ramps and intersections. Research efforts on freeway [9, 15, 25] and
arterial traffic management [59, 104, 105] as separate entities have both achieved certain
levels of success in terms of reducing travel time, collision risks and emission rates. However,
the integration of freeway and arterial traffic control has been rarely explored due to the
difficulty of modeling two completely different traffic patterns and high complexity of the road
network. Traditional integration frameworks based on optimization or feedback techniques
often face scalability issues or lack coordination for such tasks [9, 71, 15]. An alternative to
these frameworks is reinforcement learning (RL), which has been investigated in some recent
studies [106, 105]. The RL framework enhances the coordination by combining the states
and actions of sub controllers and offers fast implementation for large-scale road networks.
Therefore, in pursuit of integrating freeway and arterial traffic control, we take an initial
step to develop a RL-based freeway traffic control strategy that considers adjacent arterial
traffic conditions in this chapter.
52
5.2 Methodology
In this section, we propose a freeway traffic control (FTC) strategy that integrates variable
speed limit (VSL), lane change (LC), ramp metering (RM) and takes into account adjacent
arterial traffic conditions for a connected freeway and arterial road network depicted in Figure
5.1. The purpose of FTC is to reduce average freeway travel time and maintain on-ramp
queue lengths and vehicle densities to a reasonable level under all traffic conditions and input
demands. A cycle length model modified from the classic Webster model [51, 53] is adopted
for arterial traffic signal control (TSC). Note that all the freeway ramps are connected with
arterial roads. Some connections are omitted in Figure 5.1 due to limited drawing space.
More details of the road network configuration will be presented later in the section.
Figure 5.1: Road Network Consisting of a Freeway segment and Adjacent Arterials
5.2.1 Reinforcement Learning
The coordination between VSL, LC and RM actions is the key to improving the performance
of the proposed integrated freeway control strategy. In typical rule-based or feedback-based
control algorithms [71, 25, 32], each controller attempts to achieve its own goal without
coordination. In optimization-based control algorithms [107, 9, 108], the coordination is
addressed as multiple controllers attempting to optimize the same objective. However, the
53
formulation of the optimization problem is tedious for the considered road network due to
its high complexity which does not scale up as the network grows in size.
Reinforcement learning (RL) is a promising alternative approach that enhances the coordination between different controllers and requires less computational effort than optimization algorithms in terms of field implementation [106]. The RL agent interacts with the
environment and attempts to maximize the cumulative reward by trial and error in discrete
time steps, which can be formulated as a Markov decision process (MDP). The MDP contains
a set of environment states and a set of control actions. Let Pa(x, x′
) denote the probability
of transition from state x to another state x
′ by taking action a, and let Ra(x, x′
) denote the
reward received from the environment by making the transition from x to x
′
through action
a. For each state-action pair x and a, the expected discounted reward received by taking
action a in state x is expressed as a Q-value function Q(x, a). To solve the MDP problem,
we need to find a policy π that defines the action that leads to the maximal Q-value in each
state, i.e.
π
∗
(x) = arg max
a
Q
∗
(x, a) (5.1)
where Q∗
(x, a) can be expressed as
Q
∗
(x, a) = Ra(x, x′
) + γ
X
x′
Pa(x, x′
) max
a
′
Q(x
′
, a′
) (5.2)
where γ ∈ [0, 1) is a discount factor for the maximum possible future rewards - maxa
′ Q(x
′
, a′
).
The solution of (5.1) can be calculated using dynamic programming if the transition
probability and the reward function are known. However, in the context of traffic flow
control, the transition probability cannot be expressed explicitly. Therefore, we apply modelfree solution techniques such as the Q-learning (QL) algorithm [109] to learn the optimal
policy. During the Q-learning process, the Q-value of each state-action pair (x, a) is updated
after the agent takes action a at state x and receives an immediate reward R(x, a). Assume
x
′
is the future state for (x, a), equation (5.2) indicates that the optimal Q-value converges
54
to Ra(x, x′
) + γ maxa
′ Q(x
′
, a′
). Therefore, Q(x, a) is updated as follows
Q(x, a) ← Q(x, a) + η[R(x, a) + γ max
a
′
Q(x
′
, a′
) − Q(x, a)] (5.3)
where η ∈ (0, 1] is the learning rate that determines to what extent the newly acquired
information will override the old information. R(x, a)+γ maxa
′ Q(x
′
, a′
) is the newly acquired
estimate of Q-value which consists of both the immediate reward R(x, a) and the discounted
future rewards by a factor γ ∈ [0, 1). The convergence of Q(x, a) is achieved when the
difference between the updated Q(x, a) and the previous Q(x, a) is less then a predefined
threshold ϵq. The selection of hyperparameters will be covered in section 5.2.2.
5.2.2 Freeway Traffic Control Agent
Based on the QL algorithm presented above, this section aims to design a freeway traffic
control (FTC) agent that integrates variable speed limit (VSL), lane change (LC) and ramp
metering (RM) actions to smooth on-ramp merging, reduce freeway travel time, and alleviate
the congestion produced by a potentially existing lane-drop bottleneck. The FTC agent takes
actions based on observations of traffic states within a single-freeway-section environment
depicted in Figure 5.2. The adjacent arterial traffic conditions and signal plan are also taken
into account to enhance the coordination between the freeway and arterials. The learning
process of the FTC agent is presented in Figure 5.3.
As illustrated in Figure 5.3, we first perform offline learning over the small road network
depicted in Figure 5.2. During the training process, we explore as many state-action pairs
as possible by running multiple simulations for each traffic demand level. Each simulation
run lasts for 30 min (60 time steps) with a warm-up of 5 min. After the warm-up, there is
a 20% chance that an incident takes place and block the side lane. We consider the offline
learning as completed if the Q-value has converged for each state-action pair, which requires
the difference between the updated and previous Q-value to be less than 0.01. After that,
55
Figure 5.2: Road Network of a Single Freeway Section with Adjacent Arterials
we implement the trained FTC agent in a large simulation road network with real-world
traffic demands. We collect data from the online implementation to assist the continuous
learning of the FTC agent. The cycle of online implementation and continuous learning are
repeated multiple times until the performance improvement becomes trivial, which means
the reduction in travel time and queue lengths between 3 consecutive iterations is less than
1%. The detailed configurations of the large-scale simulation network is demonstrated in
section 5.3.1.
The states, actions, reward function of the FTC agent and some other important variables
and parameters are introduced in the rest of the section.
State Description
The states of the FTC agent involve measured and estimated traffic data of the freeway
section i and the adjacent arterial intersection k, illustrated graphically in Figure 5.2 and
listed as follows
• Vehicle density: ρi
.
• Net flow: qi,net = qi − qi+1 + ri − si
.
56
Figure 5.3: Learning Process of FTC Agent
• On-ramp queue length: w
o
i
.
• The number of closed lane(s): nc ∈ {0, 1}.
• The dominant signal phase of intersection k in the next control cycle: np ∈ {1, 2, 3, 4, 5}.
• Incoming demand of each approach of intersection k estimated by (5.15): ˜d
E
k
,
˜d
S
k
,
˜d
W
k
,
˜d
N
k
, where the superscript denotes the direction of the approach. The estimation
process will be introduced in section 5.2.3.
To reduce problem dimension, we discretize the state spaces for continuous state variables
as follows
ρi ∈ {20, 30, 40, ..., 150} veh/km,
qi,net,
˜d
E
k
,
˜d
S
k
,
˜d
W
k
,
˜d
N
k ∈ {0, 100, 200, ..., 4000} veh/h,
w
o
i ∈ {0, 50, 100, ..., 500} m
(5.4)
57
Action Space
The action space of the variable speed limit control A(vi) contains a set of speed limit values
that can be applied to each freeway section. Considering the feasibility in the real world,
we set the speed limit to be a multiple of 10 km/h with a minimum value of 60 km/h and
a maximum value of 100 km/h. The speed limit value can be increased or decreased by at
most 10 km/h to ensure safety. Therefore,
A(vi(t)) = {max{60, vi(t − T) − 10},
vi(t − T), min{100, vi(t − T) + 10}}
(5.5)
where T = 30 s is the control cycle.
The ramp metering agent has a fixed green phase duration of 3 s and adjusts the red
phase duration according to the traffic states. The set of available red phase durations is
{0, 0.5, 1, 1.5, 2, 3, 4, 6} in seconds, which corresponds to the following set of on-ramp flow
rates {1800, 1029, 900, 800, 720, 600, 514, 400} in vehicles per hour.
To mitigate the capacity drop triggered by forced lane change maneuvers and increase
the throughput at the bottleneck, we provide lane change (LC) recommendations to vehicles
moving in the closed lane(s) before approaching the bottleneck, as depicted in Figure 5.2. We
also provide LC recommendations on the upstream area of on-ramp merging if it improves the
overall performance, for which the LC agent needs to make a decision. Therefore, the action
space of the LC control is binary, i.e. whether or not to activate the LC recommendations
on the upstream area of on-ramp merging.
Reward Function
The objective of the proposed freeway control strategy is to reduce the travel time while
maintaining on-ramp queues at reasonable levels and vehicle densities close to the desired
58
value. The average travel time can be computed as [4]
Tt =
1
Nv
X
Nv
j=1
(tj,out − tj,in) (5.6)
where Nv is the number of vehicles passing through the freeway section during the current
control cycle, tj,in and tj,out is the time vehicle j enters and exits the section respectively.
The above-mentioned control objective requires the reward function to be negatively
correlated with the average travel time Tt and the on-ramp queue length w
o
i
. To avoid onramp queue overspill, we want the reward R = 0 when w
o
i
exceeds the reference value w
r
i
.
Moreover, we want the vehicle density ρi to be close to the desired density ρ
∗
, and thus, R
should be negatively correlated with the distance between ρi and ρ
∗
. Considering the above
requirements, the reward function R is defined as
R = max{0,(1 −
wi
wr
i
)
Li
Ttvf
− (
ρi
ρ
∗
− 1)2
} (5.7)
The reward R defined in (5.7) ranges from 0 to 1. R reaches 0 when the on-ramp queue
exceeds the reference value or the density is significantly deviated from the desired value.
R = 1 represents the ideal condition where the freeway section is in free-flow status with
the density equal to the desired value and no on-ramp queue. In general, a higher reward
value reflects less on-ramp queue length, travel time and deviation between ρi and ρ
∗
. Note
that the objective of Q-learning is not only to maximize the immediate reward defined in
(5.7), but to maximize a cumulative long-term reward where the reward of each time step is
computed by (5.7).
Other Variables and Parameters
As mentioned previously, the reward function (5.7) encourages the density of each freeway
section to converge to a predefined value, denoted as ρ
∗
. In the ideal case, a trivial choice
is to let ρ
∗ = min{d, Cb}/vf , which corresponds to the highest possible flow-rate through
59
the bottleneck. However, a small disturbance may drive the density towards the capacitydrop region, which introduces unwanted oscillatory behavior of the closed-loop system and
negatively impacts convergence to desired equilibrium states [38]. To avoid the capacity drop
triggered by the disturbance, we multiply Cb with a factor that is slightly less than 1, and
thus, ρ
∗ = min{d, 0.95Cb}/vf .
The proposed QL algorithm does not cover the control of the upstream part of the
freeway segment, which involves two crucial variables - the value and the location of the
most upstream VSL sign, denoted as v0 and L0 in Figure 5.1. According to [110], the inflow
q1 is regulated by v0 so that q1 is less than or equal to the bottleneck throughput. The
fundamental diagram (FD) indicates that the maximum possible flow produced by v0 is
v0wρj
v0+w
, which corresponds to three possible values of throughput:
• the original capacity C when there is no bottleneck
• the bottleneck capacity without capacity drop Cb
• the bottleneck capacity with capacity drop (1 − ϵ0)Cb
Then we have
v0 =
vf no bottleneck,
wCb
wρj − Cb
bottleneck without capacity drop,
w(1 − ϵ0)Cb
wρj − (1 − ϵ0)Cb
bottleneck with capacity drop
(5.8)
On the other hand, we set L0 to be slightly greater than the lower bound proposed in [110],
i.e.
L0 >
vf ρ¯(t0) − (1 − ϵ0)Cb)v0Lb
((1 − ϵ0)Cb − v0ρ0(t0))vf
(5.9)
where ¯ρ and Lb are the average vehicle density and the distance from the beginning of section
1 to the lane-drop bottleneck respectively; t0 is the time the incident takes place.
60
In LC control, the distance of the LC area, denoted as dLC, is a crucial control variable
that needs to be determined properly. dLC must be longer than the minimum distance
required for vehicles to complete LC maneuvers safely, but overextending dLC may lead to
the underutilization of the road capacity. In this study, dLC takes an empirical value of 800
m for both on-ramp merging and lane-closure bottlenecks [4].
The learning rate η is one of the most important QL parameters. A common strategy
is to set a decreasing η over the training process to ensure reasonable efficiency and the
convergence of the Q value. Note that letting η decrease as a function of time does not
work well because different states and actions are visited at different stages of the learning
process. Instead, we assign a specific η to each state-action pair and reduce its value every
time the state-action pair has been visited [24]. Therefore,
η(x, a) =
1
1 + n(x, a)(1 − γ)
0.8
(5.10)
where n(x, a) is the number of times the state-action pair (x, a) has been visited and γ is
the discount factor. γ determines the importance of the future reward. To makes the agent
reasonably far-sighted, we select γ = 0.9.
Since the reward defined in (5.7) ranges from 0 to 1, we select the convergence threshold
ϵq = 0.01. The simulation results reflect insignificant improvement in travel time and queue
control by choosing a smaller ϵq such as 0.002, but the training time required increases
considerably.
To balance exploration and exploitation during the learning process, we implement an
adaptive greedy policy where a random action is selected with probability δ and the bestknown action is selected with probability 1 − δ. Similar to the learning rate equation above,
δ is designed as a function of the number of prior visits to each state [106], i.e.
δ(x) = max
0.05,
1
1 + 1
4Na(x)
Pa
n=1 n(x)
(5.11)
61
where Na(x) is the number of available actions at the state x, n(x) is the number of prior
visits of the state x. The adopted greedy policy encourages the agent to take random actions
(exploration) when a state has not been visited, and becomes more likely to take the bestknown action (exploitation) as the number of visits increases. The probability of exploration
eventually converges to a minimum value of 0.05.
5.2.3 Arterial Traffic Management
The arterial road network under consideration contains K homogeneous signalized intersections indexed from 1 to K in the freeway traffic flow direction, as depicted in Figure 5.1. The
on-ramp entrances and off-ramp exits lie on the East side of each intersection. There are
2(K +1) entrances plus N off-ramps that generate traffic flow into the arterial road network.
At this stage, we assume traffic signal control (TSC) is the only method to regulate arterial
traffic flows. Fixed-time TSC strategies cannot fit various input levels and traffic conditions as mentioned in section 1.2. Therefore, we propose a traffic-responsive control scheme
to determine the signal plan that minimizes the travel time, the fuel consumption and the
emissions for each intersection based on the observation of input demands and turning ratios.
Cycle Length Model
The first step of the proposed TSC scheme is to find a cycle length model that fits the arterial
traffic conditions. The pioneer research on signal cycle optimization was conducted by F. V.
Webster [51, 53], who developed a formula to compute the signal cycle that minimizes travel
delays while considering the uncertainties of traffic models as follows:
Tc =
1.5Tl + 5
1 − Y
(5.12)
where Tc is the signal cycle and Tl
is the lost time per cycle. The lost time is defined as the
time during which no vehicles are able to pass through an intersection due to the transition
62
between a green phase and a red phase. Y ∈ [0, 1) is the sum of flow ratios of each phase
group, which indicates the degree of saturation of an intersection. The flow ratio is defined
as the actual traffic flow divided by the saturation flow. The saturation flow is set to be 1800
veh/h/lane in the considered arterial network. Extensions based on the Webster model have
been made over the years to optimize different objective functions such as fuel consumption,
emissions and the number of vehicle stops [111, 112, 54]. In this study, we adopt the modified
Webster model proposed by Calle-Laguna et al. [54]:
Tc = α1 ln( Tl
1 − Y
) + α2 (5.13)
where α1 and α2 are determined by solving a linear regression problem on the data collected
from microscopic simulations over an isolated intersection. The detailed configuration of the
isolated intersection is presented in Figure 5.4. Each intersection approach is four lanes wide
(left, right, and a double through) with a length of 100 m. The arterial road connected to the
intersection is two lanes wide and lasts for 1 km on each direction in order to accommodate
the long queue under high demands. The default signal plan involves five phases as shown
in Figure 5.5. Since only medium and high traffic demands are considered, all signal plans
must have a separate left-turn phase to enhance the mobility and safety of the intersection
operation [113].
The commercial microscopic simulator PTV VISSIM 10 is used to calibrate the model
parameters α1 and α2 in (5.13). During the calibration, we set the demand space as
d
S
, dE, dN , dW ∈ {400, 500, 600, ..., 2000} veh/h and the signal cycle space as
Tc ∈ {40, 50, 60, ..., 180} s. For each demand level, we evaluate all the cycle options using
the following performance index function [111]:
P = γ1
Tt
Tt,0
+ γ2
F
F0
+ γ3
E
E0
(5.14)
where Tt
is the average travel time, F is the average fuel consumption, E is the average
63
Figure 5.4: Calibration Road Network with an Isolated Intersection.
Figure 5.5: Traffic Signal Phasing Scheme with Five Phases
emission rates of CO2, and γ1, γ2, γ3 are the corresponding weights. F and E are computed
using the EPA MOVES model [103]. Tt,0, F0, E0 are the base-case results obtained from the
scenario where the signal cycle Tc = 60 s. According to [111], we set γ1 = 0.4, γ2 = γ3 = 0.3.
After iterating the data collection process for all input levels twice, we plot the data
points with the best (lowest) performance index and perform the linear regression in Figure
5.6. As a result, α1 = 136.8, α2 = −357.7.
64
Figure 5.6: Linear Regression of Optimal Signal Cycle
Demand and Flow Ratio Estimation
To determine signal plans using the obtained cycle length model, we estimate the demands
of each intersection as follows:
˜d
W
k = d
W
k + ¯si
˜d
E
k = d
E
k
˜d
S
k =
d
S
1
if k = 1,
y
W,l
k−1
˜d
W
k−1 + y
E,r
k−1
˜d
E
k−1 + y
S,t
k−1
˜d
S
k−1
otherwise
˜d
N
k =
d
N
K if k = K,
y
W,r
k+1
˜d
W
k+1 + y
E,l
k+1
˜d
E
k+1 + y
N,t
k+1
˜d
N
k+1 otherwise
(5.15)
where ˜d
W
k
is the estimated demand of the Westbound approach at intersection k, d
W
k
is the
actual Westbound vehicle input of intersection k, ¯si
is the historical average off-ramp flow
rate of freeway section (¯si = 0 if the off-ramp does not exist), y
W,l
k
is the left-turn ratio of the
Westbound approach at intersection k. The superscript denotes the traffic flow direction in
the following manner: E - Eastbound, W - Westbound, N - Northbound, S - Southbound,
l - left-turn, r - right-turn, t - through. We assume freeway section i is associated with
65
intersection k, all vehicle inputs are known and all links within the arterial network are
unsaturated. The estimation process of turning ratios will be illustrated in section 5.3.1.
Then we calculate the flow ratio of each phase group with respect to the phase scheme
presented in Figure 5.5 and sum them up to obtain Y :
Y1 =
˜d
S
q
S
s
Y2 =
(y
S,r + y
S,t)
˜d
S
q
S
s
+
(y
N,r + y
N,t)
˜d
N
q
N
s
Y3 =
˜d
N
q
N
s
Y4 =
(y
W,r + y
W,t)
˜d
W
qW
s
+
(y
E,r + y
E,t)
˜d
E
q
E
s
Y5 =
y
W,l ˜d
W
qW
s
+
y
E,l ˜d
E
q
E
s
Y = Y1 + Y2 + Y3 + Y4 + Y5
(5.16)
where qs is the saturation flow of the approach whose direction is specified by the superscript.
In the considered arterial network, qs = 7200 veh/h for all directions at each intersection.
Since (5.16) applies to all intersections in the road network, the index of the intersection is
omitted for the sake of simplicity.
Cycle Length and Split Computation
We compute the signal cycle Tc for each intersection using (5.13) and then select the closest
value as the actual cycle Tc from the cycle space {40, 50, 60, ..., 180} s. If the result of (5.13)
happens to be in the middle of two options, we select the larger one.
Once Tc is determined, we can allocate the green time for each phase according to the
flow ratios found in (5.16):
Tg,j =
(Tc − Tl)Yj
Y
, for j = 1, 2, 3, 4, 5, (5.17)
66
where Tg,j denotes the green time of phase j per cycle, and the lost time Tl = 16 s.
To minimize travel delays and improve the traffic mobility, it is recommended to unify
the cycle length for closely spaced traffic signal and use proper offsets to create a progression
band (green wave) for vehicle platoons on the main street [113]. However, the intersections
in our simulation network are relatively far apart with a minimum distance of 600 m and
1500 m on average. Besides, the longitudinal traffic is not significantly larger than the lateral
traffic at each intersection. Therefore, the offset optimization is not considered. The offset
of each signal is simply set to 0 s.
5.3 Numerical Simulations
5.3.1 Simulation Network and Parameters
The proposed control methodologies are simulated using a microscopic traffic simulator based
on the commercial software PTV Vissim 10. The road network in Figure 5.7 contains a 16-
km segment of I-710 freeway and the adjacent arterial region in Los Angeles, California,
United States. The freeway segment is divided into 6 sections and one upstream VSL zone.
The length of each freeway section is slightly longer than the distance of the parallel arterial
link due to its curved shape, with an average of 1.6 km. The length of the upstream VSL
zone is 4 km and the deployment location of v0 is determined by (5.9). The freeway segment
has 5 lanes, 5 on-ramps and 6 off-ramps. All ramps are connected with the arterial road
network. There are 7 arterial intersections aligned in parallel with the freeway. The realworld location of each intersection is marked by a red circle in Figure 5.8. The selected
intersections are all major intersections whose traffic conditions are closely related to the
states of adjacent freeway section. We simplify the simulation network by directly linking
some of the intersections and ignoring vertical freeways.
Traffic demands are generated from one freeway entrance and 16 arterial entrances as
indicated by arrows in Figure 5.7. The freeway demand is based on hourly average traffic
67
volume data of April 2019 from the Caltrans Performance Measurement System (PeMS).
The arterial demands are based on hourly traffic counts data of April 2019 from the LADOT
Database. We consider two levels of traffic demands. The moderate level is based on the
average hourly traffic data of the whole month. The high-level demand is 40% more than
the moderate-level demand. Each simulation run lasts for 40 min. If the incident exists, it
takes place after a 10-min warm-up and will be cleared at 30 min.
Based on our observations, we’ve noted that the saturation demand is approximately
double the moderate-level demand. This heightened demand often leads to long queues at
intersections, occasionally causing spillback onto the freeway via off-ramps. Our proposed
approach does not effectively address this situation, highlighting the need for further research.
Figure 5.7: I-710 Simulation Road Network with Incident Location
Figure 5.8: I-710 Simulation Road Network on Bing Map
The turning ratio is determined by the ratio of the traffic flow of each direction, where each
traffic flow follows a normal distribution N (µ, σ2
). µ and σ are the mean and the standard
deviation of the LADOT traffic counts data of the corresponding direction. Although the
actual turning ratio is unknown, we use the mean value of the flow distribution to estimate
68
the turning ratio of an intersection approach as follows:
y
l =
µ
l
µl + µt + µr
y
t =
µ
t
µl + µt + µr
y
r =
µ
r
µl + µt + µr
(5.18)
where l, t, r stand for left-turn, through, right-turn respectively. The turning ratios of onramps and off-ramps are estimated in the same manner as those of intersection approaches.
The above-mentioned road network and parameter setting are applied to the online implementation during the learning process of the freeway traffic control (FTC) agent. There
are 4 scenarios to be executed in one online implementation - moderate demands without
incident, moderate demands with incident, high demands without incident, high demands
with incident. The cycle of online implementation and continuous learning is iterated until
the performance improvement in travel time and queue lengths between 3 consecutive iterations is less than 1%. Then we apply the refined FTC agent for each scenario once more to
obtain the final simulation results presented in section 5.3.2. In addition, we are interested
in 3 other types of freeway control as comparisons. Note that the arterial traffic control is
fixed to the proposed one in section 5.2.3. Different freeway control strategies to be tested
are summarized as follows:
(i) No freeway control: inactive VSL, LC and RM control.
(ii) Decentralized feedback control: a decentralized feedback control strategy proposed in
[15].
(iii) QL without coordination: uncoordinated FTC trained by a similar QL algorithm. To
eliminate the coordination, the FTC agent is divided into three sub agents (VSL, LC,
RM) for separate training. Each agent has only one action variable (e.g. vi
for VSL).
In addition, none of these agents considers the adjacent arterial traffic conditions, and
69
thus, np,
˜d
E
k
,
˜d
S
k
,
˜d
W
k
,
˜d
N
k
are removed from the state variables.
(iv) QL with coordination: the proposed FTC.
5.3.2 Simulation Results
In this section, we evaluate the performance of 4 types of control strategies as previously
mentioned in the road network depicted in Figure 5.7 for each interested scenario. The
performance criteria are listed as follows [4]:
• Freeway average travel time (Tt): the average time spent for each vehicle to travel
through the freeway segment.
Tt =
1
Nv
X
Nv
i=1
(ti,out − ti,in) (5.19)
where Nv is the number of vehicles passing through the freeway segment, ti,in and ti,out
is the time vehicle i enters and exits the freeway segment respectively. ti,in > 10 min
so that the warm-up period is excluded. Vehicles that enter or exit from ramps are
also excluded.
• Freeway average number of stops (¯s): the average number of stops performed by each
vehicle when traveling through the freeway segment.
s¯ =
1
Nv
X
Nv
i=1
si (5.20)
where si
is the number of stops performed by vehicle i. The warm-up period is excluded.
• Freeway average emission rates of CO2 (E): the calculation of emission rates is based
on the MOVES model provided by the Environment Protection Agency [103].
E =
X
Nv
i=1
Ei/
X
Nv
i=1
li (5.21)
70
where Ei
is the emission produced by vehicle i and li
is the travelled distance of vehicle
i. The warm-up period is excluded.
• Average on-ramp queue length ( ¯wo):
w¯o =
X
N
i=1
w¯i/No (5.22)
where N is the number of freeway sections, ¯wi
is the average queue length of on-ramp
i during the simulation except the warm-up, No is the number of on-ramps.
• Average queue length of arterial intersections ( ¯wa):
w¯a =
X
K
k=1
( ¯w
N
k + ¯w
S
k + ¯w
E
k + ¯w
W
k
)/4K (5.23)
where K is the number of arterial intersections, ¯w
N
k
is the average queue length of the
Northbound approach of intersection k during the simulation except the warm-up.
Considering the stochastic nature of microscopic simulations, we take the average of 10
simulation runs for each pair of control type and scenario and record the final results in
Table 5.1, 5.2, 5.3 and 5.4.
Table 5.1: Evaluations of a Moderate-Demand Scenario without Incident
Control Tt (min) ¯s E (g/veh/km) ¯wo (m) ¯wa(m)
No freeway control 10.2 0.3 200.5 0 11.2
Decentralized feedback control 10.2 0.2 199.8 0 11.9
QL without coordination 10.2 0.2 199.4 0 12.5
QL with coordination 10.2 0.2 200 0 11.6
Table 5.1 presents the results of a moderate-demand scenario without incident, where
there is no bottleneck on freeway and minimal control effort is needed. Thus, the results of
all types of control are close to each other. The average number of stops is supposed to be
0 in reality. However, the stop happens in simulations when two vehicles are very close to
71
Table 5.2: Evaluations of a Moderate-Demand Scenario with Incident
Control Tt (min) ¯s E (g/veh/km) ¯wo (m) ¯wa(m)
No freeway control 12.3 0.7 212.7 0 14.5
Decentralized feedback control 11.4(7%) 0.5(29%) 209(2%) 13.1 11.7
QL without coordination 11.3(8%) 0.5(29%) 208.1(2%) 15.9 11.6
QL with coordination 10.8(12%) 0.3(57%) 203.2(4%) 16.4 11.7
Table 5.3: Evaluations of a High-Demand Scenario without Incident
Control Tt (min) ¯s E (g/veh/km) ¯wo (m) ¯wa(m)
No freeway control 12 2.1 241.7 25.3 49.6
Decentralized feedback control 11.7(2%) 1.6(24%) 233.7(3%) 30.2 49.3
QL without coordination 11.6(3%) 1.7(19%) 232.5(4%) 44.1 47.8
QL with coordination 11.3(6%) 1.4(33%) 225.5(7%) 40.8 45.6
each other at ramp merging points. Table 5.2 presents the results of a moderate-demand
scenario with incident, where the incident introduces a lane-drop bottleneck that increases
the travel time (Tt) and the number of stops (¯s) significantly. The percentages in brackets
quantify the performance improvement by implementing the corresponding control scheme
versus no freeway control. The percentage improvement of the coordinated control is higher
than decentralized feedback control and the uncoordinated control, which verifies the benefit
of integrating VSL, LC and RM actions.
Table 5.3 presents the results of a high-demand scenario without incident, where the
increased demand introduces bottlenecks at on-ramp merging areas occasionally. The overall
performance improvement by freeway traffic control is less obvious compared with Table
5.2 because the ramp-merging bottleneck is less detrimental than the lane-drop bottleneck.
The coordinated control still outperforms uncoordinated control schemes with less margins.
Table 5.4 presents the results of a high-demand scenario with incident, where both the lanedrop bottleneck and ramp-merging bottlenecks exist. The performance margins between
the coordinated and uncoordinated control schemes increase as we introduce the incident.
The above observations indicate that the proposed coordination mechanism provides higher
benefits when the demand grows up or when the incident takes place.
72
Table 5.4: Evaluations of a High-Demand Scenario with Incident
Control Tt (min) ¯s E (g/veh/km) ¯wo (m) ¯wa(m)
No freeway control 14.1 4.9 262.4 33 56.7
Decentralized feedback control 13(8%) 2.9(41%) 253(4%) 35.9 55.2
QL without coordination 13.3(6%) 3.2(35%) 254.2(3%) 48.7 51
QL with coordination 12.4(12%) 2.4(51%) 245.3(7%) 47 47.5
A trade-off between the freeway performance (Tt
, s, E¯ ) and on-ramp queue lengths ( ¯wo)
can be observed by using any type of freeway traffic control. Taking the coordinated control in
Table 5.4 as an example, the average on-ramp queue is 14 m longer while the average freeway
travel time is reduced by 12%. The increase in the queue length is trivial considering the
minimum on-ramp queue capacity, which is over 300 m. Therefore, the trade-off is acceptable.
By considering the adjacent arterial traffic conditions, the coordinated control demonstrates
slightly better performance compared to the uncoordinated QL algorithm in managing onramp queue lengths. This improvement may stem from an earlier activation of lane change
recommendations, which smooths the merging process for large on-ramp demands.
Despite the fact that the FTC agent does not control the arterial traffic signals, the
coordinated freeway control leads to a significant reduction in average queue length of arterial
intersections ( ¯wa), possibly due to a better processing of the off-ramp demands feeding into
the arterial network. This benefit motivates a fully integrated control approach where the
QL framework also involves the control of arterial traffic signals.
We draw density profiles for freeway section 4 in Figure 5.9 as a representative since
the density profiles of each section shares similar behaviors. The purple dash line marks
the desired density in each scenario. We spend a 10-min warm-up loading traffic for the
entire network. Then the control algorithm kicks in and maintains the density of each
freeway section at the steady state. Figure 5.9 shows that the interested control schemes
deliver similar performance in terms of stabilizing the density when there is no incident. The
coordinated QL performs significantly better than the uncoordinated QL and decentralized
feedback control when the incident occurs in high-demand scenarios. The above results
73
(a) (b)
(c) (d)
Figure 5.9: Measured Vehicle Density Profiles in Freeway Section 4 (where the incident
takes place). (a) Moderate demand without incident, (b) Moderate demand with incident,
(c) High demand without incident, (d) High demand with incident.
also demonstrate the necessity of coordinating different control components when the road
network is congested.
Although the proposed control strategy demonstrates satisfactory overall performance, we
have observed instances in high-demand scenarios where red light queues at certain arterial
intersections spill back onto the freeway right lane through off-ramps, significantly impeding
freeway traffic mobility. Interestingly, other directions of these intersections remain relatively
empty. This observation motivates us to develop an adaptive arterial signal controller which
is capable of adjusting the timing to alleviate off-ramp queues in future research.
74
Chapter 6: Traffic Signal Control and Speed
Offset Coordination Using Q-Learning for Arterial Road Networks
6.1 Introduction
The effective management of arterial traffic has become a pivotal aspect in addressing the
challenges posed by growing populations and increasing vehicular densities. Traffic signal
control (TSC) is the only available control option for most arterial roads nowadays. The
synchronization of traffic signals facilitates the creation of green waves, allowing for more
seamless progression of vehicles and reducing travel times on arterial roads. The conventional
fixed-time TSC systems are proved to be insufficient in coping with the unpredictable nature
of modern urban traffic [114]. To address the issue, many adaptive TSC algorithms have been
proposed and thoroughly studied by the research community [62, 115, 81]. The adaptive TSC
system typically uses sensors and real-time data to adjust signal timings and dynamically
respond to the change of traffic states. However, the benefit obtained by the adaptive signal
control can be limited when the arterial roads are highly saturated due to high demands or
unexpected incidents [116, 117].
Although being less practiced in the real world, other traffic regulation techniques such
as variable speed limit (VSL) control may also improve the arterial traffic mobility when
coordinated with the traffic signal control [118, 59]. The core idea is to control the speed
75
of vehicles to allow as many vehicles to cross the signalized intersection during the green
phase as possible, and thus, maximize the progression bandwidth. The above concept has
been further extended to the study of eco-driving where the most energy-saving trajectory
is determined for connected vehicles (CVs) based on the information of signal phasing and
timing [63, 64, 65]. Most of these studies focus on a microscopic/vehicular level and assume
the signal timing is fixed.
In this chapter, we develop an arterial traffic control strategy that coordinates a series
of traffic signals with variable speed control to improve the progression bandwidth and
reduce the arterial travel time. To facilitate the coordination between different control
components, we adopt the reinforcement learning (RL) framework, which has been applied
in many adaptive signal control studies due to its ability to share the control policy and the
perception of traffic conditions between individual agents [77]. Besides, the RL framework
can be modified to include both the signal control and the speed control as two separate
agents, but maintain collaborative interactions among the agents to enhance the overall
system performance.
We have designed a RL-based freeway traffic control strategy and verified its effectiveness
in a mixed freeway and arterial road network in chapter 5. In this chapter, we focus on the
performance of the arterial part and implement a more sophisticated arterial control strategy
using a similar RL framework in the same road network. In addition, we are interested in
the interactions between the freeway control and the arterial control.
6.2 Problem Statement
Consider a road network that consists of a freeway segment and the adjacent arterial streets
as depicted in Figure 6.1. Although the freeway and arterial traffic interact frequently via
ramps, they are controlled separately with very little, if any, communication. This often
leads to severe traffic congestion problems such as off-ramp queue overspill where the red
76
light queue at an arterial intersection extends all the way back to the side lane of freeway
via the off-ramp. To prevent the problem, the arterial signal needs to adjust its timing to
accommodate more off-ramp demands before the occurrence of overspill.
Figure 6.1: Road Network under Proposed Arterial Traffic Control
On the other hand, in spite of the evolution of traffic signal control (TSC) techniques in
past decades, TSC by itself has difficulty maintaining the expected performance under high
arterial traffic demands. Recent developments of eco-driving motivates us to incorporate
vehicle speed control with TSC in order to increase the progression bandwidth and improve
traffic mobility.
With the above considerations, we design a new arterial control strategy that coordinates
traffic signal timing, offset and dynamic speed limits with the objective to reduce arterial
travel time and size of queues at off-ramps and intersections. A Q-learning framework is
adopted due to its ability to coordinate different control components and fast real-world
implementations. The off-ramp queue is taken into account as a state variable and included
in the reward function to avoid off-ramp overspill.
77
6.3 Methodology
In this section, we provide a detailed description of the proposed Q-learning-based arterial
traffic control strategy. The road network depicted in Figure 6.1 consists of a freeway segment
and an adjacent arterial corridor with multiple signalized intersections. The freeway segment
is divided into N sections. Each section contains at least one ramp and each ramp connects
to the East leg of an arterial intersection. Note that some ramp connections are omitted in
Figure 6.1 due to limited drawing space. The proposed arterial traffic control is operated
by two agents: a traffic signal control (TSC) agent that computes the signal plan (with zero
offset) for each intersection and a dynamic speed offset (DSO) agent that determines the
proper signal offset and vehicle speed between two intersections.
6.3.1 TSC Agent
The TSC agent computes the signal plan which includes the cycle length and phase split
for each arterial signal based on the observations of traffic states as shown in Figure 6.2.
We introduce a network-level controller to receive the outputs of all the TSC agents and
generate final signal plans. The network control unifies the signal plan for all intersections
so that the progression bandwidth on the horizontal arterial in Figure 6.1 can be maximized
with the assistance of the SO agent. This unification mechanism can be either activated or
deactivated. The control cycle is set to 5 min, which means the signal plan is updated every
5 min during the simulation. To apply QL algorithms, we first introduce the definition of
states, actions and reward function for the TSC agent.
States:
The states of the TSC agent are defined as
Xts = [ws,
˜d
S
,
˜d
E
,
˜d
N ,
˜d
W ] (6.1)
78
Figure 6.2: Traffic Signal Control (TSC) Agent
where ws is the queue length of the off-ramp that is connected to the freeway side of the
corresponding intersection, and ˜d
S
,
˜d
E,
˜d
N ,
˜d
W are measured incoming demands of the intersection from four directions in vehicles per hour (veh/h), with the superscript denoting
the direction (e.g. S for Southbound). These traffic states are measured every 30 seconds.
The demand is estimated as the average measured flow rate of the upstream link in the past
control cycle. ws is set to 0 at the beginning of each simulation, and then calculated as the
average observed off-ramp queue length in the past control cycle.
We use Sw to denote the state space of ws and Sd to denote the state space of
˜d
S
,
˜d
E,
˜d
N ,
˜d
W . To reduce the problem dimension and improve the training efficiency, we
discretize the continuous state space so that Sw = {0, 50, 100, ..., 500} and
Sd = {0, 100, 200, ..., 4000}. Note that 500 m is slightly larger than the length of the longest
off-ramp and 4000 veh/h is larger than the maximal historical link flow. Therefore we choose
these two values as the upper limit of the discrete state spaces.
79
Actions:
The actions executed by the TSC agent are expressed as
Ats = [Tc, g1, g2] (6.2)
where Tc is the cycle length, which is the sum of green time of all the phases plus the lost
time during phase transitions. g1, g2 are phase-split ratios to be used to compute the green
time of each phase, and below we will present the exact expressions used to calculate them.
The default phase scheme involves 6 phases as depicted in Figure 6.3. The green time of
Figure 6.3: Traffic Signal Phasing Scheme with Six Phases
each phase is denoted as Tg,j where j = 1, 2, ..., 6 is the phase index. To simplify the phase
split computation, we assume the phase with the same color in Figure 6.3 has the same green
time, i.e. Tg,1 = Tg,3, Tg,4 = Tg,6. In addition, Tg,1/Tg,2 = Tg,4/Tg,5.
With the above assumptions, we only need two ratios to compute the green time of each
phase - g1 is the green time of first 3 phases divided by the green time of all 6 phases, g2 is
the green time of the first phase divided by the green time of first 3 phases, i.e.
g1 =
Tg,1 + Tg,2 + Tg,3
Tg,1 + Tg,2 + Tg,3 + Tg,4 + Tg,5 + Tg,6
g2 =
Tg,1
Tg,1 + Tg,2 + Tg,3
(6.3)
Note that the green time of all 6 phases is not equal to the cycle length Tc, instead we have
Tg,1 + Tg,2 + Tg,3 + Tg,4 + Tg,5 + Tg,6 = Tc − Tl (6.4)
where Tl
is the loss time, defined as the time period during which no vehicles pass through
80
the intersection due to phase transitions.
The action space of the cycle length Ac is set to {40, 50, 60, ..., 180} in seconds, which
follows the same setting in chapter 5. The action space of g1 is Ag1 = {0.2, 0.3, 0.4, ..., 0.8}.
The action space of g2 is Ag2 = {0.1, 0.2, 0.3, 0.4}. After choosing g1 and g2, the phase split
can be computed as follows:
Tg,1 = Tg,3 = (Tc − Tl)g1g2
Tg,2 = (Tc − Tl)g1(1 − 2g2)
Tg,4 = Tg,6 = (Tc − Tl)(1 − g1)g2
Tg,5 = (Tc − Tl)(1 − g1)(1 − 2g2)
(6.5)
Reward:
The first objective of the traffic signal control is to reduce the average travel time for a square
area centered at the intersection as depicted in Figure 6.4. The side length of the square
Figure 6.4: The Square Area for TSC to Compute the Average Travel Time
area Ls is chosen to be long enough to measure and monitor the queues at the intersection,
and thus, Ls = 400 m.
81
The average travel time Tt
is computed as
Tt =
1
Nv
X
Nv
i=1
(ti,out − ti,in) (6.6)
where Nv is the number of vehicles traveling through the square area during the last control
cycle, ti,in and ti,out is the time vehicle i enters and exits the square area.
The TSC agent also interacts with the off-ramp queue. An effective reward function
should exhibit a negative correlation with the average travel time Tt and the average offramp queue length ws. Considering the above requirements, the reward function is defined
as
Rts = max{0, 1 −
ws
wr
s
} · Ls
Ttva
(6.7)
where w
r
s
is the reference off-ramp queue length that we do not want ws to reach. va is the
default arterial speed limit. w
r
s
should be slightly less than the off-ramp queue capacity to
prevent the overspill. In the simulations, we set w
r
s = 400 m and va = 60 km/h.
Rts has a maximum value of 1, which is obtained when there is no queue at the off-ramp
and the average travel time Tt = Ls/va. This is a rare case which occurs when there is no
travel delay at the intersection. Rts reaches the minimum value of 0 when ws exceeds the
reference value w
r
s
, which effectively prevents off-ramp queue overspill. In general, a higher
Rts reflects less off-ramp queue length and average travel time. Note that the objective of
Q-learning is not only to maximize the immediate reward defined in (6.7), but to maximize
a cumulative long-term reward where the reward of each time step is computed by (6.7).
Network Control:
The network controller receives the intended actions from all TSC agents within the arterial
network and selects a unified action for each action variable of all signals based on a majority
rule as follows:
• If the most common action is the intended action of more than half TSC agents, it
82
selects it as the unified action for all signals.
• If the most common action is supported by half TSC agents or less, it computes the
average of all intended actions and selects the common action to be the one that is
closest to the average value. If the average value lies in the middle of two actions
exactly, it selects the larger one.
To demonstrate the majority rule proposed above, we provide an example of an arterial
corridor with four signalized intersections and discuss three cases as follows:
• Case 1: the intended signal cycles are {60, 60, 60, 70}s. Since more than half agents
favors 60s, the network controller selects 60s as the common cycle.
• Case 2: the intended signal cycles are {60, 60, 70, 80}s. No action is favored by more
than half agents. The average cycle length is 67.5s. The network controller selects the
cycle that is closest to 67.5s, i.e. 70s.
• Case 3: the intended signal cycles are {60, 60, 70, 70}s. No action is favored by more
than half agents. The average cycle length is 65s, which lies in the middle of two
available options - 60s and 70s. The network controller selects the larger one, i.e. 70s.
Although the unification may compromise the travel time of some particular intersection
areas, it benefits the overall arterial travel time and traffic mobility in most circumstances
with the collaboration of the DSO agent. The unification mechanism is deactivated during
the training process so that more state-action pair can be visited.
6.3.2 DSO Agent
The DSO agent searches for the optimal offset and speed recommendations between two
adjacent arterial signals based on the observations of signal plans, intersection queue length
and link distance as shown in Figure 6.5. The control cycle is set to 5 min to be consistent
83
with the TSC agent. The states, actions and reward function for the DSO agent are described
as follows:
Figure 6.5: Dynamic Speed Offset (DSO) Agent
States:
The states of the DSO agent are defined as
Xso = [T
U
c
, T D
c
, wU
a
, wD
a
, L] (6.8)
where T
U
c
is the cycle length of the upstream signal, T
D
c
is the cycle length of the downstream
signal, w
U
a
is the queue length at the Northbound approach of the upstream intersection, w
D
a
is the queue length at the Southbound approach of the downstream intersection, and L is
the link distance.
T
U
c
and T
D
c
are determined by the TSC agent after each control cycle. w
U
a
is observed
every 30 seconds and we take the average of all observed values of w
U
a during each control
cycle as its state value. The same applies to w
D
a
. The link distance L is assumed known for
each pair of adjacent intersections within the network.
The state space of each state variable is also discretized to accelerate the training process.
The state space of T
U
c
and T
D
c
, denoted as Sc, is the same as the action space of the cycle
84
length Ac, i.e. Sc = {40, 50, 60, ..., 180} in seconds. The state space of w
U
a
and w
D
a
, denoted
as Swa
, is set to {0, 50, 100, ..., 250} in meters. The state space of L, denoted as SL, is defined
as {1000, 1100, 1200, ..., 2500} in meters.
Actions:
The actions executed by the DSO agent are expressed as
Aso = [To, vD
r
, vU
r
] (6.9)
where To is the offset of signal j with respect to signal j − 1 expressed in seconds, which
means that the cycle initiation of signal j is later than signal j − 1 by To seconds. v
D
r
is the
recommended speed for downstream traffic in link j − 1, and v
U
r
is the recommended speed
for upstream traffic in link j − 1.
It is unnecessary for the offset To to be larger than the cycle length of signal j − 1 [119],
and thus, the action space of To, denoted as Ao, is defined as {0, 5, 10, ..., T′
c − 5} in seconds,
where T
′
c
is the cycle length of the upstream signal. The action space of v
D
r
and v
U
r
, denoted
as Avr
, is set to {40, 45, 50, ..., 80} in km/h.
Reward:
The task for the DSO agent is to reduce the average travel time on link j − 1 and the queue
lengths at both intersections by setting a proper offset and speed recommendations. Note
that higher speed does not necessarily reduce the travel time as the vehicle may run into the
red phase at the signal. In that case, we prefer to let the vehicle approach the signal with
lower speed so that it passes directly under the green light and does not join the red light
queue.
85
Guided by the above ideas, we define the reward function for the DSO agent as follows:
Rso = max{0, 1 −
w
U
a
wr
a
} · max{0, 1 −
w
D
a
wr
a
} · 3L
4T
′
tva
(6.10)
where w
r
a
is the reference queue length of an intersection approach and T
′
t
is the average
travel time for both directions on link j − 1. T
′
t
is calculated using (6.6). In the simulations,
we set w
r
a = 200 m.
Rso reaches the maximum value of 1 if all vehicles pass through the link at maximum
possible speed (80 km/h) and the two approaches have no queue at all. The minimum value
of Rso is 0, which is obtained when either queue at the two approaches exceeds the reference
value w
r
a
. A larger value of Rso reflects less intersection queue length and link travel time.
Note that the objective of Q-learning is to maximize a cumulative long-term reward where
the reward of each time step is computed by (6.10).
6.3.3 Q-Learning
A crucial factor to enhance the effectiveness of the proposed arterial traffic control method
is the coordination between the TSC agent and the DSO agent. Most rule-based and
simulation-based algorithms have each control component accomplish its own objective without any coordination [120, 121, 122]. The use of optimization framework may provide a
solution that facilitates the coordination as multiple sub-controllers working toward a common objective function [123, 62, 59]. However, the optimization problem formulation for the
road network under consideration is overly complex considering the network size and the
mixed freeway and arterial environment, which leads to very time-consuming computations
which limit its practicality. A more computationally feasible approach that also takes the
coordination and optimality into account is reinforcement learning (RL) [77, 115, 81]. The
rest of this section covers the derivation process of the Q-value update equation and the
reason for choosing Q-learning algorithms, which include more details than section 5.2.1 in
86
the previous chapter.
During the RL training process, the agent learns the best policy π that maximizes the
total expected discounted rewards Vπ(x) for each state x via trial and error. Note that Vπ(x)
is a long-term return and should not be confused with the immediate reward of both agents
defined in (6.7) and (6.10). Vπ(x) is expressed as follows
Vπ(x) = Eπ[
X∞
k=0
γ
kRk|x] (6.11)
where x is the current state defined in (6.1) and (6.8), π is a policy that suggests which action
should be taken for each possible state, k = 0 is the current time step, Rk is the reward
defined in (6.7) and (6.10) at time step k, γ ∈ [0, 1) is the discount factor that determines
the importance of the future reward. A γ close to 0 makes the agent short-sighted (only
considering current rewards), while a γ close to 1 makes it far-sighted (considering future
rewards). The future reward gets more and more discounted as it steps away from the current
time.
Without loss of generality, we assume the states of both TSC and DSO agents are memoryless, which means their future states only depend on the current states and actions.
Therefore, searching the optimal policy π can be formulated as a Markov Decision Process
(MDP) problem and solved by the Bellman equation [124, 72]. To express Vπ(x) using the
Bellman equation, we first split it into the immediate reward and the discounted future
rewards
Vπ(x) = Eπ[R0 + γ
X∞
k=1
γ
k−1Rk|x] (6.12)
The term P∞
k=1 γ
k−1Rk is the sum of discounted future rewards starting from the next time
step k = 1, which is essentially Vπ(x
′
) where x
′
is the state at k = 1. Therefore, we can
rewrite Vπ(x) as
Vπ(x) = Eπ[R0 + γVπ(x
′
)|x] (6.13)
87
Then we expand the expectation form by considering the policy π and transition probability to derive the Bellman equation [124] for Vπ(x) as
Vπ(x) = X
a
π(a|x)
X
x′
Pa(x, x′
)(Ra(x, x′
) + γVπ(x
′
)) (6.14)
where π(a|x) is the probability of taking action a in the current state x under policy π,
Pa(x, x′
) is the transition probability from x to the future state x
′ via action a, and Ra(x, x′
)
is the reward obtained from the above transition. x, a, R are the states, actions and reward specified by (6.1)(6.2)(6.7) for the TSC agent and (6.8)(6.9)(6.10) for the DSO agent
respectively.
Vπ(x) in (6.14) is an iterative term and its optimal value can be potentially solved using
dynamic programming (DP). However, the transition probability Pa(x, x′
) is unknown in the
traffic environment, which motivates us to adopt model-free techniques such as Q-learning
(QL) [72]. The Q-value Q(x, a) is defined as the total expected discounted rewards after
we take action a in state x. Q(x, a) directly tells us the expected return without needing a
model to calculate it for each action.
According to the definition of Q(x, a) and the Markov property, the optimal Vπ(x) is
derived by selecting the best action at each state
V
∗
π
(x) = max
a
Q
∗
(x, a) (6.15)
Therefore, searching the optimal policy π is equivalent to finding the action that maximizes
the Q-value for each state
π
∗
(x) = arg max
a
Q
∗
(x, a) (6.16)
Plugging the above optimal policy into (6.14), we have
Q
∗
(x, a) = X
x′
Pa(x, x′
)(Ra(x, x′
) + γ max
a
′
Q
∗
(x
′
, a′
)) (6.17)
88
Equivalently
Q
∗
(x, a) = Ra(x, x′
) + γ
X
x′
Pa(x, x′
) max
a
′
Q
∗
(x
′
, a′
) (6.18)
where maxa
′ Q∗
(x
′
, a′
) represents the maximum possible future return after moving to the
future state x
′ and taking the best action a
′
.
The Q-value iteration equation (6.18) cannot be solved since the transition probability is
unknown. However, it indicates that the optimal Q-value for each state-action pair should
converge to Ra(x, x′
) + γ maxa
′ Q∗
(x
′
, a′
), according to which the following Q-value update
equation is motivated [72].
Q(x, a) ← Q(x, a) + η[R(x, a) + γ max
a
′
Q(x
′
, a′
) − Q(x, a)] (6.19)
where η is the learning rate that determines to what extent the newly acquired information
will override the old information. R(x, a) + γ maxa
′ Q(x
′
, a′
) is the newly acquired estimate
of Q-value which consists of both the immediate reward and the discounted future rewards.
The discount factor γ is selected as 0.9 [125]. We consider the convergence of Q(x, a) is
achieved when the difference between the updated Q(x, a) and the previous Q(x, a) is less
then 0.01, i.e. ϵq = 0.01.
In the case of TSC agents, the state x refers to Xts specified by (6.1). The action a refers
to Ats specified by (6.2). The immediate reward R(x, a) is computed by (6.7). In the case of
DSO agents, the state x refers to Xso specified by (6.8). The action a refers to Aso specified
by (6.9). The immediate reward R(x, a) is computed by (6.10).
The design of learning rate η follows (5.10). Instead of using (5.11), we switched the
action selection strategy based on a Softmax function as follows [126]
p(a|x) = exp (Q(x, a)/T)
P
b∈A
exp (Q(x, b)/T)
(6.20)
where p(a|x) is the probability of selecting action a at state x, and T is the temperature.
89
T is a hyperparameter that controls the level of exploration. Lower T makes the policy
more deterministic, favoring exploitation, while higher T increases exploration by making
the policy more stochastic. In our design, we want both agents to favor exploitation slightly
more than exploration and thus select T = 0.5 [125].
6.3.4 Training Process
The training process depicted in Figure 6.6 is carried out using microscopic simulation software PTV VISSIM 10 in a simulation network that will be illustrated in section 6.4. The
freeway demands are generated using hourly traffic volume data from the Caltrans Performance Measurement System (PeMS) in April 2019. The arterial demands are generated
using hourly traffic counts from LADOT Database in April 2019. The Q-value for each
state-action pair is initialized randomly using a uniform distribution on [0, 1] to encourage
exploration [72]. This range is consistent with the range of the rewards of both agents so
that overshooting can be effectively avoided.
As shown in Figure 6.6, the simulator starts by loading corresponding traffic demand at
each entrance of the road network and assigning default signal plans and speed limits. The
training of both TSC and DSO agents begins after a 10-min warm-up. At the beginning of
each control cycle, the TSC agent first collects the state Xts for each signal, and selects an
action Ats using (6.20) based on current Q-values. The network control is deactivated during
the training process to encourage the exploration, and the TSC action of different signals
may vary. At the end of the control cycle, a reward value is computed using (6.7) for each
signal. The new state X′
ts is collected. Then we update the Q-value for the state-action pair
(Xts, Ats) using (6.19). Since there are K signals within the arterial network, the Q-values
are updated K times in total after one control cycle. The training of the DSO agent follows
a similar procedure except its Q-values are updated (K −1) times due to the number of links
being (K − 1).
The training process completes when Q(x, a) converges for all state-action pairs of both
90
Figure 6.6: Training Process of TSC and DSO Agent
agents. Considering the restrictions of VISSIM, each training simulation run lasts for 12
hours and the whole training process may take hundreds of simulation runs to complete. To
fully utilize the traffic data, the 12-hour window of each simulation run shifts afterward by
1 hour from the previous run. For example, if the current run uses the traffic data from 6
a.m. to 6 p.m. of a specific day, the next run will use the traffic data from 7 a.m. to 7 p.m.
of the same day.
We introduce a freeway incident with two purpose: the first one is to bring new traffic
conditions into the training to encourage the exploration of more state-action pairs by both
agents; the second purpose is to activate the full version of freeway traffic control which
consists of variable speed limit (VSL), lane change (LC) and ramp metering (RM), so that
the agents in the arterial network can be trained to work with the freeway control. During
the training, the incident on freeway has a probability of 0.25 to occur at the beginning of
91
each hour and is cleared in 20 minutes.
6.4 Experimental Study
6.4.1 Simulation Network and Parameters
We implement the proposed arterial traffic control strategies with the freeway traffic control
strategies designed in chapter 5 simultaneously and perform microscopic simulations using
the commercial software PTV Vissim 10. The simulation road network is also the same as
the one being used in chapter 5, which is depicted in Figure 6.7 for the sake of readability.
The network consists of a 16-km segment of the I-710 freeway and an adjacent arterial corridor with 7 signalized intersections in Los Angeles, California, United States. The freeway
segment is divided into 6 sections and one upstream section, where the freeway controller
deploys the first VSL sign dynamically [110]. There are 5 on-ramps and 6 off-ramps that
connect the freeway segment with the arterial region. The selected intersections are major
ones whose traffic states are strongly correlated with the traffic states of neighboring freeway sections. To simplify the simulation network, some minor arterial streets and vertical
freeways are ignored, and the selected intersections are directly linked.
Figure 6.7: I-710 Simulation Road Network
Traffic demands are produced at one freeway entrance and 16 arterial entrances as implied
by arrows in Figure 6.7. The freeway demands are generated using hourly traffic volume data
from the Caltrans Performance Measurement System (PeMS) in April 2019. The arterial
demands are generated using hourly traffic counts from LADOT Database in April 2019.
92
The turning ratio at an intersection approach or a ramp is determined by the ratio of the
historical average traffic counts of each direction based on the LADOT data. The incident
indicated in Figure 6.7 triggers a side-lane-closure on freeway and potentially impacts both
freeway and arterial traffic. To evaluate the trained agents, three levels of traffic demands are
considered: a low-demand level based on 1-3 a.m. weekday traffic, a moderate-demand level
based on 12-2 p.m. weekday traffic, and a high-demand level based on 5-7 p.m. weekday
traffic. Each evaluation simulation run lasts for 40 min. In the presence of an incident, it
occurs after a 10-minute warm-up and is cleared at 30 minutes.
With three demand levels and the incident occurrence option, we have a total of six
evaluation scenarios. The interested control strategies to be evaluated under these scenarios
are listed as follows:
• No freeway control (NFC): the available freeway control components (variable speed
limit, lane change, ramp metering) are all inactive.
• Integrated freeway control (IFC): a freeway control strategy that coordinates all the
control components, proposed in chapter 5.
• Fixed-time arterial traffic signal control (FAC): each signal has a fixed cycle of 120 s,
a fixed split and zero offset. The arterial speed limit is fixed to 60 km/h.
• MAXBAND: a classic arterial signal control algorithm that maximizes the progression
bandwidth [56].
• QL-based arterial traffic control without unification (QAC): the proposed arterial traffic
control strategy without unifying the cycle and split.
• QL-based arterial traffic control with unification (QACU): the proposed arterial traffic
control strategy with active unification logic.
93
6.4.2 Evaluation Criteria
We evaluate the performance of a few combinations of the above control strategies for each
interested scenario. Note that the warm-up period is excluded in any type of evaluation.
The evaluation criteria are listed as follows [127]:
• Freeway average travel time (T
f
t
): the average time spent for each vehicle to travel
through the freeway segment. The computation of T
f
t
follows (6.6). Vehicles that
enter or exit from ramps are excluded.
• Arterial average travel time (T
a
t
): the average time spent for each vehicle to travel
through the entire arterial region. The computation of T
a
t
follows (6.6). Only vehicles
that have travelled from intersection 1 (I1) to intersection 7 (I7) through the arterial
road are counted.
• Arterial average number of stops (¯s): the average number of stops performed by each
vehicle when traveling through the entire arterial region.
s¯ =
1
Nv
X
Nv
i=1
si (6.21)
where si
is the number of stops performed by vehicle i. Only vehicles that have travelled
from I1 to I7 through the arterial road are counted.
• Arterial average emission rates of CO2 (E): calculated using the MOVES model proposed by the Environment Protection Agency [103].
E =
X
Nv
i=1
Ei/
X
Nv
i=1
li (6.22)
where Ei
is the emission produced by vehicle i and li
is the travelled distance of vehicle
i. Only vehicles that have travelled from I1 to I7 through the arterial road are counted.
94
• Average off-ramp queue length ( ¯ws):
w¯s =
X
N
i=1
w¯s,i/Ns (6.23)
where N is the number of freeway sections, ¯ws,i is the average queue length of off-ramp
i during the simulation, Ns is the number of off-ramps.
• Average queue length of arterial intersections ( ¯wa):
w¯a =
X
K
k=1
( ¯w
N
k + ¯w
S
k + ¯w
E
k + ¯w
W
k
)/4K (6.24)
where K is the number of arterial intersections, ¯w
N
k
is the average queue length of the
Northbound approach of intersection k during the simulation.
6.4.3 Evaluation Results
We present the evaluation results of six interested scenarios in six tables respectively. Each
value in these tables is the average of ten random simulation runs. There are two freeway
control strategies and four arterial control strategies to be evaluated as illustrated in section 6.4.1. However, some options may not be necessary for some specific scenarios. For
instance, the integrated freeway control (IFC) is not needed when there is no incident on
freeway since the freeway demand is always within its normal capacity. Thus, we only focus on representative combinations of freeway and arterial control strategies for the sake of
simplicity.
Table 6.1 presents the evaluation results of a low-demand scenario without the freeway
incident. In table 6.1, the freeway travel time T
f
t
is not affected by the variation of arterial
control since the freeway traffic always follow the free-flow speed without any restriction. The
fixed-time policy (FAC) serves as the reference method and delivers the worst performance
under all arterial performance measurements. The percentages in brackets quantify the
95
performance improvement by implementing the corresponding control scheme versus FAC.
The QL-based control (QAC and QACU) performs better than the MAXBAND, especially
in terms of the travel time and the number of stops. The unification logic also slightly
improves the overall performance as we compare the results of QAC with QACU.
Table 6.1: Low-Demand without Incident
Freeway control NFC
Arterial control FAC MAXBAND QAC QACU
T
f
t
(s) 634 634 633 633
T
a
t
(s) 968 853(12%) 816(16%) 800(17%)
s¯ 4.5 3.3(27%) 2.5(44%) 2.3(49%)
E(g/veh/km) 246.9 231.8(6%) 226.7(8%) 221.3(10%)
w¯s(m) 0 0 0 0
w¯a(m) 15 7.8(48%) 7.1(53%) 6.7(55%)
Table 6.2 presents the evaluation results of a moderate-demand scenario without the
freeway incident. The demand change significantly increases the queue measurements ( ¯ws
and ¯wa) compared with table 6.1. The percentage improvements of each arterial traffic
control strategies are close to those in table 6.1 in terms of the travel time, the number of
stops and the emission rates. The QL-based control has a strong effect in off-ramp queue
( ¯ws) dissipation because the off-ramp queue is part of the TSC agent’s reward function. The
QL-based control also outperforms the MAXBAND with regard to the intersection queue
( ¯wa), which is taken into account in the DSO agent’s reward function.
Table 6.2: Moderate-Demand without Incident
Freeway control NFC
Arterial control FAC MAXBAND QAC QACU
T
f
t
(s) 637 639 638 638
T
a
t
(s) 981 861(12%) 841(14%) 834(15%)
s¯ 4.8 3.5(27%) 2.9(40%) 2.7(44%)
E(g/veh/km) 254 237.2(7%) 228.4(10%) 226.5(11%)
w¯s(m) 52.6 46.7(11%) 8.8(83%) 8.6(84%)
w¯a(m) 47.1 34.2(23%) 28.4(40%) 27.1(42%)
96
Table 6.3 presents the evaluation results of a high-demand scenario without the freeway
incident. The demand increase creates longer queues at off-ramps and intersections that
potentially introduces congestion at these areas. Comparing with table 6.1 and 6.2, the
percentage improvements brought by the QL-based control diminish in terms of the arterial
travel time and the number of stops, closer to those of the MAXBAND. The reason is
that the QL-based algorithms prioritize the queue dissipation at off-ramps when it reaches
the capacity. They still outperform the MAXBAND in both queue measurements as the
MAXBAND does not consider real-time queues at all. Another interesting observation is
that the freeway travel time is improved by implementing the QL-based arterial control,
because the off-ramp queue does not accumulate and no bottleneck exists on freeway.
Table 6.3: High-Demand without Incident
Freeway control NFC
Arterial control FAC MAXBAND QAC QACU
T
f
t
(s) 755 752 704(7%) 705(7%)
T
a
t
(s) 1037 948(7%) 947(7%) 940(8%)
s¯ 6.4 5.5(14%) 5.6(12%) 5.2(19%)
E(g/veh/km) 272.6 250.7(8%) 243.8(11%) 239.4(12%)
w¯s(m) 457.3 356.6(22%) 82.2(82%) 80.1(82%)
w¯a(m) 176.1 150.9(14%) 83.8(40%) 81.5(42%)
Table 6.4 presents the evaluation results of a low-demand scenario with the freeway
incident. To alleviate the congestion brought by the incident, we implement an integrated
freeway traffic control method proposed in chapter 5. Meanwhile we incorporate different
arterial traffic control strategies and examine their performance in the occurrence of the
freeway control. The percentage improvements of each type of arterial control over FAC
in table 6.4 are close to those in table 6.1, which implies that the freeway control has
no significant impact on the arterial performance under low demands. It is also verified by
comparing the NFC+QACU case with IFC+QACU case as they deliver similar performance.
Table 6.5 presents the evaluation results of a moderate-demand scenario with the freeway
incident. The percentage improvements of each type of arterial control over FAC in this table
97
Table 6.4: Low-Demand with Incident
Freeway control IFC NFC
Arterial control FAC MAXBAND QAC QACU QACU
T
f
t
(s) 653 653 651 651 730
T
a
t
(s) 979 852(13%) 835(15%) 817(17%) 807(18%)
s¯ 4.4 3.4(23%) 3.2(27%) 2.8(36%) 2.4(45%)
E(g/veh/km) 246.1 232.1(6%) 229.6(7%) 225.1(9%) 221.9(10%)
w¯s(m) 0 0 0 0 0
w¯a(m) 13.9 7.9(43%) 7.4(47%) 6.6(53%) 6.3(55%)
are close to those in table 6.2. The off-ramp and intersection queue dissipation by QL-based
control is also observed under the existence of the freeway incident and the freeway control.
Table 6.5: Moderate-Demand with Incident
Freeway control IFC NFC
Arterial control FAC MAXBAND QAC QACU QACU
T
f
t
(s) 669 670 660 659 745
T
a
t
(s) 991 887(10%) 874(12%) 863(13%) 872(12%)
s¯ 5.3 3.8(28%) 3.5(34%) 3.3(38%) 3.5(34%)
E(g/veh/km) 253.9 237(7%) 232.5(8%) 228.6(10%) 231.2(9%)
w¯s(m) 46.7 40.8(13%) 5.6(88%) 5.3(89%) 5.8(88%)
w¯a(m) 49.2 39.5(20%) 35.3(28%) 27.6(44%) 34(31%)
Table 6.6 presents the evaluation results of a high-demand scenario with the freeway
incident. By implementing the QL-based control, the trade-off between the queue lengths
and other performance measurements as mentioned in table 6.3 is observed as well in table
6.6. The trade-off also leads to a faster travel time on freeway as the off-ramp queue does
not accumulate. Under the high demand, the performance of NFC+QACU is slightly worse
than IFC+QACU, which is different from the moderate demand or low demand case. The
possible reason is that the incident produces unbalanced off-ramp traffic flows that lead to
the incompatibility between the unified signal plan and some intersection demands. The
integrated freeway control is effective in balancing off-ramp traffic flows, and thus, improves
the performance of QACU.
98
Table 6.6: High-Demand with Incident
Freeway control IFC NFC
Arterial control FAC MAXBAND QAC QACU QACU
T
f
t
(s) 766 763 726(5%) 728(5%) 847
T
a
t
(s) 1057 955(10%) 943(11%) 932(12%) 945(11%)
s¯ 7.2 5.3(26%) 4.7(35%) 4.3(40%) 5.1(29%)
E(g/veh/km) 273.8 247.1(10%) 244(11%) 239.7(12%) 240.6(12%)
w¯s(m) 159.4 121.9(24%) 49.4(69%) 45.1(72%) 86.7(46%)
w¯a(m) 132.8 105.8(20%) 66.7(50%) 57.8(56%) 73.8(44%)
The evaluation results in the above six tables reveal a consistent performance ranking
of the listed arterial traffic control strategies, that is QACU, QAC, MAXBAND, FAC from
best to worst. Under low traffic demands, the QL-based arterial control produces more
benefit than the classic MAXBAND in terms of the travel time, the number of stops and
the emission rates at the arterial region. Under high traffic demands, the above-mentioned
performance difference is less obvious because the QL agents are designed to prevent queue
spillbacks at off-ramps and intersections, while the MAXBAND does not consider that. As a
result, the QL-based control reduces the average queue lengths at off-ramps and intersection
significantly. Another interesting observation is that the freeway traffic control and the
arterial traffic control improves the performance of each other under high traffic demands.
The reason is that the arterial control dissipates the off-ramp queue and prevents freeway
bottlenecks, and the freeway control balances the off-ramp flow rates and the demands of
arterial intersections, which suits the unified TSC actions generated by the network control.
99
Chapter 7: Integration of Freeway and Arterial Traffic Control
7.1 Introduction
The lack of communication between freeway and arterial road networks leads to sub-optimal
traffic operation efficiency and frequent congestion at on-ramps and off-ramps in urban transportation systems. To address the issue, we propose an integrated freeway and arterial traffic
control strategy using a Q-learning (QL) framework, which consists of the freeway traffic control (FTC) agent proposed in chapter 5 and the arterial traffic signal control (TSC) agent
and dynamic speed offset (DSO) agent proposed in chapter 6. The FTC agent exploits adjacent arterial signal timing and intersection demands to estimate on-ramp demands and takes
proactive control actions. The TSC agent is able to adjust the signal timing to facilitate
queue dissipation in the nearest on-ramp and off-ramp. The DSO agent computes the relative offset and provides speed recommendations for consecutive arterial intersections. The
states of each agent include the control action from other agents or real-time measurements
from the neighboring environment, which enhances the communication and coordination between different control components within the network. We compare the proposed approach
with multiple semi-coordinated variations to quantify the benefit of each coordination mechanism using microscopic traffic simulations in different scenarios. The coordination proves
advantageous for arterial traffic under low or moderate demands. In high-demand scenarios,
the fully-coordinated approach significantly reduces the queues at on-ramps and off-ramps,
100
as well as travel time on the freeway, albeit with a slight increase in travel time on arterial
roads when compared with the scenarios under low and moderate demands. Furthermore,
it demonstrates consistent performance across different freeway incident locations.
7.2 Problem Statement
Consider a road network that includes a freeway segment and the adjacent arterial streets
as depicted in Figure 7.1. Despite the fact that the freeway and arterial traffic interact
frequently via ramps, they are controlled separately with very little communication in most
modern transportation systems. In rush hours, this often leads to queue overspills at onramps and off-ramps, which extends the traffic congestion from one area to another and
severely deteriorate the traffic mobility. To address the on-ramp queue issue, some ramp
metering (RM) algorithms that consider the balance of freeway occupancy and on-ramp
queue length have been proposed [47, 128]. However, the performance improvement by RM
solely is limited under high traffic demands as the significant on-ramp queue forces RM to
switch off. A promising alternative is to incorporate arterial traffic signal control (TSC)
and freeway lane change (LC) recommendations with RM. More specifically, TSC adjusts
the signal timing to reduce the on-ramp demand from arterial side, and LC encourages the
freeway traffic to leave more space for on-ramp merging.
On the other hand, the off-ramp queue overspill can be prevented by an adaptive design
of TSC as proposed in chapter 6. It should accommodate more off-ramp traffic into the
arterial network when a significant off-ramp queue has been observed.
In addition to RM, LC and TSC, we also incorporate variable speed limit (VSL) on
freeway and a dynamic speed offset (DSO) coordination on arterials to further improve
the traffic operation efficiency. The VSL control is effective in handling freeway bottleneck
congestion produced by incidents or on-ramp merging [110]. The DSO control allows more
vehicles to pass the intersections during the green phase and reduces the number of stops by
101
Figure 7.1: Connected Freeway and Arterial Road Network
red lights. Note that we will slightly modify the designs of the FTC, TSC and DSO agent
in this chapter to enhance the coordination.
To coordinate all the control components mentioned above, we adopt a Q-learning framework due to its flexibility and fast real-world implementations [72, 105]. The objective of
the coordinated traffic control is to minimize the travel time of both freeway and arterial
networks and meanwhile maintain the queues at ramps and intersections at a reasonable
level. Moreover, we compare the proposed approach with multiple under-coordinated control designs to quantify the benefit brought by coordination and communication.
7.3 Methodology
The proposed integrated traffic control consists of three Q-learning (QL) agents - a freeway
traffic control (FTC) agent that coordinates variable speed limit (VSL), lane change (LC),
ramp metering (RM) actions, a traffic signal control (TSC) agent that determines signal
plans for arterial intersections, and a dynamic speed offset (DSO) agent that computes a
relative offset and speeds for each pair of adjacent intersections. They have been modified
102
from previous versions proposed in chapter 5 and 6 to enhance the coordination and improve
the control performance. The details of each agent and the QL framework that coordinates
them are to be illustrated in the rest of this section.
7.3.1 Freeway Traffic Control Agent
The freeway traffic control (FTC) agent determines proper VSL, LC and RM actions based
on observed traffic conditions in a road network depicted in Figure 7.2. To train the agent
with Q-learning (QL) algorithms, first we need to design the states, actions and reward for
the FTC agent.
Figure 7.2: Freeway Traffic Control (FTC) Agent
States
The states of the FTC agent are defined as follows
Xf = [ρi
, qi,net, wo
i
, nc,
˜d
F
k
] (7.1)
where i = 1, 2, ..., N is the index of the freeway section, k is the index of the adjacent arterial
intersection, ρi
is the freeway vehicle density, qi,net = qi − qi+1 + ri − si
is the net incoming
103
flow, w
o
i
is the on-ramp queue length, nc is the number of closed lane(s) due to an incident,
and ˜d
F
k
is the departure demand toward freeway at intersection k.
˜d
F
k
is estimated as the
sum of the Southbound left-turn demand ˜d
S,l
k
, the Eastbound through demand ˜d
E,t
k
, and the
Northbound right-turn demand ˜d
N,r
k
, as depicted in Figure 7.3. Therefore
Figure 7.3: Estimation of ˜d
F
k
˜d
F
k = ˜d
S,l
k + ˜d
E,t
k + ˜d
N,r
k
=
˜d
S
k
y
S,l
k T
S,l
k
T
+
˜d
E
k
y
E,t
k T
E,t
k
T
+ ˜d
N
k
y
N,r
k
(7.2)
where ˜d
S
k
,
˜d
E
k
,
˜d
N
k
are incoming traffic demands of intersection k, y
S,l
k
, y
E,t
k
, y
N,r
k
are the turningratios with their directions denoted by the superscripts (e.g. S, l means Southbound vehicles
intending to turn left), T
S,l
k
, T E,t
k
are the green time of corresponding signal phases during
the next control cycle, and T = 30 s is the control cycle.
The estimation of ˜d
F
k depends on both incoming arterial intersection demands and the
104
signal timing. ˜d
F
k
is a critical state variable as the majority of this demand is the arterial
traffic desiring to enter the freeway through intersection k and the on-ramp i. An accurate
estimation of ˜d
F
k
encourages the FTC agent to anticipate the correct amount of incoming
traffic and take proactive reactions.
To reduce problem dimension, we discretize the state spaces for measured densities (Sρ),
flows (Sq) and queue lengths (Sw) as follows
Sρ = {20, 30, 40, ..., 150} veh/km,
Sq = {0, 100, 200, ..., 4000} veh/h,
Sw = {0, 50, 100, ..., 500} m
(7.3)
Then we have ρi ∈ Sρ; qi,net,
˜d
F
k ∈ Sq; w
o
i ∈ Sw.
Actions
The actions executed by the FTC agent are expressed as
Af = [vi
, to
i
, f o
i
] (7.4)
where vi ∈ {60, 70, 80, 90, 100} km/h is the speed command of VSL control,
t
o
i ∈ {0, 0.5, 1, 1.5, 2, 3, 4, 6} s is the red phase duration of RM control, f
o
i ∈ {0, 1} is the flag
of LC control at the on-ramp merging area. The green phase of RM is fixed to 3 s. f
o
i = 1
means that the LC control is activited for on-ramp i. Note that the LC control is always
active for incidents on freeway so we do not need an action variable for it. The distance of
LC control is fixed to 800 m [110].
Reward
The objective of the FTC is to reduce freeway travel time, prevent on-ramp queue overspill
and stabilize the vehicle density around the desired value. The reward function needs to
105
be negatively correlated with the average travel time Tt and the on-ramp queue length w
o
i
.
To avoid the overflow of on-ramp queue, we want the reward Rf = 0 when w
o
i
exceeds the
reference value w
r
i
. Moreover, we want the vehicle density ρi to be close to the desired
density ρ
∗
. Assume the bottleneck capacity is Cb, ρ
∗
should correspond to a flow rate that
is slightly less than Cb to avoid potential capacity drop triggered by disturbance [110], i.e.
ρ
∗ = min{d, 0.95Cb}/vf .
Considering the above requirements, the reward function R ∈ [0, 1] is defined as
Rf = max{0,(1 −
w
o
i
wr
i
)} · Li
Ttvf
· min{
ρi
ρ
∗
,
ρ
∗
ρi
} (7.5)
The first part - max{0,(1 − wi/wr
i
)}, represents a reward for the on-ramp queue. It reaches
the maximum value of 1 when w
o
i = 0 and the minimum value of 0 when w
o
i > wr
i
. The
second part - Li/Ttvf , represents a reward for the travel time. It reaches the maximum value
of 1 when Tt = Li/vf , which means freeway section i is in the free-flow condition. The third
part - min{ρi/ρ∗
, ρ∗/ρi}, represents a reward for the vehicle density control. It reaches a
maximum value of 1 when ρi = ρ
∗
.
In general, a higher reward value reflects less on-ramp queue length, freeway travel time
and deviation between ρi and ρ
∗
. Note that the objective of Q-learning is not only to
maximize the immediate reward defined in (7.5), but to maximize a cumulative long-term
reward where the reward of each time step is computed by (7.5).
Upstream VSL Control
The above proposed FTC does not cover the upstream part of the freeway segment, which
involves two crucial control variables - the value and the location of the most upstream VSL
sign, denoted as v0 and L0 in Figure 7.1. They are activated when freeway bottleneck exists
due to an incident or congestion at on-ramp merging areas. According to [110], we want q1
106
to be equal to the bottleneck throughput under the regulation of v0, and thus
v0 =
wCb
wρj − Cb
bottleneck, no capacity drop,
w(1 − ϵ0)Cb
wρj − (1 − ϵ0)Cb
bottleneck with capacity drop,
vf no bottleneck
(7.6)
where w is the backpropagation speed, ρ
j
is the jam density, ϵ0 is the capacity drop factor,
Cb is the bottleneck capacity, and vf is the free-flow speed.
On the other hand, we set L0 to be slightly greater than the lower bound proposed in
[110], i.e.
L0 >
vf ρ¯(t0) − (1 − ϵ0)Cb)v0Lb
((1 − ϵ0)Cb − v0ρ0(t0))vf
(7.7)
where ¯ρ and Lb are the average density and the distance from section 1 to the incident
location, and t0 is the time the incident takes place.
7.3.2 Traffic Signal Control Agent
The traffic signal control (TSC) agent computes the cycle and splits for each arterial intersection based on observed traffic states in a road network depicted in Figure 7.4. It has been
extended to assist on-ramp queue dissipation from the previous version in chapter 6.
States
The states of the TSC agent are defined as
Xts = [wo, ws,
˜d
S
,
˜d
E
,
˜d
N ,
˜d
W ] (7.8)
where wo and ws are queue lengths of nearest on-ramp and off-ramp respectively, and
˜d
S
,
˜d
E,
˜d
N ,
˜d
W are measured incoming demands of the intersection with the direction denoted by the superscript (e.g. S for Southbound). The state discretization follows (7.3), and
107
Figure 7.4: Traffic Signal Control (TSC) Agent
we have wo, ws ∈ Sw;
˜d
S
,
˜d
E,
˜d
N ,
˜d
W ∈ Sq.
Actions
The actions executed by the TSC agent are expressed as
Ats = [Tc, g1, g2, g3, g4, g5, g6] (7.9)
where Tc ∈ {40, 50, ..., 180} s is the cycle length of the arterial signal,
g1, g2, ..., g6 ∈ {1, 2, 3, 4, 5} are phase weights to be used to compute the green time of each
phase.
Figure 7.5: Traffic Signal Phasing Scheme with Six Phases
The default phase scheme involves 6 phases as depicted in Figure 7.5. The green time of
108
each phase is computed as
Tp,j =
(Tc − Tl)gj
P6
j=1 gj
(7.10)
where j = 1, 2, ..., 6 is the phase index, Tl
is the lost time due to phase transitions.
To reduce problem dimension, we let g1 = g3. According to Figure 7.4, phase 1,5,6
contribute to the on-ramp demand and the off-ramp traffic are dispelled in phase 4,5. We
expect the TSC agent to assist ramp queue dissipation by adjusting the phase weights properly, which relies on a reasonable design of the reward function in the next section.
Reward
The reward function of the TSC agent is defined as
Rts = max{0, 1 −
wo
wr
o
} · max{0, 1 −
ws
wr
s
} · Ls
Ttva
(7.11)
which contains three parts - the on-ramp queue wo, the off-ramp queue ws, and the average
travel time Tt over a square area that includes the intersection at the center and has a side
length of Ls = 400 m. The default arterial travel speed va = 60 km/h. Rts reaches a
minimum value of 0 when either ramp queue exceeds the reference length, and a maximum
value of 1 when no queue exists at both ramps and no travel delays are caused by the red
light at the intersection.
Network Control
The network control receives the intended signal cycles Tc from all TSC agents within the
arterial network and unifies them based on a majority rule as follows:
• If the most common cycle is favored by more than half TSC agents, it is selected as
the unified cycle.
• If the most common cycle is favored by half TSC agents or less, it computes the average
109
of all intended cycles and selects the closest option. If the average lies in the middle
of two options exactly, it selects the larger one.
Although unifying the signal cycle may compromise the travel time of some particular intersection areas, it benefits the overall arterial traffic progression with the collaboration of the
DSO agent.
7.3.3 Dynamic Speed Offset Agent
The dynamic speed offset (DSO) agent computes the offset and speed recommendations
between two adjacent intersections based on the observations of signal plans, intersection
queue lengths and link distance as shown in Figure 7.6.
Figure 7.6: Dynamic Speed Offset (DSO) Agent
States
The states of the DSO agent are defined as
Xso = [△tp,1, wU
a
, wD
a
, L] (7.12)
where △tp,1 = T
D
p,1 − T
U
p,1
is the difference of phase 1 green time between downstream and
upstream signal plans, w
U
a
and w
D
a
are intersection queue lengths at each end, and L is the
110
link distance.
The state space is discretized so that △tp,1 ∈ {−6, −4, −2, 0, 2, 4, 6} s;
w
U
a
, wD
a ∈ {0, 50, 100, ..., 250} m; L ∈ {1000, 1100, 1200, ..., 2500} m.
Actions
The actions executed by the DSO agent are expressed as
Aso = [To, vD
r
, vU
r
] (7.13)
where To ∈ {0, 5, 10, ..., Tc − 5} s is the offset of signal j with respect to signal j − 1. Note
that the cycle length Tc has been unified by the network control of TSC. v
D
r
is the speed for
downstream traffic in link j − 1, and v
U
r
is the speed for upstream traffic in link j − 1. The
action space of v
D
r
and v
U
r
is {40, 45, 50, ..., 80} km/h.
Reward
The reward function of the DSO agent is defined as
Rso = max{0, 1 −
w
U
a
wr
a
} · max{0, 1 −
w
D
a
wr
a
} · 3L
4Ttva
(7.14)
which considers the upstream intersection queue w
U
a
, the downstream intersection queue w
D
a
,
and the link travel time Tt
. Rso reaches a minimum value of 0 when either queue exceeds
the reference length w
r
a
, and a maximum value of 1 when no queue exists at both ends and
all vehicles pass through the link at maximum possible speed (80 km/h).
7.3.4 Training Process
The training process is based on the same Q-learning framework as the one proposed in
chapter 6 and carried out using microscopic simulation software PTV VISSIM 10. The simulation network configuration will be introduced in section 7.4. We generate freeway traffic
111
demands based on hourly traffic volume data from the Caltrans Performance Measurement
System (PeMS) and arterial traffic demands based on hourly traffic counts from LADOT
Database. Both data cover one month of April 2019.
Figure 7.7: Training Process of FTC, TSC and DSO Agents
As shown in Figure 7.7, the simulator starts by loading traffic demands and assigning
default signal plans and speed limits. The training process begins after a 10-min warm-up.
The training cycle of the FTC agent is 30 s, much shorter than the 5-min cycle for the TSC
and DSO agents. At the beginning of each training cycle, the agent first collects the state x
from the corresponding environment, and selects an action a using (6.20) based on current
Q-values. After running the simulation for one training cycle, we compute the immediate
reward R(x, a) based on (7.5)(7.11)(7.14) and collect the new state x
′
. Then we update the
Q-value for the state-action pair (x, a) using (6.19).
The training process completes when Q(x, a) converges for all state-action pairs. The
duration of each training simulation run is 12 hours. A 20-min freeway incident is introduced
to create a lane-drop bottleneck and activate the VSL control. During the training, the
incident on freeway has a probability of 0.25 to occur at the beginning of each hour and has
three possible locations - right/middle/left lane of freeway section 4, chosen randomly with
equal probability.
112
7.4 Experimental Study
7.4.1 Simulation Network Configuration
The proposed integrated traffic control strategy is evaluated using the commercial microscopic simulation software PTV Vissim 10 over a mixed freeway and arterial road network
depicted in Figure 7.8. The network contains a 16-km segment of the I-710 freeway and
7 adjacent signalized intersections in Los Angeles, California, United States. The freeway
segment is divided into 6 regular sections and one upstream section, where the freeway
controller deploys the most upstream VSL sign dynamically [110]. There are 5 on-ramps
and 6 off-ramps that connect the freeway with the arterial streets. The arterial network
is simplified from the real-world configuration by omitting minor intersections and directly
connecting major intersections.
Figure 7.8: I-710 Simulation Road Network with Possible Incident Locations
As mentioned in section 7.3.4, freeway traffic demand from PeMS is generated at the
freeway entrance, and arterial traffic demands from LADOT database are generated at 16
arterial entrances, implied by arrows in Figure 7.8. The turning ratio at an intersection
approach or a ramp is determined by the ratio of the historical average traffic counts of each
direction. The freeway incident has three possible locations indicated by different colors. It
creates a lane-closure bottleneck and potentially impacts both freeway and arterial traffic.
We consider 6 scenarios for evaluation as follows
(1) Low demand (weekday 1-3 a.m.) without incident
113
(2) Moderate demand (weekday 12-2 p.m.) without incident
(3) High demand (weekday 5-7 p.m.) without incident
(4) High demand with right-lane incident (the gold rectangle in Figure 7.8)
(5) High demand with mid-lane incident (the red rectangle in Figure 7.8)
(6) High demand with left-lane incident (the grey rectangle in Figure 7.8)
For each scenario, we compare 6 types of control strategies with various degrees of coordination between sub control components. The design details of each control strategy is
listed as follows
• No control: VSL, LC, RM are inactive on freeway; arterial signals have a fixed cycle
of 120 s, a fixed split and zero offset; the arterial speed limit is 60 km/h.
• Uncoordinated freeway and arterial traffic control (U-FATC):
– The FTC agent is divided into three sub agents (VSL, LC, RM) for separate
training. Each sub agent has only one action variable (e.g. vi
for VSL).
– The FTC agent has no knowledge of ˜d
F
k
- the departure demand from the nearest
arterial intersection.
– The TSC agent has no knowledge of the nearest on-ramp queue wo and off-ramp
queue ws. They are removed from Xts in (6.1) and Rts in (6.7).
– The DSO agent has no knowledge of the arterial signal timing. It assumes Tc =
120 s. △tp,1 is removed from Xso in (6.8).
• Semi-coordinated control - type I (S-FATC-I): no communication between the freeway
and the arterial network
– The FTC agent integrates VSL, LC and RM actions, but has no knowledge of ˜d
F
k
.
– The TSC agent has no knowledge of ramp queues. The DSO agent is as designed.
114
• Semi-coordinated control - type II (S-FATC-II):
– The FTC and DSO agents are as designed.
– The TSC agent has no knowledge of ramp queues.
• Semi-coordinated control - type III (S-FATC-III):
– The FTC agent integrates VSL, LC and RM actions, but has no knowledge of ˜d
F
k
.
– The TSC and DSO agents are as designed.
• Coordinated control (C-FATC) as proposed.
To improve readability, we also list the coordination mechanisms applied in each control
strategy (expect no control) in Table 7.1.
Table 7.1: Coordination Mechanisms Applied in Considered Control Strategies
Control FTC integrates FTC estimates TSC considers DSO knows
Method VSL, LC and RM ˜d
F
k
ramp queues signal plans
U-FATC No No No No
S-FATC-I Yes No No Yes
S-FATC-II Yes Yes No Yes
S-FATC-III Yes No Yes Yes
C-FATC Yes Yes Yes Yes
7.4.2 Evaluation Criteria and Results
To evaluate the performance of each control strategy mentioned above, we measure the
average travel time Tt and the average number of stops ¯s for both the freeway (denoted
by superscript f) and arterial (denoted by superscript a) network. We also measure the
average queue length at on-ramps ¯wo, off-ramps ¯ws and intersections ¯wa. The computation
of each evaluation criterion has been proposed in chapter 6. We take the average results of
ten random simulations for each scenario and put them in Table 7.2-7.7 respectively. The
115
percentages within brackets indicate the performance improvement by implementing the
corresponding control strategy.
Table 7.2: Low Demands without Incident
Method No control U-FATC S-FATC-I S-FATC-II S-FATC-III C-FATC
T
f
t
(s) 634 631 629 630 630 630
s¯
f 0 0 0 0 0 0
T
a
t
(s) 968 825(15%) 788(19%) 786(19%) 790(18%) 788(19%)
s¯
a 4.5 2.3(49%) 1.9(58%) 1.8(60%) 2.0(56%) 1.9(58%)
w¯o(m) 0 0 0 0 0 0
w¯s(m) 0 0 0 0 0 0
w¯a(m) 15 6.9(54%) 5.7(62%) 5.7(62%) 5.9(61%) 5.8(61%)
Table 7.3: Moderate Demands without Incident
Method No control U-FATC S-FATC-I S-FATC-II S-FATC-III C-FATC
T
f
t
(s) 637 635 634 633 634 633
s¯
f 0 0 0 0 0 0
T
a
t
(s) 981 837(15%) 797(19%) 795(19%) 802(18%) 798(19%)
s¯
a 4.8 2.8(42%) 2.2(54%) 2.1(56%) 2.4(50%) 2.2(54%)
w¯o(m) 0 0 0 0 0 0
w¯s(m) 53 41(23%) 42(21%) 42(21%) 10(81%) 8(85%)
w¯a(m) 47 28(40%) 26(45%) 26(45%) 25(47%) 24(49%)
Table 7.2 presents the evaluation results of the low-demand scenario without incident.
The freeway is in free-flow status and does not need any control, whereas the arterial travel
time is improved by 15% with the uncoordinated control and around 19% with partially
or fully coordinated control. The control strategies with coordination also perform better
in reducing the intersection queue (61% vs. 54%). Similar benefits in percentages can be
observed in Table 7.3 for the moderate-demand scenario without incident, except that SFATC and C-FATC are much more effective in off-ramp queue dissipation (over 80% vs.
over 20%) than other types of control. The travel-time benefit is primarily achieved by
coordinating the TSC and the DSO agent so that the arterial traffic is less delayed by
red lights. The off-ramp queue is reduced by adjusting the signal timing through TSC to
116
Table 7.4: High Demands without Incident
Method No control U-FATC S-FATC-I S-FATC-II S-FATC-III C-FATC
T
f
t
(s) 755 717(5%) 708(6%) 699(7%) 697(8%) 685(9%)
s¯
f 2.1 1.5(29%) 1.3(38%) 1.3(38%) 1.2(43%) 1.1(48%)
T
a
t
(s) 1037 912(12%) 885(15%) 880(15%) 903(13%) 891(14%)
s¯
a 6.4 4.4(31%) 4.0(38%) 3.9(39%) 4.3(33%) 4.1(36%)
w¯o(m) 136 94(31%) 75(49%) 50(63%) 67(51%) 39(71%)
w¯s(m) 246 165(33%) 161(35%) 158(36%) 92(63%) 80(67%)
w¯a(m) 176 109(38%) 94(47%) 91(48%) 86(51%) 77(56%)
accommodate heavy off-ramp traffic.
Under high demands, there exist ramp-merging bottlenecks on freeway. The travel time
and queues increase significantly in both areas due to the congestion. The fully coordinated
control obtains the best performance in the freeway travel time and the number of stops.
Although the benefit in arterial travel time drops from 19% to 14%, the fully coordinated
control is very effective in the queue reduction at ramps and intersections. We consider this
trade-off between the arterial travel time and queue control to be necessary and beneficial
for the overall traffic mobility. The ramp queues can also be alleviated by S-FATC-III where
the TSC is able to provide some assistance, but it is less effective, especially in terms of onramps where the FTC underperforms without estimating the incoming demand. S-FATC-II
performs better in the on-ramp queue control with the fully-functioning FTC. However, it
offers no benefit for off-ramps since the TSC does not consider ramp queues.
Table 7.5: High Demands with Freeway Right-lane Incident
Method No control U-FATC S-FATC-I S-FATC-II S-FATC-III C-FATC
T
f
t
(s) 877 816(7%) 799(9%) 780(11%) 789(10%) 762(13%)
s¯
f 3.5 2.4(31%) 2.2(37%) 1.8(49%) 1.9(46%) 1.6(54%)
T
a
t
(s) 1073 955(11%) 918(14%) 914(15%) 935(13%) 923(14%)
s¯
a 7.8 5.6(28%) 4.4(44%) 4.2(46%) 4.9(37%) 4.5(42%)
w¯o(m) 170 131(23%) 115(32%) 72(58%) 101(41%) 56(67%)
w¯s(m) 243 163(33%) 160(34%) 158(35%) 104(57%) 80(67%)
w¯a(m) 178 121(32%) 106(40%) 101(43%) 92(48%) 79(56%)
117
Table 7.6: High Demands with Freeway Mid-lane Incident
Method No control U-FATC S-FATC-I S-FATC-II S-FATC-III C-FATC
T
f
t
(s) 879 791(10%) 782(11%) 773(12%) 772(12%) 763(13%)
s¯
f 3.4 2.1(38%) 2.0(41%) 1.9(44%) 1.7(50%) 1.6(53%)
T
a
t
(s) 1051 935(11%) 912(13%) 908(14%) 930(12%) 919(13%)
s¯
a 6.9 4.8(30%) 4.2(39%) 4.1(41%) 4.5(35%) 4.2(39%)
w¯o(m) 144 104(28%) 92(36%) 60(58%) 81(44%) 48(67%)
w¯s(m) 259 168(35%) 162(37%) 159(39%) 107(59%) 93(64%)
w¯a(m) 182 118(35%) 107(41%) 103(43%) 93(49%) 82(55%)
Table 7.7: High Demands with Freeway Left-lane Incident
Method No control U-FATC S-FATC-I S-FATC-II S-FATC-III C-FATC
T
f
t
(s) 875 782(11%) 774(12%) 769(12%) 764(13%) 760(13%)
s¯
f 3.2 1.9(41%) 1.8(44%) 1.8(44%) 1.6(50%) 1.5(53%)
T
a
t
(s) 1043 934(10%) 909(13%) 904(13%) 926(11%) 913(12%)
s¯
a 6.6 4.5(32%) 4.0(39%) 4.0(39%) 4.4(33%) 4.2(36%)
w¯o(m) 138 97(30%) 86(38%) 57(59%) 77(44%) 46(67%)
w¯s(m) 256 159(38%) 156(39%) 153(40%) 106(59%) 95(63%)
w¯a(m) 174 115(34%) 101(42%) 97(44%) 90(48%) 81(53%)
Table 7.5-7.7 presents the scenarios with freeway incident in different lanes under high
demands. The right-lane incident exerts a strong negative impact on ramp merging, which
results in longer arterial travel time and on-ramp queue compared with mid-lane or left-lane
incident. In this specific scenario presented by Table 7.5, coordination carries greater importance than in other scenarios due to the enhanced freeway-travel-time (T
f
t
) benefit provided
by 3 coordination mechanisms: the integration of VSL, LC and RM actions improves T
f
t
from 7% to 9% (U-FATC vs. S-FATC-I); the estimation of the departure demand ˜d
F
k
improves T
f
t
from 9% to 11% (S-FATC-I vs. S-FATC-II); the dissipation of off-ramp queues by
TSC improves T
f
t
from 9% to 10% (S-FATC-I vs. S-FATC-III). These benefits can also be
observed in other high-demand scenarios from Table 7.4, 7.6, 7.7 with a smaller scale.
The queue-control benefit provided by each coordination strategy is relatively consistent
across all the incident scenarios from Table 7.5-7.7. The integration of VSL, LC and RM
actions improves the on-ramp-queue benefit by around 8% (U-FATC vs. S-FATC-I). The
118
estimation of ˜d
F
k
improves the on-ramp-queue benefit by around 22% (S-FATC-I vs. S-FATCII). The incorporation of TSC improves the on-ramp-queue benefit by around 8% and the
off-ramp-queue benefit by around 22% (S-FATC-I vs. S-FATC-III), at the cost of a minor
increase in the arterial travel time. This trade-off is performed by the TSC part of S-FATCIII and C-FATC in all the high-demand scenarios according to Table 7.4-7.7. In addition,
C-FATC achieves consistent performance in all three incident scenarios, and thus, is more
robust with different incident locations.
119
Chapter 8: Conclusion and Future Work
8.1 Conclusions
In this study, the basic formulation and the fundamental diagram of two well-known firstorder traffic models, Lighthill-Whitham-Richards (LWR) model and Cell Transmission Model
(CTM), are reviewed. Some special arterial traffic behaviors not shared by freeway traffic are
discussed, such as frequent lane change and queue discharging, along with the corresponding
extensions on CTM to address these issues. The original CTM is modified to accommodate
potential uncertainties within measurements and model parameters as well as the capacity
drop phenomenon. A combined Variable Speed Limit (VSL) and Lane Change (LC) control
strategy is then proposed based on the modified CTM to alleviate freeway bottleneck congestion and reject uncertainties. Microscopic simulations along a freeway segment of I-710 in
Los Angeles, United States, are carried out with PTV VISSIM 10 to examine the effect of the
proposed controller and different types of uncertainties. As a result, the proposed controller
can tolerate 10% deviations in sensitive traffic states and 20% deviations in nonsensitive
traffic states. The simulation results also implies that slowing down the traffic excessively
delivers worse performance than speeding up the traffic because of the extra shockwaves
produced in the former situation.
To investigate the effect of VSL sign locations, a rule-based VSL controller is proposed,
in which the distance of the most upstream VSL zone (L0) is treated as a control variable.
The speed difference between the first VSL command v0 and other downstream commands
create a low-density area that allows the congestion at the bottleneck to dissipate. To ensure
120
the complete removal of the bottleneck congestion, a chasing problem is formulated, and the
solution of this problem leads to a lower bound that L0 needs to satisfy for faster convergence
of the VSL control and better performance. The obtained lower bound is positively correlated
with v0, the initial densities, and the downstream length. The microscopic simulations
indicate that the density of each CTM section reaches steady state when L0 satisfies the
lower bound. In addition, significant benefits in terms of the number of stops and the
emission rates of CO2 are observed when the value of L0 is close to or greater than the
lower bound. However, overextending L0 produces undesirable travel time. The computed
lower bound serves as a valuable design parameter for selecting proper L0 in various traffic
scenarios. Moreover, the rule-based VSL controller is compared with a classic feedbacklinearization VSL controller. The former outperforms the latter in high-demand traffic flow
scenarios, reflecting the benefit of concentrating the control efforts and minimizing the speed
variations.
The above-mentioned content focuses on the freeway traffic control with relatively simple
and consistent traffic conditions. The study then extends the interested road network to
include both freeway and adjacent arterial roads. A Q-learning(QL)-based freeway traffic
control (FTC) strategy that integrates the VSL control, LC recommendations and ramp
metering (RM) is proposed. Meanwhile a traffic-responsive arterial signal control scheme
based on a modified Webster model is implemented to regulate the arterial traffic. The
numerical simulations indicate that the proposed approach achieves higher performance in
the freeway part compared with uncoordinated or decentralized feedback control in highdemand or incident scenarios. Meanwhile, it also reduces the average queue length of arterial
intersections by a better processing of the off-ramp demand feeding into the arterial network.
However, the occasional overspill of off-ramp queues under high demands remains an issue,
which could potentially be resolved by modifying the arterial signal control to assist with
queue dissipation.
To address the issue of off-ramp queue overspill and improve the arterial traffic operation,
121
an arterial traffic control strategy that combines traffic signal control (TSC) and dynamic
speed offset (DSO) coordination using a QL framework is proposed. The TSC agent determines the signal cycles and the splits based on intersection demands and off-ramp queue.
Then a network controller unifies the cycles and splits of different intersections with a majority rule to facilitate the arterial traffic progression with the assistance of the DSO agent.
The DSO agent determines the relative offset and the recommended speeds between two
consecutive intersections based on their physical distance, queue lengths and signal cycles.
The effectiveness of the proposed approach is demonstrated using microscopic simulations
in a mixed freeway and arterial road network with real-world traffic demands. The proposed QL-based control delivers a significantly higher performance than MAXBAND and
fixed-time control in terms of travel time and number of stops under low and moderate demands. In high-demand scenarios, the QL-based control trades the travel-time benefit for
queue dissipation at off-ramps and intersections, resulting in freeway travel time reduction.
The freeway traffic control also improves the performance of the proposed arterial control
by providing balanced off-ramp flows in accordance with the unified signal timing.
The study is then concluded by integrating the freeway and arterial traffic control strategy with two major modifications to the control components: the FTC agent estimates
the departure demand of the adjacent arterial intersection in order to take proactive control actions; the TSC agent is able to assist in the dissipation of both on-ramp and off-ramp
queues. Multiple variations of the proposed methodology with different levels of coordination
are evaluated via microscopic simulations to demonstrate the effectiveness of each coordination mechanism. The proposed approach improves the arterial travel time by 19% under
low and moderate demands, 4% more than the uncoordinated control. In high-demand scenarios, this benefit is slightly reduced in exchange for a significant reduction in ramp queues
and faster travel on the freeway. In addition, the coordinated control is more robust with
different freeway incident locations and deliver consistent performance, whereas the uncoordinated control performs relatively worse in the scenario where the ramp merging is severely
122
impacted.
8.2 Future Work
The proposed study verified the effectiveness of coordinating freeway and arterial traffic
control using a simplified road network consisting of one freeway segment and one parallel
arterial corridor. This approach can potentially be extended by including more arterial
intersections vertically. Implementing a cooperative traffic signal control (TSC) strategy with
additional arterial intersections may provide greater benefits in alleviating ramp congestion
under high traffic demands.
Additionally, the lane change (LC) control can be enhanced by introducing variable LC
distances and involving more lanes. These extensions may improve the on-ramp merging
process, which benefits both freeway and arterial travel.
123
References
[1] David Schrank, Bill Eisele, Tim Lomax, et al. “Urban mobility report 2019”. In:
(2019).
[2] Martin Savelsbergh and Marc Sol. “Drive: Dynamic routing of independent vehicles”.
In: Operations Research 46.4 (1998), pp. 474–490.
[3] Moshe Ben-Akiva, Andre De Palma, and Kaysi Isam. “Dynamic network models and
driver information systems”. In: Transportation Research Part A: General 25.5 (1991),
pp. 251–266.
[4] Yihang Zhang and Petros A Ioannou. “Combined variable speed limit and lane change
control for highway traffic”. In: IEEE Transactions on Intelligent Transportation Systems 18.7 (2017), pp. 1812–1823.
[5] Yuqing Guo et al. “Integrated variable speed limits and lane-changing control for
freeway lane-drop bottlenecks”. In: IEEE Access 8 (2020), pp. 54710–54721.
[6] Markos Papageorgiou, Habib Hadj-Salem, Jean-Marc Blosseville, et al. “ALINEA: A
local feedback control law for on-ramp metering”. In: Transportation Research Record
1320.1 (1991), pp. 58–67.
[7] Pitu Mirchandani and Fei-Yue Wang. “RHODES to intelligent transportation systems”. In: IEEE Intelligent Systems 20.1 (2005), pp. 10–15.
[8] Andreas Hegyi et al. “SPECIALIST: A dynamic speed limit control algorithm based
on shock wave theory”. In: 2008 11th international ieee conference on intelligent
transportation systems. IEEE. 2008, pp. 827–832.
[9] Rodrigo C Carlson et al. “Optimal motorway traffic flow control involving variable
speed limits and ramp metering”. In: Transportation science 44.2 (2010), pp. 238–253.
[10] Jos´e Ram´on D Frejo et al. “Macroscopic modeling of variable speed limits on freeways”. In: Transportation research part C: emerging technologies 100 (2019), pp. 15–
33.
[11] Long Kejun et al. “Model predictive control for variable speed limit in freeway work
zone”. In: 2008 27th Chinese Control Conference. IEEE. 2008, pp. 488–493.
[12] Md Hadiuzzaman and Tony Z Qiu. “Cell transmission model based variable speed
limit control for freeways”. In: Canadian Journal of Civil Engineering 40.1 (2013),
pp. 46–56.
[13] Eil Kwon et al. “Development and field evaluation of variable advisory speed limit
system for work zones”. In: Transportation research record 2015.1 (2007), pp. 12–18.
124
[14] Claudio Roncoli, Ioannis Papamichail, and Markos Papageorgiou. “Hierarchical model
predictive control for multi-lane motorways in presence of vehicle automation and
communication systems”. In: Transportation Research Part C: Emerging Technologies
62 (2016), pp. 117–132.
[15] Yihang Zhang and Petros A Ioannou. “Integrated control of highway traffic flow”. In:
Journal of Control and Decision 5.1 (2018), pp. 19–41.
[16] Mehmet Ali Silgu et al. “Combined Control of Freeway Traffic Involving Cooperative Adaptive Cruise Controlled and Human Driven Vehicles Using Feedback Control Through SUMO”. In: IEEE Transactions on Intelligent Transportation Systems
(2021).
[17] Yihang Zhang and Petros A Ioannou. “Coordinated variable speed limit, ramp metering and lane change control of highway traffic”. In: IFAC-PapersOnLine 50.1 (2017),
pp. 5307–5312.
[18] Thomas Urbanik et al. Coordinated freeway and arterial operations handbook. Tech.
rep. United States. Federal Highway Administration, 2006.
[19] Dongyan Su et al. “Coordinated ramp metering and intersection signal control”. In:
International Journal of Transportation Science and Technology 3.2 (2014), pp. 179–
192.
[20] Xianfeng Yang, Yao Cheng, and Gang-Len Chang. “Integration of adaptive signal
control and freeway off-ramp priority control for commuting corridors”. In: Transportation research part C: emerging technologies 86 (2018), pp. 328–345.
[21] Yao Cheng and Gang-Len Chang. “Arterial-Friendly Local Ramp Metering Control
Strategy”. In: Transportation Research Record 2675.7 (2021), pp. 67–80.
[22] Abdullah Al Farabi et al. “Integrated corridor management by cooperative traffic
signal and ramp metering control”. In: Computer-Aided Civil and Infrastructure Engineering (2024).
[23] Hui-Yu Jin and Wen-Long Jin. “Control of a lane-drop bottleneck through variable
speed limits”. In: Transportation Research Part C: Emerging Technologies 58 (2015),
pp. 568–584.
[24] Zhibin Li et al. “Reinforcement learning-based variable speed limit control strategy
to reduce traffic congestion at freeway recurrent bottlenecks”. In: IEEE transactions
on intelligent transportation systems 18.11 (2017), pp. 3204–3217.
[25] Jos´e Ram´on D Frejo and Bart De Schutter. “Logic-Based Traffic Flow Control for
Ramp Metering and Variable Speed Limits—Part 1: Controller”. In: IEEE Transactions on Intelligent Transportation Systems 22.5 (2020), pp. 2647–2657.
[26] Stef Smulders. “Control of freeway traffic flow by variable speed signs”. In: Transportation Research Part B: Methodological 24.2 (1990), pp. 111–132.
[27] H Zackor. “Speed limitation on freeways: Traffic-responsive strategies”. In: Concise
Encyclopedia of Traffic & Transportation Systems. Elsevier, 1991, pp. 507–511.
125
[28] Rodrigo Castelan Carlson, Ioannis Papamichail, and Markos Papageorgiou. “Local
feedback-based mainstream traffic flow control on motorways using variable speed
limits”. In: IEEE Transactions on intelligent transportation systems 12.4 (2011),
pp. 1261–1276.
[29] Yihang Zhang and Petros A Ioannou. “Stability analysis and variable speed limit
control of a traffic flow model”. In: Transportation Research Part B: Methodological
118 (2018), pp. 31–65.
[30] A Hegyi, B De Schutter, and J Heelendoorn. “MPC-based optimal coordination of
variable speed limits to suppress shock waves in freeway traffic”. In: Proceedings of
the 2003 American Control Conference, 2003. Vol. 5. IEEE. 2003, pp. 4083–4088.
[31] Bidoura Khondaker and Lina Kattan. “Variable speed limit: A microscopic analysis
in a connected vehicle environment”. In: Transportation Research Part C: Emerging
Technologies 58 (2015), pp. 146–159.
[32] Tianchen Yuan et al. “Evaluation of Integrated Variable Speed Limit and Lane
Change Control for Highway Traffic Flow”. In: IFAC-PapersOnLine 54.2 (2021),
pp. 107–113.
[33] Yihang Zhang et al. “Comparison of Feedback Linearization and Model Predictive
Techniques for Variable Speed Limit Control”. In: 2018 21st International Conference
on Intelligent Transportation Systems (ITSC). IEEE. 2018, pp. 3000–3005.
[34] Michael James Lighthill and Gerald Beresford Whitham. “On kinematic waves II. A
theory of traffic flow on long crowded roads”. In: Proceedings of the Royal Society of
London. Series A. Mathematical and Physical Sciences 229.1178 (1955), pp. 317–345.
[35] Paul I Richards. “Shock waves on the highway”. In: Operations research 4.1 (1956),
pp. 42–51.
[36] Carlos F Daganzo. “The cell transmission model: A dynamic representation of highway traffic consistent with the hydrodynamic theory”. In: Transportation Research
Part B: Methodological 28.4 (1994), pp. 269–287.
[37] Hao Liu, Suyash Vishnoi, and Christian Claudel. “A Two-stage Stochastic Model
Considering Randomness of Demand in Variable Speed Limit and Boundary Flow
Control”. In: arXiv preprint arXiv:2110.14025 (2021).
[38] Faisal Alasiri, Yihang Zhang, and Petros A Ioannou. “Robust variable speed limit
control with respect to uncertainties”. In: European Journal of Control (2020).
[39] Jos´e Ram´on D Frejo and Bart De Schutter. “SPERT: A speed limit strategy for
recurrent traffic jams”. In: IEEE Transactions on Intelligent Transportation Systems
20.2 (2018), pp. 692–703.
[40] Mudasser Seraj et al. “Optimal location identification of VSL signs for recurrent
bottlenecks”. In: Transp. Res. Rec. J. Transp. Res. Board 82.4 (2016), pp. 1084–
1090.
[41] Chengcheng Xu et al. “Procedure for determining the deployment locations of variable
speed limit signs to reduce crash risks at freeway recurrent bottlenecks”. In: IEEE
Access 7 (2019), pp. 47856–47863.
126
[42] Irene Mart´ınez and Wen-Long Jin. “Optimal location problem for variable speed limit
application areas”. In: Transportation Research Part B: Methodological 138 (2020),
pp. 221–246.
[43] Claudio Roncoli, Markos Papageorgiou, and Ioannis Papamichail. “Traffic flow optimisation in presence of vehicle automation and communication systems–Part II: Optimal control for multi-lane motorways”. In: Transportation Research Part C: Emerging
Technologies 57 (2015), pp. 260–275.
[44] Vasileios Markantonakis et al. “Integrated traffic control for freeways using variable
speed limits and lane change control actions”. In: Transportation research record
2673.9 (2019), pp. 602–613.
[45] Yorgos J Stephanedes. “Implementation of on-line Zone Control Strategies for optimal
ramp metering in the Minneapolis Ring Road”. In: (1994).
[46] H Michael Zhang and Stephen G Ritchie. “Freeway ramp metering using artificial
neural networks”. In: Transportation Research Part C: Emerging Technologies 5.5
(1997), pp. 273–286.
[47] Emmanouil Smaragdis and Markos Papageorgiou. “Series of new local ramp metering strategies: Emmanouil smaragdis and markos papageorgiou”. In: Transportation
Research Record 1856.1 (2003), pp. 74–86.
[48] G Paesani et al. “System wide adaptive ramp metering (SWARM)”. In: Merging the
Transportation and Communications Revolutions. Abstracts for ITS America Seventh
Annual Meeting and ExpositionITS America. 1997.
[49] Ioannis Papamichail et al. “Heuristic ramp-metering coordination strategy implemented at monash freeway, australia”. In: Transportation Research Record 2178.1
(2010), pp. 10–20.
[50] Yu Han et al. “Hierarchical ramp metering in freeways: an aggregated modeling and
control approach”. In: Transportation research part C: emerging technologies 110
(2020), pp. 1–19.
[51] FV Webster. Traffic signal settings. Tech. rep. 1958.
[52] Alan J Miller. “Settings for fixed-cycle traffic signals”. In: Journal of the Operational
Research Society 14.4 (1963), pp. 373–386.
[53] FV Webster. “Traffic signals”. In: Road research technical paper 56 (1966).
[54] Alvaro J Calle-Laguna, Jianhe Du, and Hesham A Rakha. “Computing optimum
traffic signal cycle length considering vehicle delay and fuel consumption”. In: Transportation Research Interdisciplinary Perspectives 3 (2019), p. 100021.
[55] Fred L Orcutt Jr. The traffic signal book. 1993.
[56] John DC Little, Mark D Kelson, and Nathan H Gartner. “MAXBAND: A versatile
program for setting signals on arteries and triangular networks”. In: (1981).
[57] Nathan H Gartner et al. “A multi-band approach to arterial traffic signal optimization”. In: Transportation Research Part B: Methodological 25.1 (1991), pp. 55–74.
127
[58] Tugba Arsava, Yuanchang Xie, and Nathan H Gartner. “Arterial progression optimization using OD-BAND: case study and extensions”. In: Transportation Research
Record 2558.1 (2016), pp. 1–10.
[59] Giovanni De Nunzio et al. “Speed advisory and signal offsets control for arterial
bandwidth maximization and energy consumption reduction”. In: IEEE Transactions
on Control Systems Technology 25.3 (2016), pp. 875–887.
[60] Dennis I Robertson. “TRANSYT: a traffic network study tool”. In: (1969).
[61] PB Hunt et al. “The SCOOT on-line traffic signal optimisation technique”. In: Traffic
Engineering & Control 23.4 (1982).
[62] Yiheng Feng et al. “A real-time adaptive signal control in a connected vehicle environment”. In: Transportation Research Part C: Emerging Technologies 55 (2015),
pp. 460–473.
[63] Giovanni De Nunzio et al. “Eco-driving in urban traffic networks using traffic signals
information”. In: International Journal of Robust and Nonlinear Control 26.6 (2016),
pp. 1307–1324.
[64] Peng Hao et al. “Eco-approach and departure (EAD) application for actuated signals
in real-world traffic”. In: IEEE Transactions on Intelligent Transportation Systems
20.1 (2018), pp. 30–40.
[65] Hao Yang, Fawaz Almutairi, and Hesham Rakha. “Eco-driving at signalized intersections: A multiple signal optimization approach”. In: IEEE Transactions on Intelligent
Transportation Systems 22.5 (2020), pp. 2943–2955.
[66] Andreas Hegyi, Bart De Schutter, and Hans Hellendoorn. “Model predictive control
for optimal coordination of ramp metering and variable speed limits”. In: Transportation Research Part C: Emerging Technologies 13.3 (2005), pp. 185–209.
[67] Amir Hosein Ghods, Ashkan Rahimi Kian, and Masoud Tabibi. “Adaptive freeway
ramp metering and variable speed limit control: a genetic-fuzzy approach”. In: IEEE
Intelligent Transportation Systems Magazine 1.1 (2009), pp. 27–36.
[68] Jos´e Ram´on Dom´ınguez Frejo and Eduardo Fern´andez Camacho. “Global versus local
MPC algorithms in freeway traffic control with ramp metering and variable speed
limits”. In: IEEE Transactions on intelligent transportation systems 13.4 (2012),
pp. 1556–1565.
[69] Cecilia Pasquale et al. “A multi-class model-based control scheme for reducing congestion and emissions in freeway networks by combining ramp metering and route
guidance”. In: Transportation Research Part C: Emerging Technologies 80 (2017),
pp. 384–408.
[70] I Schelling, Andreas Hegyi, and Serge P Hoogendoorn. “SPECIALIST-RM—Integrated
variable speed limit control and ramp metering based on shock wave theory”. In: 2011
14th International IEEE conference on intelligent transportation systems (ITSC).
IEEE. 2011, pp. 2154–2159.
128
[71] Georgia-Roumpini Iordanidou et al. “Feedback-based integrated motorway traffic flow
control with delay balancing”. In: IEEE Transactions on Intelligent Transportation
Systems 18.9 (2017), pp. 2319–2329.
[72] Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduction.
MIT press, 2018.
[73] Erwin Walraven, Matthijs TJ Spaan, and Bram Bakker. “Traffic flow optimization:
A reinforcement learning approach”. In: Engineering Applications of Artificial Intelligence 52 (2016), pp. 203–212.
[74] Francois Belletti et al. “Expert level control of ramp metering based on multi-task
deep reinforcement learning”. In: IEEE Transactions on Intelligent Transportation
Systems 19.4 (2017), pp. 1198–1207.
[75] Chong Wang et al. “Integrated traffic control for freeway recurrent bottleneck based
on deep reinforcement learning”. In: IEEE Transactions on Intelligent Transportation
Systems 23.9 (2022), pp. 15522–15535.
[76] Kok-Lim Alvin Yau et al. “A survey on reinforcement learning models and algorithms
for traffic signal control”. In: ACM Computing Surveys (CSUR) 50.3 (2017), pp. 1–38.
[77] Juan C Medina, Ali Hajbabaie, and Rahim F Benekohal. “Arterial traffic control
using reinforcement learning agents and information from adjacent intersections in
the state and reward structure”. In: 13th international IEEE conference on intelligent
transportation systems. IEEE. 2010, pp. 525–530.
[78] Lior Kuyer et al. “Multiagent reinforcement learning for urban traffic control using coordination graphs”. In: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2008, Antwerp, Belgium, September 15-19, 2008,
Proceedings, Part I 19. Springer. 2008, pp. 656–671.
[79] Elise Van der Pol and Frans A Oliehoek. “Coordinated deep reinforcement learners for
traffic light control”. In: Proceedings of learning, inference and control of multi-agent
systems (at NIPS 2016) 8 (2016), pp. 21–38.
[80] Tian Tan et al. “Cooperative deep reinforcement learning for large-scale traffic grid
signal control”. In: IEEE transactions on cybernetics 50.6 (2019), pp. 2687–2700.
[81] Weibin Zhang et al. “Distributed Signal Control of Arterial Corridors Using MultiAgent Deep Reinforcement Learning”. In: IEEE Transactions on Intelligent Transportation Systems 24.1 (2022), pp. 178–190.
[82] Silvia Siri et al. “Freeway traffic control: A survey”. In: Automatica 130 (2021),
p. 109655.
[83] Vincenzo Punzo and Fulvio Simonelli. “Analysis and comparison of microscopic traffic
flow models with real traffic microscopic data”. In: Transportation Research Record
1934.1 (2005), pp. 53–63.
[84] Elmar Brockfeld, Reinhart D K¨uhne, and Peter Wagner. “Calibration and validation
of microscopic traffic flow models”. In: Transportation Research Record 1876.1 (2004),
pp. 62–70.
129
[85] Felipe de Souza, Omer Verbas, and Joshua Auld. “Mesoscopic traffic flow model for
agent-based simulation”. In: Procedia Computer Science 151 (2019), pp. 858–863.
[86] Massimo Di Gangi et al. “Network traffic control based on a mesoscopic dynamic
flow model”. In: Transportation Research Part C: Emerging Technologies 66 (2016),
pp. 3–26.
[87] Markos Papageorgiou. “Some remarks on macroscopic traffic flow modelling”. In:
Transportation Research Part A: Policy and Practice 32.5 (1998), pp. 323–329.
[88] Carlos F Daganzo. “The cell transmission model, part II: network traffic”. In: Transportation Research Part B: Methodological 29.2 (1995), pp. 79–93.
[89] Carlos F Daganzo. “A behavioral theory of multi-lane traffic flow. Part I: Long homogeneous freeway sections”. In: Transportation Research Part B: Methodological 36.2
(2002), pp. 131–158.
[90] Apostolos Kotsialos et al. “Traffic flow modeling of large-scale motorway networks using the macroscopic modeling tool METANET”. In: IEEE Transactions on intelligent
transportation systems 3.4 (2002), pp. 282–292.
[91] Maria Kontorinaki et al. “First-order traffic flow models incorporating capacity drop:
Overview and real-data validation”. In: Transportation Research Part B: Methodological 106 (2017), pp. 52–75.
[92] Chaitrali Shirke, Ashish Bhaskar, and Edward Chung. “Macroscopic modelling of
arterial traffic: An extension to the cell transmission model”. In: Transportation Research Part C: Emerging Technologies 105 (2019), pp. 54–80.
[93] BD Greenshields et al. “A study of traffic capacity”. In: Highway research board
proceedings. Vol. 1935. National Research Council (USA), Highway Research Board.
1935.
[94] Fred L Hall and Kwaku Agyemang-Duah. “Freeway capacity drop and the definition
of capacity”. In: Transportation research record 1320 (1991).
[95] Malachy Carey, Chandra Balijepalli, and David Watling. “Extending the cell transmission model to multiple lanes and lane-changing”. In: Networks and Spatial Economics 15.3 (2015), pp. 507–535.
[96] Tianlu Pan et al. “Multiclass multilane model for freeway traffic mixed with connected automated vehicles and regular human-piloted vehicles”. In: Transportmetrica
A: transport science 17.1 (2021), pp. 5–33.
[97] Joseph S Drake. “A statistical analysis of speed density hypothesis”. In: HRR 154
(1967), pp. 53–87.
[98] James H Banks. “The two-capacity phenomenon: some theoretical issues”. In: Transportation Research Record 1320 (1991).
[99] Jorge A Laval and Carlos F Daganzo. “Lane-changing in traffic streams”. In: Transportation Research Part B: Methodological 40.3 (2006), pp. 251–264.
130
[100] Anupam Srivastava, Wen-Long Jin, and Jean-Patrick Lebacque. “A modified cell
transmission model with realistic queue discharge features at signalized intersections”.
In: Transportation Research Part B: Methodological 81 (2015), pp. 302–315.
[101] Yu Han et al. “Resolving freeway jam waves by discrete first-order model-based predictive control of variable speed limits”. In: Transportation Research Part C: Emerging
Technologies 77 (2017), pp. 405–420.
[102] Caroline J Rodier, Emily Issac, et al. Transit performance measures in California.
Tech. rep. Mineta Transportation Institute, 2016.
[103] U Epa. “Motor Vehicle Emission Simulator (MOVES) User Guide”. In: US Environmental Protection Agency (2010).
[104] Pangwei Wang et al. “A joint control model for connected vehicle platoon and arterial
signal coordination”. In: Journal of Intelligent Transportation Systems 24.1 (2020),
pp. 81–92.
[105] Hao Wang and Xianyue Peng. “Coordinated Control Model for Oversaturated Arterial
Intersections”. In: IEEE Transactions on Intelligent Transportation Systems (2022).
[106] Thorsten Schmidt-Dumont and Jan H van Vuuren. “Decentralised reinforcement
learning for ramp metering and variable speed limits on highways”. In: IEEE Transactions on Intelligent Transportation Systems 14.8 (2015), p. 1.
[107] Monique Van den Berg et al. “Integrated traffic control for mixed urban and freeway
networks: A model predictive control approach”. In: European journal of transport
and infrastructure research 7.3 (2007).
[108] Jack Haddad, Mohsen Ramezani, and Nikolas Geroliminis. “Cooperative traffic control of a mixed network with two urban regions and a freeway”. In: Transportation
Research Part B: Methodological 54 (2013), pp. 17–36.
[109] Christopher JCH Watkins and Peter Dayan. “Q-learning”. In: Machine learning 8
(1992), pp. 279–292.
[110] Tianchen Yuan, Faisal Alasiri, and Petros A Ioannou. “Selection of the Speed Command Distance for Improved Performance of a Rule-Based VSL and Lane Change
Control”. In: IEEE Transactions on Intelligent Transportation Systems (2022).
[111] Xiugang Li et al. “Signal timing of intersections using integrated optimization of
traffic quality, emissions and fuel consumption: a note”. In: Transportation Research
Part D: Transport and Environment 9.5 (2004), pp. 401–407.
[112] Ali Hajbabaie and Rahim F Benekohal. “Traffic signal timing optimization: Choosing
the objective function”. In: Transportation research record 2355.1 (2013), pp. 10–19.
[113] James A Bonneson and Michael D Fontaine. “Evaluating intersection improvements:
an engineering study guide”. In: (2001).
[114] Yizhe Wang et al. “A review of the self-adaptive traffic signal control system based
on future traffic environment”. In: Journal of Advanced Transportation 2018 (2018).
131
[115] Junchen Jin and Xiaoliang Ma. “A group-based traffic signal control with adaptive learning ability”. In: Engineering applications of artificial intelligence 65 (2017),
pp. 282–293.
[116] Ye Tian et al. “Interactive signal control for over-saturated arterial intersections using
fuzzy logic”. In: 2008 11th International IEEE Conference on Intelligent Transportation Systems. IEEE. 2008, pp. 1067–1072.
[117] Kancharla K Chandan, Alvaro JM Seco, and Ana Bastos Silva. “Real-Time IncidentResponsive Signal Control Strategy under Partially Connected Vehicle Environment”.
In: Journal of Advanced Transportation 2022 (2022).
[118] Hamdi Abdulkareem Mohammed Al-Nuaimi. “The application of variable speed limits to arterial roads for improved traffic flow”. PhD thesis. University of Southern
Queensland, 2014.
[119] Myungeun Eom and Byung-In Kim. “The traffic signal control problem for intersections: a review”. In: European transport research review 12 (2020), pp. 1–20.
[120] Fran¸cois Dion and Bruce Hellinga. “A rule-based real-time traffic responsive signal
control system with transit priority: application to an isolated intersection”. In: Transportation Research Part B: Methodological 36.4 (2002), pp. 325–343.
[121] Wael Ekeila, Tarek Sayed, and Mohamed El Esawey. “Development of dynamic transit
signal priority strategy”. In: Transportation research record 2111.1 (2009), pp. 1–9.
[122] K Chandan, Alvaro M Seco, and Ana Bastos Silva. “Real-time traffic signal control for
isolated intersection, using car-following logic under connected vehicle environment”.
In: Transportation research procedia 25 (2017), pp. 1610–1625.
[123] Wei-Hua Lin and Chenghong Wang. “An enhanced 0-1 mixed-integer LP formulation
for traffic signal control”. In: IEEE Transactions on Intelligent transportation systems
5.4 (2004), pp. 238–245.
[124] EN Barron and H Ishii. “The Bellman equation for minimizing the maximum cost.”
In: NONLINEAR ANAL. THEORY METHODS APPLIC. 13.9 (1989), pp. 1067–
1090.
[125] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep learning. MIT press, 2016.
[126] Dongbin Zhao et al. “DHP method for ramp metering of freeway traffic”. In: IEEE
Transactions on Intelligent Transportation Systems 12.4 (2011), pp. 990–999.
[127] Yihang Zhang and Petros A Ioannou. “Combined variable speed limit and lane change
control for highway traffic”. In: IEEE Transactions on Intelligent Transportation Systems 18.7 (2016), pp. 1812–1823.
[128] Yibing Wang and Markos Papageorgiou. “Local ramp metering in the case of distant
downstream bottlenecks”. In: 2006 IEEE Intelligent Transportation Systems Conference. IEEE. 2006, pp. 426–431.
132
Abstract (if available)
Abstract
Traffic congestion is a persistently growing problem in urban areas worldwide. To mitigate travel delays, reduce fuel consumption, and address the additional costs produced by traffic congestion, intelligent transportation systems (ITS) technologies, such as dynamic routing, driver information systems, variable speed limits (VSL), lane change (LC) control, ramp metering (RM), and traffic signal control (TSC), have been extensively explored and studied over the past few decades. Although many ITS technologies have proven effective in either freeway or arterial traffic management, the integrated control of the two systems has rarely been investigated due to the difficulty of modeling two completely different traffic patterns and the high complexity of the road network. Some studies have demonstrated the effectiveness of coordinating freeway ramp control with adjacent arterial signals in reducing travel time and ramp queues, which is a preliminary step toward coordinating freeway and arterial (CFA) operations and motivates further investigation in this dissertation.
The prerequisite of traditional traffic control design is to have a model capable of accurately reproducing traffic states with acceptable computational complexity. In this regard, the cell transmission model (CTM) emerges as a promising candidate. To enhance consistency between macroscopic analysis and microscopic simulations, the original CTM undergoes modifications to incorporate the capacity drop effect and a disturbance term accounting for potential uncertainties. Building upon the modified CTM, a combined feedback-based Variable Speed Limit (VSL) and Lane Change (LC) control scheme is proposed to alleviate freeway bottleneck congestion and mitigate uncertainties. Subsequently, the feedback-based VSL is replaced by a rule-based VSL, where the distance of the upstream VSL zone is treated as a control variable. A lower bound of this distance is derived analytically to prevent additional shockwaves and is validated through microscopic simulations. The established lower bound serves as a valuable design tool for fine-tuning and enhancing the performance of VSL controllers.
To advance the study of CFA operations, the considered road network is expanded to include both a freeway segment and adjacent arterial roads. The integrated control of the two systems is developed in three steps: firstly, a freeway traffic control (FTC) strategy that coordinates VSL, LC and RM actions is proposed; secondly, an arterial traffic control strategy that coordinates TSC, offset and speed recommendations is proposed; finally, the freeway and arterial traffic control strategies are integrated with necessary modifications to each sub component. All control designs are based on a Q-learning (QL) framework for higher degree of coordination and fast implementation.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Integrated control of traffic flow
PDF
Control of mainstream traffic flow: variable speed limit and lane change
PDF
Novel techniques for analysis and control of traffic flow in urban traffic networks
PDF
Microscopic traffic control: theory and practice
PDF
Computationally efficient design of optimal strategies for passive and semiactive damping devices in smart structures
PDF
Congestion reduction via private cooperation of new mobility services
PDF
Sequential Decision Making and Learning in Multi-Agent Networked Systems
PDF
Information design in non-atomic routing games: computation, repeated setting and experiment
PDF
Distributed adaptive control with application to heating, ventilation and air-conditioning systems
PDF
Optimal clipped linear strategies for controllable damping
PDF
Latent space dynamics for interpretation, monitoring, and prediction in industrial systems
PDF
Assume-guarantee contracts for assured cyber-physical system design under uncertainty
PDF
Novel queueing frameworks for performance analysis of urban traffic systems
PDF
Learning and decision making in networked systems
PDF
Elements of robustness and optimal control for infrastructure networks
PDF
Optimum multimodal routing under normal condition and disruptions
PDF
Improving mobility in urban environments using intelligent transportation technologies
PDF
Integration of truck scheduling and routing with parking availability
PDF
Process data analytics and monitoring based on causality analysis techniques
PDF
Personalized driver assistance systems based on driver/vehicle models
Asset Metadata
Creator
Yuan, Tianchen
(author)
Core Title
Coordinated freeway and arterial traffic flow control
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Degree Conferral Date
2024-05
Publication Date
06/03/2024
Defense Date
05/29/2024
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
integrated traffic control,intelligent transportation systems,OAI-PMH Harvest,Q-learning,ramp metering,traffic signal control,variable speed limit
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Ioannou, Petros (
committee chair
), Nuzzo, Pierluigi (
committee member
), Savla, Ketan (
committee member
)
Creator Email
imkiddingbb@gmail.com,tianchey@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113984881
Unique identifier
UC113984881
Identifier
etd-YuanTianch-13052.pdf (filename)
Legacy Identifier
etd-YuanTianch-13052
Document Type
Dissertation
Format
theses (aat)
Rights
Yuan, Tianchen
Internet Media Type
application/pdf
Type
texts
Source
20240603-usctheses-batch-1164
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
integrated traffic control
intelligent transportation systems
Q-learning
ramp metering
traffic signal control
variable speed limit