Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Congestion reduction via private cooperation of new mobility services
(USC Thesis Other)
Congestion reduction via private cooperation of new mobility services
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Congestion Reduction via Private Cooperation of
New Mobility Services
by
Ali Ghafelebashi
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(INDUSTRIAL AND SYSTEMS ENGINEERING)
December 2024
Copyright 2024 Ali Ghafelebashi
Acknowledgments
First and foremost, I wish to express my deepest gratitude to my PhD advisor, Dr. Meisam
Razaviyayn, for his invaluable guidance, inspiration, support, and patience throughout my doctoral
journey. He has been an exemplary researcher, mentor, instructor, and role model, setting a standard
of excellence that I aspire to follow. I could not have asked for a better mentor.
I am profoundly thankful to Professor Maged M. Dessouky for his consistent feedback and
support that enriched and shaped my research during my PhD.
I would also like to extend my heartfelt appreciation to my defense and qualifying exam
committee members: Dr. Meisam Razaviyayn, Dr. Maged M. Dessouky, Dr. Petros A. Ioannou,
Dr. John Gunnar Carlsson, and Dr. Sze-Chuan Suen. Their generosity, insights, and thoughtful
feedback have greatly influenced my work.
My PhD journey was made all the more meaningful by the support and encouragement of
incredible friends and colleagues. Special thanks go to Tianjian, Daniel, Andy, Sina, Maher, Babak,
Xinwei, Yinbin, Zeman, and Devansh for their support, friendship, and the memorable moments we
shared throughout this journey.
The foundation of my accomplishments rests on the unwavering support of my family. I am
deeply grateful for their unconditional love and encouragement throughout my life.
Finally, I want to thank my beloved wife, Zahra. I was blessed to meet her at USC, and she has
been my greatest source of support ever since. Her love, encouragement, and belief in me have
given me the strength to achieve this milestone.
ii
Table of Contents
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Chapter 1:
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Organization of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Chapter 2:
Congestion Reduction via Personalized Incentives . . . . . . . . . . . . . . . . . . . . 15
2.1 Incentive Offering Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.1 Scenario I: Operating Below Network Capacity . . . . . . . . . . . . . . . 18
2.1.2 Scenario II: Operating Above Network Capacity . . . . . . . . . . . . . . . 19
2.1.3 Algorithm for Offering Incentives and A Distributed Implementation . . . . 23
2.2 Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.1 Simulation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.2 Experiment I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2.3 Experiment II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.2.4 Experiment III . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Chapter 3:
Incentive Systems for New Mobility Services . . . . . . . . . . . . . . . . . . . . . . . 42
3.1 Why Offering Incentives to Organizations Rather Than Individuals? . . . . . . . . 44
3.2 Incentive Offering Mechanism and Problem Formulation . . . . . . . . . . . . . . 46
3.3 Incentivization Algorithm and A Distributed Implementation . . . . . . . . . . . . 53
3.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.1 Simulation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
iii
Chapter 4:
Deep Traffic Prediction via Private Cooperation of New Mobility Services . . . . . . 69
4.1 Background on Differential Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.2 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.2.1 Problem Definition & Notations . . . . . . . . . . . . . . . . . . . . . . . 71
4.2.2 Traffic Forecasting Using Graph Neural Networks (GNNs) . . . . . . . . . 74
4.2.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.3.1 Dataset Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.3.2 Construction of the Transportation Network Graph . . . . . . . . . . . . . 80
4.3.3 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.3.4 Experiment Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.3.5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Chapter 5:
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Chapter 6:
Appendicies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.1 Appendix - Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.1.1 List of Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.1.2 Details of Alternating Direction Method of Multipliers (ADMM) . . . . . . 105
6.1.3 Distributed Computation of Algorithm 1 . . . . . . . . . . . . . . . . . . . 112
6.1.4 UE Algorithm - Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.1.5 An Example of the Model and Notations - Chapter 2 . . . . . . . . . . . . 114
6.1.6 Details of the Numerical Experiments . . . . . . . . . . . . . . . . . . . . 116
6.2 Appendix - Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.2.1 List of Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.2.2 Reformulated Optimization Model for the ADMM Algorithm . . . . . . . 123
6.2.3 Distributed Incentivization Algorithm . . . . . . . . . . . . . . . . . . . . 124
6.2.4 Limitations and Further Discussions . . . . . . . . . . . . . . . . . . . . . 124
6.2.5 Supplementary Figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.3 Appendix - Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.3.1 Nonoverlapping Block Bootstrap (NBB) . . . . . . . . . . . . . . . . . . . 128
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
iv
List of Tables
2.1 Experiment I: Linear model (2.3). . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2 Comparison of $1000 and $10000 budget in Experiment I. . . . . . . . . . . . . . 32
2.3 Experiment II: Linear model (2.3) for incentive set I1. . . . . . . . . . . . . . . . 34
2.4 Experiment II: Linear model (2.3) for incentive set I2. . . . . . . . . . . . . . . . 34
2.5 Comparison of $1000 and $10000 budget in Experiment II. . . . . . . . . . . . . . 34
2.6 Experiment III: Linear model (2.3) and model (2.6). . . . . . . . . . . . . . . . . . 36
2.7 Comparison of $1000 and $10000 budget in Experiment III. . . . . . . . . . . . . 36
3.1 Distribution of the number of drivers that were assigned to an alternative route using
Algorithm 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.1 Dataset details. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.2 Performance analysis of the framework on Uber and Lyft inflow data with RMSE
metric. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.3 Performance analysis of the framework on Uber and Lyft outflow data with RMSE
metric. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.4 Performance analysis of the framework in inflow prediction on different homogeneous collaboration settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.5 Performance analysis of the framework in outflow prediction on different homogeneous collaboration settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.1 Set of edges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
v
6.2 Set of routes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.3 Distribution of the offered incentives in Experiment I with different penetration rates.117
6.4 Distribution of the offered incentives in Experiment II for incentive set I1 with
penetration rate of 100%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.5 Distribution of the offered incentives in Experiment II for incentive set I2 with
penetration rate of 100%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.6 Distribution of the offered incentives in Experiment III for model (2.3) with penetration rate of 100%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.7 Distribution of the offered incentives in Experiment III with different penetration
rates for model (2.6), Algorithm 1. . . . . . . . . . . . . . . . . . . . . . . . . . . 118
6.8 Distribution of the offered incentives in Experiment III with different penetration
rates for model (2.6), Gurobi. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
6.9 Distribution of the offered incentives in Experiment III with different penetration
rates for model (2.6), MOSEK. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
6.10 Effect of the penetration rate on travel time decrease (hour) in Experiment I. . . . . 119
6.11 Effect of the penetration rate on the percentage of travel time decrease in Experiment I.119
6.12 Effect of the penetration rate on travel time decrease (hour) in Experiment III,
model (2.6), Algorithm 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.13 Effect of the penetration rate on the percentage of travel time decrease in Experiment III, model (2.6), Algorithm 1. . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.14 Effect of the penetration rate on travel time decrease (hour) in Experiment III,
model (2.6), Gurobi. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.15 Effect of the penetration rate on the percentage of travel time decrease in Experiment III, model (2.6), Gurobi. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.16 Effect of the penetration rate on travel time decrease (hour) in Experiment III,
model (2.6), Mosek. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.17 Effect of the penetration rate on the percentage of travel time decrease in Experiment III, model (2.6), Mosek. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
vi
List of Figures
1.1 (a) Traditional platforms for offering incentives. (b) Presented platform for offering
incentives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2 Companies collaborate via sharing privatized data. . . . . . . . . . . . . . . . . . . 12
1.3 Companies collaborate via sharing privatized model. . . . . . . . . . . . . . . . . 13
2.1 Studied region in Experiment I. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2 Effect of the penetration rate on the percentage of travel time decrease in Experiment I. 32
2.3 Studied region in Experiment II. . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.4 Studied region in Experiment III. . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.5 Total estimated number of drivers entering the system (in 15-minute intervals). (a)
Experiment I, (b) Experiment II, and (c) Experiment III. . . . . . . . . . . . . . . . 37
2.6 Effect of solving method on the percentage of travel time decrease in Experiment III. 38
2.7 Effect of the penetration rate on the percentage of travel time decrease in Experiment III, model (2.6), Algorithm 1. . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.8 Effect of the penetration rate on the percentage of travel time decrease in Experiment III, model (2.6), Gurobi solver. . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.9 Effect of the penetration rate on the percentage of travel time decrease in Experiment III, model (2.6), MOSEK solver. . . . . . . . . . . . . . . . . . . . . . . . . 40
2.10 Normalized gap error of Algorithm 1 after 50,000 iterations with different cases of
penetration rate and budget. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
vii
3.1 (a) Traditional platforms for offering incentives. (b) Presented platform for offering
incentives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 Subnetwork G˜ (selected in blue dashed rectangle). . . . . . . . . . . . . . . . . . . 44
3.3 Payments and gains for self-driving vehicle organizations. . . . . . . . . . . . . . . 45
3.4 Payments and gains for delivery service organizations. . . . . . . . . . . . . . . . 46
3.5 Payments and gains for ride-hailing service organizations. . . . . . . . . . . . . . 46
3.6 Studied region and the highway sensors inside the region. This region encompasses
several areas notorious for high traffic congestion, particularly Downtown Los
Angeles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.7 Total estimated number of drivers entering the system over time (in 5-minute
intervals). The plot shows that traffic peak happens between 7 AM and 8 AM. . . . 58
3.8 Percentage of travel time decrease with different budgets at VOT=$157.8 per hour
using Algorithm 3. The amount of travel time reduction shows a positive correlation
with the amount of incentivization budget and the penetration rate. . . . . . . . . . 60
3.9 Percentage of travel time decrease with different budgets at VOT=$78.9 per hour
using Algorithm 3. The amount of travel time decrease is similar or larger compared
to Fig. 3.8 with VOT=$157.8 due to the smaller cost of incentivization. . . . . . . . 61
3.10 Total cost of incentivization of one organization with different budgets and different
penetration rates at VOT=$157.8 per hour using Algorithm 3. Incentivization cost
increases with the amount of budget as the model incentivizes more drivers to reduce
traffic. The cost is larger at larger penetration rates at $10000 budget because the
model incentivizes more drivers. At smaller budgets, the incentivization cost is
smaller at larger penetration rates because of more flexibility in selecting drivers at
limited budgets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.11 Cost of incentivization per deviated drivers of one organization with different
budgets and different penetration rates at VOT=$157.8 per hour using Algorithm 3.
At larger penetration rates, the platform can incentivize drivers more efficiently due
to access to a larger pool. The platform spends smaller incentivization amount per
deviated driver at larger penetration rates. . . . . . . . . . . . . . . . . . . . . . . 63
3.12 Travel time decrease vs. incentivization cost for different number of organizations
at a 5% penetration rate and VOT=$157.8 per hour using Algorithm 3. The incentivization cost for the same travel time reduction is smaller when the number of
organizations is smaller. This phenomenon is due to the cancel-out effect between
the gain and loss of drivers of the organizations. . . . . . . . . . . . . . . . . . . . 64
viii
3.13 Comparison of travel time reduction percentage using different solvers with different
penetration rates and budgets at VOT=$157.8 per hour. Gurobi exhibits a slight
performance advantage over Algorithm 3 at higher penetration rates and budgets. . 66
3.14 Comparison of the relative execution time of Algorithm 3 vs. different solvers at
different penetration rates at VOT=$157.8 per hour. Algorithm 3 execution time
consistently outperforms Gurobi and MOSEK up to 12 and 120 times, respectively. 66
3.15 Comparison of the relative incentivization cost using Algorithm 3 vs. Gurobi at
different penetration rates, VOT=$157.8 per hour, and one organization. Both
methods utilize similar incentivization amount for smaller budgets but at $10000
budget, Gurobi spends up to $5000 more. . . . . . . . . . . . . . . . . . . . . . . 67
4.1 Traffic prediction using spatio-temporal data of last T timestamps {t −T +1,t −
T +2,··· ,t} on graph G to predict the traffic data of the next T timestamps {t +
1,t +2,··· ,t +T}. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.2 Predict inflow or outflow X
taxi
t+1:t+T
on graph Gr using speed data X
speed
t−T+1:t
on graph Gh. 76
4.3 DP prediction model fs(X
speed
t−T+1:t
;Gh) trained using public data to predict private
data where public and private data have different graphs. . . . . . . . . . . . . . . 77
4.4 Collaboration framework in the context of company 1 helping company 2 by sharing
DP model f
Company 1
s (X
speed
t−T+1:t
;Gh). . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.5 Queens taxi zones based on NYC TLC partitioning. . . . . . . . . . . . . . . . . . 80
4.6 Queens link map of real-time data. Markers represent sensors. Each link consists of
multiple sensors. Sensors of a link are depicted with the same marker color. . . . . 81
4.7 Average inflow/outflow rate of the complete taxi data vs. the average speed data at
different times of the day. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.8 Average inflow/outflow rate of the complete taxi data vs. the average speed data on
different days of the week. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.9 Highlighted zones are removed from the region graph of the taxi datasets due to the
sparsity of signals in them. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.10 Region graph of taxi data of Queens borough. . . . . . . . . . . . . . . . . . . . . 85
4.11 Highway graph of speed data of Queens borough. . . . . . . . . . . . . . . . . . . 86
ix
4.12 Average inflow rate of Uber and Lyft data on different dates. . . . . . . . . . . . . 87
4.13 The model that a company trains solely without any collaboration. This figure is
illustrated using Uber as an example. . . . . . . . . . . . . . . . . . . . . . . . . 90
4.14 A model trained by a company through collaboration with other companies by
receiving DP datasets. This figure illustrates Uber as the company training the
model and Lyft as the company sharing the DP datasets X˜
inflow, Lyft
t−T+1:t
and X˜
outflow, Lyft
t−T+1:t
. 91
4.15 A model trained by a company through collaboration with other companies by
receiving DP models. This figure illustrates Uber as the company training the model
and Lyft as the company sharing the DP models. . . . . . . . . . . . . . . . . . . . 92
4.16 Confidence intervals of test errors for Uber models in inflow prediction, derived
from 100 bootstrapped samples using the NBB method. . . . . . . . . . . . . . . . 95
4.17 Confidence intervals of test errors for Uber models in outflow prediction, derived
from 100 bootstrapped samples using the NBB method. . . . . . . . . . . . . . . . 95
4.18 Confidence intervals of test errors for Lyft models in inflow prediction, derived from
100 bootstrapped samples using the NBB method. . . . . . . . . . . . . . . . . . . 96
4.19 Confidence intervals of test errors for Lyft models in outflow prediction, derived
from 100 bootstrapped samples using the NBB method. . . . . . . . . . . . . . . . 96
6.1 Steps of distributed implementation of Algorithm 1. Step 0: Incentive offering
platform shares the constant parameters and matrices with drivers. Step t-1: The
incentive offering platform updates u
t+1
. Step t-2: Incentive offering platform sends
the required information to drivers. Step t-3: Incentive offering platform receives
the updated vectors from drivers. Step t-4: Incentive offering platform updates γ
t+1
,
β
t+1
, and dual variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.2 Network example G1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.3 Data preparation workflow. First, traffic data and sensors’ location data are received
from the ADMS Server. Next, sensors’ location data is processed to compute sensor
distances. Finally, sensor distances and traffic data are combined to create the graph
network data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.4 Nonoverlapping block bootstrap on time series data of size n with block length of 2. 128
x
Abstract
With rapid population growth and urban development, traffic congestion has become an inescapable
issue, especially in large cities. Many congestion reduction strategies have been proposed in the past,
ranging from roadway extension to transportation demand management. In particular, congestion
pricing schemes have been used as negative reinforcements for traffic control. In this dissertation, we
study an alternative approach of offering positive incentives to drivers and organizations to change
drivers’ routing behavior. More specifically, we propose algorithms to reduce traffic congestion
and improve routing efficiency by offering incentives to drivers and organizations. The incentives
are offered after solving large-scale optimization problems in order to minimize the total travel
time (or minimize any cost function of the network, such as total carbon emission). Due to the
massive size of the optimization problems, we developed distributed computational approaches.
The proposed distributed algorithms are guaranteed to converge under a mild set of assumptions
that are verified with real data. Utilizing real-time traffic data from Los Angeles, we demonstrate
congestion reduction in arterial roads and highways through our algorithms and an extensive set of
numerical experiments. Finally, we introduce a collaboration framework that enables organizations
to train private traffic prediction models to share with the central planner or other organizations.
Our framework leverages the concept of differential privacy (DP) to protect the privacy of users.
This approach empowers the central planner to predict organizations’ traffic patterns and make
informed incentivization without access to their data or compromising their privacy. Additionally,
organizations can improve their traffic prediction models by utilizing our private collaboration
framework. We evaluate and validate the performance improvements in organizations’ traffic
prediction models achieved through collaboration within the presented framework. This evaluation
is based on real-world taxi data from Uber and Lyft and simulated organizations with varying market
shares under different privacy levels and collaboration scenarios.
xi
Chapter 1
Introduction
1.1 Background and Motivation
Today, traffic congestion is one of the major issues in metropolitan areas across the globe. Traffic
congestion declines the overall quality of life, leads to significant economic losses, degrades air
quality, and escalates health vulnerabilities due to emissions. According to INRIX, a company that
provides traffic management services by transportation data analysis, the economy of the United
States suffered more than $81 billion in losses [1]. Moreover, United States (U.S.) drivers lost 51
hours of time and $869 of cost due to traffic congestion in 2022. For instance, in Los Angeles, one
of the most congested cities in the U.S., each driver lost $1601 and was delayed by 95 hours in 2022
due to traffic congestion. In addition to direct economic losses, traffic congestion can worsen air
quality and adversely affect health conditions. According to the Transportation Research Board,
one of the seven program units of the National Academies of Sciences, Engineering, and Medicine,
vehicle emission is the main cause of air pollution [2]. The escalation of intensity and duration of
traffic congestion can raise emissions levels [3], and as a result, air pollution (specifically NO2)
increases with traffic congestion [4]. Based on the study by Hennessy and Wiesenthal, when roads
are congested, drivers show aggressive behaviors more often, and their stress level rises [5]. This
research shows that the Likert scale, a psychometric scale that ranges from 0 = “low-stress level” to
4 = “high-stress level”, can increase by two times from 0.8 to 1.73 during high congestion levels.
1
Thus, it can increase the number of accident occurrences [6].
As possible solutions to traffic congestion, Cambridge Systematics [7] –which works on planning
and policy, movement of people and goods, software design and development, and effective
partnerships and objective analysis– has proposed three categories of strategies:
1) Adding more capacity;
2) Transportation System Management and Operation (TSM&O);
3) and Demand management.
The first strategy includes expanding infrastructures, such as increasing the number of available
highway lanes and constructing new roads. While this strategy clearly results in reducing congestion,
different reasons, such as financial constraints and opposition from local and national groups, have
made this strategy very challenging in recent years. In TSM&O, the aim is to improve the efficiency
of the current transportation infrastructure and manage the short-term demand for the existing
network. Reversible commuter lanes, dynamic re-timing of traffic signals, providing information
about travel conditions to travelers, and converting streets to one-way operations are examples of
TSM&O strategies. Compared to the “adding more capacity" category, the cost of implementation
in TSM&O strategies is less. While the TSM&O strategies have shown to be highly cost-effective,
we cannot only rely on TSM&O strategies. The final category, Demand Mangement includes Travel
Demand Management (TDM), non-automotive travel modes, and land use management. Travel
demand management is focused on managing travel demand instead of increasing transportation
infrastructure. TDM includes putting more people into fewer vehicles (e.g., ride-sharing), shifting
the time of travel, and removing the need for travel altogether (e.g., teleworking). The requirement
that travelers drastically change their lifestyle is one of the main obstacles to the TDM strategies.
Another limitation of TDM is the inflexible schedule of workers. Investing in non-automotive
modes of travel, such as pedestrian infrastructure, bikeways, and bus and rail transit systems is
another strategy to decrease the rate of travel by personal vehicles.
2
The above strategies can be divided into two groups: (a) expanding the transportation infrastructure and (b) increasing the efficiency of the existing network. Strategies based on expanding the
roads need a long planning horizon before they can lessen congestion [8]. Based on the estimation
by the U.S. Census Bureau [9], the population of the United States is projected to increase by 23
million between 2020 and 2030. If low public support for tax increases (needed to expand the
transportation infrastructure) continues, innovative solutions to utilize the existing network more
efficiently become crucial. Thus, there is an urgent need to make the existing road network more
efficient.
In this dissertaiton, the focus is on changing drivers’ behavior by offering incentives in order
to reduce traffic congestion. More precisely, our solution lies in category 3 of the Cambridge
Systematics strategy framework: Demand Management. Perhaps the closest strategy to our solution
is pricing mechanisms in the literature. Road pricing policies, such as assigning a fee or tax for
driving on a highway/road, have been widely studied in theory and practice. The work of [10] and
[11] study the change in drivers’ behavior by imposing monetary penalties on the drivers’ travel
(see the book [12], Ph.D. thesis [13], and the references therein). In this category, a decline in traffic
congestion is expected due to discouraging people from using congested roads.
Pricing strategies may depend on different factors such as time of the travel [14], distance [15],
or vehicle characteristics [16]. While pricing seems a legitimate solution from a market point of
view, issues such as equity barriers complicate the implementation of congestion pricing/taxation
schemes [17]. In particular, in some of the past implementations in Lyon, France; Mexico City,
Mexico; and Genoa, Italy; congestion pricing was tested, but they failed to achieve permanent
implementation due to low public acceptance [18]. In addition to equity concerns, complexity and
uncertainties in designing pricing mechanisms have prevented policymakers from implementing
advanced congestion pricing schemes [19]. To resolve such barriers, Gu et al. [20] proposed pricing
models for a barrier area, an area in which drivers can be charged in different ways when they enter
to meet efficiency and equity concerns. Their model is a joint distance and time toll. Based on this
model, if a driver travels on very long routes that are not congested, they do not contribute to the
3
traffic congestion, so only the amount of time spent in the traffic jam is penalized.
Tradable credits (TCs) or tradable mobility permits (TPMs) are other token-based pricing
mechanisms [21]. These schemes allow certain tokens/credits to be traded among drivers through a
market mechanism. The total amount of credits is typically considered constant and limits the total
number of vehicles in use—see [22] for a review article on this topic. Various schemes, such as
receiving free travel cards [23], have been proposed in this category. In addition, many mathematical
programming approaches, such as the ones in [24] and the references therein, are proposed for the
modeling and algorithms for such token-based schemes. The theoretical advantages of such tradable
credits have been shown in [25]. While such cap-and-trade programs have been implemented
in some economic sectors, such as airport slot allocation [26], it has not been implemented for
individual-level personal travels and daily commutes [27] due to the design complexities of such
token markets [28]. In addition, they do not consider the personalized preference of different drivers.
Lately, researchers have paid more attention to positive incentive policies. Based on the psychological theory of reactance, rewarding desirable behavior could work better than penalizing
undesirable behavior [29]. Moreover, rewarding is a more popular policy than a taxation approach [30]. While the effectiveness of rewarding in changing the individual’s behavior has been
shown in [31] and [32], there are a limited number of studies on the effectiveness of rewarding
policies in the transportation area. Among these studies, the INSTANT project [33] has provided
positive incentives to motivate commuters to avoid peak times for their travel. Peak time or rush
hours are the time of the day with maximum congestion. The CAPRI project of Stanford [34] is
another example of peak avoidance studies using positive incentives. This study also encourages
other commute modes, such as walking and biking; and they have shown the effectiveness of
the rewarding policy in congestion reduction. Another series of studies have been done in the
Netherlands on the effectiveness of monetary rewards to avoid rush hour driving [35, 36, 37, 38].
They offered incentives to the drivers to avoid rush hours (before and after the peak of the traffic
in the morning), work from home, or choose alternative travel modes such as cycling, carpooling,
and public transport. Xianbao et al. [39] offered different levels of incentives to alter the drivers’
4
departure and route choices. They provided information about the travel time for each offered
departure time and alternative time, so the driver does not depend on her experience in choosing
the alternatives. Another form of reward was recently studied in [28], where token form incentives
were offered for different travel choices such as route, travel modes, and ride-sharing. The proposed
model learns individuals’ decisions and adapts to their preferences based on their travel history.
For each request, a set of choices were generated for the user, and the more contribution to the
network from an alternative, the more valuable incentives were offered. The proposed model learns
individuals’ decisions and adapts to their preferences based on their travel history. While these
policies were successful in short-term experiments, the effectiveness of the rewarding policy in
changing the behavior of individuals does not necessarily result in permanent changes in travelers’
behavior. For example, although the effectiveness of the rewarding policy in changing the behavior
of individuals was shown in [40], most of the individuals returned to their previous behavior in the
absence of incentives. More recently, [41] and [42] provide incentives to (or charge) volunteer truck
drivers to improve the overall traffic condition in a budget-balanced mechanism. [43] considers
VOT (Value of Time) in the mechanism to make it personalized.
Different choices have been used as the incentive in transportation studies. Fujii et al. [44]
used free bus tickets as an incentive to study the changes in the frequency of traveling by bus. In
a different project in Germany, to study the increase in commuting by bus, a pre-paid bus ticket
was offered to college students [45]. Also, in Australia, an early bird ticket program was offered to
relieve the problem of rail overcrowding during peak hours [46]. Free WiFi and discounted tickets
fare have been effective for Beijing commuters to motivate them to avoid the rush hours of the
morning [47]. In the Tripod project [28], they offered token form incentives. Earned tokens depend
on energy savings and can be redeemed for goods and services at local businesses and agencies that
have joined the project. In the CAPRI project [34], users of the app collected points, and they could
trade 100 points for $1 or use them to play a game in which they may lose points or gain money and
points. Knockaert et al. [17] provided smartphones to their users, and users could receive money for
their credits or keep the smartphone after the project if they reached a sufficient amount of credits.
5
The drivers’ preferences in selecting different routes can be considered in the incentive offering
platform. Mohan et al. studied different factors impacting drivers’ decisions for routing in [48].
They partitioned these factors into two categories: static factors and dynamic factors. The static
factors, which are fixed for different people, include the number of available transportation options
and the distance of nearby transit. On the other hand, the dynamic factors include weather and travel
purpose, which changes from one person to another. They identified these factors by performing
interviews and surveys and concluded that personalizing the incentives can be advantageous. Also,
the driver’s preferences can be learned through interaction with the individual [49, 28]. The goal
of offering personalized incentives is to closely tie the offer to the individual’s preferences, thus
maximizing the probability of changing the drivers’ behaviors [49].
In Chapter 2, we study the problem of offering personalized incentives to minimize a global cost
function in the network. Previous studies (e.g., [35]) consider static rewards for static options like
teleworking, biking, and walking. However, our model assigns different rewards for different alternative routes for different drivers based on traffic condition and personalization factors. Consequently,
we have more freedom in offering incentives, and our methodology results in an optimization
problem with a larger number of decision variables. The implementation of our proposed model
could be through a smartphone app where the traffic data can be used to offer incentives to drivers.
In addition, smartphones will help the central planner distribute the computational load for finding
the optimal incentive offering strategy.
Almost all pricing and incentivizing studies focus on traditional mobility systems, and little
has been done for future mobility services. With recent technological advancements, the shape of
mobility services is drastically changing. Traditionally, the driver is the car owner and is the ultimate
decision-maker on the origin, destination, routing, and time of travel. In contrast, future mobility
systems consist of different organizations and companies that completely (or partially) influence the
behavior of individual human (or AI-based) drivers. Such organizations include car-sharing services
(e.g., Zipcar, Turo), ride-hailing services (e.g., Uber, Lyft), crowdsourcing delivery systems (e.g.,
Amazon Flex, Instacart, Doordash), navigation applications (e.g., Google Maps, Waze), and even
6
companies producing autonomous cars with built-in navigation systems (e.g., Tesla), to name just a
few.
In traditional congestion pricing and incentive offering mechanisms, incentives are offered
directly to individual drivers to influence their decisions, such as departure time and routing (Fig. 1.1
(a)). In modern and future mobility services, many of these decisions are indirectly (or directly)
made by organizations providing different transportation services. For example, navigation apps,
which are regularly used by almost 70% of smartphone users [50, 51], influence the routing decisions
of millions of drivers daily. Another example is crowdsourcing delivery platforms such as Amazon
Flex, Instacart, and DoorDash. According to a recent study [52], the revenue of DoorDash during
the fourth quarter of 2022 increased by 40% to $1.8 billion from the $1.3 billion in revenue that it
recorded during the same period in 2021. Another example of such organizations is ride-hailing
companies such as Uber and Lyft. According to the announced results by Uber for the fourth
quarter of 2022 [53], the number of gross bookings increased from 12% in the fourth quarter
of 2021 to 17.7% in the fourth quarter of 2022. In addition to these drastic changes in mobility
services, the rise of autonomous cars will make future mobility services the ultimate decision-maker
in routing, origin-destination selection, and travel time in many applications. Thus, instead of
incentivizing individual drivers, it is more advantageous to incentivize organizations to reduce
congestion. Intuitively, since organizations have more flexibility and more power to change the
traffic, incentivizing organizations is expected to be more efficient than incentivizing individual
drivers. Furthermore, a mechanism has more options in balancing the route selection across the
large pool of drivers employed by the organization. Motivated by this idea, Chapter 3 develops an
incentive offering mechanism to organizations to indirectly (or directly) influence the behavior of
individual drivers (Fig. 1.1 (b)).
In Chapter 3, we show that incentivizing organizations can be more effective than traditional
individual-level incentive offering mechanisms since each organization can control a large pool of
individual drivers, thus moving the traffic flow toward the optimal “system-level” performance. In
addition, such an “organization-level” incentive offering enjoys more flexibility than the individual
7
Figure 1.1: (a) Traditional platforms for offering incentives. (b) Presented platform for offering
incentives.
“driver-level” incentive mechanisms. For example, as we will discuss in this chapter, the time of
travel or the choice of routing can be influenced significantly by providing incentives to organizations
rather than to individual drivers. Our presented approach relies on historical data as well as demand
estimates provided by organizations to predict the traffic flow of the network; and provides incentives
to organizations to reduce their travel time by changing the behavior of individual drivers in their
organization.
In incentivization frameworks, the central planner (e.g., government) must accurately forecast
upcoming traffic conditions and potential congestion to incentivize drivers to adjust their routes and
reduce traffic effectively. Additionally, collaborating companies need reliable demand and supply
forecasts to maintain balance and optimize their operations. Machine learning has become integral
to traffic forecasting by enabling organizations to process large datasets and derive actionable
insights. For instance, ride-hailing and delivery services leverage machine learning to predict
Estimated Time of Arrival (ETA) and balance supply-demand dynamics [54, 55, 56]. Similarly,
cities rely on accurate traffic forecasts to implement effective traffic management strategies [57, 58].
Traffic forecasting, a critical component of incentivization frameworks, relies heavily on accurate
predictions to optimize traffic flow and inform decision-making for both central planners and
collaborating companies. This data-driven approach to traffic forecasting is inherently a time
series forecasting problem due to the temporal nature of traffic data. Early studies focused on
historical data and statistical models for time series forecasting such as Historical Average (HA),
8
VAR [59], ARIMA [60, 61], and Seasonal ARIMA [62]. The poor practical performance of these
models was challenged by the superior accuracy of machine learning based approaches such as
KNN [63, 64], SVM [65], and SVR [66, 67]. With the advancement of neural network based
approaches with the capability of capturing more complex non-linearities, researchers developed
traffic prediction models using LSTM [68] and GRU [69]. While these methods excel at learning
temporal patterns, they fall short in capturing the spatial dependencies inherent in urban traffic data.
To address this limitation, some studies have developed deep learning models that simultaneously
capture spatial and temporal patterns in traffic data. [70] utilizes a CNN-based residual network
to capture both temporal and spatial dependencies in inflow and outflow prediction by defining a
2-D grid-based segmentation of the map. [71] combines LSTM for temporal patterns with CNN
for spatial patterns modeling. More recent advancements leverage graph-based representations of
traffic data to capture spatial dependencies. Graph Convolutional Networks (GCNs) have become
central to this effort. DCRNN [72] and T-GCN [73] integrate GCNs with RNNs to model temporal
dependencies, while STGCN [74] combines GCNs with CNNs for temporal correlations. Advanced
models like ASTGCN [75] and GMAN [76] further enhance performance by incorporating graph
attention mechanisms (GAT) to better capture spatial-temporal relationships. These advancements
have made traffic forecasting more robust, enabling the accurate predictions needed to support
incentivization frameworks’ dynamic and complex requirements.
The performance of these machine learning models is heavily influenced by the quality and
quantity of the data used for training. Access to diverse datasets enables models to capture a broader
range of patterns and generate more accurate predictions. Collaboration between companies offers a
potential solution to enhance data access by facilitating data sharing to improve model performance.
Since each company collects data from distinct users, regions, and time periods, collaboration
allows them to leverage a richer diversity of user behaviors and spatiotemporal patterns. However,
despite the evident advantages, two significant challenges impede data sharing: the competitive
value of proprietary data and concerns over user privacy.
For many companies, data is a key asset that provides a competitive edge, and sharing it could
9
diminish that advantage. For example, Domino’s Pizza avoided partnering with third-party delivery
apps until 2023 due to concerns about giving competitors access to their data. As Domino’s CEO
stated, “It’s just not clear to me why I would want to... give up the data in our business to some
third party who will ultimately use it against us” [77].
Privacy concerns further complicate data sharing, as breaches can severely damage a company’s reputation. The Facebook-Cambridge Analytica scandal in 2018, cost Meta a $725 million
settlement and resulted in significant reputational damage due to selling users’ data [78]. In another example, the Strava fitness app inadvertently revealed the locations and staffing details of
military bases by publishing heatmaps of aggregated user location data [79]. Governments have
designed different regulations to prevent such risks, including US’s AI Bill of Rights [80], European
Union’s General Data Protection Regulation (GDPR) [81], the California Consumer Privacy Act
(CCPA) [82], and China’s Personal Information Protection Law (PIPL) [83]. Violations of these
laws can result in substantial penalties. For example, the European Union fined Meta $1.3 billion
for violating GDPR by transferring European Union user data to the United States without adequate
safeguards in 2023 [84].
To address competitive concerns, companies can focus on achieving mutual benefits, such as
accessing competitors’ data through balanced collaboration. However, privacy concerns remain
a significant obstacle. Differential privacy (DP) offers a viable solution by enabling companies
to share data while protecting sensitive individual information. Differential privacy provides a
mechanism for protecting the privacy of individuals contributing data to a dataset used in data-driven
decision-making systems [85]. It has garnered significant attention for its mathematically rigorous
formulation of privacy protection [86, 87, 88]. Beyond theoretical advancements, differential
privacy has been widely adopted in real-world applications across various industries. For example,
Uber employs differential privacy to analyze traffic patterns (e.g., trip distances) without exposing
individual driver or rider data [89]. Apple has integrated differential privacy into several projects [90,
91, 92], including identifying frequently visited locations to improve features like Memories and
Places in iOS 17 [90]. In government, the U.S. Census Bureau applied differential privacy to
10
the publicly released 2020 census data to ensure respondent privacy [93]. Differential privacy
mechanisms prevent information leakage by introducing random noise into data, commonly using
methods such as the Laplace mechanism [85] or the Gaussian mechanism [94]. In machine learning,
differential privacy techniques can be applied to protect private training data. For instance, DP-SGD
adds noise to gradients during the training process to ensure privacy [95]. However, as differential
privacy mechanisms can impact model performance, recent studies have explored using public data
(data without privacy restrictions) alongside private data to mitigate this trade-off and improve model
accuracy [96, 97, 98]. This combination of theoretical rigor and practical applications highlights the
versatility and importance of differential privacy in modern data-driven systems. In the collaboration
of different organizations for improving their traffic prediction models, differential privacy allows
organizations to strike a balance between privacy protection and data utility by defining a privacy
budget. This DP-based approach enables companies to maintain trust, protect their competitive
edge, and collaborate effectively while improving predictive models and ensuring privacy.
Companies can protect the privacy of their data by mutually sharing a differentially private
version of it (Fig. 1.2). Despite the simplicity of this approach, the composition theorem in
differential privacy states that privacy leakage accumulates every time these companies share
information about the same data (the same pool of users in this scenario) [99]. Hence, the bounded
privacy budget limits the number of times that companies can request private data from each other.
As a result, the bounded privacy budget restricts the number of times private data can be shared
between companies. Additionally, the DP mechanism achieves privacy by introducing noise into the
data, which can potentially distort the original patterns and compromise the accuracy of the shared
information.
An alternative approach to sharing privatized data involves training a privatized model and
sharing it with other companies 1.3. This can be achieved using algorithms designed for differential
privacy, such as DP-SGD[95]. A key advantage of this approach is that the company receiving
the privatized model can make unlimited queries without increasing privacy loss, thanks to the
post-processing property of differential privacy [85]. Post-processing ensures that any computations
11
DP
DP
Figure 1.2: Companies collaborate via sharing privatized data.
or transformations applied to the output of a privatized model do not affect the privacy guarantees
established during training. Consequently, the privacy budget is determined during training and
remains protected, regardless of subsequent operations on the model. While sharing a privatized
model allows unlimited queries from the model, the model’s input data remains the private data of
the source company. Without access to this private input data, collaborating companies are unable
to query the model effectively.
In Chapter 4, we present a framework that enables companies to collaborate by sharing a
privatized model trained on public data to predict private data such that the companies can query
from this model without increasing the privacy leakage. Incorporating this privatized model into
the training of the company’s core model can improve its performance compared to models trained
solely on the company’s data. This collaboration enhances traffic prediction accuracy by leveraging
a diverse range of traffic information, reflecting the varied features and patterns present in data
from multiple companies. This collaboration not only improves traffic prediction accuracy but also
supports the incentivization framework. These companies can provide the central planner with their
private traffic prediction models such that the central planner can query the upcoming traffic from
12
DP
DP
Figure 1.3: Companies collaborate via sharing privatized model.
the models before incentivization planning.
1.2 Organization of the Dissertation
The remainder of this dissertation is organized as follows:
In Chapter 2, we study the problem of offering personalized incentives to minimize a global
cost function in the network. First, Section 2.1 presents our model. Then, in Subsection 2.1.3, we
propose a distributed algorithm to solve our optimization problem efficiently. Finally, the results of
our numerical experiments are presented in Section 2.2 using data from the Los Angeles area. The
details of our methodology and experiments are provided in the appendix. All the codes for this
chapter can be found in [100].
Chapter 3 studies the problem of incentivizing organizations to reduce traffic congestion. First,
Section 3.1 motivates the advantage of incentivizing organizations rather than individuals through
a simple example. Section 3.2 introduces the basic notations and describes our incentive offering
mechanism for congestion reduction. We formulate an optimization problem to find the “optimal”
13
incentive offering strategy. We then propose an algorithm for solving this optimization problem
efficiently. Results of numerical experiments for the model are presented in Section 3.4 using data
from the Los Angeles area. The implemented codes for this chapter can be found in [101].
Chapter 4 discusses our presented framework for training private machine learning models for
traffic forecasting to provide traffic forecast to the central planner and enhance the performance of
the traffic prediction model of collaborating companies. To guarantee the privacy of organizations’
data, our methodology uses the concept of differential privacy (DP). The trained private model
by organizations can be utilized by the central incentive offering platform to collect users’ data
privately and predict the traffic condition in the future. Prediction of traffic can enable the central
platform to distinguish between roads with probable congestion and highways with free flow in
the near future. Whether a reward should be assigned to a road and how much of a reward should
be considered is based on the remaining capacity in the road segments. This information comes
from a prediction algorithm without risking the privacy of participants. Finally, we conclude our
dissertation in Chapter 5.
14
Chapter 2
Congestion Reduction via Personalized Incentives
In this chapter, we study the problem of offering personalized incentives to minimize a global
cost function in the network. Although previous studies (e.g., [35]) consider static rewards for
static options like teleworking, biking, and walking, our model assigns different rewards for
different alternative routes for different drivers based on the traffic condition and personalization
factors. Consequently, we have more freedom in offering incentives, and our methodology results
in an optimization problem with a larger number of decision variables. The implementation of
our proposed model could be through a smartphone app where the traffic data can be used to
offer incentives to drivers. In addition, smartphones will help the central planner distribute the
computational load for finding the optimal incentive offering strategy.
2.1 Incentive Offering Mechanisms
Let us model the structure of the traffic network with a directed graph G = (V ,E ). Here V is the
collection of all major intersections and ramps, which form the set of nodes in the graph. We use
the set of edges E to capture the connectivity of the nodes in the graph. Two different nodes are
adjacent in the graph if it is possible to directly go from one to another without passing over any
other node. The direction of an edge between two nodes is based on the direction of the road from
15
which we can go from one point to another. We also use the notation |E | to denote the total number
of road segments/edges in our network (i.e. the cardinality of the set E ). A route is a collection of
adjacent edges that starts from one node and ends in another. We use the one-hot encoding scheme
to denote the routes. In other words, a given route is represented by a vector r ∈ {0,1}
|E |
. Here, the
k-th entry of vector r is one if the k-th edge is a part of route r and it is zero, otherwise.
Let T = {1,...,T} denote the time horizon of interest assuming the system is currently at time
t = 1. For any t ∈ T, we use the random vector vt ∈ R
|E |
to represent the traffic volume on the
different road segments at time t. The k-th entry of vt shows the total number of vehicles of road
segment k at time t. Notice that the offered incentives can change the drivers’ behavior who are
using the platform in the future and thus affecting the vector vt
.
We use N to denote the set of drivers that we can influence their behavior through offering
incentives. For any driver n ∈ N , let Rn ⊆ {0,1}
|E | denote the set of possible route options for
going from its origin to its destination. Let In be the set of possible incentives we can offer to
driver n ∈ N . We also use the binary variable s
r,n
i ∈ {0,1} to represent the offered incentives. For
any driver n ∈ N and incentive i ∈ In, the variable s
r,n
i = 1 if incentive i is offered to driver n
to take route r; and s
r,n
i = 0 otherwise. We assume that we incentivize each driver with only one
offer, i.e., ∑r∈Rn ∑i∈In
s
r,n
i = 1. Given any incentive offered to the drivers, we model the decision
of the drivers stochastically. In particular, we assume after offering incentives, each driver n chooses
route r with a certain probability which depends on the amount of incentive, the route, and the
driver’s preferences, as described below.
The route preferences of the drivers depend on different factors such as route travel time,
gender, age, and particularly the (monetary) incentive provided to the drivers in our context. Such
dependence can be learned using standard machine learning approaches in the presence of data [102].
In this project, we rely on the model developed in [102] for our preference modeling. We simplify
their model by not including the less predictive features and only considering two major features:
the value of incentive and travel time. In particular, we assume that, given incentive i ∈ In to
16
driver n, the driver chooses route r with probability
p
r,n
i = P(Tbr,i), (2.1)
where Tbr is the estimate of the travel time for route r provided by the incentive offering platform.
Notice that when drivers make their routing decisions, they do not know the exact travel time Tr
for route r, but instead, they rely on the estimate Tbr provided by the system. Here, we make an
implicit assumption that the drivers do not consider their own judgment about the travel time in their
decision. However, if such individual biases for drivers exist, the system can learn them over time
using standard preference learning techniques. Modeling the drivers’ behavior in a probabilistic
fashion has its own benefits. The decision of a driver for a given incentive amount depends on many
factors such as age, gender, and income as also studied in [102]. It is even likely that the driver’s
decision may depend on the driver’s “state of mind” at the time that the incentive is offered. Thus,
the features that influence the driver’s decision are not completely known to the central planner. In
such a setting, probabilistic models can be a better fit for modeling the system. For this reason, in the
general area of “recommendation systems” in machine learning and statistics, probabilistic models
have been widely used to model the behavior of individual users (drivers in our setting) [103]. In
addition, we do not assume any traffic control by modeling the probability of drivers’ acceptance.
Traffic control might be more effective but it needs an authority with the power of changing traffic
which is not required in our framework. In our model, drivers can disregard the offers at any time
but the offers change the probability of accepting drivers’ routing choices.
In the next subsections, we present our model and formulation in more detail. For the convenience of the reader, the list of notations defined here and later in the manuscript is presented in
Appendix 6.1.1. We present our framework under two different scenarios: First, for simplicity of
presenting the ideas, we study the case where it is possible to bring traffic flow below the network
capacity. Then, we study the high demand scenario where there is no feasible strategy to bring the
demand below the network capacity.
17
2.1.1 Scenario I: Operating Below Network Capacity
Let us first for simplicity assume that there exists a solution that all road segments operate below
capacity. Hence, for that solution, we can assume that the travel time will be based on the free flow
traffic. As this section shows, this assumption will result in a mixed integer linear programming
optimization which can be solved efficiently using standard solvers.
Given (2.1), the expected value of the volume vector vt can be computed as:
E[vt
] = ∑
n∈N
∑
i∈In
∑
r∈Rn
s
r,n
i
p
r,n
i βr,t (2.2)
where the vector βr,t
,∈ R
|E |
shows the probability of being at different links of the network at time
t ∈ T, conditioned on the fact that driver n is on route r. For more details about the vector βr,t
,
please refer to [104].
In order to minimize the drivers’ total travel time while keeping the volume below the road
segment capacity vector v0, we need to solve the following optimization problem
min
{s
r,n
i
}
∑
t∈T
∑
n∈N
∑
r∈Rn
∑
i∈In
s
r,n
i
p
r,n
i β
⊺
r,tω
s.t. ∑
n∈N
∑
i∈In
∑
r∈Rn
s
r,n
i
p
r,n
i βr,t ≤ v0, ∀t ∈ T
∑
n∈N
∑
i∈In
∑
r∈Rn
s
r,n
i
ηi ≤ Ω
∑
r∈Rn
∑
i∈In
s
r,n
i = 1 ∀n ∈ N
s
r,n
i ∈ {0,1} ∀n ∈ N ,∀i ∈ In,∀r ∈ Rn
(2.3)
where ω ∈ R
|E |
is the vector of free flow travel time of the links, Ω is the total available budget,
and ηi
is the cost of offering incentive i. To keep this optimization problem tractable, we rely on
the assumption of a large number of vehicles in each road segment and approximate the random
quantity vt with its average E[vt
] provided in equation (2.2). Notice that the objective function is
18
equal to
min
{s
r,n
i
}
∑
n∈N
∑
r∈Rn
∑
i∈In
s
r,n
i
p
r,n
i ∑
t∈T
β
⊺
r,tω
in which ∑t∈T β
⊺
r,tω is the expected travel time of driver n driving on route r.
Problem (2.3) is a mixed integer linear program that can be solved via standard solvers such as
Gurobi, AMPL, GAMS, and CPLEX. We use Gurobi in our experiments because of its powerful LP
solver.
2.1.2 Scenario II: Operating Above Network Capacity
In this subsection, we assume that the demand is elevated; thus, there is no incentive offering strategy
that can bring the traffic flow below the network capacity. In such a scenario, we still can “improve”
the congestion by incentivizing individual drivers. Our goal is to optimize a disutility of the system
as a criterion to compare the traffic condition after incentivizing. To make the formulation more
specific, we use total travel time as the disutility function. It is worth noting that while we use this
disutility, following our steps, one can use any other disutility function such as carbon emissions or
energy consumption.
To compute the total travel time of the system, we sum the travel time of the drivers of all the
links over all time periods:
Ftt(vˆ) =
|E |
∑
ℓ=1
|T|
∑
t=1
vˆℓ,tδℓ,t(vˆℓ,t) (2.4)
where δℓ,t
is the travel time of link ℓ at time t (which itself is a function of the volume). Here, vˆ is
the vector of volume of links in which vˆℓ,t
is the (|E | ×t +ℓ)
th element of vector vˆ representing the
volume of the ℓ
th link at time t.
To understand the impact of our offered incentives, we estimate the drivers’ decision based on
the provided incentives, which in turn results in estimating the volume of the links in the horizon of
interest. Given these estimated volume values, we estimate the travel time in the links as described
19
below.
Travel time value δ: There are different functions that capture the relation between travel time
and volume. For example, the link congestion function developed by the Bureau of Public Roads
(BPR) [105] defines a nonlinear relation between the volume and travel time of the road segments:
fBPR(v) = t0
1+0.15 v
w
4
where fBPR(v) is the travel time of the drivers on the link given the assigned traffic volume v; the
parameter t0 is the free flow travel time of the link; v is the assigned traffic volume of the link; and
w is the practical capacity of the link. We learn t0 and w using historical traffic data. Although we
use the BPR function in our presented model, our methodology provides a modular framework in
which we can replace the BPR function with any other appropriate function. In order to estimate
the total travel time of the system, we need to estimate the volume vector vˆ, which we discuss next.
Volume vector vˆ: To compute the volume vector, we need to know the routing decision of the drivers
to be able to (approximately) estimate their location at different times. Clearly, the drivers’ decision
is a function of the offered incentives. In other words, the location of a driver is dependent on the
incentive that we assign to them because the likelihood of various decisions changes with different
incentives. Let us first explain our notations for the offered incentives: For each driver, we have a
one-hot encoded vector describing which route has been incentivized and how much reward has
been assigned to it. Thus, for each driver we have a binary vector sn ∈ {0,1}
|R|·|I |
in which only
one element has a value of one and it corresponds to the route and the incentive amount that we
offer. As we need one vector for each driver, we can aggregate all our incentivization strategies in a
matrix S ∈ {0,1}
(|R|·|I |)×|N |
. Naturally, routes that are not relevant to that OD pair of a driver will
get a value of zero in the corresponding incentive vector (since we cannot offer those routes to the
driver).
To understand the drivers’ responses to our offered incentives, we need to estimate the probability
of acceptance of incentivized routes under different incentives including zero incentive (i.e, no
incentive). To model this probability, we use the utility function developed in [102] and compute
20
the probability of acceptance of each offered incentive (by using a Softmax function on top of the
utility). While the model in [102] takes many parameters (such as gender, age, and education of the
driver) as input, in our model and numerical experiments we only consider static parameters of the
travel time and the reward value to generate the probability of acceptance of a given incentive/reward.
However, our framework is modular and we can use any prediction model that can estimate drivers’
behavior given an incentive amount. We can use any personalized routing model that can learn
drivers’ behavior such as a neural network. Let P ∈ [0,1]
|R|×(|R|·|I |) be a matrix encoding the
information of probability of picking different routes given the offered (route, incentive) pairs. Thus,
the vector PS1 ∈ R
|R|×1
shows the expected number of vehicles in each route.
Given the number of vehicles in each route, the location of each driver for the next time horizon
can be modeled in a probabilistic fashion. For this purpose, we rely on the model developed in [104]
where a specific matrix R ∈ [0,1]
(|E |·|T|)×|R|
is proposed to estimate the probability of the presence
of a driver in a given road segment at a specific time in the future (assuming that the driver is
picking a specific route). We can compute matrix R by running a simulation model if we have
enough computation power. In our experiments in subsection 2.2.1, we rely on the historical data in
computing matrix R. Similar to other performative prediction problems [106], an inconsistency may
appear between the estimated value and the actual output (as the estimation impacts the outcome).
To resolve this issue, we first used the historical travel time to compute matrix R in our experiments.
Then, we used the estimated R to form the objective function. This approach can be viewed as
an “approximation” of the actual utility function when we want to compute the incentives. This
approximation is only used in computing incentives, and our evaluation of the system’s performance
is based on the actual travel times because after the drivers make their decision, computing the
actual travel time is possible, and such an inconsistency no longer exists. Thus, the vector
vˆ =RPS1 ∈ R
(|E |·|T|)×1
represents the expected number of vehicles in all the links at each time slot. Substituting the
21
expression of vˆ in (2.4), we get
Ftt(vˆ) =
|E |
∑
ℓ=1
|T|
∑
t=1
(AS1)ℓ,tδ((AS1)ℓ,t)
=
|E |
∑
ℓ=1
|T|
∑
t=1
(aℓ,tS1)δ(aℓ,tS1)
(2.5)
where aℓ,t
is the row of matrix A = RP which corresponds to link ℓ at time t. Thus in order to
minimize the total travel time in the system by providing incentives to drivers, we need to solve the
following optimization problem:
min
S
|E |
∑
ℓ=1
|T|
∑
t=1
(aℓ,tS1)δ(aℓ,tS1)
s.t. S
⊺
1 = 1, c
⊺S1 ≤ Ω
DS1 = q, S ∈ {0,1}
(|R||I |)×|N |
(2.6)
where c ∈ R
|R|·|I |
+ is the vector of cost of incentives assigned to each route, D ∈ {0,1}
K×(|R|·|I |)
is the matrix of incentive assignment to the OD pairs, and q ∈ R
K×|I |
is the vector of the number
of drivers for each OD pair. Here, K is the number of OD pairs. We explain the constraints in more
detail below:
Constraint 1 (S
⊺1 = 1): This constraint simply states that we only assign one incentive to each
driver.
Constraint 2 (c
⊺S1 ≤ Ω): This is our budget constraint. The vector c ∈ R
|R|·|I |
represents the cost
of the different rewards assigned to each driver. Ω is the total budget.
Constraint 3 (DS1 = q): This constraint makes sure that we offer the correct number of rewards
for the routes between OD pairs. Recall that S1 represents the (expected) number of drivers that
have been offered different routes given different rewards. We use matrix D to sum the number of
drivers that received different reward offers for routes between the same OD pair. q is the vector of
the actual number of drivers that are traveling between OD pairs and DS1 must be equal to q.
22
Constraint 4 (S ∈ {0,1}
(|R||I |)×|N |
): This constraint imposes binary structure on our decision
parameters. In other words, 0 is not choosing an incentive and 1 is selecting the incentive.
To illustrate our model and the above constraints, we provide an example in Appendix 6.1.5.
2.1.3 Algorithm for Offering Incentives and A Distributed Implementation
The optimization problem (2.6) is of large size while it needs to be solved in almost real time (or
hourly if the drivers send their travel information to the central planner every hour before their trip)
in the network. However, due to the existence of binary variable S, solving this problem efficiently
is difficult1
. In order to develop an efficient “approximate” solver for (2.6), we first relax the binary
constraint in (2.6) and replace it with the relaxed convex constraint S ∈ [0,1]
(|R||I |)×|N |
, leading
to the relaxed formulation
min
S
|E |
∑
ℓ=1
|T|
∑
t=1
(aℓ,tS1)δ(aℓ,tS1)
s.t. S
⊺
1 = 1, c
⊺S1 ≤ Ω
DS1 = q, S ∈ [0,1]
(|R||I |)×|N |
.
(2.7)
The constraints in the above optimization problem are convex. By substituting aℓ,tS1 by γℓ,t
,
the objective function becomes a summation of monomial functions with positive coefficients.
Moreover, γℓ,t
is an affine mapping of the optimization variable S. Since our domain is the
nonnegative orthant and monomials are convex in this domain, the objective function is convex.
This convexity will allow us to explore the use of standard solvers such as CVX [107]. However,
these solvers rely on methods such as interior point methods [108] which requires O(n
3
) number of
iterations with n being the number of variables. This heavy computational complexity prevents us
from applying standard solvers for realistic size problems. In our context, each driver is equipped
with a smartphone and; thus, we can distribute the computational burden of solving (2.7) among the
1We conjecture that problem (2.6) is NP-hard to solve since it is a special instance of polynomial optimization with
discrete variables and there does not appear to be any special structure in function f to reduce its complexity.
23
drivers. In what follows, we propose a simple reformulation of the problem leading to a distributed
algorithm for solving (2.7). To present our algorithm, let us start by reformulating (2.7) as
min
γ,u,S,W,H,z,β
|E |
∑
ℓ=1
|T|
∑
t=1
γℓ,tδ(γℓ,t)
−
˜λ
2
|R|
∑
r=1
|I |
∑
i=1
|N |
∑
n=1
Hr,i,n(Hr,i,n −1)
s.t. S1 = u, W⊺
1 = 1
Du = q, Au = γ
H = S, W = S
c
⊺u+β = Ω, β ≥ 0
H ∈ [0,1]
(|R|·|I |)×|N |
.
(2.8)
As we discuss in Appendix 6.1.2, this formulation is amenable to the ADMM method [109, 110,
111, 112, 113, 114], which has a natural distributed implementation. Our ADMM formulation (2.8)
shows that this computation burden can be distributed among drivers’ cell phones. This distributed
optimization/federated learning framework can have other standard advantages of federated learning/distributed systems [115, 116]. For example, when proper privacy preserving mechanisms
(such as differential privacy [117]) are utilized, we can guarantee the privacy of drivers since they
can participate in the optimization procedure without completely sharing their data and through
a private communication mechanism (see, e.g., [116, 118, 119, 120]). The steps of this algorithm are summarized in Algorithm 1 and the details of the derivation of its steps are provided in
Appendix 6.1.2.
It is worth mentioning that other standard approaches such as projected gradient descent is not
easily applicable to problem (2.8) due to the complexity of the projection operator to our constraint
set. However, ADMM will decompose this projection across multiple variables with each projection
being easy to compute. In addition to projection, computation of the linear minimization oracle is
also expensive, which eliminates the possibility of utilizing other methods such as the conditional
24
gradient (Frank-Wolfe) method. These are the reasons (in addition to the possibility of distributed
implementation) behind choosing ADMM.
Algorithm 1 Distributed Incentivization via ADMM
1: Input: Initial values: γ
0
, S
0
, H0
, W0
, u
0
, β
0
, λ
0
1 ∈ R
|R|·|I |×1
, λ
0
2 ∈ R
|N |×1
, λ
0
3 ∈ R
K×1
,
λ
0
4 ∈ R
|E |·T×1
, Λ0
5 ∈ R
|R|·|I |×|N |
, λ
0
6 ∈ R, Λ0
7 ∈ R
|R|·|I |×|N |
, Dual update step: ρ, Number
of iterations: T˜. t = 0,1,...,T˜
2: u
t+1 = (ρI+ρD
⊺D+ρA
⊺A+ρcc⊺
)
−1
(λ
t
1 +ρS
t1−D
⊺λ
t
3 +ρD
⊺q−A
⊺λ
t
4 +ρA
⊺γ
t −c(λ
t
6 +
β
t −Ω))
3: Wt+1 = (ρ11⊺ +ρI)
−1
(ρ11⊺ +ρS
t −Λt
7 −1λ
t⊺
2
)
4: Ht+1 = 1(ρ > ˜λ)Π
1
ρ−˜λ
(ρS
t −Λt
5 −
˜λ
2
)
[0,1]
+1(ρ < ˜λ)Π
1
ρ−˜λ
(ρS
t −Λt
5 −
˜λ
2
)
{0,1}
5: S
t+1 = (ρu
t+11
⊺ +Λt
5 +ρHt+1 +Λt
7 +ρWt+1 −λ
t
1
1
⊺
)(ρ11⊺ +2ρI)
−1
ℓ = 0,1,...,|E | tˆ = 1,...,|T|
6: γ
t+1
ℓ,tˆ = argmin
γℓ,tˆ
γℓ,tˆδ(γℓ,tˆ) +λ
t
4,(ℓ,tˆ)
(aℓ,tˆu
t −γℓ,tˆ) + ρ
2
(aℓ,tˆu
t −γℓ,tˆ)
2
7: β
t+1 = Π
Ω−c
⊺u
t+1 −
1
ρ
λ
t
6
R+
8: λ
t+1
1 = λ
t
1 +ρ(S
t+11−u
t+1
)
9: λ
t+1
2 = λ
t
2 +ρ(Wt+1⊺1−1)
10: λ
t+1
3 = λ
t
3 +ρ(Dut+1 −q)
11: λ
t+1
4 = λ
t
4 +ρ(Aut+1 −γ
t+1
)
12: Λ
t+1
5 = Λt
5 +ρ(Ht+1 −S
t+1
)
13: λ
t+1
6 = λ
t
6 +ρ(c
⊺u
t+1 +β
t+1 −Ω)
14: Λ
t+1
7 = Λt
7 +ρ(Wt+1 −S
t+1
)
15: Return: ST˜
In Algorithm 1, Π(·)[0,1]
is the operator that projects each entry of the input matrix to the interval
[0,1] and Π(·)R+
is the operator that projects each entry of the input matrix on to R+. In Algorithm 1
steps 3, 4, and 5 are computationally cumbersome due to the size of the matrices W,H, and S.
However, each column of the matrices W,H, and S corresponds to a single driver and hence the
computation corresponding to each column can be performed in parallel on smartphone devices
of the drivers. Moreover, since the steps are not coupled, they can be solved in parallel on the
drivers’ smart devices. Further details about the steps of a distributed computation of Algorithm 1
are provided in Appendix 6.1.3. Theorem 1 guarantees the convergence of our ADMM algorithm.
25
Theorem 1. Algorithm 1 finds an ε-optimal solution of problem (2.8) in O(1/ε) iterations [121].
Theorem 1 guarantees the convergence of Algorithm 1 that is provided for optimization problem (2.8). The optimization problem (2.8) is a (convex) reformulation of the relaxed problem (2.7),
and is amenable to ADMM. However, as it was mentioned, the original problem (2.6) is likely hard
to solve since it is a special instance of polynomial optimization with discrete variables and function
f does not seem to have any special structure to reduce its complexity.
In optimization problem (2.7) (and consequently (2.8)), all solutions S
∗ with a fixed value of
S
∗1 = u
∗
lead to the same objective as long as S
∗⊺1 = 1. Hence, this convex problem can have an
infinite number of solutions (with many of them not even close to binary). Therefore, in order to
find (approximately) binary solutions, we add the following regularizer to the objective function
in (2.8):
ℜ(Hr,i,n) = −
˜λ
2
Hr,i,n(Hr,i,n −1) (2.9)
where ˜λ ∈ R+ is the regularization parameter and Hr,i,n ∈ [0,1]. This regularizer forces the elements
of matrix H to be as close as possible to the binary domain {0,1}.
While Algorithm 1 returns the solution of the optimization problem (2.8), this problem (2.8)
is a relaxation of the original problem (2.6). Hence, the obtained solution in Algorithm 1 must
be utilized to obtain a feasible point in (2.6). For this step, we solve the following mixed integer
(linear) problem
min
S
∥S1−u
∗
∥1
s.t. S
⊺
1 = 1, c
⊺S1 ≤ Ω
DS1 = q, S ∈ {0,1}
(|R||I |)×|N |
(2.10)
where u
∗
is the optimal solution obtained by Algorithm 1. We can use off-the-shelf solvers such as
Gurobi to solve (2.10).
The BPR function of Algorithm 1 in Section 2.1.1 can capture both Scenario I in Section 2.1.1
and Scenario II in Section 2.1.2. However, the computational requirements for the free-flow case in
26
Scenario I in Section 2.1.1 are less expensive compared to that of the congested case in Scenario II in
Section 2.1.2. Thus, model (2.3) in Scenario I in Section 2.1.1 is an alternative when computational
resources are limited.
2.2 Numerical Experiments
We evaluate the performance of our algorithms using data from the Los Angeles area. The Los
Angeles region is ideally suited for being the validation area because there are multiple routes
connecting most OD pairs. Additionally, researchers at the University of Southern California have
developed the Archived Data Management System (ADMS) that collects, archives, and integrates
a variety of transportation datasets from Los Angeles, Orange, San Bernardino, Riverside, and
Ventura Counties [122]. ADMS includes access to real-time traffic data from 9500 highway and
arterial loop detectors with measurements every 30 seconds and 1 minute respectively.
Due to the lack of access to the drivers’ routing information, we need to estimate the origindestination (OD) matrix from the network flow information. Rows and columns of the OD matrix
correspond to the origin and destination points respectively. For OD matrix A, the element A(i, j)
is
the number of drivers going from point i to point j. The OD matrix estimation problem is underdetermined [123, 124, 125]. There are two categories of OD matrices: static and dynamic [126].
Due to the high resolution of our data, most of the existing dynamic OD estimation (DODE)
methods become computationally inefficient. In addition, we do not have prior data of the OD
matrix which many studies consider as given data [127, 128, 129, 130] and we do not have access
to prior observations of the OD matrix. Given these barriers, we relied on the algorithm proposed
by [104]. This algorithm performs without employing any prior OD matrix information.
2.2.1 Simulation Model
In our numerical experiments, we integrate different datasets and models to evaluate the performance
of our algorithms. First, we extract the speed data, volume data, and sensor information including
27
the location of sensors from the Archived Data Management System (ADMS). Then, we use the
distances of sensors, extracted from the location of sensors using Google Maps API, to create the
graph of the network. We created three sets of graph networks corresponding to the regions depicted
in Fig. 2.1, Fig. 2.3, and Fig. 2.4. In the next step, the speed data, volume data, and the network
graph are used for estimating OD pairs by the algorithm provided in [104]. The total number of
estimated incoming drivers for all three experiments is presented in Fig. 2.5. For each OD pair, we
find up to 4 different routing options. In particular, we start by the shortest path for each OD pair.
Then, we remove the edges in this path and go with the second shortest path, and we continue this
process until we find 4 different routes between the origin and destination (or no other routes exist).
We use the model in [102] to compute the acceptance probability for the different offers on the
different routes for each individual driver. The parameters used for computing the probability are
static values provided by [102] and we only calibrate some of the parameters because we are only
using historical data and we do not have access to the drivers’ features such as age or gender. In this
chapter, we do not learn the route choice model of the drivers so the parameters of the probability
model are fixed but it is possible to adapt our routing model to the drivers’ preferences by observing
the drivers’ behavior. We run three different experiments that model the road network at different
scales. In Experiment I, we model an arterial region (Fig. 2.1) but includes surface streets. For
Experiment II, we model a large network of highways (Fig. 2.3). For Experiment III, we model
a moderate region (Fig. 2.4) which is a subset of the region in Experiment II. We use travel time
savings as our metric for performance evaluation in all experiments. Besides travel time savings,
we also include the monetary value of traffic reduction based on the Value of Time (VOT) as an
alternative metric. Our base Value of Time (VOT) is derived from the estimation of [131] which is
$2.63 per minute or $157.8 per hour.
To solve model (2.3), we use the Gurobi solver in all experiments. Also, we solve model (2.6)
in Experiment III utilizing Gurobi and MOSEK to compare their results with Algorithm 1 [100].
The comparison between ADMM, Gurobi, and MOSEK is shown in Experiment III. Gurobi
and MOSEK are state-of-the-art off-the-shelf commercial solvers of linear and mixed integer
28
optimization problems. To better balance accuracy and the required time for solving the problem,
we set the relative mixed integer programming optimality gap at 0.01 for both Gurobi and MOSEK
in the experiments. Given that ADMM is also known to satisfy the constraints “asymptotically”,
we need to evaluate the solution quality after terminating our algorithm in a finite many number
of iterations. We have measured the quality of our ADMM-based algorithm by computing the
normalized gap error between the right-hand side and the left-hand side of our constraints as
fgap(S,u,W,H,β,γ) = ||S1−u||
||S||||1||+||u|| +
||W⊺1−1||
||W⊺||||1||+||1|| +
||Du+q||
||D||||u||+||q|| +
||Au−γ||
||A||||u||+||γ||
+
||H−S||
||H||+||S|| +
||W−S||
||W||+||S|| +
||c
⊺u+β −Ω||
||c
⊺||||u||+||β||+||Ω||.
While we only provide incentives to the drivers that enter the system in the first time interval,
our incentive offering mechanism considers estimations of the traffic flows in the next time intervals.
The selected drivers for incentivization are from the same cohort. We randomly select a group of
drivers between 7 AM and 7:15 AM. Then, we use the selected drivers to compare the performance
of the model with different budget values on the total travel time for 7 AM to 8 AM. While our
formulation is static, it can be applied in a dynamic environment if solved frequently in the network
in order to offer incentives to the drivers.
To evaluate the travel time of the network, we use the volume of the network at the User Equilibrium (UE) after the incentivization. After the incentivization, the user drivers that have accepted the
incentive offer cannot change their incentivized route as part of the assumed incentivization policy.
However, the remainder of the drivers (user drivers that rejected the incentive offer, user drivers
that did not receive an incentive offer, and nonuser drivers) can select their route based on the new
traffic volume at the UE resulting from the incentivization. In other words, our framework does not
assume that drivers who are not incentivized will remain on the previous routes. Hence, those who
are not incentivized may also change routes as the traffic conditions change due to the incentives.
To compute the total travel time at the UE, we provide Algorithm 2 in Appendix 6.1.3. Algorithm 2
29
returns the total travel time of the system at UE given the routing assignment of incentivized drivers
who accepted the incentive offer and the OD information of the remaining drivers. The decision of
the incentivized driver on accepting/rejecting the offer is randomly made based on the probability
of their acceptance given the incentive offer.
2.2.2 Experiment I
In Experiment I, we check the performance of model (2.3) using the ADMS data for May 5th
,
2018 with the incentive set I = {$0,$2,$10}. The studied region, which is depicted in Fig. 2.1,
includes the data of 301 sensors. Based on the ADMS data, we created a graph with 41 nodes,
139 links, and 105.5 miles of road. OD points are located at intersections and close to the ramp of
the highways. The number of OD pairs is 1681 and there are 4278 paths between them in total.
We assume 7494 drivers enter the system between 7 AM and 8 AM and we consider 1805 drivers
entering the system in the first 15 minutes for incentivization. To evaluate the travel time of the
network, we run Algorithm 2 with step size αUE = 0.05 and T˜ = 20 iterations and report the travel
time at UE after the incentivization.
Notice that model (2.3) may result in an infeasible optimization problem (particularly in a
heavily congested network). Hence, we included a parameter α in our formulation as the multiplier
of the allowed capacity. In other words, we use α × v0 instead of v0 in model (2.3). We only
consider this multiplier during the computation of incentives; however, during the computation of
total travel time, we use the original capacity. We assumed zero dollar incentive in our probabilistic
model for drivers that are not receiving an incentive.
Results of model (2.3) at 100% penetration rate (percentage of drivers who are considered in
incentivization) are presented in TABLE 2.1. In this table, “Total travel time” shows the travel time
computed via the BPR function after offering incentives. When the budget is increased from $1000
to $10,000, the percentage of total travel time decrease is improved from 4.03% to 6.97%. The
row “Cost” in the table shows the amount of the budget that was used. In all cases, almost all of the
budget is used. The results show that the value of saved time is much larger than the amount spent
30
Figure 2.1: Studied region in Experiment I.
Budget ($)
0 100 1000 10000
Cost ($) 0 100 1000 9996
Value of saved
time ($) 0 729 4320 7471
Total travel
time (hour) 679 674 652 632
Table 2.1: Experiment I: Linear model (2.3).
on incentive except in the budget of $10,000. Note that a budget of zero is the case of no incentive.
TABLE 2.2 shows that increasing the budget results in higher percentage of drivers to whom we
offered the incentive and a higher average amount of offered incentives. In addition, we observe
that even offering incentives to 6.67% of the drivers (with an average of $2.00 monetary incentive
per driver) can reduce the total travel time by 4.03%. If approximately 13.64% of the drivers are
incentivized with an average of $9.78 per driver, the total travel time can be reduced by almost
6.97%. For more details about the distribution of offered incentives to drivers in Experiment I,
please see TABLE 6.3 in the Appendix.
31
Number
of drivers
entering
the system
Budget ($)
% of
rewarded
drivers
Average
incentive
amount
Reduction
in total
travel time
7402 1000 6.67% $2.00 4.03%
10000 13.64% $9.78 6.97%
Table 2.2: Comparison of $1000 and $10000 budget in Experiment I.
Figure 2.2: Effect of the penetration rate on the percentage of travel time decrease in Experiment I.
Fig. 2.2 shows the effect of the penetration rate on travel time decrease. By reducing the
penetration rate, we experience a smaller travel time decrease because the flexibility of the model
in selecting drivers decreases. Although reducing the penetration rate adversely affects the incentivization, the model focuses on available drivers for reducing travel time. For more details on the
numbers provided in Fig. 2.2, please see TABLE 6.11 and TABLE 6.10 in the Appendix.
2.2.3 Experiment II
In Experiment II, we evaluate the performance of our methods for the region depicted in Fig. 2.3
with 753 sensors under two different possible sets of incentives:
32
Figure 2.3: Studied region in Experiment II.
• I1 = {$0,$2,$10}
• I2 = {$0,$1,$2,$3,$5,$10}
This region only includes data of highway sensors with 25 OD points and 32 links which includes
707.6 miles of road. The number of OD pairs is 625, and there are 1331 paths between them in
total. We assume 15093 drivers enter the system between 7 AM and 8 AM. Our incentivization
model considers 4126 drivers entering the system in the first 15 minutes. To evaluate the travel
time of the network, we run Algorithm 2 following the same settings as Experiment I. The results
of our experiment at 100% penetration rate are presented in TABLE 2.3 for incentive set I1, and in
TABLE 2.4 for incentive set I2. The value of saved time is much larger than the cost of offering
incentives for all budget values for both incentive sets except in the budget of 10,000. The value of
saved time can go up to 15 times the cost.
33
Budget ($)
0 100 1000 10000
Cost ($) 0 100 1000 9998
Value of saved
time ($) 0 1430 3955 8519
Total travel
time (hour) 4253 4244 4228 4199
Table 2.3: Experiment II: Linear model (2.3) for incentive set I1.
Budget ($)
0 100 1000 10000
Cost ($) 0 100 1000 10000
Value of saved
time ($) 0 1546 3931 9741
Total travel
time (hour) 4253 4243 4228 4191
Table 2.4: Experiment II: Linear model (2.3) for incentive set I2.
Number
of drivers
entering
the system
Budget
($)
% of
rewarded
drivers
Average
incentive
amount
Reduction
in total
travel time
Exp. II-1 15093 1000 2.97% $2.23 0.59%
10000 16.42% $4.03 1.27%
Exp. II-2 15093 1000 3.87% $1.71 0.59%
10000 15.97% $4.15 1.45%
Table 2.5: Comparison of $1000 and $10000 budget in Experiment II.
34
In addition to confirming our previous observations in Experiment I, Experiment II shows the
diversity gain related to the incentive set I1 (see the “Reduction in total travel time” column in
TABLE 2.5). In other words, more choices in the incentive set provides more flexibility for the
algorithm, resulting in a total travel time reduction. For more details about the distribution of offered
incentives to drivers in Experiment II, please see TABLE 6.4 and TABLE 6.5 in the Appendix. By
examining Experiments I and II, we observe that more alternative routes leads to more gain in the
travel time reduction.
2.2.4 Experiment III
In Experiment III, we compare the performance of the linear model (2.3) and the ADMM model (2.6)
using the incentive set I = {$0,$2,$10}. The region considered in our analysis is depicted in
Fig. 2.4. This region includes the data of 293 sensors. Based on the ADMS data, we created a graph
with 12 nodes, 32 links, and 288.1 miles of road. The number of OD pairs is 144 and there are 270
paths between them in total. The estimated total number of drivers incoming to the system between
5 AM to 9 AM by the OD estimation algorithm is depicted in Fig. 2.5 (c). In our simulations,
we assume 8220 drivers enter the system between 7 AM and 8 AM. Our incentivization model
considers 2248 drivers entering the system in the first 15 minutes. To evaluate the travel time of the
network, we run Algorithm 2 following the same settings as Experiment I. Results of model (2.3)
and model (2.6) at 100% penetration rate are presented in TABLE 2.6 and Fig. 2.6.
As we can observe in TABLE 2.7 and Fig. 2.6, model (2.6) has a better performance compared
to model (2.3) for all the budgets. At 100% penetration rate, model (2.6) decreased the travel time
up to twice model (2.3). Although the objective function in model (2.3) reduces the total free flow
travel time, the actual travel time is not reached since the free flow travel time is a poor estimation
of the actual travel time. Model (2.6) directly minimizes the travel time based on the BPR function
so it captures the nonlinear relation between travel time and volume. When the volume is greater
than the capacity, model (2.6) which is a more accurate model representing the traffic network
produces better results. These phenomena can be observed in Fig. 2.6, Fig. 2.7, Fig. 2.8, and Fig. 2.8.
35
Figure 2.4: Studied region in Experiment III.
Budget ($)
0 100 1000 10000
Model
(2.3)
(Linear)
Cost ($) 0 100 1000 10000
Value of
saved
time ($)
0 604 6550 9760
Total
travel
time
(hour)
2087 2082 2045 2024
Model
(2.6)
(ADMM)
Cost ($) 0 100 1000 9578
Value of
saved
time ($)
0 1204 8999 15149
Total
travel
time
(hour)
2087 2079 2029 1990
Table 2.6: Experiment III: Linear model (2.3) and model (2.6).
Number
of drivers
entering
the system
Budget
($)
% of
rewarded
drivers
Average
incentive
amount
Reduction
in total
travel time
Exp. III
Model (2.3) 8220 1000 6.08% $2.00 1.99%
10000 15.86% $7.67 2.96%
Exp. III
Alg. 1 8220 1000 6.08% $2.00 2.73%
10000 14.46% $8.06 4.60%
Table 2.7: Comparison of $1000 and $10000 budget in Experiment III.
36
Figure 2.5: Total estimated number of drivers entering the system (in 15-minute intervals). (a)
Experiment I, (b) Experiment II, and (c) Experiment III.
Although both Gurobi and MOSEK find slightly better solutions for model (2.6), it can take up to
35 hours for Gurobi and up to 80 hours for MOSEK to solve the problem. However, Algorithm 1
takes at most 14 minutes by utilizing parallel computation. Also, our computational resources
were limited. With a proper use of GPUs and TPUs, matrix computations can be even much more
efficient. We set the termination rule of MOSEK and Gurobi as 0.01 relative optimality gap. Also,
we have computed the normalized gap error of our ADMM-based Algorithm 1 to measure the
quality of its solution. This error is illustrated in Figure 2.10 for Experiement III after 50,000
iterations for different penetration rates and budgets in Figure 2.10. We can observe that the error
after 10,000 iterations converges almost to 0 for most of the cases and after 20,000 iterations, it
converges almost to 0 for all the cases.
Algorithm 1 by offering incentives to 14.46% of the vehicles at 100% penetration rate in
Experiment III (with an average of $8.06 monetary incentive per driver) can reduce the total travel
time by 4.60% using model (2.6). For a budget of $10,000, model (2.6) has 1.64% larger reduction
37
Figure 2.6: Effect of solving method on the percentage of travel time decrease in Experiment III.
in the percentage of travel time compared to model (2.3) although both offer almost the same
amount of incentive on average to the almost same percentage of drivers. The computation time
of model (2.3) is 2.6 minutes, but model (2.6) requires up to 1.04 hours to run if we employ
serial computation. Utilizing parallel computation as described in section 2.1.3, we can reduce the
computational time to at most 14 minutes. The value of saved time using Algorithm 1 is much
larger than the amount spent on incentive for all budget values and it can go up to 12 times the cost.
For more details about the distribution of the offered incentives to the drivers in Experiment III,
please see TABLE 6.6, TABLE 6.7, and TABLE 6.8 in the Appendix. The effect of the penetration
rate on travel time decrease in Experiment III for model (2.6) is depicted in Fig. 2.7, Fig. 2.8, and
Fig. 2.9. The behavior is similar to our observation in Fig. 2.2 which was for Model (2.3). For
more details of the numbers provided in Fig. 2.7, Fig. 2.8, and Fig. 2.9, please see TABLE 6.13,
TABLE 6.12, TABLE 6.15, TABLE 6.14, TABLE 6.17, and TABLE 6.16 in the Appendix.
38
Figure 2.7: Effect of the penetration rate on the percentage of travel time decrease in Experiment III,
model (2.6), Algorithm 1.
Figure 2.8: Effect of the penetration rate on the percentage of travel time decrease in Experiment III,
model (2.6), Gurobi solver.
39
Figure 2.9: Effect of the penetration rate on the percentage of travel time decrease in Experiment III,
model (2.6), MOSEK solver.
Figure 2.10: Normalized gap error of Algorithm 1 after 50,000 iterations with different cases of
penetration rate and budget.
40
2.2.5 Summary
As we discussed in section 2.1.1, model (2.3) assumes that there exists a traffic flow solution
operating below the network capacity. When this assumption is not satisfied, our model results
in an “approximate” solution. To evaluate the validity of this approximation in heavily congested
networks, we ran model (2.3) for heavily congested networks (Experiments I and II) with many
alternative routes so that we can reasonably reduce the congestion level. As we saw in Experiments I
and II, this model can provide a reasonable approximation in both arterial (Experiment I) and
highways (Experiment II) and leads to congestion reduction even when the final result is above
the system capacity. In Experiment III, our numerical experiments demonstrate the superiority of
model (2.6) over model (2.3) in reducing congestion. This is because of the heavy congestion and
the lack of availability of enough alternative routes to reduce congestion (so that the final solution
of model (2.3) is far away from the free flow traffic and a linear approximation of travel time is no
longer accurate enough). We were not able to run model (2.6) for Experiments I and II due to the
large number of nodes in the network. However, relying on edge computation, this model could be
solved efficiently in practice as we discussed in subsection 2.1.3.
41
Chapter 3
Incentive Systems for New Mobility Services
In traditional congestion pricing and incentive offering mechanisms, incentives are offered directly
to individual drivers to influence their decisions, such as departure time and routing (Fig. 3.1 (a)).
In modern and future mobility services, many of these decisions are indirectly (or directly) made
by organizations providing different transportation services. For example, navigation apps, which
are regularly used by almost 70% of smartphone users [50, 51], influence the routing decisions of
millions of drivers daily. Another example is crowdsourcing delivery platforms such as Amazon
Flex, Instacart, and Doordash. Intuitively, since organizations have more flexibility and more power
to change the traffic, incentivizing organizations is expected to be more efficient than incentivizing
individual drivers. Furthermore, a mechanism has more options in balancing the route selection
across the large pool of drivers employed by the organization. Motivated by this idea, this chapter
develops an incentive offering mechanism to organizations to indirectly (or directly) influence
the behavior of individual drivers (Fig. 3.1 (b)). Our framework will be based on the following
three-step procedure:
Step 1) The central planner receives organizations’ demand estimates for the next time interval (e.g.,
the next few hours)
Step 2) The central planner offers incentives to organizations to change their routes, travel time, and
demand if necessary.
Step 3) Observe organizations’ response and go back to Step 1 for the next time interval.
The central planner (which is referred to as “Incentive Offering Platform” in Fig. 3.1 (b)) will
42
continually repeat this three-step process in the network for every time interval.
Figure 3.1: (a) Traditional platforms for offering incentives. (b) Presented platform for offering
incentives.
Although an incentive system targets the reward to the organization, the proposed approach
can be used by a wide variety of transportation organizations, including those that have drivers as
employees, use autonomous vehicles, or use gig workers as drivers. Furthermore, the transport
organization can be a freight delivery company. In delivery organizations, there are no passengers
that need to be incentivized. In autonomous vehicle organizations, there are no drivers that need
to be incentivized. In ride-hailing organizations, it encompasses both drivers and passengers. In
the categories where gig workers or passengers participate, they are not obligated to accept the
ride with a longer travel time. Hence, they can accept the incentivized longer road or reject it.
Participant drivers/passengers who reject the incentivized longer route can take the shortest route.
As previously mentioned, the platform is designed to provide payments to organizations. In cases
where the organization employs gig workers or involves passengers, the organization will have
to pass on some of these incentives to the gig workers and passengers to incentivize them to take
longer routes. Overall, the incentivization platform aims to improve system efficiency by moving
toward System Optimal (SO) by changing organizations’ routes via incentives.
43
3.1 Why Offering Incentives to Organizations Rather Than
Individuals?
Our methodology is incentivizing organizations (rather than individual drivers). Let us first motivate
the benefit of this strategy via a simple example. Consider the subnetwork G˜ at Fig. 3.2 as a subset
of a larger network.
Figure 3.2: Subnetwork G˜ (selected in blue dashed rectangle).
Links a and b are routes between Origin-Destination (OD) nodes v1 (origin) and v2 (destination).
The travel times of a and b are 25 and 30 minutes at User Equilibrium (UE), respectively. Assume
20 drivers start traveling from v1 to v2 at the same time. If travel time is the most important factor
in their utility, they will select v1 because it is the fastest route at UE. Assume we have found the
System Optimal (SO) strategy for the entire network, and we need 15 out of the 20 drivers to select
b instead of a to achieve SO. At SO, the travel time of route a decreases to 20 minutes (5-minute
decrease), and the travel time of route b increases to 35 minutes (5-minute increase). Drivers that
use route a save 5 minutes due to a decrease in travel time of route a. Deviated drivers to route b
expect to lose 5 minutes because they expect route b to have travel time of 30 minutes. Hence, since
we want to deviate 15 drivers to a route with longer travel time (route b in this example), we should
compensate for their increased travel time. Assume VOT is $1/min. Let us compare two scenarios:
(I) All 20 drivers are individual drivers. Since we need to deviate/incentivize 15 drivers and
compensate each of them for 5 minutes of their time, we need to spend $75 = (5 min×15)×
44
$1/min.
(II) All 20 drivers work in the same organization. In this scenario, the organization needs to
spend $75 to alter the decision of the 15 drivers. However, after offering incentives, the
travel time of the 5 remaining drivers on route a decreases. Therefore, the organization gains
25 = 5 × 5 minutes of time from the drivers who stayed in route a. Overall, the increase
and decrease in the drivers’ travel times cancel each other out (canceling-out effect). This
change only costs the organization 50 minutes of total time. Hence, the compensation cost is
$50 = 50 min×$1/min for the organization.
Therefore, we spend 33% less in incentivizing the organization (i.e., scenario (II)) compared
to incentivizing the individuals (i.e., scenario (I)). This example illustrates that incentivizing
organizations can be more cost-effective than incentivizing individual drivers. Fig. 3.3, Fig. 3.4,
and Fig. 3.5 illustrate different scenarios of payments and gains for organizations in self-driving
vehicles, delivery services, and ride-hailing services, respectively. Note that this observation does
not necessarily hold in general games; grouping users in a game does not necessarily lead to a
lower-cost Nash equilibrium.
Figure 3.3: Payments and gains for self-driving vehicle organizations.
45
Figure 3.4: Payments and gains for delivery service organizations.
Figure 3.5: Payments and gains for ride-hailing service organizations.
3.2 Incentive Offering Mechanism and Problem Formulation
Given the origin-destination information of drivers in various organizations, the goal is to find the
“optimal” strategy for offering organization-level incentives to them to reduce the traffic congestion
46
of the system. To mathematically state the problem, we begin by defining our notations. A complete
list of notations used in this chapter can be found in Appendix 6.2.1. For further details of the
notation, an example is provided in Appendix B in the complete version of the work [132].
The traffic network is represented by a directed graph G = (V ,E ). Vertices V of the graph
are major ramps and intersections in the network. Vertices are connected by a set of edges E . In
our directed graph, the edge direction is determined by the allowable direction of travel on the
corresponding road segment, indicating the permissible movement from one node to another for a
driver. The adjacency of two nodes is based on the possibility of driving directly from one node to
another without visiting any other node. The network comprises a total number of road segments,
denoted as |E |, which reflects the cardinality of the set E . A route in the network is a path in the
graph and is denoted by a one-hot encoding. In other words, a given route is represented by a vector
r ∈ {0,1}
|E |
in which the k-th entry is one if route r includes the k-th edge and it is zero, otherwise.
Let T = {1,...,T} denote the defined time horizon such that t = 1 marks the starting time of the
system. Traffic volume of road segments at time t is represented by the vector vt ∈ R
|E |
in which
the k-th entry is the total number of vehicles of road segment k at time t.
Let N = N1 ∪ ··· ∪ Nn denote the set of all drivers and Ni denote the set of drivers of
organization i. If a driver works for multiple organizations, he or she will be counted as a different
driver at each organization. Hence, N1 ∩ ··· ∩Nn = /0. For any driver j ∈ N , let Rj ⊆ {0,1}
|E |
denote the set of driver’s possible route choices between her/his origin and destination. The binary
variable s
r, j
i ∈ {0,1} represents the assigned route to the j-th driver of organization i. For this driver
and given route r ∈ Rj
, the variable s
r, j
i = 1 if route r is assigned to the j-th driver of organization i;
and s
r, j
i = 0, otherwise. Each driver can only be assigned to one route, i.e., ∑r∈Rj
s
r, j
i = 1. Given
any routing strategy assigned to drivers, we model the drivers’ decision deterministically due to the
power of the organizations in enforcing their drivers’ routes.
In this chapter, we change the routing decision of organizations’ drivers by incentivizing their
organizations. The incentivization budget can be provided through resources similar to previous
47
studies [wu2023managing, wu2023public, 33, 34, 35, 39] (e.g., government). We assume that
organizations will accept our route assignments if the incentive offer can compensate for the change
in their total travel time. Notice that when the organizations decide to accept the offer, they have no
access to the offered route assignments to the other organizations. Hence, they can only estimate the
travel time based on historical data, and they will be compensated based on their loss/gain compared
to the historic setting. This compensation is computed by utilizing the VOT at the organizationlevel. For example, in the case of ride-hailing services like Uber and Lyft, the incentivization
platform can provide incentives to the organization based on the VOT of drivers and passengers
(combined). Next, the organization utilizes the received budget to incentivize passengers (e.g., by
reducing the price) and the drivers who accept longer routes (by paying them). Those who reject
the incentivized longer route can take the shortest route. Note that an organization’s VOT is not
necessarily dependent on passengers or drivers because the set of organizations extends beyond the
ride-hailing sector. For instance, the VOT for delivery services would be associated with the delivery
partners’ VOT. Moreover, the VOT of autonomous vehicle organizations like Waymo pertains only
to passengers due to the absence of drivers. Our platform includes the flexibility to differentiate
VOTs for each organization due to the varied nature of their operations.
In this work, we adopt total travel time as the utility function, while alternative metrics like
energy consumption or total carbon emissions can also be considered. We compute the system total
travel time by summing the drivers’ travel time of all road segments over all time periods in the
horizon of interest:
F(vˆ) =
|E |
∑
ℓ=1
|T|
∑
t=1
vˆℓ,tθℓ,t(vˆℓ,t) (3.1)
where θℓ,t
is the travel time of link ℓ at time t (which itself is a function of the link’s traffic volume
at that specific time). Here, vˆ is the vector of volume of links in which vˆℓ,t
is the (|E |×(t −1)+ℓ)
th
element of vector vˆ corresponding to the volume on the ℓ
th link at time t. Using the volume vector,
we can then calculate the travel time for the links at various times, as outlined below.
48
Multiple approaches have been proposed to illustrate the relationship between traffic volume and
travel time. For instance, the Bureau of Public Roads (BPR) [105] presents a congestion function
for road links. This function describes a non-linear connection between the travel time on a road
and its traffic volume:
θ(v) = fBPR(v) = θ0
1+0.15 v
w
4
(3.2)
where fBPR(v) denotes the travel time for drivers on a road segment based on its traffic volume v;
θ0 represents the segment’s free flow travel time; and w is the road segment’s practical capacity.
In our experiments, to learn the parameters w and θ0 of the road segments in the Los Angeles
area at different times of the day, we utilize the historical traffic data of the road segments. Given
the function θ(·) in (3.2), to compute the total travel time of the system, one needs to compute
the volume at each link. Subsequently, we elucidate the process by which the volume vector is
computed within our model.
Volume vector vˆ: The computation of the volume vector vˆ requires (approximately) estimating the
location of the drivers at different times based on their route. By assigning a different route to a
driver, the driver’s impact on the values of the vector vˆ will be different because the driver’s location
will change by following a different route. We will begin by introducing our notation for route
assignment: Each driver’s assigned route is represented by a one-hot encoded vector. Thus, for
each driver, we have a binary vector s
j
i ∈ {0,1}
|R|
in which only one element has a value of one,
and it corresponds to the assigned route to the j-th driver of organization i. As we need one vector
for each driver, we can aggregate all our assignments in a matrix S ∈ {0,1}
|R|×|N | = [S1S2 ...Sn]
where Si ∈ {0,1}
|R|×|Ni
|
, which is the assignment matrix of organization i with n being the number
of organizations. Elements in a driver’s assignment vector that correspond to routes unrelated to
their specific origin-destination pair are set to zero since travel on these routes is not possible for
the drivers. Thus, the S matrix can be rather sparse.
Given the driver’s route entering the system at a specific time, we need to model the location
of the individual in the upcoming times. To model the drivers’ location in the system, we use the
49
model developed by [104] in which the drivers’ location is computed in a probabilistic fashion. This
model can be presented by a matrix R ∈ [0,1]
(|E |·|T|)×|R| which estimates the probability of a driver
being on a certain link at a given future time, under the assumption that they choose a specific route.
Multiple ways to estimate matrix R are suggested in [104], including an approach based on the
use of historical data. In our experiments in subsection 3.4.1, matrix R is computed based on the
volume at the UE state of the system. Given matrix R, it is easy to see that the vector
vˆ =RS1 ∈ R
|E |·|T|
(3.3)
contains the expected number of vehicles in all the links at each time. Plugging the expression of vˆ
in (3.1), we get the total travel time of the system as
F(vˆ) =
|E |
∑
ℓ=1
|T|
∑
t=1
(RS1)ℓ,tθ((RS1)ℓ,t)
=
|E |
∑
ℓ=1
|T|
∑
t=1
(rℓ,tS1)θ(rℓ,tS1)
(3.4)
where rℓ,t
is the row of matrix R which corresponds to link ℓ at time t.
To reduce the total travel time of the system, some drivers can be deviated to alternative routes to
lower the traffic flow of the congested links. To change the routing assignment of drivers, we need
to offer incentives to their organizations such that it can compensate the organizations’ financial
loss caused by accepting our assignment. For simplicity, we use the total travel time increase of
the organization as a measure of financial loss. Although we have estimated the travel time of the
system from equation (3.4), we need to compute the “route travel times” to be able to compare
the amount of change in travel time of each driver after offering incentives. Given the route travel
times, we compute the incentives using a model that depends on VOT and the amount of increase in
the travel time for each organization. In particular, we assume that, given the route assignment to
50
organization i, the incentive value is
ci = αimax(
0, ∑
j∈Ni
δ
⊤s
j
i −γi
)
, (3.5)
where ci
is the incentive offered to organization i, αi ∈ R+ is VOT for organization i, δ ∈ R
|R|·|T|
+ is
the travel time of the route for each driver, and γi
is the sum of the minimum travel time route of
each driver of organization i in the absence of incentivization. The variable αi
is designed based
on the VOT specific to organization i. This approach allows the model to adjust the VOT for each
organization, accommodating the diverse nature of their operations. δ and γi are computed based on
the absence of incentivization. When ∑j∈Ni
δ
⊤s
j
i −γi > 0, the organization’s total travel time has
increased compared to the baseline of having no incentive, and hence the system will compensate
the organization’s loss. On the other hand, when ∑j∈Ni
δ
⊤s
j
i − γi < 0, the organization’s travel
time is improved after incentivization, and hence no incentivization is required for this particular
organization to participate. The details of our method for computing route travel time vector δ are
described next.
Route travel time vector δ: Estimation of the vector δ requires the volume on each link, which is
derived based on the route assignment of the drivers. Let S denote the routing decision of the drivers.
Given S, we can estimate the volume vector v using (3.3). By utilizing the BPR function (3.2) and
the estimated volume vector vˆ, we can compute the speed of the links. Given the speed of each
link, we can determine the vector δ that contains the travel time of the routes for different time slots
and the vector η ∈ R
K·|T|
+ that contains the travel time of the fastest route for different OD pairs
for different times (K represents the total number of OD pairs). To do so, we rely on the method
provided by [104] and the routing decision of drivers S at the UE state of the system. Given the
minimum travel time between OD pairs in vector η, we can compute the minimum travel time of
organization i as γi = (Biη)
⊤1 where Bi ∈ {0,1}
|Ni
|×(K·|T|)
is the matrix of shortest travel time
assignment of drivers of organization i. Biη is the vector of the shortest travel time between the OD
pair for each driver, and by summing the elements of this vector, we get γi
.
51
Proposed formulation: For minimizing the total travel time of the system via providing incentives
to organizations, we need to solve the following optimization problem:
min
{Si
,ci}
n
i=1
|E |
∑
ℓ=1
|T|
∑
t=1
vˆℓ,tθℓ,t(vˆℓ,t)
s.t. vˆ =
n
∑
i=1
RSi1
(3.6)
where vˆℓ,t
is an element of vector vˆ that corresponds to the volume of link ℓ at time t, ci ∈ R+
is the cost of incentive assigned to organization i, D ∈ {0,1}
(K·|T|)×(|R|·|T|)
is the matrix of route
assignment of the OD pairs, bi ∈ R
|Ni
|
+ denotes the factor by which the travel time of an assigned
route can be larger than shortest travel time of the OD pair, Bi ∈ {0,1}
|Ni
|×(K·|T|)
is the matrix of
shortest travel time assignment of drivers of organization i, and qi ∈ R
K·|T|
is the vector of the number
of drivers of organization i for each OD pair at different times. If there are drivers in the system that
do not work for any organization, we can consider them as a single organization whose decision
matrix is initialized and has fixed values such that they are assigned to the fastest route (assuming
they always select the shortest route). The same idea can be employed for organizations that are not
joining the incentivization platform. The following section provides a detailed explanation of the
constraints:
Constraint 1 (vˆ = ∑
n
i=1 RSi1): This constraint is the computation of the volume on each link at
different times based on the routing assignments for the organizations.
Constraint 2 (DSi1 = qi): This constraint ensures that the correct number of drivers are assigned to
the routes between OD pairs. Si1 represents the number of drivers assigned to the different routes.
The matrix D is utilized to aggregate the count of drivers assigned to various routes within the same
OD pair. The vector qi represents the actual number of drivers from organization i traveling between
these OD pairs, and the product DSi1 is required to equal qi
.
Constraint 3 (S
⊤
i
1 = 1): This constraint simply states that we can only assign one route to each
driver of organization i.
52
Constraint 4 (Si ∈ {0,1}
(|R|·|T|)×|Ni
|
): This constraint enforces a binary framework on our decision
variables, where 0 indicates not assigning a route and 1 signifies route assignment.
Constraint 5 (S
⊤
i
δ ≤ bi ⊙Biη): This is our fairness and time delivery constraint. Due to different
reasons, such as urgent deliveries by some of the organizations’ drivers, they may not accept
alternative routes that deviate significantly from the fastest route. Moreover, the platform should
consider fairness between different drivers in terms of the amount of deviation from the shortest
travel time. The fairness and time delivery constraint bounds the deviation of travel time of the
assigned routes from the minimum travel time. S
⊤
i
δ represents the travel time of the assigned routes
to drivers of organization i. bi ∈ R
|Ni
|
+ denotes the factor by which deviation is allowed for each
driver of organization i.
Constraints 6 and 7 (ci ≥ αi(δ
⊤Si1−γi) and ci ≥ 0): These two constraints guarantee (3.5).
Constraint 8 (c1 +c2 +···+cn ≤ Ω): This represents our budget constraint. The scalar ci denotes
the incentive amount allocated to organization i. Ω signifies the total budget available.
For further elaboration on model 3.6 and its constraints, an illustrative example is presented in
Appendix B in the complete version of the work [132].
3.3 Incentivization Algorithm and A Distributed Implementation
Optimization problem (3.6) is of large size and includes binary variables (Si
,∀i = 1,...,n). Thus,
solving it efficiently is a challenging task. In this subsection, we propose an efficient algorithm for
solving it. First, we relax the binary constraint Si ∈ {0,1}
(|R|·|T|)×|Ni
|
to convex constraint Si ∈
[0,1]
(|R|·|T|)×|Ni
|
and we refer to this as the relaxed version of problem (3.6). The objective
function is a summation of monomial functions with positive coefficients. Furthermore, θℓ,t
is
an affine mapping of the optimization variable Si
. Since our domain is the nonnegative orthant
and monomials are convex in this domain, the objective function is convex. As the constraints
53
of this problem are convex, the relaxed version of problem (3.6) becomes a convex optimization
problem. Thus, standard solvers such as CVX [107] can be used to solve this problem. However,
these solvers have large computational complexity because of utilizing methods such as interior
point methods [108] with O(n
3
) iteration complexity where n is the number of variables. This
computational complexity is not practical for our problem. In what follows, we rely on first-order
methods with linear computational complexity in n, which is affordable in our problem. The
reformulation is provided in Appendix 6.2.2. This reformulation is amenable to the Alternating
Direction Method of Multiplier (ADMM) method [109, 110, 111, 112, 114], which is a firstorder method and scalable. Appendix 6.1.2 provides an overview of ADMM, the fundamental
component of our framework. The steps of the resulting algorithm are provided in Algorithm 3 in
Appendix 6.2.3. The details of the derivation of this algorithm are provided in Appendix E in the
complete version of the work [132]. Due to the distributed setting of Algorithm 3 using the ADMM
method, it also provides the potential benefits associated with federated learning and distributed
systems [115, 87].
In the relaxed version of problem (3.6), different solutions S
∗
i with a fixed S
∗
i
1 = u
∗
i
yield the
same objective value if S
∗
i
satisfies all the constraints. Thus, potentially infinitely many solutions
to our convex problem exist, and many are not binary. To promote a binary solution for the final
decision, we introduce the following regularizer into the objective function of the relaxed version of
problem (3.6):
ℜ(S) = −
˜λ
2
n
∑
i=1
|Ni
|
∑
j=1
|R|
∑
r=1
|T|
∑
t=1
(Si)j,r,t((Si)j,r,t −1) (3.7)
where ˜λ ∈ R+ is the regularization parameter and (Si)(j,r,t) ∈ [0,1]. This regularizer has the effect
of driving the elements of matrix S towards the binary domain {0,1}. The regularizer penalizes
any deviations from this domain in the objective function. While convexity is sacrificed due to
regularization, ADMM can still be convergent in nonconvex problems [112].
Algorithm 3 solves the relaxed version of problem (3.6). Since the solution to the relaxed version
54
of problem (3.6) may not be binary (due to relaxation), we need to project it back to the feasible
region. For computational purposes, we suggest using ℓ1 projection by solving the following mixed
integer (linear) problem (MILP)
min
{Si
,ci}
n
i=1
n
∑
i=1
∥Si1−u
∗
i ∥1
s.t. DSi1 = qi
, ∀i = 1,2,...,n
S
⊤
i 1 = 1, ∀i = 1,2,...,n
Si ∈ {0,1}
(|R|·|T|)×|Ni
|
, ∀i = 1,2,...,n
S
⊤
i δ ≤ bi ⊙Biη, ∀i = 1,2,...,n
ci ≥ αi(δ
⊤Si1−γi), ∀i = 1,2,...,n
ci ≥ 0, ∀i = 1,2,...,n
c1 +c2 +···+cn ≤ Ω
(3.8)
where u
∗
i
,∀i = 1,2,...,n is the optimal solution obtained by Algorithm 3. Clearly, this problem can
be reformulated as a MILP problem and solved using off-the-shelf solvers like Gurobi. Solving
problem (3.8) can be easier than problem (3.6). Problems (3.6) and (3.8) have the same variable
size and similar constraints, but the objective functions are different. While the objective function in
problem (3.8) can be restructured as a linear programming problem, we have a polynomial objective
function in problem (3.6) that introduces more complexity.
3.4 Experiments
We evaluate our incentive scheme’s effectiveness using Los Angeles area data. The presence of
multiple routes between most OD pairs makes the Los Angeles area particularly suitable for our
assessment. We use the data collected by the Archived Data Management System (ADMS), a comprehensive transportation dataset compilation by University of Southern California researchers [122].
This system aggregates data from Los Angeles, Orange, San Bernardino, Riverside, and Ventura
55
Counties, offering a robust data source for analysis.
For our evaluations, we need to estimate the OD matrix. The (i, j)-th entry of the OD matrix
represents the count of drivers traveling between origin i and destination j. We need to estimate the
OD matrix using the available network flow information due to drivers’ routing data unavailability.
The OD matrix estimation problem is challenging due to its under-determined nature [123, 124, 125].
OD matrices are categorized as static or dynamic [126]. However, many dynamic OD estimation
(DODE) methods are computationally impractical for our high-resolution data. Additionally, some
studies rely on existing OD matrix data [127, 128, 129, 130], which we lack. Given these constraints,
we adopt the OD estimation algorithm proposed by [104]. Note that OD estimations in our study
serve as an input to our incentivization model rather than being the focus of our analysis, as we do
not propose a new OD estimation algorithm.
The base VOT of our experiments is derived from the estimation by [131], which is $2.63 per
minute or $157.8 per hour. In our experiments, we apply a uniform VOT across all organizations. We
note that, in practice, we do not initially know the exact VOT of passengers and drivers. Moreover,
the perceived VOT by organizations can change over time because the incentive policy would
necessitate algorithmic adjustments within the organizations. Specifically, they would need to
modify their payment algorithms to allocate the received incentives between passengers and drivers
that accept longer routes. Such algorithmic updates would allow the organizations to optimize
their operations and services in response to the incentive policy. The incentivization platform
can learn the VOT of passengers and drivers by continuously observing their acceptance/rejection
behavior through techniques in online/reinforcement learning. Learning the VOT is beyond the
scope of our work, and we assume the VOT is known. All the codes are publicly available at:
https://github.com/ghafeleb/Incentive_Systems_for_New_Mobility_Services.
56
3.4.1 Simulation Model
First, we extract sensor details, including their locations. We extract the speed and volume data of
selected sensors. Nodes for the network graph are chosen from on-ramps and highway intersections.
Connecting link data is derived from in-between sensors. Node distances are determined via Google
Maps API. Data preparation workflow is shown in Fig. 6.3 in Appendix 6.2.5. The network under
consideration includes highways surrounding Downtown Los Angeles, as depicted in Fig. 3.6, and
consists of 12 nodes, 32 links, and a total road length of 288.1 miles. We have 144 OD pairs, and
we employ the algorithm from [104] on the network’s speed and volume data for OD estimation.
Fig. 3.7 shows the total estimated incoming drivers per time interval. We explore 3 routing options
for each OD pair. Initially, the shortest path is determined. Subsequently, links in the first path are
removed to uncover the second shortest path if available. This process is repeated for the third route.
Based on this process, we find 270 paths between OD pairs.
In practice, the prediction of travel time and OD estimations are handled by organizations using
their sophisticated software and data-collecting tools. By incorporating these prediction tools, the
framework can consider external factors such as weather conditions in traffic predictions or road
closures in finding possible routes because our approach focuses on short-term planning (only a few
hours ahead).
We focus on incentivizing the organizations to change their behavior for the 7 AM to 8 AM
interval (which is the rush hour based on the estimated number of incoming drivers in Fig. 3.7).
Although we have selected 7 AM to 8 AM as the incentivization time period, we also include 8 AM
to 8:30 AM in our experiments because some of the drivers entering between 7 AM and 8 AM may
not finish their route before 8 AM. To track the effect of these drivers on the total travel time of the
system, we include traffic flow from 8 AM to 8:30 AM in our analysis as well. The OD estimation
algorithm’s projected total count of drivers entering the system from 6 AM to 9 AM is illustrated in
Fig. 3.7. From 7 AM to 8:30 AM, a total of 11985 drivers enter the system.
We consider the traffic volume of the network at UE in our baseline. To compute the volume of
57
Figure 3.6: Studied region and the highway sensors inside the region. This region encompasses
several areas notorious for high traffic congestion, particularly Downtown Los Angeles.
Figure 3.7: Total estimated number of drivers entering the system over time (in 5-minute intervals).
The plot shows that traffic peak happens between 7 AM and 8 AM.
58
the network at UE, we use the UE algorithm in [133]. The algorithm receives the volume (historical
data) and OD estimation as inputs and returns the matrix RUE and route travel time vector δUE at
UE. To compute the cost of organizations’ incentivization, we need to know the route travel times
when drivers have made decisions based on the UE route travel time δUE. Hence, we compute the
new volume vector vnew = RUESUE1 where SUE is the assignment of drivers to the fastest route
based on the UE route travel time vector δUE. Using the BPR function, volume vector vnew, and
δUE, we compute δ that denotes the travel time of the routes if drivers make decision based on δUE
and η denotes the minimum travel time between the different OD pairs.
3.4.2 Results
In this subsection, using our model and algorithm, we study the impact of organization incentivization for different budget values, the number of organizations, VOTs, and the percentage of drivers
who are employed by the organizations in the incentivization program. The remaining drivers are
assumed to be background drivers who follow the δUE. We consider four scenarios for the percentage of drivers that enter the system between 7 AM and 8 AM and belong to organizations that we
can incentivize (penetration rate): I. 5% (407 drivers) II. 10% (812 drivers) III. 15% (1221 drivers)
IV. 20% (1626 drivers). Selected drivers in each scenario are included in scenarios with larger user
percentages to have a standard comparison between scenarios. Drivers in each organization are
selected uniformly at random.
The percentage of travel time decrease with incentivization as compared to a system with no
incentivization scheme with VOT of $157.8 is presented in Fig. 3.8 for different penetration rates.
In our plots, the budget of $0 shows the case of a no-incentivization platform. We observe that by
increasing the budget, the amount of decrease in travel time increases (as expected). This decrease
is more for the same budgets at larger penetration rates because the model has access to more drivers
to select and has more flexibility to recommend alternative routes. For the purpose of sensitivity
analysis, we also provide travel time decrease for all penetration rates with a different VOT of
$157.8
2 = $78.9 per hour in Fig. 3.9. The comparison of results for different VOTs in Fig. 3.8 and
59
$0,
No Incentive
$200 $800 $2000 $10000
Budget
0.00%
2.00%
4.00%
6.00%
Travel Time Reduction
Penetration Rate
5%
10%
15%
20%
Figure 3.8: Percentage of travel time decrease with different budgets at VOT=$157.8 per hour using
Algorithm 3. The amount of travel time reduction shows a positive correlation with the amount of
incentivization budget and the penetration rate.
Fig. 3.9 shows that for a very large budget, the decrease in travel time is almost similar. This is
because none of the models utilize the entire budget at a $10000 budget. However, when budgets
are limited, the performance disparity can increase up to 1.41% due to lower incentivization costs
associated with the smaller VOT.
For the next analyses of our numerical results, we only report the results for our base VOT
($2.63 per minute or $157.8 per hour) because the results follow similar patterns with VOT of $78.9.
In Fig. 3.10, we present the total incentivization cost for different budgets and penetration rates
when there is one organization in the system. This cost increases when the available budget is more.
This pattern shows that the platform can utilize the resources when it has access to more money.
We observe that more involvement of drivers at $800 and $2000 budget leads to a slightly smaller
cost at larger penetration rates because of more flexibility in selecting drivers. At a $10000 budget,
the platform does not exhaust the whole budget at any penetration rate. Hence, it spends more on
incentivization at larger penetration rates by incentivizing a greater number of participants. Fig. 3.11
60
$0,
No Incentive
$200 $800 $2000 $10000
Budget
0.00%
2.00%
4.00%
6.00%
Travel Time Reduction
Penetration Rate
5%
10%
15%
20%
Figure 3.9: Percentage of travel time decrease with different budgets at VOT=$78.9 per hour using
Algorithm 3. The amount of travel time decrease is similar or larger compared to Fig. 3.8 with
VOT=$157.8 due to the smaller cost of incentivization.
shows the cost per deviated driver. The cost per driver is significantly smaller in larger penetration
rates because the model has more flexibility in choosing the drivers efficiently. Moreover, the cost
per driver increases with the budget. This shows that our model utilizes our budget efficiently
by providing more affordable incentives first when the budget is low. As TABLE 3.1 shows, the
number of incentivized drivers in larger penetration rates is larger because there are more drivers for
selection.
Penetration Rate Budget
$200 $800 $2000 $10000
5% 20 34 48 48
10% 31 51 74 94
15% 42 72 101 151
20% 49 90 123 195
Table 3.1: Distribution of the number of drivers that were assigned to an alternative route using
Algorithm 3.
The number of organizations in the system can alter the total travel time and cost. Fig. 3.12
61
$0,
No Incentive
$200 $800 $2000 $10000
Budget
$0
$1,000
$2,000
$3,000
$4,000
Total Cost
Penetration Rate
5%
10%
15%
20%
Figure 3.10: Total cost of incentivization of one organization with different budgets and different
penetration rates at VOT=$157.8 per hour using Algorithm 3. Incentivization cost increases with
the amount of budget as the model incentivizes more drivers to reduce traffic. The cost is larger at
larger penetration rates at $10000 budget because the model incentivizes more drivers. At smaller
budgets, the incentivization cost is smaller at larger penetration rates because of more flexibility in
selecting drivers at limited budgets.
62
$0,
No Incentive
$200 $800 $2000 $10000
Budget
$0
$10
$20
$30
Cost Per Deviated Driver
Penetration Rate
5%
10%
15%
20%
Figure 3.11: Cost of incentivization per deviated drivers of one organization with different budgets
and different penetration rates at VOT=$157.8 per hour using Algorithm 3. At larger penetration
rates, the platform can incentivize drivers more efficiently due to access to a larger pool. The
platform spends smaller incentivization amount per deviated driver at larger penetration rates.
illustrates the percentage decrease of travel time and total cost when there are different number of
organizations in the system at a 5% penetration rate. As an extreme case, we also include the case
that each organization contains one driver (i.e., we incentivize individuals rather than organizations).
In Fig. 3.12, we observe a larger cost for reducing the same amount of travel time decrease when
there are more organizations in the system. The intuitive reason behind this observation is as follows.
For each organization, after incentivization, some drivers lose time, and some gain travel time. At
the organizational level, the time changes of drivers can cancel each other out, and hence we may
not need to compensate the organization significantly. When the number of drivers per organization
decreases, the canceling effect becomes weaker, and the incentivization costs more. This is in line
with our discussion in Section 3.1. This also explains why incentivizing organizations is much more
cost-efficient than incentivizing individual drivers. This observation remains consistent across other
penetration rates; therefore, corresponding plots for other rates are not provided.
Our experiments use Algorithm 3 to solve the relaxed version of problem (3.6) and utilizes
63
$0 $1,000$2,000$3,000$4,000$5,000$6,000
Cost
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
Travel Time Reduction
No. of Orgs.
1 Organization
10 Organizations
100 Organizations
Individual Drivers
Figure 3.12: Travel time decrease vs. incentivization cost for different number of organizations
at a 5% penetration rate and VOT=$157.8 per hour using Algorithm 3. The incentivization cost
for the same travel time reduction is smaller when the number of organizations is smaller. This
phenomenon is due to the cancel-out effect between the gain and loss of drivers of the organizations.
64
the Gurobi solver to solve the MILP problem (3.8). We compare our approach against solving the
MIP problem (3.6), utilizing Gurobi and MOSEK. These solvers are recognized as state-of-the-art,
off-the-shelf commercial tools for linear and mixed integer optimization problems. We configure
the relative mixed integer programming optimality gap at 0.01 for both solvers to ensure an optimal
trade-off between computational speed and accuracy. Fig. 3.13 shows that the Gurobi solver has
a slightly better travel time reduction compared to our method. MOSEK is not included in this
plot because its performance closely mirrors that of Gurobi. Although the solvers show a slight
advantage in reducing travel time, our presented method significantly outpaces these solvers when
parallel computation techniques are applied. As Fig. 3.14 shows, our method achieves speeds up
to 12 times faster than Gurobi and 120 times faster than MOSEK, demonstrating a considerable
advantage in computational efficiency. This enhanced speed does not only translate to quicker
solutions but also suggests potential for real-time application in dynamic traffic management
scenarios where rapid decision-making is critical. Moreover, Fig. 3.15 illustrates that, at $10000
budget, the Gurobi solver spends significantly more (up to $5000) on incentivization compared
to Algorithm 3. This discrepancy highlights the cost-efficiency of our algorithm, particularly in
managing budget allocations effectively while achieving comparable traffic management outcomes.
The potential reason is that Gurobi employs branch-and-bound and linear programming solvers to
find the solution in a finite number of steps, relying on extreme points. In contrast, Algorithm 3 is
based on a first-order method and asymptotically converges to the solution, stopping upon finding
an ε-optimal solution. Due to the similar incentivization cost of MOSEK and Gurobi, a comparative
analysis for MOSEK is not included.
3.5 Conclusion
In this chapter, we study the problem of incentivizing organizations to reduce traffic congestion. To
this end, we developed a mathematical model and provided an algorithm for offering organizationlevel incentives. In our framework, a central planner collects the origin-destination and routing
65
$0,
No Incentive
$200 $800 $2000 $10000
Budget
0.00%
2.00%
4.00%
6.00%
8.00%
Travel Time Reduction
Penetration Rate, Solver
5%, Algorithm 1
5%, Gurobi
10%, Algorithm 1
10%, Gurobi
15%, Algorithm 1
15%, Gurobi
20%, Algorithm 1
20%, Gurobi
Figure 3.13: Comparison of travel time reduction percentage using different solvers with different
penetration rates and budgets at VOT=$157.8 per hour. Gurobi exhibits a slight performance
advantage over Algorithm 3 at higher penetration rates and budgets.
5% 10% 15% 20%
Penetration Rate
0
25
50
75
100
125
Relative Execution Time of
Algorithm 1 Compared to Solvers
2X 2X 7X 12X
3X
58X
120X
109X
Solver
Gurobi
MOSEK
Figure 3.14: Comparison of the relative execution time of Algorithm 3 vs. different solvers at
different penetration rates at VOT=$157.8 per hour. Algorithm 3 execution time consistently
outperforms Gurobi and MOSEK up to 12 and 120 times, respectively.
66
$0,
No Incentive
$200 $800 $2000 $10000
Budget
$0
$1,000
$2,000
$3,000
$4,000
$5,000
Cost Difference
(Gurobi - Algorithm 1)
Penetration Rate
5%
10%
15%
20%
Figure 3.15: Comparison of the relative incentivization cost using Algorithm 3 vs. Gurobi at
different penetration rates, VOT=$157.8 per hour, and one organization. Both methods utilize
similar incentivization amount for smaller budgets but at $10000 budget, Gurobi spends up to $5000
more.
information of the organizations. Then, the central planner utilizes this information to offer incentive
packages to organizations to incentivize a system-level optimal routing strategy. In particular, we
focused on minimizing the total travel time of the network. However, other utilities can be used
in our framework. Finally, we employed data from the ADMS to evaluate the performance of our
model and algorithm in a representative traffic scenario in the Los Angeles area. A 6.90% reduction
in the total travel time of the network was reached by our framework in the experiments. More
importantly, we observed that incentivizing companies/organizations is more cost-efficient than
incentivizing individual drivers. As future work, it is important to study the effect of incentivization
to change the start time of the trip. This is particularly relevant in future mobility services because
many of them, such as delivery services, are flexible in terms of trip time to a certain degree. In
addition, we can consider the stochastic nature of making decisions in routing by individual drivers.
Moreover, we can extend the incentivization framework to the case that not all organizations accept
their received offer. Our platform also has the limitation of assuming VOT is given and fixed.
67
Furthermore, as a potential legal, ethical, and practical constraints, we should consider the privacy
of individuals’ data. We can adopt approaches similar to those used in previous incentivization
projects with real-world implementations [33, 34, 35, 39] to address this issue. Further discussions
on limitations and scalability of the platform are provided in Appendix 6.2.4. All the codes for this
chapter can be found in [101].
68
Chapter 4
Deep Traffic Prediction via Private Cooperation of New Mobility Services
In this chapter, we begin with an overview of Differential Privacy (DP) to provide the necessary
background. Next, we present a comprehensive discussion of our framework including: the
notations used, the integration of DP into our approach, the traffic prediction model employed, and
the utilization of DP and public data in Graph Neural Network (GNN) training within a collaborative
setting. Finally, we evaluate the performance of our framework in traffic prediction under various
scenarios, using real-world data from Uber and Lyft and simulated data representing companies
with different market shares.
4.1 Background on Differential Privacy
This section provides an overview of differential privacy, a foundational concept in privacypreserving machine learning. The key advantage of this mathematically rigorous method is its
ability to quantify privacy leakage through its formal definition, allowing data holders to balance
utility and privacy effectively. By leveraging DP, our approach ensures privacy protection while
enabling meaningful insights from sensitive user data. It is important to note that our framework
focuses exclusively on feature-level privacy because the structure of the transportation network
graphs are assumed have no privacy restrictions.
69
An important property of DP is post-processing immunity. This property ensures that if data or
a model is generated using a differentially private mechanism, an adversary cannot compromise
privacy protection by applying any arbitrary computations or by using auxiliary information. In other
words, the differential privacy guarantee persists regardless of any subsequent analysis performed
on the released data.
Definition 1 (Differential Privacy). Let ε ≥ 0, δ ∈ [0,1). A randomized algorithm A : X n → W
is (ε,δ)-differentially private (DP) if for all neighboring datasets X,X
′ differing by one record and
all subsets S ⊆ W , we have
P(A (X) ∈ S) ≤ e
εP(A (X
′
) ∈ S) +δ. (4.1)
Based on this definition, an adversary cannot distinguish between the outputs of the randomized
mechanism A when applied to any two neighboring datasets X and X
′ because the probability of
A generating output s ∈ S given the two neighboring datasets is almost similar. The level of privacy
is determined by the parameters ε and δ. Smaller values of these parameters correspond to stronger
privacy protection. It is important to note that there is a privacy-utility trade-off: higher levels of
privacy (smaller ε and δ) can decrease the utility or accuracy of the data or model, and vice versa.
In other words, enhancing privacy comes at the cost of reduced utility.
Another critical factor influencing the amount of noise required to achieve ε-δ differential
privacy is the sensitivity of a query. Sensitivity measures the maximum change in a query’s output
resulting from the addition or removal of a single individual’s data. In this chapter, the taxi dataset
exhibits a sensitivity of 1 because it aligns with the structure of histogram queries.
Histogram Query: In certain datasets, data is organized into disjoint cells (bins), and a query
retrieves the count of points in each cell. For example, a city can be divided into smaller disjoint
regions to query the number of pickups or dropoffs in each region. This type of query, known as a
histogram query, has a sensitivity of 1 because adding or removing a single data point affects the
count in only one bin by at most 1 [99]. Histogram queries are particularly significant in applications
70
involving spatial data, such as pickup or dropoff locations, because the required noise to ensure
privacy remains constant regardless of the number of bins or queries. In Subsection 4.2.1, we
discuss how our taxi data is a special case of histogram queries in more detail.
Building on these foundational concepts, we now explore the specific Gaussian mechanism that
ensures differential privacy by carefully calibrating the noise added to query results.
Gaussian mechanism: A common method to impose differential privacy is the Gaussian
mechanism [99]. This mechanism adds a random noise generated from Gaussian distribution to a
non-private input to make it private. The variance of noise controls the level of privacy. The higher
variance results in higher privacy and vice versa. DP-SGD provides a method based on the Gaussian
mechanism for learning tasks [95]. This method adds Gaussian noise to the parameter gradients
after clipping their norm at each iteration. We use this method in our framework to train DP GNN
models.
4.2 Proposed Method
4.2.1 Problem Definition & Notations
In recent years, researchers have focused on traffic forecasting using GNN models due to their high
performance in capturing spatio-temporal patterns in traffic data. In this chapter, we leverage GNN
models for our experiments. Alongside traffic features, GNN models require the structure of the
transportation network graph as an additional input. Since we utilize two types of traffic data from
the same urban area, we construct two separate transportation network graphs:
I Region graph Gr for taxi data.
II Highway graph Gh for real-time speed data.
Definition 2 (Region Graph). The region network is represented by an undirected graph Gr =
(Vr
,Er
,Ar) where Vr
is the set of Nr = |Vr
| nodes representing subregions of the graph in the
71
network, Er
is the set of nodes representing connections between neighboring subregions, and
Ar ∈ {0,1}
Nr×Nr
is the adjacency matrix. Corresponding nodes to two subregions are connected if
they overlap at their borders.
Definition 3 (Highway Graph). The highway network is represented by a directed graph Gh =
(Vh,Eh,Ah) where Vh is the set of Nh = |Vh| nodes representing highways in the network, Eh is the
set of nodes representing connections between highways, and Ah ∈ {0,1}
Nh×Nh
is the adjacency
matrix. Two nodes corresponding to highways are connected if it is possible to travel directly from
one node to the other without passing through any intermediate nodes.
In our binary adjacency matrices, Ar and Ah, the entry Ai, j
is 1 if nodes vi and v j ∈ V are
connected (i.e., (vi
, v j) ∈ E ), and 0 otherwise. Additionally, each graph is associated with a feature
matrix that aggregates the signals from all nodes across different timestamps:
Definition 4 (Traffic Features). At each timestamp t, each graph G is associated with a feature
matrix Xt ∈ R
N×D, where N represents the number of nodes, and D denotes the number of features
(e.g., speed, number of pickups, number of drop-offs).
We can employ the constructed traffic graph and the corresponding traffic features to forecast
the traffic:
Definition 5 (Traffic Forecasting). Given historical traffic signals of the last T timestamps
Xt−T+1:t = (Xt−T+1,Xt−T+2,...,Xt) ∈ R
T×N×D on graph G , find the mapping function f such that
Yt+1:t+T = f(Xt−T+1:t
;G ). Here, Yt+1:t+T = (Yt+1,Yt+2,...,Yt+T ) ∈ R
T×N represents the predicted
traffic states (e.g., speed, inflow, or outflow) for the next T timestamps. The function f learns the
spatio-temporal relationships in Xt−T+1:t using the structural information in G to generate accurate
predictions for Yt+1:t+T .
Figure 4.1 provides a high-level illustration of traffic forecasting using graph-based spatiotemporal data. In this chapter, our traffic forecasting framework incorporates DP to train private
72
...
...
Figure 4.1: Traffic prediction using spatio-temporal data of last T timestamps {t −T +1,t −T +
2,··· ,t} on graph G to predict the traffic data of the next T timestamps {t +1,t +2,··· ,t +T}.
GNN models. Additionally, we compare the performance of this framework with an alternative
approach where data is privatized using DP techniques before being used to train a GNN model. To
establish a clear foundation for our discussion, we define spatio-temporal neighboring datasets in
the context of DP for both scenarios.
DP GNN on Spatio-Temporal Data: In our framework, we train differentially private Graph
Neural Network (DP GNN) models using public data as input and private data as output. We assume
no privacy constraints at the node level for the transportation graph structure and focus solely on
feature-level privacy of the data. In the context of training DP models on spatio-temporal data, two
neighboring datasets are defined as differing in a single timestamp of the graph data. The formal
definition is as follows:
Definition 6 (Neighboring Spatio-Temporal Datasets). Two spatio-temporal datasets D =
(X1,X2,...,XT ) ∈ R
T×N and D′ = (X
′
1
,X
′
2
,...,X
′
T
) ∈ R
T×N with T timestamps and N nodes are
neighboring if they differ at most by one timestamp, i.e., ∃i ∈ {1,2,··· ,T} s.t. Xj = X
′
j
,∀ j ̸= i.
73
DP Inflow/Outflow Data: To train GNN models with DP traffic graph data, it is essential to
define neighboring datasets in a way that aligns with the structure of inflow/outflow data. This
requires understanding how inflow and outflow rates are generated from the number of pickups and
drop-offs in the taxi data.
The pickup and drop-off locations of users are recorded at discrete timestamps. At each
timestamp t, the company collects a dataset Dt ∈ {0,1}
Nu,t×Nr
, where Nu,t
is the number of users at
time t and Nr
is the number of regions. Each user’s data is represented as a one-hot vector, with
the i
th element set to 1 if the user is in region i at timestamp t, and 0 otherwise. Summing Dt
column-wise over all users produces Xt ∈ N
Nr
, which reflects the inflow and outflow rates in each
region at time t. The dataset used for training spans multiple timestamps and is represented by
Xt+1:t+T for T consecutive timestamps.
In the context of differential privacy, two neighboring datasets Xt and X
′
t differ by the inclusion
or exclusion of a single individual’s data, affecting the count in at most one region at a single
timestamp by 1. Specifically, adding or removing a user’s pickup or drop-off record changes the
inflow or outflow rate in one region at time t by 1. Thus, taxi data is a special case of histogram
queries (refer to Section 4.1 for more details). In histogram queries, data is organized into disjoint
cells (bins). In this context, the bins correspond to regions at specific timestamps, and each data
point represents an individual pickup or drop-off event. Consequently, modifying the dataset by
adding or removing a single record affects only one entry in Xt by 1. Formally, two neighboring
taxi datasets satisfy: ∃i ∈ {1,2,...,T} s.t. ||Xi −X
′
i
||1 ≤ 1 and Xj = X
′
j
∀ j ̸= i, making
them a specific case of Definition 6.
4.2.2 Traffic Forecasting Using Graph Neural Networks (GNNs)
Graph Neural Networks (GNNs) have emerged as the leading method for traffic prediction [134].
Numerous GNN architectures have been developed to improve the accuracy of traffic forecasting.
In this chapter, we employ the Attention-Based Spatial-Temporal Graph Convolutional Network
74
(ASTGCN) [75] as our base model to evaluate the performance of our proposed framework.
ASTGCN integrates attention mechanisms with graph convolution to effectively capture both
spatial and temporal information in traffic data. The attention mechanism includes:
• Spatial attention: Models the spatial correlations between different regions or nodes.
• Temporal attention: Extracts temporal relationships across multiple timestamps.
The convolution mechanism in ASTGCN consists of:
• Spatial convolution: Captures spatial signals from neighboring regions or nodes.
• Temporal convolution: Identifies dependencies among neighboring timestamps.
4.2.3 Methodology
In this chapter, we propose a framework that enables companies to enhance their machine learning
models through collaboration without sharing raw data or compromising privacy. In this framework,
each company independently uses its private data to train a DP model, which is then shared with
other companies. Upon receiving the DP model, the collaborating companies utilize its predictions
as input to their core models to improve their model performance while maintaining data privacy.
For the shared models to be effective, recipient companies require access to the training input
data or an input data with a distribution matching that of the training data. While other companies
may collect similar input data (e.g., both Uber and Lyft gather the number of pickups and drop-offs),
there is no guarantee that their collected data will follow the same distribution and can be utilized
as the input during the inference phase. One approach is for companies to share a DP version of
the training data along the DP model. However, our methodology explores an alternative solution:
utilizing public input data for both the training and inference phases.
Public data can be accessed via different resources. For example, speed data is public and can
be accessed via government sources (collected via sensors or cameras) or real-time APIs (e.g.,
75
...
...
Public Data Private Data
Figure 4.2: Predict inflow or outflow X
taxi
t+1:t+T
on graph Gr using speed data X
speed
t−T+1:t
on graph Gh.
TomTom [135] and Mapbox [136]). Moreover, this data should be correlated with the private data
so the model can use the patterns in the public data to predict the private data. The correlation
of speed data with the taxi data is analyzed in Subsection 4.3.1 and illustrated in Figure 4.7 and
Figure 4.8 of this subsection.
By utilizing public input data, companies can train models to predict the number of pickups
and drop-offs. Since GNNs are state-of-the-art (SOTA) models for traffic prediction due to their
ability to capture complex spatiotemporal relationships, we leverage GNNs for our predictions. A
key challenge in our scenario, however, is that the input graph (highway graph, Gh) and the output
graph (taxi graph, G r) are different as visualized in Figure 4.2. Traditional GNN-based traffic
prediction models are typically designed to use the same graph for both input and output. Our
model, fs(X
speed
t−T+1:t
;Gh), incorporates a linear layer after the GNN component to map predictions
from the highway graph (Gh) to the taxi graph (Gr) to address this discrepancy. This mapping
process is illustrated in Figure 4.3.
After sharing the DP version of fs(X
speed
t−T:t
;Gh) with other companies, it can be utilizes in three
76
Linear GNN Layer
Public Data Private Data
DP Model
Figure 4.3: DP prediction model fs(X
speed
t−T+1:t
;Gh) trained using public data to predict private data
where public and private data have different graphs.
different ways:
I Directly use the prediction by the model for decision-making.
II Fine-tune the model based on the available data.
III Use the predictions from the model as additional features in the core model of the company.
In this chapter, we focus on the third scenario. Figure 4.4 provides a high-level illustration of our
framework in the context of the collaboration of two companies.
77
GNN
DP Model of Company 1
Linear
Layer
GNN
Public Data
Figure 4.4: Collaboration framework in the context of company 1 helping company 2 by sharing
DP model f
Company 1
s (X
speed
t−T+1:t
;Gh).
4.3 Experiments
4.3.1 Dataset Description
To evaluate the effectiveness of our developed framework, we utilize traffic datasets from the Queens
Borough of New York City. New York City traffic is known for its high congestion levels [137],
making accurate traffic prediction particularly challenging due to significant daily variations in
congestion. By focusing on Queens, a highly congested area, we aim to assess our framework under
realistic and demanding conditions, ensuring its applicability to real-world scenarios.
To collect the traffic data for Queens, we rely on two primary sources: taxi trip data provided by
the New York City Taxi and Limousine Commission (TLC) and real-time speed data collected by
78
the City of New York Department of Transportation.
4.3.1.1 Taxi Data
The New York City Taxi and Limousine Commission (TLC) collects comprehensive trip records,
including detailed information about every taxi trip in the city (see Figure 4.5) [138]. In our
experiments, we focus on trip records from companies dispatching over 10,000 trips daily [139],
which include major providers such as Uber, Lyft, Via, and Juno. Given that Uber and Lyft account
for approximately 95% of all recorded trips, we exclude Juno and Via from our analysis and focus
solely on the trip records of Uber and Lyft. These records capture spatial (e.g., pickup and drop-off
zones, trip distance), temporal (e.g., pickup and drop-off times, trip duration), financial (e.g., fares
and tips), and company-specific (e.g., service provider codes) details for each trip.
The spatial information is based on pickup and drop-off zones, with Queens divided into 68
zones as defined by TLC (see Figure 4.5). To derive inflow and outflow rates, we aggregate
individual trip records at the zone level over specified time intervals. The inflow rate represents
the number of dropoffs within a zone during a time window, while the outflow rate represents the
number of pickups. These rates are computed by summing the counts of pickups and drop-offs in
each zone for a given time interval, such as 10 minutes. This aggregation leverages both the spatial
features (pickup/drop-off zones) and temporal features (pickup/drop-off times) of the data.
4.3.1.2 Real-Time Speed Data
The City of New York Department of Transportation provides real-time traffic data on major arterials
and highways [140]. The data is collected by speed detectors and is updated several times per
minute. Each data point represents two features of average speed and travel time of a link during a
recorded time interval (see metadata [141]). The borough of Queens includes 32 links that serve as
major highways (see Figure 4.6). Each link consists of multiple sensors, and the recorded speed
is the average speed measured by these sensors. In our evaluations, we use only the speed feature
79
Figure 4.5: Queens taxi zones based on NYC TLC partitioning.
because speed and travel time essentially convey the same information.
Speed data is strongly correlated with fluctuations in the number of pickups and drop-offs in taxi
data, making it a suitable choice for our framework. Intuitively, an increase in pickup and drop-off
activity reflects higher commuting demand, leading to traffic congestion and reduced average speeds.
Conversely, lower pickup and drop-off rates indicate decreased traffic levels, with average speeds
approaching free-flow conditions. This relationship is evident in Figure 4.7, where higher inflow
and outflow rates during morning and afternoon peak hours correspond to a sharp decline in average
speed. Additionally, Figure 4.8 shows that earlier days of the week exhibit lower inflow and outflow
rates, accompanied by higher average speeds, further reinforcing the connection between speed and
taxi activity data.
4.3.2 Construction of the Transportation Network Graph
The adjacency graph is a crucial input to our GNN models. In our experiments, we construct two
separate graphs: one for the taxi data and another for the real-time speed data, each tailored to the
80
Figure 4.6: Queens link map of real-time data. Markers represent sensors. Each link consists of
multiple sensors. Sensors of a link are depicted with the same marker color.
Average Inflow/Outflow Rate
Average Speed (MPH)
12 AM
3 AM
6 AM
9 AM
12 PM
3 PM
6 PM
9 PM
16
14
12
10
8
6
4
2
47.5
45.0
42.5
40.0
37.5
35.0
32.5
30.0
Time
Speed
Outflow
Inflow
Figure 4.7: Average inflow/outflow rate of the complete taxi data vs. the average speed data at
different times of the day.
81
37.0
37.5
38.0
38.5
39.0
Average Speed (MPH)
8.0
8.5
9.0
9.5
10.0
Average Inflow/Outflow Rate
Monday
Tuesday
Wednesday
Thursday
Friday
Speed
Outflow
Inflow
Day of the Week
Figure 4.8: Average inflow/outflow rate of the complete taxi data vs. the average speed data on
different days of the week.
unique characteristics of the respective datasets.
4.3.2.1 Taxi Data
We construct the region graph Gr for the taxi data by connecting zones that share a common border
to form the adjacency matrix. The zones are generated by NYC TLC and shared via a SHP file. We
processed this file using the GeoPandas package to extract spatial relationships between zones.
4.3.2.2 Real-Time Speed Data
For the highway graph Gh, representing the speed data, edges are manually defined based on
highway adjacency and driving direction. An edge is created between two highways (nodes) if they
are physically adjacent and share the same driving direction, allowing a driver to travel directly
between them without traversing other highways. This approach ensures that the graph accurately
reflects the physical connectivity and directional flow of traffic on major roads.
82
4.3.3 Data Preprocessing
We use the traffic data ranging from February 1st, 2019, to March 31st, 2019. We use two months of
data similar to the widely used real-world datasets that are used in previous studies [74, 75, 142]. We
selected these two months because they are part of the school calendar, which is expected to exhibit
consistent traffic patterns. We use only business days (Monday-Friday) to avoid the dissimilarity
pattern between weekends and weekdays. We use the whole day (00:00-23:59). We aggregate
the data into 10-minute bucket intervals, which results in 144 data points for each node per day.
We use the 60%/20%/20% split for train/validation/test such that the first 60% of timestamps are
training data, the next 20% of timestamps are validation data, and the rest is test data. In time
series forecasting problems, we use the later timestamps for validation and the latest timestamps for
testing to avoid the look-ahead bias. Following the recent traffic prediction studies [75, 76], we use
Z-score to zero-mean normalize the data across the features.
We remove 7 days (out of 41 days) from both taxi and speed data due to the large percentage
of missing values in speed data (value 0 is considered missing) based on the threshold computed
using the interquartile range (IQR) method. We remove 4 nodes from the highway graph of the
speed data due to their large percentage of missing values (≥ 43%) compared to at most 5% missing
values in other nodes. Then, we connect the neighbors of each removed node to each other. The
final highway graph is depicted in Figure 4.11. We also removed 9 zones from the region graph
(highlighted in Figure 4.9) due to their high proportion of zero values. These zones had an average
of 91% and 99% zero values in Uber and Lyft data, respectively. Including such zones increases
computational and memory costs without meaningful contributions to model training due to their
lack of positive signals and sparsity. Figure 4.10 illustrates the final region graph of taxi datasets.
Two disconnected nodes in this figure are the Far Rockaway and Hammels/Arverne zones in the
southern part of Queens that lack overlapping borders with other zones.
83
Figure 4.9: Highlighted zones are removed from the region graph of the taxi datasets due to the
sparsity of signals in them.
84
Figure 4.10: Region graph of taxi data of Queens borough.
Data
Taxi Speed
# of Nodes 59 27
# of Edges 138 31
Time Intervals 10-minute
Time Span 02/01/2019 - 03/31/2019
Days Business Days, i.e., Monday - Friday
# of Days 34
Daily Range 00:00 - 23:59
Table 4.1: Dataset details.
85
Figure 4.11: Highway graph of speed data of Queens borough.
4.3.4 Experiment Setting
4.3.4.1 Dataset Settings
To evaluate the performance of our framework under different collaboration scenarios, we consider
two types of settings: a heterogeneous setting using real-world data from Uber and Lyft and a
homogeneous setting where simulated companies are created by partitioning real-world taxi trip
records according to varying market shares. This dual approach allows us to assess how differences
in data distributions and company sizes impact the outcomes of collaboration.
Heterogeneous Setting: In our experiments, we use Uber and Lyft data collected from the same
region and time window; however, these datasets do not share the same distribution. This heterogeneity arises from differences in their user bases and behaviors. Figure 4.12 illustrates the average
inflow rate of these two companies. Due to the similarity of average inflow and average outflow, we
do not plot the average outflow. One of the differences that can be observed in Figure 4.12 between
these two companies is that Uber users show a lower density of rides in February and an increased
rate in March, but Lyft users are almost the opposite. These underlying differences in user behavior
86
9.0
8.5
8.0
7.5
7.0
6.5
6.0
5.5
5.0
2.6
2.4
2.2
2.0
1.8
Uber
Average Inflow
Lyft
Average Inflow
02/01/2019, Friday
02/08/2019, Friday
02/15/2019, Friday
02/22/2019, Friday
03/01/2019, Friday
03/08/2019, Friday
03/15/2019, Friday
03/22/2019, Friday
03/29/2019, Friday
Date, Day of the Week
Uber
Lyft
Figure 4.12: Average inflow rate of Uber and Lyft data on different dates.
and trip records contribute to the heterogeneity between the datasets.
Homogeneous Setting: We also conduct experiments in scenarios where two ride-hailing companies
with different proportions of ride requests collaborate. To generate this homogeneous data, we
randomly assign pickups and drop-offs to each company using a Bernoulli assignment. This method
provides a simple yet effective way to distribute rides randomly while controlling the proportion
assigned to each company. The assigned data points are then aggregated to create the inflow and
outflow data for each company. For example, we consider two companies where one has 80% of
the rides (the larger company) and the other has 20% (the smaller company). We randomly assign
rides accordingly and aggregate them over time intervals, performing the same process for both
87
companies. Since the rides are randomly assigned, the final inflow and outflow datasets for both
companies follow the same underlying distribution, differing only in dataset size. For this setting,
we run experiments for scenarios where the companies have ride ratios of 90%-10%, 80%-20%,
70%-30%, 60%-40%, and 50%-50%.
By examining both settings, we aim to assess the robustness and adaptability of our framework
under varying real-world conditions. This comparison allows us to understand how differences in
data distributions and company sizes affect the effectiveness of collaborative frameworks in traffic
prediction.
4.3.4.2 Hyperparameters
We use AdamW optimizer for training models [143]. We use 5 different initial learning rates of
{0.1,0.01,··· ,1e−5}. Similar to previous studies [75, 76], we use the data of the last 1 hour to
predict the next 1 hour. Following the default parameters provided by ASTGCN paper [75], all
the GNN models in our experiments have the following setting: the number of the terms of the
Chebyshev polynomial K to 3, the number of convolution kernels of the graph convolution layers
to 64, the number of convolution kernels of the temporal convolution layers to 64, and number of
spatiotemporal blocks to 2. ASTGCN paper considers three independent components to capture
the last hour, daily-periodic, and weekly-periodic pattern. However, we only focus on the first
component because the number of days and weeks is small in our data, and removing two other
components reduces the computation cost of experiments. In our implementation, we use the
ASTGCN model developed by [144] and adapt the original implementation to work with the DP
optimizer. We minimize the Mean Squared Error (MSE) as our loss function during training and
report the Root Mean Squared Error (RMSE) for evaluation. The batch size for training non-DP
models is 256, and for other models, 1024. The number of epochs for non-DP models is 10,000,
with the validation error checked at every 5 epochs and the optimal model with the lowest validation
MSE recorded. The number of epochs for non-DP models is 10,000, with the validation error
checked at every 100 epochs with early stopping if 5 consecutive validation MSE increase.
88
For training DP models, we follow a different training setting because the privacy loss is affected
by the training setting. We use a batch size of 256 for training DP models because it is the largest
batch size, which does not result in early divergence and can utilize GPU power. The reason for the
divergence with larger batch sizes in DP-SGD is that larger batch sizes are equivalent to larger noises
injected into gradients. We train DP models in 100 epochs with validation steps of 5 without early
stopping. We train the DP models with the DP-SGD algorithm by using the Opacus package [145].
The optimizer provided by Opacus only accepts input data. However, the network graph is the
second input to our GNN model. As our network graphs are static and do not change with input
data or timestamp, we pass their adjacency matrix to our GNN models when initializing the GNN
network. we We fix the privacy budget (ε value) for each DP training with δ = 1e−5. We use the
fixed value of 0.1 for the clipping threshold.
4.3.4.3 Baselines
We compare the performance of our collaboration framework model with three baselines:
• Historical Average (HA): This baseline predicts the inflow/outflow rate using the average
rate from the last hour (6 consecutive timestamps). It serves as a simple yet commonly used
benchmark for time series forecasting.
• Model Without Collaboration (M1): This model is trained independently by a company
without any collaboration. Figure 4.13 illustrates an example in which Uber trains this model,
denoted as f
No Coop.
Uber (X
speed
t−T+1:t
,X
Uber
t−T+1:T
;Gh,Gr). The model consists of three components:
1. A pretrained GNN model, fs(X
speed
t−T+1:t
;Gh), which is trained to predict public data (e.g.,
speed) and remains frozen during the training of this baseline (marked with a snowflake
symbol). 2. A linear layer though which the prediction from the pretrained model is passed
to align the size of the graphs from public data and the private data (e.g., Gr and Gh). 3. A
second GNN model that takes the outputs of the linear layer along with the private data (e.g.,
Uber’s inflow and outflow rates) as inputs to produce the final prediction.
89
GNN
GNN Linear
Layer Public Data
Figure 4.13: The model that a company trains solely without any collaboration. This figure is
illustrated using Uber as an example.
• Model With Collaboration via Sharing DP Data (M2): This baseline explores collaboration
by sharing a DP version of the data. Figure 4.14 depicts an example where Uber trains this
model, denoted as f
Coop. w. DP Data
Uber (X
speed
t−T+1:t
,X
Uber
t−T+1:T
;Gh,Gr), and Lyft shares its DP inflow
data X˜
inflow, Lyft
t−T+1:t
and DP outflow data X˜
outflow, Lyft
t−T+1:t
. We apply the Gaussian mechanism with a
sensitivity of 1 to make the data private, leveraging the alignment of the problem with the
histogram query structure (see Section 4.1 for details on histogram queries). The data is
privatized hourly, which reduces the noise added but also weakens the privacy protection
compared to our framework. This trade-off leads to better utility for this baseline at the cost
of weaker privacy guarantees.
To ensure a fair comparison, we use an identical pretrained speed prediction model
fs(X
speed
t−T+1:T
;Gh)across all settings. The only variation in the pretraining process of this pretrained
model is the validation step, which is set to a more granular interval of 50. while this adjustment is
computationally negligible since pretraining is performed only once, it benefits all models equally by
improving the accuracy of the pretrained model. The performance of these models is analyzed using
Root Mean Squared Error (RMSE) metric, which measures the standard deviation of prediction
errors.
90
GNN
GNN Linear
Layer Public Data
DP data
shared by Lyft
Figure 4.14: A model trained by a company through collaboration with other companies by receiving
DP datasets. This figure illustrates Uber as the company training the model and Lyft as the company
sharing the DP datasets X˜
inflow, Lyft
t−T+1:t
and X˜
outflow, Lyft
t−T+1:t
.
4.3.4.4 Framework
We assume that Uber and Lyft use recent speed data, inflow data, and outflow data to predict inflow
or outflow. In addition to the input data, they use both the DP inflow prediction model and DP
outflow prediction model in the framework (see Figure 4.15). Hence, our experiments report the
sum of ε of DP inflow and outflow models. Figure 4.15 depicts the example where Lyft has shared
DP inflow model f
1
s,Lyft(X
speed
t−T+1:t
;Gh) and DP outflow model f
2
s,Lyft(X
speed
t−T+1:t
;Gh) with Uber. The
parameters of these two models are frozen during the training of the framework model (marked
with a snowflake symbol), and no backpropagation happens for them. In other words, their output
is utilized as new features during the training. Note that the ASTGCN model used as our GNN
accepts multiple input features.
91
Linear
Layer
GNN
GNN
GNN Linear
Layer
Linear
Layer
Public Data
DP models
shared by Lyft
GNN Linear
Layer
Figure 4.15: A model trained by a company through collaboration with other companies by receiving
DP models. This figure illustrates Uber as the company training the model and Lyft as the company
sharing the DP models.
92
4.3.5 Numerical Results
The results from both baselines, M1 and M2, as well as our framework, consistently outperform
the Historical Average baseline across all settings presented in Table 4.2, Table 4.3, Table 4.4,
and Table 4.5. The inclusion of the HA baseline serves to establish that the investigated training
scenarios are reasonably effective and suitable for evaluation.
In all settings, the underlying GNN model is ASTGCN. We do not perform comparison analysis
with other SOTA traffic prediction architectures, as the modular design of our framework and
baselines allows any deep learning architecture to replace ASTGCN seamlessly without impacting
the generality of the findings.
The test results for the heterogeneous setting are detailed in Table 4.2, Table 4.3, Table 4.4, and
Table 4.5. Figure4.16, Figure4.17, Figure4.18, and Figure4.19 depict the Confidence Intervals (CIs)
for the framework compared to the no-collaboration baseline M1, generated using Non-overlapping
Block Bootstrap (NBB) with 100 bootstrapped samples. According to [146], the recommended
number of bootstrap samples ranges between 50 and 200. Details on the NBB method are provided
in Appendix 6.3.1. As shown in the tables and figures, our framework consistently outperforms the
no-collaboration baseline M1, achieving up to a 6% improvement in inflow and outflow predictions.
Notably, the CIs from our framework do not overlap with those of the baseline M1, except for a
slight overlap at the extreme case of the low privacy level (ε = 10) for Lyft in outflow prediction.
This demonstrates the statistical significance of the observed performance improvements. An
intriguing finding is that performance improves as the privacy level increases, but beyond a certain
point, this improvement plateaus and then declines. This behavior, as explored in [147, 148],
occurs because differential privacy mechanisms can enhance generalizability by introducing noise.
However, at higher levels of privacy (i.e., greater noise injection), the randomness begins to outweigh
the benefits, leading to a drop in model performance. These results highlight the inherent trade-off
between privacy protection and generalizability.
We further evaluate the performance of the proposed framework in the heterogeneous setting
93
Method Inflow prediction
Uber Lyft
Historical Average 4.19 1.78
No Collaboration 3.46 1.59
Share DP Data
ε = 10 3.48 1.58
ε = 5 3.49 1.59
ε = 1 3.46 1.59
Framework
ε = 10 3.29 1.58
ε = 5 3.30 1.58
ε = 2 3.31 1.56
ε = 1 3.31 1.57
Table 4.2: Performance analysis of the framework on Uber and Lyft inflow data with RMSE metric.
by comparing its results with the baseline model M2. As shown in Table 4.2, Table 4.3, Table 4.4,
and Table 4.5, the framework consistently outperforms the M2 models across all scenarios. It is
important to note that while similar epsilon values are reported for both the framework and the M2
baseline, the privacy guarantees in M2 are significantly weaker. This is because the definition of
neighboring datasets in M2 is restricted to a 1-hour time window and is based on histogram query
nature of the data, whereas the framework applies privacy protection across the entire dataset and
utilizes DP-SGD with which we do not impose sensitivity of 1 for our data.
Table 4.4 and Table 4.5 present the results for the homogeneous setting. Similar to the heterogeneous setting, our proposed framework consistently outperforms all baselines in nearly all
scenarios. In collaboration 1, the smaller company with a 10% market share shows a slightly greater
improvement in the M2 baseline results. However, this improvement is marginal and stems from the
significantly weaker privacy guarantees in M2. Moreover, in some scenarios under the M2 baseline,
the performance of larger companies is adversely affected by sharing DP data. This deterioration
occurs because the inflow and outflow rates of smaller companies, which are inherently low, are
subjected to the same level of noise as larger companies’ data. As a result, the patterns in smaller
companies’ data become heavily distorted. This uniform noise level across scenarios arises from
the fixed sensitivity of 1 applied to all cases.
94
No Collab.
Framework,
Framework,
Framework,
Framework,
3.400
3.375
3.350
3.325
3.300
RMSE
3.275
3.425
3.450
3.475
Mean Loss Line
Mean Loss with
Confidence Interval
Figure 4.16: Confidence intervals of test errors for Uber models in inflow prediction, derived from
100 bootstrapped samples using the NBB method.
No Collab.
Framework,
Framework,
Framework,
Framework,
3.26
3.24
3.22
3.20
3.18
RMSE
3.16
3.28 Mean Loss Line
Mean Loss with
Confidence Interval
Figure 4.17: Confidence intervals of test errors for Uber models in outflow prediction, derived from
100 bootstrapped samples using the NBB method.
95
No Collab.
Framework,
Framework,
Framework,
Framework,
1.59
1.58
1.57
1.56
1.55
1.54
RMSE
Mean Loss Line
Mean Loss with Confidence Interval
Figure 4.18: Confidence intervals of test errors for Lyft models in inflow prediction, derived from
100 bootstrapped samples using the NBB method.
No Collab.
Framework,
Framework,
Framework,
Framework,
1.54
1.53
1.52
1.51
1.50
1.55
RMSE
Mean Loss Line
Mean Loss with
Confidence Interval
Figure 4.19: Confidence intervals of test errors for Lyft models in outflow prediction, derived from
100 bootstrapped samples using the NBB method.
96
Method Outflow prediction
Uber Lyft
Historical Average 3.85 1.68
No Collaboration 3.28 1.55
Share DP Data
ε = 10 3.27 1.54
ε = 5 3.36 1.54
ε = 1 3.33 1.54
Framework
ε = 10 3.19 1.52
ε = 5 3.18 1.51
ε = 2 3.18 1.51
ε = 1 3.20 1.52
Table 4.3: Performance analysis of the framework on Uber and Lyft outflow data with RMSE
metric.
4.4 Conclusion
This chapter introduces a novel collaborative framework for traffic prediction that employs differential privacy techniques to enable secure and privacy-preserving collaboration between companies.
By evaluating the framework in heterogeneous settings with real-world data from Uber and Lyft,
as well as in homogeneous settings with simulated companies, we demonstrate its effectiveness
and adaptability across diverse scenarios. The framework consistently outperforms baseline models, including those with weaker privacy guarantees, achieving improvements of up to 6% over
no-collaboration baselines. This performance highlights the framework’s ability to balance the
trade-off between privacy and utility, a critical aspect of privacy-preserving systems. A key strength
of the proposed framework is its modular design, which ensures compatibility with various deep
learning architectures, making it highly flexible for future adaptations and extensions. By integrating
advanced privacy mechanisms with collaborative modeling, this work offers a scalable and secure
approach to traffic prediction challenges, addressing both competitive concerns and privacy risks
in data sharing. Looking ahead, future research could explore the incorporation of additional data
sources, such as weather or event information, to further enhance prediction accuracy. Additionally,
optimizing the framework for large-scale deployments and investigating its applicability in other
97
Collaboration 1 Collaboration 2 Collaboration 3 Collaboration 4 Collaboration 5
10% 90% 20% 80% 30% 70% 40% 60% 50% 50%
Historical Average 1.13 4.67 1.69 4.27 2.17 3.86 2.61 3.47 3.05 3.03
No Collaboration 1.04 3.75 1.52 3.47 1.92 3.22 2.27 2.93 2.62 2.58
Share Data
(ε = 20) 1.02 3.79 1.49 3.48 1.89 3.20 2.23 2.91 2.58 2.55
Share Data
(ε = 10) 1.03 3.78 1.50 3.49 1.90 3.21 2.24 2.95 2.59 2.57
Share Data
(ε = 5) 1.03 3.75 1.51 3.49 1.91 3.21 2.25 2.93 2.60 2.57
Share Data
(ε = 1) 1.03 3.75 1.52 3.51 1.91 3.24 2.26 2.93 2.62 2.57
Framework
(ε = 10) 1.03 3.55 1.51 3.34 1.89 3.16 2.21 2.79 2.55 2.51
Framework
(ε = 5) 1.03 3.57 1.51 3.31 1.87 3.09 2.16 2.83 2.56 2.48
Framework
(ε = 2) 1.03 3.58 1.51 3.32 1.87 3.08 2.20 2.82 2.54 2.51
Framework
(ε = 1) 1.03 3.61 1.51 3.33 1.88 3.10 2.20 2.83 2.53 2.51
Table 4.4: Performance analysis of the framework in inflow prediction on different homogeneous collaboration settings.
98
Collaboration 1 Collaboration 2 Collaboration 3 Collaboration 4 Collaboration 5
10% 90% 20% 80% 30% 70% 40% 60% 50% 50%
Historical Average 1.11 4.24 1.62 3.90 2.07 3.55 2.46 3.20 2.84 2.83
No Collaboration 1.03 3.56 1.48 3.30 1.86 3.05 2.20 2.80 2.50 2.49
Share Data
(ε = 20) 1.01 3.66 1.46 3.29 1.84 3.02 2.16 2.76 2.47 2.46
Share Data
(ε = 10) 1.02 3.55 1.46 3.30 1.85 3.03 2.17 2.77 2.48 2.47
Share Data
(ε = 5) 1.02 3.56 1.47 3.30 1.85 3.04 2.18 2.78 2.49 2.48
Share Data
(ε = 1) 1.02 3.56 1.48 3.36 1.86 3.04 2.19 2.78 2.50 2.48
Framework
(ε = 10) 1.02 3.43 1.47 3.20 1.84 3.03 2.15 2.72 2.45 2.44
Framework
(ε = 5) 1.02 3.43 1.47 3.21 1.84 2.95 2.14 2.73 2.46 2.43
Framework
(ε = 2) 1.02 3.44 1.47 3.19 1.84 2.96 2.15 2.70 2.46 2.44
Framework
(ε = 1) 1.02 3.41 1.47 3.17 1.85 2.96 2.17 2.69 2.45 2.44
Table 4.5: Performance analysis of the framework in outflow prediction on different homogeneous collaboration settings.
99
domains, such as healthcare or finance, could extend its impact. This chapter provides a foundation
for advancing privacy-preserving collaborative frameworks, paving the way for broader applications
in intelligent transportation systems and beyond.
100
Chapter 5
Conclusion
In this dissertation, we investigated the cooperative congestion reduction frameworks. Our frameworks offer incentives to individual users and new mobility services to change their routing decisions.
Finally, we evaluated the performance of our models and algorithms using Archived Data Management System (ADMS) data [122].
In Chapter 2, we developed mathematical models and proposed algorithms for offering personalized incentives to drivers to reduce congestion in the network. In this framework, drivers share
their origin-destination and routing information with a central planner. Based on this information,
the central planner then offers incentives to drivers to incentivize/enforce a socially optimal routing
strategy. The incentives are only offered to alter the routing decision of the drivers and are offered
based on solving large-scale optimization problems in our framework. The framework brings
together prior works to model the behavior of drivers in response to the offered incentives as well
as the resulting congestion reduction in the network where no traffic control is required. We paid
special attention to minimizing the total travel time of the network. In addition, we showed that this
problem can be solved in a distributed fashion where some of the computations are performed on
individual drivers’ smart devices. Our experiments showed that the proposed framework can lead
up to a 5% decrease in the total travel time of the system during rush hour times. All the codes for
this chapter are publicly available at [100] .
In Chapter 3, we study the problem of incentivizing organizations to reduce traffic conges101
tion. In this chapter, we developed a mathematical model and provided an algorithm for offering
organization-level incentives. In our framework, a central planner collects the origin-destination
and routing information of the organizations. Then, the central planner utilizes this information
to offer incentive packages to organizations to incentivize a system-level optimal routing strategy.
The central planner only offers the incentive to alter the routing decision of the organization drivers.
A 6.90% reduction in the total travel time of the network was reached by our framework in the
experiments. More importantly, we observed that incentivizing companies/organization is more
cost-efficient than incentivizing individual drivers. All the codes for this chapter can be found
in [101].
Finally, in Chapter 4, we discussed our private collaboration framework in which companies
share private traffic prediction models with each other to improve their performance. Moreover, this
model can be utilized by the central incentive offering platform to predict traffic conditions without
requiring access to companies’ user data. Prediction of traffic can enable the central platform to
distinguish between roads with probable congestion and highways with free flow in the near future.
Whether a reward should be assigned to a road and how much of a reward should be considered is
based on the remaining capacity in the road segments. This information comes from a prediction
algorithm without risking the privacy of participants.
102
Chapter 6
Appendicies
6.1 Appendix - Chapter 2
6.1.1 List of Notations
The following symbols are used in Chapter 2:
• G : Directed graph of the traffic network
• V : Set of nodes of graph G which correspond to major intersections and ramps
• E : Set of edges of graph G which correspond to the set of road segments
• |E |: Total number of road segments/edges in the network G (i.e. the cardinality of the set E )
• r: Route vector
• T: Time horizon
• |T|: Number of time units (i.e. the cardinality of T)
• v0: Capacity vector of road segments
• vt
: Volume vector of road segments at time t
• N : Set of drivers
103
• |N |: Number of drivers (i.e. the cardinality of the set N )
• Rn: Set of possible route options for driver n
• R: Total set of possible route options for all drivers
• |R|: Number of possible route options (i.e. the cardinality of the set R)
• In: Set of possible incentives to offer to driver n
• I : Total set of possible incentives to all drivers
• |I |: Number of possible incentives (i.e. the cardinality of the set I )
• s
r,n
i
: Decision parameter indicates whether incentive i is offered to driver n for route r
• p
r,n
i
: The probability of acceptance of route r by driver n given incentive i
• Tbr: The estimate of the travel time for route r provided by the incentive offering platform
• Tr: The exact travel time for route r
• βr,t
: The vector of the location of driver that is traveling a route r at time t
• ηi
: The cost of incentive i
• Ftt(.): Total travel time function
• δℓ,t
: Travel time of link ℓ at time t
• vˆ: The vector of the volume of links at different times in the horizon
• vˆℓ,t
: The (|E | ×t +ℓ)
th element of vector vˆ representing the volume of ℓ
th link at time t
• t0: The free flow travel time of the link
• v: The traffic volume of the link
• w: The practical capacity of the link
104
• sn: The binary decision vector for one driver in which only one element has the value of one
and it corresponds to the route and the incentive amount that we offer
• fBPR(.): BPR function
• S: Decision matrix
• R: The matrix of the location of a driver
• P: Route choice probability matrix
• D: The matrix of incentive assignment to OD pairs
• q: The vector of the number of drivers for each OD pair
• c: The vector of cost of incentives assigned to each route
• Ω: Budget
• ω: The vector of free flow travel time of links
• aℓ,t
: The row of matrix A = RP which corresponds to link ℓ at time t
• K: The number of OD pairs
• e: An edge of graph G which corresponds to a road segments in the traffic network
6.1.2 Details of Alternating Direction Method of Multipliers (ADMM)
Before explaining the steps of our proposed algorithm, let us first explain the Alternating Direction
Method of Multipliers (ADMM), which is the main building block of our framework.
105
6.1.2.1 Review of ADMM
ADMM developed in [110] and [111] aims at solving linearly constrained optimization problems of
the form
min
w,z
h(w) +g(z) s.t. Aw+Bz = c,
where w ∈ R
d1
,z ∈ R
d2
, c ∈ R
k
, A ∈ R
k×d1
, and B ∈ R
k×d2
. By forming the augmented Lagrangian
function
L (w,z,λ) ≜ h(w) +g(z) +⟨λ,Aw+Bz−c⟩+
ρ
2
∥Aw+Bz−c∥
2
2
,
each iteration of ADMM applies alternating minimization to the primal variables and gradient
ascent to the dual variables:
Primal Update: w
r+1 = argmin
w
L (w,z
r
,λ
r
), (6.1)
z
r+1 = argmin
z
L (w
r+1
,z,λ
r
)
Dual Update: λ
r+1 = λ
r +ρ
Awr+1 +Bzr+1 −c
This algorithm is well studied in the optimization literature (see [109] for a monograph on the use
of this algorithm in convex distributed optimization and [112] for its use in non-convex continuous
optimization).
6.1.2.2 ADMM for Solving (2.7)
To follow the standard form provided at subsection 6.1.2.1 and substitute aℓ,tS1 with γℓ,t
, we
10
reformulate the optimization problem (2.7) as
min
S,γ,β
Ftt(γ) =
|E |
∑
ℓ=1
|T|
∑
t=1
(γℓ,t)δ(γℓ,t)
s.t. S
⊺
1 = 1, c
⊺S1+β = Ω
DS1 = q, AS1 = γ
S ∈ [0,1]
(|R|·|I |)×|N |
, β ≥ 0
(6.2)
where β is a slack variable. As we discussed in subsection 2.1.3, in order to find (approximately)
binary solutions, we add a regularizer ℜ(S) = −
˜λ
2 ∑
|R|
r=1 ∑
|I |
i=1 ∑
|N |
n=1
Sr,i,n(Sr,i,n −1) to the objective
function:
min
S,γ,β
Ftt(γ) =
|E |
∑
ℓ=1
|T|
∑
t=1
(γℓ,t)δ(γℓ,t)
−
˜λ
2
|R|
∑
r=1
|I |
∑
i=1
|N |
∑
n=1
Sr,i,n(Sr,i,n −1)
s.t. S
⊺
1 = 1, c
⊺S1+β = Ω
DS1 = q, AS1 = γ
S ∈ [0,1]
(|R|·|I |)×|N |
, β ≥ 0
(6.3)
where ˜λ ∈ R+ is the regularization parameter. This regularizer forces the elements of matrix S to
be as close as possible to the binary domain {0,1}. The augmented lagrangian of the reformulated
107
optimization problem (6.3) is
L (S,γ,β)
≜ Ftt(γ) +I
[0,1]
(|R|·|I |)×|N |(S) +IR+(β)
−
˜λ
2
|R|
∑
r=1
|I |
∑
i=1
|N |
∑
n=1
Sr,i,n(Sr,i,n −1)
+⟨λ1,S
⊺
1−1⟩+λ2(c
⊺S1+β −Ω)
+⟨λ3,DS1−q⟩+⟨λ4,AS1−γ⟩
+
ρ
2
||S
⊺
1−1||2 +
ρ
2
(c
⊺S1+β −Ω)
2
+
ρ
2
||DS1−q||2 +
ρ
2
||AS1−γ||2
(6.4)
with the set of Lagrange multipliers {λ1,λ2,λ3λ4} and ρ > 0 be the primal penalty parameter.
Then, ADMM solves (6.3) by the following iterative scheme
S
t+1 =argmin
S
I
[0,1]
(|R|·|I |)×|N |(S)
−
˜λ
2
|R|
∑
r=1
|I |
∑
i=1
|N |
∑
n=1
Sr,i,n(Sr,i,n −1)
+⟨λ1,S
⊺
1−1⟩+λ2(c
⊺S1+β −Ω)
+⟨λ3,DS1−q⟩+⟨λ4,AS1−γ⟩
+
ρ
2
||S
⊺
1−1||2 +
ρ
2
(c
⊺S1+β −Ω)
2
+
ρ
2
||DS1−q||2 +
ρ
2
||AS1−γ||2
β
t+1 =argmin
β
IR+(β) +λ2(c
⊺S1+β −Ω) + ρ
2
(c
⊺S1+β −Ω)
2
γ
t+1 =argmin
γ
Ftt(γ) +⟨λ4,AS1−γ⟩+
ρ
2
||AS1−γ||2
λ
t+1
1 =λ
t
1 +ρ
S
t+1⊺
1−1
10
λ
t+1
2 =λ
t
2 +ρ
c
⊺S
t+1
1+β
t+1 −Ω
λ
t+1
3 =λ
t
3 +ρ
DSt+1
1−q
λ
t+1
4 =λ
t
4 +ρ(ASt+1
1−γ
t+1
)
We can write the update of the primal variable S as a closed-form expression. To facilitate the
derivation of its updating rule, we substitute S1 in our problem with the new variable u and add
the constraint S1 = u to our formulation. Moreover, we substitute the matrix S by the new variable
W in the constraint S
⊺1 = 1 and replace the matrix S with the new variable H in the constraint
S ∈ [0,1]
(|R|·|I |)×|N |
and the regularizer ℜ(S). As matrices W and H are substitutions for S, we
include the constraints S = W and S = H in the reformulation. Therefore, optimization problem (6.3)
will be reformulated as
min
γ,u,S,W,H,z,β
|E |
∑
ℓ=1
|T|
∑
t=1
γℓ,tδ(γℓ,t)
−
˜λ
2
|R|
∑
r=1
|I |
∑
i=1
|N |
∑
n=1
Hr,i,n(Hr,i,n −1)
s.t. S1 = u, W⊺
1 = 1
Du = q, Au = γ
H = S, W = S
c
⊺u+β = Ω, β ≥ 0
H ∈ [0,1]
(|R|·|I |)×|N |
.
(6.5)
1
which is the introduced problem (2.8) in the subsection 2.1.3. Let
L (γ,S,H,W,u,β)
≜ Ftt(γ) +I
[0,1]
(|R|·|I |)×|N |(H) +IR+(β)
+⟨λ1,S1−u⟩+⟨λ2,W⊺
1−1⟩+⟨λ3,Du−q⟩
+⟨λ4,Au−γ⟩+⟨Λ5,H−S⟩
+λ6(c
⊺u+β −Ω) +⟨Λ7,W−S⟩
+
ρ
2
||S1−u||2 +
ρ
2
||W⊺
1−1||2
+
ρ
2
||Du−q||2 +
ρ
2
||Au−γ||2
+
ρ
2
||H−S||2 +
ρ
2
(c
⊺u+β −Ω)
2
+
ρ
2
||W−S||2 −
˜λ
2
|R|
∑
r=1
|I |
∑
i=1
|N |
∑
n=1
Hr,i,n(Hr,i,n −1)
(6.6)
be the augmented Lagrangian function of (2.8) with the set of Lagrange multipliers {λ1,λ2,...,Λ7}
and ρ > 0 be the primal penalty parameter. Then, ADMM solves (2.8) by the following iterative
scheme
u
t+1 =argmin
u
⟨λ
t
1
,S
t+1
1−u⟩+⟨λ
t
3
,Du−q⟩
+⟨λ
t
4
,Au−γ⟩+λ6(c
⊺u+β −Ω)
+
ρ
2
||S
t+1
1−u||2 +
ρ
2
||Du−q||2
+
ρ
2
||Au−γ||2 +
ρ
2
(c
⊺u+β −Ω)
2
Wt+1 =argmin
W
⟨λ
t
2
,W⊺
1−1⟩+⟨Λ
t
7
,W−S
t+1
⟩
+
ρ
2
||W⊺
1−1||2 +
ρ
2
||W−S
t+1
||2
110
H
t+1 =argmin
H
1(ρ > ˜λ)I
[0,1]
(|R|·|I |)×|N |(H)
+1(ρ < ˜λ)I
{0,1}
(|R|·|I |)×|N |(H)
+⟨Λ
t
5
,H−S
t+1
⟩+
ρ
2
||H−S
t+1
||2
−
˜λ
2
|R|
∑
r=1
|I |
∑
i=1
|N |
∑
n=1
Hr,i,n(Hr,i,n −1)
S
t+1 =argmin
S
⟨λ
t
1
,S1−u
t
⟩+⟨Λ
t
5
,H
t −S⟩
+⟨Λ
t
7
,Wt −S⟩+
ρ
2
||S1−u
t
||2
+
ρ
2
||H
t −S||2 +
ρ
2
||Wt −S||2
β
t+1 =argmin
β
I+(β) +λ
t
6
c
⊺u
t+1 +β −Ω
+
ρ
2
c
⊺u
t+1 +β −Ω
2
λ
t+1
1 =λ
t
1 +ρ
S
t+1
1−u
t+1
λ
t+1
2 =λ
t
2 +ρ
Wt+1⊺
1−1
λ
t+1
3 =λ
t
3 +ρ
Dut+1 −q
λ
t+1
4 =λ
t
4 +ρ
Aut+1 −γ
t+1
Λ
t+1
5 =Λ
t
5 +ρ
H
t+1 −S
t+1
λ
t+1
6 =λ
t
6 +ρ
c
⊺u
t+1 +β
t+1 −Ω
Λ
t+1
7 =Λ
t
7 +ρ
Wt+1 −S
t+1
The primal update rules can be simplified as
γ
t+1
ℓ,tˆ =argmin
γℓ,tˆ
γℓ,tˆδ(γℓ,tˆ) +λ
t
4,(ℓ,tˆ)
(aℓ,tˆu
t −γℓ,tˆ)
+
ρ
2
(aℓ,tˆu
t −γℓ,tˆ)
2
, ∀ℓ,∀tˆ
S
t+1 =
1
ρ
(−λ
t
1
1
⊺ +Λ
t
5 +Λ
t
7 +ρu
t
1
⊺ +ρH
t
+ρWt
)(11⊺ +2I)
H
t+1 = 1(ρ > ˜λ)Π
1
ρ − ˜λ
(ρS
t −Λ
t
5 −
˜λ
2
)
!
[0,1]
+1(ρ < ˜λ)Π
1
ρ − ˜λ
(ρS
t −Λ
t
5 −
˜λ
2
)
!
{0,1}
Wt+1 =
1
ρ
(I+11⊺
)
−1
(−1λ
t⊺
2 −Λ
t
7 +ρ11⊺ +ρS
t+1
)
u
t+1 =
1
ρ
(I+D
⊺D+A
⊺A+cc⊺
)
−1
(λ
t
1 −D
⊺λ
t
3 −A
⊺λ
t
4
+ρS
t+1
1+ρD
⊺q+ρA
⊺γ
t+1 −λ6c−β ρc+Ωρc)
β
t+1 = Π
1
ρ
(−λ
t
6 −ρc
t⊺u
t +ρΩ)
R+
6.1.3 Distributed Computation of Algorithm 1
To handle the expensive computation of matrices W, H, and S in Algorithm 1, we can utilize
the computational power of our drivers’ smartphones. Each column of the matrices W, H, and S
corresponds to a single driver, and hence the computation corresponding to each column can be
performed in parallel on smartphone devices of the drivers. The details of this parallel computation
are depicted in Figure 6.1. To update the i
th column of matrices W, H, and S at iteration t of
Algorithm 1, driver i’s smartphone computes
Wt+1
(:,i) =(ρ11⊺ +ρI)
−1
(ρ1+ρS
t
(:,i) −Λ
t
7,(:,i) −λ
t
2,(:,i)
)
H
t+1
(:,i) =1(ρ > ˜λ)Π
1
ρ − ˜λ
(ρS
t
(:,i) −Λ
t
5,(:,i) −
˜λ
2
)
!
[0,1]
+1(ρ < ˜λ)Π
1
ρ − ˜λ
(ρS
t
(:,i) −Λ
t
5,(:,i) −
˜λ
2
)
!
{0,1}
S
t+1
(:,i) =(ρu
t+1
1
⊺ +Λ
t
5,(:,i) +ρH
t+1
(:,i) +Λ
t
7,(:,i) +ρWt+1
(:,i) −λ
t
1
1
⊺
)(ρ11⊺ +2ρI)
−1
112
where (:,i) denotes the i’th column of the matrix and corresponds to the driver i.
Figure 6.1: Steps of distributed implementation of Algorithm 1. Step 0: Incentive offering platform
shares the constant parameters and matrices with drivers. Step t-1: The incentive offering platform
updates u
t+1
. Step t-2: Incentive offering platform sends the required information to drivers. Step
t-3: Incentive offering platform receives the updated vectors from drivers. Step t-4: Incentive
offering platform updates γ
t+1
, β
t+1
, and dual variables.
6.1.4 UE Algorithm - Chapter 2
In our numerical experiments, we use the volume at the UE state of the system after the
incentivization to evaluate the travel time. To compute the volume at User Equilibrium, we present
Algorithm 2. Before we present the details of Algorithm 2, let us explain some notations used in this
algorithm. Vector v ∈ R
|E |·|T|
+ denotes the volume of links at different time slots. N1 is the set of
user drivers that accept the incentive offer and S1 ∈ {0,1}
|R|×|N1|
is the matrix of route assignment
of these drivers. N2 is the set of the remaining drivers (user drivers that rejected the incentive offer,
113
user drivers that did not receive an incentive offer, and nonuser drivers) and S2 ∈ {0,1}
|E |×|N2|
is the matrix of their OD assignment. P˜ ∈ [0,1]
|R|×|E |
encodes the information of probability of
picking different routes given the driver’s OD. Thus, the vector P˜ S21 ∈ R
|R|×1
shows the expected
number of non-incentivized vehicles in each route. δUE ∈ R+ is the total travel time of the system
based on the traffic volume at the last iteration. In this algorithm, we rely on the method presented
by [104] to compute matrix R and P˜ based on the volume vector v.
Algorithm 2 Computation of Travel Time at UE
1: Input: Step size: αUE, Number of iterations: T˜.
2: Compute R0 and P˜
0 using volume vector v0 (historical data) t = 1,2,...,T˜
3: v˜t = Rt−1S11+Rt−1Pt−1S21
4: vt = (1−αUE)vt−1 +αUEv˜t
5: Compute Rt and Pt based on volume vector vt
6: Compute UE travel time δUE utilizing vT˜ and BPR function
7: Return: δUE
6.1.5 An Example of the Model and Notations - Chapter 2
In this section, we present a small example of a network to illustrate our model and notations.
Consider the network
Figure 6.2: Network example G1.
where V = {ν1,ν2,ν3} is the set of nodes and E = {e1, e2, e3} is the set edges (roads). Details of
the links and attributes are represented in TABLE 6.1. The (origin, destination) pair is (ν1, ν3).
There are two routes going from origin to destination as illustrated in TABLE 6.2. The time horizon
set is T = {1,2,3} and each time is 0.2 hour. To estimate the location of drivers at each time, we
need matrix R ∈ [0,1]
9×6
as follows
114
Length
(Mile)
Speed
(mph)
Travel time
(Hour)
e1 5 50 0.1
e2 10 50 0.2
e3 5 50 0.1
Table 6.1: Set of edges.
r Graph
Route 1
e1 → e3
r1 =
1
0
1
Route 2
e2 → e3
r2 =
0
1
1
Table 6.2: Set of routes.
R =
t1
= 1
r1
t1
= 1
r2
t1
= 2
r1
t1
= 2
r2
t1
= 3
r1
t1
= 3
r2
t2 = 1, e1 1 1 0 0 0 0
t2 = 1, e2 0 0 0 0 0 0
t2 = 1, e3 0.5 0 0 0 0 0
t2 = 2, e1 0 0 1 1 0 0
t2 = 2, e2 0 1 0 0 0 0
t2 = 2, e3 0.5 0 0.5 0 0 0
t2 = 3, e1 0 0 0 0 1 1
t2 = 3, e2 0 0 0 1 0 0
t2 = 3, e3 0 0 0.5 0 0.5 0
where t1 is the entrance time of the driver and t2 is the driver’s arrival time at the road. In model
(2.3), the column vector βr,t corresponds to the columns of matrix R.
Assume there are two drivers in the system and N = {d1,d2}. We want to offer rewards from
the set I = {$0,$5} to control the traffic. To estimate the probability of choosing routes given an
offered incentive at a time, we use matrix P ∈ [0,1]
6×12 when incentive i is offered:
115
Pti =
No incentive $5 → r1 $5 → r2
t = 1, r1 0.50 0.50 0.97 0.03
t = 1, r2 0.50 0.50 0.03 0.97
t = 2, r1 0.50 0.50 0.97 0.03
t = 2, r2 0.50 0.50 0.03 0.97
t = 3, r1 0.50 0.50 0.97 0.03
t = 3, r2 0.50 0.50 0.03 0.97
,∀i ∈ {1,2,3}
P =
Pt1 Pt2 Pt3
Probability matrices for all three times are equal because the speed is the same in all three times.
We compute the probability of choosing route k given that $i
′
is offered for route j
′ by
P(r = k,i = ($i
′ → route j
′
))
=
exp(−0.086ttk +0.7i
′
Ik=j
′)
exp(−0.086ttj
′ +0.7i
′) +∑j̸=j
′ exp(−0.086ttj)
(6.7)
where ttj
is the travel time of route j. We use [102] to extract these coefficients.
6.1.6 Details of the Numerical Experiments
116
Budget Incentive
$0 $2 $10
Penetration Rate
25% $1000 7242 191 61
$10000 7198 14 282
50% $1000 7063 414 17
$10000 6975 0 519
75% $1000 6994 500 0
$10000 6717 0 777
100% $1000 6994 500 0
$10000 6472 28 994
Table 6.3: Distribution of the offered incentives in Experiment I with different penetration rates.
Budget Incentive
$0 $2 $10
$1000 14645 435 13
$10000 12614 1849 630
Table 6.4: Distribution of the offered incentives in Experiment II for incentive set I1 with penetration rate of 100%.
Budget Incentive
$0 $1 $2 $3 $5 $10
$1000 14509 351 152 30 51 0
$10000 12682 184 305 832 838 252
Table 6.5: Distribution of the offered incentives in Experiment II for incentive set I2 with penetration rate of 100%.
Budget Incentive
$0 $2 $10
$1000 7720 500 0
$10000 6916 380 924
Table 6.6: Distribution of the offered incentives in Experiment III for model (2.3) with penetration
rate of 100%.
117
Budget Incentive
$0 $2 $10
Penetration Rate
25% $1000 8042 100 78
$10000 7879 104 237
50% $1000 7892 285 43
$10000 7144 109 967
75% $1000 7772 435 13
$10000 7057 241 922
100% $1000 7720 500 0
$10000 7031 289 900
Table 6.7: Distribution of the offered incentives in Experiment III with different penetration rates
for model (2.6), Algorithm 1.
Budget Incentive
$0 $2 $10
Penetration Rate
25% $1000 8032 110 78
$10000 7891 9 320
50% $1000 7980 175 65
$10000 7565 0 655
75% $1000 7972 185 63
$10000 7246 78 896
100% $1000 7896 280 44
$10000 7022 248 950
Table 6.8: Distribution of the offered incentives in Experiment III with different penetration rates
for model (2.6), Gurobi.
Budget Incentive
$0 $2 $10
Penetration Rate
25% $1000 8036 105 79
$10000 7658 0 562
50% $1000 8050 85 85
$10000 7220 0 1000
75% $1000 7972 185 63
$10000 7240 83 897
100% $1000 7900 286 34
$10000 7048 260 912
Table 6.9: Distribution of the offered incentives in Experiment III with different penetration rates
for model (2.6), MOSEK.
118
Budget Penetration Rate
25% 50% 75% 100%
$100 3 4 4 5
$1000 11 21 27 27
$10000 14 27 38 47
Table 6.10: Effect of the penetration rate on travel time decrease (hour) in Experiment I.
Budget Penetration Rate
25% 50% 75% 100%
$100 0.49% 0.63% 0.62% 0.68%
$1000 1.68% 3.07% 3.91% 4.03%
$10000 2.03% 3.99% 5.52% 6.97%
Table 6.11: Effect of the penetration rate on the percentage of travel time decrease in Experiment I.
Budget Penetration Rate
25% 50% 75% 100%
$100 8 8 4 8
$1000 25 44 50 57
$10000 27 50 72 96
Table 6.12: Effect of the penetration rate on travel time decrease (hour) in Experiment III,
model (2.6), Algorithm 1.
Budget Penetration Rate
25% 50% 75% 100%
$100 0.38% 0.41% 0.17% 0.37%
$1000 1.21% 2.13% 2.41% 2.71%
$10000 1.28% 2.38% 3.47% 4.60%
Table 6.13: Effect of the penetration rate on the percentage of travel time decrease in Experiment III,
model (2.6), Algorithm 1.
Budget Penetration Rate
25% 50% 75% 100%
$100 8 9 7 6
$1000 23 43 49 65
$10000 28 55 74 98
Table 6.14: Effect of the penetration rate on travel time decrease (hour) in Experiment III,
model (2.6), Gurobi.
119
Budget Penetration Rate
25% 50% 75% 100%
$100 0.38% 0.45% 0.33% 0.30%
$1000 1.10% 2.06% 2.33% 3.09%
$10000 1.32% 2.64% 3.56% 4.69%
Table 6.15: Effect of the penetration rate on the percentage of travel time decrease in Experiment III,
model (2.6), Gurobi.
Budget Penetration Rate
25% 50% 75% 100%
$100 8 8 5 6
$1000 23 29 48 67
$10000 28 62 70 101
Table 6.16: Effect of the penetration rate on travel time decrease (hour) in Experiment III,
model (2.6), Mosek.
Budget Penetration Rate
25% 50% 75% 100%
$100 0.38% 0.37% 0.23% 0.27%
$1000 1.11% 1.39% 2.31% 3.19%
$10000 1.33% 2.98% 3.35% 4.86%
Table 6.17: Effect of the penetration rate on the percentage of travel time decrease in Experiment III,
model (2.6), Mosek.
6.2 Appendix - Chapter 3
6.2.1 List of Notations
Traffic network spatiotemporal parameters:
• G : Directed graph of the traffic network
• V : Set of nodes of graph G which correspond to major intersections and ramps
• E : Set of edges of graph G which correspond to the set of road segments
120
• |E |: Total number of road segments/edges in the network G (i.e. the cardinality of the set E )
• ℓ: An edge of graph G which corresponds to a link/road segment in the traffic network
• Rj
: Set of possible route options for driver j
• R: Total set of possible route options for all OD pairs
• |R|: Total number of possible route options (i.e. the cardinality of the set R)
• r: Route vector
• T: Set of time of periods
• |T|: Number of time units (i.e. the cardinality of T)
• θℓ,t
: Travel time of link ℓ at time t
• F(.): Total travel time function
• Tr: The travel time for route r
BPR function and its parameters:
• fBPR(.): BPR function
• v: The traffic volume of the link
• w: The practical capacity of the link
• θ0: The free flow travel time of the link
Optimization model parameters:
• Ni
: Set of drivers of organization i
• |Ni
|: Total number of drivers of organization i (i.e. the cardinality of set Ni)
121
• N : Set of all drivers
• |N |: Total number of drivers (i.e., the cardinality of set N )
• vt
: Volume vector of road segments at time t
• vˆ: The vector of the estimated volume of links at different times in the horizon
• vˆℓ,t
: The (|E | ×t +ℓ)
th element of vector vˆ representing the volume of the ℓ
th link at time t
• R: The matrix of the probability of a driver being at each link given their route
• rℓ,t
: The row of matrix R that corresponds to link ℓ at time t
• D: The matrix of route assignments of the OD pairs
• qi
: The vector of number of drivers of organization i for each OD pair
• δ: The vector of travel time of routes at different times
• η: The vector of shortest travel time between different OD pairs at different times
• bi
: This vector contains the factors by which the travel time of assigned routes can be larger
than the shortest travel time of the drivers of organization i
• Bi
: The matrix of shortest travel time assignment of drivers of organization i
• αi
: VOT for organization i
• α: The vector of VOT values for the different organizations
• γi
: Total travel time of organization i in the absence of incentivization platform
• Ω: Budget for incentivization
• K: The number of OD pairs
Decision variables:
122
• s
r, j
i
: Decision parameter indicates whether route r is assigned to driver j from organization i
• s
j
i
: The binary route assignment vector of driver j from organization i
• Si
: Decision matrix of drivers of organization i
• S: Decision matrix of all drivers
• ci
: The cost of incentive offered to organization i
6.2.2 Reformulated Optimization Model for the ADMM Algorithm
To solve the relaxed version of problem (3.6) efficiently, we present a distributed algorithm based
on this reformulation
min
S,H,W,
Z,u,β,
ω,µi
,
˜β,c
|E |
∑
ℓ=1
|T|
∑
t=1
vˆℓ,tθℓ,t(vˆℓ,t)−
˜λ
2
R
∑
r=1
|T|
∑
t=1
n
∑
i=1
(Zi)r,t((Zi)r,t −1)
s.t. Si1 = ui
, ∀i = 1,2,...,n
Du˜
i = qi
, ∀i = 1,2,...,n
W⊤
i 1 = 1, ∀i = 1,2,...,n
Si = Wi
, ∀i = 1,2,...,n
H
⊤
i δ +βi = bi ⊙Biη, ∀i = 1,2,...,n
Si = Hi
, ∀i = 1,2,...,n
βi ≥ 0, ∀i = 1,2,...,n
Zi ∈ [0,1]
(|R|·|T|)×(|Ni
|)
, ∀i = 1,2,...,n
˜Ic˜ = α⊙(∆u−γ), ω = Ru˜
c˜ ≥ 0, c˜
⊤1˜ + ˜β = Ω,
˜β ≥ 0
Si = Zi
, ∀i = 1,2,...,n,
(6.8)
123
where S = {Si}
n
i=1
, H = {Hi}
n
i=1
, W = {Wi}
n
i=1
, Z = {Zi}
n
i=1
, u = {ui}
n
i=1
, and β = {βi}
n
i=1
.
6.2.3 Distributed Incentivization Algorithm
Algorithm 3 solves the relaxed version of problem (3.6). In this algorithm, we use the projection
operator Π(·)[0,1]
that projects elements of a matrix to the interval [0,1]. Π(·)R+
is also a projection
operator but projects elements of a matrix to R+. Notice that in Algorithm 3, the computation
load of steps 3, 7, 8, and 9 is extensive because matrices S,W,H and Z have large sizes. However,
each column in these matrices corresponds to one driver and these steps are not coupled so we
can perform the computation of each column in parallel by leveraging parallel computation. The
notations used in Algorithm 3 are defined below.
γ =
γ1
.
.
.
γn
q =
q1
.
.
.
qn
λi =
λi,1
.
.
.
λi,n
,i = 1,3
R˜ =
R ... R
˜I =
I −I
α˜ =
α1
.
.
.
αn
α =
α1
.
.
.
αn
u˜
t =
S
t
1
1
.
.
.
S
t
n1
D˜ =
D
.
.
.
D
∆ =
δ
.
.
.
δ
1˜ =
1
0
c˜ =
c
µ
6.2.4 Limitations and Further Discussions
While our platform demonstrates significant potential, several limitations and considerations warrant
further discussion: First, our simulations assume that VOT is given and fixed. Although these values
124
Algorithm 3 Distributed Organization-Level Incentivization via ADMM
1: Input: Initial values: ω
0
, S
0
i
, H0
i
, W0
i
, Z
0
i
, u
0
, β
0
i
,
˜β
0
, c˜
0
, λ
0
1,i ∈ R
|R|·|T|
, λ
0
2 ∈ R
|E |·|T|
,
λ
0
3,i ∈ R
K·|T|
, λ
0
4,i ∈ R
|R|·|T|
, Λ0
5,i ∈ R
(|R|·|T|)×|Ni
|
, λ
0
6,i ∈ R
|Ni
|
, λ
0
7 ∈ R
n
, Λ0
8,i ∈ R
(|R|·|T|)×|Ni
|
,
λ
0
9 ∈ R, Λ0
10,i ∈ R
(|R|·|T|)×|Ni
|
, Dual update step: ρ, Number of iterations: T˜. t = 0,1,...,T˜
ℓ = 0,1,...,|E | tˆ = 1,...,|T|
2: ω
t+1
ℓ,tˆ = argmin
ωℓ,tˆ
ωℓ,tˆθℓ,t(ωℓ,tˆ) +λ
t
2,(ℓ,tˆ)
(ωℓ,tˆ−rℓ,tˆ(∑
n
i=1
u
t
i
)) + ρ
2
(ωℓ,tˆ−Rℓ,tˆ(∑
n
i=1
u
t
i
))2
i = 1,...,n
3: S
t+1
i = (−λ
t
1,i
1
⊤ −Λt
5,i −Λt
8,i −Λt
10,i +ρu
t
i
1
t⊤ +ρWt
i +ρHt
i +ρZ
t
i
)(ρ11⊤ +3ρI)
−1
4: β
t+1
i = Π
1
ρ
(−λ
t
6,i −ρHt⊤
i
δ +ρbi ⊙(Biη))
R+
5: c˜
t+1 = Π(
1
ρ
(
˜I
⊤˜I+1˜1˜⊤)
−1
(
˜I
⊤λ
t
7 −λ
t
9
1˜ −ρ˜I
⊤(α⊙γ)
+ρ˜I
⊤(α⊙(∆⊤u
t
))−ρβ˜1˜ +ρΩ1˜)R+
6: u
t+1 =
1
ρ
(I + R˜ ⊤R˜ + D˜ ⊤D˜ + (∆α˜)(∆α˜)
⊤)
−1
(λ
t
1 + R˜ ⊤λ
t
2 − D˜ ⊤λ
t
3 − (∆α˜)λ
t
7 + ρu˜
t+1 −
ρR˜ ⊤ω
t+1 +ρD˜ ⊤q+ρ(∆α˜)(α⊙γ) +ρ(∆α˜)(˜Ic˜
t+1
))
i = 1,...,n
7: Wt+1
i =
1
ρ
(11⊤ +I)
−1
(ρ11⊤ +ρS
t+1
i −1λ
t⊤
4,i +Λt
5,i
)
8: H
t+1
i =
1
ρ
(δδ⊤ +I)
−1
(−δλt⊤
6,i +Λt
8,i −ρδβ
t+1⊤
i +ρδ(bi ⊙Biη)
⊤ +ρS
t+1
i
)
9: Z
t+1
i = 1(ρ > ˜λ)Π
1
ρ−˜λ
(ρS
t+1
i +Λt
10 −
˜λ
2
)
[0,1]
+1(ρ < ˜λ)Π
1
ρ−˜λ
(ρS
t+1
i +Λt
10 −
˜λ
2
)
{0,1}
i = 1,...,n
10: λ
t+1
1,i = λ
t
1,i +ρ(S
t+1
i
1−u
t+1
i
)
11: λ
t+1
3,i = λ
t
3,i +ρ(Dut+1
i −qi)
12: λ
t+1
4,i = λ
t
4,i +ρ(Wt+1⊤
i
1−1)
13: Λ
t+1
5,i = Λt
5,i +ρ(S
t+1
i −Wt+1
i
)
14: λ
t+1
6,i = λ
t
6,i +ρ(H
t+1⊤
i
δ +β
t+1
i −bi ⊙Biη)
15: Λ
t+1
8,i = Λt
8,i +ρ(S
t+1
i −Wt+1
i
)
16: Λ
t+1
10,i = Λt
10,i +ρ(S
t+1
i −Z
t+1
i
)
17: λ
t+1
2 = λ
t
2 +ρ(ω
t+1 −R(∑
n
i=1
u
t+1
i
))
18: λ
t+1
7 = λ
t
7 +ρ(α⊙(∆⊤u
t+1 −δ)− ˜Ic˜
t+1
)
19: λ
t+1
9 = λ
t
9 +ρ(c˜
t+1⊤1˜ +β˜t+1 −Ω)
20: Return: ST˜
i
,∀i = 1,...,n
125
can be learned by observing drivers’ and passengers’ behavior, learning VOT is beyond the scope
of our work and we assumed it is known. Moreover, the BPR function used in our simulations to
compute travel time can sometimes be inaccurate. However, our modular design allows for any
non-linear travel time computation function, offering flexibility in practice. A potential practical
limitation of the platform is that we assume the assigned routes will be followed. With autonomous
vehicles, it will be easier to enforce the assigned routes. Moreover, delivery companies can enforce
their drivers to follow specific routes. The ride-hailing companies can ensure compliance by
incentivizing passengers/drivers who accept routes (by paying them). Another concern is protecting
the privacy of individuals’ data because of legal, ethical, and practical constraints. To address
this concern, we can adopt approaches similar to those used in previous incentivization projects
with real-world implementations [33, 34, 35, 39]. We can also examine the scalability of our
incentivization platform from various angles. Our modular design allows for the use of various
prediction models, such as traffic prediction and OD estimation, tailored to different scenarios.
Moreover, organizations with access to scalable real-time traffic prediction software can provide
ETA predictions. The platform also offers flexibility in utilizing different VOTs for organizations
with diverse operational natures.
126
6.2.5 Supplementary Figure
Figure 6.3: Data preparation workflow. First, traffic data and sensors’ location data are received
from the ADMS Server. Next, sensors’ location data is processed to compute sensor distances.
Finally, sensor distances and traffic data are combined to create the graph network data.
127
Original Time-Series Data
...
...
... ...
...
... ...
Break Data Into Disjoint Blocks
Resample Blocks Randomly With Replacement
Figure 6.4: Nonoverlapping block bootstrap on time series data of size n with block length of 2.
6.3 Appendix - Chapter 4
6.3.1 Nonoverlapping Block Bootstrap (NBB)
The single-data-value bootstrap scheme introduced by [149] is not applicable when the crucial
assumption of sample independence is violated, as demonstrated by [150]. In such cases, the
Nonoverlapping Block Bootstrap (NBB) [151] provides a suitable approach for bootstrapping time
series data with dependent samples. Let (Xt+1,Xt+2,··· ,Xt+n) represent the observed time series
data. NBB partitions these observations into b = n/l disjoint blocks, each of length l:
B1 = (Xt+1,Xt+2,··· ,Xt+l)
B2 = (Xt+l+1,Xt+l+2,··· ,Xt+2l)
.
.
.
Bb = (Xt+(b−1)l+1
,Xt+(b−1)l+2
,··· ,Xt+n)
and resamples randomly with replacement from these blocks instead of individual samples. Fig. 6.4
illustrates the NBB method for times series data.
128
Bibliography
[1] Bob Pishue. “2023 INRIX Global Traffic Scorecard”. In: INRIX (January 2023). 2023.
[2] Congestion Mitigation and Air Quality Improvement Program: Assessing 10 Years of
Experience. URL: http://www.trb.org/main/blurbs/160904.aspx.
[3] Health Effects Institute. Panel on the Health Effects of Traffic-Related Air Pollution. Trafficrelated air pollution: a critical review of the literature on emissions, exposure, and health
effects. 17. Health Effects Institute, 2010.
[4] Kai Zhang and Stuart Batterman. “Air pollution and health risks due to vehicle traffic”. In:
Science of the total Environment 450 (2013), pp. 307–316.
[5] Dwight A Hennessy and David L Wiesenthal. “Traffic congestion, driver stress, and driver
aggression”. In: Aggressive Behavior: Official Journal of the International Society for
Research on Aggression 25.6 (1999), pp. 409–423.
[6] Melvin L Selzer and Amiram Vinokur. “Life events, subjective stress, and traffic accidents”.
In: American Journal of Psychiatry 131.8 (1974), pp. 903–906.
[7] Cambridge Systematics. Traffic congestion and reliability: Trends and advanced strategies
for congestion mitigation. Tech. rep. United States. Federal Highway Administration, 2005.
[8] Robert Cervero. “Road expansion, urban growth, and induced travel: A path analysis”. In:
Journal of the American Planning Association 69.2 (2003), pp. 145–163.
[9] Jonathan Vespa, David M Armstrong, and Lauren Medina. Demographic turning points for
the United States: Population projections for 2020 to 2060. US Department of Commerce,
Economics and Statistics Administration, 2018.
[10] Arthur C Pigou. “The economics of welfare Macmillan and Co”. In: London, United
Kingdom (1920).
[11] Frank H Knight. “Some fallacies in the interpretation of social cost”. In: The Quarterly
Journal of Economics 38.4 (1924), pp. 582–606.
[12] Erik Verhoef et al., eds. Pricing in Road Transport. Books 4192. Edward Elgar Publishing,
July 2008. ISBN: ARRAY(0x4e494af0). URL: https://ideas.repec.org/b/elg/eebo
ok/4192.html.
[13] Dirk Hendrik Van Amelsfort. “Behavioural responses and network effects of time-varying
road pricing”. In: (2009).
129
[14] Nan Zheng, Guillaume Rérat, and Nikolas Geroliminis. “Time-dependent area-based pricing
for multimodal systems with heterogeneous users in an agent-based environment”. In:
Transportation Research Part C: Emerging Technologies 62 (2016), pp. 133–148.
[15] Carlos F Daganzo and Lewis J Lehe. “Distance-dependent congestion pricing for downtown
zones”. In: Transportation Research Part B: Methodological 75 (2015), pp. 89–99.
[16] Shu Zhang, Ann M Campbell, and Jan F Ehmke. “Impact of congestion pricing schemes
on costs and emissions of commercial fleets in urban areas”. In: Networks 73.4 (2019),
pp. 466–489.
[17] Jasper Knockaert et al. “The Spitsmijden experiment: A reward to battle congestion”. In:
Transport Policy 24 (2012), pp. 260–272.
[18] Charles Raux and Stéphanie Souche. “The acceptability of urban road pricing: A theoretical
analysis applied to experience in Lyon”. In: Journal of Transport Economics and Policy
(JTEP) 38.2 (2004), pp. 191–215.
[19] Ziyuan Gu et al. “Congestion pricing practices and public acceptance: A review of evidence”.
In: Case Studies on Transport Policy 6.1 (2018), pp. 94–101.
[20] Ziyuan Gu et al. “Optimal distance-and time-dependent area-based pricing with the Network
Fundamental Diagram”. In: Transportation Research Part C: Emerging Technologies 95
(2018), pp. 1–28.
[21] Erik Verhoef, Peter Nijkamp, and Piet Rietveld. “Tradeable permits: their potential in the
regulation of road transport externalities”. In: Environment and Planning B: Planning and
Design 24.4 (1997), pp. 527–548.
[22] Wenbo Fan and Xinguo Jiang. “Tradable mobility permits in roadway capacity allocation:
Review and appraisal”. In: Transport Policy 30 (2013), pp. 132–142.
[23] John Thøgersen and Berit Møller. “Breaking car use habits: The effectiveness of a free
one-month travelcard”. In: Transportation 35.3 (2008), pp. 329–345.
[24] Guangmin Wang et al. “Models and a relaxation algorithm for continuous network design
problem with a tradable credit scheme and equity constraints”. In: Computers & Operations
Research 41 (2014), pp. 252–261.
[25] Theodore Tsekeris and Stefan Voß. “Design and evaluation of road pricing: state-of-theart and methodological advances”. In: NETNOMICS: Economic Research and Electronic
Networking 10.1 (2009), pp. 5–52.
[26] Hideki Fukui. “An empirical analysis of airport slot trading in the United States”. In:
Transportation Research Part B: Methodological 44.3 (2010), pp. 330–357.
130
[27] Nico Dogterom, Dick Ettema, and Martin Dijst. “Tradable credits for managing car travel:
a review of empirical research and relevant behavioural approaches”. In: Transport Reviews
37.3 (2017), pp. 322–343.
[28] Carlos Lima Azevedo et al. “Tripod: sustainable travel incentives with prediction, optimization, and personalization”. In: Transportation Research Board 97th Annual Meeting.
2018.
[29] Jack Williams Brehm. A Theory of Psychological Reactance. New York: Academic Press,
1966.
[30] J Knockaert et al. “Experimental design and modelling Spitsmijden”. In: Utrecht, Consortium Spitsmijden (2007).
[31] David M Kreps. “Intrinsic motivation and extrinsic incentives”. In: The American Economic
Review 87.2 (1997), pp. 359–364.
[32] Kent C Berridge. “Reward learning: reinforcement, incentives, and expectations.” In: (2001).
[33] Deepak Merugu, Balaji S Prabhakar, and N Rama. “An incentive mechanism for decongesting the roads: A pilot program in Bangalore”. In: Proc. of ACM NetEcon Workshop.
Citeseer. 2009.
[34] Jia Shuo Yue et al. “Reducing road congestion through incentives: a case study”. In: (2015).
[35] Michiel Bliemer, Matthijs Dicke-Ogenia, and Dick Ettema. “Rewarding for avoiding the
peak period: A synthesis of three studies in the Netherlands”. In: European Transport
Conference 2009. Citeseer. 2009.
[36] Eran Ben-Elia, Dick Ettema, and Hennie Boeije. “Behaviour change dynamics in response
to rewarding rush-hour avoidance: A qualitative research approach”. In: (2011).
[37] Eran Ben-Elia and Dick Ettema. “Changing commuters’ behavior using rewards: A study
of rush-hour avoidance”. In: Transportation Research Part F: Traffic Psychology and
Behaviour 14.5 (2011), pp. 354–368.
[38] Eran Ben-Elia and Dick Ettema. “Rewarding rush-hour avoidance: A study of commuters’
travel behavior”. In: Transportation Research Part A: Policy and Practice 45.7 (2011),
pp. 567–582.
[39] Xianbiao Hu, Yi-Chang Chiu, and Lei Zhu. “Behavior insights for an incentive-based active
demand management platform”. In: International Journal of Transportation Science and
Technology 4.2 (2015), pp. 119–133.
131
[40] Vivek Kumar et al. “Impacts of incentive-based intervention on peak period traffic: experience from the Netherlands”. In: Transportation Research Record 2543.1 (2016), pp. 166–
175.
[41] Aristotelis-Angelos Papadopoulos et al. “Coordinated freight routing with individual incentives for participation”. In: IEEE Transactions on Intelligent Transportation Systems 20.9
(2018), pp. 3397–3408.
[42] Ioannis Kordonis, Maged M Dessouky, and Petros A Ioannou. “Mechanisms for cooperative
freight routing: incentivizing individual participation”. In: IEEE Transactions on Intelligent
Transportation Systems 21.5 (2019), pp. 2155–2166.
[43] Aristotelis-Angelos Papadopoulos et al. “Personalized Pareto-improving pricing-and-routing
schemes for near-optimum freight routing: An alternative approach to congestion pricing”.
In: Transportation Research Part C: Emerging Technologies 125 (2021), p. 103004.
[44] Satoshi Fuji and Ryuichi Kitamura. “What does a one-month free bus ticket do to habitual
drivers”. In: Transportation 30 (2003), pp. 81–95.
[45] Sebastian Bamberg, Icek Ajzen, and Peter Schmidt. “Choice of travel mode in the theory
of planned behavior: The roles of past behavior, habit, and reasoned action”. In: Basic and
Applied Social Psychology 25.3 (2003), pp. 175–187.
[46] Graham Currie. Free Fare Incentives to Shift Rail Demand Peaks–Medium-term Impacts.
Tech. rep. 2011.
[47] Zheng Zhang, Hidemichi Fujii, and Shunsuke Managi. “How does commuting behavior
change due to incentives? An empirical study of the Beijing Subway System”. In: Transportation Research Part F: Traffic Psychology and Behaviour 24 (2014), pp. 17–26.
[48] Shiwali Mohan, Matthew Klenk, and Victoria Bellotti. “Exploring How to Personalize
Travel Mode Recommendations For Urban Transportation.” In: IUI Workshops. 2019.
[49] Xi Zhu et al. “Personalized incentives for promoting sustainable travel behaviors”. In:
Transportation Research Part C: Emerging Technologies (2019).
[50] Amy He. People Continue to Rely on Maps and Navigational Apps. URL: https://www.e
marketer.com/content/people-continue-to-rely-on-maps-and-navigational
-apps-emarketer-forecasts-show. Accessed: 2021-12-12.
[51] Riley Panko. The Popularity of Google Maps: Trends in Navigation Apps in 2018. URL: h
ttps://themanifest.com/app-development/trends-navigation-apps. Accessed:
2021-12-12.
132
[52] Grocery continues to power growth for DoorDash. URL: https://www.grocerydive.co
m/news/grocery-continues-to-power-growth-for-doordash/643120/. Accessed:
2023-04-09.
[53] Uber Announces Results for Fourth Quarter and Full Year 2022. URL: https://invest
or.uber.com/news-events/news/press-release-details/2023/Uber-Announce
s-Results-for-Fourth-Quarter-and-Full-Year-2022/default.aspx. Accessed:
2023-04-09.
[54] Austin Derrow-Pinion et al. “Eta prediction with graph neural networks in google maps”.
In: Proceedings of the 30th ACM international conference on information & knowledge
management. 2021, pp. 3767–3776.
[55] DeepETA: How Uber Predicts Arrival Times Using Deep Learning. URL: https://w
ww . uber . com / blog / deepeta - how - uber - predicts - arrival - times/. Accessed:
2024-21-11.
[56] Managing Supply and Demand Balance Through Machine Learning. URL: https://care
ersatdoordash.com/blog/managing-supply-and-demand-balance-through-mac
hine-learning/?utm_source=chatgpt.com. Accessed: 2024-21-11.
[57] Suresh Chavhan and Pallapa Venkataram. “Prediction based traffic management in a
metropolitan area”. In: Journal of traffic and transportation engineering (English edition)
7.4 (2020), pp. 447–466.
[58] Xiaoning Dou et al. “Machine Learning for Smart Cities: A Comprehensive Review of
Applications and Opportunities”. In: International Journal of Advanced Computer Science
and Applications 14.9 (2023).
[59] Eric Zivot and Jiahui Wang. “Vector autoregressive models for multivariate time series”. In:
Modeling financial time series with S-PLUS® (2006), pp. 385–429.
[60] Mohammed S Ahmed and Allen R Cook. Analysis of freeway traffic time-series data by
using Box-Jenkins techniques. 722. 1979.
[61] Billy M Williams, Priya K Durvasula, and Donald E Brown. “Urban freeway traffic flow
prediction: application of seasonal autoregressive integrated moving average and exponential
smoothing models”. In: Transportation Research Record 1644.1 (1998), pp. 132–141.
[62] Billy M Williams and Lester A Hoel. “Modeling and forecasting vehicular traffic flow
as a seasonal ARIMA process: Theoretical basis and empirical results”. In: Journal of
transportation engineering 129.6 (2003), pp. 664–672.
[63] Lun Zhang et al. “An improved k-nearest neighbor model for short-term traffic flow prediction”. In: Procedia-Social and Behavioral Sciences 96 (2013), pp. 653–662.
133
[64] Zuduo Zheng and Dongcai Su. “Short-term traffic volume forecasting: A k-nearest neighbor
approach enhanced by constrained linearly sewing principle component algorithm”. In:
Transportation Research Part C: Emerging Technologies 43 (2014), pp. 143–157.
[65] Xinxin Feng et al. “Adaptive multi-kernel SVM with spatial–temporal correlation for shortterm traffic flow prediction”. In: IEEE Transactions on Intelligent Transportation Systems
20.6 (2018), pp. 2001–2013.
[66] Chun-Hsin Wu, Jan-Ming Ho, and Der-Tsai Lee. “Travel-time prediction with support
vector regression”. In: IEEE transactions on intelligent transportation systems 5.4 (2004),
pp. 276–281.
[67] Young-Seon Jeong et al. “Supervised weighting-online learning algorithm for short-term
traffic flow prediction”. In: IEEE Transactions on Intelligent Transportation Systems 14.4
(2013), pp. 1700–1707.
[68] Nikolay Laptev et al. “Time-series extreme event forecasting with neural networks at uber”.
In: International conference on machine learning. Vol. 34. sn. 2017, pp. 1–5.
[69] Rui Fu, Zuo Zhang, and Li Li. “Using LSTM and GRU neural network methods for traffic
flow prediction”. In: 2016 31st Youth academic annual conference of Chinese association
of automation (YAC). IEEE. 2016, pp. 324–328.
[70] Junbo Zhang, Yu Zheng, and Dekang Qi. “Deep spatio-temporal residual networks for
citywide crowd flows prediction”. In: Proceedings of the AAAI conference on artificial
intelligence. Vol. 31. 1. 2017.
[71] Huaxiu Yao et al. “Deep multi-view spatial-temporal network for taxi demand prediction”.
In: Proceedings of the AAAI conference on artificial intelligence. Vol. 32. 1. 2018.
[72] Yaguang Li et al. “Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic
Forecasting”. In: International Conference on Learning Representations. 2018. URL: https
://openreview.net/forum?id=SJiHXGWAZ.
[73] Ling Zhao et al. “T-GCN: A temporal graph convolutional network for traffic prediction”.
In: IEEE transactions on intelligent transportation systems 21.9 (2019), pp. 3848–3858.
[74] Bing Yu, Haoteng Yin, and Zhanxing Zhu. “Spatio-temporal graph convolutional networks:
A deep learning framework for traffic forecasting”. In: arXiv preprint arXiv:1709.04875
(2017).
[75] Shengnan Guo et al. “Attention based spatial-temporal graph convolutional networks for
traffic flow forecasting”. In: Proceedings of the AAAI conference on artificial intelligence.
Vol. 33. 01. 2019, pp. 922–929.
134
[76] Chuanpan Zheng et al. “Gman: A graph multi-attention network for traffic prediction”. In:
Proceedings of the AAAI conference on artificial intelligence. Vol. 34. 01. 2020, pp. 1234–
1241.
[77] Domino’s CEO - I’m not handing my data to third party digital delivery providers. URL: ht
tps://diginomica.com/dominos-ceo-im-not-handing-my-data-third-party-d
igital-delivery-providers. Accessed: 2024-09-09.
[78] Facebook-Cambridge Analytica: A timeline of the data hijacking scandal. URL: https://w
ww.cnbc.com/2018/04/10/facebook-cambridge-analytica-a-timeline-of-the
-data-hijacking-scandal.html. Accessed: 2024-09-09.
[79] Fitness tracking app Strava gives away location of secret US army bases. URL: https://w
ww.theguardian.com/world/2018/jan/28/fitness-tracking-app-gives-awaylocation-of-secret-us-army-bases. Accessed: 2024-21-11.
[80] White House. Blueprint for an ai bill of rights: Making automated systems work for the
american people. Nimble Books, 2022.
[81] General Data Protection Regulation GDPR. “General data protection regulation”. In: Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on
the protection of natural persons with regard to the processing of personal data and on the
free movement of such data, and repealing Directive 95/46/EC (2016).
[82] Erin Illman and Paul Temple. “California consumer privacy act”. In: The Business Lawyer
75.1 (2019), pp. 1637–1646.
[83] Jeff KUO. “China’s Personal Information Protection Law (PIPL)–Data Privacy in the Land
of Big Data”. In: Lexology, London 13 (2021).
[84] Meta Fined $1.3 Billion for Violating E.U. Data Privacy Rules. URL: https://www.nytim
es.com/2023/05/22/business/meta-facebook-eu-privacy-fine.html. Accessed:
2024-10-15.
[85] Cynthia Dwork et al. “Calibrating noise to sensitivity in private data analysis”. In: Theory
of cryptography conference. Springer. 2006, pp. 265–284.
[86] Ahmed El Ouadrhiri and Ahmed Abdelhadi. “Differential privacy for deep and federated
learning: A survey”. In: IEEE access 10 (2022), pp. 22359–22380.
[87] Andrew Lowy, Ali Ghafelebashi, and Meisam Razaviyayn. “Private non-convex federated
learning without a trusted server”. In: International Conference on Artificial Intelligence
and Statistics. PMLR. 2023, pp. 5749–5786.
135
[88] Mengmeng Yang et al. “Local differential privacy and its applications: A comprehensive
survey”. In: Computer Standards & Interfaces (2023), p. 103827.
[89] Uber. Uber Releases Open Source Project for Differential Privacy. URL: https://medium
.com/uber-security-privacy/differential-privacy-open-source-7892c82c4
2b6. 2017.
[90] Uber. Learning Iconic Scenes with Differential Privacy. URL: https://machinelearnin
g.apple.com/research/scenes-differential-privacy. 2017.
[91] Apple Differential Privacy Team. Learning with Privacy at Scale. URL: https://docs-as
sets.developer.apple.com/ml-research/papers/learning-with-privacy-atscale.pdf. 2017.
[92] Apple, Differential Privacy Technial Overview. URL: https://www.apple.com/privacy
/docs/Differential_Privacy_Overview.pdf. Accessed: 2024-21-11.
[93] U Bureau. “Disclosure avoidance for the 2020 census: An introduction”. In: (2020).
[94] Cynthia Dwork et al. “Our data, ourselves: Privacy via distributed noise generation”. In: Annual International Conference on the Theory and Applications of Cryptographic Techniques.
Springer. 2006, pp. 486–503.
[95] Martin Abadi et al. “Deep learning with differential privacy”. In: Proceedings of the 2016
ACM SIGSAC conference on computer and communications security. 2016, pp. 308–318.
[96] Nicolas Papernot et al. “Semi-supervised knowledge transfer for deep learning from private
training data”. In: arXiv preprint arXiv:1610.05755 (2016).
[97] Vitaly Feldman et al. “Privacy amplification by iteration”. In: 2018 IEEE 59th Annual
Symposium on Foundations of Computer Science (FOCS). IEEE. 2018, pp. 521–532.
[98] Andrew Lowy et al. “Optimal Differentially Private Model Training with Public Data”. In:
Forty-first International Conference on Machine Learning.
[99] Cynthia Dwork, Aaron Roth, et al. “The algorithmic foundations of differential privacy”. In:
Foundations and Trends® in Theoretical Computer Science 9.3–4 (2014), pp. 211–407.
[100] GitHub codes. URL: https://github.com/ghafeleb/Congestion-Reduction-viaPersonalized-Incentives. Accessed: 2023-02-15.
[101] GitHub Codes. URL: https://github.com/ghafeleb/Incentive_Systems_for_New
_Mobility_Services. Accessed: 2021-12-30.
136
[102] Chenfeng Xiong et al. “An integrated and personalized traveler information and incentive scheme for energy efficient mobility systems”. In: Transportation Research Part C:
Emerging Technologies (2019).
[103] Yin Zheng et al. “A neural autoregressive approach to collaborative filtering”. In: International Conference on Machine Learning. PMLR. 2016, pp. 764–773.
[104] Wei Ma and Zhen Sean Qian. “Estimating multi-year 24/7 origin-destination demand using
high-granular multi-source traffic data”. In: Transportation Research Part C: Emerging
Technologies 96 (2018), pp. 96–121.
[105] United States. Bureau of Public Roads. Traffic assignment manual for application with a
large, high speed computer. Vol. 37. US Department of Commerce, Bureau of Public Roads,
Office of Planning, Urban Planning Division, 1964.
[106] Juan Perdomo et al. “Performative prediction”. In: International Conference on Machine
Learning. PMLR. 2020, pp. 7599–7609.
[107] Michael Grant and Stephen Boyd. CVX: Matlab Software for Disciplined Convex Programming, version 2.1. URL: http://cvxr.com/cvx. Mar. 2014.
[108] Michael Grant and Stephen Boyd. “Graph implementations for nonsmooth convex programs”. In: Recent Advances in Learning and Control. Ed. by V. Blondel, S. Boyd, and
H. Kimura. Lecture Notes in Control and Information Sciences. URL: http://stanford
.edu/~boyd/graph_dcp.html. Springer-Verlag Limited, 2008, pp. 95–110.
[109] Stephen Boyd, Neal Parikh, and Eric Chu. Distributed optimization and statistical learning
via the alternating direction method of multipliers. Now Publishers Inc, 2011.
[110] Daniel Gabay and Bertrand Mercier. “A dual algorithm for the solution of nonlinear
variational problems via finite element approximation”. In: Computers & Mathematics with
Applications 2.1 (1976), pp. 17–40.
[111] Roland Glowinski and A Marroco. “Sur l’approximation, par éléments finis d’ordre un, et
la résolution, par pénalisation-dualité d’une classe de problèmes de Dirichlet non linéaires”.
In: ESAIM: Mathematical Modelling and Numerical Analysis-Modélisation Mathématique
et Analyse Numérique 9.R2 (1975), pp. 41–76.
[112] Mingyi Hong, Zhi-Quan Luo, and Meisam Razaviyayn. “Convergence analysis of alternating
direction method of multipliers for a family of nonconvex problems”. In: SIAM Journal on
Optimization 26.1 (2016), pp. 337–364.
[113] Tianjian Huang et al. “Alternating direction method of multipliers for quantization”. In:
International Conference on Artificial Intelligence and Statistics. PMLR. 2021, pp. 208–
216.
137
[114] Babak Barazandeh et al. “Efficient Algorithms for Estimating the Parameters of Mixed
Linear Regression Models”. In: arXiv preprint arXiv:2105.05953 (2021).
[115] Tian Li et al. “Federated learning: Challenges, methods, and future directions”. In: IEEE
Signal Processing Magazine 37.3 (2020), pp. 50–60.
[116] Andrew Lowy, Ali Ghafelebashi, and Meisam Razaviyayn. “Private Non-Convex Federated
Learning Without a Trusted Server”. In: Proceedings of The 26th International Conference
on Artificial Intelligence and Statistics. Ed. by Francisco Ruiz, Jennifer Dy, and Jan-Willem
van de Meent. Vol. 206. Proceedings of Machine Learning Research. PMLR, 2023, pp. 5749–
5786. URL: https://proceedings.mlr.press/v206/lowy23a.html.
[117] Cynthia Dwork. “Differential privacy”. In: International Colloquium on Automata, Languages, and Programming. Springer. 2006, pp. 1–12.
[118] Brendan McMahan et al. “Communication-efficient learning of deep networks from decentralized data”. In: Artificial intelligence and statistics. PMLR. 2017, pp. 1273–1282.
[119] Bargav Jayaraman et al. “Distributed learning without distress: Privacy-preserving empirical
risk minimization”. In: Advances in Neural Information Processing Systems 31 (2018).
[120] Maxence Noble, Aurélien Bellet, and Aymeric Dieuleveut. “Differentially private federated
learning on heterogeneous data”. In: International Conference on Artificial Intelligence and
Statistics. PMLR. 2022, pp. 10110–10145.
[121] Bingsheng He and Xiaoming Yuan. “On the O(1/n) convergence rate of the Douglas–
Rachford alternating direction method”. In: SIAM Journal on Numerical Analysis 50.2
(2012), pp. 700–709.
[122] Chrysovalantis Anastasiou et al. “Admsv2: A modern architecture for transportation data
management and analysis”. In: Proceedings of the 2nd ACM SIGSPATIAL International
Workshop on Advances on Resilient and Intelligent Cities. 2019, pp. 25–28.
[123] Henk J Van Zuylen and Luis G Willumsen. “The most likely trip matrix estimated from
traffic counts”. In: Transportation Research Part B: Methodological 14.3 (1980), pp. 281–
293.
[124] Ennio Cascetta. “Estimation of trip matrices from traffic counts and survey data: a generalized least squares estimator”. In: Transportation Research Part B: Methodological 18.4-5
(1984), pp. 289–299.
[125] Michael GH Bell. “The real time estimation of origin-destination flows in the presence of
platoon dispersion”. In: Transportation Research Part B: Methodological 25.2-3 (1991),
pp. 115–125.
138
[126] Sharminda Bera and KV Rao. “Estimation of origin-destination matrix from traffic counts:
the state of the art”. In: European Transport\Trasporti Europei (2011).
[127] Panchamy Krishnakumari et al. “A data driven method for OD matrix estimation”. In:
Transportation Research Part C: Emerging Technologies (2019).
[128] Stefano Carrese et al. “Dynamic demand estimation and prediction for traffic urban networks
adopting new data sources”. In: Transportation Research Part C: Emerging Technologies 81
(2017), pp. 83–98.
[129] Marialisa Nigro, Ernesto Cipriani, and Andrea del Giudice. “Exploiting floating car data for
time-dependent origin–destination matrices estimation”. In: Journal of Intelligent Transportation Systems 22.2 (2018), pp. 159–174.
[130] JinYoung Kim et al. “Using electronic toll collection data to understand traffic demand”. In:
Journal of Intelligent Transportation Systems 18.2 (2014), pp. 190–203.
[131] Juan Camilo Castillo. “Who Benefits from Surge Pricing?” In: Available at SSRN 3245533
(2020).
[132] Ali Ghafelebashi, Meisam Razaviyayn, and Maged Dessouky. “Incentive Systems for Fleets
of New Mobility Services”. In: arXiv preprint arXiv:2312.02341 (2023).
[133] Ali Ghafelebashi, Meisam Razaviyayn, and Maged Dessouky. “Congestion reduction via
personalized incentives”. In: Transportation Research Part C: Emerging Technologies 152
(2023), p. 104153.
[134] Weiwei Jiang et al. “Graph neural network for traffic forecasting: The research progress”.
In: ISPRS International Journal of Geo-Information 12.3 (2023), p. 100.
[135] TomTom, Traffic APIs. URL: https://www.tomtom.com/products/traffic-apis/.
Accessed: 2024-09-09.
[136] Mapbox, Traffic Data. URL: https://www2.census.gov/library/publications/dec
ennial/2020/census-briefs/c2020br-03.pdf. Accessed: 2024-09-09.
[137] INRIX 2022 New York City Scorecard. URL: https://www.inrix.com/scorecard-cit
y-2022. Accessed: 2023-11-10.
[138] TLC Trip Record Data. URL: https://www.nyc.gov/site/tlc/about/tlc-trip-rec
ord-data.page. Accessed: 2023-11-10.
[139] Data Dictionary – High Volume FHV Trip Records. URL: https://www.nyc.gov/ass
ets/tlc/downloads/pdf/data_dictionary_trip_records_hvfhs.pdf. Accessed:
2023-11-10.
139
[140] City of New York Department of Transportation - Data of Speed Detectors. URL: https://d
ata.cityofnewyork.us/Transportation/DOT-Traffic-Speeds-NBE/i4gi-tjb9.
Accessed: 2023-11-10.
[141] Metadata - City of New York Department of Transportation - Data of Speed Detectors.
URL: https://data.cityofnewyork.us/api/views/i4gi-tjb9/files/cc7f3b15-
58b7-46e3-94e7-4c5753c3a8b8?download=true&filename=metadata_trafficsp
eeds.pdf. Accessed: 2023-11-10.
[142] Rongzhou Huang et al. “LSGCN: Long short-term traffic prediction with graph convolutional networks.” In: IJCAI. Vol. 7. 2020, pp. 2355–2361.
[143] I Loshchilov. “Decoupled weight decay regularization”. In: arXiv preprint arXiv:1711.05101
(2017).
[144] Benedek Rozemberczki et al. “PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models”. In: Proceedings of the 30th ACM International
Conference on Information and Knowledge Management. 2021, 4564–4573.
[145] Ashkan Yousefpour et al. “Opacus: User-Friendly Differential Privacy Library in PyTorch”.
In: arXiv preprint arXiv:2109.12298 (2021).
[146] Bradley Efron and Robert J Tibshirani. An introduction to the bootstrap. Chapman and
Hall/CRC, 1994.
[147] Cynthia Dwork et al. “Preserving statistical validity in adaptive data analysis”. In: Proceedings of the forty-seventh annual ACM symposium on Theory of computing. 2015, pp. 117–
126.
[148] Christopher Jung et al. “A new analysis of differential privacy’s generalization guarantees”.
In: arXiv preprint arXiv:1909.03577 (2019).
[149] Bradley Efron. “Bootstrap methods: another look at the jackknife”. In: Breakthroughs in
statistics: Methodology and distribution. Springer, 1992, pp. 569–593.
[150] Kesar Singh. “On the asymptotic accuracy of Efron’s bootstrap”. In: The annals of statistics
(1981), pp. 1187–1195.
[151] Edward Carlstein. “The use of subseries values for estimating the variance of a general
statistic from a stationary sequence”. In: The annals of statistics (1986), pp. 1171–1179.
140
Abstract (if available)
Abstract
With rapid population growth and urban development, traffic congestion has become an inescapable issue, especially in large cities. Many congestion reduction strategies have been proposed in the past, ranging from roadway extension to transportation demand management. In particular, congestion pricing schemes have been used as negative reinforcements for traffic control. In this dissertation, we study an alternative approach of offering positive incentives to drivers and organizations to change drivers’ routing behavior. More specifically, we propose algorithms to reduce traffic congestion and improve routing efficiency by offering incentives to drivers and organizations. The incentives are offered after solving large-scale optimization problems in order to minimize the total travel time (or minimize any cost function of the network, such as total carbon emission). Due to the massive size of the optimization problems, we developed distributed computational approaches. The proposed distributed algorithms are guaranteed to converge under a mild set of assumptions that are verified with real data. Utilizing real-time traffic data from Los Angeles, we demonstrate congestion reduction in arterial roads and highways through our algorithms and an extensive set of numerical experiments. Finally, we introduce a collaboration framework that enables organizations to train private traffic prediction models to share with the central planner or other organizations. Our framework leverages the concept of differential privacy (DP) to protect the privacy of users. This approach empowers the central planner to predict organizations’ traffic patterns and make informed incentivization without access to their data or compromising their privacy. Additionally, organizations can improve their traffic prediction models by utilizing our private collaboration framework. We evaluate and validate the performance improvements in organizations’ traffic prediction models achieved through collaboration within the presented framework. This evaluation is based on real-world taxi data from Uber and Lyft and simulated organizations with varying market shares under different privacy levels and collaboration scenarios.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Improving mobility in urban environments using intelligent transportation technologies
PDF
Models and algorithms for the freight movement problem in drayage operations
PDF
Intelligent urban freight transportation
PDF
Integrated control of traffic flow
PDF
Solving the empty container problem using double-container trucks to reduce vehicle miles
PDF
Optimum multimodal routing under normal condition and disruptions
PDF
Computational geometric partitioning for vehicle routing
PDF
New approaches for routing courier delivery services
PDF
Coordinated freeway and arterial traffic flow control
PDF
Striking the balance: optimizing privacy, utility, and complexity in private machine learning
PDF
Microscopic traffic control: theory and practice
PDF
Personalized Pareto-improving pricing-and-routing schemes with preference learning for optimum freight routing
PDF
Novel queueing frameworks for performance analysis of urban traffic systems
PDF
Spatiotemporal traffic forecasting in road networks
PDF
Models and algorithms for pricing and routing in ride-sharing
PDF
Computational validation of stochastic programming models and applications
PDF
Utilizing real-world traffic data to forecast the impact of traffic incidents
PDF
Traffic assignment models for a ridesharing transportation market
PDF
Novel techniques for analysis and control of traffic flow in urban traffic networks
PDF
Train scheduling and routing under dynamic headway control
Asset Metadata
Creator
Ghafelebashizarand, Seyedali
(author)
Core Title
Congestion reduction via private cooperation of new mobility services
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Industrial and Systems Engineering
Degree Conferral Date
2024-12
Publication Date
12/10/2024
Defense Date
12/05/2024
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
congestion reduction,incentivization,intelligent transporation systems,OAI-PMH Harvest,privacy,traffic prediction
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Razaviyayn, Meisam (
committee chair
), Dessouky, Maged (
committee member
), Ioannou, Petros (
committee member
)
Creator Email
ali.ghafelebashi@gmail.com,ghafeleb@usc.edu
Unique identifier
UC11399EHFB
Identifier
etd-Ghafelebas-13681.pdf (filename)
Legacy Identifier
etd-Ghafelebas-13681
Document Type
Dissertation
Format
theses (aat)
Rights
Ghafelebashizarand, Seyedali
Internet Media Type
application/pdf
Type
texts
Source
20241210-usctheses-batch-1227
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
congestion reduction
incentivization
intelligent transporation systems
traffic prediction