Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Utilizing real-world traffic data to forecast the impact of traffic incidents
(USC Thesis Other)
Utilizing real-world traffic data to forecast the impact of traffic incidents
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Utilizing Real-World Traffic Data to Forecast the Impact of Traffic Incidents by Bei (Penny) Pan A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (Computer Science) Augest 2014 Contents Chapter List of Tables v Chapter List of Figures vi Chapter Abstract viii Chapter 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Contributions of the Research . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Outline of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . 5 Chapter 2 Related Work 6 2.1 Traffic Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.1 Simulation Models . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.2 Data Mining Techniques . . . . . . . . . . . . . . . . . . . . . 6 2.2 Impact Prediction for Traffic Incidents . . . . . . . . . . . . . . . . . . 7 2.3 Traffic Incident Detection and Analysis . . . . . . . . . . . . . . . . . . 8 2.3.1 Anomaly Detection using Traffic Data . . . . . . . . . . . . . . 8 2.3.2 Anomaly Detection using Social Media . . . . . . . . . . . . . 9 2.4 Causality analysis in Transportation Data . . . . . . . . . . . . . . . . . 9 Chapter 3 Forecast traffic in presence of incidents 11 3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2.1 Maximum Impact Backlog Modeling . . . . . . . . . . . . . . 13 3.2.2 Traffic Prediction with Incident Impact . . . . . . . . . . . . . . 15 3.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3.2 Predictions with Event Information . . . . . . . . . . . . . . . . 17 Chapter 4 Forecast impact propagation behavior of traffic incidents 20 4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.2 Impact Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.2.1 Modeling Propagation Behavior . . . . . . . . . . . . . . . . . 24 ii 4.2.2 Determining Minimum Impact Threshold () . . . . . . . . . . 26 4.3 Prediction of Propagation Behavior . . . . . . . . . . . . . . . . . . . . 27 4.3.1 Baseline Approaches . . . . . . . . . . . . . . . . . . . . . . . 28 4.3.2 Prediction with Traffic Density (PAD) . . . . . . . . . . . . . . 30 4.3.3 Prediction with Initial Behavior (PADI) . . . . . . . . . . . . . 32 4.3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.4 Prediction of Clearance Behavior . . . . . . . . . . . . . . . . . . . . . 34 4.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.5.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . 37 4.5.2 Evaluation of Propagation Prediction . . . . . . . . . . . . . . . 38 4.5.3 Evaluation of Clearance Prediction . . . . . . . . . . . . . . . . 45 4.5.4 Case Studies on Travel Time Calculation . . . . . . . . . . . . . 45 Chapter 5 Analyze Traffic Events using Human Mobility and Social Media 49 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 5.1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 5.1.2 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . 51 5.2 Offline Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 5.2.1 Modeling Taxi Trajectories . . . . . . . . . . . . . . . . . . . . 52 5.2.2 Modeling Routing Behavior . . . . . . . . . . . . . . . . . . . 52 5.2.3 Index Building . . . . . . . . . . . . . . . . . . . . . . . . . . 53 5.3 Traffic Anomaly Detection . . . . . . . . . . . . . . . . . . . . . . . . 54 5.3.1 Anomalous Seed Selection . . . . . . . . . . . . . . . . . . . . 55 5.3.2 Anomalous Graph Expansion . . . . . . . . . . . . . . . . . . . 56 5.4 Traffic Anomaly Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.4.1 Impact Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.4.2 Term Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 5.4.3 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.5.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.5.2 Evaluation Approach . . . . . . . . . . . . . . . . . . . . . . . 63 5.5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Chapter 6 Forecast impact on arterial streets and intersected freeways 72 6.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 6.1.1 Granger Causality . . . . . . . . . . . . . . . . . . . . . . . . . 74 6.1.2 Lasso-Granger . . . . . . . . . . . . . . . . . . . . . . . . . . 75 6.2 Time-Sensitive Causality Detection . . . . . . . . . . . . . . . . . . . . 75 6.2.1 Slowdown Causality . . . . . . . . . . . . . . . . . . . . . . . 76 6.2.2 Intervention Causality . . . . . . . . . . . . . . . . . . . . . . 78 6.3 Impact Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 iii 6.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 6.4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . 83 6.4.2 Results on Running Examples . . . . . . . . . . . . . . . . . . 84 6.4.3 Result on Prediction Accuracy . . . . . . . . . . . . . . . . . . 86 6.4.4 Result on Travel-time Calculation . . . . . . . . . . . . . . . . 89 Chapter 7 Conclusion and Future work 90 Bibliography : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 92 iv List of Tables 3.1 Maximum impact backlog on incident meta-attributes . . . . . . . . . . 14 3.2 Dataset description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3 Relevant incident attributes . . . . . . . . . . . . . . . . . . . . . . . . 18 4.1 Dataset for evaluation of prediction accuracy . . . . . . . . . . . . . . . 38 4.2 Evaluation parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.1 Example ofRP OD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 5.2 Computational analysis of update procedure . . . . . . . . . . . . . . . 58 5.3 Statistics of dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 5.4 Average processing time in anomaly detection . . . . . . . . . . . . . . 65 5.5 Comparison based on #. of tweets Used . . . . . . . . . . . . . . . . . 67 6.1 Dataset description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 6.2 Effects of rush hour interval on X* . . . . . . . . . . . . . . . . . . . . 86 v List of Figures 1.1 (a) route calculated based on current incident’s impact (b) time-varying expansion of impacted region as driver approaches the incident location (c) route calculated based on accurate prediction of impact . . . . . . . 2 1.2 Sample incident impact propagation . . . . . . . . . . . . . . . . . . . 3 3.1 Impact of an accident on ARIMA and HAM . . . . . . . . . . . . . . . 13 3.2 Case study on traffic collision incidents . . . . . . . . . . . . . . . . . 18 3.3 Case study on a road construction incident . . . . . . . . . . . . . . . . 19 4.1 Sample traffic incident on I-5 South . . . . . . . . . . . . . . . . . . . 22 4.2 Intersecting Figure 2(b) with = 60% in speed change . . . . . . . . . 24 4.3 Sample propagation behavior . . . . . . . . . . . . . . . . . . . . . . . 25 4.4 Propagation behaviors under different . . . . . . . . . . . . . . . . . 26 4.5 Confidence interval for a sample sensor . . . . . . . . . . . . . . . . . 27 4.6 Case studies on traffic environment . . . . . . . . . . . . . . . . . . . . 31 4.7 Sample prediction comparison on I-405 S. . . . . . . . . . . . . . . . . 32 4.8 Hierarchy structure for training . . . . . . . . . . . . . . . . . . . . . . 33 4.9 Sample prediction on I-405 S. . . . . . . . . . . . . . . . . . . . . . . . 34 4.10 Behavior learning on two sample incidents . . . . . . . . . . . . . . . . 35 4.11 Case studies on two sample incident . . . . . . . . . . . . . . . . . . . 40 4.12 Effect of impact threshold () . . . . . . . . . . . . . . . . . . . . . . . 40 4.13 Case study on impact threshold . . . . . . . . . . . . . . . . . . . . . . 41 4.14 Effect of forward lag (h) . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.15 Effects of incident occurrence time . . . . . . . . . . . . . . . . . . . . 43 4.16 Effects of distance metric on PADI approach . . . . . . . . . . . . . . . 43 4.17 Overall results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.18 Prediction of clearance start time and location . . . . . . . . . . . . . . 45 4.19 Scenario for sampled incident occurred on I-5 North . . . . . . . . . . . 47 4.20 Scenario for sampled incident occurred on I-405 South . . . . . . . . . 48 5.1 Concrete example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 5.2 System architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5.3 Example of index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 5.4 Sample insertion procedure . . . . . . . . . . . . . . . . . . . . . . . . 57 vi 5.5 Update procedure for online index . . . . . . . . . . . . . . . . . . . . 57 5.6 Term mining overview . . . . . . . . . . . . . . . . . . . . . . . . . . 60 5.7 Example of analysis view. . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.8 Traffic anomalies reported to authorities, discovered by the baseline PCA approach, and discovered by our method from 7AM to 9AM and from 4PM to 6PM on 5/12/2011 . . . . . . . . . . . . . . . . . . . . . 68 5.9 Effects of time intervals . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.10 Effects of index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.11 Case study 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5.12 Case study 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 6.1 Impact of a traffic incident . . . . . . . . . . . . . . . . . . . . . . . . 72 6.2 Running example 1: causality from freeway traffic to arterial traffic . . . 76 6.3 Running example 1 cont.: effects of time interval im training data . . . . 76 6.4 Running example 2: causality from one freeway traffic to intersected freeway traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 6.5 Running example 2 cont.: in the presence of traffic incidents . . . . . . 79 6.6 Flow chart for impact prediction . . . . . . . . . . . . . . . . . . . . . 82 6.7 Effects of prediction horizon on running example 1 . . . . . . . . . . . 85 6.8 Sample traffic incident on I-10 W. at 4:28PM . . . . . . . . . . . . . . 87 6.9 Sample incident on I-405 N. at 6:38 PM . . . . . . . . . . . . . . . . . 87 6.10 Overall prediction accuracy on impact of freeway incidents . . . . . . . 88 6.11 Case study in travel time calculation . . . . . . . . . . . . . . . . . . . 89 vii Abstract For the first time, real-time high-fidelity spatiotemporal data on the transportation net- works of major cities have become available. This gold mine of data can be utilized to learn about the behavior of traffic congestion at different times and locations, poten- tially resulting in major savings in time and fuel, the two important commodities of the 21st century. Therefore, how to mine valuable information from these data to enable next-generation technologies for unprecedented convenience has become a key topic in spatiotemporal data mining. By utilizing real-world transportation related datasets, this thesis focuses on addressing the problems related to the impact of traffic incidents. Traffic incidents refer to non-recurring issues that occur in the road network, such as traffic accidents, weather hazard, special events and construction zone closures, which contribute to approximately 50% of traffic congestion. First, this thesis addresses the fundamental problem of traffic prediction in the pres- ence of traffic incidents by utilizing traffic sensor data and incident reports collected from Los Angeles County road networks. The proposed prediction method overcomes the deficiency of traditional time-series prediction techniques by considering the unique characteristics for traffic speed time series. Then, using the same dataset, this the- sis proposes a set of methods to predict the dynamic evolution of the impact of inci- dents. Through the surrounding traffic data of traffic incidents, this thesis models the propagation behavior of congestion caused by archived incidents and develops a set of clustering-based techniques for predicting similar behavior in the future. Third, in addition to sensor data, this thesis mines social media and GPS trajectories to obtain a better understanding of the cause of traffic incidents. Specifically, by identifying unusual traveling behaviors and Twitter-like posts from data collected in Beijing, this work detects and analyzes the impact of traffic incidents. Finally, this thesis analyzes the causality relationship between freeway traffic and arterial traffic to provide a com- prehensive prediction of the impact that incidents have on both freeways and arterial streets. As a result, next-generation navigation applications that are built based on the approaches discussed in this thesis can help drivers effectively avoid the impacted area in real time and thereby save them a considerable amount of travel time. viii Chapter 1 Introduction 1.1 Motivation The two most important commodities of the 21st century are time and energy; traffic congestion wastes both. The Texas annual Transportation report [2] estimates that 5.5 billion hours and 2.9 billion gallons of fuel are wasted due to the problem of traffic congestion in the United States in 2012. According to [45], approximately 50% of the freeway congestions are caused by non-recurring incidents, such as traffic accidents, weather hazard, special events and construction zone closures. The congestion prob- lem caused by traffic incidents has been widely studied by several disciplines, such as in transportation science, civil engineering, policy planning, and operations research through mathematical models, simulation studies and field surveys. In recent years, due to the sensor instrumentation of road networks in major cities as well as the vast availability of auxiliary commodity sensors form which traffic information can be de- rived (e.g., CCTV cameras, GPS devices), for the first time a large volume of real-time traffic data at very high spatial and temporal resolutions has become available. Hence, the goal of dissertation is to quantify and predict the impact of traffic incidents on the surrounding traffic through real-world transportation data. This quantification can al- leviate the significant financial and time lost by traffic incidents, for example it can be used by city transportation agencies for providing evacuation plan to eliminate poten- tial congested grid locks, for effective dispatching of emergency vehicles, or even for long-term policy making. The McKinsey report [1] predicts a worldwide consumer saving of more than $600 billion annually by 2020 for location-based-services, where the biggest single con- sumer benefit will be from time and fuel savings from navigation services tapping into real-time traffic data. Therefore, for this dissertation, we focus on a next generation consumer navigation system (in-car or on smart phone), called ClearPath, as a moti- vating application, which can help drivers to effectively plan their routes in real-time by avoiding the incidents’ impact areas. That is, suppose an accident is reported in real-time (by crowdsourcing [59] or through agency reports or SIGALERTS [50]) in 1 front of a driver but the accident is 20 minutes away. If we can effectively quantify the impact of the accident, ClearPath would know that this accident would be cleared in the next 10 minutes. Thereby, ClearPath would guide the driver directly towards the accident because it knows that by the time the driver arrives the area, there would be no accident. To be more specific, consider another example illustrated in Figure 1.1. In this figure, the caution mark, the directed solid red lines, and the dashed blue lines represent the incident location, the congested region caused by the incident, and the route a driver plans to follow, respectively. Without prediction, but with the knowledge of the incident, a typical navigation application, such as Waze [59], may suggest the route shown in Figure 1.1(a) to the drivers. If the driver follows this route, he would be stuck in the traffic congestion caused by the incident, as illustrated in Figure 1.1(b), due to the fact that the congested region has grown. On the other hand, if we can pre- dict how the impacted spatial span (i.e., congested region) evolves over time, ClearPath could calculate the route that can effectively avoid the congestion from the beginning, as shown in Figure 1.1(c). One lane closed on Culver Dr (a) Routing without Impact Prediction One lane closed on Culver Dr (b) Routing without Impact Prediction (cont.) One lane closed on Culver Dr (c) Routing with Impact Prediction Figure 1.1: (a) route calculated based on current incident’s impact (b) time-varying expansion of impacted region as driver approaches the incident location (c) route cal- culated based on accurate prediction of impact In this dissertation, we discuss a framework that allows for the rapid prediction of the impact of newly reported traffic incidents by mining real-world transportation- related datasets. In our first attempt, we consider the impact of traffic incidents to be a static measurement. Specifically, we utilize a single value to measure how much the speed decreases at a particular location or to measure how large the backlog and congestion will be on the street where the incident occurs. Following this strategy, the first part of this dissertation discusses the modeling strategy for maximum speed decrease and the maximum spatial span on incidents that occurred recently and how to utilize them in the prediction of traffic speed. 2 However, by investigating real-world datasets, we observe that the scales of speed change and spatial span gradually develop over time. Consider the real scenario for a sample incident that occurred on freeway I-10 West, as illustrated in Figure 1.2. Here, the v indicates the scale of speed decreases. According to the scales, the red bars cover the region where the traffic shows a significant slowdown, and the orange bars cover regions with medium traffic slow down. As shown in the figure, the traffic slow- down regions increase over time right after the occurrence of the incident. For a differ- ent scale of speed changes, such an increase follows different manners . Inspired by this observation, instead of modeling the incident impact as a single value measurement, in the second part of this dissertation, we model the impact of an incident as a time series of spatial span relative to the scale of speed changes to capture the propagation process of the incident impact. Moreover, we propose a series of approaches to predict the propagation process. Vehicle Flow Dir. (a) 5 minutes after occurrence of event (b) 15 minutes after occurrence of event (c) 30 minutes after occurrence of event Event Location Δ v ≥ 60% ∆ v [40%, 60%] Figure 1.2: Sample incident impact propagation In the third and fourth parts of this dissertation, we extend the previous two parts by utilizing different datasets and considering different impact regions, respectively. In the third part, instead of using traffic sensor data and incident reports, we utilize GPS trajectories from individual drivers and Twitter-like social media posts to detect and analyze the impact of traffic incidents. Unlike the existing traffic incident-detection methods, we identify anomalies according to significant changes in drivers’ routing 3 behavior to avoid the traffic incidents. Here, a detected incident is represented by a sub-graph of a road network where drivers’ routing behaviors significantly differ from their original patterns, which is also considered the impacted region of the incident. We then try to describe the detected anomaly by mining representative terms from social media that people posted when the anomaly happened. This methodology is based solely on analyzing human behavior and is capable of detecting the impact of traffic incidents that have not even been reported to the transportation agencies, such as some large shopping events. Moreover, this technique can be applied to small cities that lack loop detectors (i.e., sensors). In the fourth part, we focus on predicting the impact of traffic incidents on arte- rial streets and other intersecting freeways. In the first two parts of our study, our prediction model is designed to forecast the impact only on the freeway streets of the incident occurrence; to simplify the problem, we assumed several hypotheses about the impact region, such as the absence of traffic signals and stop signs. In this part, we tar- get more complex impact regions including arterial streets and intersecting freeways, which introduce new problems and challenges. One major challenge is to identify the size of vicinity areas that will be impacted. Previously, we assume that the traffic on the upstream of the freeway of the incident occurrence will be impacted by the incident. However, we cannot make a similar assumption for impacts on arterial streets. Thus, we propose a causality detection methodology to identify the causality relationship be- tween freeway traffic and arterial traffic and further utilize this causality to predict the impact for a given traffic incident. 1.2 Contributions of the Research The main results, contributions and innovations of this thesis are summarized as fol- lows: 1. This thesis presents a prediction algorithm for forecasting the traffic speed time series on road networks. The existing prediction approaches overlook the unique characteristics for traffic speed time series, such as the impact brought by rush hour and traffic accidents, and therefore cannot precisely predict the congestion caused by rush-hour and accidents, when the prediction is most needed. The proposed H-ARIMA+ approach addresses these problems and improves the pre- diction accuracy in the presence of rush hour and traffic incidents by up to 78% and 91%, respectively. 2. This thesis quantifies the dynamic behavior of traffic incidents and defines the problem of predicting the quantitative time-varying spatial span of an incidents impact. First, the traffic surrounding each accident is modeled using the traffic data at the time and location of the incident. Then, by analyzing the archived incident data, the surrounding traffic is classified based on the incident’s features 4 (e.g., time, location, type of incident) and utilized to predict the impact of a new incident by matching these features. This information, in turn, can help drivers effectively avoid the impacted area in real time. 3. This thesis also introduces a system linking mobility data (e.g., GPS trajectories) and social media for the purpose of detecting and describing traffic-related events (such as accidents or sport events). The traffic events are detected by identifying significant changes in driver routing behaviors from their original patterns. Then, the system analyzes the impact of the detected traffic event by mining the rep- resentative terms from social media, where people have posted when the event occurred. Fusing social media with mobility data allows transportation agencies to identify/understand unusual people’s activities that causes significant traffic problems. 4. This thesis proposes two types of time-sensitive causality relationships that are unique to a road network traffic time series. These causality relationships are detected and analyzed to predict the impact of freeway traffic accidents to arte- rial streets and other freeway streets. By discovering the temporal feature of the causality relationship, the prediction accuracy for traffic on arterial streets and other freeway streets under the presence of traffic incidents is improved by up to 64.7% and 87.2%, respectively. These findings can be further applied to naviga- tion systems to identify a more comprehensive spatial span for traffic incidents occurring on freeways. 1.3 Outline of the Dissertation The remainder of the dissertation is organized as follows. Chapter 2 reviews research work related to our topic. In Chapter 3, we describe our framework that enables the prediction of the maximum incident impact. Chapter 4 introduces how we model and predict the propagation behavior for a newly occurred incident. Chapter 5 introduces the methodologies we employed to detect and analyze the impact of traffic incidents using GPS trajectories and social media. In Chapter 6, a causality detection approach is proposed for predicting the impact of freeway incidents on arterial streets and in- tersected freeways. Chapter 7 concludes the thesis and proposes some future research directions. Some materials in this thesis appear in conference papers or submitted journal pa- pers [40, 39, 41]. 5 Chapter 2 Related Work 2.1 Traffic Prediction 2.1.1 Simulation Models The traffic prediction techniques developed in the first category use surveys and/or sim- ulation models. In [15], Clark proposes a non-parametric regression model to predict traffic based on the observed traffic data. In [20] and [9], authors use microscopic mod- els upon trajectories of individual vehicles to simulate overall traffic data and further conduct prediction. In [65], Yuan at el. estimate the traffic flow of a road segment by analyzing taxi trajectories. The major limitation of such studies is that they rely on sporadic observations and are often restricted to synthetic or simplified data for sim- ulations. Also, none of these studies use a source of incident data with description variables and reporting techniques, and their spatial transferability is limited. In this thesis, we use a very detailed high resolution traffic dataset and incident dataset. 2.1.2 Data Mining Techniques The increase in the availability of real-time traffic allowed researchers to develop and apply data mining techniques to forecast traffic based on the real-world datasets. Since early 1980s, univariate time series models, mainly Box-Jenkins Auto-Regressive Inte- grated Moving Average (ARIMA) [10] and Holt-Winters Exponential Smoothing (ES) models [35, 62], have been widely used in traffic prediction. In the last decade, Neural Network (NNet) models also has been extensively used in forecasting of various traffic parameters, including speed [64, 25], travel time[56], and traffic flow [52, 42]. Nowa- days, ARIMA, ES and NNet models are used as benchmarking methods for short-term traffic prediction [42, 37]. However, these approaches consider traffic flow as a simple time-series data and ignore phenomenons that particularly happen to traffic data. For example, for generic time-series, the observations made in the immediate past are usu- ally a good indication of the short-term future. However, for traffic time-series, this is 6 not true at the beginning of traffic incidents, due to sudden speed changes.In the first part of this thesis, we proposed a technique to predict the maximum incident impact and utilize it into the prediction of traffic speed in the presence of traffic incident. 2.2 Impact Prediction for Traffic Incidents In the last decades, to support operational decisions, the impact of traffic incidents is usually measured by incident duration (elapsed time between the incident occurrence time to when the response vehicle depart the scene). The problem of forecasting in- cident duration has been widely studied through a variety of statistical approaches: distribution fitting [22, 18, 54], analysis of variance[21], as well as regression models [26]. Other data mining techniques, such as decision trees [38], classification trees [27, 51], as well as Bayesian classifier [11] are also broadly used to predict incident durations. In these studies, the duration of incidents is considered as a single parameter describing the severity of their impact. However, this parameter cannot represent the spatial span of the incident impact region, neither the travel time delay cause by the incident in that region. In our work, we propose a comprehensive measurement of the incident impact, instead of physical incident duration, we focus on predicting the con- sequences of traffic incidents, i.e., the scale and spatial range of the traffic congestions caused by incident. In the recent decades, with the availability of real-world traffic sensor data, re- searchers start to combine the theoretical model (e.g., queuing model) with real-world data to predict the spatial impact of traffic incidents [8, 58]. However, in these theo- retical models, certain assumptions and parameters are still difficult to validate through real-world data. Specifically, queuing models are for micro-level simulations/analysis (i.e., per car level) and based on underlying assumptions regarding vehicle arrival pat- terns, departure characteristics, and queue discipline[34]. However, real-world traffic data to which we have access are at macro-level and hence it is generally difficult to derive the required parameters utilizing real-world traffic data. For example, the lane- change behavior from drivers are hard to predict or estimate based on traffic data col- lected from sensors, and such behavior will largely affect the queuing discipline, and further affect the accuracy of queuing models. Thereby, in this thesis, to avoid the as- sumptions that are difficult to validate, we focus on the impact modeling and prediction strategy purely based on real-world sensor data. Two of most relevant works to our study are the models proposed by [28] and [36]. In [28], the authors utilize a nearest-neighbor technique to detect cumulative delays and impact regions caused by traffic incidents. They define impact regions with fixed thresholds. However, the impact of incident on traffic congestion varies based on space and time. For example, the impact region of an accident occurring during rush hour is usually more severe. Similarly, an accident at an inter-state street has a different impact region than that of a surface street. In our study, we consider such spatiotemporal 7 characteristics of traffic incidents while designing our models. As an improvement of [28], [40] and [36] model the impact of traffic incidents as congested region and delay costs, respectively, without using the fixed threshold. However, in these studies, the incident impact are measured by a set of static values (e.g., maximum congested backlog, and total travel time delay). However, such measurement can hardly capture the development of incident impact as time elapses. For example, in the first couple of minutes after the incident, the congested area may be small, and it is entirely possible that the congested area grow much larger in the next half an hour. In Chapter 4, the impact of incidents will be modeled and predicted as a dynamic variable depending on the elapsed time after occurrence. 2.3 Traffic Incident Detection and Analysis 2.3.1 Anomaly Detection using Traffic Data The previous work on detecting anomalies using GPS data can be divided into two cat- egories: 1) the studies on trajectory anomalies (e.g., [19, 67, 68]), and 2) the studies on traffic anomalies (e.g., our work and [13]). The works in the first category sought to find a small percentage of drivers whose driving trajectories is different compared with the broader population, which could result from fraudulent taxi driving behavior or some other anomalous cause. Our work belongs to the second category, and differs from the above methods in the following aspects. First, we aim to detect a large amount of drivers whose behavior is anomalous. Second, for anomalous trajectory detection, the comparison between the trajectories always happens between a small set of trajec- tories and the remaining trajectories at the same time and location. For our work in Chapter 5, the traffic anomaly detection, the comparison happens between the current behavior of drivers and the historical driving behavior. The most relevant works to our study, in terms of both data types and the definition of an “anomaly”, are those focusing on traffic anomaly detection using GPS data (e.g., [32], [13]). Among these works, this dissertation can be distinguished in two ways. First, our approach is the first considering the change of routing behavior in addition to the change in traffic volume. Therefore, in Chapter 5, we have found that our approach has a higher detection rate as compared with an approach that only uses traffic volume changes. Further, our technique can provide users with detour routes to avoid or escape the congestion caused by a traffic anomaly, while the volume based approaches can only detect the locations of the anomalies, without revealing the whole extent throughout the road network. These two advantages were evaluated in experiment section. Finally, the granularity of our detected traffic anomaly is on the level of road segments instead of spatial regions. For example, the anomalous scenarios studied in [13] are inter-regions, making its results limited to very large scale events, such as marathon race, instead of road-segment-level traffic anomalies, such as traffic accidents. 8 2.3.2 Anomaly Detection using Social Media Another line of related work is anomaly detection via mining social media content. Recently, microblogging services (e.g., twitter) have received much research attention in the fields of anomaly detection. Researchers consider the twitter posts (i.e., tweets) as real-time social streams and focus on analyzing the features of keywords in the specific context to detect events [48, 49, 31]. The key challenges in these works is to filter out the irrelevant contents in the tweets, which requires computationally expensive filtering, such as the Kalman filtering based model proposed by [48] and the Gibbs Random Field defined probabilistic model in [31]. However, in our work discussed in Chapter 5, by using the data collected from anomaly detection in addition to the social texts, we can narrow down the search space to a specific time and location, tremendously reducing the search space as compared with the traditional methods. We therefore only need to conduct a simple filtering technique to separate out the irrelevant contents, as discussed in anomaly analysis section. 2.4 Causality analysis in Transportation Data The theory of Granger causality is also extensively applied in the field of transporta- tion. For example, [3] examined the causality relationship among road traffic accidents, GDP, population, road mails, road vehicles and the number of driver licenses in Saudi Arabia. [6] built models based on Granger causality theory to understand the nature and extent of the accident causes in Yemen. These studies simply applied the Granger causality test to the transportation data and other types of data sets and tried to discover the causality relationship. However, in our study discussed in Chapter 6, in addition to the simple causality test, we investigate the temporal existence of the causality relation- ship among traffic speed time series and further apply the detected causality relationship to obtain an impact prediction of traffic incidents that occurred recently. The set of studies that are most relevant towards our problem are the models pro- posed in [39, 43]. [39] is based on an assumption that the causality relationship exists for the traffic flows on the same freeway. Thereby, the scope of the problem solved in this work is limited to predicting an incidents’ impact on the freeway on which the incident occurred. However, in Chapter 6, instead of depending on assumptions, we de- tect the causality relationship between traffic flows on freeways of incident occurrence and other streets, e.g., adjacent arterial streets and other freeways intersecting with the freeway of the incident, and further predict an incident’s impact on these streets. Con- versely, [43] predicts the impact of traffic incidents on other freeways rather than the occurrence freeway using a dynamic Bayesian network model. That work also pre- defines the causality relationships between traffic flows in the road network rather than detecting them. Therefore, in their model built on a particular road network, the spatial transferability is limited. However, our approach is designed to detect the causality among freeway traffic-flows and arterial traffic-flows and can be applied to any road 9 network using the available traffic data. 10 Chapter 3 Forecast traffic in presence of incidents In this section, we study the effect of incidents on traffic congestions, especially in upstream direction. In particular, we incorporate incident information in to our traffic speed prediction techniques [40] to enhance the prediction accuracy. Towards this end, we exploit our historical incident reports and the associated traffic speed nearby at the time of the incidents to model the correlation between incident attributes and traffic congestion. Note that even though our model is built offline by using the past data, we use it online for better traffic prediction. That is, in real-time using the current incident reports as input, we match the incident’s attributes to find similar incidents happened in the past to predict speed delays and backlogs, caused by the current incident. 3.1 Preliminaries We introduce two standard traffic speed prediction techniques, namely Auto-Regressive Integrated Moving Average (ARIMA) and Historical Average Model (HAM), which is mainly served as the preliminaries for the Chapter 3. Auto-Regressive Integrated Moving Average (ARIMA) This model [10] is a generalization of autoregressive moving average model with an initial differencing step applied to remove the non-stationarity of the data. The model can be formulated as Y t+1 = X p i=1 i Y ti+1 + X q i=1 i " ti+1 +" t+1 (3.1) wherefY t g refers to a time series data (e.g., the sequence of speed readings). In the au- toregressive component of this model ( P p i=1 i Y ti+1 ), a linear weighted combination of previous data is calculated, wherep refers to the order of this model and i refers to the weight of (ti + 1)-th reading. In the second part ( P q i=1 i " ti+1 ), the sum of weighted noise from the moving average model is calculated, where" denotes the noise,q refers to its order and i represents the weight of (ti + 1)-th noise. 11 As shown in Equation (3.1), the predicted value mainly relies on the linear com- bination of the data that occurred before time t. This model can be directly used to predict the traffic speed data, when prediction horizonh=1. Whenh>1, we can iterate the prediction processh times by using the predicted value as the input to predict the next value. Historical Average Model Our rigorous analysis on real-world traffic sensor data reveals that there is a strong cor- relation (both temporally and spatially) present among the measurements of the single and multiple traffic sensor(s) on road networks. For example, the traffic condition of a particular road segment on Monday 8:30AM can be estimated based the average of last four sensor readings for the same road segment at 8:30AM in the past four Mondays. Therefore, we introduce Historical average model (HAM) that uses the average of pre- vious readings for the same time and location to forecast the future data. We formulate HAM as follows: v(t d;w +h) = 1 jV (d;w)j X s2V(d;w) v(s) (3.2) whereV (d;w) refers to the subset of past observations that happened at the same timed on the same dayw. Specifically,d captures the daily effects (i.e., the traffic observations at the same time of the day are correlated), whilew captures the weekly effects (i.e., the traffic observations at the same day of the week are correlated). For example, if the traffic data to be predicted is next Monday at 8:00AM,d refers to ”8:00AM”, andw = Mon. TherebyV (d;w) refers to the set of traffic data happens on previous Mondays at 8:00AM. In fact, the selection of historical observations is also relevant with seasonal effects. For example, the historical observations on Mondays during winter is probably different with that on Mondays during summer. Here, we eliminate the seasonal effects by only using the data collected in one season. Also, as shown in the formula, the function to select past observations and calculating the average are indifferent to the value of the prediction horizonh. 3.2 Methodology As discussed in preliminary section, HAM can hardly react to unexpected traffic inci- dents as it eliminates the influence of incidents by averaging historical observations. ARIMA, due to its delayed reaction, is not an ideal method to use in the case of inci- dents which cause sudden changes in the time-series data. To illustrate the prediction accuracy of ARIMA and HAM in the presence of an incident, consider Figure 3.1 that shows the speed prediction of both techniques for a traffic accident that happened on freeway CA-91 at 10:53AM Dec. 5th, 2011 with prediction horizonh =6. As shown, the prediction accuracy of both techniques are significantly low as compared with the actual speed. 12 Hence, we discuss our maximum impact backlog model that helps traffic prediction problem in the presence of incidents. 0 20 40 60 80 Speed (mph) t HAM ARIMA Actual Figure 3.1: Impact of an accident on ARIMA and HAM 3.2.1 Maximum Impact Backlog Modeling With our approach we assume that incident data is an input to our algorithm and in- cludes but not limited to the following meta-data: 1) incident date, 2) incident start- time, 3) incident location (i.e., latitude, longitude), 4) incident type (e.g., traffic col- lision, road construction), 5) type of vehicles involved if incident is an accident, 6) number of affected lanes. In addition, we use maximum impact backlog as defined in the preliminary chapter to represent the spatial upstream span of an incident. To esti- mate the maximum impact backlog, we calculate it as the network distance between the incident location and the last influenced sensors by the incidents. The influenced sen- sors are the sensors whose speed reading show an anomalous decline compared with the historical average speed 1 . Based on our analysis of real-world data, we observe that maximum impact backlog varies across incidents with different attributes. Let us consider one of the attributes ”start time” as an example. The maximum impact backlog of incidents that happen during day-time may be large compared with incidents happening at midnight, due to higher traffic flow during the day-time. Thereby, the key to investigate the correlation between incident attributes and maximum impact backlog is to decide which attributes are correlated with maximum impact backlog. It is likely that some incident attributes are irrelevant or redundant for inferring maximum impact backlog. In order to iden- tify the most correlated subset, we first process the incident attributes as normalized features and maximum impact backlog as numerical classes, and then apply the Cor- relation based Feature Selection (CFS) algorithm[24] on top of this normalized data to select correlated features. From the result obtained from this procedure, we observe that the following incident attributes are the most relevant:fStart time, Location, Direction, Type, Affected Lanesg. We use the selected attributes to classify the maximum impact backlog values, and utilize the average value of each class to represent the impact of an incident with corresponding attributes. Table 3.1 shows some selected classification re- sults where the maximum impact backlog under different Start-time is aggregated into 1 We detect the anomalous decline using the traffic incident detection algorithm proposed in [29]. 13 (a) Traffic collision incident, affected lanes = 0 Location D S 0;4 S 4;8 S 8;12 S 12;16 S 16;20 S 20;24 I-10 E 1.87 3.11 2.58 2.25 4.56 1.97 I-10 W N/A 3.63 3.56 2.41 3.28 2.68 I-405 N 2.07 2.93 3.68 2.92 3.33 1.51 I-405 S 0.14 3.37 2.61 3.63 4.37 2.03 I-5 N 0.10 3.32 4.12 4.45 5.51 2.56 I-5 S 1.17 3.66 3.41 2.43 3.73 1.34 I-105 E 2.38 2.51 2.36 3.79 4.24 1.90 I-105 W N/A 2.95 3.85 2.83 3.60 2.00 (b) Traffic collision incident, affected lanes = 1 Location D S 0;4 S 4;8 S 8;12 S 12;16 S 16;20 S 20;24 I-10 E N/A 7.71 2.83 N/A N/A N/A I-10 W N/A 3.37 4.15 N/A N/A N/A I-405 N N/A N/A 4.74 3.57 3.52 0.46 I-405 S N/A N/A N/A N/A 4.78 1.75 I-5 N N/A N/A 2.02 N/A 6.11 N/A I-5 S 0.10 N/A N/A N/A N/A N/A I-105 E 1.50 N/A N/A N/A 5.30 0.40 I-105 W N/A N/A N/A N/A 4.40 N/A (c) Road construction incident, affected lanes = 1 Location D S 0;4 S 4;8 S 8;12 S 12;16 S 16;20 S 20;24 I-10 E 1.33 8.46 3.36 4.58 N/A N/A I-10 W N/A N/A N/A 4.87 N/A 1.54 I-405 N 0.96 N/A 9.35 5.02 N/A 1.25 I-405 S 1.73 N/A N/A N/A N/A 0.19 I-5 N N/A N/A 4.70 5.80 5.70 6.50 I-5 S N/A N/A N/A N/A N/A N/A I-105 E 1.80 N/A N/A N/A N/A N/A I-105 W N/A N/A N/A N/A 4.60 0.1 Table 3.1: Maximum impact backlog on incident meta-attributes 4-hour interval denoted asS start-hour;end-hour and ”N/A” means that there is no such inci- dent happening with the attributes specified in our experimental dataset 2 . The dataset used to train this model includes the incidents happened in weekdays, when rush-hour is considered as 6:00AM to 9:00AM and 4:00PM to 7:00PM. From the results shown in Table 3.1, we make the following observations. 2 The number of affected lanes equals zero indicates that no lanes are blocked as the involved vehicles moved to the shoulder of the road after the accident. 14 First, from Table 3.0(a), we observe that for the incidents happening during rush hours, the maximum impact backlog is larger than that of non-rush hours. This is expected because when an accident happens during rush hours on a high oc- cupancy road, the impact of that incident is more severe than on roads without traffic. Second, comparing Table 3.0(a) and 3.0(b), we infer that for the incidents hap- pening at similar time, same location, the maximum impact backlog is generally larger when the number of affected lanes is more. Obviously, since the affected number of lanes reflects the number of lanes which are blocked by the incidents, the more lanes blocked, the slower the traffic flow. However, for accidents that occur at midnight, since the traffic is free-flow at that time, the higher number of affected lanes does not necessarily indicate longer maximum impact backlog. Third, in Table 3.0(c), we observe that for the road construction incidents, if they happen at day time, especially at rush hours, their impact on traffic is severe, sometimes exceptionally larger than that of traffic collisions happening at the same time. On the other hand, if they happen at night, their impact is not that significant. 3.2.2 Traffic Prediction with Incident Impact In addition to maximum impact backlog, the speed change (speed-impact) caused by incidents is also very important for traffic prediction. To estimate the speed-impact, we introduce two factors: maximum speed decrease (v) and time shift (t). We estimate v based on the correlated attributes (similar to maximum impact backlog). Definition 3.2.1: For sensor i, its maximum speed decrease v i for incident e is defined as the maximum speed changes for all incidents that share the same correlated attributes (i.e., Start-time, Location, Direction, Type and Affected Lanes) with e, and affected sensori in the past. Once we find the maximum speed decrease, the next step is to determine the ex- act time stamps we need to apply the change on sensors. When an incident occurs, the sensors located at different locations might be influenced at different time stamps. Therefore, we propose the concept of time shift (t) to estimate the period of time that a sensor will be affected after an incident. Definition 3.2.2: For sensori, its time shift (t i ) for incidente is defined as the dis- tance between the sensori and incidente divided by the average traffic speed between them, which can be represented as follows: t i (e) = dist(i;e) avg(fv j g) where p(i)p(j)p(e) (3.3) 15 Table 3.2: Dataset description duration Nov. 1st - Dec. 7th, 2011 Sensor Data # of sensors 2028 spatial span 3420 miles sensor sampling rate 1 reading per 30 secs temporal aggregation interval 5 mins spatial resolution 1 sensor Event Data # of incidents 3255 # of incident attributes 43 where p(i) refers to the position of sensori. The set offv j g refers to all the speed readings presented at the sensors located between sensori and incidente. Below we summarize our procedure to predict traffic in case of incidents: 1. When an incidente occurs at timet, all the relevant incident features(i.e.,fStart- time,Location, Direction, Type, Affected Lanesg) are utilized to determine the maximum impact backlog ofe. 2. Using the maximum impact backlog and the location ofe, the set of all influenced sensors are identified as setfs i g. 3. For each sensors i , during [t+t i (e), t+t i (e)+h], the predicted value is calcu- lated as (v i (t) v i ), where h is the prediction horizon. 4. After timet+t i (e)+h, ARIMA is used to predict the rest until the incidente is cleared. 3.3 Experiments 3.3.1 Experimental Setup Dataset In our research center, we maintain a very large-scale and high resolution (both spatial and temporal) traffic loop detector dataset collected from entire LA County highways and arterial streets. We also collect and store traffic incident data from City of Los Angeles Department of Transportation and California Highway Patrol. The detailed description of this dataset is shown in Table 6.1. Comparison Approaches NNet: We implement Neural Network (NNet) model as multilayer perceptron (MLP). The architecture of MLP is as follows: 10 neurons in the input layer, 16 single hidden layer with 4 neurons and h output neuron, where h refers to the prediction horizon. For example, in one-step forecasting, there is 1 output neu- ron. The input neurons includefv(k);k =t 9;:::;tg, while the output neuron isfv(t + 1):::v(t +h)g, where t represents the current time. Tangent sigmoid function and linear transfer function are used for activation function in the hidden layer and output layer, respectively. This model is trained using back-propagation algorithm over the training dataset. H-ARIMA: The approach is designed based on the combination of HAM and ARIMA, with detailed implementation described in [40]. H-ARIMA+: This approach is the H-ARIMA approach with our maximum im- pact model for the prediction of maximum impact backlog and speed decrease. Fitness Measurements We use mean absolute percent error (MAPE) to quantify the accuracy of traffic predic- tion. MAPE = ( 1 N N X i=1 jy i b y i j y i ) 100 (3.4) wherey i andb y i represent actual and predicted traffic speed respectively, andn repre- sents the number of predictions. 3.3.2 Predictions with Event Information In this set of experiments, we evaluate the prediction accuracy of our proposed ap- proach in the case of incidents, dubbed H-ARIMA+. We set the prediction horizon of all approaches to 6, which indicates that our algorithm is set to predict speed informa- tion 30-minute in advance. Figure 3.2 shows the result for a sample sensor located on east bound of CA-91 affected by three traffic collision incidents on Dec. 7, 2011. Figure 3.2(a) illustrates the actual speed on that day and the historical average (for that weekday) of the selected sensor. The historical average indicates that the rush hour intervals for this sensor are [7:00AM-8:00AM], and [3:00PM-7:00PM]. Figure 3.2(b) plots the prediction error for H-ARIMA+, H-ARIMA, NNet correspondingly. Table 3.3 shows the relevant attributes for each incident, where Dist(e,s) refers to the distance between the sensor and corre- sponding incident location. The number of affected lanes equals zero indicates that no lanes are blocked as the involved vehicles moved to the shoulder of the road after the accident. As shown in Figure 3.2(a), the first two incidents (i.e., Event 350 and Event 2116) happened at the beginning of morning and afternoon rush hours, and the last incident (i.e., Event 2621) happened near the end of the afternoon rush hour. As 17 Event ID Start time No. of Affected Lanes Dist(e,s) 350 06:31 0 0.58 2116 16:06 0 0.10 2621 18:26 0 0.11 Table 3.3: Relevant incident attributes illustrated in Figure 3.2(b), the prediction accuracy of H-ARIMA+ improves the pre- diction accuracy of H-ARIMA, NNet by up to 45% and 67%, respectively. We observe that though H-ARIMA can capture the sudden speed changes at rush hours, it cannot predict traffic in case of incidents. This is because the effect of traffic incidents are smoothed in historical averages. 0 10 20 30 40 50 60 70 Speed (mph) t Actual Historical Average (a) Actual speed and historical average 0 40 80 120 160 200 MAPE (%) t NNet H-ARIMA H-ARIMA+ (b) MAPE of the sensor Figure 3.2: Case study on traffic collision incidents We also study the effect of road construction incidents on our prediction model. Figure 3.3 shows the effect of a 6-hour long road construction incident which happened in I-405 on a specific sensor. There is one lane affected by this incident and the distance between this incident and the selected sensor is 0.23 mile. As shown in Figure 3.3(a), the traffic speed deviates sharply especially in the first hour of the incident. Similar to traffic collision incidents, since ARIMA cannot handle sudden speed changes, and HAM cannot react to traffic dynamics such as incidents, the prediction accuracy of H- ARIMA (which selects between ARIMA and HAM) is very low at the beginning half an hour. However, H-ARIMA+ utilizes the incident information, and yields significantly better prediction at the beginning of this incident by improving H-ARIMA and NNet by up to 91% (see Figure 3.3(b)). 18 0 20 40 60 80 Speed (mph) t Actual Historical Average (a) Actual speed and historical average 0 400 800 1200 1600 MAPE (%) t NNet H-ARIMA H-ARIMA+ (b) MAPE of the sensor Figure 3.3: Case study on a road construction incident 19 Chapter 4 Forecast impact propagation behavior of traffic incidents The advances in sensor technologies enable real-time collection of high-fidelity spa- tiotemporal data on transportation networks of major cities. In this chapter, using two real-world transportation datasets: 1) incident and 2) traffic data, we address the prob- lem of predicting and quantifying the impact of traffic incidents. Traffic incidents in- clude any non-recurring events on road networks, including accidents, weather hazard or road construction. By analyzing archived incident data, we classify incidents based on their features (e.g., time, location, type of incident). Subsequently, we model the impact of each incident class on its surrounding traffic by analyzing the archived traffic data at the time and location of the incidents. Consequently, in real-time, if we observe a similar incident (from real-time incident data), we can predict and quantify its im- pact on the surrounding traffic using our developed models. This information, in turn, can help drivers to effectively avoid impacted areas in real-time. To be useful for such real-time navigation application, and unlike current approaches, we study the dynamic behavior of incidents and model the impact as a quantitative time varying spatial span. For our motivating navigation application, ClearPath, to be effective, we need to predict specific values of speed changes and backlog lengths over the lifetime (i.e., temporal) and impact-area (i.e., spatial) of an incident. This is in contrast to previous application scenarios where forecasting abstract or aggregate values was sufficient. In particular, consider the following three aspects that we need to forecast. First, we need to predict the exact values of speed changes and backlog lengths. There are two major approaches to measure the impact of incidents: 1) qualitative ap- proaches (i.e., classify incident’s impact into conceptual categories such as “severe” or “non-severe”, and “significant delay” or “slight delay”); 2) quantitative approaches (i.e., providing numeric measurement such as 45% speed decrease, and 3.2 miles of congested backlog). In the past, most studies focused on qualitative approaches for measuring impact, which makes the impact easier to predict (e.g., [38]). The qualita- tive measurement may be sufficient for general decision-making or response analysis, 20 however, not precise enough for ClearPath. In this chapter, we describe the impact from a quantitative perspective, and provide numeric measurements of the impact to the surrounding areas. Second, since the impact region of an incident evolves over time and space (as shown in Figure 1.1), we need to predict the spatiotemporal behavior of the impact. In previous studies, it was sufficient to predict an incident’s impact as a single or a set of aggregate values. For example, in [40], the impact is predicted as average speed decrease or average of the backlog length. In this chapter, the outcome of our predic- tion approach is the exact length of time varying backlogs (i.e., evolution of congested spatial span) with different scales of speed changes. Third, we need to predict the sudden speed changes caused by incidents in a faraway future (e.g., the next 30 minutes). The occurrence of incidents always involves two phenomenon: 1) abrupt speed changes; for example, it is very common for the traffic speed to drop 60% when an incident occurs on freeways in LA; and 2) long-lasting propagation of the speed changes; for example, a closer sensor to the incident may report speed decrease in 3 rd minute after its occurrence, however, a farther sensor may report similar decrease in 30 th minute. Since traditional prediction approaches rely on the immediate past data to predict the future, they cannot effectively predict the abrupt speed changes and how they propagate over a long term, which is important for ClearPath to successfully navigate drivers around the incident impact area. Towards this end, we analyze the correlations between archived incident data and traffic data. Specifically, we first classify incidents based on their features (e.g., time, location, type of incident), which are correlated with their impact to the surrounding traffic. Next, we improve the classification by incorporating traffic density and the initial behavior of incident. By utilizing such models, we can effectively predict the abrupt speed change and the propagation over a long term by identifying similar classes of incidents mined from archived dataset. The remainder is organized as follows: we introduce preliminaries in Section 4.1. In Section 4.2, we explain our approach of quantifying incident’s impact. We detail our prediction approaches in Section 4.3 and 4.4 for impact propagation behavior and clearance behavior respectively. In Section 4.5, we present our evaluation strategies and results. 4.1 Preliminaries To explain the preliminaries, consider a sample incident that occurred on the freeway I- 5 South as illustrated in Figure 4.1(a). Sensor S1-S4 represents the four affected sensors located on I-5 South upstream of the incident location 1 . In the rest of this chapter, we use this scenario as a running example to explain our approach. 1 In this study, we focus on the impact on the upstream direction of incident location for incidents occurred on freeways. 21 S4 S3 S2 S1 (a) Scenario map −20 0 20 40 60 80 0 1 2 0 50 100 Distance (mile) Time (min) Speed Change Ratio (%) S1 S2 S3 S4 (b) Speed change ratio Figure 4.1: Sample traffic incident on I-5 South Definition 1: (Speed Change Ratio) The speed change ratio (v) at a specific loca- tion (l) and time (t) is defined as decreased ratio of current traffic speed (v c ) compared with normal traffic speed atl andt, as shown in Equation (4.1). v(l;t) = v r (l;t)v c (l;t) v r (l;t) 100% (4.1) Here, the normal speed (v r ) is calculated as the historical average value at location l of same timet in the past. Figure 4.1(b) shows the corresponding time varying speed change ratios for four sensors depicted in Figure 4.1(a). Here, the axis labeled as Time refers to the elapsed time after the occurrence of the incident, where the negative values refers to the time stamp before the incident occurs. The axis labeled as Distance refers to the road network distance between sensor location and incident location. In our problem, the key to predicting the time varying spatial span is to predict the speed changes of all sensors over time. One intuitive solution is to apply traditional time series prediction approach on the speed time series. Towards this end, we need to predict the speed changes for each sensor. However, this solution has a few limita- tions and drawbacks. In the following, we provide a brief explanation of its limitations through two critical observations made from Figure 4.1(b). Observation 1: For all sensors, the speed decreases abruptly after the occurrence of a traffic incident, suggested by the sudden increase of speed change ratio. For example, for sensor S1, the speed dropped from 67 MPH to 18 MPH within 2 minutes after the occurrence of incident. The time series prediction approaches[40] (e.g., auto-regressive models) cannot effectively predict abrupt variation in time series because most of them relies on the data in the immediate past. Thereby, according to observation 1, traditional time series prediction techniques cannot effectively predict the traffic time series at the beginning of a traffic incident. 22 Observation 2: The abrupt speed change for each sensor starts at different time stamps after the incident’s occurrence. In our running example, sensor S3 reports the abrupt speed decrease at 12 th minutes, while sensor S4 reports at 19 th minutes after the incident’s occurrence. Hence, in this scenario, given the incident just occurred, we need to predict the speed changes in 12 or 19 minutes ahead. This task requires a multi-step prediction strategy for time series prediction approaches. However, according to the study in [14], multi-step time series prediction suffers from error accumulation problem when the prediction period is long. Thus, the time series approach cannot accurately predict the speed changes in a long term, for example, 30 minutes in advance for general cases. To conclude, we argue that traditional time series prediction technique cannot ef- fectively predict the speed decrease for all sensors impacted by an incident. To address this issue, in the following, we propose a modeling strategy towards incidents’ impact and corresponding prediction techniques. 4.2 Impact Modeling First, we define whether a location is impacted according to the magnitude of speed changes as follows: Definition 2: (Impacted Threshold) is defined as a parameter that indicates the magnitude of the speed changes. Given a time stamp t and a location l, if the speed change v(l;t) satisfy the following inequality, we denote the locationl as impacted at timet. v(l;t) (4.2) In the experiments, we will study the effects of values in the prediction accuracy of propagation behavior. Consider as 60%, and cut the 3D Figure 4.1(b) horizontally with v=60%. We will obtain a series of scatter points in a 2D space of distance and time, as depicted in Figure 4.2. Each point (x;y) in Figure 4.2 represents a specific sensor located aty miles from the incident location with 60% speed decrease atx-minute after the incident occurrence time. For the four points on the left side (withx < 20), their x-axis value indicates the time stamp when a sensor starts to get impacted, which is referred as propagation phase. For the other four points, their x-axis value indicates when a sensor ends from getting impacted, which is referred as clearance phase. As a byproduct, the impact duration of a sensor can be derived as the time difference between the points in the propagation phase and clearance phase. In this study, we focus on predicting the impact in propagation phase. As shown in Figure 4.2, we observe that the closer a sensor is to the incident lo- cation, the earlier it starts to get impacted. Intuitively, if a sensors get impacted at a timet, all the sensors closer thans should be impacted beforet. Therefore, the impact backlog (i.e., spatial span) of traffic incident is defined as follows: 23 0 0.5 1 1.5 2 0 10 203040 Distance(mile) Time (min) S1 S2 S3 S4 Figure 4.2: Intersecting Figure 2(b) with = 60% in speed change Definition 3: (Impact Backlog) Given an incident location on freeway l and oc- currence time t 0 , and impact threshold , the impact backlog b at time t (b t ), is the road network distance between the occurrence location and the furthest impact location (with v(l;t) ), along the upstream direction (i.e., the opposite direction of the vehicle flow). In the following, we will use the example in Figure 4.2 to explain how to calculate b t , with=60%. In this example, sensor S2 (0.9 miles from the incident) starts to get impacted at 8 th minute after the incidents. Therefore, the impact backlog at the 8 th minute is 0.9 miles. If we consider the granularity of time stamp(t) in the definition as 1 minute, we could deriveb 8 =0.9. Similarly, we could deriveb 1 ,b 12 andb 19 from the sensor S1, S3 and S4, according to the time they get impacted and their distances to the incident location. 4.2.1 Modeling Propagation Behavior With the notation of impact backlog, the time varying spatial span of incident impact in terms of propagation behavior is defined as follows: Definition 4: (Propagation Behavior) Given an incident (e) at locationl occurred at timet 0 , and,e’s propagation behavior is defined as a time series of impact backlog aftert 0 and before it reaches the maximum impact backlog. Assuminge reaches the maximum impact backlog aftert minutes, its propagation behavior is represented as ~ b orfb 0 ;b 1 ;:::;b t g, where the subscripti forb i represents the time units aftert 0 . Here,b i is the distance from the incident location that is “start to get impacted” at timet i . To calculate the propagation behavior for an incident, one naive way is to record the speed changes on all the possible upstream locations. However, this method requires a fairly dense placement of sensors. In most sensor networks, the sensors reporting traffic speed are always placed with a certain distance interval (e.g., 0.5 mile). Therefore, due to the limited availability of sensor data, we can only derive impact backlog from the locations equipped with sensors. To create a continuous propagation behavior, we 24 0 0.5 1 1.5 2 050 Distance(mile) Time (min) S1 S2 S3 S4 0 0.5 1 1.5 2 010 20 Distance(mile) Time Elapsed (min) (a) Fitting result 0 0.5 1 1.5 2 010 20 Impact Backlog (mile) Time Elapsed (min) (b) Interpolation result Figure 4.3: Sample propagation behavior utilize a fitting strategy. The overall modeling strategy is summarized as follows: 1. We utilize the distance of a sensor from the incident location to represent the impact backlog at timet, which is the stamp they start to get “impacted”. 2. Consequently, we plot the derived impact backlogs into 2D space (e.g., the scatter points in Figure 4.3(a)), and train a polynomial function to fit the plotted discrete points. 3. Finally, we utilize the learned fitting function and interpolate the backlogs at missing time stamp and generate a complete propagation behavior. Figure 4.3(b) shows the propagation behavior for our running example, where the impact back- logfb 0 ;b 1 ;:::;b 19 g is plotted at each minute. There are alternative modeling approaches, such as the use of coefficients in polyno- mial fitting function. The superiority of our modeling strategy over this approach is as follows: when we construct the propagation behavior, we only use the fitting function to interpolate the missing impact backlogs, for existing impact backlogs we still use the original data. However, if we rely on the coefficient vectors of the fitting function, we may introduce fitting error into the original data, which may result in inaccurate representation of the propagation behavior. Applications of Propagation Behavior: The prediction of propagation behavior can enable intelligent route planning, effective transportation policy making, and faster traffic emergency responses. In the following, we will briefly detail how to use the propagation behavior within route planning applications. For each traffic incident, we can predict multiple propagation behaviors based on different values. For example, Figure 4.4 illustrates the propagation behavior for the running example under different s. The value of can be tuned according to the preference of the end users of a route planning application. For a specific time, by utilizing the combination of propagation behaviors, we could derive the set of affected road segments under different magnitude of speed changes. 25 0 1 2 3 4 0 102030 Distance Towards the Incident (mile) Time Elapsed (min) λ=60% λ=40% λ=20% Figure 4.4: Propagation behaviors under different For example, in Figure 4.4, at 15 th minute after the occurrence of incident (i.e.,x=15), the location at 1.3 mile towards the incident is at least 60% speed decrease, the location at 1.7 mile is at least 40% speed decrease and the location at 2.0 mile is at least 20% speed decrease. Similarly, for a specific location, by fixing the y values, we could derive the time when starts to get 20%, 40% or 60% speed decrease. Such predictive information are crucial to generate travel time weight for road segments near incident locations, further to be utilized in the fastest path calculation in route planning. 4.2.2 Determining Minimum Impact Threshold () As we discussed, impact threshold () is a parameter that represents the magnitude of the speed changes caused by traffic incidents. In real-world applications, users can tune the value of in order to obtain different set of propagation behaviors. This is partic- ularly important when the small scale of speed difference between recent traffic speed and historical average traffic speed does not reflect the impact from traffic incidents. Such small fluctuation may be due to regular fluctuation of traffic speed or noisy data from sensors. Thereby, if is too small, when we observe the speed change at a partic- ular location with v , we cannot distinguish whether this location is actually get impacted by a traffic incident or not. As a consequence, we cannot compute the prop- agation behavior accurately. In this section, we will briefly discuss how to determine the minimum impact threshold to accurately capture speed changes due to incidents. To calculate the minimum impact threshold ( min ), we first derive historical aver- age traffic speed and its 95% confidence interval for each time stamp reported by each sensor. Figure 4.5 illustrates the result for a sample sensor. In this study, we assume any traffic speed reading within the 95% confidence intervals as regular fluctuation towards the historical average. And the speed readings outside this interval are considered as anomalous readings impacted by traffic incidents. In the example shown in Figure 4.5, during 6AM to 10AM, the confidence interval is also small, indicating the scale of regular speed fluctuation for this sensor in this time interval is fairly small. This phe- nomenon normally occurs in non-rush hours when the historical average traffic speed is 26 0 20 40 60 80 Speed (mph) t Historical Avg 95% Confidence Interval Figure 4.5: Confidence interval for a sample sensor generally high. In this case, even if a smaller scale of speed changes (e.g., 20%) occurs to this sensor during this interval, the sensor can still be considered as impacted by traffic incidents. On the other hand, during the time period 2PM to 8PM, when conges- tion starts to happen, the confidence interval is quite large, indicating the traffic speed fluctuate largely around the historical average around these hours. In other words, if we set as a fairly small value (e.g., 20%), we may inaccurately identify the regular speed fluctuation during this interval as impacted by traffic incidents. Thereby, we define the minimum impact threshold ( min ) at locationl, and timet as follows: min (l;t) =C(l;t)=v r (l;t) (4.3) wherev r and refers to the historical average value at locationl of same timet in the past and the standard deviation, respectively. Here, to ensure 95% confidence interval, the constantC is set to 1.96 according to the theorems published by [53]. In real-world applications, users should set the impact threshold to at least min to ensure accurate modeling of the propagation behavior of traffic incidents. For example, when a traffic incident occurs around the sample sensor illustrated in Figure 4.5, the impact threshold should be set as follow: for the incidents occurred during [6AM, 10AM], the speed changes should be at least 15% (i.e., min 15%) to indicating the sensor is impacted by this incident; during [10AM, 2PM] and [2PM, 8PM], min should be increased to around 25% and 65% respectively, to justify whether this sensor is impacted. The ef- fects of choosing different values will be further evaluated in the experiment (Section 4.5.2.). 4.3 Prediction of Propagation Behavior In this section, we will explain our proposed techniques for predicting the impact of incidents on road networks in terms of propagation behavior (as defined in Section 4.2.1). First, we will discuss a baseline approach for grouping similar incidents based 27 on their attributes to estimate the impact. However, in some particular cases, although two incidents have similar attributes, their impacts are still highly different from each other. Therefore, we introduce a new prediction model that addresses the shortcom- ings of the baseline approach by incorporating traffic density measures such as volume and occupancy. Then, we will explain a multi-step prediction approach that takes into account initial behavior (i.e., sub-pattern of propagation behavior) of an incident that further improves the prediction accuracy. Our dataset includes three years of historical sensor readings (i.e., speed, volume and occupancy) referred to as traffic data (D). Specifically, volume represents the num- ber of cars passed by a sensor within a sampling interval (e.g., 30 seconds), and oc- cupancy represents the percentage of time a sensor is occupied. For example if the occupancy of a sensor equals to 10% of a sensor with sampling interval 30 seconds, it means there are total 3 seconds in the last 30 seconds with vehicles presence on top of this sensor. In addition to traffic sensor data, we also include the dataset of incident reports that includes set of 43 attributes, such as fatality, number of lanes affected etc., referred as incident data (R). Our impact prediction problem is defined as follows: Problem Definition: Given an incidente (e2R) occurred at timet 0 , and the dataset D collected beforet 0 (i.e., [t 0 T;t 0 ], where T is the duration of the datasets), to predict propagation behavior ofe in the nextt time stamps, i.e.,fb 1 ;b 2 ;:::;b t g. 4.3.1 Baseline Approaches In this section we introduce two baseline approaches for impact prediction, a) based on theoretical traffic flow simulations by [63], and b) based on a clustering idea using the real-world traffic and incidents data proposed in [40]. Theoretical Baseline (TB) With this baseline approach, we adopt the shock-wave model which is widely used for estimating queuing delay caused by traffic incident [63, 4]. The shock-wave model is developed based on the theory of kinematic waves, which takes traffic flow as a contin- uous fluid with a flow-density relationship proposed by [30]. This approach assumes when a lane is blocked by a traffic incident, the congestions propagates linearly along the shock-wave speed. Specifically, a traffic stream can be described by its average flow (q) and average density (k). A shockwave is created when the state of a traffic stream changes from regular state (q i ,k i ) to the state in the presence of traffic accident (q j ,k j ). According to the literature published in [30], the shockwave speed (c ij ) is calculated as follows: c = q i q j k i k j (4.4) Using this model, we calculate the propagation behavior as a linear function between backlog and time, with the gradient equals the shock-wave speed (c). To calculate 28 the shock wave speed, we utilize the speed and volume readings collected from the closest upstream sensors near the incident locations. Specifically, we can directly derive the traffic flow (q) from traffic volume, and estimate traffic density (vehicle/distance) through the division between volume (vehicle/time) and speed (distance/time). Below are the steps we take to predict the impact of a given incidents: 1. When a new incident occurs at t 0 , we derive the regular state (q 1 ,k 1 ) of traffic stream based on the sensor data collected during [t 0 5,t 0 ], and incident state (q 2 ,k 2 ) based on the data collected [t 0 ,t 0 + 5]. 2. We derive the shock-wave speed (c) based on the equation defined in Equ (4.4). 3. We calculate the predicted propagation behavior ati th time stamps after the oc- currence of incidents asb i =jcj (t i t 0 ). Data-driven Baseline (DB) The Data-driven Baseline approach (DB) utilizes the propagation modeling approach discussed in Section 4.2. In addition, DB utilizes incident report data to classify inci- dents solely based on their attributes for prediction [40]. In particular, we use historical incident information to create classes, that we use to model the impact based on the at- tributes of incidents. The main intuition here is that the incidents within the same class should be strongly correlated, and hence given an incidente with certain attributes will follow the similar impact. The detail steps of DB is as follows: 1) given historical inci- dents and all their attributes, apply a feature subset selection algorithm to identify the set of related features that are maximally correlated with their propagation behavior; 2) classify all historical incidents into different groups according to their values of selected features. For example, if the incident location (e.g., I-5 South) is one of the selected features, all incidents occurred on freeway I-5 South should be put into one group, and within each group, we use the average propagation behavior as the representative for prediction. In this way, when a new incident occurs, we extract its correlated feature values, use them to identify the group it belongs to, and use the representative in that group as predicted propagation behavior. With our dataset, we observe that the feature subset selection algorithm determines the following attributes: street name (e.g., I-5 South), start time (i.e., occurrence time), affected number of lanes (i.e., number of lanes blocked by the incidents), and incident type (such as traffic collision, road construction, etc). Therefore, in the rest of this chapter we will use these attributes to classify the incidents. Note that the length of propagation behavior might be different from each other, thereby, during the calculation of average propagation behavior (i.e., Step 2 in predic- tion phase), its length equals the longest propagation behavior in one cluster. Assuming the length of propagation behavior is denoted ast, each time stamp in the average prop- 29 agation behavior (b i ) is calculated as follows: b i = 1 N N X p=1 b pi N = #:of incidentswithti (4.5) 4.3.2 Prediction with Traffic Density (PAD) In the baseline approach, we assumed that incidents with similar attributes may fol- low similar impact, and hence classified the incidents based on the values of selected attributes. However, our observations from the real-world datasets show that in some cases, even two incidents have similar attributes, their impact propagation behavior can be significantly different from each other. This is particularly notable when two inci- dents occurred on the same street but different road segments. For example, consider two incidents (with same attributes) that occur at a rush-hour on two different segments passing through downtown area and rural area (significantly less crowded). Obviously, the impact of these accidents will be different. Therefore, we argue that traffic “density” around the incident is correlated with its propagation behavior, and hence can improve the prediction accuracy. In the rest of this section, we will present two selected case studies to verify our hypothesis and propose an approach that utilizes traffic density. We quantify the traffic density using two traffic measures: volume (the number of cars passing from a sensor location) and occupancy (the percentage of time the sensor is being occupied) from the sensors that on the same streets close to the incident location. Note that we cannot just use single parameter (i.e., either volume or occupancy) to describe traffic density. According to the LWR model published in [30], the traffic volume/occupancy itself cannot precisely describe the congestion information of traffic. For example, same volume number (e.g., 5 vehicles/30 seconds) may exist in either free flow situation or congestion situation. Only if we use both parameters together, we can clearly measure the congestion situations of traffic flow. As we discussed these measures are available in our sensor dataset. Below we explain the effect of each measure in turn. Effect of Volume: To illustrate the correlation between volume and propagation behavior, we present two real-world incidents (e A and e B ) that occurred on I-405 S with similar incident attributes, but different volume values (i.e., low volume for e A , high volume for e B ). Their propagation behavior are depicted in Figure 4.6(a). As shown, for e B , as the vehicles accumulated quickly (due to large traffic volume), the impact propagates very fast after a few minutes. On the other hand, the propagation speed ofe A (with lower volume) is not as fast ase B . Hence, it is likely that different volume values can result in different propagation behavior. Effect of Occupancy: Similar to volume case study, we will show the impact of occupancy using an example. In this case, we choose two incidents that occurred on I-5 S with different occupancy values. Figure 4.6(b) shows the propagation behavior for e A (with higher occupancy value) ande B ( with lower occupancy value). Obviously, 30 0 0.5 1 1.5 2 2.5 3 0 102030 Impact Backlog (mile) Time Elapsed (min) Incident A Incident B (a) Effects of V olume 0 1 2 3 4 5 6 7 8 010 20 30 Impact Backlog (mile) Time Elapsed (min) Incident A Incident B (b) Effect of Occupancy Figure 4.6: Case studies on traffic environment the average propagation speed (average curve gradient) fore A is higher than that ofe B . This means that the incident impact propagates faster on more occupied locations, and hence occupancy is also correlated with propagation behavior. As illustrated in the above two case studies, the traffic density (measured by volume and occupancy) are very important parameters to predict the propagation behavior of an incident. Therefore, we incorporate traffic density into our prediction model. In particular, for each incident, we create a two-dimensional feature vector composed of volume and occupancy values and cluster incidents based on this vector. Our Prediction approach that combines incident Attributes and traffic Density (PAD) is summarized as follows: Training Phase: 1) Classify the historical incidents into groups according their correlated attributes trained in the baseline approach; 2) Within each group, clus- ter all incidents on the feature space composed by the volume and occupancy value, i.e.,<v;o>; Prediction Phase: For a newly occurred incident e, 1) we identify its group based on its correlated attributes, and use its volume and occupancy value to find the cluster (C) it belongs to; 2) we select all the archived incidents inside theC and use the average of their propagation behaviors for the impact prediction ofe. To ensure the cluster quality we maximize the number of clusters (k) while guaran- teeing the quality of each cluster, which is measured by average silhouette coefficient (s) defined in [47] 2 . 2 Specifically, we choose the maximum number of clusters while constrains to stay in the range (0.5, 0.7], which indicating the reasonable evidence for clustering result. 31 (a) PAD approach (b) PADI approach Figure 4.7: Sample prediction comparison on I-405 S. 4.3.3 Prediction with Initial Behavior (PADI) In the previous section we discussed PAD model that improves the accuracy of the baseline approach by using traffic density information. However, there are still other impact correlated features that PAD does not take into consideration, such as weather conditions or other information that are not available in our dataset. Therefore, in some cases, the accuracy of PAD still can be improved. Figure 4.7(a) shows one such case for a sample cluster learned by PAD. In this figure, prediction candidate refers to the average propagation behavior with similar attributes and traffic density, and predic- tion range (i.e., the gray area) is calculated based on the maximum deviation of each instance towards the candidate. If we use this candidate for predicting propagation behavior for incidents with same attribute and density, the prediction error would be non-trivial. To shrink the range for the prediction candidate, we cluster all the propa- gation behavior within a group of incidents (under same attributes and traffic density), and generate multiple prediction candidates. This eliminates the need to rely on the candidate in terms of average propagation behavior for the prediction. Figure 4.7(b) shows a sample candidate and its range after the clustering on propagation behavior. We elaborate the training procedure for this method as a hierarchy structure illus- trated in Figure 4.8. Level I, II and III indicates the successive grouping of incidents based on attributes, density and propagation behavior. One may think of merging all three levels into one level containing all three types of information (i.e.,attributes, en- vironment, propagation), and conduct clustering algorithm only once. However, it is difficult to balance the weight for the features of the three types of information during clustering. Therefore, the hierarchical structure helps us to avoid potential problems in weight tuning step. During the prediction step, for a given incident, we use its attributes and traffic density to search in the first two levels. To identify a suitable cluster in Level III, we relax the prediction problem, and use initial behavior of the accident to match the 32 clusters on traffic density clusters on propagation behavior classes on incident attributes collision on I-405 S, at peak hour, 1 block lane centroid <v i ,o i > centroid {b i } All training incidents Level II: Level III: centroid {b j } centroid <v j ,o j > Level I: Figure 4.8: Hierarchy structure for training cluster centroid, which is defined as follows: Definition 5: (Initial Behavior) Given an incident (e) and its propagation behavior ~ b, i.e.,fb 1 ;:::;b t g, its initial behavior is defined as the first h time stamps in ~ b (i.e., fb 1 ;:::;b h g), whereh is defined as forward lag, andh<t. In particular, with the help of initial behavior, when a new incident e occurs, we match its initial behavior with the first h times stamp (i.e., ~ b 1:::h ) among the corre- sponding propagation behavior centroids in the Level III, and identify the closest cen- troid as the candidate for predicting ~ b h+1:::t . Note that initial behavior can be learned from traffic data. Therefore, by considering the initial behavior as input, we relax our prediction problem by knowing the traffic in the first a few minutes after the occurrence of incidents. To illustrate the use of initial behavior, consider the example in Figure 4.9. The prediction candidates (i.e., cluster centroids on propagation behavior) for incidents that occurred on freeway I-405 South with similar attributes and traffic density is illus- trated as solid lines in Figure 4.9(a). The black dash line in Figure 4.9(b) represents the initial behavior in the first 5 minutes for a newly occurred incident. By matching fb 0 ;:::;b 5 g between its initial behavior and the five prediction candidates, we select the closest cluster centroid to predict the propagation behavior afterb 5 , as depicted in Figure 4.9(b). It is important to note that, there exists various metrics to evaluate the “closeness” between initial behavior and firsth stamps in cluster centroids. In our approach, we use both Euclidean distance and Mahalanobis distance [33] to measure the closeness. The Mahalanobis distance differs from Euclidean distance in that it takes into account the correlations in the dataset and is scale-invariant. To measure the differences between propagation behavior ~ b 1 and ~ b 2 , the Mahalanobis distance is calculated as follows: d M ( ~ b 1 ; ~ b 2 ) = q ( ~ b 1 ~ b 2 ) T S 1 ( ~ b 1 ~ b 2 ) (4.6) 33 0 1 2 3 4 5 6 0 102030 Impact backlog (mile) Time Elapsed (min) (a) Prediction candidate t =5 for predicting the rest (b) Selected candidate Figure 4.9: Sample prediction on I-405 S. whereS is the covariance matrix between ~ b 1 and ~ b 2 . We evaluate the prediction accu- racy for both Euclidean distance and Mahalanobis distance in Section 4.5.2. 4.3.4 Discussion 4.3.5 Discussion So far, we have discussed the strategy of using traffic density and initial behavior to predict the propagation. Last but not the least, we want to complete the discussion by providing a solution to navigation system where the measurement of traffic density is either not available or inaccurate. In particular, for navigation systems based on crowd sourcing, although they can still have access to incident reports and speed changes, but it is generally challenging for them to have accurate traffic density measurement such as volume and occupancy around the incidents. Therefore, for these systems, we provide a similar Prediction strategy by only using incident Attributes and Initial behavior (PAI). To implement this strategy, instead of completing the three levels as shown in Figure 4.8, we only utilize the first and third level. That is, during the training phase, we first group the historical incidents by their attributes and then conduct clustering algorithm on their propagation behavior; during the prediction phase, we utilize the attributes and initial propagation behavior of the incoming incident for predicting its propagation behavior. 4.4 Prediction of Clearance Behavior As discussed in Section 4.2, the impact behavior of incidents in the clearance phase can be modeled in the similar way as propagation behavior. According to the explanation regarding Figure 4.2, behavior in the propagation phase can be modeled by the time 34 0 1 2 3 4 5 020 40 60 Impact Backlog (mile) Time Elapsed (min) Propagation Clearance Fit (Propagation) Fit (Clearance) (a) Incident-A on I-405 N. 0 1 2 3 4 0 20406080 Impact Backlog (mile) Time Elapsed (min) Propagation Clearance Fit (Propagation) Fit (Clearance) (b) Incident-B I-5 S. Figure 4.10: Behavior learning on two sample incidents stamps when a sensor starts to get impacted. Similarly, the behavior in clearance phase can be derived by the time stamps when a sensor ends from getting impacted (i.e., the first time stamp with v < ). In this way, we could define the clearance behav- ior in the similar way compared with propagation behavior according to Definition 4. However, the prediction of clearance behavior is more challenging than the prediction of propagation behavior. In this section, we will explain these challenges, and briefly discuss our strategy to solve these challenges. Consider the impact behavior of two incidents as illustrated in Figure 4.10. In this figure, each pair of the rectangle and triangle markers (with the same impact back- log) indicates the start and end time-stamp of a particular sensor getting impacted, respectively. As shown in Section 4.2, the propagation and clearance behavior for each incident can be learned by fitting the markers with polynomial functions, which is il- lustrated as solid curve in Figure 4.10. In the following, we will use these two incidents as examples to explain two challenges in predicting clearance behavior. The first new challenge in predicting the clearance behavior lies in the location where the clearance behavior starts. In the prediction of propagation behavior, we assume the propagation always starts at the location of traffic incident. However, this is not the case for the clearance behavior. Consider the case illustrated in Figure 4.10(a), the sensor closest to the incident location is cleared first, followed by the second closest sensor and so on, indicating the clearance behavior starts at the incident location. On the other hand, according to the case in illustrated Figure 4.10(b), the sensor furthest to the incident location is cleared first, followed by the second furthest sensor, which suggests that the clearance behavior starts at the location with maximum backlog. Based on these two cases, in the prediction of clearance behavior, it is important to identify the locations where the clearance begins. This is because if we cannot identify the correct starting locations, the pattern of the clearance behavior will be predicted in a completely wrong way. As suggested by the two case studies illustrated in Figure 4.10, the fitting function of clearance behavior for incident-A has positive gradient, while for incident- B, it has negative gradient. Moreover, predicting the starting location of clearance 35 behavior is important for navigation systems to identify the size and the location of the impact area. As shown in Figure 4.10(a), the location and the size of impact area for incident A is dynamically changing over time. For example, at 20 th min, its impact area starts at 0 miles towards the incident, and reaches 2.5 miles backlog. But at 30 t h min, the impact area starts at 3 rd mile towards the incident, reaches the 4 th mile, indicating its size is only around 1 mile. On the other hand, for incident-B illustrated in Figure 4.10(b), only the size of the impact area is changing, and starting location of the impact area is always at incident location. In sum, if we cannot accurately predict the starting location of the clearance behavior, the navigation systems cannot accurately capture changing of impact area, further cannot provide the efficient routing plan. The second new challenge comes from the beginning time stamp of the clearance behavior. During the prediction of propagation behavior, we assume the propagation starts right after the occurrence of the incident. Thereby, the beginning time stamp of the propagation behavior is the incident’s occurrence time, which can be acquired from the incident’s report. Similarly, we could assume the beginning time stamp for clear- ance behavior is the incident clearance time (e.g., when the involved vehicle leaves the scene or when the blocked lane is cleared). However, for the recently occurred incidents, their clearance time stamps are always unknown at the time when they are reported. On the other hand, even if the clearance time is already known, we cannot directly use it as the beginning point of the clearance behavior. In fact, for some traf- fic incidents, even if the incident is cleared, drivers may still slow down their vehicles to see the scene of the incident due to their curiosity, in particular for some severe incidents such as car-on-fire. In this way, the traffic congestion caused by the traffic in- cident is still not resolved, thereby the clearance time of the scene cannot be considered as starting point of clearance behavior. Thereby, for predicting the clearance behavior, we also need to predict the time stamp when the clearance behavior begins. One way to address the two mentioned challenges is to directly calculate the start- ing location and time of clearance behavior based on the sensor data. The location can be derived by keeping track of the speed readings from the closest sensor and the furthest sensor in the impact region, and see whose speed reading recovers to regular speed reading first. And the starting time is the first speed recovery time among the two sensors. If the closest sensor is recovered first, the clearance behavior should be similar with incident A shown in Figure 4.10(a), otherwise, it should be similar with incident B shown in Figure 4.10(b). However, this solution only works at the time stamp when the incident starts to clear. In other words, it cannot predict the clearance behavior immediately after the incident is occurred, instead, it needs to wait until the time when any impacted sensor reports speed recovery. As a result, the application of this solution is very limited due to its short prediction interval. Thereby, as an alter- native, we try to predict the starting time and location of the clearance behavior at the occurrence time of the incident by identifying correlations of between clearance be- haviors and incident attributes. For example, we observe that for small scale of traffic collision incidents occurred during non-rush hour which are always cleared very fast, 36 with clearance behavior always starting at the incident location, as depicted in Figure 4.10(a). On the other hand, severe incidents such as car-on-fire, will take longer time to clear (more attractive to drivers’ attention), their clearance behaviors always start at the furthest location, as depicted in Figure 4.10(b). Driven by such observations, to predict the clearance behavior, we first identify the set of correlated attributes of inci- dents in predicting the start time and location of the clearance behavior, then use the correlated attribute for the newly occurred incident for the prediction (i.e., similar with the DB approach). After the starting time and location is predicted, we then follow the same strategy proposed in Section 4.3 to predict the entire time series for the clearance behavior. 4.5 Experiments 4.5.1 Experimental Setup In the experiments, we evaluated our approaches with real-world traffic and incident datasets. First, we evaluate our proposed impact prediction techniques on propagation behavior under various parameters. Second, we briefly shown our result in predicting the clearance behavior. Finally, we conduct two case studies on real-world traffic inci- dents to reveal the superiority of utilizing our approach in estimating the travel time in navigation systems. Data Set At our research center, we maintain a very large-scale and high resolution (both spatial and temporal) dataset collected from entire LA County highways and arterial streets [46]. We have been continuously collecting and archiving the data for the past three years. We use this real-world dataset to create and evaluate our techniques. This dataset includes: 1. Traffic data: collected from traffic sensors covering approximately 5000 miles. The sensors report occupancy, volume and speed values. 2. Incident data: collected from various agencies including California Highway Pa- trol (CHP), LA Department of Transportation (LADOT), and California Trans- portation Agencies (CalTrans). The statistics about this dataset is given in Table 6.1. 37 Table 4.1: Dataset for evaluation of prediction accuracy data duration Jun. 1st - Jul. 7th # of sensors 4,230 Traffic sampling rate 1 reading/30 secs data aggr. interval 1 min spatial range OC & LA County # of incidents 6,811 Incident # of attributes 43 data updating rate 1 min spatial range OC & LA County 4.5.2 Evaluation of Propagation Prediction Evaluation Method For evaluating propagation behavior, we first use two case studies to reveal the ef- fectiveness of traffic density and initial behavior in the prediction of impact. Then, we evaluate the overall prediction accuracy under various system parameters, listed in Table 4.2. For each set of experiments, we only vary one parameter and fix the re- maining to the default values. Specifically, the occurrence time refers to the first report time of the incidents. The peak hours and non-peak hours refers to the time intervals [6AM,10AM], [3PM,7PM] and [10AM, 3PM], [7PM, 9PM], respectively. In addition, the prediction intervalt is to 30 (indicating 30 minutes) as the default value to evaluate the results. This means that we evaluate our approach by forecasting the time seriesfb 1 ,b 2 , ...,b 30 g, whereb i refers to the backlog ati th minute aftert 0 . The prediction accuracy is measured by root mean square error between the pre- dicted propagation behavior (i.e.,f b b i g) and actual propagation behavior ~ b (i.e.,fb i g). RMSE = v u u t 1 N N X i=1 (b i b b i ) 2 (4.7) In the experiments, we will compare following techniques: theoretic baseline (TB), data-driven baseline solely on attributes of incidents (DB), Prediction with Attributes and traffic Density (PAD) and Prediction with Attributes, Density and Initial behavior (PADI), and Prediction based on Attributes and Initial behavior only (PAI) for trans- portation system without density information. Case Studies In this section, we select two traffic incidents (i.e., collision accident) and compare the prediction accuracy of two baseline approaches with PAD, and PAI, to illustrate the effectiveness of traffic density and initial behavior independently. The results are shown 38 Table 4.2: Evaluation parameters Parameters Default Range Impact threshold () 20 f20, 40, 60g (%) Forward lag (h) 5 f0, 2, 5, 10g (min) Occurrence Peak Hour fPeak hours, Time (t) Off-Peak hoursg Distance metric Euclidean fEuclidean, Mahalanobisg Prediction Interval 30 [0,30] (min) in Figure 4.11 where the solid black line indicates the actual propagation behavior interpolated from the actual sensor readings. Figure 6.7(a) and 6.7(b) depict the traffic collision incidents that occurred on I-405 North freeway and on I-5 South freeway, respectively. In the first case study, the propagation behavior predicted by theoretical baseline (TB) is close to the ground truth in the initial 15 minutes, but highly deviated from ground truth from 15 minutes to 30 minutes. And the data-driven baseline (i.e., the baseline approach solely on incident’s attributes) yields the worst accuracy (i.e., with predicted pattern furthest to the actual one). In this case, the theoretical model out- performs the baseline solely on incident attributes, but still not as good as the other two approaches using real-world traffic data (i.e., PAI and PAD). In the second case, although the DB outperforms TB since 18 th mins after the incident’s occurrence, the prediction accuracy from both of them are lower than PAI and PAD. Such observations indicates that for theoretical models, it may provide accurate prediction in the beginning during the presence of the incidents. But as time elapsed, it is likely that more complex situations in the traffic stream (e.g., lane switching) cannot be accurately modeled by the shockwave theories. And such problems do not exist in the approaches utilizing the real-world data such as PAI and PAD. From this set of case studies, we can also observe that sometimes PAI yields the best accuracy (i.e., case (a)) and some times PAD yields the best accuracy (i.e., case (b)), and both of them outperforms the two baseline approaches in both cases. The observation implies that 1) the use of traffic density and initial behavior can improve the prediction accuracy compared with the baseline approach; 2) both of them are necessary for the improvement of prediction accuracy, since the results reflect that they are functioning in different ways towards the improvement of prediction accuracy in different cases. Due to the necessity of both traffic density and initial behavior in the prediction, there is no clear ranking between PAI and PAD on which one is better than the other. Effects of Impact Threshold () In this set of experiments, we compare the prediction accuracy under differents. Fig- ure 4.12(a) depicts the average of prediction error for 905 incidents in the test data for 39 0 0.5 1 1.5 2 2.5 3 0 102030 Impact Backlog (mile) Time Elapsed (min) Truth TB DB PAD PAI (a) Sample incident on I-405 N. 0 1 2 3 4 5 6 7 8 0 102030 Impact Backlog (mile) Time Elapsed (min) Truth TB DB PAD PAI (b) Sample incident I-5 S. Figure 4.11: Case studies on two sample incident 0 0.5 1 1.5 λ=20% λ=40% λ=60% RMSE DB PAD PADI (a) Average prediction error 0 0.5 1 1.5 λ=20% λ=40% λ=60% RMSE Baseline PAD PADI (a) Average prediction error PAD PADI λ=20% 4.2% 11.3% λ=40% 7.4% 39.1% λ=60% 23.9% 45.8% (b) Improv. over baseline Figure 11. Effect of impact threshold (λ) PAD PADI λ=20% 4.2% 11.3% λ=40% 7.4% 39.1% λ=60% 23.9% 45.8% fitting result. As the propagation behavior is quantified in a more accurate way, the prediction accuracy is also higher. Whenλ is small, the impact is less significant and hence the result can be more easily affected by the noise in the sensor speed readings, which yields lower prediction accuracy. Furthermore, we also observe that the larger the λ values cause shorter propagation behavior. This is because, given an incident, the significant speed decrease normally propagates a shorter distance than that of trivial speed changes. Thereby, it is easier to predict the propagation behavior with less time duration under large λ value. In sum, with larger λ, the propagation behavior is modeled more accurately (i.e., less fitting error), and hence easier to predict. 0 1 2 3 4 5 020 40 Impact Backlog (mile) Time Elapsed (min) λ=20% λ=40% λ=60% Fit ( λ=20%) Fit ( λ=40%) Fit ( λ=60%) (a) Fitting graphs Intuitively, when we increase λ, the standard to evaluate a location whether impacted or not is increased. Thereby, with large λ, only sensors shows significant speed changes will be considered as impacted sensors. To investigate the reason why the prediction accuracy increases a lot based along with the increase of λ, we conduct a case study on one incident occurred on I-405 S during off-peak hours. Figure 16 shows the interpolation process of the propagation behavior under different λ value. In this figure, each scatter point < x,y > represents a sensor located at y is start to get impacted at time x. And the dash lines represent the fitted function towards the corresponding set of scatter points. From this figure, we could derive the following two observations: (1) the larger the λ is, the less noise in the fitting process to generate propagation behavior; (2) the larger the λ is, the shorter propagation behavior is. From the first observation, we may infer the fact that, whenλ is small, the time when a sensor start to get impacted can be easily influenced by the noise in sensor speed readings. Since the noise can only cause speed changes in a small range, as the λ increases to a larger value, it can hardly affected the generation of propagation behavior, thereby the fitting performance is better, further the prediction accuracy is increased as well. For the second observation, it is intuitive that the duration with significant speed decrease is normally shorter than that with trivial speed changes. Thereby, the duration of propagation behavior is shorter, and easier to predict. 0 1 2 3 4 5 020 40 Impact Backlog (mile) Time Elapsed (min) λ=20% λ=40% λ=60% Fit ( λ=20%) Fit ( λ=40%) Fit ( λ=60%) Figure 16. Impact threshold case study avg(εf ) λ=20% 0.21 λ=40% 0.03 λ=60% 0.01 Table III FITTING ERROR 4) Effects of Forward Lag (h): In this set of experiments, we study how the length of forward lag (h) affects the prediction accuracy. Since there is no initial behavior pattern matching step in the baseline and the approach using envi- ronment information, we only compare the accuracy based on AP and AEP approach. Note that the AP approach can be reduced to the baseline approach when h = 0. Figure 22 depicts the prediction accuracy of the two proposed AE AEP λ=20% 4.2% 11.3% λ=40% 7.4% 39.1% λ=60% 23.9% 45.8% Table IV IMPROVEMENT TOWARDS BASELINE 0 1 2 3 4 5 020 40 Impact Backlog (mile) Time Elapsed (min) λ=20% λ=40% λ=60% Fit ( λ=20%) Fit ( λ=40%) Fit ( λ=60%) (a) Fitting graphs 0 0.5 1 1.5 2 RMSE λ=20% λ=40% λ=60% (a) AP approach 0 0.5 1 1.5 2 RMSE λ=20% λ=40% λ=60% (b) AEP approach Figure 15. Effect of impact threshold (λ) one incident occurred on I-405 S during off-peak hours. Figure 16 shows the interpolation process of the propagation behavior under different λ value. In this figure, each scatter point < x,y > represents a sensor located at y is start to get impacted at time x. And the dash lines represent the fitted function towards the corresponding set of scatter points. From this figure, we could derive the following two observations: (1) the larger the λ is, the less noise in the fitting process to generate propagation behavior; (2) the larger the λ is, the shorter propagation behavior is. From the first observation, we may infer the fact that, whenλ is small, the time when a sensor start to get impacted can be easily influenced by the noise in sensor speed readings. Since the noise can only cause speed changes in a small range, as the λ increases to a larger value, it can hardly affected the generation of propagation behavior, thereby the fitting performance is better, further the prediction accuracy is increased as well. For the second observation, it is intuitive that the duration with significant speed decrease is normally shorter than that with trivial speed changes. Thereby, the duration of propagation behavior is shorter, and easier to predict. 0 1 2 3 4 5 020 40 Impact Backlog (mile) Time Elapsed (min) λ=20% λ=40% λ=60% Fit ( λ=20%) Fit ( λ=40%) Fit ( λ=60%) Figure 16. Impact threshold case study λ avg(εf ) 20% 0.21 40% 0.03 60% 0.01 Table III FITTING ERROR 4) Effects of Forward Lag (h): In this set of experiments, we study how the length of forward lag (h) affects the prediction accuracy. Since there is no initial behavior pattern matching step in the baseline and the approach using envi- ronment information, we only compare the accuracy based on AP and AEP approach. Note that the AP approach can be reduced to the baseline approach when h = 0. Figure 17 depicts the prediction accuracy of the two proposed approach by varying the forward lag from 0 to 10 under different incident locations. In general, as h increases, the prediction accuracy from both approaches increase. This is because the longer time we observe on the impact backlog time series, the better estimation we can conclude for the rest of the behavior. However, for some cases, there is an slight increase of prediction error when h increases from 0 to 2. One explanation of such phenomenon is that the propagation behavior for the first 2 minutes is noisy, which may due to the difference in people’s immediate reactions to the incidents. For example, in the very beginning of the incidents, whether to move incident scene from the middle of the road to the shoulder may greatly affects the incident propagation behavior. Thereby, instead of enhancing the prediction accuracy, the initial propagation behavior may introduce more prediction error. 0.8 1 1.2 1.4 1.6 1.8 2 025 10 RMSE Forward lag (h) I-10 W I-5 S I-405 S (a) AP approach 0.8 1 1.2 1.4 1.6 1.8 2 025 10 RMSE Forward lag (h) I-10 W I-5 S I-405 S (b) AEP approach Figure 17. Effect of forward lag (h) 5) Effects of Distance Metric: In this set of experiments, we compare the prediction accuracy by tuning the distance metric when matching the initial propagation behavior in the AEP. Figure 15 illustrates the prediction accuracy for all selected freeways under the Euclidean distance metric and Mahalanobis distance metric. As shown, for prediction of impact on some freeways (such as I-405 S and I-405 N), the use of Mahalanobis distance improves the accuracy. On the other hand, for prediction on freeways such as I-10 E and I-5 N, the use of Euclidean distance has a better result. 0 0.5 1 1.5 I-405 N I-10 W I-5 S I-405 S I-5 N I-10 E Avg RMSE Euclidean Mahalanobis Figure 18. Effects of distance metric To investigate the reason for such phenomenon, we plot the cluster centroids used for prediction in AEP approaches for two selected freeways in Figure 19. Specifically, we choose I-405 S to represent the cases with better prediction in Mahalanobis distance metric, and I-10 E to represent the freeways with better prediction in Euclidean distance metric. According to the definition of the two distance metrics [14], Mahalanobis distance is based on correlations between variables by which different patterns can be identified and analyzed. It differs from Euclidean distance in that it takes (b) Fitting error Figure 17. Impact threshold case study approach by varying the forward lag from 0 to 10 under different incident locations. In general, as h increases, the prediction accuracy from both approaches increase. This is because the longer time we observe on the impact backlog time series, the better estimation we can conclude for the rest of the behavior. However, for some cases, there is an slight increase of prediction error when h increases from 0 to 2. One explanation of such phenomenon is that the propagation behavior for the first 2 minutes is noisy, which may due to the difference in people’s immediate reactions to the incidents. For example, in the very beginning of the incidents, whether to move incident scene from the middle of the road to the shoulder may greatly affects the incident propagation behavior. Thereby, instead of enhancing the prediction accuracy, the initial propagation behavior may introduce more prediction error. 0.8 1 1.2 1.4 1.6 1.8 2 025 10 RMSE Forward lag (h) I-10 W I-5 S I-405 S (a) AP approach 0.8 1 1.2 1.4 1.6 1.8 2 025 10 RMSE Forward lag (h) I-10 W I-5 S I-405 S (b) AEP approach Figure 18. Effect of forward lag (h) 5) Effects of Distance Metric: In this set of experiments, we compare the prediction accuracy by tuning the distance metric when matching the initial propagation behavior in the AEP. Figure 15 illustrates the prediction accuracy for all selected freeways under the Euclidean distance metric and Mahalanobis distance metric. As shown, for prediction of impact on some freeways (such as I-405 S and I-405 N), the use of Mahalanobis distance improves the accuracy. On the other hand, for prediction on freeways such as I-10 E and I-5 N, the use of Euclidean distance has a better result. (b) Fitting error Figure 12. Case study on impact threshold 3) Effects of Forward Lag (h): In this set of experiments, we study the effect of forward lag (h) length over the prediction accuracy (see Figure 13). We only evaluate the prediction accuracy based on PAI and PADI as there is no initial behavior pattern matching step in the Baseline and PAD approaches. It is important to note that whenh=0, PAI, PADI are reduced to Baseline and PAD, respectively. Figure 13(a) depicts the average prediction accuracy of PAI and PADI by varying the forward lag from 0 to 10. Here, the unit of h is minute. Table 13(b) shows the improvement of PADI over PAI regarding different values of h. In general, as h increases, the prediction accuracy of both PAI and PADI increases. This is because the longer time using initial behavior as indicator yields better estimation. However, for some cases, there is an slight increase in prediction error (e.g., when h increases from 0 to 2 minutes). One explanation for this case is that the propagation behavior for the first 2 minutes is noisy, which may due to the difference in immediate reactions of the drivers to the incidents. For example, at the very beginning of the incidents, whether to stay of the road or move to the shoulder to take an exit may greatly affects the incident propagation behavior. 0.8 1 1.2 1.4 025 10 RMSE Forward lag (min) PAI PADI (a) Average prediction error Improv. h=0 4.2% h=2 6.0% h=5 7.2% h=10 19.2% Table VII AEP→AP VI. CONCLUSIONS In this paper, we model the incident spatiotemporal impact as a time series of impact backlog in terms of propagation behavior on urban road network and predict the propagation behavior under certain speed changes for newly occurred incidents. By evaluating based on a real traffic sensor datasets and incident reports, we show that our proposed prediction algorithm utilizing environment information and initial propagation behavior significantly improves the pre- diction accuracy of existing approaches based on incident attributes up to 45.8%. In particular, for predicting the set of road segments with 60% travel time delay in 15 th , 20 th and 30 th minutes after the occurrence of incidents, our best solution reaches the prediction accuracy of 91.7%, 84.2% and 72.4% under the configuration of freeways in LA county and Orange county. As a result, the propagation behavior predicted by our method can serve as an crucial input for predictive routes calculation in intelligent routing applications. REFERENCES [1] S. Boyles, D. Fajardo, and S. T. Waller. Naive bayesian classifier for incident duration prediction. [2] Y . Chung and W. W. Recker. A methodological approach for estimating temporal and spatial extent of delays caused by freeway accidents. [3] C. F. Daganzo. The cell transmission model: A dynamic representation of highway traffic consistent with the hydro- dynamic theory. Transportation Research Part B: Method- ological, 28:269–287, 1994. [4] U. Demiryurek, F. Banaei-Kashani, C. Shahabi, , and A. Ran- ganathan. Online computation of fastest path in time- dependent spatial networks. In SSTD’11. [5] A. L. Erera, T. W. Lawson, and C. F. Daganzo. A simple, generalized method for analysis of a traffic queue upstream of a bottleneck. 1998. [6] A. Garib, A. E. Radwan, and H. Al-Deek. Estimating magnitude and duration of incident delays. Journal of Transportation Engineering, 123(6):459–466, Nov. 1997. [7] GeoLife. http://research.microsoft.com/en- us/projects/geolife/. Last visited Feb 25, 2013. [8] G. Giuliano. Incident characteristics, frequency, and duration on a high volume urban freeway. Transportation Research Part A: General, 23(5):387–396, Sept. 1989. [9] T. F. Golob, W. W. Recker, and J. D. Leonard. An analysis of the severity and incident duration of truck-involved freeway accidents. Accident Analysis and Prevention, 19(5):375–395, Oct. 1987. [10] A. J. Khattak, J. L. Schofer, and M.-h. Wang. A simple time sequential procedure for predicting freeway incident duration. IVHS Journal, 2(2), Jan. 1994. [11] W. Kim, S. Natarajan, and G.-L. Chang. Empirical anal- ysis and modeling of freeway incident duration. In 11th International IEEE Conference on Intelligent Transportation Systems, 2008. ITSC 2008, pages 453–457, 2008. [12] J. Kwon, M. Mauch, and P. P. Varaiya. Components of con- gestion : delay from incidents, special events, lane closures, weather, potential ramp metering gain, and excess demand. In TRR’06, pages 84–91. [13] T. W. Lawson, D. J. Lovell, and C. F. Daganzo. Using the input-output diagram to determine the spatial and temporal extents of a queue upstream of a bottleneck. Trans. Res. Rec, 1572:140–147, 1997. [14] P. C. Mahalanobis. On the generalised distance in statistics. Proceedings of the National Institute of Sciences of India 2, 1:49–55, 1936. [15] M. Miller and C. Gupta. Mining traffic incidents to forecast impact. In UrbComp ’ 12. [16] K. Ozbay and P. Kachroo. Incident management in intelligent transportation systems. Artech House, 1999. [17] R. Pal and K. C. Sinha. Simulation model for evaluating and improving effectiveness of freeway service patrol programs. Journal of Transportation Engineering, 128:355–365, 2002. [18] B. Pan, U. Demiryurek, and C. Shahabi. Utilizing real- world transportation data for accurate traffic prediction. In ICDM’12. [19] F. M. Report. http://www.metro.net/board/Items/2012/03March/ 20120322RBMItem57.pdf. Last visited Feb 14, 2013. [20] RIITS. http://www.riits.net/. Last visited December 25, 2011. [21] P. J. Rousseeuw. Silhouettes: A graphical aid to the in- terpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20:53–65, 1987. [22] K. W. Smith. and B. L. Smith. Forecasting the clearance time of freeway accidents. In Center Transp. Studies, Univ. Virginia, Charlottesville, VA, Rep. STL-2001-01, 2001. [23] E. C. Sullivan. New model for predicting freeway incidents and incident delays. Journal of Transportation Engineering, 123(4):267–275, July 1997. [24] Y . Wang and N. Nihan. Freeway traffic speed estimation with single-loop outputs. TransportationResearchRecord:Journal of the Transportation Research Board, 1727(-1):120–126, 01 2000. [25] Z. Wang and P. M. Murray-Tuite. A cellular automata approach to estimate incident-related travel time on interstate 66 in near real time. Virginia Transportation Research Council, 2010. [26] WAZE. http://www.waze.com/. Last visited Feb 25, 2013. [27] S. C. Wirasinghe. Determination of traffic delays from shock- wave analysis. Transportation Research, pages 343–348, 1978. (b) PADI over PAI Figure 13. Effect of forward lag (h) 4) Effect of Distance Metric: In this set of experiments, we compare the prediction accuracy by choosing the distance metric by matching the initial behavior in PADI. Figure 14(a) illustrates the prediction accuracy for top six freeways with most incident occurrences using Euclidean distance metric and Mahalanobis distance metric. As shown, the performance of Euclidean and Mahalanobis distance metrics are variant, i.e., changes based on highways. For example, while Mahalanobis distance yields better results on I-405 South and I-405 North, Euclidean distance is better for I-10 East and I-5 North. 0 0.5 1 1.5 I-405N I-10W I-5S I-405S I-5N I-10E RMSE Euclidean Mahalanobis (a) Prediction accuracy on freeways 0.35333 0.48333 0.61 0.73333 0.95667 1.1767 1.3933 1.6033 1.8067 2.0033 0.65 0.76667 0.88667 0.99667 1.1067 1.21 1.31 1.47 1.6433 1.8067 1.235 1.42 1.6 1.775 1.955 2.12 2.29 2.45 2.605 2.76 1.06 1.17 1.27 1.38 1.48 1.58 1.68 1.78 1.87 1.97 0 0.005 0.095 0.185 0.27 0.355 0.435 0.515 0.59 0.66 0123456789 0.385 0.595 0.765 0.93 1.1 1.26 0 0 0 0.05 0.1 0.2 0.23 0.6 0.95 1.41 1.87 2.13 1.23 1.62 1.89 2.06 2.32 2.51 2.19 2.28 2.32 2.33 2.38 2.46 0.35333 0.48333 0.61 0.79333 0.95667 1.0767 0.65 0.76667 0.88667 0.99667 1.1067 1.21 1.235 1.42 1.6 1.775 1.955 2.12 1.06 1.17 1.27 1.38 1.48 1.58 0 0.005 0.095 0.185 0.27 0.355 0 1 2 3 05 I-10 E 0 1 2 3 05 I-405 S (b) Sub-predictor Figure 14. Effects of distance metric on PADI approach To investigate the reason, we plot the first 5 minutes of training results under two selected freeways (see Figure 14(b)). Specifically, we choose the two clusters for I-405 S and I-10 E to represent the cases with better prediction in Mahalanobis and Euclidean distance metric, respectively. As shown in Figure 14(a), the five minutes of cluster centroids in I-405 S present distinct patterns from each other. Thereby the Mahalanobis distance metric is more helpful in selecting the centroids for prediction, due to it measures the correlative distance between two variables. However, first five minutes of cluster centroids in I-10 E follow the similar pattern (i.e., curves with similar gradient), which means they are already highly correlated with each other. In this case, the correlation is no longer a good metric, we need to utilize scale information to distinguish them from each other. Therefore, the Euclidean distance metric introduces lower prediction error in this case. To effectively select the distance metric in our techniques, we evaluate the degree of pattern (b) Improv. over baseline Figure 4.12: Effect of impact threshold () the three approaches with available traffic sensor dataset. Note that since theoretical baseline approach does not utilize the proposed modeling strategy of propagation be- havior, in this experiment, we only compare proposed approaches with the data-driven baseline approach (DB). As shown, both PAD and PADI outperforms the data-driven baseline approach and the percentage of their improvement over baseline is listed in the Table 4.12(b). In addition, as illustrated in Figure 4.12(a), as increases, the prediction error decreases regardless of which approach is used. To investigate the reason of this phenomenon, we conduct an case study based on an incident occurred on I-405 South during off-peak hours (see Figure 4.13). In fact, when we increase, the number of impacted sensor decreases as well. Fig- ure 4.13(a) shows the interpolation result when we create propagation behavior with respect to different values. Each scatter point (x,y) represents a sensor located aty starts to get impacted at timex. The dashed lines represent the fitted curves for the cor- responding set of scatter points. Table 4.13(b) shows the average fitting error for each fitted curve. As illustrated, the larger the is, the less error in the fitting result. This is because large is more probable to be greater than the minimum impact threshold ( min ), thereby we accurately identify the speed change is caused by traffic incident instead of regular speed fluctuation. For smaller value, since we cannot distinguish whether the speed change is from regular fluctuation from impact of traffic incident, we may introduce more fitting errors in the propagation behavior. The result also indicates 40 0 1 2 3 4 5 020 40 Impact Backlog (mile) Time Elapsed (min) λ=20% λ=40% λ=60% Fit ( λ=20%) Fit ( λ=40%) Fit ( λ=60%) (a) Fitting graphs Intuitively, when we increase λ, the standard to evaluate a location whether impacted or not is increased. Thereby, with large λ, only sensors shows significant speed changes will be considered as impacted sensors. To investigate the reason why the prediction accuracy increases a lot based along with the increase of λ, we conduct a case study on one incident occurred on I-405 S during off-peak hours. Figure 16 shows the interpolation process of the propagation behavior under different λ value. In this figure, each scatter point < x,y > represents a sensor located at y is start to get impacted at time x. And the dash lines represent the fitted function towards the corresponding set of scatter points. From this figure, we could derive the following two observations: (1) the larger the λ is, the less noise in the fitting process to generate propagation behavior; (2) the larger the λ is, the shorter propagation behavior is. From the first observation, we may infer the fact that, whenλ is small, the time when a sensor start to get impacted can be easily influenced by the noise in sensor speed readings. Since the noise can only cause speed changes in a small range, as the λ increases to a larger value, it can hardly affected the generation of propagation behavior, thereby the fitting performance is better, further the prediction accuracy is increased as well. For the second observation, it is intuitive that the duration with significant speed decrease is normally shorter than that with trivial speed changes. Thereby, the duration of propagation behavior is shorter, and easier to predict. 0 1 2 3 4 5 020 40 Impact Backlog (mile) Time Elapsed (min) λ=20% λ=40% λ=60% Fit ( λ=20%) Fit ( λ=40%) Fit ( λ=60%) Figure 16. Impact threshold case study avg(ε f ) λ=20% 0.21 λ=40% 0.03 λ=60% 0.01 Table III FITTING ERROR 4) Effects of Forward Lag (h): In this set of experiments, we study how the length of forward lag (h) affects the prediction accuracy. Since there is no initial behavior pattern matching step in the baseline and the approach using envi- ronment information, we only compare the accuracy based on AP and AEP approach. Note that the AP approach can be reduced to the baseline approach when h = 0. Figure 22 depicts the prediction accuracy of the two proposed AE AEP λ=20% 4.2% 11.3% λ=40% 7.4% 39.1% λ=60% 23.9% 45.8% Table IV IMPROVEMENT TOWARDS BASELINE 0 1 2 3 4 5 020 40 Impact Backlog (mile) Time Elapsed (min) λ=20% λ=40% λ=60% Fit ( λ=20%) Fit ( λ=40%) Fit ( λ=60%) (a) Fitting graphs 0 0.5 1 1.5 2 RMSE λ=20% λ=40% λ=60% (a) AP approach 0 0.5 1 1.5 2 RMSE λ=20% λ=40% λ=60% (b) AEP approach Figure 15. Effect of impact threshold (λ) one incident occurred on I-405 S during off-peak hours. Figure 16 shows the interpolation process of the propagation behavior under different λ value. In this figure, each scatter point < x,y > represents a sensor located at y is start to get impacted at time x. And the dash lines represent the fitted function towards the corresponding set of scatter points. From this figure, we could derive the following two observations: (1) the larger the λ is, the less noise in the fitting process to generate propagation behavior; (2) the larger the λ is, the shorter propagation behavior is. From the first observation, we may infer the fact that, whenλ is small, the time when a sensor start to get impacted can be easily influenced by the noise in sensor speed readings. Since the noise can only cause speed changes in a small range, as the λ increases to a larger value, it can hardly affected the generation of propagation behavior, thereby the fitting performance is better, further the prediction accuracy is increased as well. For the second observation, it is intuitive that the duration with significant speed decrease is normally shorter than that with trivial speed changes. Thereby, the duration of propagation behavior is shorter, and easier to predict. 0 1 2 3 4 5 020 40 Impact Backlog (mile) Time Elapsed (min) λ=20% λ=40% λ=60% Fit ( λ=20%) Fit ( λ=40%) Fit ( λ=60%) Figure 16. Impact threshold case study λ avg(ε f ) 20% 0.21 40% 0.03 60% 0.01 Table III FITTING ERROR 4) Effects of Forward Lag (h): In this set of experiments, we study how the length of forward lag (h) affects the prediction accuracy. Since there is no initial behavior pattern matching step in the baseline and the approach using envi- ronment information, we only compare the accuracy based on AP and AEP approach. Note that the AP approach can be reduced to the baseline approach when h = 0. Figure 17 depicts the prediction accuracy of the two proposed approach by varying the forward lag from 0 to 10 under different incident locations. In general, as h increases, the prediction accuracy from both approaches increase. This is because the longer time we observe on the impact backlog time series, the better estimation we can conclude for the rest of the behavior. However, for some cases, there is an slight increase of prediction error when h increases from 0 to 2. One explanation of such phenomenon is that the propagation behavior for the first 2 minutes is noisy, which may due to the difference in people’s immediate reactions to the incidents. For example, in the very beginning of the incidents, whether to move incident scene from the middle of the road to the shoulder may greatly affects the incident propagation behavior. Thereby, instead of enhancing the prediction accuracy, the initial propagation behavior may introduce more prediction error. 0.8 1 1.2 1.4 1.6 1.8 2 025 10 RMSE Forward lag (h) I-10 W I-5 S I-405 S (a) AP approach 0.8 1 1.2 1.4 1.6 1.8 2 025 10 RMSE Forward lag (h) I-10 W I-5 S I-405 S (b) AEP approach Figure 17. Effect of forward lag (h) 5) Effects of Distance Metric: In this set of experiments, we compare the prediction accuracy by tuning the distance metric when matching the initial propagation behavior in the AEP. Figure 15 illustrates the prediction accuracy for all selected freeways under the Euclidean distance metric and Mahalanobis distance metric. As shown, for prediction of impact on some freeways (such as I-405 S and I-405 N), the use of Mahalanobis distance improves the accuracy. On the other hand, for prediction on freeways such as I-10 E and I-5 N, the use of Euclidean distance has a better result. 0 0.5 1 1.5 I-405 N I-10 W I-5 S I-405 S I-5 N I-10 E Avg RMSE Euclidean Mahalanobis Figure 18. Effects of distance metric To investigate the reason for such phenomenon, we plot the cluster centroids used for prediction in AEP approaches for two selected freeways in Figure 19. Specifically, we choose I-405 S to represent the cases with better prediction in Mahalanobis distance metric, and I-10 E to represent the freeways with better prediction in Euclidean distance metric. According to the definition of the two distance metrics [14], Mahalanobis distance is based on correlations between variables by which different patterns can be identified and analyzed. It differs from Euclidean distance in that it takes (b) Fitting error Figure 17. Impact threshold case study approach by varying the forward lag from 0 to 10 under different incident locations. In general, as h increases, the prediction accuracy from both approaches increase. This is because the longer time we observe on the impact backlog time series, the better estimation we can conclude for the rest of the behavior. However, for some cases, there is an slight increase of prediction error when h increases from 0 to 2. One explanation of such phenomenon is that the propagation behavior for the first 2 minutes is noisy, which may due to the difference in people’s immediate reactions to the incidents. For example, in the very beginning of the incidents, whether to move incident scene from the middle of the road to the shoulder may greatly affects the incident propagation behavior. Thereby, instead of enhancing the prediction accuracy, the initial propagation behavior may introduce more prediction error. 0.8 1 1.2 1.4 1.6 1.8 2 025 10 RMSE Forward lag (h) I-10 W I-5 S I-405 S (a) AP approach 0.8 1 1.2 1.4 1.6 1.8 2 025 10 RMSE Forward lag (h) I-10 W I-5 S I-405 S (b) AEP approach Figure 18. Effect of forward lag (h) 5) Effects of Distance Metric: In this set of experiments, we compare the prediction accuracy by tuning the distance metric when matching the initial propagation behavior in the AEP. Figure 15 illustrates the prediction accuracy for all selected freeways under the Euclidean distance metric and Mahalanobis distance metric. As shown, for prediction of impact on some freeways (such as I-405 S and I-405 N), the use of Mahalanobis distance improves the accuracy. On the other hand, for prediction on freeways such as I-10 E and I-5 N, the use of Euclidean distance has a better result. (b) Fitting error Figure 4.13: Case study on impact threshold that if the propagation behavior is quantified in a more accurate way, the prediction accuracy is also higher. When is small, we may not accurately identify the speed changes especially when it is less than the minimum impact threshold ( min ), which yields lower prediction accuracy as a consequence.According to our studies within the dataset, we found that the scale of speed changes can reach up to 85% 90% due to a freeway traffic incident. Thereby, can be set as large as 85% or 90%. When trans- portation agencies would like to know the locations with most severe impact by a traffic incident, they need to increase the impact threshold (lambda) to a larger value, to elim- inate the locations with less severe impacts. Furthermore, we also observe that larger values cause shorter propagation be- havior. This is because, given an incident, the significant speed decrease normally propagates a shorter distance than that of trivial speed changes. Thereby, it is easier to predict the propagation behavior with less time duration under large value. In sum, with larger , the propagation behavior is modeled more accurately (i.e., less fitting error), and hence easier to predict. Effects of Forward Lag (h) In our impact prediction framework, system users can also tune the forward lag in the impact prediction. It depends on how far in advance the users try to predict the impact caused by the incidents. For example, for navigation applications, sometimes it is important to predict 30 minutes of traffic conditions in advance to calculate the fastest path. Thereby, in this set of experiments, we study the effect of forward lag (h) length over the prediction accuracy (see Figure 4.14). We only evaluate the prediction accuracy based on PAI and PADI as there is no initial behavior pattern matching step in the two baseline approaches and PAD. It is important to note that whenh=0, PAI, PADI are reduced to DB and PAD, respectively. Figure 4.14(a) depicts the average prediction accuracy of PAI and PADI by varying the forward lag from 0 to 10. Here, the unit ofh is minute. Table 4.14(b) shows the improvement of PADI over PAI regarding different 41 0.8 1 1.2 1.4 025 10 RMSE Forward lag (min) PAI PADI (a) Average prediction error Improv. h=0 4.2% h=2 6.0% h=5 7.2% h=10 19.2% Table VII AEP→AP VI. CONCLUSIONS In this paper, we model the incident spatiotemporal impact as a time series of impact backlog in terms of propagation behavior on urban road network and predict the propagation behavior under certain speed changes for newly occurred incidents. By evaluating based on a real traffic sensor datasets and incident reports, we show that our proposed prediction algorithm utilizing environment information and initial propagation behavior significantly improves the pre- diction accuracy of existing approaches based on incident attributes up to 45.8%. In particular, for predicting the set of road segments with 60% travel time delay in 15 th , 20 th and 30 th minutes after the occurrence of incidents, our best solution reaches the prediction accuracy of 91.7%, 84.2% and 72.4% under the configuration of freeways in LA county and Orange county. As a result, the propagation behavior predicted by our method can serve as an crucial input for predictive routes calculation in intelligent routing applications. REFERENCES [1] S. Boyles, D. Fajardo, and S. T. Waller. Naive bayesian classifier for incident duration prediction. [2] Y . Chung and W. W. Recker. A methodological approach for estimating temporal and spatial extent of delays caused by freeway accidents. [3] C. F. Daganzo. The cell transmission model: A dynamic representation of highway traffic consistent with the hydro- dynamic theory. Transportation Research Part B: Method- ological, 28:269–287, 1994. [4] U. Demiryurek, F. Banaei-Kashani, C. Shahabi, , and A. Ran- ganathan. Online computation of fastest path in time- dependent spatial networks. In SSTD’11. [5] A. L. Erera, T. W. Lawson, and C. F. Daganzo. A simple, generalized method for analysis of a traffic queue upstream of a bottleneck. 1998. [6] A. Garib, A. E. Radwan, and H. Al-Deek. Estimating magnitude and duration of incident delays. Journal of Transportation Engineering, 123(6):459–466, Nov. 1997. [7] GeoLife. http://research.microsoft.com/en- us/projects/geolife/. Last visited Feb 25, 2013. [8] G. Giuliano. Incident characteristics, frequency, and duration on a high volume urban freeway. Transportation Research Part A: General, 23(5):387–396, Sept. 1989. [9] T. F. Golob, W. W. Recker, and J. D. Leonard. An analysis of the severity and incident duration of truck-involved freeway accidents. Accident Analysis and Prevention, 19(5):375–395, Oct. 1987. [10] A. J. Khattak, J. L. Schofer, and M.-h. Wang. A simple time sequential procedure for predicting freeway incident duration. IVHS Journal, 2(2), Jan. 1994. [11] W. Kim, S. Natarajan, and G.-L. Chang. Empirical anal- ysis and modeling of freeway incident duration. In 11th International IEEE Conference on Intelligent Transportation Systems, 2008. ITSC 2008, pages 453–457, 2008. [12] J. Kwon, M. Mauch, and P. P. Varaiya. Components of con- gestion : delay from incidents, special events, lane closures, weather, potential ramp metering gain, and excess demand. In TRR’06, pages 84–91. [13] T. W. Lawson, D. J. Lovell, and C. F. Daganzo. Using the input-output diagram to determine the spatial and temporal extents of a queue upstream of a bottleneck. Trans. Res. Rec, 1572:140–147, 1997. [14] P. C. Mahalanobis. On the generalised distance in statistics. Proceedings of the National Institute of Sciences of India 2, 1:49–55, 1936. [15] M. Miller and C. Gupta. Mining traffic incidents to forecast impact. In UrbComp ’ 12. [16] K. Ozbay and P. Kachroo. Incident management in intelligent transportation systems. Artech House, 1999. [17] R. Pal and K. C. Sinha. Simulation model for evaluating and improving effectiveness of freeway service patrol programs. Journal of Transportation Engineering, 128:355–365, 2002. [18] B. Pan, U. Demiryurek, and C. Shahabi. Utilizing real- world transportation data for accurate traffic prediction. In ICDM’12. [19] F. M. Report. http://www.metro.net/board/Items/2012/03March/ 20120322RBMItem57.pdf. Last visited Feb 14, 2013. [20] RIITS. http://www.riits.net/. Last visited December 25, 2011. [21] P. J. Rousseeuw. Silhouettes: A graphical aid to the in- terpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20:53–65, 1987. [22] K. W. Smith. and B. L. Smith. Forecasting the clearance time of freeway accidents. In Center Transp. Studies, Univ. Virginia, Charlottesville, VA, Rep. STL-2001-01, 2001. [23] E. C. Sullivan. New model for predicting freeway incidents and incident delays. Journal of Transportation Engineering, 123(4):267–275, July 1997. [24] Y . Wang and N. Nihan. Freeway traffic speed estimation with single-loop outputs. TransportationResearchRecord:Journal of the Transportation Research Board, 1727(-1):120–126, 01 2000. [25] Z. Wang and P. M. Murray-Tuite. A cellular automata approach to estimate incident-related travel time on interstate 66 in near real time. Virginia Transportation Research Council, 2010. [26] WAZE. http://www.waze.com/. Last visited Feb 25, 2013. [27] S. C. Wirasinghe. Determination of traffic delays from shock- wave analysis. Transportation Research, pages 343–348, 1978. (b) PADI over PAI Figure 4.14: Effect of forward lag (h) values ofh. In general, ash increases, the prediction accuracy of both PAI and PADI increases. This is because the longer time using initial behavior as indicator yields better estimation. However, for some cases, there is an slight increase in prediction error (e.g., when h increases from 0 to 2 minutes). One explanation for this case is that the propagation behavior for the first 2 minutes is noisy, which may due to the difference in immediate reactions of the drivers to the incidents. For example, at the very beginning of the incidents, whether to stay of the road or move to the shoulder to take an exit may greatly affects the incident propagation behavior. Effects of Occurrence Time To evaluate the prediction accuracy among all approaches based on real-world traffic data, we compare the prediction accuracy under different occurrence time of traffic incidents, as illustrated in Figure 4.15. The x-axis of this figure is a combination of the incident location and the category of occurrence time, where “peak” means the “peak hour” which refers to the rush hours in the morning and in the afternoon, and “off” means the “off-peak hours” which refers to the time intervals when the road network is less crowded, as explained in Section 7.2.1. As shown in Figure 4.15, DB approach yields the worse prediction accuracy, followed by the PAD and PAI approach, and the PADI approaches is the best of all. The figure also shows that the PAD yields better accuracy than PAI (e.g., I-5 S Peak), sometimes PAI yields better accuracy (e.g., I- 405 S Peak), which suggests both the initial propagation behavior and environment information in terms of volume/occupancy are essential for enhancing the prediction accuracy. Moreover, we also observe that for all approaches, the prediction accuracy in the off-peak hours is higher than that in the peak hours, which implies the fact that the propagation behavior for incidents during off peak hours is easier to predict. Effect of Distance Metric In this set of experiments, we compare the prediction accuracy by choosing the distance metric by matching the initial behavior in PADI. Figure 4.16(a) illustrates the prediction 42 0 0.6 1.2 1.8 I-10 W Peak I-10 W Off I-5 S Peak I-5 S Off I-405 S Peak I-405 S Off RMSE DB PAD PAI PADI Figure 4.15: Effects of incident occurrence time 0 0.5 1 1.5 I-405N I-10W I-5S I-405S I-5N I-10E RMSE Euclidean Mahalanobis (a) Prediction accuracy on freeways 0.35333 0.48333 0.61 0.73333 0.95667 1.1767 1.3933 1.6033 1.8067 2.0033 0.65 0.76667 0.88667 0.99667 1.1067 1.21 1.31 1.47 1.6433 1.8067 1.235 1.42 1.6 1.775 1.955 2.12 2.29 2.45 2.605 2.76 1.06 1.17 1.27 1.38 1.48 1.58 1.68 1.78 1.87 1.97 0 0.005 0.095 0.185 0.27 0.355 0.435 0.515 0.59 0.66 0123456789 0.385 0.595 0.765 0.93 1.1 1.26 0 0 0 0.05 0.1 0.2 0.23 0.6 0.95 1.41 1.87 2.13 1.23 1.62 1.89 2.06 2.32 2.51 2.19 2.28 2.32 2.33 2.38 2.46 0.35333 0.48333 0.61 0.79333 0.95667 1.0767 0.65 0.76667 0.88667 0.99667 1.1067 1.21 1.235 1.42 1.6 1.775 1.955 2.12 1.06 1.17 1.27 1.38 1.48 1.58 0 0.005 0.095 0.185 0.27 0.355 0 1 2 3 05 I-10 E 0 1 2 3 05 Impact backlog (mile) I-405 S (b) Sub-predictor Figure 4.16: Effects of distance metric on PADI approach accuracy for top six freeways with most incident occurrences using Euclidean distance metric and Mahalanobis distance metric. As shown, the performance of Euclidean and Mahalanobis distance metrics are variant, i.e., changes based on highways. For example, while Mahalanobis distance yields better results on I-405 South and I-405 North, Euclidean distance is better for I-10 East and I-5 North. To investigate the reason, we plot the first 5 minutes of training results under two selected freeways (see Figure 4.16(b)). Specifically, we choose the two clusters for I-405 S and I-10 E to represent the cases with better prediction in Mahalanobis and Euclidean distance metric, respectively. As shown in Figure 4.16(a), the five minutes of cluster centroids in I-405 S present distinct patterns from each other. Thereby the Mahalanobis distance metric is more helpful in selecting the centroids for prediction, due to it measures the correlative distance between two variables. However, first five minutes of cluster centroids in I-10 E follow the similar pattern (i.e., curves with similar gradient), which means they are already highly correlated with each other. In this case, the correlation is no longer a good metric, we need to utilize scale information to distinguish them from each other. Therefore, the Euclidean distance metric introduces lower prediction error in this case. To effectively select the distance metric in our techniques, we evaluate the degree of pattern correlation in the firsth minutes of the cluster centroids trained by PADI approach, and set specific thresholds to decide the better metric accordingly. 43 0 0.5 1 1.5 0 102030 Avg( ε) Prediction Interval (min) DB PAD PADI (a) Effects of prediction interval PADI T=15 91.7% T=20 84.2% T=30 72.4% decide the better metric accordingly. We conclude the results of our experiments using Figure 15. As shown, λ and h is set to 60% and 5 respectively. To evaluate the performance of the approaches in predicting the spatial span in different time stamps, we directly calculate the differences of actual impact backlog and predicted impact backlog. For each incident i at time t, the ε i,t is defined as |b i,t − d b i,t |, where b refers to the actual impact backlog, and b b refers to the predicted impact backlog. 0 0.5 1 1.5 0 102030 Avg( ε) Time Elapsed (t) Baseline PAD PADI (a) Average rediction accuracy over time PADI T=15 91.7% T=20 84.2% T=30 72.4% (b) Best pred. ac- cur. Figure 15. Overall results As shown in Figure 15(a), the prediction error increases with the increase of duration of time interval that we want to predict. In addition, at any time stamp, AEP outperforms both AE and Baseline. To calculate the percentage of in- cidents accurately predicted, for each incident i, we define the impact of an incident i’s as accurately predicted during [0,T] if the following inequality is satisfied: avg(ε i,[0,T] )≤γ (6) whereγ is set to 0.5 mile according to the sensor placement configuration on Los Angeles freeways (the average sensor placement interval in Los Angeles is 0.5 mile). Since our approach is based on the interpolation of traffic between sen- sors, the average estimation error brought by the availability of sensor data is also 0.5 mile. Under this circumstances, if the average error for an incidenti before timeT is no more than the internal estimation error, we define the impact of the incident i is accurately predicted. Table ?? summarizes the percentage of incidents that is accurately predicted under different time interval T , by our best approach AEP. As shown, for predicting the spatial span with 60% travel time delay in 15 th , 20 th and 30 th minutes after the occurrence of incidents, our best solution reaches the prediction accuracy of 91.7%, 84.2% and 72.4%. VII. CONCLUSIONS In this paper, we modeled an incident’s spatiotemporal impact as a time series of impact backlog in terms of propagation behavior on urban road network and predicted the propagation behavior under certain speed changes for newly occurred incidents. By evaluating based on real- world traffic sensor datasets and incident reports, we show that our proposed prediction algorithm utilizing environment information and initial propagation behavior significantly improves the prediction accuracy of existing approaches based on incident attributes by up to 45.8%. In particular, for predicting the set of road segments with 60% travel time delay in 15 th , 20 th and 30 th minutes after the occurrence of incidents, our best solution reaches the prediction accuracy of 91.7%, 84.2% and 72.4%, respectively, under the config- uration of freeways in LA county and Orange county. As a result, the propagation behavior predicted by our method can serve as a crucial input for predictive path calculation in intelligent navigation applications. REFERENCES [1] Texas transportation institute (tti), annual urban mobility report and appendices. 2012. [2] S. Boyles, D. Fajardo, and S. T. Waller. Naive bayesian classifier for incident duration prediction. [3] H. Cheng, P.-N. Tan, J. Gao, and J. Scripps. Multistep-ahead time series prediction. [4] Y . Chung and W. W. Recker. A methodological approach for estimating temporal and spatial extent of delays caused by freeway accidents. [5] C. F. Daganzo. The cell transmission model: A dynamic representation of highway traffic consistent with the hydro- dynamic theory. Transportation Research Part B: Method- ological, 28:269–287, 1994. [6] G. Giuliano. Incident characteristics, frequency, and duration on a high volume urban freeway. Transportation Research Part A: General, 23(5):387–396, Sept. 1989. [7] W. Kim, S. Natarajan, and G.-L. Chang. Empirical anal- ysis and modeling of freeway incident duration. In 11th International IEEE Conference on Intelligent Transportation Systems, 2008. ITSC 2008, pages 453–457, 2008. [8] J. Kwon, M. Mauch, and P. P. Varaiya. Components of con- gestion : delay from incidents, special events, lane closures, weather, potential ramp metering gain, and excess demand. In TRR’06, pages 84–91. [9] T. W. Lawson, D. J. Lovell, and C. F. Daganzo. Using the input-output diagram to determine the spatial and temporal extents of a queue upstream of a bottleneck. Trans. Res. Rec, 1572:140–147, 1997. [10] P. C. Mahalanobis. On the generalised distance in statistics. Proceedings of the National Institute of Sciences of India 2, 1:49–55, 1936. [11] M. Miller and C. Gupta. Mining traffic incidents to forecast impact. In UrbComp ’ 12. [12] K. Ozbay and P. Kachroo. Incident management in intelligent transportation systems. Artech House, 1999. [13] R. Pal and K. C. Sinha. Simulation model for evaluating and improving effectiveness of freeway service patrol programs. Journal of Transportation Engineering, 128:355–365, 2002. [14] B. Pan, U. Demiryurek, and C. Shahabi. Utilizing real- world transportation data for accurate traffic prediction. In ICDM’12. [15] F. M. Report. http://www.metro.net/board/Items/2012/03March/ 20120322RBMItem57.pdf. Last visited Feb 14, 2013. [16] RIITS. http://www.riits.net/. Last visited December 25, 2011. [17] P. J. Rousseeuw. Silhouettes: A graphical aid to the in- terpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20:53–65, 1987. [18] K. W. Smith. and B. L. Smith. Forecasting the clearance time of freeway accidents. In Center Transp. Studies, Univ. Virginia, Charlottesville, VA, Rep. STL-2001-01, 2001. [19] E. C. Sullivan. New model for predicting freeway incidents and incident delays. Journal of Transportation Engineering, 123(4):267–275, July 1997. (b) Accuracy summary Figure 4.17: Overall results Effects of Prediction Interval We conclude the evaluation of predicting propagation behavior using Figure 4.17, which illustrates the overall performance of the three prediction approaches (i.e., data-driven baseline (DB), PAD and PADI) over prediction time interval (t). In this experiments, and h is set to 60% and 5 minutes respectively. To compute the prediction result, we directly calculate the differences of actual propagation behavior and predicted ones at each time stamp. That is, for each incident at timet, the" t is defined asjb t b b t j, whereb t refers to the impact backlog for its actual propagation behavior at timet, and b b t refers to the impact backlog for the predicted behavior att. As shown in Figure 4.17(a), the prediction error increases ast increases. For ex- ample, for the prediction of impact backlog in 10 th minute, the accuracy is higher than the same prediction in 30 th minute. In addition, at any time stamp, PADI outperforms both PAD and Baseline. To calculate the percentage of incidents that are accurately predicted before time T, for each incident occurred at t 0 , we consider its impact as accurately predicted if the following inequality is satisfied: avg(" [t 0 ;T] ) (4.8) where is set to 0.5 mile according to the sensor placement configuration on Los An- geles freeways (i.e., the average sensor placement interval is 0.5 mile). Since our ap- proach is based on the interpolation of traffic between sensors, the average estimation error brought by the availability of sensor data is also 0.5 mile. Under this circum- stances, if the average error for an incident i before T is no more than the internal estimation error, we define the impact of the incident i is accurately predicted. Ta- ble 4.17(b) summarizes the percentage of incidents that is accurately predicted under different time intervalT , from our best approach PADI. As shown, for predicting the spatial span with 60% travel time delay in 15 th , 20 th and 30 th minutes after the occur- rence of incidents, our best solution reaches the prediction accuracy of 91.7%, 84.2% and 72.4%, respectively. 44 0 0.5 1 1.5 λ=20% λ=40% λ=60% RMSE Baseline PAD PADI (a) Average prediction error PAD PADI λ=20% 4.2% 11.3% λ=40% 7.4% 39.1% λ=60% 23.9% 45.8% (b) Improv. over baseline Figure 11. Effect of impact threshold (λ) closest furthest ± 5 min 43.1% 53.2% ± 10 min 48.6% 70.1% ± 15 min 64.1% 71.1% fitting result. As the propagation behavior is quantified in a more accurate way, the prediction accuracy is also higher. Whenλ is small, the impact is less significant and hence the result can be more easily affected by the noise in the sensor speed readings, which yields lower prediction accuracy. Furthermore, we also observe that the larger the λ values cause shorter propagation behavior. This is because, given an incident, the significant speed decrease normally propagates a shorter distance than that of trivial speed changes. Thereby, it is easier to predict the propagation behavior with less time duration under large λ value. In sum, with larger λ, the propagation behavior is modeled more accurately (i.e., less fitting error), and hence easier to predict. 0 1 2 3 4 5 020 40 Impact Backlog (mile) Time Elapsed (min) λ=20% λ=40% λ=60% Fit ( λ=20%) Fit ( λ=40%) Fit ( λ=60%) (a) Fitting graphs Intuitively, when we increase λ, the standard to evaluate a location whether impacted or not is increased. Thereby, with large λ, only sensors shows significant speed changes will be considered as impacted sensors. To investigate the reason why the prediction accuracy increases a lot based along with the increase of λ, we conduct a case study on one incident occurred on I-405 S during off-peak hours. Figure 16 shows the interpolation process of the propagation behavior under different λ value. In this figure, each scatter point < x,y > represents a sensor located at y is start to get impacted at time x. And the dash lines represent the fitted function towards the corresponding set of scatter points. From this figure, we could derive the following two observations: (1) the larger the λ is, the less noise in the fitting process to generate propagation behavior; (2) the larger the λ is, the shorter propagation behavior is. From the first observation, we may infer the fact that, whenλ is small, the time when a sensor start to get impacted can be easily influenced by the noise in sensor speed readings. Since the noise can only cause speed changes in a small range, as the λ increases to a larger value, it can hardly affected the generation of propagation behavior, thereby the fitting performance is better, further the prediction accuracy is increased as well. For the second observation, it is intuitive that the duration with significant speed decrease is normally shorter than that with trivial speed changes. Thereby, the duration of propagation behavior is shorter, and easier to predict. 0 1 2 3 4 5 020 40 Impact Backlog (mile) Time Elapsed (min) λ=20% λ=40% λ=60% Fit ( λ=20%) Fit ( λ=40%) Fit ( λ=60%) Figure 16. Impact threshold case study avg(εf ) λ=20% 0.21 λ=40% 0.03 λ=60% 0.01 Table III FITTING ERROR 4) Effects of Forward Lag (h): In this set of experiments, we study how the length of forward lag (h) affects the prediction accuracy. Since there is no initial behavior pattern matching step in the baseline and the approach using envi- ronment information, we only compare the accuracy based on AP and AEP approach. Note that the AP approach can be reduced to the baseline approach when h = 0. Figure 22 depicts the prediction accuracy of the two proposed AE AEP λ=20% 4.2% 11.3% λ=40% 7.4% 39.1% λ=60% 23.9% 45.8% Table IV IMPROVEMENT TOWARDS BASELINE 0 1 2 3 4 5 020 40 Impact Backlog (mile) Time Elapsed (min) λ=20% λ=40% λ=60% Fit ( λ=20%) Fit ( λ=40%) Fit ( λ=60%) (a) Fitting graphs 0 0.5 1 1.5 2 RMSE λ=20% λ=40% λ=60% (a) AP approach 0 0.5 1 1.5 2 RMSE λ=20% λ=40% λ=60% (b) AEP approach Figure 15. Effect of impact threshold (λ) one incident occurred on I-405 S during off-peak hours. Figure 16 shows the interpolation process of the propagation behavior under different λ value. In this figure, each scatter point < x,y > represents a sensor located at y is start to get impacted at time x. And the dash lines represent the fitted function towards the corresponding set of scatter points. From this figure, we could derive the following two observations: (1) the larger the λ is, the less noise in the fitting process to generate propagation behavior; (2) the larger the λ is, the shorter propagation behavior is. From the first observation, we may infer the fact that, whenλ is small, the time when a sensor start to get impacted can be easily influenced by the noise in sensor speed readings. Since the noise can only cause speed changes in a small range, as the λ increases to a larger value, it can hardly affected the generation of propagation behavior, thereby the fitting performance is better, further the prediction accuracy is increased as well. For the second observation, it is intuitive that the duration with significant speed decrease is normally shorter than that with trivial speed changes. Thereby, the duration of propagation behavior is shorter, and easier to predict. 0 1 2 3 4 5 020 40 Impact Backlog (mile) Time Elapsed (min) λ=20% λ=40% λ=60% Fit ( λ=20%) Fit ( λ=40%) Fit ( λ=60%) Figure 16. Impact threshold case study λ avg(εf ) 20% 0.21 40% 0.03 60% 0.01 Table III FITTING ERROR 4) Effects of Forward Lag (h): In this set of experiments, we study how the length of forward lag (h) affects the prediction accuracy. Since there is no initial behavior pattern matching step in the baseline and the approach using envi- ronment information, we only compare the accuracy based on AP and AEP approach. Note that the AP approach can be reduced to the baseline approach when h = 0. Figure 17 depicts the prediction accuracy of the two proposed approach by varying the forward lag from 0 to 10 under different incident locations. In general, as h increases, the prediction accuracy from both approaches increase. This is because the longer time we observe on the impact backlog time series, the better estimation we can conclude for the rest of the behavior. However, for some cases, there is an slight increase of prediction error when h increases from 0 to 2. One explanation of such phenomenon is that the propagation behavior for the first 2 minutes is noisy, which may due to the difference in people’s immediate reactions to the incidents. For example, in the very beginning of the incidents, whether to move incident scene from the middle of the road to the shoulder may greatly affects the incident propagation behavior. Thereby, instead of enhancing the prediction accuracy, the initial propagation behavior may introduce more prediction error. 0.8 1 1.2 1.4 1.6 1.8 2 025 10 RMSE Forward lag (h) I-10 W I-5 S I-405 S (a) AP approach 0.8 1 1.2 1.4 1.6 1.8 2 025 10 RMSE Forward lag (h) I-10 W I-5 S I-405 S (b) AEP approach Figure 17. Effect of forward lag (h) 5) Effects of Distance Metric: In this set of experiments, we compare the prediction accuracy by tuning the distance metric when matching the initial propagation behavior in the AEP. Figure 15 illustrates the prediction accuracy for all selected freeways under the Euclidean distance metric and Mahalanobis distance metric. As shown, for prediction of impact on some freeways (such as I-405 S and I-405 N), the use of Mahalanobis distance improves the accuracy. On the other hand, for prediction on freeways such as I-10 E and I-5 N, the use of Euclidean distance has a better result. 0 0.5 1 1.5 I-405 N I-10 W I-5 S I-405 S I-5 N I-10 E Avg RMSE Euclidean Mahalanobis Figure 18. Effects of distance metric To investigate the reason for such phenomenon, we plot the cluster centroids used for prediction in AEP approaches for two selected freeways in Figure 19. Specifically, we choose I-405 S to represent the cases with better prediction in Mahalanobis distance metric, and I-10 E to represent the freeways with better prediction in Euclidean distance metric. According to the definition of the two distance metrics [14], Mahalanobis distance is based on correlations between variables by which different patterns can be identified and analyzed. It differs from Euclidean distance in that it takes (b) Fitting error Figure 17. Impact threshold case study approach by varying the forward lag from 0 to 10 under different incident locations. In general, as h increases, the prediction accuracy from both approaches increase. This is because the longer time we observe on the impact backlog time series, the better estimation we can conclude for the rest of the behavior. However, for some cases, there is an slight increase of prediction error when h increases from 0 to 2. One explanation of such phenomenon is that the propagation behavior for the first 2 minutes is noisy, which may due to the difference in people’s immediate reactions to the incidents. For example, in the very beginning of the incidents, whether to move incident scene from the middle of the road to the shoulder may greatly affects the incident propagation behavior. Thereby, instead of enhancing the prediction accuracy, the initial propagation behavior may introduce more prediction error. 0.8 1 1.2 1.4 1.6 1.8 2 025 10 RMSE Forward lag (h) I-10 W I-5 S I-405 S (a) AP approach 0.8 1 1.2 1.4 1.6 1.8 2 025 10 RMSE Forward lag (h) I-10 W I-5 S I-405 S (b) AEP approach Figure 18. Effect of forward lag (h) 5) Effects of Distance Metric: In this set of experiments, we compare the prediction accuracy by tuning the distance metric when matching the initial propagation behavior in the AEP. Figure 15 illustrates the prediction accuracy for all selected freeways under the Euclidean distance metric and Mahalanobis distance metric. As shown, for prediction of impact on some freeways (such as I-405 S and I-405 N), the use of Mahalanobis distance improves the accuracy. On the other hand, for prediction on freeways such as I-10 E and I-5 N, the use of Euclidean distance has a better result. (b) Fitting error Figure 12. Case study on impact threshold 3) Effects of Forward Lag (h): In this set of experiments, we study the effect of forward lag (h) length over the prediction accuracy (see Figure 13). We only evaluate the closest furthest Distr. 73.5% 26.5% Pred. Accur 82.3% prediction accuracy based on PAI and PADI as there is no initial behavior pattern matching step in the Baseline and PAD approaches. It is important to note that whenh=0, PAI, PADI are reduced to Baseline and PAD, respectively. Figure 13(a) depicts the average prediction accuracy of PAI and PADI by varying the forward lag from 0 to 10. Here, the unit of h is minute. Table 13(b) shows the improvement of PADI over PAI regarding different values of h. In general, as h increases, the prediction accuracy of both PAI and PADI increases. This is because the longer time using initial behavior as indicator yields better estimation. However, for some cases, there is an slight increase in prediction error (e.g., when h increases from 0 to 2 minutes). One explanation for this case is that the propagation behavior for the first 2 minutes is noisy, which may due to the difference in immediate reactions of the drivers to the incidents. For example, at the very beginning of the incidents, whether to stay of the road or move to the shoulder to take an exit may greatly affects the incident propagation behavior. 0.8 1 1.2 1.4 025 10 RMSE Forward lag (min) PAI PADI (a) Average prediction error Improv. h=0 4.2% h=2 6.0% h=5 7.2% h=10 19.2% Table VII AEP→AP VI. CONCLUSIONS In this paper, we model the incident spatiotemporal impact as a time series of impact backlog in terms of propagation behavior on urban road network and predict the propagation behavior under certain speed changes for newly occurred incidents. By evaluating based on a real traffic sensor datasets and incident reports, we show that our proposed prediction algorithm utilizing environment information and initial propagation behavior significantly improves the pre- diction accuracy of existing approaches based on incident attributes up to 45.8%. In particular, for predicting the set of road segments with 60% travel time delay in 15 th , 20 th and 30 th minutes after the occurrence of incidents, our best solution reaches the prediction accuracy of 91.7%, 84.2% and 72.4% under the configuration of freeways in LA county and Orange county. As a result, the propagation behavior predicted by our method can serve as an crucial input for predictive routes calculation in intelligent routing applications. REFERENCES [1] S. Boyles, D. Fajardo, and S. T. Waller. Naive bayesian classifier for incident duration prediction. [2] Y . Chung and W. W. Recker. A methodological approach for estimating temporal and spatial extent of delays caused by freeway accidents. [3] C. F. Daganzo. The cell transmission model: A dynamic representation of highway traffic consistent with the hydro- dynamic theory. Transportation Research Part B: Method- ological, 28:269–287, 1994. [4] U. Demiryurek, F. Banaei-Kashani, C. Shahabi, , and A. Ran- ganathan. Online computation of fastest path in time- dependent spatial networks. In SSTD’11. [5] A. L. Erera, T. W. Lawson, and C. F. Daganzo. A simple, generalized method for analysis of a traffic queue upstream of a bottleneck. 1998. [6] A. Garib, A. E. Radwan, and H. Al-Deek. Estimating magnitude and duration of incident delays. Journal of Transportation Engineering, 123(6):459–466, Nov. 1997. [7] GeoLife. http://research.microsoft.com/en- us/projects/geolife/. Last visited Feb 25, 2013. [8] G. Giuliano. Incident characteristics, frequency, and duration on a high volume urban freeway. Transportation Research Part A: General, 23(5):387–396, Sept. 1989. [9] T. F. Golob, W. W. Recker, and J. D. Leonard. An analysis of the severity and incident duration of truck-involved freeway accidents. Accident Analysis and Prevention, 19(5):375–395, Oct. 1987. [10] A. J. Khattak, J. L. Schofer, and M.-h. Wang. A simple time sequential procedure for predicting freeway incident duration. IVHS Journal, 2(2), Jan. 1994. [11] W. Kim, S. Natarajan, and G.-L. Chang. Empirical anal- ysis and modeling of freeway incident duration. In 11th International IEEE Conference on Intelligent Transportation Systems, 2008. ITSC 2008, pages 453–457, 2008. [12] J. Kwon, M. Mauch, and P. P. Varaiya. Components of con- gestion : delay from incidents, special events, lane closures, weather, potential ramp metering gain, and excess demand. In TRR’06, pages 84–91. [13] T. W. Lawson, D. J. Lovell, and C. F. Daganzo. Using the input-output diagram to determine the spatial and temporal extents of a queue upstream of a bottleneck. Trans. Res. Rec, 1572:140–147, 1997. [14] P. C. Mahalanobis. On the generalised distance in statistics. Proceedings of the National Institute of Sciences of India 2, 1:49–55, 1936. [15] M. Miller and C. Gupta. Mining traffic incidents to forecast impact. In UrbComp ’ 12. [16] K. Ozbay and P. Kachroo. Incident management in intelligent transportation systems. Artech House, 1999. [17] R. Pal and K. C. Sinha. Simulation model for evaluating and improving effectiveness of freeway service patrol programs. Journal of Transportation Engineering, 128:355–365, 2002. [18] B. Pan, U. Demiryurek, and C. Shahabi. Utilizing real- world transportation data for accurate traffic prediction. In ICDM’12. [19] F. M. Report. http://www.metro.net/board/Items/2012/03March/ 20120322RBMItem57.pdf. Last visited Feb 14, 2013. [20] RIITS. http://www.riits.net/. Last visited December 25, 2011. [21] P. J. Rousseeuw. Silhouettes: A graphical aid to the in- terpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20:53–65, 1987. [22] K. W. Smith. and B. L. Smith. Forecasting the clearance time of freeway accidents. In Center Transp. Studies, Univ. Virginia, Charlottesville, VA, Rep. STL-2001-01, 2001. [23] E. C. Sullivan. New model for predicting freeway incidents and incident delays. Journal of Transportation Engineering, 123(4):267–275, July 1997. [24] Y . Wang and N. Nihan. Freeway traffic speed estimation with single-loop outputs. TransportationResearchRecord:Journal of the Transportation Research Board, 1727(-1):120–126, 01 2000. [25] Z. Wang and P. M. Murray-Tuite. A cellular automata approach to estimate incident-related travel time on interstate 66 in near real time. Virginia Transportation Research Council, 2010. [26] WAZE. http://www.waze.com/. Last visited Feb 25, 2013. [27] S. C. Wirasinghe. Determination of traffic delays from shock- wave analysis. Transportation Research, pages 343–348, 1978. (b) PADI over PAI Figure 13. Effect of forward lag (h) 4) Effect of Distance Metric: In this set of experiments, we compare the prediction accuracy by choosing the distance metric by matching the initial behavior in PADI. Figure 14(a) illustrates the prediction accuracy for top six freeways with most incident occurrences using Euclidean distance metric and Mahalanobis distance metric. As shown, the performance of Euclidean and Mahalanobis distance metrics are variant, i.e., changes based on highways. For example, while Mahalanobis distance yields better results on I-405 South and I-405 North, Euclidean distance is better for I-10 East and I-5 North. 0 0.5 1 1.5 I-405N I-10W I-5S I-405S I-5N I-10E RMSE Euclidean Mahalanobis (a) Prediction accuracy on freeways 0.35333 0.48333 0.61 0.73333 0.95667 1.1767 1.3933 1.6033 1.8067 2.0033 0.65 0.76667 0.88667 0.99667 1.1067 1.21 1.31 1.47 1.6433 1.8067 1.235 1.42 1.6 1.775 1.955 2.12 2.29 2.45 2.605 2.76 1.06 1.17 1.27 1.38 1.48 1.58 1.68 1.78 1.87 1.97 0 0.005 0.095 0.185 0.27 0.355 0.435 0.515 0.59 0.66 0123456789 0.385 0.595 0.765 0.93 1.1 1.26 0 0 0 0.05 0.1 0.2 0.23 0.6 0.95 1.41 1.87 2.13 1.23 1.62 1.89 2.06 2.32 2.51 2.19 2.28 2.32 2.33 2.38 2.46 0.35333 0.48333 0.61 0.79333 0.95667 1.0767 0.65 0.76667 0.88667 0.99667 1.1067 1.21 1.235 1.42 1.6 1.775 1.955 2.12 1.06 1.17 1.27 1.38 1.48 1.58 0 0.005 0.095 0.185 0.27 0.355 0 1 2 3 05 I-10 E 0 1 2 3 05 I-405 S (b) Sub-predictor Figure 14. Effects of distance metric on PADI approach To investigate the reason, we plot the first 5 minutes of training results under two selected freeways (see Figure 14(b)). Specifically, we choose the two clusters for I-405 S and I-10 E to represent the cases with better prediction in Mahalanobis and Euclidean distance metric, respectively. As shown in Figure 14(a), the five minutes of cluster centroids in I-405 S present distinct patterns from each other. Thereby (a) Prediction of start location 0 0.5 1 1.5 λ=20% λ=40% λ=60% RMSE Baseline PAD PADI (a) Average prediction error PAD PADI λ=20% 4.2% 11.3% λ=40% 7.4% 39.1% λ=60% 23.9% 45.8% (b) Improv. over baseline Figure 11. Effect of impact threshold (λ) closest furthest ± 5 min 43.1% 53.2% ± 10 min 48.6% 70.1% ± 15 min 64.1% 71.1% fitting result. As the propagation behavior is quantified in a more accurate way, the prediction accuracy is also higher. Whenλ is small, the impact is less significant and hence the result can be more easily affected by the noise in the sensor speed readings, which yields lower prediction accuracy. Furthermore, we also observe that the larger the λ values cause shorter propagation behavior. This is because, given an incident, the significant speed decrease normally propagates a shorter distance than that of trivial speed changes. Thereby, it is easier to predict the propagation behavior with less time duration under large λ value. In sum, with larger λ, the propagation behavior is modeled more accurately (i.e., less fitting error), and hence easier to predict. 0 1 2 3 4 5 020 40 Impact Backlog (mile) Time Elapsed (min) λ=20% λ=40% λ=60% Fit ( λ=20%) Fit ( λ=40%) Fit ( λ=60%) (a) Fitting graphs Intuitively, when we increase λ, the standard to evaluate a location whether impacted or not is increased. Thereby, with large λ, only sensors shows significant speed changes will be considered as impacted sensors. To investigate the reason why the prediction accuracy increases a lot based along with the increase of λ, we conduct a case study on one incident occurred on I-405 S during off-peak hours. Figure 16 shows the interpolation process of the propagation behavior under different λ value. In this figure, each scatter point < x,y > represents a sensor located at y is start to get impacted at time x. And the dash lines represent the fitted function towards the corresponding set of scatter points. From this figure, we could derive the following two observations: (1) the larger the λ is, the less noise in the fitting process to generate propagation behavior; (2) the larger the λ is, the shorter propagation behavior is. From the first observation, we may infer the fact that, whenλ is small, the time when a sensor start to get impacted can be easily influenced by the noise in sensor speed readings. Since the noise can only cause speed changes in a small range, as the λ increases to a larger value, it can hardly affected the generation of propagation behavior, thereby the fitting performance is better, further the prediction accuracy is increased as well. For the second observation, it is intuitive that the duration with significant speed decrease is normally shorter than that with trivial speed changes. Thereby, the duration of propagation behavior is shorter, and easier to predict. 0 1 2 3 4 5 020 40 Impact Backlog (mile) Time Elapsed (min) λ=20% λ=40% λ=60% Fit ( λ=20%) Fit ( λ=40%) Fit ( λ=60%) Figure 16. Impact threshold case study avg(εf ) λ=20% 0.21 λ=40% 0.03 λ=60% 0.01 Table III FITTING ERROR 4) Effects of Forward Lag (h): In this set of experiments, we study how the length of forward lag (h) affects the prediction accuracy. Since there is no initial behavior pattern matching step in the baseline and the approach using envi- ronment information, we only compare the accuracy based on AP and AEP approach. Note that the AP approach can be reduced to the baseline approach when h = 0. Figure 22 depicts the prediction accuracy of the two proposed AE AEP λ=20% 4.2% 11.3% λ=40% 7.4% 39.1% λ=60% 23.9% 45.8% Table IV IMPROVEMENT TOWARDS BASELINE 0 1 2 3 4 5 020 40 Impact Backlog (mile) Time Elapsed (min) λ=20% λ=40% λ=60% Fit ( λ=20%) Fit ( λ=40%) Fit ( λ=60%) (a) Fitting graphs 0 0.5 1 1.5 2 RMSE λ=20% λ=40% λ=60% (a) AP approach 0 0.5 1 1.5 2 RMSE λ=20% λ=40% λ=60% (b) AEP approach Figure 15. Effect of impact threshold (λ) one incident occurred on I-405 S during off-peak hours. Figure 16 shows the interpolation process of the propagation behavior under different λ value. In this figure, each scatter point < x,y > represents a sensor located at y is start to get impacted at time x. And the dash lines represent the fitted function towards the corresponding set of scatter points. From this figure, we could derive the following two observations: (1) the larger the λ is, the less noise in the fitting process to generate propagation behavior; (2) the larger the λ is, the shorter propagation behavior is. From the first observation, we may infer the fact that, whenλ is small, the time when a sensor start to get impacted can be easily influenced by the noise in sensor speed readings. Since the noise can only cause speed changes in a small range, as the λ increases to a larger value, it can hardly affected the generation of propagation behavior, thereby the fitting performance is better, further the prediction accuracy is increased as well. For the second observation, it is intuitive that the duration with significant speed decrease is normally shorter than that with trivial speed changes. Thereby, the duration of propagation behavior is shorter, and easier to predict. 0 1 2 3 4 5 020 40 Impact Backlog (mile) Time Elapsed (min) λ=20% λ=40% λ=60% Fit ( λ=20%) Fit ( λ=40%) Fit ( λ=60%) Figure 16. Impact threshold case study λ avg(εf ) 20% 0.21 40% 0.03 60% 0.01 Table III FITTING ERROR 4) Effects of Forward Lag (h): In this set of experiments, we study how the length of forward lag (h) affects the prediction accuracy. Since there is no initial behavior pattern matching step in the baseline and the approach using envi- ronment information, we only compare the accuracy based on AP and AEP approach. Note that the AP approach can be reduced to the baseline approach when h = 0. Figure 17 depicts the prediction accuracy of the two proposed approach by varying the forward lag from 0 to 10 under different incident locations. In general, as h increases, the prediction accuracy from both approaches increase. This is because the longer time we observe on the impact backlog time series, the better estimation we can conclude for the rest of the behavior. However, for some cases, there is an slight increase of prediction error when h increases from 0 to 2. One explanation of such phenomenon is that the propagation behavior for the first 2 minutes is noisy, which may due to the difference in people’s immediate reactions to the incidents. For example, in the very beginning of the incidents, whether to move incident scene from the middle of the road to the shoulder may greatly affects the incident propagation behavior. Thereby, instead of enhancing the prediction accuracy, the initial propagation behavior may introduce more prediction error. 0.8 1 1.2 1.4 1.6 1.8 2 025 10 RMSE Forward lag (h) I-10 W I-5 S I-405 S (a) AP approach 0.8 1 1.2 1.4 1.6 1.8 2 025 10 RMSE Forward lag (h) I-10 W I-5 S I-405 S (b) AEP approach Figure 17. Effect of forward lag (h) 5) Effects of Distance Metric: In this set of experiments, we compare the prediction accuracy by tuning the distance metric when matching the initial propagation behavior in the AEP. Figure 15 illustrates the prediction accuracy for all selected freeways under the Euclidean distance metric and Mahalanobis distance metric. As shown, for prediction of impact on some freeways (such as I-405 S and I-405 N), the use of Mahalanobis distance improves the accuracy. On the other hand, for prediction on freeways such as I-10 E and I-5 N, the use of Euclidean distance has a better result. 0 0.5 1 1.5 I-405 N I-10 W I-5 S I-405 S I-5 N I-10 E Avg RMSE Euclidean Mahalanobis Figure 18. Effects of distance metric To investigate the reason for such phenomenon, we plot the cluster centroids used for prediction in AEP approaches for two selected freeways in Figure 19. Specifically, we choose I-405 S to represent the cases with better prediction in Mahalanobis distance metric, and I-10 E to represent the freeways with better prediction in Euclidean distance metric. According to the definition of the two distance metrics [14], Mahalanobis distance is based on correlations between variables by which different patterns can be identified and analyzed. It differs from Euclidean distance in that it takes (b) Fitting error Figure 17. Impact threshold case study approach by varying the forward lag from 0 to 10 under different incident locations. In general, as h increases, the prediction accuracy from both approaches increase. This is because the longer time we observe on the impact backlog time series, the better estimation we can conclude for the rest of the behavior. However, for some cases, there is an slight increase of prediction error when h increases from 0 to 2. One explanation of such phenomenon is that the propagation behavior for the first 2 minutes is noisy, which may due to the difference in people’s immediate reactions to the incidents. For example, in the very beginning of the incidents, whether to move incident scene from the middle of the road to the shoulder may greatly affects the incident propagation behavior. Thereby, instead of enhancing the prediction accuracy, the initial propagation behavior may introduce more prediction error. 0.8 1 1.2 1.4 1.6 1.8 2 025 10 RMSE Forward lag (h) I-10 W I-5 S I-405 S (a) AP approach 0.8 1 1.2 1.4 1.6 1.8 2 025 10 RMSE Forward lag (h) I-10 W I-5 S I-405 S (b) AEP approach Figure 18. Effect of forward lag (h) 5) Effects of Distance Metric: In this set of experiments, we compare the prediction accuracy by tuning the distance metric when matching the initial propagation behavior in the AEP. Figure 15 illustrates the prediction accuracy for all selected freeways under the Euclidean distance metric and Mahalanobis distance metric. As shown, for prediction of impact on some freeways (such as I-405 S and I-405 N), the use of Mahalanobis distance improves the accuracy. On the other hand, for prediction on freeways such as I-10 E and I-5 N, the use of Euclidean distance has a better result. (b) Fitting error Figure 12. Case study on impact threshold 3) Effects of Forward Lag (h): In this set of experiments, we study the effect of forward lag (h) length over the prediction accuracy (see Figure 13). We only evaluate the closest furthest Distr. 73.5% 26.5% Pred. Accur 82.3% prediction accuracy based on PAI and PADI as there is no initial behavior pattern matching step in the Baseline and PAD approaches. It is important to note that whenh=0, PAI, PADI are reduced to Baseline and PAD, respectively. Figure 13(a) depicts the average prediction accuracy of PAI and PADI by varying the forward lag from 0 to 10. Here, the unit of h is minute. Table 13(b) shows the improvement of PADI over PAI regarding different values of h. In general, as h increases, the prediction accuracy of both PAI and PADI increases. This is because the longer time using initial behavior as indicator yields better estimation. However, for some cases, there is an slight increase in prediction error (e.g., when h increases from 0 to 2 minutes). One explanation for this case is that the propagation behavior for the first 2 minutes is noisy, which may due to the difference in immediate reactions of the drivers to the incidents. For example, at the very beginning of the incidents, whether to stay of the road or move to the shoulder to take an exit may greatly affects the incident propagation behavior. 0.8 1 1.2 1.4 025 10 RMSE Forward lag (min) PAI PADI (a) Average prediction error Improv. h=0 4.2% h=2 6.0% h=5 7.2% h=10 19.2% Table VII AEP→AP VI. CONCLUSIONS In this paper, we model the incident spatiotemporal impact as a time series of impact backlog in terms of propagation behavior on urban road network and predict the propagation behavior under certain speed changes for newly occurred incidents. By evaluating based on a real traffic sensor datasets and incident reports, we show that our proposed prediction algorithm utilizing environment information and initial propagation behavior significantly improves the pre- diction accuracy of existing approaches based on incident attributes up to 45.8%. In particular, for predicting the set of road segments with 60% travel time delay in 15 th , 20 th and 30 th minutes after the occurrence of incidents, our best solution reaches the prediction accuracy of 91.7%, 84.2% and 72.4% under the configuration of freeways in LA county and Orange county. As a result, the propagation behavior predicted by our method can serve as an crucial input for predictive routes calculation in intelligent routing applications. REFERENCES [1] S. Boyles, D. Fajardo, and S. T. Waller. Naive bayesian classifier for incident duration prediction. [2] Y . Chung and W. W. Recker. A methodological approach for estimating temporal and spatial extent of delays caused by freeway accidents. [3] C. F. Daganzo. The cell transmission model: A dynamic representation of highway traffic consistent with the hydro- dynamic theory. Transportation Research Part B: Method- ological, 28:269–287, 1994. [4] U. Demiryurek, F. Banaei-Kashani, C. Shahabi, , and A. Ran- ganathan. Online computation of fastest path in time- dependent spatial networks. In SSTD’11. [5] A. L. Erera, T. W. Lawson, and C. F. Daganzo. A simple, generalized method for analysis of a traffic queue upstream of a bottleneck. 1998. [6] A. Garib, A. E. Radwan, and H. Al-Deek. Estimating magnitude and duration of incident delays. Journal of Transportation Engineering, 123(6):459–466, Nov. 1997. [7] GeoLife. http://research.microsoft.com/en- us/projects/geolife/. Last visited Feb 25, 2013. [8] G. Giuliano. Incident characteristics, frequency, and duration on a high volume urban freeway. Transportation Research Part A: General, 23(5):387–396, Sept. 1989. [9] T. F. Golob, W. W. Recker, and J. D. Leonard. An analysis of the severity and incident duration of truck-involved freeway accidents. Accident Analysis and Prevention, 19(5):375–395, Oct. 1987. [10] A. J. Khattak, J. L. Schofer, and M.-h. Wang. A simple time sequential procedure for predicting freeway incident duration. IVHS Journal, 2(2), Jan. 1994. [11] W. Kim, S. Natarajan, and G.-L. Chang. Empirical anal- ysis and modeling of freeway incident duration. In 11th International IEEE Conference on Intelligent Transportation Systems, 2008. ITSC 2008, pages 453–457, 2008. [12] J. Kwon, M. Mauch, and P. P. Varaiya. Components of con- gestion : delay from incidents, special events, lane closures, weather, potential ramp metering gain, and excess demand. In TRR’06, pages 84–91. [13] T. W. Lawson, D. J. Lovell, and C. F. Daganzo. Using the input-output diagram to determine the spatial and temporal extents of a queue upstream of a bottleneck. Trans. Res. Rec, 1572:140–147, 1997. [14] P. C. Mahalanobis. On the generalised distance in statistics. Proceedings of the National Institute of Sciences of India 2, 1:49–55, 1936. [15] M. Miller and C. Gupta. Mining traffic incidents to forecast impact. In UrbComp ’ 12. [16] K. Ozbay and P. Kachroo. Incident management in intelligent transportation systems. Artech House, 1999. [17] R. Pal and K. C. Sinha. Simulation model for evaluating and improving effectiveness of freeway service patrol programs. Journal of Transportation Engineering, 128:355–365, 2002. [18] B. Pan, U. Demiryurek, and C. Shahabi. Utilizing real- world transportation data for accurate traffic prediction. In ICDM’12. [19] F. M. Report. http://www.metro.net/board/Items/2012/03March/ 20120322RBMItem57.pdf. Last visited Feb 14, 2013. [20] RIITS. http://www.riits.net/. Last visited December 25, 2011. [21] P. J. Rousseeuw. Silhouettes: A graphical aid to the in- terpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20:53–65, 1987. [22] K. W. Smith. and B. L. Smith. Forecasting the clearance time of freeway accidents. In Center Transp. Studies, Univ. Virginia, Charlottesville, VA, Rep. STL-2001-01, 2001. [23] E. C. Sullivan. New model for predicting freeway incidents and incident delays. Journal of Transportation Engineering, 123(4):267–275, July 1997. [24] Y . Wang and N. Nihan. Freeway traffic speed estimation with single-loop outputs. TransportationResearchRecord:Journal of the Transportation Research Board, 1727(-1):120–126, 01 2000. [25] Z. Wang and P. M. Murray-Tuite. A cellular automata approach to estimate incident-related travel time on interstate 66 in near real time. Virginia Transportation Research Council, 2010. [26] WAZE. http://www.waze.com/. Last visited Feb 25, 2013. [27] S. C. Wirasinghe. Determination of traffic delays from shock- wave analysis. Transportation Research, pages 343–348, 1978. (b) PADI over PAI Figure 13. Effect of forward lag (h) 4) Effect of Distance Metric: In this set of experiments, we compare the prediction accuracy by choosing the distance metric by matching the initial behavior in PADI. Figure 14(a) illustrates the prediction accuracy for top six freeways with most incident occurrences using Euclidean distance metric and Mahalanobis distance metric. As shown, the performance of Euclidean and Mahalanobis distance metrics are variant, i.e., changes based on highways. For example, while Mahalanobis distance yields better results on I-405 South and I-405 North, Euclidean distance is better for I-10 East and I-5 North. 0 0.5 1 1.5 I-405N I-10W I-5S I-405S I-5N I-10E RMSE Euclidean Mahalanobis (a) Prediction accuracy on freeways 0.35333 0.48333 0.61 0.73333 0.95667 1.1767 1.3933 1.6033 1.8067 2.0033 0.65 0.76667 0.88667 0.99667 1.1067 1.21 1.31 1.47 1.6433 1.8067 1.235 1.42 1.6 1.775 1.955 2.12 2.29 2.45 2.605 2.76 1.06 1.17 1.27 1.38 1.48 1.58 1.68 1.78 1.87 1.97 0 0.005 0.095 0.185 0.27 0.355 0.435 0.515 0.59 0.66 0123456789 0.385 0.595 0.765 0.93 1.1 1.26 0 0 0 0.05 0.1 0.2 0.23 0.6 0.95 1.41 1.87 2.13 1.23 1.62 1.89 2.06 2.32 2.51 2.19 2.28 2.32 2.33 2.38 2.46 0.35333 0.48333 0.61 0.79333 0.95667 1.0767 0.65 0.76667 0.88667 0.99667 1.1067 1.21 1.235 1.42 1.6 1.775 1.955 2.12 1.06 1.17 1.27 1.38 1.48 1.58 0 0.005 0.095 0.185 0.27 0.355 0 1 2 3 05 I-10 E 0 1 2 3 05 I-405 S (b) Sub-predictor Figure 14. Effects of distance metric on PADI approach To investigate the reason, we plot the first 5 minutes of training results under two selected freeways (see Figure 14(b)). Specifically, we choose the two clusters for I-405 S and I-10 E to represent the cases with better prediction in Mahalanobis and Euclidean distance metric, respectively. As shown in Figure 14(a), the five minutes of cluster centroids in I-405 S present distinct patterns from each other. Thereby (b) Prediction of start time Figure 4.18: Prediction of clearance start time and location 4.5.3 Evaluation of Clearance Prediction Based on our discussion in Section 4.4, given the start time and location of the clearance behavior, we could easily predict the clearance behavior through the same prediction strategy as the propagation behavior. Thereby, in this sub-section, we mainly evaluate the accuracy in predicting the start time and location of clearance behavior. Figure 4.18(a) shows the distribution and prediction accuracy of the start locations. Here, the “closest” and “furthest” refers to starting locations of the clearance behav- ior towards the incident location, indicating the corresponding clearance behavior is similar with the one illustrated in Figure 4.10(a) and 4.10(b), respectively. As shown, around three-fourths traffic incidents are follow the clearance behavior illustrated in Figure 4.10(a), and the other one-fourth follows the behavior in Figure 4.10(b), which implies that most incidents occurred in Los Angeles are small-scale incidents and cleared fast. The prediction accuracy of the starting locations based on incident at- tributes is 82.3%, indicating there do exists correlations between clearance behaviors and the incident attributes. Figure 4.18(b) shows the prediction accuracy of start time in clearance behavior, where the category5,10 and15 refers to the percentage of incidents whose ab- solute error in predicting start time is within 5, 10 and 15 minutes range, respectively. As shown, for the incidents with clearance locations starts at closest location (i.e., oc- currence location), their clearance starting time always involve with the human factors, which makes the start time of clearance is harder to predict. For example, the start time relates with the time when drivers move the incident scene to the road shoulder, or even the time stamp when the police arrive the scene. For incidents started to be cleared at the furthest locations, their clear time are not necessarily correlated with the time when the scene is cleared, and it’s easier to predict according to the result depicted in Figure 4.18(b). 4.5.4 Case Studies on Travel Time Calculation We further evaluated our impact prediction approaches using two scenarios based on real-world incidents. For each scenario, we consider a routing plan which starts after the incident is reported and passes through the incident location. In order to estimate 45 the travel time on the routing plan, we utilize two strategies: one is based on the existing impacted area of the incident, another one is based on our predicted impact behavior of the incidents. The actual travel time calculated based on sensor speed for this routing plan is considered as the ground truth. We evaluate the accuracy of two strategies in estimating travel times. Figure 4.19 shows the first scenario for a real-world traffic collision incident re- ported on 11:58 AM, Sep 6 th , 2013. The incident is occurred on I-5 North and our selected route is from A to B as illustrated in Figure 4.19(a). In this scenario, we set the start routing time is 12:05PM, which is 7 minutes after the occurrence of the incident. Figure 4.19(b) shows the actual propagation behavior and the predicted propagation behavior after 12:05PM based on our prediction strategy with=40% 3 . As shown by the actual propagation behavior, the existing impact area is only 1.5 miles towards the incident location, and the impact area grows rapidly as time elapsed. For traditional navigation systems based on existing traffic situations, the travel time estimation only considers the 1.5 mile congestion towards the incident location. In reality, as drivers approaching the impact area along the route from A to B, the congestion area is already expanded, which results in the actual travel time is much larger than the one estimated by traditional navigation systems. Figure 4.19(c) shows the result of travel time estima- tion based on existing impact and predicted impact of traffic incidents. As shown, the traditional navigation system based on existing impact largely under-estimate the travel time. The next-generation navigation system based on our predicted impact improve the estimation accuracy towards the traditional one by 67%. Moreover, in this scenario, if there exists another route from A to B which takes 12 minutes to go through, the tra- ditional navigation system will not select this route since it assumes the current route from A to B only take 10 minutes based on the existing incident’s impact, which is less than 12 minutes. However, for next-generation navigation system utilized our impact prediction, it will select the route takes 12 minutes instead of the existing route since 14-minute is more than 12-minute. As a result, the routing choice made by the next- generation navigation system saves 1 minute travel time compared with the ground truth. The second scenario is illustrated in Figure 4.20 based on another traffic collision incident reported on 7:44 PM, Sep 16 th , 2013. This incident is occurred on I-405 South and our selected route is from A to B as illustrated in Figure 4.20(a). Fig- ure 4.20(b) shows the actual propagation behavior and predicted clearance behavior based on our prediction strategy with=40%. In this scenario, we set the start routing time is 8:55PM, which is after the beginning of the clearance behavior of the incident (the clearance behavior begins at 8:53PM as shown in Figure 4.20(b)). As shown, at 8:55PM, the existing impact area is still around 2 miles towards the incident location, and the impact area will be completely cleared in 5 minutes. In this scenario, for tradi- 3 In most cases, 40% is large enough to distinguish whether the speed changes is due to noisy sensor data or due to traffic incidents.Thereby, it should be a reasonable choice in the calculation of travel time. 46 (a) Planned route and incident location (b) Incident’s propagation behavior (c) Travel time estimation Ground truth Use existing impact Use predicted impact Travel time 13 (min) 10 (min) 14 (min) 0 1 2 3 4 5 Backlog (mile) t Actual Predicted Figure 4.19: Scenario for sampled incident occurred on I-5 North tional navigation systems based on existing impact, the travel time from A to B will be largely over-estimated, since at 8:55PM, the navigation system will still consider the congestion area caused by the incident is up to 2 miles, but it will be cleared soon as drivers approaching the area. Figure 4.19(c) reveals the result of travel time estimation. As shown, the actual travel time is only 9 minutes, which is 2 minutes less than the estimation based on the existing impact. Also, according to our prediction of clearance behavior, the travel time estimation is 50% more accurate compared with the traditional travel-time estimation. Similarly, in this scenario, if there exists another route from A to B which takes 10 minutes to go through, the traditional navigation system will se- lect this route instead of the current route since it assumes the current route from A to B take 1 additional minute from A to B. The next-generation navigation system will select current route since the other route takes 10-minute which is more than the esti- mated 8-minute on the current route, and in the end, it saves 1 minute travel time than compared with the ground truth. To conclude, the travel time estimation based on existing impact of an incident can both under-estimate and over-estimate the actual travel time due to neglecting the dy- namic development of the impact over time. By utilizing the predicted impact behavior, the next generation navigation system can estimate the travel time in a more accurate way, enabling more effective routing strategies. 47 0 0.5 1 1.5 2 2.5 Backlog (mile) Acutal Propagation Behavior t Predicted Clearance Behavior (a) Planned route and incident location (b) Incident’s propagation behavior (c) Travel time estimation Ground truth Use existing impact Use predicted impact Travel time 9 (min) 11 (min) 8 (min) Figure 4.20: Scenario for sampled incident occurred on I-405 South 48 Chapter 5 Analyze Traffic Events using Human Mobility and Social Media In this chapter, instead of using traffic incidents reports and traffic sensor dataset, we try to detect and analyze the impact of traffic incidents through human mobility data and social media. First, we detect traffic anomalies (i.e. incidents), which could be caused by accidents, traffic controls, celebrations, pro-tests, and disasters, etc., through mining human mobility data (e.g., GPS trajectories). Here, a detected anomaly is repre- sented by a sub-graph of a road network where people’s routing behaviors significantly differ from their original patterns. We then try to describe a detected anomaly by min- ing representative terms from the social media that people posted when the anomaly happened. The system for detecting such traffic anomalies can benefit both drivers and transportation authorities, e.g., by notifying drivers approaching an anomaly and suggesting alternative routes, as well as supporting traffic jam diagnosis and dispersal. In the system developed in this chapter, we uses a novel methodology to detect anomalies according to drivers’ routing behavior, i.e. the topological variation in traffic flow between points. This is different from related works on traffic anomaly detection, which focus on traffic volume and velocity on roads. Figure 5.1 gives a concrete ex- ample, where 200 drivers travel from an origin O to a destination D in a period of day. As demonstrated in Figure 5.1 (a), normally, 80% of drivers go to D via routert 1 while 10% travel along route rt 2 and 10% via route rt 3 . Figure 5.1 (b) shows one kind of anomaly in which the traffic volume decreased on each route. Figure 5.1 (c) illustrates another kind of traffic anomaly where the total traffic flow is the same as before but the routing behavior of drivers along these routes has changed. Specifically, the percentage of drivers choosingrt 1 decreased from 80% to 25%, while the traffic onrt 2 andrt 3 in- creases from 10% to 30%, respectively. At the same time, a new routert 4 has emerged, attracting 15% of drivers. Our approach has the following advantages over the existing methods. First, it provides a comprehensive view of the anomalies, showing the affected road segments of the anomaly as well as the relationships between these road segments. This is useful 49 D O rt 2 : 10% 30% rt 3 : 10% 30% rt 2 : 20 (10%) rt 3 : 20 (10%) D O Total Flow (O-D) : 200 Total Flow (O-D) : 100 Total Flow (O-D) : 200 (a) Regular Scenario (b) Traffic flow change (c) Routing behavior change rt 4 : 0% 15% rt 1 : 160 (80%) rt 2 : 20 10 rt 3 : 20 10 D O rt 1 : 160 80 rt 1 : 80% 25% Figure 5.1: Concrete example for diagnosing an anomaly or planning for traffic dispersal. For instance, traffic volume- based methods would only detect the road segment on which an accident has occurred, while other routes such asrt 2 andrt 3 would be overlooked. In fact, the traffic volume- based method may not even be able to detect some extreme cases, where the traffic volume does not change significantly on each road segment. Second, by detecting a subgraph, we enable the retrieval of relevant social media to describe the event. Without finding this geographic constraint and its time span, determining what social media is relevant to an anomaly would be far more costly, if not impossible. We use the historical tweets associated with the spatial region to represent the historical norm and report the terms that occur more frequently during the timespan of the anomaly as compared to their historical occurrences for this region. The rest of the chapter is organized as follows. In Section 5.1, we overview the system and introduce preliminaries. In Section 5.2, we explain our approach to offline mining. In Section 5.3, we detail our anomaly detection approach. In Section 5.4, we present our systems capability to analyze the detected anomalies. In Section 5.5, we present the experimental setup and results. 5.1 Overview 5.1.1 Preliminaries Definition 1 (Road Segment): A road segmentr is a directed edge in the road network graphs, with two terminal pointsr:s andr:e. The vehicle flow on this edge is fromr:s tor:e. Definition 2 (Road Network): A road networkG is a directed graph,G = (V ,E ), whereV is a set of nodes representing the terminal points of road segments, andE is 50 a set of edges denoting road segments. Definition 3 (Path): A path p is a sequence of connected road segments, i.e., p: r 1 !r 2 !:::!r n , wherer (k+1) :s =r k :e, (1k<n). Definition 4 (Trajectory): A trajectorytr is a sequence of GPS points created by a moving object. Each point consists of a longitude, latitude and a time stamp (t). In this work, we map-matched these GPS points onto a path in the road network, thereby, each trajectory can be converted to a set of time-ordered road segments, i.e., < t 1 ;r 1 >!< t 2 ;r 2 >! :::!< t n ;r n >, wherer (k+1) :s = r k :e, andt k indicates the arrival time on the road segmentr k (1k<n). 5.1.2 System Overview Figure 5.2 shows the architecture of our system, which consists of three parts: offline mining, anomaly detection, and anomaly analysis. Map Matching Routing Behavior Analysis Trajectory DB Offline Index Building Offline index Anomalous Seed Selection Anomalous Graph Expansion Anomalous Graphs Impact Analysis Visualization ..... ..... ..... Term Mining End Users Transportation Authority GPS Trajectories Tweets Offline Mining Anomaly Detection Anomaly Analysis Physical World Cyber World Figure 5.2: System architecture Offline mining: As illustrated in the left column of Figure 5.2, this step consists of identifying the normal routing behavior of drivers which happens in general cases (detailed in Section 5.2.2). This step accumulates historical mobility data (e.g., GPS trajectories from vehicles) into a trajectory database and builds an index between road segments and the trajectories traversing them in order to enable online anomaly detec- tion (refer to Section 5.2.3). This step also calculates the number and travel times of vehicles traversing each road segment over the course of a day. Online anomaly detection: As shown in the middle column of Figure 5.2, anomaly detection is an online inference step based on the recently received GPS trajectories of vehicles and the behavioral knowledge we obtained from offline mining. First, our system maps the received GPS trajectories of vehicles onto a road network using a map- matching algorithm presented in [66]. One copy of these processed trajectories is sent 51 to the trajectory database for offline mining. Another copy is used for real-time routing behavior analysis. Similar to offline mining, we analyze the current vehicle flow and travel time for each road segment. By comparing the real-time information with our historical routing behavior knowledge, our system selects road segments with a certain deviation from its normal pattern (we call such such road segments seed segments). Then, our system expands each seed segment to a complete anomaly subgraph, over which drivers’ routing behavior changed significantly (refer to Section 5.3 for details). Based on the offline index, an online indexing structure between paths and trajectories is built for efficient anomaly graph expansion. Traffic anomaly analysis: On the right of Figure 5.2, the anomaly analysis step aims to analyze and explain the anomaly. One class of analytic information is the anomaly’s impact in terms of the travel time delay on each path of the detected anomaly graph. We extract the information from the recent GPS trajectories of vehicles and the ordinary routing behaviors learned offline. Another class of analytic information is the representative terms (such as ‘bridge out’, ‘accident’, ‘sports’, etc.) that could describe or diagnose the anomaly. Specifically, we retrieve the relevant social media, e.g., tweets, using the time span when the anomaly occurred and the name of the roads covered by the anomaly. We then mine the representative terms that occurred frequently in the time span of the anomaly but rarely appeared otherwise. Finally, as a result, our system creates visualizations for individual drivers showing the extent of the anomaly as well as visualizations for more in-depth visual analysis. 5.2 Offline Mining 5.2.1 Modeling Taxi Trajectories We first partition the GPS logs from each taxi into independent trajectories representing individual trips, which is done using the taxi’s transaction records. Next, we employ IVMM algorithm [66], to map each GPS point onto a road segment. Due to the fact that taxis normally report their GPS location every 1 to 2 minutes, these mapped road seg- ments may not be connected with each other. Therefore, we connect each consecutive pair of GPS points with a path calculated based on the method described in [60]. As the result of this step, each trajectory has been converted to a directed path composed by connected road segments. For each trajectory, we also estimate the travel time on each road segment in its mapped path. We assume the travel time between two GPS points is uniformly distributed over the connecting path. 5.2.2 Modeling Routing Behavior We model the routing behavior between two points as the distribution of traffic flow across different connecting paths. The preliminaries for this model are provided as 52 follows: Definition 5 (Original Edge and Destination Edge): For an edger in a graphG, if there are no incoming edges connected with r:s, r is denoted as origin edge (r O ). Similarly, if there is no outgoing edges connected withr:e,r is denoted as destination edge (r D ). Definition 6 (Routing Pattern) : For each pair of<r O ;r D > in road network graph G, at timet, its Routing Pattern(RP) is defined as< f 1 ;p 1 ;f 2 ;p 2 ;:::;f m ;p m >, where f i is the traffic volume (i.e., number of vehicles) on thei-th path fromr O tor D , andp i is the percentage of the total flow (i.e., the sum off i ) betweenr O andr D using thei-th path. Consider the graph in Figure 5.1 as an example. Suppose the time stamps for the three figures are t 1 , t 2 and t 3 . The traverse flows and the routing behavior (i.e., routing pattern) for these three cases between O and D are shown in Table 5.1. Time Routing Pattern (RP ) t 1 <160, 0.8, 20, 0.1, 20, 0.1> t 2 <80, 0.8, 10, 0.1, 10, 0.1> t 3 <50, 0.25, 60, 0.3, 60, 0.3, 30, 0.15> Table 5.1: Example ofRP OD To measure the differences of routing behavior at timet 1 (RP t 1 ) with another rout- ing behavior (RP t 2 ), we define the Mahalanobis distance [16] as follows: d M (RP t 1 ;RP t 2 ) = q (RP t 1 RP t 2 ) T S 1 (RP t 1 RP t 2 ) (5.1) where RP t 1 and RP t 2 share the same distribution with S as the covariance matrix. This measurement is based on correlation analysis, through which different patterns in routing behavior can be identified. It differs from Euclidean distance in that it takes into account the correlations of the data set and is scale-invariant. Note that if the length of two routing patterns are different (e.g., the routing patterns fort 2 andt 3 in Table 5.1), additional zeros will be appended to the shorter vector to match the size of the longer one. 5.2.3 Index Building We create two index structures, an offline index and online index, for speeding up the anomaly detection process. The offline index is a bi-directional index structure between the trajectories and road segments. As stated in Section 5.2.1, each trajectory is converted into a path of connected road segments. The indexing in the forward direction is between each distinct trajectory and all the road segments contained in the derived path. In the reverse direction, each road segment is indexed by every trajectory that traversed it. Consider 53 r 1 r 4 r 3 r 2 tr 1 tr 2 tr 3 (b) Offline Index Eg. (a) Example tr 2 tr 3 r 3 r 1 r 2 r 4 r 1 ... r 4 r 3 r 2 r 3 tr 1 r 4 tr 2 tr 1 tr 1 tr 2 tr 3 (c) Online Index {p} {tr} r Road segment Paths ended at r Trajectories along each p tr 1 r 1 Figure 5.3: Example of index the example on Figure 5.3(a), where solid directed lines represent road segments, and the dash lines represent the trajectories. The corresponding offline index is depicted in Figure 5.3(b). This index structure is built offline, but will be updated online as new trajectories are received. Our system also includes an online index. The online index is a index structure created for each road segmentr to index all the ended paths and the trajectories along them. Note that ended paths refers to the paths which includer as the last edge. The structure of the online index is depicted in Figure 5.3(c). To detect an anomaly, we must efficiently find all the trajectories on the set of road segments and examine their routing behaviors. However, if we index the trajectories on all possible combinations of road segments within the road network, the size of the index entries would be exponential relative to the size of road network. Therefore, during the anomaly detection process, we use the offline index to build an additional index structure to expedite the searching process. The details of construction and maintenance strategy will be discussed in following section. 5.3 Traffic Anomaly Detection Problem Definition: Given a road network graphG composed of a set of road seg- mentsR =fr 1 ;r 2 ;:::;r n g, a set of driversD =fd 1 ;d 2 ;:::;d m g, and a set of trajectories TR =ftr 1 ;tr 2 ;:::;tr h g during [t 0 ;t 1 ], find a set of subgraphs, called traffic anomaly graphs,fg 1 ;g 2 ;:::g k g, where each graphg i satisfies the following criteria: (a)g i is a connected graph (b) Ing i , for eachr O , there is at least oner D , such that the routing pattern (RP) att 1 54 betweenr O andr D , satisfies the following: d M (RP t 1 ; [t 0 ;t 1 ) ) 3 s 1 N X t2[t 0 ;t 1 ) (RP t [t 0 ;t 1 ) ) 2 (5.2) where [t 0 ;t 1 ) denotes the median of all routing patterns calculated during [t 0 ;t 1 ). Sim- ilarly, for eachr D , there is at least oner O satisfied the above constraint. In (5.2), the left side is the Mahalanobis distance between the routing pattern at t 1 and the regular routing pattern 1 . The right side is three times of standard deviation from all routing patterns beforet 1 . Thereby, conceptually, we aim at detecting traffic anomalies with routing pattern highly deviated from regular routing pattern. Towards this end, we select seed segments and expand them into subgraphs that satisfy the con- ditions described above. In the following two sub-sections, we will discuss the two major steps, anomalous seed selection and graph expansion. 5.3.1 Anomalous Seed Selection Definition 7 (Anomalous Seed Segment): At time T, an anomalous seed segment is a single road segment that is not in any existing anomaly subgraph and has a current flow (f T ) that satisfies the following equation: jf T j 3 v u u t 1 N t N X i=t 1 (f t i ) 2 (5.3) where is the historical median flow on the road segments, andt 1 tot N refer to all the time stamps beforeT . At time T, all the road segments that satisfy the above definition are put into a pool of anomalous seed candidates. During the selection procedure, we sort the road segments in the pool such that they are ordered according to the scale of their flow variation, with the segment with the greatest variation being first in the queue. To ensure the selected seeds are not in any existing anomaly graphs, we select one seed at a time and expanding the corresponding anomaly graph, then iterate the process. Intuitively, for several road segments with similar flow changes located near to each other, it is entirely possible that their changes indicate they are part of the same anomaly. Thereby, instead of identifying multiple seeds at once, we select one seed and expand from it. After we finish expanding the anomaly subgraph, all of its road segments that are existed in the candidate pool are removed from the pool. The seed selection process terminates when the pool is empty. Note that, to measure the flow variation, we use relative variation rather than absolute variation, and in practice, we discard the road segments with extremely low historical flow from seed candidate pool as too little is known about them. 1 To calculate the regular route pattern, we first derive the median traffic volume on each path, and then calculate the percentage based on the sum of median volume. 55 5.3.2 Anomalous Graph Expansion With the selected anomalous seed segment, the next step is to expand from the seed to find the complete anomaly graph. The details of the expansion procedure are described in Algorithm 1. In general, it is based on breadth first graph expansion algorithm, with a verification step as pruning based on the criteria (b) in the problem definition. Algorithm 1 MSGDetector(seed, R, Tr,t,,') Output: Mobility Shift Graph:g 1: Let Graphg = InitGraph(seed) 2: Let Queueq = InitQueue(); 3: q.enqueue(seed.neighbors(R)); 4: whileq6=; do 5: RoadSegmentr =q.dequeue(); 6: g.addEdge(r); 7: boolp = Verification(g, Tr,t,,') 8: if p == true then 9: /// criteria (b) in problem definition is satisfied 10: q.enqueue(r.neighbors(R)); 11: else 12: g.deleteEdge(r); 13: end if 14: end while 15: Returng The verification step of Algorithm 1 is computationally costly, because it first needs to retrieve all paths between all O-D pair in the anomaly graphs (i.e., all combinations ofr O andr D ), and then record the number of traversing vehicles both historically and at present for each path to calculate the routing pattern. Specifically, the most compu- tational costly step is to find all the historical trajectories traversing a path. Since a path is normally composed of several road segments, the offline index on individual road segment cannot help directly. Therefore, we use an online index structure (depicted in Figure 5.3(c)) to reduce the complexity of retrieving the trajectories, as described in the following paragraphs. For each insertion of a new road segment, we need to update the relevant paths and trajectories in the online index structure. Consider a sample insertion sequence as depicted in step 1 to 4 in the Figure 5.4, the corresponding procedure of building the index is shown in Figure 5.5. The complexity of each insertion is shown in Table 5.2, whereTr xyz indicates the trajectory group which is indexed by the pathr x r y r z . The function F(A,r) returns all the trajectories in set A that passed edger, and TR refers to all historical trajectories. There are three possible positions for the newly added edge in a graph: as a destination edge(r D ), an origin edge(r O ), or an ordinary edge. The 56 r 1 r 2 r 4 r 4 r 1 r 1 r 2 r 3 r 2 r 1 r 3 r 1 r 3 r 2 (b) Step 2 (e) Pruning on Step 4 (d) Step 4 (c) Step 3 (a) Step 1 Edge added Edge removed Edge verified Figure 5.4: Sample insertion procedure r 3 →r 1 tr 1 r 4 →r 1 tr 2 r 1 r 2 r 3 →r 1 →r 2 tr 1 r 3 r 3 tr 1 r 1 tr 1 r 1 r 3 →r 1 tr 1 r 1 r 2 r 3 →r 1 →r 2 tr 1 r 3 r 3 tr 1 r 2 r 1 →r 2 tr 1 tr 2 r 1 tr 1 r 1 tr 2 Step 1 Step 2 Step 3 r 4 r 4 tr 3 Step 4 tr 2 r 4 →r 1 →r 2 Figure 5.5: Update procedure for online index complexity regarding the insertion of newr D andr O will be discussed in the following, while the complexity of inserting an ordinary edge can be derived by a combination of the other two. Destination Edge Insertion: For the insertion of new r D , we need to create its own index without updating any other edge in the graph. For example, in step 2 (for the insertion ofr 2 ), first we need find its incoming edges (i.e.,fr 1 g) from the existing graph G. Then, we need to append the new edge to the end of the paths fromr 1 and retrieve all trajectories that pass through the new edge. In this example, the complexity isO(jTr 1 j). Origin Edge Insertion: For the insertion of newr O , we need to creater O ’s index as well as update indexes for other relevant edges. Here, the relevant edges refer to all edges reachable fromr O . There are two types of update operations depending on whether the new edge replaced the previous origin edge or not. The steps 3 and 4 in the above example illustrate the two situations respectively. 57 Updated Index Complx. Edge Paths Trajectories (1) r 1 +r 1 Tr 1 = F(TR;r 1 ) O(1) (2) r 2 +r 1 r 2 Tr 12 =F(Tr 1 ;r 2 ) O(jTr 1 j) (3) r 3 +r 3 Tr 3 = F(TR;r 3 ) O(1) r 1 -r 1 Tr 31 =F(Tr 1 ;r 3 ) O(jTr 1 j) +r 3 r 1 r 2 -r 1 r 2 Tr 312 = F(Tr 12 ;r 3 ) O(jTr 12 j) +r 3 r 1 r 2 (4) r 4 +r 4 Tr 4 = F(TR;r 4 ) O(1) r 1 +r 4 r 1 Tr 41 =F(Tr 4 ,r 1 ) O(jTr 4 j) r 2 +r 4 r 1 r 2 Tr 412 =F(Tr 41 ,r 2 ) O(jTr 41 j) Table 5.2: Computational analysis of update procedure For the insertion ofr 3 in step 3, as shown in the Figure 5.4, it replaces the existing origin edge ofr 1 . In this case, we create an index forr 3 and update the existing indexes for r 1 and r 2 . For r 3 , we use the same strategy as in step 1. For r 1 and r 2 , we need to insertr 3 before the existing paths in each of their indexes, and search within their indexes to find trajectories that traverser 3 . As the operations for the three edges, r 1 , r 2 andr 3 , are independent of each other, thereby the updates can be accomplished in parallel. Hence, in general, the complexity of insertion in such cases is the maximum number of trajectories stored in the index of reachable road segments. For example, in this example, the overall complexity isO(max(jTr 1 j;jTr 21 j). For the insertion ofr 4 in step 4, since it does not replace any existing origin edges, we do not need to update existing indexes. Instead, we only need to add more entries in the index structure for reachable edges,r 1 andr 2 . As shown in Table 5.2, this step involves the same three operations as step 3. However, the three operations here cannot be executed in parallel because the operation in the next step depends on the result from the previous step. For example, for the second operation, generating the index entry of r 1 relies on Tr 4 as the input, which is the result of first operation. In this way, the updates need to be executed sequentially. Therefore, the complexity equals to the sum of updating costs from the insertion edge to all of its reachable edges. In this example, the overall complexity is theO(jTr 4 j +jTr 41 j). Origin Edge Insertion With Pruning: To reduce the number of sequential oper- ations, we propose a pruning strategy based on the following intuition: if the routing behavior on a sub-pathp does not present much variation, the routing behavior on the complete path containingp will not present much variation either. Thereby, instead of completing the updating operations, we can first test on the sub-path to see whether it satisfies the criteria for the verification, if not, we will prune it from the updating se- quence. Consider the example depicted in Figure 5.4, if the pathr 4 r 1 does not passed the verification step, the last operation in the Table 5.2 could be pruned. Meanwhile, 58 the updates of index ofr 2 could be pruned, as the red edge shown in the step 5 in Fig- ure 5.4. As the result, the overall complexity for the insertion ofr 4 could be reduced to O(jTr 4 j). 5.4 Traffic Anomaly Analysis 5.4.1 Impact Analysis We evaluate the impact of traffic anomalies in terms of the total travel time delay on the detected anomalous graph. The travel times for individual cars may have high variance, rather than staying around a static value, due to estimation error, different durations of traffic lights, driver preferences, etc. To address this, we defined the mean travel time (M) for a road segment over the time intervalT as follows: M(T ) = P i2T f i t i P i2T f i (5.4) where f i denotes the traffic flow along the road segment at time interval i in T , and t i represents corresponding travel time for flow f i . Using this, the travel time delay at time periodT 1 compared with periodT 2 for a road segmentr can be calculated as below: D r (T 1 ;T 2 ) = maxf0;M r (T 1 )M r (T 2 )g (5.5) To evaluate the total travel time delay for the traffic anomaly graph, we specify theT 1 in the above definition as the occurrence time period for the traffic anomaly andT 2 as the corresponding time period in the past. We then add up all the travel time delay (D r ) for each road segment in the traffic anomaly graph. For the examples in Figure 5.6, the sum of all the travel time delay onR 1 ,R 2 andR 3 is used as the impact parameter of the anomaly. By using this measure, we can further evaluate the severity of the detected anoma- lies. For severe anomalies, not only does the routing behavior changes, but drivers encounter large travel time delays in the impact region. On the other hand, some anomalies exist only as routing behavior changes without severe delays. In general, we focus on severe anomalies as these incur a high cost to both the drivers and the city. Therefore, by using the travel time, we conduct a post selection step to filter out the non-severe anomalies. Specifically, for a detected anomaly graph g, if the inequality (5.6) is not satisfied, we consider the anomaly as not severe. D g (T 1 ) 3std(ft2T 2 jM g (t)g) (5.6) wherestd refers to the standard deviation function, and the set passed to this function consists of theM values at different time intervals during historical period,T 2 . After this filtering step, the selected anomalies not only present anomalous routing behavior, 59 but can also be considered as an anomaly in terms of travel time. Note that, in the implementation of stand deviation function, we actually use the median of M values rather than the mean as the center. 5.4.2 Term Mining The online social media (e.g., microblogging service) allow people to post information (tweets) reflect what they are looking, hearing, feeling. In other words, the people using such social media services can be regarded as a human sensor of physical world. This motivated us to retrieve the relevant information from the human sensors to describe the traffic anomaly. Towards this goal, we utilize the location and time information obtained from the anomalous graph to eliminate the irrelevant posts, in order to enhance the searching efficiency. However, the remaining posts are still not necessarily relevant to the de- tected anomaly, because they may include some posts referring to other phenomenon which are commonly discussed all the time. For example, if an anomaly happens near a famous restaurant, during the occurrence time of the anomaly, not only will tweets dis- cussing the traffic anomaly be posted, but also the tweets regarding the famous dishes in the restaurant. Thereby, to filter out the commonly discussed terms, we propose a strategy based on comparing the frequency of current tweets with historical tweets, to ensure the effectiveness of the retrieved information. We detail our strategy by utilizing the flow chart shown in Figure 5.6. Retrieve tweets in the time slot R 1 .name, R 2 .name, R 3 .name [t 1 , t 2 ] Recent Tweets Retrieve historic tweets Historic Tweets Mining Representative Terms ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... A Recent Document Historical documents R 1 R 2 R 3 Figure 5.6: Term mining overview Figure 5.6 shows the flow chart of the online tweets selection strategy for a sample traffic anomaly graph. As shown, the graph contains three edges, R 1 , R 2 and R 3 , and 60 the corresponding time duration [t 1 , t 2 ]. As illustrated in this figure, we first get the location information from the street names of each road segment in the anomalous graph. We use these names to collect all the tweets over a historical time interval (i.e., either the tweet is published at this location, or the content of the tweet refers to the street names in this location). We then use the time interval during which an anomaly was detected (i.e., [t 1 ,t 2 ]) to separate from the historical context the tweets that might relate to the anomaly. Here, we consider all the tweets posted in one day during [t 1 , t 2 ] as a document. For examples, the set of all the current tweets is considered as one document denoted asT C . In this way, the historical tweets (T H ) refers to all the documents for each day in the past, as illustrated in Figure 5.6. Once we have bothT H andT C , we analyze the relevance of each term among them using the strategy similar with tf-idf in [44]. Here, we have one document for the collection of current tweets (T C ). For historical tweets (T H ), we have a set of documents with each representing the collections of the tweets data during [t 1 ,t 2 ] within one day in the past. Specifically, for each term, we calculate its relevance weights (w t ) as in Equation (5.7). w term =tf(term;T C )idf(term;T H ) s:t: ( tf(t;d) = f(t;d) maxff(w;d);8w2dg idf(t;D) = log jDj jfd2D:t2dgj (5.7) where tf is the function to calculate the frequency of the term in the current tweet document (T C ), andidf refers to the calculation of inverse document frequencies in all the historical tweets documents (T H ). A high weight in Equation 5.7 is reached by a high term frequency (in the current tweets) and a low document frequency of the term in the whole collection of historical tweets. By using these weights, we can filter out the terms that frequently appear in the historical tweets. In the end, we ranked all the terms according to their weight to describe the anomalies. The term cloud in Figure 5.6, is one of our sample visualization based on the weights. The size of the terms is proportional to their weights. To conclude, by identifying this geographic constraint and its time span, as well as guaranteeing the uniqueness of the terms, our approach is able to retrieve relevant social media (e.g., tweets) that offer description related to an anomaly. The efficiency and effectiveness of our approach are shown in the experiment section. 5.4.3 Visualization Our system presents a visual representation of the discovered anomalies for users. We present a navigation view for use by drivers and a analysis view for planners. Our design is informed by work on stacked graphs [12], flow maps [57], and road network visualization [61]. Our visualization shows a depiction of the road network overlaid on top of a satellite image. This serves to show the context of anomaly both in terms of the roads that are 61 Figure 5.7: Example of analysis view. involved as well as the surrounding city geography. For the navigation view, seen in Figures 5.11(b), 5.12(b), we display an anomaly as a colored subgraph. The road segments of the anomaly are colored green, yellow, or red if the travel time is less than 2x, less than 3x, or greater than 3x the historical travel time, respectively. If travel time is not available, the segments are colored red if there is a decrease in flow and yellow otherwise. At each downstream boundary of the anomaly, the arrow represents the direction the traffic is flowing. For the analysis view, each road segment is additionally drawn with a width corresponding to the current flow and a width corresponding to the historical flow. The geometry representing the current flow is colored red, yellow or green, while the historical flow is colored black. To demonstrate the analysis view, consider Figure 5.7. Here, we observe that the flow near the accident is less than the historical flow, and the speed is over 3x slower. We can also see that the flow on the offramp has increased and is moving at at least half of the historical value. The flow has increased along some detour routes, where the speed has remained at least half of the historical. The speed on the onramp has dropped to less than a third of the historical, raising the possibility that the traffic jam could extend down the ramp and affect the crossing highway. 5.5 Experiments 5.5.1 Dataset Mobility Data: We use GPS trajectories as mobility data, with statistics shown in Table 5.3. As about 20% of traffic on road surfaces in Beijing is generated by taxicabs, the taxi trajectories represent a significant portion of the traffic flow on the road network. While we use taxi trajectory for validation, we believe our system and method are general 62 enough to accept trajectory data generated by other sources, such as from public transit or location based check-in data, as long as they reflect mobility on the road network. Road Network: We have the road networks of Beijing, with statistics shown in Table 5.3. Traffic Anomaly Reports: We use the traffic anomaly reports published by trans- portation agencies as the ground truth to evaluate the effectiveness of our approach, the statistics is shown in Table 5.3. data duration Mar-May, 2011 Trajectories # of taxis 13,597 # of effective days 51 # of trips 19,455,948 avg. sampling interval (s) 70.45 Roads # of road segments 162,246 # of road nodes 121,771 Reports avg. # of reports per day 23 Table 5.3: Statistics of dataset 5.5.2 Evaluation Approach In this study, we explore the effectiveness and efficiency of our approach to traffic anomaly detection as well as the efficiency of our approach to term mining to help analyze and describe the detected anomalies. In this experiment, we consider the traffic anomaly reported in last three weeks in the 3-month period as test data to evaluate the overall accuracy of our approach. In this evaluation, we study the performance of our method using a time discretization of 30 minutes. In other words, we carry out our method for anomaly detection every 30 minutes and consider the taxi trajectories collected during this time interval as current data, and all the trajectories collected before as historical data to calculate the regular routing behavior. According the study in [13], the length of a time interval is a trade-off between the computational load and the timeliness of an application. Measurement: To evaluate the effectiveness of our approach, we consider the re- ported traffic incidents as a subset of ground truth, because the reported incidents is not necessarily a complete set of ground truth. We employ a parameterrecall to measure the accuracy of the detected anomalies. In our experiments,recall is the fraction of the number of detected reported anomalies over the number of all the anomalies reported. Note that, in this evaluation, we did not use the precision measurement, because the reported incidents is not a complete set of ground truth. It is entirely possible that some traffic anomaly, which resulted in the change of routing behavior and travel time delay, is detected by our approach but not reported by transportation authorities, such as the second case study presented in the result section. 63 Baselines: To evaluate the accuracy of our approach, we use a modified version of Principle Component Analysis (PCA) applied in [13] as a baseline anomaly detection approach. Unlike our work, this method focuses solely on traffic flow. The details of the implementation are as follows: we first applied PCA on a matrix of all road segments to find the anomalous road segments during a specific time period; then, we aggregate the nearby road segments into a connected graph as the anomalous graph. For the anomaly analysis, we consider an anomaly detection algorithm purely using social media similar with [48] as our baseline approach, which was initially proposed to detect the location and the description of earthquakes in real-time. This baseline approach uses keywords such as ”earthquake” to filter the irrelevant tweets. However, in our case, there is no indication of what terms might be relevant to the anomaly. Therefore we cannot use pre-defined keywords to do the filtering. As a result, we use this approach without keyword-filtering step as our baseline. 5.5.3 Results Effectiveness To evaluate our approach, we show the result under two ‘rush hour’ time intervals (i.e., 7-9AM and 4-6PM) on 5/12/2011 in Figure 5.8, where the caution label indicates the location of the anomalies. Figure 5.8 (a) and (b) show all the reported anomalies during the two time intervals; (c) and (d) show the anomalies detected by the baseline approach; and (e) and (f) show the detected anomaly by our approach. As shown from Figure 5.8, in both time intervals, our approach detects more anoma- lies than the baseline approach. In particular, for 7-9AM interval, our approaches de- tected all the reported anomalies, but baseline only detects two of them. For 4-6PM, our approach detect 8 reported anomalies but baseline only detects 7 of them. Specif- ically, in this particular experiments, the recall value for our approach improves the baseline by 85.6%. In the evaluation of over all the test data (i.e., all the anomalies occurred in the last three weeks of the dataset), average recall value for our approach is 86.7%, while that for baseline is around 46.7%. Therefore, we claim our approach significantly outperforms the traffic-volume approach. We believe this is due to the fact that our approach can detect the anomalies reflected not only from the traffic volume change, but also from the change of routing behavior. To further show the superiority of our approach, we choose a particular anomaly detected using our approach during 8:30AM to 9:00AM, but NOT detected by baseline approach in this time interval. In this case, based on our detected graph, there is a significant routing behavior shift from the main road (denoted as M-routing) to the auxiliary road (denoted as A-routing). In Figure 5.9, we visualize the change of overall flow (the sum of the flow on the two routes) as well as the change of the routing behavior between the main road and the auxiliary road over time. According to the Figure 5.9(a), during the interval of 8:30AM to 9:00AM, the over- 64 Procedure Name Time (s) Routing Behavior Analysis 1:2 Anomalous Seed Selection <1 Anomalous Graph Expansion 5:2 Table 5.4: Average processing time in anomaly detection all flow bypassing the two routes did not show much difference compared with regular flow. However, during 9:00AM to 9:30AM, as people started to avoid the anomaly re- gion, the overall flow decreases. On the other hand, the routing percentage changes in a different manner compared with that of overall flow. According to Figure5.9(b), people start to change their routes immediately after the anomaly happens (i.e., in the interval between 8:30AM to 9:00AM). During 9:00AM to 9:30AM, the routing behavior starts to recover to normal, while the overall flow in this region starts to behave abnormally. Therefore, in this case, when the sliding window reaches the time interval 8:30AM to 9:00AM, the baseline approaches cannot detect the anomaly as the overall traffic vol- ume has not changed significantly. However, as our approach considers the change of routing behavior, it can identify the anomaly in a more timely fashion than baseline approach. Efficiency In this set of experiments, we compare our anomaly detection approach with the ap- proach without online index structure. Both approaches are implemented on a 64-bit server running Windows Server 2008 (OS) using a single thread of a 2.66GHZ CPU with 16G memory. Figure 5.10 shows comparison result. As the size of detected mobil- ity graph grows, our approach performs increasingly better than the approach without an index, due to the fact that no-index approach spends a great amount of time in ver- ifications for all the O-D pairs during the expansion. However, our approach uses the additional data structure to avoid scanning every path between all the O-D pairs. Table 5.4 shows the average processing times for major steps in anomaly detection. The map matching procedure is always running in the background as a pre-processor to convert each GPS trajectory we collect online. The average processing time for map-matching one trajectory is 0.085 seconds. Assuming the number of anomalies is less than 10 per 30 minute period, our system can detect these anomalies within 1 to 2 minutes. The efficiency of anomaly analysis based on social media is evaluated through the case studies. Case Studies We further evaluate our approach using two case studies: one is reported and detected by our system, another one is detected by our system but is not reported. The reported 65 anomaly is caused by a traffic accident, and the un-reported anomaly is probably caused by the wedding expo exhibition according to our analysis. The results for the these case studies are depicted in Figure 5.11 and 5.12, respectively. In these figures, (a) presents the detected anomalous graphs by the baseline approach and (b) presents the anomaly detected by our approach. On the anomalous graph, the red, yellow and green lines indicate the travel time metric, as described in Section 5.3, and the caution mark repre- sents the location of the anomaly reported by the transportation authorities. In addition to the detected anomalous graphs, we also present the results from term analysis in the sub-figure (c). We also compare some relevant results during ordinary times versus during anomaly time in sub-table (d), such as the number of tweets and the tf-idf value of important terms. For the first case, according to the anomaly report, during 4PM-4:30PM on 5/19/2011, a two-car accident occurred on the Lianhua bridge in the north-bound direction. In the anomalous graph (i.e., Figure 5.11 (a)) detected by the baseline approach, only a small part of the highway around Lianhua bridge is included. On the other hand, in our detected graph (i.e., Figure 5.11 (b)), a more comprehensive view is provided in the following two aspects: 1) we detect a larger and more complete region impacted by this anomaly 2) by showing the yellow and green road segments, we can provide end-users the routes (i.e., auxiliary lanes) to detour or avoid the regions covered by red lines. These routing suggestions are implied in our detected graph as many people change their routing behavior to avoid or escape the anomaly. In addition, we show the corresponding top 50 terms mined for this cases in Figure 5.11(c). According to this figure, the most highlighted words are ”traffic” and ”acci- dent”, which is also consistent with the anomaly reports from Beijing Transportation Bureau. In addition, the result also reveals some other information relevant to the anomaly. For example, the terms ”two”, ”vehicles”, ”car”, which may indicate this ac- cident is involved with two cars, also shows the consistency with the anomaly reports. Also, the term ”north” indicates the direction of the lane where accident happens, as well as ”rainy” reveals the weather information at that time when the anomaly happens. Figure 5.11(d) shows in the anomaly time, there is no significant increase of tweets re- ferring the anomaly location, compare with that in ordinary times. However, thetf-idf value of some terms changed significantly, such as ”accident” and ”traffic”. By using the idea oftf-idf, our method can successfully identify the relevant terms. In the second case, our approach detected an anomaly at 8:30AM to 9AM on 5/27/2011 near the location of Beijing Exhibition Center. There is no anomaly reports from transportation agencies at this particular time and location, however, based on our analysis through online social media, the 18th Beijing Wedding Expo is opened at 9AM at Beijing Exhibition Center. According to the local news, each year, the wedding expo attracts a lot of wedding related companies to exhibit and sell their products as well as thousands of young people as customers, which can be considered as a significant shopping event. As shown in Figure 5.12, our detected anomalous graph is also more comprehensive and informative than the graph detected by baseline approach. 66 Baseline Our Approach jT H j jT C j jT H j jT C j Case 1 9:110 7 1:710 6 9:710 3 1:910 2 Case 2 8:510 7 1:910 6 3:410 4 5:610 2 Table 5.5: Comparison based on #. of tweets Used Different from previous case, there is no official transportation reports for this de- tected anomaly. To understand the anomaly, we further conduct the terms analysis from the online social media as result shown in Figure 5.12(c). From this figure, the most frequent mined terms are ”wedding” and ”expo”, which implies the cause of the detected anomaly. Also, the detected terms ”promotions” and ”shopping”, may sug- gest this event have great deals that may attract a lot people to shop there. From these mined terms, we could inference the traffic anomaly (i.e., significant travel time delay) is caused by too many people attending the wedding expo at its opening time at Beijing Exhibition Center. To conclude, our system can not only detect the traffic anomalies reported by the transportation agencies, but also, which is more important, detect the anomalies that are not reported. Table 5.5 shows a comparison of our approach with the baseline based on the num- ber of tweets used regarding the three cases studied above. Here,jT H j denotes the number of tweets published historically at the time of the anomaly. For example, for the first case,T H represents all the tweets posted during 4:30PM to 5PM at each day be- fore 05/19/2011 in the historical dataset.jT C j denotes the number of tweets published at the time of the anomaly. As presented, for both two cases, the number of tweets we analyzed in our approach is significantly reduced from that of the baseline (e.g., from the level of 10 6 to as low as 10 2 ). Since our approach focused on the tweets that were relevant (i.e., both spatially and temporally) to the detected anomaly graph, the search space of tweets is largely reduced compared with the baseline approach. 67 (a) 7-9AM: Reported incidents (b) 4-6PM: Reported incidents (c) 7-9AM: Baseline results (d) 4-6PM: Baseline results (e) 7-9AM: Our results (f) 4-6PM: Our results Figure 5.8: Traffic anomalies reported to authorities, discovered by the baseline PCA approach, and discovered by our method from 7AM to 9AM and from 4PM to 6PM on 5/12/2011 68 40 50 60 70 80 90 100 110 120 Traffic Flow t Flow in Anomaly Flow in Regular (a) Routing flow comparison 0 10 20 30 40 50 60 70 80 90 Percentage (%) t M-routing in anomaly M-routing in regular A-routing in anomaly A-routing in regular (b) Routing behavior comparison Figure 5.9: Effects of time intervals 0 20000 40000 60000 80000 Processing Time (millisec.) Size of Graph Without Online Index With Online Index Figure 5.10: Effects of index 69 (a) Detected by baseline (b) Detected by our method (c) Terms discovered ordinary anomaly #: of 1.6*10 2 1.9*10 2 tweets w traffic 0.361 0.903 w accident 0.202 0.978 (d) Relevant results Figure 5.11: Case study 1 70 (a) Detected by baseline (b) Detected by our method (c) Terms discovered ordinary anomaly #: of 2.0*10 2 5.6*10 2 tweets w wedding 0.121 0.447 wexpo 0.060 0.298 (d) Relevant results Figure 5.12: Case study 2 71 Chapter 6 Forecast impact on arterial streets and intersected freeways In our previous researches discussed in Chapter 3 and 4, we focused on predicting traffic speed with respect to upstream direction, i.e.; predicting backlog of the upstream road segments impacted from an incident. However, accidents and special events such as road constructions, major sporting games and concerts cause surges in traffic demand that overwhelm their vicinity with radically different flow from typical patterns. For example, as illustrated in Figure 6.1, a traffic accident on a freeway may impact the traffic flow in the following 3 types of locations: 1. the upstream stretch of the highway of incident occurrence 2. the adjacent arterial streets 3. other surrounding freeways 2 3 1 s 0 s 2 s 1 s 4 s 3 traffic sensor traffic incident potential impact direction Figure 6.1: Impact of a traffic incident In this chapter, we focus on proposing algorithms to address scenarios (2) and (3) to forecast the impact of incidents in its vicinity. Consequently, the major research 72 questions and challenges that we address in this chapter are a) to identify the set of road segments that will be impacted given a new incident “e”, b) for each impacted road segment, to predict the spatiotemporal performance decrease, i.e., determine when and how the impact will occur in time and space. We use the Granger causality to describe causal interactions among traffic at dif- ferent road segments to address the aforementioned challenges. Consider the road seg- ments with sensors, as depicted in Figure 6.1. To identify the causality relationship among them, we use three years of historical traffic sensor datasets to train regressive models to determine whether the time series data (e.g., collected froms 0 ) is useful for predicting other time series data (e.g., collected froms 1 ). If the change in traffic per- formance (e.g., decrease or increase in traffic speed) ats 0 leads to a change in traffic performance at another locations 1 , when a traffic incident occurs nears 0 , we identify s 1 as part of the impacted area and utilize the traffic performance ats 0 to predict the traffic speed ats 1 . Consequently, given a traffic incident and its attributes, we utilize the detected causality to predict the traffic performance in the near vicinity of the traffic incident. Given the solution above, the only remaining issue is how to detect the causality in traffic speed time series. One straightforward idea is to use the traditional Granger causality test [23] to detect the causality. However, based on our trial of the Granger causality test on real-world traffic data that were collected over three months, we found that hardly any Granger causality existed between any pair of traffic speed time se- ries. To solve this challenge, in this chapter, we investigate the unique characteristics of traffic speed time series and propose two types of time-sensitive causalities that are unique to traffic speed time series. Specifically, for two traffic speed time series with correlated historical patterns, we observe that the causality only exists during the be- ginning of rush hours when the traffic starts to become congested. Such causality only exists between two road segments that have strong connectivity in the road network. Conversely, in other connectivity scenarios, especially when the two time series are not correlated, we observe another type of causality that only exists when an intervention happens on the road network (i.e., traffic incident) during non-rush hours. Because both of the time-sensitive causalities are involved in this change in traffic performance in a similar fashion as the changes impacted by a traffic incident, in this chapter, we utilize the detected causality for the impact prediction of traffic incidents. The remainder of this chapter is organized as follows. Section 6.1 introduce the preliminaries of Granger causality. In Section 6.2, the intuition and definition of the time-sensitive causality are introduced. Section 6.3 discuss our approach on how to utilize the causality relationship to predict the impact caused by traffic incidents. We present the evaluation result of our approach in Section 6.4. 73 6.1 Preliminaries 6.1.1 Granger Causality Granger causality [23] is one of the earliest methods developed to quantify the causal effect from time series observations. It was originally proposed for an economic time series and is based on the commonly accepted observation that the cause occurs prior to its effect. Conceptually, X Granger causesY if its past value can help predict the future value ofY beyond what could have been done using a past value ofY only. In this chapter, we focus on the linear regression formulation of Granger causality. Given two time seriesX t =fx 1 ;x 2 ;;x t g andY t =fy 1 ;y 2 ;:::;y t g, consider the following two regressions: Y t = L X l=1 l Y tl +" 1 (6.1) Y t = L X l=1 l X tl + L X l=1 l Y tl +" 2 (6.2) whereL is the maximum lag and" is a white noise series. If Eq.(6.2) is a signifi- cantly better model than Eq. (6.1), we determine that the time seriesX t Granger causes Y t . The meaning of “significantly better” is that the prediction error ofY t is statistically and significantly reduced by adding the variableX. To quantify the specific reduction of the prediction error, we define the improvement of prediction accuracy as follows: Definition 1: Improvement of Prediction Accuracy (IPA) The IPA is defined as the relative difference in the prediction error ofY t obtained using the models defined in Eq.(6.1) and Eq.(6.2), as shown in Equation (6.3). IPA = err(Eq:(6:1))err(Eq:(6:2)) err(Eq:(6:1)) 100% (6.3) Here, the function err calculates the root mean square error between the ground truth and the predicted values during the test of corresponding models. In studies on financial time series [55], 5 percent is mostly used as the significance level to determine the causality relationship. In this study, we follow the same criteria using the following definition: if IPA is no less than 5 percent for the model defined as Eq.(6.2), compared with the model defined as Eq.(6.1),X t Granger causesY t . During the causality test, we also utilize a correlation measurement, which is de- fined as follows: Definition 2: Correlation Ratio (COR) Given two time seriesX andY , the corre- lation ratio (cor(X,Y)) is defined as the following equation (6.4): cor(X;Y ) = cov(X;Y ) p var(X) p var(Y ) (6.4) 74 Here, the functioncov calculates the co-variance of two time series, and the function var calculates the variance of one time series. When the absolute COR value is closer to 1, a higher correlation exists between the two time series. According to the discussion in [5], causality implies correlation, but correlation does not imply causality. 6.1.2 Lasso-Granger The Lasso-Granger method was proposed to provide a graphical causality model and reduce the computational complexity of the Granger method [7]. Lasso-Granger em- ploys the Lasso method to avoid the exhaustive one-to-one regression and statistical test used in the Granger method [23]. The Lasso method is used to select variable by assigning zero coefficients to variables to be eliminated and non-zero coefficients to variables to be selected [17]. The Lasso-Granger method identifies the causal relation- ship among a set of variables by estimating coefficient vectors that satisfy the following objective function: minimize 1 n X n i=1 (y i X j x ij j ) 2 + X j j j j (6.5) wheren is the number of examples in the input data and is a constant to be deter- mined. The Lasso method assigns zeros to the coefficients of variables to be eliminated. 6.2 Time-Sensitive Causality Detection To predict the impact of traffic incidents on freeways, the key is to understand the traf- fic causality relationship from an incident’s location to other adjacent locations. Due to the randomness of traffic incidents and the sparsity of sensor placement, it is difficult to collect exact traffic information at incident locations or at all possible adjacent loca- tions from sensors (i.e., loop detector equipped on roads). To solve this problem, the traffic data collected from the nearest upstream sensor(s) are utilized to represent traffic situations for any incident (e) having occurred on freeways. The traffic data collected from other sensors (fs i g) can be utilized to represent the adjacent traffic. Thereby, by detecting the traffic causality relationship based on sensor data collected from s and fs i g, we can further infer the impact of an incident occurring nears on other locations represented byfs i g. In this chapter, we focus on the causality relationship for traffic from one freeway to adjacent arterial streets or to other freeways. Based on our studies on real-world datasets, the traffic causality from one freeway to adjacent arterial streets or to other freeways may not always exist. Therefore, based on the temporal characteristics of their existence, we categorize the causalities into two groups: slowdown causality and intervention causality. In the following section, we introduce these two types of time-sensitive causalities through case studies based on real-world traffic incidents, followed by the detection approaches for these causalities. 75 6.2.1 Slowdown Causality Freeway sensor: X: West Arterial sensors: Y1: North Y2: South Y3: West Y X 1 2 3 0 20 40 60 80 Speed (mph) t 10 40 t 10 40 10 40 Speed (mph) Speed (mph) Y1 Y2 Y3 X Figure 6.2: Running example 1: causality from freeway traffic to arterial traffic -1 0 1 2 3 4 5 6 7 8 6AM-9PM 6-10AM +2-7PM 2-7PM 2-4PM IPA(%) Time Interval Y1 Y2 Y3 Figure 6.3: Running example 1 cont.: effects of time interval im training data Consider the example illustrated in Figure 6.2: sensor X is located on the I-10 West freeway, and the three arterial sensor are located at Y (i.e., the closest on-ramp road segment towards X). To identify whether the traffic at Y is impacted when a traffic incident occurs at X, we try to identify the causality relationship between one freeway sensor (X) and three arterial sensors (i.e., Y1,Y2 and Y3). To examine the causality from X to three sensors at Y, for each pair, we learn the two models in Equ.(6.1) and (6.2) using the 3-month traffic speed time series dataset, as described in Section 4.5 1 and compute the IPA value according to Equ. (6.3). We 1 In the implementation, we utilize the default maximum Lag value (L=6), as instructed in the Exper- iment section, to implement the regressive Equ. (6.1) and (6.2) 76 also extract the regular traffic speed pattern for sensorX and three arterial sensors at Y on weekdays, as illustrated in Figure 6.2, and calculate the correlation ratio between X and threeY s based on the regular speed pattern according to Equ. (6.4). As a result, we find that the IPA values are far below the 5 percent significant level for all three pairs; however, a strong correlation exists between the regular speed pattern ofX and Y 3, with the cor(X,Y 3) value reaching 0.82, and a weak correlation exists betweenX and eitherY 1 orY 2, with the correlation ratios staying at 0.16 and -0.45, respectively. Based on the findings so far, it is easy to derive the hypothesis that although a strong correlation exists between X and Y 3, there is no causality between them. However, this hypothesis only implies that there is no causality in general between X and Y 3; it cannot negate the existence of causality in special cases. According to the studies in [40], the traditional auto-regressive prediction approach is less efficient when there are significant speed drops in traffic speed time series. Because our Granger causality test relies heavily on the auto-regressive approach, the result of the test may be entirely different under these scenarios. Thus, we conduct another causality test by tuning the time intervals in the training dataset and depict the result in Figure 6.3. In this figure, the x-axis represents the time intervals that we chose for training the auto-regressive models using Equ.(6.1) and (6.2). As shown, the IPA values ofY 1 andY 2 always stay below the 5 percent significance level. However, the IPA values ofY 3 increases to 7.34% as we narrow down the training time intervals to 2PM-4PM. Combining this result with the regular traffic speed pattern illustrated in Figure 6.2, the time interval of 2PM-4PM covers the beginning of afternoon rush hours when there is a significant speed drop for bothX andY 3. This result indicates that the usage ofX helps significantly (at least a 5 percent significance level) in predictingY 3 during significant speed drops, while it does not help much in predicting Y 3 during other time intervals. Driven by such observations in the result, we define a new type of causality relationship as follows: Definition 3: Slowdown Causality: Given two highly correlated traffic time series X and Y , we define the causality relationship (from X to Y ) that is only observed when the traffic onX andY has a significant speed drop (i.e., becomes congested) as a slowdown causality. The slowdown causality is detectable not only at the beginning of rush hours but also at the beginning of traffic incidents as long as there is a significant speed drop. The slowdown causality is also in line with a common phenomenon in transportation: when the traffic on freeways starts to become congested, due to a high occupancy of the road and the interference of on-ramp signals, the automobiles will find it difficult to enter the freeways; therefore, the closest on-ramp streets will also become congested. In sum, by utilizing the 5 percent significance level as the benchmark, we can exam- ine the slowdown causality and further utilize it to predict the impact of traffic incidents. Note that the choice of the benchmark (i.e., the 5 percent significance level) is closely related to the value of the prediction horizon. In the results depicted in this section, the prediction horizon is set to 1. The effects of prediction horizons is discussed in the experiment section. 77 Y X X’ X* Y: West, X: South X’: West, X*: North 0 40 80 0 40 80 6:00 10:00 14:00 18:00 t X’ X* Speed (mph) 0 40 80 0 40 80 6:00 10:00 14:00 18:00 t Y X Speed (mph) Figure 6.4: Running example 2: causality from one freeway traffic to intersected free- way traffic 6.2.2 Intervention Causality Detecting the slowdown causality requires the existence of a high correlation between two time series. However, in many cases, such conditions cannot be satisfied especially with regard to the causality detection from one freeway to another freeway. If we rely only on the slowdown causality to detect the impact region of incidente, we may omit the locations whose traffic will also be impacted but cannot hold the slowdown causality with the traffic ate’s location. For example, consider the scenario illustrated in Figure 6.4: sensorY is located on freeway US101 West, and sensorX is located at freeway I-405 South, and we try to detect the causality relationship between X and Y . As a reference, we also include sensorX 0 (on US101 West) andX (on I-405 North) in this scenario to cover all possible locations to which traffic atY can diverge. Similar to the strategy employed in the previous case, in this case, we also calculate the COR value betweenY and three other locations based on the regular traffic speed pattern illustrated in Figure 6.4. The COR value betweenY andX,X 0 andX is 0.32, 0.87 and 0.38, respectively. Based on this result, we can derive that the speed time series atY is highly correlated with the speed time series on the same streets (collected from X 0 ) but weakly correlated with that on the other freeways (collected fromX orX ). However, such findings cannot guarantee that there is no causality relationship between traffic on freeway US101 and traffic on freeway I-405. For example, consider the traffic collision accident depicted in Figure 6.5(a). This figure describes a real-world traffic collision accident that occurred on I-405 South at 6:59PM on Jan 24 th , 2013, within the scenario depicted in Figure 6.4. On the right-hand side of this figure, the traffic speeds from sensorX andY are plotted, with the black dashed line representing the regular speed pattern on Thursdays and the red solid line representing the speed time series collected on the accident day (i.e., Jan 24 th , 2013). As shown in Figure 6.5(a), during the period after the incident occurred (i.e., 6:59PM), 78 Y X Collision at I-405 S. on 01/24/2013 (Thu) 06:59 PM 0 20 40 60 80 t Speed (mph) Y 0 20 40 60 80 Regular Thursday 1/24/2013 (Thu) X (a) Sample traffic collision Incidents occurrence time IPA of Y using X(%) 6:00 – 10:00 2.9 10:00 – 14:00 6.4 14:00 – 21:00 7.5 6:00 – 21:00 6.8 (b) Effects of occurrence time Figure 6.5: Running example 2 cont.: in the presence of traffic incidents the traffic speed at bothX andY experienced an unusual significant speed-drop com- pared with corresponding usual speed pattern. If we only consider the speed pattern of X andY during the presence of this traffic accident and compute their correlative ratio, the result can reach to 0.83. Such observations indicate that for sensors at two different freeways, where speed patterns are not correlative at regular times, can have highly cor- relative speed patterns in the presence of traffic incidents. Consider such scenarios, we work on two datasets: one is collected during regular times with few traffic incidents, and the other is collected during the first thirty minutes following an incident occurring atX. Note that to avoid the influence of rush hour traffic, only the datasets collected during the non-rush hour time intervals are selected. We examine the IPA ofY using X based on these two dataset. The result shows that for the former dataset, the IPA value is as low as 1.9%; however, IPA reaches 6.8% for the latter dataset. Considering 5% is used for the causality detection, we can derive that in this case, the causality re- lationship betweenX andY only exists after the traffic incident occurred atX during its non-rush hour interval. To generalize this type of causality, we provide the formal definition as follows: Definition 4: Intervention Causality Given two traffic time seriesX andY , we de- fine a causality relationship (fromX toY ) that is only observed when a traffic incident occurs atX during its non-rush hour, as intervention causality. According to the definition, the intervention causality can be detected by using the data collected during the beginning of traffic incidents. Therefore, to further study the influence of rush-hour on this causality, we conduct a set of evaluations by tuning the beginning time intervals of traffic incidents atX, and the results are shown in Figure 6.5(b). As shown, if we use the data collected from 6AM-10AM with an incident just recently occurring to train/test the auto-regressive models for causality detection, the IPA ofY stays below the 5 percent significance level. Conversely, for the same type of data collected from 2PM-9PM, the IPA ofY can reach 7.5%. Combining this result with the regular speed pattern ofX illustrated in Figure 6.4, for an incident occurring 79 during 6AM to 10AM (i.e., rush hours ofX), the traffic at locationX is already heavily congested; thus, the incident will probably not cause much difference in the speed time series ofX or inY . However, from 2PM to 9PM, the traffic onX is close to free flow and can be easily affected by anomalous incidents. Thus, incidents occurring during this time interval have a higher chance of incurring significant changes in traffic speed on the speed time series ofX and onY . Based on this evaluation, we can derive that the intervention causality from X to Y is can only be detected when the traffic at X is less congested (during the non-rush hour of X). Such observation will be further reinforced in the experiment section. 6.3 Impact Prediction In this section, we explain the proposed technique of utilizing time-sensitive causality to predict the impact of traffic incidents. Given the traffic dataset (D), which includes three years of historical sensor readings (e.g., speed), our impact prediction problem is defined as follows: Problem Definition: For an incidente occurring at timet 0 , the following three sets of parameters can be predicted: (a) The set of sensors located on arterial streets and other freeways that are im- pacted:fs i g. (b) For each impacted sensors i , the significance of the impact (i.e., scale of speed decrease): v i . (c) For each impacted sensors i , the time stamp when it starts to obtain impacted: t i . In this definition, item (a) represents the spatial impact factor capturing the im- pacted range of the traffic incident. Items (b) and (c) are closely related to the travel time delay caused by the incident and thus represent the temporal impact factor. By solving this problem, we can predict both the spatial and temporal impacts of traffic incidents. To solve (a), we can first identify the closest upstream sensor (s 0 ) to incident e, followed by detecting the slowdown causality and intervention causality froms 0 to all of the nearby sensors on arterial streets and other freeways, as discussed above. As a result, we can identify the set of sensorsfs i g that holds a causality relationship with s 0 , which can also serve as an answer to item (a). For (b), we can utilize the causality relationship learned from (a). Specifically, considering s 0 to be X and the impacted sensors i to beY , according to the causality relationship learned from Equ. (6.2), any variation inX times the coefficient will result in a variation ofY . Thus, by knowing the speed changes reported bys 0 after the occurrence of the incident, we can predict the speed changes ofs i using the causality relationship. So far, the only unsolved item is (c). In the rest of this section, we detail our solution to (c). During the causality detection phase, the default maximum Lag valueL is set to 6 80 in Eqs. (6.1) and (6.2). Because the aggregation level is 5 min for traffic speed read- ings, we utilize the traffic speed during the last 30 minutes (i.e.,ftig;i = 1;::; 6g) to predict the average traffic speed in the coming 5 minutes. In this way, even if the causality is detected, we can only conclude the helpfulness ofX t1 toX t6 in predict- ingY t , and we cannot clearly distinguish the significance of the contribution towards each time stamp of X. To identify the set of X values that are most important for the prediction ofY t , we considerfX ti ji=1,..,6)g to be independent variables and uti- lize the Lasso-Granger methodology to eliminate non-significant variables by assigning zero coefficients. After the elimination, we consider the lag difference betweent and the position of the first non-zero coefficient as the delayed length to illustrate the time X that needs to be passed to result in an impact. We present a small example to detail the specific solution of item (c) through Lasso- Granger. Consider the sample scenario depicted in Figure 6.2. When studying the slowdown causality between X and Y 3, the coefficients () offX ti ji=1,..,6)g, as learned using Equ.(6.2), are [-0.066, 0.298, -0.030, 0.002, 0.004, -0.002] After considering X t1 to X t6 independent variables and applying the Lasso- Granger methodology, as defined in Equ.(6.5), the coefficients ()offX ti ji=1,..,6)g are [0.000, 0.469, 0.000, 0.016, 0.000, 0.000] According to these results, the Lasso-Granger approach eliminates the variables with lagi2f1, 3, 5, 6g by assigning zero coefficients. The remaining variable areX t2 andX t4 . In this result, the first non-zero coefficient following time stampt occurs at t 2, indicating that a variation inX will affect the value ofY 3 at two time stamps. In this scenario, for any traffic incident that occurs atX and results in a significant speed decrease toX, its impact will affect the traffic speed atY 3 in 6 to 10 minutes. In this way, we address item (c) defined in the problem. Note that the traffic on the arterial streets (e.g.,Y 3) is always controlled by traffic signals; thus, the arterial traffic speed fluctuates greatly according to the display of the traffic signal. Thereby, it is entirely possible that the traffic speed immediately following the time stamp (e.g.,X t1 ) is not a good indicator of the current traffic speed (e.g.,X t ). Similarly, due to such fluctuations, ifX t2 is identified as an important indicator, it is entirely possible thatX t3 will not be chosen as an important indicator. Combining the causality detection and important lag selection discussed in the last section with this section, the complete flow of solving our impact prediction problem is illustrated in Figure 6.6. Given that a new incidente has just occurred, its closest upstream sensors 0 is sent to the archived database to retrieve the relevant time intervals for causality detection. It is also utilized to retrieve a potential candidate sensor to be impacted. Then, the sen- sor pair< s 0 ;s i >, together with the corresponding dataset, is the causality detection model that we use to identify whether the slowdown causality and intervention causal- ity exist froms 0 tos i . If the slowdown causality exists, we do not need to examine the 81 Real-time & archived traffic dataset Does intervention causality exist? Select important lag(s) based on lasso- granger & re-train regressive model regressive model For each <s 0 , s i > Incident e occurred s 0 Identify sensor s i as to be impacted, and predict its traffic speed Yes e’ info Yes Does slowdown causality exist? Real-time traffic speed for s 0 Locate sensors e’s nearest sensor s 0 e’s adjacent sensors{s i } Have correlated pattern ? Yes No Causality detection Impact prediction Offline Online Online Figure 6.6: Flow chart for impact prediction intervention causality because the impact of significant speed drops from traffic inci- dents is already covered in the definition of the slowdown causality. At the end of the causality detection, the sensor pairs (<s 0 ;s i >) holding the causality relationship can proceed to the next step, and the sensor pairs (< s 0 ;s i >) holding neither slowdown nor intervention causality are disregarded. In the former case,s i is considered one of the impacted sensors that can be contribute to the spatial impact range for item (a) in the problem definition. In the latter case,s i is excluded from the spatial impact range caused by incidente. For sensor pairs (< s 0 ;s i >) holding the causality relationship, we select the important lag variable froms 0 to identify the time stamp whens i starts to become impacted to address item(c) in the problem definition. We need to re-train the regressive model for predictings i based on the past value ofs i and the selected lag in s 0 . Finally, we utilize the real-time traffic speed data collected froms 0 and the learned regressive model to predict the speed of s i as the solution to item (b) in the problem definition. To enable real-time impact prediction, in this flow chart, the causality detection and important variable selection steps need to be implemented offline. Because the training step in the regressive model and the lasso approach require access to large amounts of archived traffic time series data, the causality detection and important variable selection significantly delay the online prediction process due to a great deal of training time consumption. To expedite the process, the causality detection and variable selection steps need to be completed offline for every sensor pair on the road networks. In this way, when a new incident occurs, the system will search within the offline training 82 Table 6.1: Dataset description data duration Jan. 1 st - Mar. 31 st , 2013 # of sensors 4,230 Traffic sensor sampling rate 1 reading/30 secs data temporal aggr. interval 1 min spatial range OC & LA County # of incident 6,025 data updating rate 1 min spatial range OC & LA County results to identify whether causality exists between the corresponding sensor pairs and will further retrieve the learned regressive model for the online traffic speed prediction for the sensor to be impacted. 6.4 Experiments 6.4.1 Experimental Setup Data Set At our research center, we maintain a very large-scale and high-resolution (both spatial and temporal) dataset collected from the entire network of LA County highways and arterial streets [46]. We have been continuously collecting and archiving the data for the past three years. We use this real-world dataset to create and evaluate our techniques. This dataset includes the following: 1. Traffic data: collected from traffic sensors covering approximately 5000 miles. The sensors report occupancy, volume and speed values. 2. Incident data: collected from various agencies, including California Highway Patrol (CHP), LA Department of Transportation (LADOT), and California Trans- portation Agencies (CalTrans). The statistics about this dataset is given in Table 6.1. Evaluation Method With our experiment, we first reinforce our findings in causality detection by tuning the system parameters (i.e., prediction horizon and rush-hour time interval). Second, we reveal the effectiveness of detecting the slowdown causality and intervention in the prediction with the presence of traffic incidents. Third, we evaluate the overall prediction accuracy of the approaches using both causality detection and important 83 variable selection strategy. Finally, we utilize a sample scenario describing a real- world navigation problem to illustrate the superiority of our approach in calculating travel time. The prediction accuracy is measured by the absolute error or Root Mean Square Error (RMSE) between the predicted traffic speed (i.e.,b v i ) and actual traffic speed (i.e., v i ). The definition of RMSE is as follows: RMSE = v u u t 1 N N X i=1 (v i b v i ) 2 (6.6) In this experiments, our model is built based on the detected time-sensitive causality and utilizes data collected from bothX andY to predictY . To ensure a fair comparison based on the same amount of data, we consider the regular regressive model as shown in Equ. (6.2) to be the baseline. According to the definition of Equ. (6.2), the regressive baseline model is also trained based on both X and Y to predict Y . In the evalua- tion, we compare the following techniques: the baseline, Prediction with the detection of Slow-down causality and Intervention causality using Auto-Regression (SIAR) and Prediction with detected causality and important variable selection (SIAR+lasso). In our default setting, the prediction horizon is set to 1 (i.e., 5 min), and the significance level for the causality detection is set to 5 percent. 6.4.2 Results on Running Examples In section 4.2, we utilize two real-world examples to illustrate the detection of the slowdown causality and intervention causality. To validate the observations made in that section, we revisit these examples and conduct more experiments. Effects of the prediction horizon First, we conduct a set of experiments for the first running example shown in Figure 6.2 under different prediction horizons. By analyzing the results shown in Figure 6.2(b), we observe that for the time interval 2PM-4PM, causality exists betweenX andY 3 but does not exist betweenX andY 1 orY 2. This observation is based on the experiment in which the prediction horizon is set to 5 min and the significance level is set to 5 percent. It is entirely possible that for other scenarios, the existence of the causality relationship betweenX andY 1,Y 2 orY 3 may be different. Thus, we tune the value of the prediction horizon and show the result of predicting the traffic speed from 2PM to 4PM in Figure 6.7. To implement the prediction model under a different horizon, instead of always using Y t+1 in Eqs.(6.1) and (6.2), we train the model directly with Y t+h (h=f1,...,6g). Here, one time stamp represents a 5-minute time interval. In this way, the prediction interval is set from 5 minutes to 30 minutes. 84 0 5 10 15 20 25 5 1015202530 IPA(%) Prediction Horizon (min) Y1 Y2 Y3 (a) Sample incident on I-405 N. 3 4 5 6 7 5 1015202530 RMSE Prediction Horizon (min) Baseline SIAR (b) RMSE for Y3 prediction Figure 6.7: Effects of prediction horizon on running example 1 In this set of results, 6.7(a) shows the effects of the prediction horizon on the IPA values for utilizingX in the prediction. From this figure, we can observe that forY 3, the improvement of the prediction accuracy (IPA) by usingX increases significantly as the prediction horizon increases, which indicates that the usage ofX significantly im- proves the prediction ofY 3. In this way, the existence of causality betweenX andY 3 can be verified. Moreover, the IPAs ofY 1 andY 2 do not present significant increases towards the increase of the prediction horizon; thus, their causality relationship with X cannot be identified. This set of results also gives us some hints on how to choose the significance level. According to the study in [40], the increase of the prediction horizon may decrease the prediction accuracy of traditional regressive approaches due to its reliance on data from the immediate past. Such decreases will give more room for improvement by usingX. Thereby, when we increase the prediction horizon, the significance level should also be increased. For example, the 5 percent significance level is not a proper benchmark to identify the causality relationship when the predic- tion horizon is 30 minutes, according to the 30-min result shown in Figure 6.7(a), it will mis-identify the existence of causality betweenX andY 1. Figure 6.7(b) compares the prediction error forY 3 under different prediction hori- zons using the baseline approach and our approach. According to this figure, our SIAR approach provides a more accurate prediction result at all prediction horizons compared with the baseline. This result indicates that the detection of the slowdown causality helps predictY 3 from 2PM to 4PM compared with the baseline approach, based on the direct usage ofX at all time intervals. Effects of rush-hour time interval Second, we revisit the second running example illustrated in Figure 6.4 to examine the effects of rush hour. In that example, the IPA value ofY usingX is higher whenX is less congested in the afternoon. We thus claimed that the intervention causality from X toY is can only be observed when the traffic atX is less congested. To validate this 85 hypothesis, in this experiments, we try to examine the intervention causality between X and Y in the same scenario. As shown in Figure 6.4(a), X and X are located in different directions of freeway I-405, and the time periods of their rush hours are distinct. Specifically, forX, the rush hour congestion only exists in the morning period, while forX , it only exists in the afternoon. Thus, if this hypothesis stands, the IPA value of the intervention causality betweenX andY should be higher in the morning, whenX is less congested. Driven by this hypothesis, we tune the incident occurrence time interval to examine the intervention causality betweenX andY , and the result is shown in Table 6.2. Table 6.2: Effects of rush hour interval on X* IPA of Y 6AM-10AM 10AM-4PM 4PM-8PM by using X* 8.85% 7.94% 2.28% From this result, we can observe that the IPA value from 6AM to 10AM is signifi- cantly higher than that from 4PM to 8PM. Combining the result with the traffic speed pattern ofX shown in Figure 6.4(a), we can derive that for an intervention causality betweenX andY , it is still only detected whenX is less congested (e.g., from 6AM to 4PM). Therefore, we verify our hypothesis regarding the temporal characteristics of the intervention causality. 6.4.3 Result on Prediction Accuracy In this sub-section, we first choose two cases to compare the prediction accuracy be- tween our SIAR approach and the baseline. The first case that we choose is a traffic collision accident that occurred on I-10 West on January 27 th (Sunday), 2013, and we try to predict its impact on the arterial streets. Figure 6.8 illustrates the traffic speed and the prediction accuracy for an on-ramp arterial sensor to I-10 West. As shown in Figure 6.8(a), on regular Sundays, the traffic speed at this arterial sensor stays at approximately 35 MPH from 3PM to 8PM. However, on January 27 th , a sudden speed change was re- ported during this time interval due to the accident that occurred on I-10 W. To predict the traffic speed of this arterial sensor on this day, the traditional regressive approach with little help of freeway traffic (i.e., baseline) represents at least a 10 MPH absolute prediction error in both the beginning and the end of the traffic incident, according to the plot shown in Figure 6.8(b). For our SIAR approach, although it presents a similar prediction accuracy as the baseline both before and after the incident, it significantly reduces the absolute error by up to 93.8% compared with the baseline approach at the boundary of the traffic incident. The second case that we choose is regarding a traffic hazard incident that occurred on I-405 North on January 31 st (Thursday), 2013. It was located 0.6 miles from the intersection with I-10. Our goal is to predict the impact of the incident on freeway I-10 West. Similar to the previous case, Figure 6.9 illustrates the traffic speed and 86 0 10 20 30 40 50 Spped (mph) t Reg. Sun. Incident Day (a) On-ramp arterial traffic speed 0 5 10 15 20 Absolute Error t Baseline SIAR (b) Pred. accuracy comparison Figure 6.8: Sample traffic incident on I-10 W. at 4:28PM prediction accuracy for a freeway sensor located on I-10 West pointed toward the inter- section. According to Figure 6.9(a), the traffic speed at this sensor drops significantly near 6:50PM compared with its regular traffic speed on Thursdays at the same time interval. From Figure 6.9(b), we can clearly observe that the baseline approach cannot effectively predict such significant speed changes, while our SIAR approach can iden- tify such changes by intelligently learning the intervention causality from the traffic on freeway I-405. As a result, the SIAR approach can improve the prediction accuracy by up to 48.5% compared with the baseline approach. The SIAR model may present less accurate predictions during time intervals with no traffic incidents involved, such as 5:30PM to 6:30PM, as illustrated in 6.9(b). In real-world navigation systems, we can use traditional regressive approaches to predict the traffic with no incidents involved and switch to our approach to predict the traffic with an incident just occurred. 0 20 40 60 80 Speed (mph) t Reg. Thu. Incident Day (a) Traffic speed on I-10 W. towards the in- tersection 0 10 20 30 Absolute Error t Baseline SIAR (b) Prediction accuracy comparison Figure 6.9: Sample incident on I-405 N. at 6:38 PM In addition to the case studies, we conduct an overall evaluation of the traffic speed 87 prediction based on a total of 603 freeway traffic incidents occurring on weekdays. In these experiments, 333 incidents that were collected on freeway I-10 are used to evaluate the impact on arterial streets, and 270 incidents collected on I-405 are used to evaluate the impact on other intersected freeways. In this set of experiments, only the prediction accuracy for the traffic speed in the first half-hour after the incidents occurrence times is evaluated. Figure 6.10 shows the result of the average prediction accuracy. As shown on the x-axis, we group the incidents’ occurrence time into four time intervals, with 6AM-10AM and 2PM-7PM reflecting morning and afternoon rush hours, respectively. 0 5 10 15 20 6AM- 10AM 10AM - 2PM 2PM - 7PM 7PM - 9PM RMSE Baseline SIAR SIAR+ Lasso (a) Arterial traffic (b) Intersecting freeway traffic 0 5 10 15 20 6AM- 10AM 10AM -2PM 2PM - 7PM 7PM - 9PM RMSE 0 2 4 6 8 10 12 6AM- 10AM 10AM -2PM 2PM - 7PM 7PM - 9PM RMSE Figure 6.10: Overall prediction accuracy on impact of freeway incidents As shown in Figure 6.10(a), when predicting the impact on arterial streets, our SIAR slightly outperforms the baseline approach, and the SIAR+lasso approach yields the best prediction accuracy (i.e., lowest RMSE error) in all time intervals. Regarding the impact on intersected freeways, as shown in Figure 6.10(b), the improvement of our approach compared with the baseline is more obvious. This is because the speed decrease scales on freeways are always more significant than the decrease scales on the arterial streets, regarding the impact from freeway incidents. Therefore, by con- sidering the slowdown causality and intervention causality, our approaches can predict the speed changes compared with the baseline approach. Moreover, by utilizing the lasso approach for important lag selection, the prediction accuracy is slightly improved compared with the approach solely based on SIAR. Another observation is that the pre- diction accuracy during rush hour is generally lower than that obtained during non-rush hour. Because the traffic on the road is always congested during rush hours, it is difficult to distinguish the impact caused by incidents from the regular rush-hour congestion. 88 6.4.4 Result on Travel-time Calculation In this section, we further evaluate our prediction approaches using a real-world inci- dent. As illustrated in Figure 6.11 (a), a traffic collision incident was reported at 6:47 PM, September 16 th , 2013. Regarding this incident, we consider a routing plan that starts at A at 7PM and passes through the incident location to B, as illustrated in Figure 6.11(a). Figures 6.11(b) and (c) show the predicted traffic speed for sensors located at location A and the estimated travel time, respectively. To estimate the travel time for the routing plan, we utilize two strategies: one is based on the impact prediction approach proposed in [39], which only focuses on the impact on the freeway on which the incident occurred, and the other is based on both [39] and our prediction of traffic speed on an intersecting freeway. The actual travel time calculated based on sensor speed for this routing plan is considered the ground truth. (a) Planned route and incident location (c) Travel time estimation Ground truth Use [1] Use [1] + SIAR Travel time 9 (min) 7 (min) 10 (min) (b) Prediction of raffic speed at A 0 20 40 60 80 Speed (mph) t Regular Monday Truth at 09/16/2013 SIAR+lasso prediction Figure 6.11: Case study in travel time calculation According to the result shown in Table 6.11(c), by combining our approach and the approach proposed in [39], we can provide a more comprehensive prediction of the incident’s impact, both on the freeway of incident occurrence and on the intersecting freeway. Thereby, this combination yields the most accurate estimation of the travel time calculation. Such travel time estimations can further assist next-generation navi- gation systems for more efficient routing. 89 Chapter 7 Conclusion and Future work By utilizing real-world transportation related datasets, this thesis addressed several fun- damental problems related to traffic incidents, which contribute to approximately 50% of traffic congestion. These problems include predicting the traffic in the presence of traffic accidents, predicting the dynamic evolvement of the impact of incidents, mining social media and GPS trajectories to better understand the cause of traffic incidents, and analyzing the causality relationship for predicting impact of traffic incidents on road networks. The main contribution of each chapter is summarized as follows: Chapter 3 discusses a prediction approach for the impact of traffic incidents based on their attributes. Built on this approach, we further proposed H-ARIMA+, which predicts traffic speeds based on the historical spatial span and speed changes during the presence of traffic incidents. Through experiments using real-world traffic sensor data collected in Los Angeles area, our approach improves the tradi- tional time-series prediction approach (such as ARIMA and Neural Net) by up to 78% and 91% during the presence of rush hour and traffic incidents, respectively. In Chapter 4, we improve the impact modeling strategy of traffic incidents by considering the dynamic evolution of congested areas compared with the static impact modeling strategy discussed in Chapter 3. As a result, we propose the concepts of propagation behavior and clearance behavior to characterize the changes of the congested spatial span over time of a traffic incident. Moreover, to predict such behavior, we propose a prediction approach based on the attributes of traffic incidents, such as traffic density and initial propagation behavior (PADI). According to the evaluation results based on real-world traffic sensor data and incident reports, our PADI approach improves the baseline approach (proposed in Chapter 3) by up to 45%. In addition to the traffic sensor data, we investigate the impact of traffic incidents through human mobility data (e.g., GPS trajectories) and social media in Chapter 5. In this study, we first identified the traffic incidents by mining large amounts 90 of detour behaviors to avoid the traffic incidents using GPS trajectories based on human instincts. Then, we analyzed the impact of the traffic incidents through the travel-time delay collected from GPS data. Subsequently, we proposed an approach to mine social media for terms that are both geographically and tem- porally constrained to the sub-graph and are correlated with the detected traffic incidents. We evaluated our system using a GPS trajectory dataset generated by over 30,000 taxis over 3 months in Beijing. We examined the effectiveness and efficiency of our system and compared our approach with a baseline method using traffic volumes. Chapter 6 discusses two types of causality relationships among traffic time series collected from different locations on the road network during specific time inter- vals. As a consequence, we propose an impact prediction approach (SIAR) by utilizing these causalities at the beginning of traffic incidents to predict the range of the impact and the scale of the speed changes. Moreover, we employee the lasso approach to assist SIAR to identify the congestion starting time for each impacted location. As a result, we predict a more comprehensive impact region of traffic incidents (i.e., on the entire road network) compared with the impact region predicted in Chapters 3 and 4 (i.e., the upstream stretch of the occurrence freeway). These achievements have important academic value and extensive applications. The studies discussed in this thesis have already been published in or under submission to top tier academic conferences such as the International Conference on Data Mining (ICDM) and ACM SIGSPATIAL. This work allows transportation and data mining re- searchers to gain more insight regarding the traffic & incident datasets. They can also enable transportation agencies and the government to obtain a detailed understanding of the impact of traffic incidents and make effective plans or policies to reduce the congestion impact caused by traffic incidents. The most important application is the next-generation navigation systems for individual drivers. As shown in the experiment results, next-generation navigation applications built based on the approaches proposed in this thesis can save individual drivers a significant amount of travel time in the pres- ence of traffic incidents. In the future, we plan to extend this study by investigating the effects of different incident types on our impact prediction model and examining the effectiveness of our models in different datasets depending on the season. In addition, we plan to investi- gate effective methods for updating our models online using streaming real-time data. Finally, we will focus on predicting the impact of more complex traffic relevant events, such as large-scale parades or sporting events. 91 Bibliography [1] Big data: The next frontier for innovation, competition, and productivity. McKin- sey Global Institute, 2011. [2] Texas transportation institute (tti), annual urban mobility report and appendices. 2012. [3] M. M. Ageli and A. M. Zaidan. Road traffic accidents in saudi arabia: An ardl approach and multivariate granger causality. International Journal of Economics and Finance, 5:26–31, 2013. [4] H. Al-Deek, A. Garib, and A. E. Radwan. New method for estimating freeway incident congestion. Transportation Research Board 1494, pages 30–39, 1995. [5] J. Aldrich. Correlations genuine and spurious in pearson and yule. Statistical Science, 10, 1989. [6] J. R. Ameena and J. A. Najib. Causal models for road accident fatalities in yemen. Accident Analysis and Prevention, 33:547–561, 2001. [7] A. Arnold, Y . Liu, and N. Abe. Temporal causal modeling with graphical granger methods. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’07, pages 66–75, New York, NY, USA, 2007. ACM. [8] M. Baykal-gursoy, W. Xiao, and K. Ozbay. Modeling traffic flow interrupted by incidents. In European Journal of Operational Research, 2009. [9] M. Ben-akiva, M. Bierlaire, H. Koutsopoulos, and R. Mishalani. DynaMIT: a simulation-based system for traffic prediction. In DACCORD’98, Delft, The Netherlands. [10] G. Box and G. Jenkins. Time series analysis: Forecasting and control. San Francisco: Holden-Day, 1970. [11] S. Boyles, D. Fajardo, and S. T. Waller. Naive bayesian classifier for incident duration prediction. 92 [12] L. Byron and M. Wattenberg. Stacked graphs–geometry & aesthetics. IEEE Transactions on Visualization and Computer Graphics, 2008. [13] S. Chawla, Y . Zheng, and J. Hu. Inferring the root cause in road traffic anomalies. In ICDM ’12. [14] H. Cheng, P.-N. Tan, J. Gao, and J. Scripps. Multistep-ahead time series predic- tion. In PAKDD’06. [15] S. Clark. Traffic prediction using multivariate nonparametric regression. In JTE’03, volume 129. [16] R. De Maesschalck, D. Jouan-Rimbaud, and Massart. The mahalanobis distance. In Chemometrics and Intelligent Laboratory Systems 50, pages 1–18, 2000. [17] B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. The Annals of Statistics, 32:407–499. [18] A. Garib, A. E. Radwan, and H. Al-Deek. Estimating magnitude and duration of incident delays. Journal of Transportation Engineering, 123(6):459–466, Nov. 1997. [19] Y . Ge, H. Xiong, C. Liu, and Z.-H. Zhou. A taxi driving fraud detection system. In ICDM ’11. [20] J. D. Gehrke and J. Wojtusiak. A natural induction approach to traffic prediction for autonomous agent-based vehicle route planning. MLI 08-1, George Mason University, 2008. [21] G. Giuliano. Incident characteristics, frequency, and duration on a high volume urban freeway. Transportation Research Part A: General, 23(5):387–396, Sept. 1989. [22] T. F. Golob, W. W. Recker, and J. D. Leonard. An analysis of the severity and incident duration of truck-involved freeway accidents. Accident Analysis and Pre- vention, 19(5):375–395, Oct. 1987. [23] C. W. J. Granger. Investigating causal relations by econometric models and cross- spectral methods. Econometrica, 37:424–438, Aug. 1969. [24] M. A. Hall and L. A. Smith. Practical feature subset selection for machine learn- ing. In ACSC98, Perth, Berlin: Springer., pages 181–191, 1998. [25] S. Ishak and C. Alecsandru. Optimizing traffic prediction performance of neu- ral networks under various topological, input, and traffic condition settings. In JTE’04, volume 130. 93 [26] A. J. Khattak, J. L. Schofer, and M.-h. Wang. A simple time sequential procedure for predicting freeway incident duration. IVHS Journal, 2(2), Jan. 1994. [27] W. Kim, S. Natarajan, and G.-L. Chang. Empirical analysis and modeling of freeway incident duration. In 11th International IEEE Conference on Intelligent Transportation Systems, 2008. ITSC 2008, pages 453–457, 2008. [28] J. Kwon, M. Mauch, and P. P. Varaiya. Components of congestion : delay from incidents, special events, lane closures, weather, potential ramp metering gain, and excess demand. In TRR’06, pages 84–91. [29] X. Li, Z. Li, J. Han, and J. Lee. Temporal outlier detection in vehicle traffic data. In ICDE ’09. [30] M. J. Lighthill and G. B. Whitham. On kinematic waves: Ii. a theory of traffic flow on long crowded roads. In Proc. Roy. Soc. A 229, pages 317–345, 1955. [31] C. X. Lin, B. Zhao, Q. Mei, and J. Han. PET: a statistical model for popular events tracking in social communities. In KDD ’10. [32] W. Liu, Y . Zheng, S. Chawla, J. Yuan, and X. Xing. Discovering spatio-temporal causal interactions in traffic data streams. In KDD ’11. [33] P. C. Mahalanobis. On the generalised distance in statistics. Proceedings of the National Institute of Sciences of India 2, 1:49–55, 1936. [34] F. L. Mannering and S. S. Washburn. Fundamentals of traffic flow and queuing theory. Principles of Highway Engineering and Traffic Analysis, Chapter 5, 2012. [35] R. S. Marshment, R. C. Dauffenbach, and D. A. Penn. Short-range intercity traffic forecasting using econometric techniques. In ITE Journal, volume 66, 1996. [36] M. Miller and C. Gupta. Mining traffic incidents to forecast impact. In UrbComp ’ 12. [37] N. L. Nihan and J. N. Zhu. Short-term forecasts of freeway traffic volumes and lane occupancies, phase 1. In TNW’93, volume 5. [38] K. Ozbay and P. Kachroo. Incident management in intelligent transportation sys- tems. Artech House, 1999. [39] B. Pan, U. Demiryurek, C. Gupta, and C. Shahabi. Forecasting spatiotemporal impact of traffic incidents on road networks. In ICDM’13. [40] B. Pan, U. Demiryurek, and C. Shahabi. Utilizing real-world transportation data for accurate traffic prediction. In ICDM’12. 94 [41] B. Pan, Y . Zheng, D. Wilkie, and C. Shahabi. Crowd sensing of traffic anomalies based on human mobility and social media. In ACM SIGSPATIAL ’13. [42] B. Park, C. J. Messer, and T. I. Urbanik. Short-term freeway traffic volume fore- casting using radial basis function neural network. Number 1651. [43] C. M. Queen and C. J. Albers. Intervention and causality: Forecasting traffic flows using a dynamic bayesian network. Journal of the American Statistical Association, 104, 2009. [44] J. Ramos. Using TF-IDF to Determine Word Relevance in Document Queries. Technical report, Department of Computer Science, Rutgers University, 2003. [45] F. M. Report. http://www.metro.net/board/Items/2012/03March/ 20120322RB- MItem57.pdf. Last visited Feb 14, 2013. [46] RIITS. http://www.riits.net/. Last visited December 25, 2011. [47] P. J. Rousseeuw. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20:53– 65, 1987. [48] T. Sakaki, M. Okazaki, and Y . Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In WWW ’10. [49] H. Sayyadi, M. Hurst, and A. Maykov. Event detection and tracking in social streams. In ICWSM ’09). AAAI, 2009. [50] SIGALERT. http://www.sigalert.com. Last visited May, 2013. [51] K. W. Smith. and B. L. Smith. Forecasting the clearance time of freeway acci- dents. In Center Transp. Studies, Univ. Virginia, Charlottesville, VA, Rep. STL- 2001-01, 2001. [52] V . J. Stephanedes, P. G. Michalopoulos, and R. A. Plum. Improved estimation of traffic flow for real-time control discussion and closure. In TRR’81, number 795. [53] A. Stuart, K. Ord, and S. Arnold. The advanced theory of statistics, vol 2: Infer- ence and relationship. 1973. [54] E. C. Sullivan. New model for predicting freeway incidents and incident delays. Journal of Transportation Engineering, 123(4):267–275, July 1997. [55] D. L. Thornton and D. S. Batten. Lag-length selection and tests of granger causal- ity between money and income. Journal of Money, Credit and Banking, 17, 1985. 95 [56] J. van Lint, S. Hoogendoorn, and H. van Zuylen. Freeway travel time prediction with State-Space neural networks. In TRR’02, volume 1811. [57] K. Verbeek, K. Buchin, and B. Speckmann. Flow map layout via spiral trees. IEEE Transactions on Visualization and Computer Graphics, 2011. [58] Y . Wang, R. Yu, Y . Lao, and T. Thomson. Quantifying incident-induced travel delays on freeways using traffic sensor data: Phase ii. In Research Report in Transportation Northwest Regional Center X (TransNow) and Washington State Transportation Center, 2011. [59] WAZE. http://www.waze.com/. Last visited Feb 25, 2013. [60] L.-Y . Wei, Y . Zheng, and W.-C. Peng. Constructing popular routes from uncertain trajectories. In KDD ’12. [61] D. Wilkie, J. Sewall, and M. Lin. Transforming gis data into functional road models for large-scale traffic simulation. IEEE Transactions on Visualization and Computer Graphics, 2012. [62] B. Williams, P. Durvasula, and D. Brown. Urban freeway traffic flow prediction: Application of seasonal autoregressive integrated moving average and exponential smoothing models. In TRR’98, volume 1644. [63] S. C. Wirasinghe. Determination of traffic delays from shock-wave analysis. Transportation Research, pages 343–348, 1978. [64] H. Xiao, H. Sun, B. Ran, and Y . Oh. Fuzzy-Neural network traffic prediction framework with wavelet decomposition. In TRR’03, volume 1836. [65] J. Yuan, Y . Zheng, X. Xie, and G. Sun. Driving with knowledge from the physical world. In SIGKDD’11. [66] J. Yuan, Y . Zheng, C. Zhang, X. Xie, and G.-Z. Sun. An interactive-voting based map matching algorithm. In MDM ’10. [67] D. Zhang, N. Li, Z.-H. Zhou, C. Chen, L. Sun, and S. Li. iBAT: detecting anoma- lous taxi trajectories from GPS traces. In UbiComp ’11. [68] J. Zhang. Smarter outlier detection and deeper understanding of large-scale taxi trip records: a case study of NYC. In SIGKDD ’12 Workshop on Urban Comput- ing. 96
Abstract (if available)
Abstract
For the first time, real‐time high‐fidelity spatiotemporal data on the transportation networks of major cities have become available. This gold mine of data can be utilized to learn about the behavior of traffic congestion at different times and locations, potentially resulting in major savings in time and fuel, the two important commodities of the 21st century. Therefore, how to mine valuable information from these data to enable next‐generation technologies for unprecedented convenience has become a key topic in spatiotemporal data mining. By utilizing real‐world transportation related datasets, this thesis focuses on addressing the problems related to the impact of traffic incidents. Traffic incidents refer to non‐recurring issues that occur in the road network, such as traffic accidents, weather hazard, special events and construction zone closures, which contribute to approximately 50% of traffic congestion. ❧ First, this thesis addresses the fundamental problem of traffic prediction in the presence of traffic incidents by utilizing traffic sensor data and incident reports collected from Los Angeles County road networks. The proposed prediction method overcomes the deficiency of traditional time‐series prediction techniques by considering the unique characteristics for traffic speed time series. Then, using the same dataset, this thesis proposes a set of methods to predict the dynamic evolution of the impact of incidents. Through the surrounding traffic data of traffic incidents, this thesis models the propagation behavior of congestion caused by archived incidents and develops a set of clustering‐based techniques for predicting similar behavior in the future. Third, in addition to sensor data, this thesis mines social media and GPS trajectories to obtain a better understanding of the cause of traffic incidents. Specifically, by identifying unusual traveling behaviors and Twitter‐like posts from data collected in Beijing, this work detects and analyzes the impact of traffic incidents. Finally, this thesis analyzes the causality relationship between freeway traffic and arterial traffic to provide a comprehensive prediction of the impact that incidents have on both freeways and arterial streets. As a result, next‐generation navigation applications that are built based on the approaches discussed in this thesis can help drivers effectively avoid the impacted area in real time and thereby save them a considerable amount of travel time.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Spatiotemporal traffic forecasting in road networks
PDF
Congestion effects on arterials as a result of incidents on nearby freeway: When should you get off the highway?
PDF
Deriving real-world social strength and spatial influence from spatiotemporal data
PDF
A data integration approach to dynamically fusing geospatial sources
PDF
Scalable data integration under constraints
PDF
Query processing in time-dependent spatial networks
PDF
Iteratively learning data transformation programs from examples
PDF
Learning the semantics of structured data sources
PDF
Transforming unstructured historical and geographic data into spatio-temporal knowledge graphs
PDF
Algorithms and data structures for the real-time processing of traffic data
PDF
A reference-set approach to information extraction from unstructured, ungrammatical data sources
PDF
Predicting and modeling human behavioral changes using digital traces
PDF
Mechanisms for co-location privacy
PDF
Robust real-time algorithms for processing data from oil and gas facilities
PDF
Prediction models for dynamic decision making in smart grid
PDF
Improving mobility in urban environments using intelligent transportation technologies
PDF
Inferring mobility behaviors from trajectory datasets
PDF
Traffic assignment models for a ridesharing transportation market
PDF
Congestion reduction via private cooperation of new mobility services
PDF
Novel techniques for analysis and control of traffic flow in urban traffic networks
Asset Metadata
Creator
Pan, Bei (Penny)
(author)
Core Title
Utilizing real-world traffic data to forecast the impact of traffic incidents
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Publication Date
02/18/2016
Defense Date
05/07/2014
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
impact analysis,OAI-PMH Harvest,prediction,traffic incident,traffic sensor data
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Shahabi, Cyrus (
committee chair
), Giuliano, Genevieve (
committee member
), Knoblock, Craig (
committee member
)
Creator Email
beipan@usc.edu,penny.maple@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-460900
Unique identifier
UC11287175
Identifier
etd-PanBeiPenn-2828.pdf (filename),usctheses-c3-460900 (legacy record id)
Legacy Identifier
etd-PanBeiPenn-2828.pdf
Dmrecord
460900
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Pan, Bei (Penny)
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
impact analysis
prediction
traffic incident
traffic sensor data