 Original Paper
 Open Access
A comparative study of models for the incident duration prediction
 Gaetano Valenti^{1}Email author,
 Maria Lelli^{1} and
 Domenico Cucina^{2}
https://doi.org/10.1007/s1254401000314
© The Author(s) 2010
 Received: 3 September 2008
 Accepted: 12 April 2010
 Published: 7 May 2010
Abstract
Purpose
This study is intended to investigate the reliability of different incident duration prediction models for real time application with a view to contribute to the development of a decision aid tool within the incident management process context where rough incident duration estimates are currently provided by traffic operators or police on the basis of their skill and past experience.
Methods
Five predictive models, ranging from parametric models, to nonparametric and neural network models, have been considered and compared evaluating their capacity of predicting incident duration. The data set used in this study for developing and testing the prediction models includes 237 incident events and contains information about the incident characteristics, the personnel and equipment involved to clear the incident and the related response times, including the beginning and ending time of the incident.
Results
Testing results have demonstrated that the proposed models are able to achieve good performance in terms of prediction accuracy especially for incidents with duration less than 90 min. This finding is partly due to the fact that the dataset has a relatively small number of severe incidents. Furthermore a linear combination of predictions from models was applied with negligible gain in accuracy.
Conclusions
A deeper investigation is suggested for a future work to evaluate potential improvements from the application of other combination methods. Moreover each proposed model is able to reach best performance for incidents within a particular duration range. Thus a preliminary incident classification scheme could be more convenient in order to select the more appropriate prediction model.
Keywords
 Incident duration
 Duration prediction model
 Statistical models
 Regression models
 Discriminant analysis
1 Introduction
When an incident occurs, the timely estimate of its duration assumes a key role in the overall incident management process. Specifically reliable incident duration predictions can help traffic managers in providing correct and essential information to road users, applying appropriate traffic control measures at or near the incident location and evaluating the effectiveness of the incident management strategies implemented.
In current practices, rough incident duration estimates are provided by traffic operators or police on the basis of experience and the known characteristics of the incidents such as the nature of the incident, the occurrence of injuries and fatalities, as well as the type and number of vehicles involved. The reliability of these practices is still unknown and largely depends upon the skill and experience of the operator.
Grounded on the existing scientific literature, this study intends to develop and compare the effectiveness of different prediction models suitable to estimate the incident duration in a realtime environment. The proposed prediction models incorporate variables that have the greatest influence on the incident duration and that can be practically obtained in real time as soon as the incident is detected and verified.
The incident data used in this study for developing and testing the prediction models have been supplied by the “Fiano” Trunk Management Centre of Autostrade per l’Italia Spa which is the biggest Italian motorway company.
These data, usually obtained from the incident scene and manually logged by the TMC operators in a database, contain information about the incident characteristics, the personnel and equipment involved to clear the incident and the related response times, including the beginning and ending time of the incident.
First, a statistical analysis of these incident data was conducted to investigate the factors that influence the incident duration with the scope to find out what variables are important for the prediction process. Both the ANOVA and KruskalWallis analysis have been performed to measure and test the statistical significance of differences in incident duration for each of the explanatory variables.
Then, different predictive models, ranging from parametric (polynomialtype) models, to nonparametric and neural network models, have been considered and compared evaluating their capacity of predicting testing data.
This paper is organized as follow: a review of previous studies on incident duration prediction, aimed at obtaining insight in the strength and weakness of the many methods that have been developed up to now is presented in the next section. This is followed by the exploratory analysis of incident data collected by “Autostrade per l’Italia Spa” to identify critical variables associated with the incident duration. Next the construction and testing of five incident duration prediction models are reported, namely: Multiple Linear Regression (MLN), Prediction/Decision tree (DT), Artificial Neural Network (ANN), Support/Relevance Vector Machine (RVM) and KNearestNeighbour (kNN). Finally some practical conclusions are drawn from the comparison of their prediction performance in the various incident situations.
2 Previous studies on incident duration prediction
Over the past two decades a number of studies have been undertaken to investigate the feasibility of estimating incident duration. Various approaches, ranging from statistical modeling methods, to machine learning methods like neural networks, have been applied. However, a direct comparison of the results of these studies is quite difficult since datasets, used to build and validate the various models, exhibit different characteristics, reflecting local variations in data collection and reporting practices.
The purpose of developing incident duration models is to determine the relationships between incident duration and influencing variables. Previous studies reported similar sets of variables affecting incident duration, such as the incident type and severity, the number and type of vehicles involved, the geometric characteristics, the time of day and the emergency equipment (ambulances, tow track, etc.) dispatched.
Golob, et al. (1987) [1] analyzed over 9,000 truckinvolved accidents that occurred during a 2year period on freeways in the greater Los Angeles area. Statistical models, that relate incident duration to collision type, accident severity and lane closures, were developed. The durations of incidents were found to be lognormally distributed for homogeneous groups of truck accidents, categorized according to the type of collision and, in some instances, the severity.
Also Giuliano (1989) [2] aggregated incidents into broad categories and estimated models as a function of incident characteristics for each category.
Jones et al. (1991) [3] introduced the important concept of conditional probability; that is, given that the incident has lasted X minutes, it will end in the Yth minute. The authors analyzed 2,156 incidents in the metropolitan Seattle area and found that the duration of incidents is approximated by a loglogistic instead of a lognormal distribution.
Ozbay and Kachroo (1999) [4] focused on incident having major impact on traffic and proposed the use of decision trees with the first split at the root node on the “incident type” variable. In this study a normal distribution of duration for homogeneous subsets of incidents (in terms of incident type and severity) was found.
Nam and Mannering (2000) [5] applied hazardbased duration models to statistically evaluate the time it takes to detect/report, respond to, and clear incidents. The model estimation results showed that a wide variety of factors significantly affects incident times, and that different distributional assumptions for the hazard function are appropriate for the different incident times being considered.
Smith and Smith (2001) [6] proposed and applied nonparametric regression and classification trees as models to predict incident clearance time.
Lin et al. (2004) [7] presented a system that integrates the discrete choice model with a rulebased supplemental module for estimating the duration of a detected incident. The primary function of the embedded discrete model is to estimate those incidents having durations less than 60 min. For severe incidents that may last more than 1 h, the system uses a rulebased supplemental module.
Wang et al. (2005) [8] developed two models to predict the vehicle breakdown duration: one based on fuzzy logic (FL) and the other on artificial neural networks (ANN). The study demonstrated that FL and ANN can provide reasonable estimates for the breakdown duration with few variables. However, both models had difficulties in predicting the outliers.
Ozbay and Noyan (2006) [9] used Bayesian Networks (BNs) as knowledge discovery process to accurately predict incident duration. The research showed that BNs offer an effective way to represent the stochastic nature of incident.
On the basis of these previous studies (see also [10, 11]), it can be concluded that each method seems to have its own strengths and weaknesses, thus no single method is expected to be the best method under all circumstances. If the full incident duration prediction horizon is to be covered, a combination of methods seems to be the best option. This view motivates the focus of this study on comparing different incident duration prediction methods.
3 Data description
The data used in this study are from the Incidents Database of “Autostrade per l’Italia Spa”, for two motorway sections, respectively of two and three lanes in both directions. They are referred to 3 months of 2005 (January, April and August) for the amount of 237 incident events.
These data are normally used for monitoring incident management operations and are related to every event disrupting the regular traffic flow on the infrastructure by obstructing part of the road.
All the records of the database contain at least: 1) the starting and the ending time/date of the incident, 2) the type of the incident (crash, disabled vehicle, vehicle fire, obstacles on the road), 3) the location and the detection source.
 1.
incident attributes (number of personal injuries/fatalities, number/type of vehicle involved, weather conditions, occurrence of events connected to the incident like cargo spill)
 2.
operational details (presence/number of emergency medical services, presence/number of special rescue vehicles...)
 3.
variables describing the state of the infrastructure and of the traffic (number/type of lane closed, queues...)
The ChiSquare statistical test confirmed that the incident duration distribution is a log normal distribution (pvalue = 0.053), as found by Giuliano [2].
Analysis of variance (ANOVA) was applied to determine which variable is statistically relevant for estimating incident duration. Moreover the nonparametric KruskalWallis test was performed when the two assumptions of normality and homogeneity of variances, requested by the ANOVA, are not met.
Possible independent variables resulted significant from ANOVA
Variable  Value  

Incident type  Injuries/fatalities  1 = Yes; 0 = No 
Property damage (involving damage to the vehicle)  1 = Yes; 0 = No  
Disabled vehicle  1 = Yes; 0 = No  
Vehicle fire  1 = Yes; 0 = No  
Incident details  Heavy duty vehicles involved  1 = Yes; 0 = No 
Peak hour  1 = Yes; 0 = No  
Infrastructure damage  1 = Yes; 0 = No  
Operational details  Response service requested  1 = Yes; 0 = No 
Emergency Medical Services at the scene  1 = Yes; 0 = No  
Special agencies (heavy tow truck, HAZMAT clearance agency)  1 = Yes; 0 = No  
Infrastructure and traffic variables  Number of lanes  1 = 3 lanes; 0 = 2 lanes 
Shoulder lane occupied  1 = Yes; 0 = No  
Slow lane occupied  1 = Yes; 0 = No 
4 Models to predict incident duration
 1)
Multiple linear regression (MLR);
 2)
Prediction/Decision tree (DT);
 3)
Artificial Neural Network (ANN);
 4)
Support/Relevance Vector Machine (RVM);
 5)
KNearestNeighbour (KNN).
For assessing the predictive ability of these models, the incident data set was split into training and testing partitions with statistical properties similar to those represented in the original dataset. Specifically, 187 incident cases were included in the training partition for the model construction process, whereas 50 incident cases were used to evaluate the accuracy of the proposed models.
Moreover, four incident duration classes were used to estimate and compare the models performance at the different duration horizons according to the incident severity: short (<30 min), medium (31–60 min), mediumlong (61–90 min) and long (>90 min).
For investigating the accuracy of the proposed models the Mean Absolute Error (MAE), the Root Mean Squared Error (RMSE) and the Mean Absolute Percentage Error (MAPE) were adopted. The MAE quantifies the average magnitude of the errors, the RMSE diagnoses their variation and the MAPE weights them in relation to the actual value amount.
4.1 Multiple linear regression
Multiple linear regression attempts to model the relationship between two or more independent or explanatory variables (X_{1}, X_{2}, ..., X_{p}) and a dependent variable (Y) by fitting a linear equation to observed data ([12, 13]).
In this study linear regression with the log_{10} of the incident duration as the dependent variable was used in order to meet the normal distribution assumption required by the MLR method. The skewness coefficient for the log_{10} distribution was equal to −0.24.
Coefficients of the MLR model
Variable  Coefficients (b_{i})  Std. Error  tstudent  pvalue  

a  Constant term  1.657  .092  17.924  .000 
X_{1}  Emergency Medical Services at the scene  .222  .040  5.617  .000 
X_{2}  Heavy duty vehicles involved  .147  .039  3.820  .000 
X_{3}  Peak hour  .156  .035  4.488  .000 
X_{4}  Infrastructure damage  .204  .078  2.619  .010 
X_{5}  Number of lanes  −.092  .035  −2.657  .009 
X_{6}  Vehicle fire incident  −.276  .088  −3.148  .002 
All coefficients are statistically significant at the 95% level, however the explanatory power of the model is rather poor as indicated by R^{2} = 0.32. Furthermore the F ratio is equal to 7.545 and the pvalue is equal to zero. The addition of other variables does not significantly improve the accuracy of the predictions. The MAE value is 17 min.
The MAE value achieved by the MLR model is comparable to the MAE obtained by Ozbay and Kachroo [4] with DT models, and better than the one got by Smith & Smith [6] using KNN models.
4.2 Prediction/Decision tree
Prediction/Decision trees can perform classification for predicting what group a case belongs to, as well as regression for predicting a specific value. DTs are nonparametric models as they make no assumption on the data distribution and, as a result, they may be applied in situations where little is known about the application in question.
As with all regression techniques we assume the existence of a single output (response) variable and one or more input (predictors) variables. It is called a decision tree because the resulting model is presented in the form of a tree structure or a set of logical “ifthen” conditions (tree nodes). The visual presentation makes the decision tree model very easy to understand and assimilate.
Decision tree is built through an iterative process of splitting the data into partitions, and then splitting up further on each of the branches. The process continues until each node reaches a userspecified minimum node size and becomes a terminal node. The terminal nodes of the tree contain the predicted output variable values. The theoretical and computational details of decision tree model are provided in [14–17] and [18].
Validation test results showed that the developed DT model has satisfactory precision in predicting the duration of most incident cases. In particular 37 incident cases out of 50 are predicted with less than 20 min of prediction error. Better prediction performance is given by the DT model for incident cases with mediumlong durations, where the MAPE and MAE values are equal to 18% and 12 min, respectively.
4.3 Artificial neural network
An Artificial Neural Network (ANN) model is a flexible mathematical structure capable of describing complex nonlinear relations between input and output datasets. ANNs have been successfully applied to prediction and pattern classification problems [19]. The architecture of ANN models is loosely based on the biological neural system. Although there are numerous types of ANNs, the most commonly used type of ANN is the MultiLayer Perceptron (MLP). This is a feedforward, fullyconnected hierarchical network typically comprising three types of neuron layers each including one or several neurons: an input layer, one or more hidden layers and an output layer. The behaviour of a neural network is determined by the transfer functions of its neurons, by the learning rule and by the architecture itself ([20, 21]).
In this study, the number of neurons in the input layer is determined by the 13 most significant variables affecting incident duration, while a single neuron in the output layer is made up of the incident duration value being predicted. Moreover various ANN architectures, with one or two hidden layers and different number of neurons in the hidden layers, were trained using the LevenbergMarquardt backpropagation algorithm.
The best performing ANN architecture is obtained with a single hidden layer of 15 neurons and employing tangentsigmoid transfer functions.
4.4 Support/Relevance vector machine
The Support Vector Machines (SVMs) are supervised learning machines born in the 1990s in the framework of statistical learning theory, based on the Structural Risk Minimization Theory (SRM) developed by Vapnik and Chervonenkis [22], to clarify the properties of generalization of the learning machines. The SVMs are powerful tools for solving problems of classification, regression, pattern recognition, density estimation [23], with the supervisor’s output as a function of a linear combination of kernel functions centred on a subset of the training data, consisted of the so called support vectors.
In the last years many different SVM models were developed, based on a variety of error functions, or kernels or optimization techniques. In 2001 Tipping [24] elaborated a new support vector machine, called Relevance Vector Machine (RVM), merging the Vapnik theory with the Bayesian statistics. The RVM model is based on a hierarchical prior on the parameters of the kernel functions’ weights, which leads to model sparseness. As a consequence, the RVM can generalize well and can provide inferences at low computational cost, bypassing some SVM constrains.
In this study, different SVMs and RVMs were trained varying kernel and error functions with different set of independent variables. Using the Cauchy Function Kernel and the training dataset composed of the 13 significant explanatory variables from ANOVA, the best performing RVM was obtained with 45 support vectors. This model gave the smallest MAE of about 15 min.
4.5 Knearest neighbour
The nonparametric KNearest Neighbour (KNN) method offers an alternative to the traditional parametric regression models. Through this method, the estimate/prediction for a current observation is simply based on weighting the contributions of the k nearest neighbours, so that the nearer neighbours contribute more than the farther ones.
The neighbourhood size is defined using independent variables which are known in both the past and current observations. In order to define the relative closeness of a given point, the form of the similarity (or distance) measure must be specified. Similarity measures based on absolute differences or Euclidean distance functions are typically applied.
In building the KNN model the choice of k can strongly influence the quality of predictions: a small value of k leads to a large variance in predictions; alternatively, setting k to a large value may lead to a large model bias since the k nearest neighbours are farther away including cases that are less representative of the case under examination. Thus, k should be set to a large value enough to minimize the estimation error and small enough (with respect to the numbers of cases) so that the k nearest points are close enough to the query point.
In this study an appropriate distance metric, based on the number of matching independent variables between past and current incident, was applied since all the independent variables are binary (0/1) in form [6]. Furthermore weight factors for each independent variable, given by the absolute difference between the average duration of the two related yes/no samples, were used to compute the KNN distance. K values up to 30 were tested and compared, using the MAE as measure of effectiveness. The minimum value of MAE was obtained in correspondence of K = 10.
5 Comparisons and conclusions
This paper presents the findings of a study that appraises and compares five predictive modelling methods, ranging from parametric (polynomialtype), to nonparametric and neural network models in order to provide an useful and reliable decision aid tool within the incident management process context where rough incident duration estimates are currently provided by traffic operators or police on the basis of their skill and past experience.
These models have been developed and tested using a common incident data set, including 237 incident events, for allowing a direct comparison of the models’ prediction ability in the various incident situations. The Mean Absolute Errors (MAE), the Root Mean Squared Errors (RMSE) and the Mean Absolute Percentage Errors (MAPE) were adopted to estimate the models’ accuracy.
The testing results, based on 50 incident events, have demonstrated that the proposed models are able to achieve good performance in terms of prediction accuracy for incidents with duration less than 90 min, matching what was obtained in past studies using similar prediction methods.
Worst prediction errors (predictionincident duration)
Incident type  Incident duration (min)  MLR error (min)  CHAID error (min)  ANN error (min)  RVM error (min)  KNN error (min) 

HDV involved  13  20,73  46,67  54,36  25,99  39,48 
Obstacles on the road  111  −86,96  −81,33  −100,38  −94,61  −85,88 
Number of variables, MAE, RMSE and MAPE of the models
MLR  CHAID  ANN  RVM  KNN  

N° variables  6  4  13  13  10 
MAE  15,17  16,66  16,02  13,65  15,41 
RMSE  20,04  23,07  19,80  17,29  20,29 
MAPE  34%  43%  44%  36%  36% 
Distribution of predictions’ absolute errors for all models
Absolute error (min)  MLR  CHAID  ANN  RVM  KNN 

<5  19%  27%  13%  17%  19% 
5–10  29%  19%  29%  31%  23% 
10–20  21%  33%  25%  31%  35% 
20–30  23%  8%  21%  15%  15% 
>30  8%  15%  13%  6%  8% 
According to the MAE and RMSE values in Table 4, the less reliable model is the decision tree model (CHAID), characterised by the greatest variance in the errors. The RVM is the most reliable model performing the smallest MAE and RMSE values, while the smallest MAPE value is performed by the MLR, working with only six explanatory variables. The highest MAPE is achieved by the ANN, with great errors for short duration cases, as shown in the previous section.
As listed in Table 5, 79% of RVM prediction errors are less than 20 min, while CHAID predictions are with the greatest number of errors more than 30 min. However the CHAID model exhibits the greatest number of errors less than 5 min, and this result is achieved with only four explanatory variables. The advantage to give a readytouse easy tool, with a small number of variables, makes the Decision Tree together with the MLR the methods most used for the incident duration prediction problem. Moreover, unlike “black box” methods such as ANN and RVM, a further advantage of these methods is their statistical approach that allows a transparent and easytounderstand explanation of their results.
A further step to enhance the prediction accuracy using all the proposed models can be a combination procedure of their predictions, in order to exploit the fact that the models have strengths and weaknesses in different situations. In this view, the linear combinations proposed by Granger and Ramnathan [25] were applied. From this application a negligible gain in prediction accuracy was reached in terms of MAE values (from 13,65 with RVM to 12,62 min with predictions’ combination). However a deeper investigation can be suggested for a future work to evaluate potential improvements from the application of other combination methods.

the MLR is the best performing model for short duration incidents;

the best predictions are achieved by the RVM model in the incident cases with medium/mediumlong duration;

the ANN is the only model that can predict an incident longer than 90 min. Moreover the ANN model gives the best results for long duration incident cases, with the lowest MAPE, even if greater than 30%;

all proposed models tend to have a relatively low accuracy for incidents with long duration partly because the dataset has a relatively small number of severe incidents.
In conclusion, each proposed model is able to reach the best performance for incidents within a particular duration range, as if they have specialised skills in predicting incidents of specific duration class. For this reason, a preliminary incident classification scheme would be more convenient in order to select the more appropriate prediction model. For example, a preliminary classification between two classes of duration—less or more than 30 min—can help to pick up between the MLR for short duration incidents and RVM for the others.
Finally the findings reached in this study have certainly demonstrated the validity of the RVM as prediction model also in the context of incident duration prediction.
However it is likely the proposed models could have a limited accuracy when used in other geographical contexts where different incident management and emergency response actions take place. In order to ensure that the proposed models are able to deal with different conditions, a widerscale data collection effort is needed to be undertaken.
Declarations
Acknowledgment
The authors would like to thank Autostrade per l’Italia Spa for kindly supplying the incident data used in this study.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Authors’ Affiliations
References
 Golob TF, Recker Leonard JD (1986) An analysis of the severity and incident duration of truckinvolved freeway accidents, Institute of Transportation Studies, Univ. of California, Irvine, UCIITSWP869Google Scholar
 Giuliano G (1989) Incident characteristics, frequency, and duration on a high volume urban freeway. Transp Res 23A(5):387–396View ArticleGoogle Scholar
 Jones B, Janssen L, Mannering F (1991) Analysis of the frequency and duration of freeway accidents in seattle. Accident Anal Prev 23(4):239–255View ArticleGoogle Scholar
 Obzay K, Kachroo P (1999) Incident management in intelligent transportation systems. Artech House, BostonGoogle Scholar
 Nam D, Mannering F (2000) An exploratory hazardbased analysis of highway incident duration. Transp Res, Part A Policy Pract 34(2):85–102View ArticleGoogle Scholar
 Smith BL, Smith K (2001) Forecasting the clearance time of freeway accidents. Research Report No. UVACTS15035. USDOT University Transportation CenterGoogle Scholar
 Lin PW, Zou N, Chang GL (2004) Integration of a discrete choice model and a rulebased system for estimation of incident duration: a case study in Maryland. The 83rd Annual meeting of the Transportation Research BoardGoogle Scholar
 Wang W, Chen H, Bell MC (2005) Vehicle breakdown duration modelling. J Transp Stat 8(1):75–84Google Scholar
 Ozbay K, Noyan N (2006) Estimation of incident clearance times using Bayesian Networks approach. Accident Anal Prev 38(3):542–555View ArticleGoogle Scholar
 Wei CH, Lee Y (2007) Sequential forecast of incident duration using Artificial Neural Network models. Accident Anal Prev 39(5):944–954View ArticleGoogle Scholar
 Garib A, Radwan AE, AlDeek H (1997) Estimating magnitude and duration of incident delays. J Transp Eng 123(6):459–466View ArticleGoogle Scholar
 Johnson R, Wichern D (2003) Applied multivariate statistical analysis. Prentice HallGoogle Scholar
 Fox J (1997) Applied regression analysis, linear models, and related methods. SageGoogle Scholar
 Brieman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth International Group, BelmontGoogle Scholar
 Kass GV (1980) An exploratory technique for investigating large quantities of categorical data. Appl Stat 29:119–131View ArticleGoogle Scholar
 Loh WY, Shih YS (1997) Split selection methods for classification trees. Stat Sinica 7:825–840MathSciNetGoogle Scholar
 Biggs D, DeVille B, Suen E (1991) A method of choosing multiway partitions for classification and decision trees. J Appl Stat 18(1):49–62View ArticleGoogle Scholar
 Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann, San MateoGoogle Scholar
 Duda RO, Hart PE, Stork DG (2000) Pattern classification (2nd ed). John Wiley & SonsGoogle Scholar
 Haykin S (1999) Neural Networks: a comprehensive foundation. Prentice HallGoogle Scholar
 Bishop C (1995) Neural networks for pattern recognition. Clarendon, OxfordGoogle Scholar
 Vapnik V (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10(5):988–999View ArticleGoogle Scholar
 Cristianini N, ShaweTaylor J (2000) An introduction to support vector machines. Cambridge University Press, and reference thereinGoogle Scholar
 Tipping M (2001) Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 1, pp. 211–244 (http://www.miketipping.com/index.php?page=rvm)
 Granger CWJ, Ramanathan R (1984) Improved methods of combining forecasts. J Forecast 3:197–204View ArticleGoogle Scholar