- Original Paper
- Open Access
Autoregressive nonlinear time-series modeling of traffic fatalities in Europe
- George Yannis^{1}Email author,
- Constantinos Antoniou^{2} and
- Eleonora Papadimitriou^{1}
https://doi.org/10.1007/s12544-011-0055-4
© The Author(s) 2011
- Received: 5 February 2011
- Accepted: 30 August 2011
- Published: 27 September 2011
Abstract
Purpose
The objective of this paper is to provide a parsimonious model for linking motorization level with the decreasing fatality rates observed across EU countries during the last three decades.
Methods
A macroscopic analysis of road-safety in Europe at the country level is proposed through the application of non-linear models correlating fatalities and vehicles for the period between 1970 and 2002. Given the time series nature of road safety data, these models result in auto-correlated residuals, thus violating at least one of the assumptions of non-linear regression. Autoregressive forms of the considered models that overcome these limitations and provide superior predictive capabilities are also considered.
Results
An autoregressive log-transformed model seems to outperform the base autoregressive non-linear model in this respect. The use of these models allowed for the identification of the best and worst performing countries.
Conclusions
The proposed models can prove useful for assessing the road safety performance of the examined countries, as well as for obtaining some insight on the current and future trends of less developed countries.
Keywords
- Traffic safety
- Non-linear regression
- Time series analysis
- Autoregressive models
1 Introduction
Road traffic injuries represent a major global public health crisis, requiring concerted efforts for effective and sustainable prevention. Worldwide, the number of people killed in road traffic accidents every year is estimated at 1.2 million, while the number of those injured could be as high as 50 million – the combined population of five of the world’s largest cities [37]. Furthermore, while the number of accidents in developed countries is reducing, unless decisive action is taken globally, the total number of road traffic deaths and injuries is forecast to rise by some 65% between 2000 and 2020 [36], with deaths in low-income and middle-income countries expected to increase by as much as 80% [37] due to their upcoming growth and associated consequent traffic.
Macroscopic modeling can provide insight into this problem and help policy-makers in both under-developed and developing countries adjust their policies in reaction to the changing conditions. Older studies focused primarily on developed countries. Within the current research, data from countries from various parts of Europe are analyzed thus highlighting differences between countries that can be used to anticipate traffic safety trends in less developed countries. The interest of such an analysis may become more pronounced when considering that the EU includes different groups of countries with different socioeconomic characteristics presenting different road safety cultures and performances (i.e. western European countries, southern Mediterranean countries, eastern new member states) and requiring potentially different road safety measures, programmes and strategies.
Several researchers [9, 14, 24, 26], using road accident statistics, have presumed that the explanatory variables have a multiplicative effect on accidents (as opposed to e.g. additive). Henning-Hager [17] presented a non-linear regression model to express the relationship between traffic fatalities, traffic volumes and the quality of transportation supply and demand in urban areas. Qin et al. [31] showed that the relationship between crashes and the daily volume (AADT) is non-linear and varies by crash type, and is significantly different from the relationship between crashes and segment length for all crash types. A macroscopic road-safety model commonly used in the late 60s was proposed by Smeed [33] linking the number of fatalities with the number of vehicles and the population. Jacobs [18] repeated this analysis for a number of developed and developing countries using data between 1968 and 1975 while Gharaybeh [13] applied the same formula to assess the development of road safety in Jordan, relative to that of other middle-eastern and developing countries.
It should be noted, however, that many studies have criticised Smeed’s model because it only concentrates on the motorisation level of country and ignores the impact of other variables (cf. [3, 8]). An implication of this is that effectiveness assessment of road safety measures would have little meaning because road fatalities can simply be predicted from population and vehicle numbers in any country and any year, at least at macroscopic level. Andreassen [3] criticised the model’s accuracy because there would always be a decline in traffic risk for any increase in the number of vehicles, but generally in a non-linear way, and proposed using country-specific parameters to distinguish between countries with a similar degree of motorisation. The main criticism of Andreassen, however, seems to be targeted at the way that the Smeed formula was manipulated algebraically (instead of a new regression being fit to the resulting transformation). Smeed’s formula expected the downtrend in fatalities rate but not the number of absolute fatalities, which occurred in the highly motorized countries in the seventies [8].
A critical review of a number of approaches for modeling road safety trends can be found in [14, 27]. Al-Haji [2] provides a review of these concerns, as well as several alternative approaches for the development of road safety models. Another useful review [10] provides a detailed analysis of the debate surrounding Smeed’s formulas and analysis. One of the conclusions is that “there is general agreement now among researchers, that models describing traffic safety developments should have time-dependent parameters.” In this paper, we contribute to this discussion by exploring the development of models that explicitly treat the temporal correlation of the road safety data. Within this alternative approach, time is not treated as an explanatory variable, but instead its negative impact (temporal serial correlation) is factored out by the use of appropriate statistical procedures in order to focus on road safety related predictors.
The comparison of time series of road safety among different countries has been an interesting research topic. Lassarre [22] applies the local linear trend model to ten European countries and uses the estimated trend and elasticities to make inference about the relationship between traffic flow and number of fatalities. Page [28] presents a statistical model to compare road mortality in OECD (Organisation for Economic Co-operation and Development) countries, combining cross-sectional and panel data. Models with several exogenous variables are developed and countries are ranked based on their road mortality level. Beenstock and Gafni [5] show that there is a relationship between the downward trend in the rate of road accidents in Israel and other countries and suggest that this reflects the international propagation of road safety technology as it is embodied in motor vehicles and road design, rather than parochial road safety policy. Van Beeck et al. [35] examine the association between prosperity and traffic accident mortality in industrialized countries in a long-term perspective (1962–1990) and find that in the long-term the relation between prosperity and traffic accident mortality appears to be non-linear. Kopits and Cropper [21] use linear and log-linear forms to model region specific trends of traffic fatality risk and per income growth using panel data from 1963 to 1999 for 88 countries. Abbas [1] compares the road safety of Egypt with that of other Arab nations and G-7 countries, and develops predictive models for road safety. Yannis et al. [38] fit piece-wise linear regression models to identify changes in macroscopic road accident trends. Lessons from the analysis of the past road safety patterns of developed countries provide some insight into the underlying process that relates motorization levels with personal risk and can prove to be beneficial for predicting the road safety evolution of developing countries that may have not yet reached the same breakpoints.
Taking into account the road safety macroscopic modeling background presented above, the objective of this paper is to provide a parsimonious model for linking motorization level with the decreasing fatality rates across EU countries observed during the last three decades. Models used in the late 60’s to describe the – at the time – increasing relationship between motorization and traffic fatalities were adjusted in order to describe the decreasing relationship observed in the last three decades. Time-series methods are applied to remove the temporal trends (and autocorrelation) from the modeling of traffic fatality risk, thus allowing for capturing the impact of macroscopic road safety related model parameters on traffic risk.
On that purpose, a macroscopic analysis of road-safety in Europe at the country level (16 EU countries) is proposed through the application of non-linear models correlating fatalities and vehicles for the period between 1970 and 2002. Road safety trends can be attributed to various parameters, some of which can be modeled explicitly, while others may be handled indirectly. Within this analysis, the motorization level has been chosen as the single explanatory variable, as elaborate models that would include some of the other prevailing parameters (e.g. vehicle quality, traffic safety measures and regulations, intensity of police enforcement) are less macroscopic and thus fall outside the scope of this research.
2 Methodology
While the linear regression model is simple (to run and interpret), elegant and efficient, many interesting processes may be more adequately modeled by non-linear models in practice. Linear regression models might have been a practical necessity in the past, but theoretical and computational developments have made the use of more elaborate (appropriate, accurate) methods practical. This can also be seen in road safety research, where while early work used multiple linear regression modeling (assuming normally distributed errors and homoscedasticity), over the past two decades there has been a departure from this model. Generalized linear models (GLM) allow for some nonlinear relationships to be modeled and relax some restrictions on the distributional assumptions of linear regression [12, 25]. Although many scientific and engineering processes can be described well using linear models, or other relatively simple types of models, there are many processes that are inherently nonlinear. Non-linear models can then be used. The biggest advantage of nonlinear regression over many other techniques is the broad range of functions that can be fit.
The Gauss-Markov assumptions from ordinary least square (OLS) procedures (normal, i.i.d. disturbances etc) still apply in non-linear regression. Therefore, whenever time or distance is involved as a factor in a regression analysis, it is important to check the assumption of independent residuals. When the residuals are not independent, the model for the observations must be altered to account for dependence (e.g. moving average or autoregressive models of variable order).
Road safety data are often correlated in space or time, raising the suspicion of correlated data (and hence residuals), which violates one of the underlying assumptions (that of independent disturbances). In order to provide a clear distinction with the previously defined data m = 1, …, M, potentially correlated data are denoted by n = 1, …, N. Serial correlation of the disturbances can be detected from an ordered time series plot of the residuals versus time or from a lag plot of the residuals on the (n)th case versus the residuals on the (n-1)th case. If a violation of independent disturbances is detected, then the model needs to be altered to account for this. Common forms for dependence, or autocorrelation, of disturbances are moving average or autoregressive models of variable order [7].
In this paper, Eq. 4 forms the base model from which all the others are developed and against which they are benchmarked. Within this research, V/P was chosen as a macroscopic predictor of traffic fatalities, which can be safely calculated by the use of data available and comparable across several EU countries (Vehicles and Population). Traffic, road expenditure, driver behaviour and other road safety related parameters may also affect traffic fatalities’ trends but cannot easily be calculated in a uniform way across the EU.
2.1 An autoregressive non-linear model
2.2 A log-transformed model
i.e. there is a multiplicative error term (as opposed to an additive error term in Eq. 4). The log transformations lead to some more transformations of model parameters, e.g. α′ = exp(α) in Eq. 11.
2.3 An autoregressive log-transformed model
Note that the above model (Eq. 12) is not linear in the parameters, due to the second and fourth right-hand terms (in particular (1−φ)·α and φ·β). Furthermore, unlike the model in Eq. 8 (which is also not linear in the parameters, but can easily be transformed into a linear model through taking the logarithm, as shown in Eq. 4), this model cannot be easily transformed into a linear model.
In the remainder of this research, the four models represented by Eqs. 4, 8, 10, and 12, are estimated and assessed through a variety of tests, including lack-of-fit tests and portmanteau tests. Furthermore, the predictive ability of the models has been assessed using the root mean square percent error (RMSPE) statistic [30]. In order to be able to validate the predictive ability of the estimated models, the data-set was split to an estimation part and a validation part.
All models in this research have been estimated using the R Software for Statistical Computing v. 2.11.0 [32].
3 Data overview
Aggregate fatality, population and vehicle data from European countries between 1970 and 2002 have been used. Data for years 1970–1994 have been used for the model estimation and years 1995–2002 have been used for validation. Choosing different splits for the data set (e.g. setting aside fewer or more data for the validation) might lead to different results. The particular choice is based on the fact that as many as possible data should be allocated for estimation, while still keeping more than a few data-points for validation. The data have been obtained primarily from IRTAD (International Road Traffic and Accident Database). Official representatives of the countries with missing data were contacted directly, and several responses with additional data were incorporated to the database. In the end, out of the 25 countries of the enlarged EU, sufficiently complete data have been available for 16 of them, for which this model has been applied. Fatalities data refer to the 30-day definition of fatality for all countries, i.e. include all persons who died within 30 days of being involved in a traffic accident. The timeframe used in this research was decided during the Safety-Net project in 2006, when this work initiated [34]. The presented models are general and could be applied to newer data.
One of the assumptions of the (linear and nonlinear) regression is that the data follow a normal distribution and aim to minimize the sum of squares (least-squares regression). Outliers can have a dominant effect in this process and therefore can be of particular interest in this analysis. On the other hand, one needs to be very cautious in easily removing data points that are suspected outliers, as this process can also artificially affect the model properties.
4 Results and main diagnostics
Non-linear model estimation results (top: base, bottom: after correcting for correlation)
Coefficient α | Coefficient β | Coefficient φ | |||||||
---|---|---|---|---|---|---|---|---|---|
Estimate | Standard error | t-test | Estimate | Standard error | t-test | Estimate | Standard error | t-test | |
AT | 0,099 | 0,007 | 14,962 | −1,962 | 0,054 | −36,252 | |||
BE | 0,080 | 0,006 | 13,215 | −2,068 | 0,069 | −30,091 | |||
CY | 0,219 | 0,018 | 12,262 | −0,770 | 0,108 | −7,158 | |||
DK | 0,012 | 0,004 | 3,204 | −3,477 | 0,291 | −11,958 | |||
FI | 0,026 | 0,006 | 4,597 | −2,475 | 0,162 | −15,263 | |||
FR | 0,083 | 0,006 | 13,151 | −2,153 | 0,073 | −29,698 | |||
DE | 0,070 | 0,006 | 12,469 | −2,012 | 0,070 | −28,597 | |||
EL | 0,288 | 0,016 | 18,252 | −0,711 | 0,023 | −31,058 | |||
HU | 0,172 | 0,028 | 6,260 | −0,984 | 0,082 | −11,987 | |||
IE | 0,035 | 0,008 | 4,540 | −2,075 | 0,151 | −13,762 | |||
IT | 0,081 | 0,006 | 14,078 | −1,677 | 0,060 | −27,834 | |||
LU | 0,156 | 0,018 | 8,626 | −1,542 | 0,104 | −14,815 | |||
NL | 0,017 | 0,002 | 8,384 | −2,844 | 0,091 | −31,123 | |||
PT | 0,290 | 0,039 | 7,398 | −0,956 | 0,075 | −12,753 | |||
ES | 0,212 | 0,017 | 12,716 | −0,876 | 0,049 | −17,784 | |||
UK | 0,030 | 0,003 | 11,403 | −2,210 | 0,076 | −28,933 | |||
AT | 0,090 | 0,010 | 9,303 | −2,051 | 0,096 | −21,484 | 0,3387 | 0,1255 | 2,699 |
BE | 0,077 | 0,012 | 6,215 | −2,111 | 0,158 | −13,396 | 0,4487 | 0,197 | 2,277 |
CY | 0,214 | 0,027 | 7,994 | −0,815 | 0,180 | −4,524 | 0,2047 | 0,2905 | 0,705 |
DK | 0,015 | 0,010 | 1,550 | −3,227 | 0,617 | −5,228 | 0,5686 | 0,1695 | 3,355 |
FI | 0,021 | 0,009 | 2,209 | −2,687 | 0,370 | −7,273 | 0,4647 | 0,1429 | 3,252 |
FR | 0,068 | 0,016 | 4,329 | −2,382 | 0,251 | −9,494 | 0,5339 | 0,1798 | 2,970 |
DE | 0,069 | 0,011 | 6,215 | −2,034 | 0,153 | −13,329 | 0,5282 | 0,1752 | 3,015 |
EL | 0,294 | 0,025 | 11,740 | −0,701 | 0,037 | −19,013 | 0,3005 | 0,2131 | 1,410 |
HU | 0,155 | 0,062 | 2,516 | −1,045 | 0,218 | −4,794 | 0,5825 | 0,1784 | 3,265 |
IE | 0,034 | 0,015 | 2,295 | −2,119 | 0,309 | −6,865 | 0,6081 | 0,152 | 4,001 |
IT | 0,071 | 0,009 | 8,243 | −1,818 | 0,116 | −15,687 | 0,3571 | 0,1819 | 1,964 |
LU | 0,131 | 0,015 | 8,521 | −1,757 | 0,114 | −15,438 | 0,2458 | 0,1454 | 1,691 |
NL | 0,015 | 0,003 | 4,986 | −2,969 | 0,163 | −18,200 | 0,3247 | 0,1364 | 2,380 |
PT | 0,219 | 0,068 | 3,230 | −1,154 | 0,196 | −5,893 | 0,5303 | 0,1314 | 4,037 |
ES | 0,135 | 0,054 | 2,473 | −1,306 | 0,418 | −3,122 | 0,7992 | 0,0688 | 11,619 |
UK | 0,025 | 0,006 | 4,244 | −2,374 | 0,230 | −10,333 | 0,5916 | 0,2184 | 2,709 |
An analysis of the correlograms indicates that serial correlation exists and -if untreated- the independence assumption of the regression is violated. Both the apparent exponential decay of the autocorrelations and the presence of a significant partial autocorrelation of order 1 suggest that a first order autoregressive process may be able to capture the serial correlation of the residuals. This is confirmed, as the autocorrelation is mostly dealt with in the residuals of the autoregressive models (as per Eq. 8), diagnostics for which are provided in Subfigures 2B and 2D (for France) and 2F and 2H (for Germany).
Log-transformed model estimation results (top: base, bottom: autoregressive model)
Coefficient α | Coefficient β | Coefficient φ | |||||||
---|---|---|---|---|---|---|---|---|---|
Estimate | Standard error | t-test | Estimate | Standard error | t-test | Estimate | Standard error | t-test | |
AT | −2,395 | 0,057 | −42,122 | −2,031 | 0,057 | −35,857 | |||
BE | −2,521 | 0,074 | −33,904 | −2,056 | 0,077 | −26,896 | |||
CY | −1,555 | 0,083 | −18,642 | −0,818 | 0,120 | −6,808 | |||
DK | −4,004 | 0,278 | −14,423 | −3,047 | 0,273 | −11,160 | |||
FI | −2,985 | 0,188 | −15,884 | −1,916 | 0,166 | −11,528 | |||
FR | −2,565 | 0,091 | −28,050 | −2,236 | 0,100 | −22,443 | |||
DE | −2,715 | 0,068 | −40,134 | −2,056 | 0,073 | −28,224 | |||
EL | −1,210 | 0,048 | −25,150 | −0,694 | 0,024 | −28,465 | |||
HU | −1,647 | 0,172 | −9,569 | −0,919 | 0,096 | −9,567 | |||
IE | −3,353 | 0,224 | −14,995 | −2,073 | 0,164 | −12,682 | |||
IT | −2,395 | 0,051 | −46,635 | −1,558 | 0,054 | −29,118 | |||
LU | −1,994 | 0,071 | −27,923 | −1,671 | 0,082 | −20,433 | |||
NL | −4,187 | 0,088 | −47,684 | −2,928 | 0,078 | −37,472 | |||
PT | −1,340 | 0,078 | −17,150 | −1,012 | 0,053 | −19,184 | |||
ES | −1,558 | 0,094 | −16,500 | −0,878 | 0,070 | −12,540 | |||
UK | −3,566 | 0,117 | −30,529 | −2,248 | 0,112 | −20,050 | |||
AT | −2,452 | 0,097 | −25,263 | −2,100 | 0,103 | −20,420 | 0,451 | 0,155 | 2,910 |
BE | −2,509 | 0,159 | −15,772 | −2,045 | 0,173 | −11,850 | 0,539 | 0,188 | 2,862 |
CY | −1,581 | 0,109 | −14,456 | −0,865 | 0,167 | −5,191 | 0,079 | 0,300 | 0,263 |
DK | −3,526 | 0,662 | −5,324 | −2,540 | 0,675 | −3,765 | 0,688 | 0,150 | 4,594 |
FI | −1,871 | 1,036 | −1,805 | 0,149 | 0,713 | 0,209 | 0,940 | 0,040 | 23,539 |
FR | −2,954 | 0,571 | −5,177 | −2,697 | 0,711 | −3,795 | 0,748 | 0,185 | 4,035 |
DE | −2,567 | 1,269 | −2,022 | 0,738 | 1,270 | 0,581 | 0,961 | 0,020 | 49,141 |
EL | −1,184 | 0,087 | −13,568 | −0,680 | 0,045 | −14,979 | 0,431 | 0,205 | 2,097 |
HU | −1,977 | 0,508 | −3,896 | −1,110 | 0,304 | −3,653 | 0,709 | 0,165 | 4,290 |
IE | −2,817 | 5,391 | −0,522 | −0,523 | 0,582 | −0,898 | 0,980 | 0,066 | 14,977 |
IT | −2,360 | 0,126 | −18,768 | −1,521 | 0,149 | −10,193 | 0,686 | 0,165 | 4,165 |
LU | −2,053 | 0,063 | −32,483 | −1,760 | 0,075 | −23,359 | −0,033 | 0,184 | −0,178 |
NL | −4,219 | 0,134 | −31,500 | −2,963 | 0,122 | −24,254 | 0,358 | 0,177 | 2,030 |
PT | −1,437 | 0,145 | −9,879 | −1,094 | 0,109 | −10,010 | 0,548 | 0,158 | 3,472 |
ES | −1,948 | 0,687 | −2,837 | −1,216 | 0,760 | −1,601 | 0,849 | 0,191 | 4,451 |
UK | 0,123 | 2,283 | 0,054 | 0,044 | 0,766 | 0,057 | 1,040 | 0,040 | 26,168 |
One of the observations that can be made from Tables 1 and 2 is that the base non-linear regressions provide lower standard errors (respectively higher t-test statistics) than their counterparts that have been corrected for serial correlation. Since the autoregressive models provide superior fit (as indicated by both the summary goodness of fit statistics), as well as satisfy the assumption of independent residuals (as indicated by the graphical diagnostics), it may be concluded that the “ordinary” non-linear models underestimate the standard errors. An exhaustive discussion of this issue in the context of OLS is provided in Petersen [29]. This is a serious potential issue with models that ignore violations of the independence assumption, as it could lead to the acceptance of non-valid models as true.
The significance of the coefficient β associated with the motorization level reinforces the indications about the validity of this model. Even when correcting for autocorrelation, the obtained t-statistics suggest that this coefficient is very significant. Therefore, it is inferred that the negative relationship between the motorization level and the fatality risk is not circumstantial. In the two following sections, further statistical tests will be performed to provide additional insight into the properties of the developed models.
5 Model assessment
5.1 “Portmanteau” tests
A more detailed presentation of these tests is available in several texts, including Box et al. [7], on which this section is based. In the following application, Eq. 14 is used.
5.2 Comparison of predictive results
The impact of the autoregressive process in the prediction results is clear, with both autoregressive models almost consistently outperforming the base models. The non-linear AR model performs on average 39% better than the nonlinear model (i.e. the average reduction in the RMSPE of the models for the 16 countries that have been considered is 39%), while the autoregressive log-transformed model performs on average 49% better than the log-transformed model. This is a substantial improvement at the cost of just one extra parameter (the AR coefficient φ). Also, the AR log-transformed model performs on average more than 13% better than the AR non-linear model.
In absolute numbers, the non-linear and log-transformed models provide sometimes inaccurate predictions, ranging between 0.1 and 0.4 in terms of RMSPE. The performance of the autoregressive models, on the other hand, is a lot more consistent with most models providing predictions well below 0.1. Only two models (Cyprus and Luxemburg) have a higher RMSPE (i.e. lower predictive ability). An explanation may be found in the fact that these are by far the smaller of the considered countries (in terms of population) and hence the sample (not in term of annual observations, but in terms of fatalities per year) is smaller for them.
6 Model interpretation
The interpretation of parameter α is fairly straightforward, as it is a positive multiplicative parameter, and as such it can be considered as an indicator of the level of traffic risk in the country. Naturally, these parameters are not always directly comparable, as the value of the second parameter β also affects the total number of fatality rate. As the base of the exponent term is the car ownership rate, which is usually less than one, a larger negative value implies a higher overall term. One can deduce that parameter α is the dominant parameter, and as such a simplified categorization of the countries in terms of their traffic fatalities status could be based on that parameter (i.e. their position along the x-axis). Consequently, better performing countries are those presenting lower fatality rate combined with increasing effect of motorization rate. Several topics can be further investigated. For example, an interesting question is the influence of the general level of motorization on the models and the values of their parameters.
Combining these observations, safer countries should be to the left and top of Fig. 6 and less safe countries should be in the right and bottom. No countries are located in the lower right triangle of the plot, which is a reflection of the fact that, despite their differences, the considered countries are developed and have a decent level of road safety. It is expected that developing countries may be located closer to the lower right corner of the plot. Their objective should be how to move towards the top left corner of the plot. This trend might –to a degree- occur due to the increased motorization level resulting in lower speeds, but also in a better overall road safety culture. However, it would be possible for road safety experts and policy makers in these countries to also study the successful policies and measures from the more advanced (from a road safety point of view) countries and try to adapt them and incorporate them into their road safety strategies.
Among the countries considered, the least safe countries in terms of safety in Europe today are Greece, Portugal, and Cyprus and indeed the respective points are located closer to the right and top of the plot. Similarly, the United Kingdom, Finland, the Netherlands and Denmark (some of the safest countries in Europe) are closer to the left and bottom, without necessarily providing the exact ranking between them. These findings provide further validation for the ability of this model to capture existing road safety trends.
7 Conclusion
Modeling road safety is a complex task, which needs to consider both the quantifiable impact of specific parameters, and the underlying trends that cannot always be measured or observed. The sensitivity of users to road safety campaigns, the improved quality of the vehicle fleet, the improvement of the driving skills of the general population, and the overall improvement of the condition of the road network are only some of the aspects that cannot be easily modeled directly. Therefore, modeling should consider both measurable parameters and the dimension of time, which embodies all remaining parameters.
In the present research, the development of macroscopic models using both time and vehicle fleet as explanatory variables would have also been a meaningful approach. However, an alternative approach was opted for, for several reasons. First of all, time has some limitations as an explanatory variable as it is not really explaining road safety trends but instead reflects indirectly the changes in other parameters. Furthermore, a parameter representing time is linear (and uniform across countries) and thus limited in the amount of information that it can add to the model.
On the other hand, vehicle fleet may affect the number of fatalities, given that an increase in the vehicle number leads to higher average traffic volumes, which in turn may translate to a reduction in average speeds. Moreover, an increase of the vehicle fleet and total mileage in a country increases the need for more and safer road environment, in which the drivers’ behaviour tends to be also better [19, 20]. Besides, vehicle fleet is acknowledged as a useful alternative measurement of exposure, when traffic data are not available. Therefore, there is a causal macroscopic relationship between the number of fatalities (or fatality rates) and vehicles (or vehicle ownership). In this research, this relation has been investigated and modeled in the context of European countries.
Time-series methods have been used to account for and correct temporal correlation of the data. It is recognised however that traffic fatality risk also depends on other parameters, such as vehicle quality, traffic safety initiatives and regulations, and intensity of police enforcement. However, there are a number of reasons that make collection of these data across countries very difficult and –even when such data exist – they are often not directly comparable. Another important consideration is that some of these variables may be endogenous and thus might require special treatment in order to not impair the model.
The value of a simple model that could be used for cross-country comparisons can be easily motivated, without however claiming to fully explain the road safety phenomenon. Therefore, this paper provides a parsimonious model for linking motorization level with fatality rates across EU countries and possibly some insight on the existing or future trends in other, especially less developed countries, which still have not reached the motorization level of EU countries. Examining the road safety patterns of countries in this motorization level, policy makers and road safety experts in developing countries could foresee these developments and incorporate them into their strategies and policies.
Using fatality rate and vehicle ownership data from 16 EU countries for a period of 33 years (1970–2002) several models were developed, fitted, validated and compared, including simple non-linear models, their log-transformations and the related autoregressive models. The autoregressive versions of the models were proved to overcome the correlation of the residuals and also exhibit superior predictive properties. For a couple of countries (Italy and the Netherlands), however, the autoregressive model performed poorer than the base non-linear model. Log-transformed versions of the model also suffer from correlated residuals, and with the exception of few cases (especially Finland, Greece, and Hungary) have better or similar predictive capabilities than the non-linear models. The autoregressive log-transformed models also overcome the issues with the correlated residuals and provide superior predictive performance.
However, the estimated coefficients of the AR log-transformed model for five of the 16 countries are sometimes questionable (in terms of magnitudes and signs), suggesting that this model should be applied with caution, taking into account the particularities of the case examined. The autoregressive non-linear models therefore seem to be a more robust choice for prediction of macroscopic road safety trends, as they provide desirable predictive properties, satisfy the assumptions of the model (e.g. uncorrelated residuals) and provide intuitive model parameters (in terms of magnitude and sign).
The models presented in this research are regression based models and therefore have modest data requirements. Considering that annual road safety time-series are often small, such models are suitable for this analysis. The length of the time-intervals should be such that they provide adequate data for the model estimation and still allow for a reasonable validation data set. The choice of the boundaries of the time intervals can be important if the time series data exhibit sudden changes that could shift the regression line. If such changes are observed in the data then it is recommended that the modelers try alternative definitions of the time intervals, in order to determine the sensitivity/robustness of the models to the inclusion of one or more additional data points.
The results of the presented models can be used to evaluate the road safety performance of various countries, identifying poor performers, as well as traffic safety leaders. Indeed, as exhibited in the previous section, the model accurately determines the poor performers among the considered countries (Greece, Portugal, Cyprus), as well as those countries that are leading in terms of their road safety performance (United Kingdom, Finland, Netherlands, Denmark). At individual country level, given estimates of a country’s expected performance, the actual road safety performance of that country over the past few years may be assessed. Moreover, by applying the models, the expected road safety situation in a country in a “do-nothing” scenario is described, so that the potential impact of adopted road safety strategies may be assessed at macroscopic level (e.g. target setting). Furthermore, the study of more advanced (in terms of road safety and in general) countries may be applied to predict the future evolution of less developed or successful (in terms of traffic safety) countries. However, it is stressed that the use of the developed models for prediction should be limited within the currently applied domain, as their applicability in ranges for which data is not available cannot be verified.
Further research directions include the enrichment of the model with additional macroscopic parameters, as well as the investigation of other functional forms and model specifications. Additional parameters (such as the Gross Domestic Product, GDP) may help separate exogenous effects and isolate road safety trends. Other functional forms may also provide valuable insight into the road-safety problem. One relevant question is whether road safety trends are similar for best and worst performing countries and subsequently to find the inflection points defining the thresholds between the changing trends. An alternative modeling approach would have been the use of state-space models and structural time-series models, such as those proposed by Harvey and Shephard [16], Harvey [15], which belong to the family of unobserved component models. One of the advantages of this type of models is that they can explicitly model interventions or external road safety measures and campaigns.
Authors’ Affiliations
References
- Abbas KA (2004) Traffic safety assessment and development of predictive models for accidents on rural roads in Egypt. Accid Anal Prev 36(2):149–163View ArticleGoogle Scholar
- Al-Haji G (2007) Road Safety Development Index (RSDI). Theory, philosophy and practice. linkoeping studies in science and technology, Dissertation No. 1100, Norrkoeping, SwedenGoogle Scholar
- Andreassen D (1991) Population and registered vehicle data vs. road deaths. Accid Anal Prev 23(5):343–351View ArticleGoogle Scholar
- Bates DM, Watts DG (1988) Nonlinear regression analysis and its applications. Wiley, New YorkMATHView ArticleGoogle Scholar
- Beenstock M, Gafni D (2000) Globalization in road safety: explaining the downward trend in road accident rates in a single country (Israel). Accid Anal Prev 32:71–84View ArticleGoogle Scholar
- Box GEP, Pierce DA (1970) Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. J Am Stat Assoc 65:1509–1526MATHMathSciNetView ArticleGoogle Scholar
- Box GEP, Jenkins GM, Reinsel GC (1994) Time series analysis. Forecasting and control. Prentice Hall International, Inc., New JerseyMATHGoogle Scholar
- Broughton J (1991) Forecasting road accident casualties in Great Britain. Accid Anal Prev 23(5):353–362View ArticleGoogle Scholar
- Cameron MH, Haworth N, Oxley J, Newstead S, Le T (1993) Evaluation of Transport Accident Commission road safety television advertising. Report No. 52, Monash University Accident Research CentreGoogle Scholar
- COST329 (2004) Models for traffic and safety development and interventions. Final Report of the Action. European Commission, LuxembourgGoogle Scholar
- Davies N, Triggs CM, Newbold P (1977) Significance levels of the Box-Pierce portmanteau statistic in finite samples. Biometrika 64:517–522MATHMathSciNetView ArticleGoogle Scholar
- Dobson AJ (1990) An introduction to generalized linear models, 2nd edn. Chapman and Hall, LondonMATHView ArticleGoogle Scholar
- Gharaybeh FA (1994) Application of Smeed’s formula to assess development of traffic safety in Jordan. Accid Anal Prev 26(1):113–120View ArticleGoogle Scholar
- Hakim S, Shefer D, Hakkert AS, Hocherman I (1991) A critical review of macro models for road accidents. Accid Anal Prev 23(5):379–400View ArticleGoogle Scholar
- Harvey AC (1994) Forecasting, structural time series models and the Kalman filter. Cambridge University Press, CambridgeGoogle Scholar
- Harvey AC, Shephard N (1993) Structural time series models. In: Maddala GS, Rao CR, Vinod HD (eds) Handbook of Statistics, vol 11. Elsevier Science Publishers, B. V, Amsterdam, pp 261–302Google Scholar
- Henning-Hager U (1986) Urban development and road safety. Accid Anal Prev 18(2):135–145View ArticleGoogle Scholar
- Jacobs GD (1986) Road accident fatality rates in developing countries-a reappraisal. In: PTRC. Summer Annual Meeting, University of Sussex, 14–17 July 1986., Proc of Seminar H. London: PTRC Education and Research Services, pp 107–119Google Scholar
- Koornstra MJ (1992) The evolution of road safety and mobility. IATSS Research 16:129–148Google Scholar
- Koornstra MJ (1997) Trends and forecasts in motor vehicle Kilometrage, road safety, and environmental quality, pp 21–32 in Roller, D., (ed.) The motor vehicle and the environment – Entering a new century. Proceedings of the 30th International Symposium on Automotive Technology & Automation, Automotive Automation Limited, CroydonGoogle Scholar
- Kopits E, Cropper M (2005) Traffic fatalities and economic growth. Accid Anal Prev 37:169–178View ArticleGoogle Scholar
- Lassarre S (2001) Analysis of progress in road safety in ten European countries. Accid Anal Prev 33:743–751View ArticleGoogle Scholar
- Ljung GM, Box GEP (1978) On a measure of lack of fit in time series models. Biometrika 65:297–303MATHView ArticleGoogle Scholar
- Lord D (2002) Application of accident prediction models for computation of accident risk on transportation Networks. Transport Res Rec: J Transport Res Board 1784:17–26View ArticleGoogle Scholar
- McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman Hall, New YorkMATHView ArticleGoogle Scholar
- Newstead S, Cameron MH, Gantzer S, Vulcan P (1995). Modeling of some major factors influencing road trauma trends in Victoria 1989–93. Report No. 74, Monash University Accident Research CentreGoogle Scholar
- Oppe S (1989) Macroscopic models for traffic and traffic safety. Accid Anal Prev 21(3):225–232MathSciNetView ArticleGoogle Scholar
- Page Y (2001) A statistical model to compare road mortality in OECD countries. Accid Anal Prev 33:371–385View ArticleGoogle Scholar
- Petersen MA (2009) Estimating standard errors in finance panel data sets: comparing approaches. Rev Financ Stud 22:435–480View ArticleGoogle Scholar
- Pindyck RS, Rubinfeld DL (1997) Econometric models and economic forecasts, 4th edn. Irwin McGraw-Hill, BostonGoogle Scholar
- Qin X, Ivan JN, Ravishanker N (2004) Selecting exposure measures in crash rate prediction for two-lane highway segments. Accid Anal Prev 36(2):183–191View ArticleGoogle Scholar
- R Development Core Team (2011) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org (accessed August 13, 2011)
- Smeed RJ (1968) Variations in the pattern of accident rates in different countries and their causes. Traffic Eng Contr 10(7):364–371Google Scholar
- Stipdonk HL.(ed.) (2008) Time series applications on road safety developments in Europe. Deliverable D7.10 of the EU FP6 project SafetyNetGoogle Scholar
- van Beeck EF, Borsboom GJJ, Mackenbach JP (2000) Economic development and traffic accident mortality in the industrialized world, 1962–1990. Int J Epidemiol 29:503–509View ArticleGoogle Scholar
- WHO (2002) WHO mortality statistics. World Health Organization, GenevaGoogle Scholar
- WHO (2004) World report on road traffic injury prevention. World Health Organization, GenevaGoogle Scholar
- Yannis G, Antoniou C, Papadimitriou E, Katsochis D (2011) When may road fatalities start to decrease? J Saf Res 42(1):17–25View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.