- Original Paper
- Open Access
Common before-after accident study on a road site: a low-informative Bayesian method
European Transport Research Review volume 1, pages125–134 (2009)
This note aims at providing a Bayesian methodological basis for routine before-after accident studies, often applied to a single road site, and in conditions of limited resources in terms of time and expertise.
A low-informative Bayesian method is proposed for before-after accident studies using a comparison site or group of sites. As compared to conventional statistics, the Bayesian approach is less subject to misuse and misinterpretation by practitioners. The low-informative framework seems appropriate in situations of limited expertise. The proposed approach gives the possibility of correcting for regression to the mean. Examples illustrate the application of this method.
Results and conclusions
It is shown that a relatively simple method, based on the Jeffreys’s rule prior considered as a “reasonable standard”, can be implemented without major difficulties. Posterior distributions are proper. The numerical calculation of posterior probabilities can be done without using Monte-Carlo simulations nor specialised software tools.
It is common that road sites are modified in order to achieve improvements from various points of view (traffic conditions, better integration of various uses and users of the road and public space, reduction of noise and air pollution, traffic safety, etc.). A few years after a site has been modified, local engineers generally have to study the effects of this road change, regarding various aspects including road safety. Thus, a retrospective before-after accident study is often needed.
In such routine situations, resources are limited in terms of time and expertise, and the risk of misuse of conventional statistical methods is increased. Even among people who are more experienced in statistics, like researchers, erroneous uses of conventional methods are common: misuse of tests of significance, erroneous understanding of p-values, misinterpretation of confidence intervals (as pointed out by many authors [15, 18, 19, 24, 27, 32]; see also [5, 11, 28]). For example, the p-value is often erroneously regarded as the probability that the null hypothesis is true, and the 95% confidence interval obtained is wrongly assumed to contain the true parameter with a 95% chance. The Bayesian approach to statistics is more in accordance with the expectations and intuitions of non-specialists. In particular, the posterior distribution can be legitimately used to give the probabilities that the parameter of interest is contained in various regions of the parameter space (a 95% credible interval, for example), or exceeds a particular value, given the data observed and prior knowledge. Some authors consider that teaching Bayesian statistics is easier than teaching frequentist statistics [10, 31]. Nevertheless, aids to practitioners are necessary to implement Bayesian methods, since the calculations in these approaches are sometimes complex.
In this paper, we will not deal with studies based on large samples of sites and using multivariate modelling, for which Bayesian approaches were proposed in the recent period [4, 30, 35, 37]. Bayesian methods adapted to meta-analyses or to overviews of several studies (see, for example, ) will not be considered here. We will focus on methods applicable to a single site and transferable to engineers for common practice.
In the case we deal with here (routine evaluation, single site), the methods currently used and recommended are conventional statistical methods (see, for example, ), even though they sometimes make use of empirical Bayes estimates of the expected accident number on the treated site in order to cope with ‘regression to the mean’ bias. The principle of a ‘full’ Bayesian approach was described by Hauer [21, 22] for studying the index of effectiveness θ of a road measure: the prior probability density function of the parameter θ, reflecting the prior knowledge concerning this parameter, is combined with the likelihood function (probability of the data given the parameter) to obtain the posterior probability density function. The posterior probabilities reflect the revised knowledge about the parameter, given previous knowledge and the data analysed. The method proposed by Hauer, however, is an informative (subjective) Bayes method and presupposes expertise or previously formalised knowledge: the prior probabilities are based on the “elicitation of prevailing opinion about the effectiveness of a treatment” (, p. 289), or possibly on the results of previous studies or meta-analyses. Road safety expertise is limited, however, in the routine situations we consider here, since the study is often carried out by a local road engineer, and not by a road safety specialist. Moreover, the site modification is often singular and not generic (it may combine several treatments, for example: redesigning of islands, resurfacing and marking at a junction site). Therefore, it may be difficult to make use of results from previous meta-analyses. A method coping with this problem was described by Al Masaied et al. : prior probabilities were estimated using a part of the accident data, for both the before and the after periods. In the case of a single site, however, this may lead to very small accident numbers for each data subset. Another way is to use the ‘objective’ or ‘low-informative’ Bayesian framework [6, 7, 17, 25, 26] where the prior probabilities are chosen in order to be neutral in some way as regards the possible parameter values, reflecting the lack of previous knowledge. Besides, it can be argued that results based on low-informative approaches are generally easier to communicate to a diverse or uninitiated audience, since, as mentioned by Box and Tiao , they represent “what someone who a priori knew very little about an unknown parameter should believe in light of the data” (p. 22).
In before-after accident studies, it is important to be able to control for regression to the mean bias, which can be done by incorporating some limited information into the prior distribution concerning one component of the vector of parameters (see Section 4). Besides, although such studies are retrospective and not experimental, one should seek to control for the confusing influence of factors other than the road change. To this end, it can be useful to take into consideration a comparison group of similar sites, for example. The method described by Hauer  uses a comparison sample, but the calculations are based on approximations which presuppose that the accident counts in the comparison sample are large. The method proposed by Al-Masaied et al.  is a simple before-after method without comparison sites.
In this methodological note, we describe a low-informative Bayesian method adapted to the current practice of before-after accident studies concerning a single treated site (or a group of sites considered as a whole). A comparison site (or group of sites) is used in order to control for factors other than the road modification. Practical means of calculation, for a commonly available spreadsheet software package, will also be provided on the author’s webpage (http://www.inrets.fr/ur/ma/Brenac.html).
Data structure and parameters for the before-after study with comparison sites
When a comparison site or group of sites is used, the basic data take the form of a 2×2 contingency table (Table 1) containing the observed accident counts x i . These counts are considered as observations of independent Poisson variables X i with expected values μ i (unknown).
Under the assumption of a strong similarity between the treated site and the comparison site, and if the evolution of traffic does not differ between them, the effect of the measure can be represented by the odds ratio
θ expresses the ratio of the ‘accidentality’ on the treatment site during period II (after modification) to what this ‘accidentality’ would have been during the same period II, had the site not been modified—here we use the term ‘accidentality’ in the somewhat unusual sense of the expected value of the accident count. From a practical point of view, an odds ratio of 0.8, for example, would mean that the effect of the treatment is a 20% reduction in accidentality. The ratio reflects the effect of other factors on the evolution from period I to period II, assumed to be common to both the treated and comparison sites (η can be considered as a trend parameter). In other terms, μ2 and μ4 can be expressed as follows: μ2 = μ1θ η and μ4 = μ3η .
Thus, we are in the presence of a problem with four observations x1, x2, x3, x4 from four independent random Poisson variables X1, X2, X3 and X4, and four unknown parameters θ, η, μ1 and μ3 with the following relationships:
The Bayesian framework
According to the Bayesian approach to statistics, the unknown parameters θ, η , μ1 and μ3 are considered as instantiations of variables Θ, Η, M1 and M3, treated as random variables, but which in fact reflect our uncertainty about the values of these parameters. Given the observed data x = (x1, x2, x3, x4), given the likelihood function L(x | θ, η, μ1, μ3)Footnote 1 and the joint prior probability density function of the parameters π(θ, η, μ1, μ3), the application of Bayes’ theorem leads to the joint posterior distribution
The joint prior distribution π(θ, η, μ1, μ3) represents our previous assumptions or knowledge (or lack of knowledge) regarding the parameters (see Section 4). The joint posterior distribution represents our revised knowledge about the parameters, after the observations are taken into account. The likelihood function can be easily derived from the problem formulation given in Section 2:
The parameter of interest is θ. Its posterior probability density function can be obtained by integrating the joint posterior distribution with respect to the three other parameters:
From a practical point of view, however, the most useful result is the posterior cumulative distribution function of Θ,
This cumulative distribution function makes it possible to calculate credible intervals and the probability that the effect studied is lower or higher than a particular value, given the data and prior probabilities.
Low-informative prior distributions
In this paper we assume a lack of previous knowledge or sufficient expertise regarding the parameters. Thus, the prior distributions should be low-informative or neutral as regards these parameters. This choice also tends to “let the data speak for themselves”, giving a higher importance to the likelihood function in the calculation of posterior probabilities. Two situations should be distinguished, however, according to whether regression to the mean bias is likely or not. Regression to the mean (see, for example, ) occurs when the site was chosen for treatment in consideration of a high accident record. In this case, the count x1 gives only biased information on the expected value μ1, and a low-informative prior distribution for μ1 would lead to biased results, overestimating the treatment effect. In this situation, other data or information are needed and should be taken into account in the prior distribution of μ1 (see point 4.2).
Case where regression to the mean bias is unlikely
In many circumstances, regression to the mean bias is unlikely: for example, when the site modification was not decided for safety reasons, but for other purposes (really independent from accident counts). In this case, a low-informative joint prior distribution can be chosen for the four parameters θ, η , μ1 and μ3. The way of selecting low-informative priors (also called non-informative, objective, default or reference priors) is widely discussed in Bayesian statistics (see the review by Kass and Wassermann ; see also [2, 8, 9, 17, 20, 25, 26, 39]). We will not enter this debate here since, as mentioned by Ghosh et al. , “even though there is no unique objective prior, the posteriors will usually be very similar even with a modest amount of data” (p. 147). In the present paper, for the sake of simplicity, we will only consider the prior obtained by the Jeffreys’s general ruleFootnote 2 , which is widely accepted as a “reasonable standard” . For a vector of parameters ξ, the Jeffreys’s rule prior is proportional to the square root of the determinant of the Fisher information matrix:
where ∝ denotes proportionality. In this expression, I(ξ) is the Fisher information matrix defined by where l is the log-likelihood. Applied to our problem, using Eq. 4, this rule leads to the joint prior
Like many non-informative priors, this prior is improper since it does not integrate to a finite value over the parameter space. In Bayesian statistics, however, this is not regarded as a problem, provided that the posterior distribution is proper (i.e., the integral in the denominator of Eq. 3 converges to a finite value).
Case where regression to the mean bias is likely
In this situation, conventional methods correct for regression to the mean by considering that the site is taken from a population of comparable sites and extracting complementary information from a sample of such sitesFootnote 3. Each of the accident counts x1j at these sites, during period I, is considered as an observation from a Poisson variable with mean μ1j . The μ1j are assumed to be distributed like a Gamma variable with shape parameter α and scale parameter λ (some empirical justifications can be found in the literature [1, 34]). This Poisson-Gamma structure leads to a negative binomial distribution of the counts x1j among this sample of sites. Based on the mean m and variance s2 of this distribution, estimated from the x1j, it is possible to estimateFootnote 4α and λ: α = m2/(s2–m) and λ = m/(s2−m). Conventional evaluation methods then replace x1, the usual estimate of μ1, by the empirical Bayes estimate μ1* = m2/s2 + x1(s2−m)/s2 = (α+x1)/(1+λ) for the calculation of the odds ratio [16, 23, 36]. This technique has proved to be effective for correcting for regression to the mean bias .
The equivalent in the ‘full’ Bayes approach consists in taking the Gamma(α,λ) prior distribution for the parameter μ1:
In this situation, a joint prior distribution can be obtained by calculating π(θ, η, μ3) with the Jeffreys’s rule applied while holding μ1 fixed (see ), which gives π(θ, η, μ3) ∝ (1/θ)½ and leads to
(Constants are not taken into account since they would be cancelled anyway as common factors in the denominator and numerator of Eq. 3). This prior is also improper. The estimates of α and λ are drawn from accident data at a sample of similar sites (independent from the group of comparison sites), or from an accident model, as described above for conventional methods using empirical Bayes estimates. Although this joint distribution (Eq. 10) uses some prior information concerning μ1 (through α and λ), it remains, however, low-informative in a relative sense, since no prior knowledge is used concerning the parameter of interest θ and the two other parameters η and μ3.
Case where regression to the mean bias is unlikely
Applying the likelihood function (Eq. 4) and the joint prior distribution given in Eq. 8 to the calculation of the joint posterior distribution (Eq. 3) gives the following expression, after simplification (cancelling of factors present both at the numerator and the denominator):
This latter integral converges to a finite value even when some (or all) of the x i equal zero. Therefore, a proper posterior distribution can always be obtained. The terms in μ1 and μ3 are proportional to Gamma density functions, which makes it possible to integrate the expression given in Eq. 11 with respect to μ1 and μ3, leading to the joint posterior of θ and η
with where B denotes the Beta functionFootnote 5. The posterior cumulative distribution function of Θ is then given by
The calculation of this integral is generally not possible by analytical means. We describe in the appendix of this paper a way of calculating it numerically.
Case where regression to the mean bias is likely
For the prior given in Eq. 10, the same kind of calculations as those described in point 5.1 lead to the following expression for the posterior cumulative distribution of Θ :
where . The calculation leading to this result is not valid if x1+α ≤ ½ (which is unlikely: α is a positive parameter and we are in a situation where the treated site was chosen in consideration of a high accident count x1). For the numerical calculation of this integral, see the appendix.
Practical uses of the posterior cumulative distribution function of Θ
From a practical point of view, various useful results can be obtained using the function F Θ (t | x). For example, the lower limit θ LL and upper limit θ UL of a 95% symmetrical credible interval are defined by F Θ (θ LL | x) = 0.025 and F Θ (θ UL | x) = 0.975; the probability, given the data, that θ is contained in this interval is 95%. The median θ med defined by F Θ (θ med | x) = 0.5 gives a point estimate of the odds ratio for which the posterior risks of overestimation and underestimation are equal. The value F Θ (1 | x) represents the posterior probability that θ is lower than 1, i.e. the probability that the treatment is beneficial to safety, given the data and initial assumptions (see Section 2).
Group of comparison sites instead of a single comparison site
In this situation, the group of q comparison sites is considered as a whole, with x3 = Σ x3k and x4 = Σ x4k (where x3k and x4k are the accident counts during periods I and II on each comparison site k, with k = 1 to q). The aggregated counts x3 and x4 are observations from random variables X3 and X4 which are Poisson variables (since they are obtained by summing the independent Poisson variables X3k and X4k) with means μ3 = Σ μ3k and μ4 = Σ μ4k. The calculations described in Sections 3 to 5 are then applied by simply using the aggregated counts x3 and x4 and the aggregated means μ3 and μ4 . The low informative joint prior is given by Eq. 8 or 10. The posterior cumulative distribution function of Θ is then given by Eq. 13 or 14 (with x3 = Σ x3k and x4 = Σ x4k ).
Multiple treated sites
The general case of several treated sites, considered independently, with possibly different odds ratios θ i due to heterogeneity in the treatment effect is beyond the purpose of this paper and will be the subject of further publications. Nevertheless, in the simpler situation where a group of treated sites is considered as a whole (with a focus on the overall effect of treatment), the methods described above can be easily adapted.
Let us consider n treated sites with accident counts x1i and x2i (i = 1 to n) during periods I and II, with corresponding means μ1i and μ2i, and q comparison sites with accidents counts x3k and x4k (k = 1 to q) during periods I and II, with corresponding means μ3k and μ4k.
When regression to the mean bias is unlikely, and if we consider the treated sites as a whole (and the comparison sites as a whole), the calculations and results described in Sections 3 to 5 can be applied by simply using the aggregated counts x1 = Σ x1i, x2 = Σ x2i, x3 = Σ x3k, x4 = Σ x4k and the corresponding aggregated means, with the prior given in Eq. 8. In this case, the parameter θ represents the overall effect of the programme of treatment. The posterior probabilities are given by Eq. 13.
When regression to the mean bias appears likely, if the same prior Gamma(α,λ) distribution can be assumed for the mean μ1i of each treated site i, the prior distribution of the overall mean μ1 = Σμ1i is a Gamma(nα, λ) distribution (using the classical property of the sum of independent Gamma variables with same scale parameter λ). Considering the treated sites as a whole (and the comparison sites as a whole), and considering θ as the overall effect, the joint prior distribution becomes
where μ3 = Σμ3k . The posterior cumulative distribution function of Θ is then
where and where n is the number of treated sites, x1 = Σ x1i, x2 = Σ x2i, x3 = Σ x3k and x4 = Σ x4k.
Examples of application
Example 1: Safety effect of redesigning an urban road section
We describe here the case of an urban section of road where the infrastructure was largely modified in order to enhance the quality of local urban life. Raised median islands, small roundabouts, speed humps and raised tables were implemented in 2000 on this section of a main urban road in a town of 40,000 inhabitants (length of the treated section: 700 m). All the unmodified sections of the main roads in this town were taken as a comparison group of sites. The comparability between the treated site and the comparison group of sites was verified by comparing the yearly injury accident counts for the 1989–1999 period. The ‘before’ period is the five-year period from 1995 to 1999. The ‘after’ period is the five-year period from 2001 to 2005. The presence of regression to the mean bias was considered to be unlikely for the following reasons: this project was not decided for safety reasons, and the proportion of accidents during the 1995–1999 period relative to 1989–1999 was not higher on the treated site as compared to all the unmodified sections of main roads in this town. For the ‘before’ period, 16 injury accidents occurred on the treated site and 61 injury accidents occurred on the comparison group of sites. For the ‘after’ period, 3 injury accidents occurred on the treated site, and 46 injury accidents occurred on the comparison group of sites.
The calculations applied to these data (x1 = 16, x2 = 3, x3 = 61, x4 = 46) with the low-informative prior based on the Jeffreys’s rule (Eq. 8) give the following results based on the posterior cumulative distribution function of Θ :
|95% symmetrical credible interval:||0.062 to 0.815|
|Posterior probability that θ < 1:||0.990|
These results suggest a beneficial effect on safety. They can be compared to the results that would be obtained by conventional statistical methods. Nevertheless, as mentioned in the introduction, Bayesian and non-Bayesian concepts (like credible interval and confidence interval) can not be interpreted in the same wayFootnote 6. In this example, the usual unconditional maximum likelihood estimator of the odds ratio, with the related approximate 95% confidence interval (Woolf interval), would lead to the following results:
|θ ML * = 0.249|
|Woolf 95% confidence interval:||0.068 to 0.904|
In this example, a practitioner would probably conclude in favour of a positive effect on safety, from both these Bayesian and non-Bayesian results.
Example 2: Safety effect of a rural crossroads modification
This example deals with a priority intersection on a main rural two-lane road. This crossroads was modified in 1986 (installation of median raised islands, marking) for safety reasons. Therefore, regression to the mean is likely to occur. At this junction, 14 injury accidents occurred during the three-year period before the treatment. During the three-year period following the treatment, 4 injury accidents occurred.
This evolution was compared to the evolution observed at a set of 11 similar intersections on main rural two-lane roads in the same region, used as a comparison group of sites. At these sites, considered as a whole, 33 injury accidents occurred during the before period and 22 injury accidents occurred during the after period.
The calculations applied to these data (x1 = 14, x2 = 4, x3 = 33, x4 = 22), using the low-informative prior distribution given by Eq. 8 (Jeffreys’s rule prior), would lead to the following results based on the posterior cumulative distribution function of Θ :
|95% symmetrical credible interval:||0.117 to 1.389|
|Posterior probability that θ < 1:||0.917|
Due to the high risk of regression to the mean in this case, however, these results are probably biased. In order to correct for this regression to the mean bias, it is necessary to use a more ‘informed’ prior, concerning the parameter μ1 (see Section 4). To this end, the parameters α and λ of a prior Gamma distribution for μ1 have to be estimated. By applying an accident model (which was established at a national level ) to the characteristics of this junction (traffic volumes, number of arms, number of traffic lanes), as mentioned in Section 4.2., it is possible to obtain the mean m = 3.55 and variance s2 = 15.90 of the accident counts for a virtual population of similar sites during the same period I. On this basis, we can calculate the estimates α = 1.02 and λ = 0.29. The joint prior given by Eq. 10 is then precisely defined and leads to the following results, in terms of posterior probabilities:
|95% symmetrical credible interval:||0.151 to 1.789|
|Posterior probability that θ < 1:||0.828|
These results show that, in this case, the safety effect is in reality smaller than indicated by the biased results obtained with the low-informative prior given by Eq. 8. The median of the posterior distribution (0.566) can be used as a point estimate of the odds ratio (where the posterior probabilities of overestimation and underestimation are equal). This value corresponds to an accident reduction of approximately 43%. The 95% credible interval, however, is large and the beneficial effect of the treatment remains uncertain.
Using the same data, a more conventional approach would lead, for example, to the maximum likelihood estimate θ ML * = 0.429 (without controlling for regression to the mean), or to a corrected estimate of 0.515 based on the empirical Bayes estimate of μ1 [16, 36].
Example 3: Safety effect of resurfacing on main roads
This example is based on some of the data published in an article by Leden et al. , dealing with the effect of resurfacing on friction, speeds and safety on main roads in Finland. The treated sites are all sections on main roads (in the south of Finland) which were resurfaced in 1991. The comparison sites are all the untreated main roads in the same region. Due to the particular road conditions in winter in Finland, only the effects on the non-winter period (from April 1 to September 30) are studied. Regression to the mean bias is considered to be unlikely, since “sections were selected for treatment on a routine maintenance base”  (p. 82) and not for safety reasons. We consider the treated sites as a whole, and the comparison sites as a whole. The parameter θ thus represents the overall effect of the treatment on the set of sites. The following data concern the ‘before’ period from April to September 1990 and the ‘after’ period from April to September 1992. Before and after injury accident counts are x1 = 80 and x2 = 74 on the treated sites, and x3 = 931 and x4 = 779 on the comparison sites. Based on the Jeffreys’s rule prior, the results are as follows, in terms of posterior probabilities:
|95% symmetrical credible interval:||0.794 to 1.537|
|Posterior probability that θ < 1:||0.275|
One can note the proximity of these results with the following results which would be obtained with a conventional frequentist approach:
|θ ML * = 1.105|
|Woolf 95% confidence interval:||0.794 to 1.537|
This proximity is not surprising: posterior credible intervals based on the Jeffreys’s rule prior are frequently close to frequentist confidence intervals in large sample conditions [17, 40] although they do not have the same meaning.
Based on these results, the posterior median estimate of θ would suggest a slight detrimental effect on safety (increase of accidentality of approximately 11%), but no certain conclusion can be drawn since the 95% credible interval is large. Based on the posterior probability that θ < 1 (approximately 28%), however, one could say that the probability that the treatment increases the accidentality, given the data and assumptions, is 72%. No equivalent result from a conventional statistical analysis could lead to this kind of interpretation, except if one wrongly interprets a p-value as a posterior probability. A possible increase of accidentality could be explained by the fact that resurfacing tends to increase the average speeds, at least when the road is dry, as shown by Leden et al. .
Discussion and conclusion
In this note, we described a low-informative Bayesian method for before-after accident studies, using a comparison site or group of sites, and giving the possibility of correcting for regression to the mean bias. The aim was to provide a methodological basis for routine evaluation studies, often applied to a single treated site, and in conditions of limited resources in terms of time and expertise. As compared to conventional statistics, the Bayesian approach is less subject to misuse and misinterpretation by practitioners with limited statistical experience. The low-informative or objective Bayesian methods seem appropriate in routine evaluation studies, where expertise or previous knowledge are often limited or hard to formalise. As shown in Sections 2 to 6, a relatively simple method, based on the Jeffreys’s rule prior considered as a “reasonable standard”, can be implemented without major difficulties. Posterior distributions are proper. The numerical calculation of posterior probabilities can be done without using Monte-Carlo methods nor specialised software tools. The examples given in Section 7 show that the results can be analysed in a direct way, without the high risk of misinterpretation involved in the analysis of frequentist results.
Further developments, however, are still needed. Although this method seems to be transferable to engineers for common practice, further work is necessary in order to provide a simple, didactic description of the Bayesian line of reasoning, with minimal use of mathematical formalisms, appropriate for communicating this approach to practitioners. Concerning the practical means for calculating the posterior probabilities, the spreadsheet mentioned in the appendix (for a common spreadsheet software package) will be made available on our website.
The proposed method has limitations, of course. Retrospective before-after studies are not randomised experiments and the validity of their results is based on the assumption that the treated and comparison sites are similar. Before-after studies based on multivariate generalised linear models make it possible to better control for the influence of differences between treated and comparison sites. But such methodologies would generally involve a thorough data collection and analysis on a large sample of sites, which seems hard to implement by practitioners in the routine situations we considered in this paper. The comparability of treated and comparison sites, however, can be checked by examining their accident history, when accident data are available for a long period before the treatment (see ). A Bayesian approach to this subject could be studied. Besides, other developments could contribute to extending the field of application of the proposed method: in this paper, we only dealt with the case of a single treated site (or a group of sites treated as a whole, with a focus on the overall effect of the programme of treatment), with a comparison site or group of sites. The case of multiple treated sites considered independently and with possibly different odds ratios remains to be dealt with. However, this would involve an increased complexity and more difficulties for practitioners.
We hope this methodological note will contribute to an increased use of the Bayesian approach, which is more in accordance with the expectations and intuitions of non-statisticians, in the current practice of before-after accident studies.
This notation means: probability of the data x = (x1, x2, x3, x4) given the parameter values θ, η , μ1, μ3.
This rule can be justified from several points of view, in particular: invariance by re-parameterisation, uniformity, in the sense of equiprobability of regions of same size in the parameter space with a Riemannian metric, and minimisation of information (the Jeffreys's rule prior can be considered as a special case of the Bernardo-Berger prior). For developments of these arguments, see for example Ghosh et al.  and Kass and Wassermann .
In the expression of K, the term in θ is proportional to a three-parameter Beta-prime distribution, which makes it possible to integrate with respect to θ over [0,+∞). The integration with respect to η is then possible, over [0,+∞).
A correct interpretation of a classical (non-Bayesian) 95% confidence interval is: if we could indefinitely repeat the same “experiment” with the same parameter value, 95% of the confidence intervals thus obtained would contain this value.
Abbess C, Jarrett D, Wright CC (1981) Accidents at blackspots: estimating the effect of remedial treatment, with special reference to the ‘regression-to-mean’ effect. Traffic Eng Control 22:535–542
Agresti A, Hitchcock DB (2005) Bayesian inference for categorical data analysis. Stat Methods Appl 14:297–330
Al-Masaied HR, Sinha KC, Kuczek T (1993) Evaluation of safety impact of highway projects. Transp Res Rec 1401:9–16
Aul N, Davis G (2006) Use of propensity score matching method and hybrid Bayesian method to estimate crash modification factors of signal installation. Transp Res Rec 1950:17–23
Belia S, Fidler F, Williams J, Cumming G (2005) Researchers misunderstand confidence intervals and standard errors. Psychol Methods 10:389–396
Berger J (1985) Statistical decision theory and Bayesian analysis. Springer, New York
Berger J (2006) The case for objective Bayesian analysis. Bayesian Anal 1:385–402
Berger JO, Bernardo JM (1992) On the development of the reference prior method. In: Bernardo JM, Berger JO, Dawid AP, Smith AFM (eds) Bayesian Statistics 4: Proceedings of the Fourth Valencia International Meeting. Clarendon Press, Oxford, pp 35–60
Bernardo JM (1979) Reference posterior distributions for Bayesian inference. J R Stat Soc Series B Stat Methodol 41:113–147
Berry DA (1995) Decision analysis and Bayesian methods in clinical trials. Cancer Treat Res 75:125–154
Berry DA (1997) Teaching elementary Bayesian statistics with real applications in science. Am Stat 51:241–246
Bin Ibrahim K, Metcalfe AV (1993) Bayesian overview for evaluation of mini-roundabouts as a road safety measure. Statistician 42:525–540
Box GEP, Tiao GC (1973) Bayesian inference in statistical analysis. Addison-Wesley, Reading
Brenac T (1994) Accidents en carrefour sur routes nationales, modélisation du nombre d’accidents prédictible sur un carrefour et exemples d’applications. INRETS report 185. INRETS, Arcueil (France)
Cohen J (1994) The earth is round (p < 0.05). Am Psychol 49:997–1003
De Brabander B, Nuyts E, Vereeck L (2005) Road safety effects of roundabouts in Flanders. J Saf Res 36:289–296
Ghosh JK, Delampady M, Samanta T (2006) An introduction to Bayesian analysis, theory and methods. Springer, New York
Goodman SN (2005) Introduction to Bayesian methods, I: measuring the strength of evidence. Clin Trials 2:282–290
Haller H, Krauss S (2002) Misinterpretations of significance: A problem students share with their teachers? Methods Psychol Res 7:1–20
Hasofer AM (1970) On the representation of ignorance in Poisson processes. J R Stat Soc Series B Stat Methodol 32:268–271
Hauer E (1983) Reflections on methods of statistical inference in research on the effect of safety countermeasures. Accid Anal Prev 15:275–285
Hauer E (1983) An application of the likelihood/Bayes approach to the estimation of safety countermeasure effectiveness. Accid Anal Prev 15:287–298
Hauer E (1997) Observational before-after studies in road safety. Pergamon, Oxford
Hauer E (2004) The harm done by tests of significance. Accid Anal Prev 36:495–500
Jaynes ET (1968) Prior probabilities. IEEE Trans Syst Sci Cybern 4:227–241
Jeffreys H (1961) Theory of probability, 3rd edn. Oxford University Press, London
Johnson DH (1999) The insignificance of statistical significance testing. J Wildl Manag 63:763–772
Kadane JB (1995) Prime time for Bayes. Control Clin Trials 16:313–318
Kass RE, Wasserman L (1996) The selection of prior distribution by formal rules. J Am Stat Assoc 91:1343–1370
Lan B, Persaud B, Lyon C, Bhim R (2008) Validation of a full Bayes methodology for observational before-after road safety studies and application to evaluation of rural signal conversions. Transportation Research Board annual meeting, Washington
Lecoutre B (1999) Beyond the significance test controversy: Prime time for Bayes? Bull Int Stat Inst LVIII(2):205–208
Lecoutre B, Lecoutre MP, Poitevineau J (2001) Uses, abuses and misuses of significance tests in the scientific community: Won’t the Bayesian choice be unavoidable? Int Stat Rev 69:399–418
Leden L, Hämäläinen O, Manninen E (1998) The effect of resurfacing on friction, speeds and safety on main roads in Finland. Accid Anal Prev 30:75–85
Maher MJ (1987) Fitting probability distributions to accident frequency data. Traffic Eng Control 28:356–357
Miranda-Moreno LF (2007) Fu L (2007) Traffic safety study: empirical Bayes or full Bayes? Transportation Research Board annual meeting, Washington
Mountain L, Fawaz B, Sineng L (1992) The assessment of changes in accident frequencies on link segments: a comparison of four methods. Traffic Eng Control 33:429–431
Pawlovich MD, Li W, Carriquiry A, Welch T (2006) Iowa’s experience with road diet measures, use of Bayesian approach to assess impacts on crash frequencies and crash rates. Transp Res Rec 1953:163–171
Persaud B, Lyon C (2007) Empirical-Bayes before-after safety studies: Lessons learned from two decades of experience and future directions. Accid Anal Prev 39:546–555
Robert CP (2007) The Bayesian choice, from decision-theoretic foundations to computational implementation, 2nd edn. Springer, New York
Welch BL, Peers HW (1963) On formulae for confidence points based on intervals of weighted likelihoods. J R Stat Soc Series B Stat Methodol 25:318–329
The author would like to thank Sylvie Després (Université Paris-Nord) and three anonymous referees for their helpful comments.
Appendix: Numerical calculation of the posterior cumulative distribution function of Θ
Appendix: Numerical calculation of the posterior cumulative distribution function of Θ
In the case where regression to the mean bias is unlikely, this function is given by Eq. 13. It can be written in a more generic form:
where the parameters a, b, c and d take the values a = x2 + ½, b = x1 + ½, c = x4 + ½, d = x3 + ½ for the Jeffreys’s rule prior (Eq. 8), and the values a = Σx2i + ½, b = Σx1i + ½, c = Σx4k + ½, d = Σx3k + ½ in case of multiple sites considered as a whole (with the Jeffreys’s rule prior; see Section 6), for example. If we successively use the three changes of variable z(η) = η /(1+η), ω(θ) = θ z /((θ –1) z + 1) and lastly u(z) = Betacdfc,d(z), where Betacdfc,d denotes the cumulative distribution function of the Beta distribution with parameters c and d, we obtain from Eq. 17:
where Betacdfa,b denotes the cumulative distribution function of the Beta distribution with parameters a and b, and Betacdfc,d−1 denotes the inverse function of Betacdfc,d. The Beta cumulative distribution function and its inverse are commonly available in spreadsheet software, and this integral can be calculated without using specialised tools (see below). In the case where regression to the mean bias is suspected, the posterior cumulative distribution function of Θ is preferably calculated from Eq. 14. This equation can be written in the following form:
where the parameters a, b, c and d take the values a = x2 + ½, b = x1+α–½, c = x4 + ½, d = x3 + ½ for the prior given by Eq. 10, and the values a = Σx2i + ½, b = Σx1i + nα–½, c = Σx4k + ½, d = Σx3k + ½ in case of multiple sites considered as a whole, with the prior given in Eq. 15. After the three successive changes of variable z(η)= η /(1+η), ω(θ) = θ z /((θ−1−λ) z + 1 + λ) and lastly u(z) = Betacdfc,d(z), this integral becomes
We will propose (on our website) a spreadsheet calculating FΘ(t | x) for any given value of t, given the accident counts and the prior choice (among the priors mentioned in this paper), for a common spreadsheet software package. The calculation of the integrals of Eqs. 18 and 20 is based on a classical trapezoidal quadrature method, with an increase of the number of calculation points in the vicinity of 0 and 1. The reliability of the results was carefully checked by comparing them to the results obtained with a more powerful software tool using other quadrature methods (adaptive Simpson quadrature and Lobatto quadrature), for a large range of possible values for accident counts x1, x2, x3, and x4.NB: Simplified calculations can be equivalently used in the special case where the counts x3 and x4 are very large, since the trend parameter η can then be considered as non-random and approximately equal to x4 / x3. The problem then reduces to a two-parameter problem (θ and μ1), with two random observations x1 and x2. In this situation, the Jeffreys’s rule prior is again π(θ, μ1) ∝ (1/θ)½. In the case where regression to the mean should be taken into account, the method described in Section 4 leads to the following prior: π(θ, μ1) ∝ (1/θ)½μ1α–1exp(–λμ1). With the Jeffreys’s rule prior, the same kind of calculations as those described in Section 3 to 6 lead to the posterior cumulative distribution function
This is a Beta-prime distribution function with three parameters x2 + ½, x1 + ½ and 1/η. After a change of variable z(θ) = θ /(θ + 1/η), this integral can be written in the form of a Beta cumulative distribution function:
where a = x2 + ½ and b = x1 + ½. In the case where the prior is π(θ, μ1) ∝ (1/θ)½μ1α−1exp(−λμ1), the calculation leads to the following result:
where a = x2 + ½ and b = x1+α−½.