- Original Paper
- Open Access
Derivation of level of service by artificial neural networks at horizontal curves: a case study in Egypt
European Transport Research Review volume 7, Article number: 4 (2015)
Multi-lane highways represent the majority of the total length of highway network at many countries. The geometric design of such facilities and the traffic volume which includes heavy vehicles percentage (HV) are considered the most important factors affecting the level of service (LOS), especially on the horizontal curves.
This paper aims to explore the relationship between the road geometric characteristics including horizontal curve properties and traffic volume including average annual daily traffic (AADT) and HV in one hand and LOS in the other hand by two methods. First is generalized linear modeling (GLM) procedure and second is the artificial neural networks (ANNs) procedure. In this research, the traffic and road geometric data are collected on 78 horizontal curves that are distributed on four multi-lane highways in Egypt. Two of them are located in desert area and the others in agricultural area.
The Analyses showed that the ANNs procedure give the best models for estimating LOS. Also, the most influential variables on LOS are AADT, HV, and radius of horizontal curve (R). Finally, the derived models have statistics within the acceptable regions and, also, they are conceptually reasonable.
The previous inferences are so important for road authorities in Egypt as they can consider them as a first step for origination of Egyptian highway capacity manual.
The transportation system in Egypt is suffered from limited roadway infrastructure and the lack of operation and management experience. Among the most critical issues in highway planning and management is to explore the effectiveness of road geometric characteristics and the percentage of heavy vehicles (HV) in traffic composition on LOS at multi-lane rural highways. Rural multi-lane highways are an important type of uninterrupted flow facilities in which there is no obstructions to the movement of vehicles along the road. Such facilities represent the majority of the highway system in Egypt. Highway Capacity Manual (HCM)  uses traffic density (Density), in terms of passenger cars per kilometer per lane as the primary level of service (LOS) measure for multi-lane highways. Abdul-Mawjoud and Sofia  indicate that horizontal curves have long been recognized as having a significant effect on vehicle speeds. Therefore, it is necessary and important to be taken into consideration in the present analysis. Hence, this paper aims to evaluate LOS on horizontal curves at multi-lane highways by two modeling techniques. First is generalized linear modeling (GLM) technique and the second is artificial neural networks (ANNs) technique.
Field data on multi-lane highways in Egypt are used in this investigation. The analysis considers 78 horizontal curves that are distributed on four multi-lane highways. Two of them are located in desert area (Cairo-Alexandria and Cairo-Ismailia desert highways), and the other two are located in agricultural area (Cairo- Alexandria and Tanta-Damietta agricultural roads). Then, the paper includes two separate relevant analyses. The first analysis uses GLM procedure to investigate the relationships between Density as dependent variable, and horizontal curve properties, roadway factors, traffic volume, and HV as independent variables. The horizontal curve properties include radius of curve, deflection angle, and superelevation. The road factors for each curve are lane and pavement width, lateral clearance, and number of lanes in each direction. In addition the traffic volume is expressed by average annual daily traffic (AADT). The second analysis uses ANNs procedure to explore the previous relationships and comparing the results. According to the objectives of this research, road authorities in Egypt can determine LOS for different horizontal curves on multi-lane highways and improve the traffic performance of them in the future.
Several researches have been carried out to analyze the effect of road geometry and traffic composition on LOS for multi-lane highways. Kerner  confirmed that the determination of LOS for any highway is one of the most important applications of any traffic theory. Some previous theories and empirical researches focused on the interrelationships among the influence of LOS, traffic features and geometric elements on uninterrupted multi-lane highways [4–7]. The Indonesian highway capacity manual  mentioned that travel speed as the main measure of LOS of road segments.
Reinfurt et al.  tackled the speed changes which lead to LOS variation over the horizontal curves. They found that, as curves become sharper, there is a proportionally greater increase in speed reduction on a curve. The study findings support the debate of drivers cutting the curve short, which can result in run-off road crashes on the inside of the curve as well as head-on and opposite direction-sideswipe crashes with oncoming vehicles.
Arasan and Arkatkar  studied the effect of variation of traffic composition, road width, magnitude of upgrade and its length on Indian highways capacity and LOS, and subsequent it was concluded that highway performance significantly changes with change in traffic volume composition, width of roadway, magnitude of upgrade and its length.
Sakai et al.  used an empirical approach to produce LOS measure for basic expressway segments in Japan incorporating Customer Satisfaction (CS). It was concluded that LOS and CS were confirmed to have a nonlinear relation.
Shawky and Hashim  studied the effect of horizontal alignment on traffic performance at two-lane rural highways. The follower density was used as a promising measure of traffic performance. The results show that the horizontal alignment characteristics have a significant effect on the follower density, especially curve radius value, by decreasing radius, the follower density increases (i.e., traffic performance decreases).
In Egypt, there were so few concerned with LOS due to lack of road geometric, horizontal curve, traffic flow and speed data. The most important research in this direction is published by Semeida . The analysis in this paper used 45 different tangent sites from four main multi-lane rural highways. Two modeling techniques were used. First was multiple linear regression and second was ANNs. Results showed that the ANN modeling gives the best models for estimating LOS. The most influential variables on LOS are lane width, HV, and existence of side access. This analysis was limited to tangent sections. Hence, the present paper extends and improves the analysis to include LOS on horizontal curves.
Study sites and field data
This research concerns with horizontal curves at rural multi-lane highways in Egypt. Therefore the analysis of this paper uses 78 horizontal curves (Sections) from four main multi-lane highways in Egypt. These roads include Cairo-Alexandria Agricultural Highway, Tanta- Damietta Agricultural Highway, Cairo-Alexandria Desert Highway, and Cairo-Ismailia Desert Highway. The collected data are divided into three types as road geometric characteristics including horizontal curves, vehicles speed, and traffic volume data including AADT and HV.
Road geometry data
This data presents the key independent variables in the analysis. Some of this data are collected directly from site investigation which includes lane, pavement width, lateral clearance, and number of lanes in each direction. On the other hand, the horizontal curves data are extracted from Abdalla  who worked with the survey team of General Authority of Roads, Bridges and Land Transport in Egypt (GARBLT)  and obtaining these data. The horizontal curve properties include radius of curve, deflection angle, and superelevation. All the previous variables, their symbols, and statistical analysis are provided in Table 1.
Vehicles speed data
The main type of the collected speed data is average travel speed of passenger cars (ATSpc). The passenger cars include taxis, vans, and jeeps. It is measured in field as follows. ATSpc is detected during the measurement period in the peak flow of traffic as a worst case for recording. ATSpc is measured by taking a constant distance between 50 and 100 m throughout length of each horizontal curve and recording the time during which vehicles traveled at this constant distance. Next, the speed is calculated by dividing the constant distance per the recorded time. The sample of cars at any curve is not smaller than 100 vehicles . The speed data is recorded in Table 2.
Traffic volume data
The purpose of collecting traffic volume is to reach AADT, the design hourly volume (V) and detect HV along each section (HV include semi- trucks, trucks and trucks trailer that have at least one axle with dual wheels). This can be executed as follows. Average Annual Daily Traffic must be estimated at each site as Semeida . As there are many sites (78 curves), a manual traffic counting is done for an hour at each site. Surely, the results of any manual counting must be enlarged and corrected to be converted into Average Annual Daily Traffic (AADT) as Heikal and Abdel-latif . This needs three factors as follows: hourly factor (HF) (100 / % volume in counting hours), Daily Factor (DF), and Seasonally Factor (SF). These factors are constant for each road and are obtained from (GARBLT) . The values of DF and SF for all roads under study are listed in Table 3. Therefore, the calculated AADT is given by Eq. (1) . The values of AADT are listed in Table 4.
Consequently, the design hourly volume (V) can be calculated by Eq. (2) (1).
design hourly volume (typically, the 30th highest annual hourly volume)
Average Annual Daily Traffic in vehicle per day and
factor used to convert annual average daily traffic to a specified annual hourly volume. (K = 0.1 for rural roads).
During any particular hour, traffic volume will likely be greater in one direction than in the other. Directional distribution (D) is an important factor in quality of service analysis. To convert hourly volume to hourly directional volumes (Dir.V), the hourly volumes are multiplied by the D-factor as shown in Eq. (3) .
Note that D = 0.6 for rural roads as stated by HCM 2010 . Dir. V values for all sites are recorded in Table 4. Finally, HV is extracted from the recorded manual counting as there is a classification of vehicles for passenger cars and HV. These values are indicated in Table 4.
Determination of LOS for horizontal curves under study
- Dir. V:
hourly volume in one direction (veh/h)
peak-hour factor (PHF = 0.88 for rural roads)
number of lanes in the selected direction
- fHV :
heavy-vehicle adjustment factor (0.97–0.9) depending on HV value, and
- fp :
driver population factor (1) as the driver is familiar with the highway.
Consequently, LOS is determined for each curve from Density value which reflects the degree of vehicles congestion on each curve. The LOS and Density for all curves are determined and listed in Table 4.
Thus, the research uses a total number of 10 variables. These variables are divided into dependent and independent variables. Dependent Variable includes one variable. This variable is Density which presents the promising measure for LOS. Independent Variables include 9 Variables as follows,
The methodology of LOS prediction in the present research is divided into two main procedures: GLM procedure and ANNs procedure. It is concluded from the previous researches that the ANNs provide a better model is highlighted by better predictions for lower values, the normality of the residuals and their independence from the predicted variable. Several authors have reported greater performance of ANNs compared to GLM. The advantage of ANNs over GLM is that ANNs can directly take into account any non-linear relationships between the dependent variables and each independent variable. ANNs have another advantage in that the ANN modeling approach is fast and flexible. Finally, ANNs model can be easily use by engineers.
This type of modeling was used successfully in accidents prediction all over the world. Semeida  used this technique and concluded important results in this context. In the present research, this model is used but in different form to fit density data. The mathematical form includes all independent variables separately. This is performed by taking the natural logarithm of all these variables to produce a power relationship between density and each of the former items. This allows studying the effect of all independent variables on Density separately. A normal distribution with a log link function is chosen to model these data. This form takes the following shape in Eq. (6) as Varagouli et al. :
- Xn :
Explanatory variables from 1 to 9
- β o :
Regression constant; and
- βn :
This form satisfies two main conditions:
This model must yield logical results (non negative). Also, at Xi = 0; Density must be zero.
The existing of logarithmic link function that can linearize this form for the purpose of coefficient estimation.
This model form is executed using the generalized linear model procedure PROC GENMOD in the SAS statistical software . SAS user’s manual  applies the maximum log-likelihood technique to estimate the regression coefficients, standard errors, Wald Chi-squared statistics, p-values. In addition, the R2 (coefficient of determination) and (infinity norm of error vector) ║δ║ values are calculated for each model as Varagouli et al. . Finally, the model with minimum ║δ║ and highest R2 value is selected.
In general, ANNs consist of 3 layers, namely, the input, the hidden and the output layers. In statistical terms, the input layer contains the independent variables and the output layer contains the dependent variables. ANNs typically start out with randomized weights for all their neurons. When a satisfactory level of performance is reached the training is ended and the network uses these weights to make a decision (Singh et al., 2011) .
The experience in this field is extracted from Semeida [13, 22, 23]. In his researches, the multi-layer perceptron (MLP) neural network models give the best performance of all models. In addition, this network is usually preferred in engineering applications because many learning algorithm might be used in MLP. One of the commonly used learning algorithms in ANN applications is back propagation algorithm (BP) , which is also used in this work.
The overall dataset of 78 curve sections is divided into a training dataset and a testing dataset. As in the literature, the training data set varies from 70 % to 90 % and the testing data set varies from 10 % to 30 %. Model performances are and R2 for testing and training data set in one hand and for all data set in the other hand (Voudris, 2006) .
So many trials are done to reach the suitable percentage between training and testing data that gives the best performance for cars and trucks speed models. In addition, over fitting can be avoided by randomize the 78 curves before training the network to reach the best performance for both training and testing data. The performance of testing data must be good as training data (R2 must not be smaller than 0.7) (Tarefder et al., 2005) .
Data analysis and results
The correlations among Density on curve sections and the nine independent variables are analyzed. As shown in Table 5, Pearson correlation coefficient and the value of Sig. are calculated by SPSS. It can be seen from this table, there are relatively significant correlations at 0.01 level among Density and six independent variables. These variables are LC, R, DA, e, AADT, and HV. Then, these variables are introduced into GLM and ANN models. Consequently, there should be no multicollinearity among the selected independent variables in the final models.
Analysis and results of GLM procedure
There are three models that are statistically significant with Density after the application of Proc GENMOD in SAS software ; the best models are as follows in Eqs. (7), (8), and (9). Also, the graphical relations between predicted and observed data are shown in Fig. 1a, b, c, respectively. In addition, Table 6 presents the parameter estimates of the three models.
(Whereas, R2 = 0.877, and δ = 0.258)
(Whereas, R2 = 0.96, and δ = 0.142)
(Whereas, R2 = 0.962, and δ = 0.123)
Investigation of the previous results shows that:
The third model in Eq. (9) is better than the other three as it has higher R2, and lower ║δ║.
In the best model, it is concluded that the negative sign of the coefficient for R means that Density decreases with the increase of R. The horizontal curves that have greater radii encourage the drivers to increase their speed on them than with smaller radii. These curves are safer for drivers. Thus, the greater R improves LOS on horizontal curves.
In addition for the third model, the positive sign of the coefficient for AADT means that Density increases with the increase of traffic volume. The increase of traffic volume makes the road section more crowded with vehicles and consequently, the number of vehicles per km clearly increases. This implies to the decrease of LOS. Also, this result is rational.
In the same model, the positive sign of the coefficient for HV means that Density increases with the increase of HV. The horizontal curves with higher HV are more congestive and the drivers of passenger cars are annoyed with HV which force them to decrease their speed. This implies to the decrease of LOS. Also, this result is rational.
In the best model, it is found that the negative sign of the coefficient for LC means that Density moderately decreases with the increase of LC. In other word, the wider lateral clearance encourages the drivers to increase their speed on horizontal curve as the maneuvering is easier at wider road cross section and then Density decreases. Consequently, the wider LC slightly improves LOS on horizontal curves. This result is consistent with logic.
Analysis and results of ANNs procedure
As a result of correlation analysis, there are six independent variables are highly correlated with Density. These variables are in input layer. One hidden layer is used, and one desired variable (Density) is in output layer with 78 observations are used. The architecture of the ANN model is shown in Fig. 2. The number of neurons in hidden layer is about half of the total number of neurons at the input and output layers (thee neurons), which is set based on generally accepted knowledge in this field. Using of learning rule of (momentum) and the suitable number of epochs (iterations) is 5000. The previous conditions are suitable for quick convergence of the problem as executed by Semeida [13, 22, 23]. So many trials are done to reach the percentage between training and testing data which gives the best model performance in the present case of research. The performances of the best three trials for training and testing data set are presented in Table 7. This table shows that trial two is the best one. In this trial, the curves are divided into training data set that has 63 curves (80 % of all observations), and testing data set that has 15 curves (20 % of all sites).
As a result of training and testing processing, the observed versus predicted values are shown in Fig. 3. It is clearly that the ANN models give better and most confident results than GLM models. In order to measure the importance of each explanatory variable, general influence (sensitivity about the mean or standard deviation) is computed based on the trained weights of ANNs. For specified independent variable, if this value (sensitivity about the mean) is higher than other variables. This indicates that the effect of this variable on dependent variable (Density) is higher than other variables. Also, Fig. 3 shows the sensitivity of each explanatory variable in the selected model. It is found that the most influential variable on Density is R, followed by HV and AADT. The relationships between each effective input variable and Density are shown in Fig. 4. It is concluded the following results:
Density decreases with the increase of R. The increase of R from 134 m to 617 m leads to a decrease of Density from 15.7 pc/km/lane to 5.6 pc/km/lane. In other words, the LOS improves form C to A. The relation between the two variables is nearly linear and clearly inverse.
Density increases with the increase of HV. The increase of HV from 16 % to 22 % leads to an increase of Density from 9.8 pc/km/lane to 18 pc/km/lane. This implies that LOS deteriorates from B to D. Moreover, Density is nearly constant with HV less than 16 %. This implies that HV effect on traffic moving starts at percentage more than 16 %. Less than this value, this effect is feeble.
Density increases with the increase of AADT. The increase of AADT from 8000 veh. /h to 38,000 veh. /h leads to an increase of Density from 7.7 pc/km/lane to 12.8 pc/km/lane. This indicates that LOS retreats from B to C. The relation between the two variables in this zone is linear. Density is nearly constant with AADT less than 8000 veh./ h. This proves that AADT is not effective on traffic flow less than this value.
Although the effect of the other variables (LC, DA, and e) on Density is limited, but it is discussed for more knowledge. Density decreases with the increase of LC. The increase of LC from 1.4 m to 2.3 m leads to a decrease of Density from 10.5 pc/km/lane to 9 pc/km/lane. This implies that no change in LOS B. In addition, the same conclusion is resulted for DA and e as the same LOS B is existed with variable change. Therefore, the effect of these variables on Density is weak and can be neglected.
In addition, Density increases with the increase of both HV and AADT. These results are more accurate than the regression models and rational.
The current paper explores the impact of highway geometry, horizontal alignment, and traffic characteristics on density at multilane highways in Egypt. The most important findings of this paper are:
The ANNs procedure gives so better, more confident and consistent with logic results than GLM procedure in terms of predicting Density.
The best ANNs model gives R2 and δ equal to 0.993 and 0.102, respectively compared with the best GLM model for the same horizontal curves gives R2 and ║δ║ equal to 0.962, and 0.123, respectively.
For ANN model: The most influential variable on Density is R, followed by HV and AADT.
Also, the increase of R from 134 m to 617 m leads to a decrease of Density from 15.7 pc/km/lane to 5.6 pc/km/lane. In other words, the LOS improves form C to A.
In addition, the increase of HV from 16 % to 22 % leads to an increase of Density from 9.8 pc/km/lane to 18 pc/km/lane. This implies that LOS deteriorates from B to D.
Finally, the increase of AADT from 8000 veh. /h to 38,000 veh. /h leads to an increase of Density from 7.7 pc/km/lane to 12.8 pc/km/lane. This indicates that LOS retreats from B to C.
These results are so important for road authorities in Egypt as they can determine LOS for different horizontal curves and improve the traffic performance of them in the future. Also, they can consider these results as a first step for origination of Egyptian highway capacity manual. Finally, future research should be conducted to extend all aspects of this research using comprehensive field data including more multi-lane rural roads and vertical alignment. In addition, capacity for multi-lane highways is the main target in the Egyptian future research.
Transportation Research Board (TRB) (2010) Highway capacity manual (HCM), 1st edn. Transportation Research Board, National Research Council, Washington
Abdul-Mawjoud A, Sofia G (2008) Development of models for predicting speed on horizontal curves for Two-lane rural highways. Arab J Sci Eng 33(2):365–377
Kerner B (2004) Three phase traffic theory and highway level of service. J Phys A 333:379–450
Hoban C (1987) Evaluating traffic capacity and improvements to road geometry. World Bank, Washington
Iwasaki M (1991) Empirical analysis of congested traffic flow characteristics and free speed affected by geometric factors on an intercity expressway. Transp RES REC 1320:242–250
Ibrahim A, Hall F (1994) Effect of adverse weather conditions on speed-flow- occupancy relationships. Transp RES REC 1457:184–191
Shankar V, Mannering F (1998) Modeling the endogeneity of lane-mean speeds and lane- speed deviations: a structural equations approach. J Transp RES A 32:311–322
Manual IHC (1997) Department of public works. Directorate General Highways, Jakarta
Reinfurt W, Zegeer C, Shelton B, Neuman T (1991) Analysis of vehicle operations on horizontal curves. Transp RES REC 1318:43–50
Arasan V, Arkatkar S (2011) Derivation of capacity standards for intercity roads carrying heterogeneous traffic using computer simulation. J Procedia Sci 16:218–229
Sakai T, Yamada-Kawai K, Matsumoto H, Uchida T (2011) New measure of the level of service for basic expressway segments incorporating customer satisfaction. J Procedia Sci 16:57–68
Shawky M, Hashim I (2010) Impact of horizontal alignment on traffic performance at rural two-lane highways, In: Proceedings of 4th International Symposium on Highway Geometric Design, Valencia, Spain, June 2–5
Semeida AM (2013) New models to evaluate the level of service and capacity for rural multi-lane highways in Egypt. A Eng J 52(3):455–466. doi:10.1016/j.aej.2013.04.003
Abdalla N (2010) Relationship between traffic flow characteristics and pavement conditions on roads in Egypt. M.Sc. Dissertation, Department of civil Engineering, Faculty of Engineering, Al-Azhar University
General Authority of Roads, Bridges and Land Transport (GARBLT) (2009) System of traffic counting data, El-Nasr St., Cairo, Egypt. http://www.garblt.gov.eg/. Accessed 12 Feb 2009
Semeida A. Analysis and Evaluation of Road Safety in Egypt Using Conventional and Non-Conventional Modeling Techniques (2011) Ph.D. Dissertation, Department of civil Engineering, Faculty of Engineering, Port Said University
Heikal A, Abdel-latif H (2009) Principals of traffic engineering. Alhakim press, Inc., (ISBN-977-19-7061-5), Cairo
Varagouli A, Simos T, Xeidakis G (2005) Fitting a multiple regression line to travel demand forecasting: the case of the prefecture of xanthi, Northern Greece. J Math Comp Model 42:817–836. doi:10.1016/j.mcm.2005.09.010
SAS Institute Inc (2003) Version 9 of the SAS system for windows. SAS, Cary
SAS User’s Manual (2003) (Chapter 29): the GENMOD procedure. SAS, Cary, pp 1365–1462
Singh D, Zaman M, White L (2011) Modeling of 85th percentile speed for rural highways for enhanced traffic safety. No. FHWA 2211, Oklahoma Department of Transportation
Semeida AM (2013) Impact of highway geometry and posted speed on operating speed at multilane highways in Egypt. J Adv Res 4(6):515–523. doi:10.1016/j.jare.2012.08.014
Semeida AM (2014) Application of artificial neural networks for operating speed prediction at horizontal curves: a case study in Egypt. J Mod Transp 22(1):20–29. doi:10.1007/s40534-014-0033-3
User’s Manual “NeuroSolutions 7” (2010) NeuroDimension, Inc. Gainesville
Voudris AV (2006) Analysis and forecast of capsize bulk carriers market using artificial neural networks. M.Sc. Dissertation, Massachusetts Institute of Technology, USA
Tarefder RA, White L, Zaman M (2005) Neural network model for asphalt concrete permeability. M Civ Eng J 17:19–27
The author acknowledges the support of the General Authority of Roads, Bridges, and Land Transport, (GARBLT) for their assistance with the acquisition of traffic count data. Also, the author acknowledges Eng. Nasser Abdalla, Department of Civil Engineering, Faculty of Engineering, Al-Azhar University for his assistance with the acquisition of horizontal curve properties at sites under research.