 Original Paper
 Open Access
 Published:
Effectiveness of link and path information on simultaneous adjustment of dynamic OD demand matrix
European Transport Research Review volume 6, pages139–148 (2014)
Abstract
Introduction
The paper deals with the adjustment of timedependent Origin–destination (OD) demand matrix, which is the fundamental input of ITS application for traffic predictions. The usual problem is to search for temporal OD matrices that are “near” an a priori estimate (seed matrices) and that best fit traffic counts. However information on link flows is not fully effective in describing the state of the network; recent technologies for tracking vehicles provide a new kind of information on route travel times that can integrate usual information on traffic flows at count sections.
Objective
The object of the paper is to analyse the effectiveness of different types of information in the offline simultaneous adjustment of dynamic OD demand, starting from seed matrices with different degrees of reliability.
Introduction
Dynamic estimation of Origin–destination (OD) matrix is a fundamental input for ITS systems, which need to identify the current traffic state and predict future traffic conditions at realtime level. In fact, demand patterns vary from day to day and congested networks are heavily affected by even small changes of OD demand flows. So, high level of accuracy on demand can lead to successful ITS systems [1] as well as to effective strategies for implementing route guidance, congestion pricing and networkbased traffic signal control [2]. On the other hand, knowledge of spacetemporal structure of demand is the necessary input for a dynamic traffic assignment model that simulates congestion evolution. Without correcting errors in OD demand estimation, the inconsistency in OD flows would accumulate and propagate in the traffic simulation process, making the network state estimation and prediction highly unreliable [3].
Usual methods for OD estimation combine some a priori information, like historical OD matrices, with realtime traffic measurements. Since dynamic traffic assignment models for ITS applications require a very detailed representation of OD matrix in time and space, the OD estimation problem is highly undetermined. So, any possible information on demand structure can be useful to reduce the complexity of the problem.
Information on prior OD matrices (the socalled “seed matrix”) are usually reported in any formulation, both static and dynamic; however, differently from other measures, they are not directly observable [4] and solution procedures for demand adjustment are usually irrespective of their quality [5].
Current technologies can provide a great amount of traffic data collected on links and nodes of the transportation network: pavementembedded sensors, roadside radars and cameras provide measures of flows and speeds at nodes and along links; Advanced Vehicle Identification (AVI), groundbased radio navigation, cellular geolocation and GPS provide a new kind of information about travel times and route choices that integrate usual information on traffic flows at count sections. Moreover, it is well known that traffic counts are not fully effective in discerning between congested and uncongested traffic state of a link, because of nonmonotone flowdensity relationship. Thus, it is important to formulate effective methods for OD estimation combining several heterogeneous sources of information and to assess the relative importance of each of them. On the other hand, optimization methods can applied to individuate the best locations of measurement sections (see, for example, [6]).
Many authors dealt with the problem of increasing the amount of information required by dynamic OD estimation problem and included, for example, speed and link occupancy [7–9], probe data from vehicle equipped by AVI tags [10–14, 15,16], aggregate demand data such as traffic emissions and attractions by zones [8,9,17], total demand for subnetworks, or the temporal distribution of trips in some areas on the network.
In this paper we want to investigate the contribution of different kinds of information to improve the accuracy of timedependent OD matrix estimation. Specifically, with respect to previous studies, we introduce information on travel times, which are assumed to be provided by a fleet of floating cars. In order to focus on basic issues of the problem, we tackle offline simultaneous estimation of timedependent OD demand, which is the basis for a suitable development of ITS applications in online context.
The paper is organized into five sections including this introduction: Section 2 reports different methodologies developed in the last years for the dynamic OD estimation and after defines the one adopted in the study; in Section 3 the case study is presented, while the results of the application are reported in Section 4; finally Section 5 summarizes the main conclusions.
Problem formulation
Different approaches and solution algorithms have been developed in the last years for both offline and online dynamic OD estimation: in the following the most recent contributes are reported.
Zhou et al. [18] formulated the dynamic OD estimation problem as a single level nonlinear optimization model, solved with a relaxation algorithm of the lagrangian extension of the original one, taking into account route choice in order to work in the path flow dimension. Frederix et al. [23] adopted a linear approximation of the relationship between OD flows and link flows, taking into account link flows being not separable. This approximation has been obtained with the marginal computation (MaC) method that performs a perturbation analysis in a computationally efficient way, using the kinematic wave theory principles for traffic simulation. Toledo and Kolechkina [19] proposed a method based on a linear approximation of the assignment matrix; they apply different iterative algorithms, performing a mesoscopic traffic simulation to conduct network loadings. Djukic et al. [20] proposed the reduction and approximation of OD demand variables based on principal component analysis (PCA). The new transformed set of variables (demand principal components) is then updated online from traffic counts in a novel reduced state space model for real time estimation of OD demand.
The problem of offline simultaneous estimation of temporal OD matrices is tackled in this paper adopting a simulation approach, which avoids introducing assignment matrices [9]. The OD estimation problem is formulated as an optimization problem aiming at minimizing a linear combination of the distance between estimated and a priori OD demand flows and the errors between detected and estimated traffic measurements in a dynamic (i.e., timedependent) offline context. The objective function includes different kinds of data collected with different types of techniques: simple traffic counts and speed measurements detected at fixed road sections and travel times measured on routes travelled, for example, by floating cars equipped with a GPS receiver and a cellular mobile transmitter. Adding speed measurement provides further information on the traffic regime that enables to distinguish between congested and uncongested conditions. The extent of such a congested condition can be grasped further by adding travel time information.
Given:
a network B = [N,A], where:
 N :

nodes
 A :

directed links
 n _{ od } :

number of origin–destination pairs
 R :

routes connecting each OD pair.
where:
 f ^{d} :

term of the objective function relative to the distance with the seed matrix
 x _{ i } :

estimated matrix for departing time interval i, i = 1…n_{ h }
 d _{ i } :

seed matrix for departing time interval i, i = 1…n_{ h }
 y _{ i } :

simulated information on link set S for departing time interval i, i = 1…n_{ h }
 ${\widehat{\mathbf{y}}}_{i}$ :

collected measures on link set S for departing time interval i, i = 1…n_{ h }
 f ^{l} :

term of the objective function relative to measures collected on links
 z _{ i } :

simulated information on node set P for departing time interval i, i = 1…n_{ h }
 ${\widehat{\mathbf{z}}}_{i}$ :

collected measures on node set P for departing time interval i, i = 1…n_{ h }
 f ^{n} :

term of the objective function relative to measures collected on nodes
 w _{ i } :

simulated information on route set r for departing time interval i, i = 1…n_{ h }
 ${\widehat{\mathbf{w}}}_{i}$ :

collected measures on route set r for departing time interval i, i = 1…n_{ h }
 f ^{p} :

term of the objective function relative to measures collected on routes.
Dependence between simulated information in Eq. (1) and estimated matrices is obtained directly by simulation performing a dynamic traffic assignment (DTA), so that:
with F = DTA.
Lower bound and upper bound constraints can be introduced on demand to avoid infeasible solutions and to restrict the search space:
Other aggregate demand data can be introduced in the objective function or as a constraint. For example, Cipriani et al. [9] introduce constraints on traffic emissions by zones in order to prevent demand overestimation:
with:
 G _{ o } ^{*} :

a priori emission value for origin zone o
 G _{ o } ^{i} :

emission value for origin zone o of the demand matrix x_{ i }.
In fact, overestimation of demand can produce a prolonged loading period in simulation (i.e., the network takes longer to clear), without significant changes in observed link flows or in the corresponding terms of the objective function. The constraint on generation balances for the insensitivity of the objective function to these conditions.
Functions f depend on the particular estimation framework, on the type of estimator and on the available information [21].
Generalized Least Squares (GLS) framework exploits additional information about the reliability of measurements; this information can be incorporated as a set of internal weights resulting in the variance–covariance matrix. In such a case, the f^{d} function, for instance, assumes the following form:
where V = variance–covariance matrix of the vector of sampling errors affecting the estimate d.
If this information is not available, the different objective function terms can be controlled using exogenous scalar weights representing the relative confidence of the analyst on measurements (that is: speeds, flows and travel times) or on a priori direct observation (that is, the seed matrix).
Information such as flows and speeds measured on links as well as travel times from probe vehicles has been reported in this study inside the objective function (1). Data from links and routes are considered with different types of grouping to assess the impact of different network elements in the adjustment process. Generated trips have been reported as an inequality constraint as in Eq. (4). Different seed matrices with different degrees of reliability have been considered as inputs of the procedure in order to analyze different levels of uncertainty on a priori demand estimation.
The procedure adopted to solve the problem (1) is the SPSA ADPI (Simultaneous Perturbation Stochastic Approximation, Asymmetric Design, Polynomial Interpolation) proposed by Cipriani et al. [8]. SPSA ADPI is a modification of the gradient based path search optimization method that permits to reduce the computational effort in regard to the usual gradientbased methods, which is a basic issue to deal with a simultaneous estimation of demand for real applications.
At the generic iteration k, the algorithm computes the dynamic matrix for the next iteration k + 1 as:
where:
 a _{ k } :

gain sequence at iteration k of the OD estimation algorithm
 $\overline{\widehat{\mathbf{g}}}\left({\mathbf{x}}_{k}\right)$ :

the average approximated gradient at iteration k, calculated as the average of m gradient approximations:
Each gradient approximation ${\widehat{\mathbf{g}}}_{m}\left({\mathbf{x}}_{k}\right)$ is based on a simultaneous perturbation of each component; in case of one side simultaneous perturbation (Asymmetric Design – AD), this is:
where the distribution of the n_{ v }dimensional random perturbation vector Δ (with n_{ v } = n_{ h }× OD) is subject to the condition that the components {Δ^{j}_{ m }} of the perturbation vector are independent and symmetrically distributed around 0 with finite inverse moments E(Δ^{j}_{ m }) for all m, j.
The gain sequence a_{ k } is computed using a Polynomial Interpolation (PI) of the objective function along the descendent direction: at each iteration the minimum point of the polynomial interpolation is considered as the suboptimal solution of the problem, as shown in Fig. 1.
Description of the experiments
Experiments have been conducted on the testnetwork reported in Fig. 2, consisting of 8 links and 8 nodes, with three traffic signals (same cycle time and green time equally split between the incoming approaches) introduced to increase congestion.
In detail, the following information has been considered as input of the estimation process:

1.
information on links: counts and measured speeds collected at 5 count sections on the network (Fig. 2);

2.
information on routes: path travel times for different departure times of probe vehicles along one path connecting origin 2 to destination 4 (Fig. 2);

3.
information on demand: previous demand matrices (seed matrices) with different degrees of reliability and aggregate demand data (generated trips).
Counts, measured speeds and path travel times have been collected performing a dynamic user equilibrium assignment by DYNAMEQ [22], given a supposed “true” demand matrix, which is assumed to be unknown to the analyst. The total time horizon of the assignment is 50 min. The demand is characterized by three OD components (between centroids 2 and 4, 6 and 4, 3 and 4, Fig. 2) for a total amount of about 4,400 veh/h and it has been divided into 5 time slices of 10 min each with a variable profile (Fig. 3). Only one OD component (from 2 to 4, Fig. 2) has a possible route choice. Information on links (traffic counts and measured speeds) is collected every time slice, while path travel times have been collected only for the first three time slices.
Different seed matrices, representing possible a priori knowledge of demand have been obtained by random perturbations of the “true” matrix.
The distance between the “true” matrix and the seed matrix, which represents the reliability of the latter, has been computed using the Relative Mean Error (RME) statistic:
where:
 d :

“seed” demand values
 x ^{r} :

“true” demand values
 i :

time interval
 j :

OD pairs
In particular, 8 seed matrices have been generated with a RME value ranging from 0.16 (high reliability) to 0.68 (low reliability), as reported in Table 1.
In Fig. 3 the variable demand profiles of the “true” matrix and of some of the adopted seed matrices are reported: the differences between the profiles suggest the need to work not only on the value of the total demand, but also on its distribution between the time slices.
Four different objective functions (OF) have been defined, grouping the collected information in the following way:

OF1: distance between simulated flows and link counts plus distance between estimated demand and seed matrix;

OF2: distance between simulated flows and link counts, plus distance between simulated speeds and measured link speeds, plus distance between estimated demand and seed matrix;

OF3: distance between simulated flows and link counts, plus distance between simulated path travel times and measured path travel times from probe vehicles, plus distance between estimated demand and seed matrix;

OF4: distance between simulated flows and link counts, plus distance between simulated speeds and measured link speeds, plus distance between simulated path travel times and measured path travel times from probe vehicles, plus distance between estimated demand and seed matrix.
Results
Results of the SPSA ADPI, for different degrees of reliability of the seed matrix and using different types of information inside the objective function, demonstrate the effectiveness of the procedure, with improvements of the objective function up to 50 %. Smaller improvements are experienced if the seed matrices have very low reliability (RME≥0.6, Seed 7 and Seed 8), because of the large distance from the real demand.
The following figures highlight the effects of different kinds of realtime information on the accuracy of OD demand estimation, for different degrees of reliability of a priori information on OD demand. The best improvements are obtained using OF4; that is, when path travel times are considered together with measures of speeds and flows on link sections. In particular, objective function improvements higher than 40 % are obtained for seed matrix reliability ranging from RME = 0.2 to RME = 0.5 (from Seed 2 to Seed 6 in Fig. 4).
Only when RME is lower than 0.2 (Seed 1) OF4 does not present the best improvement: this due to the very high reliability of the seed matrix, which makes useless additional information to improve the solution.
Intermediate reliability levels (0.2 ≤ RME ≤ 0.4, Seed 2 to 4) show similar behaviour in terms of sensitivity to information.
Analysing each term of the OFs, it can be underlined that OF2 and OF4 usually imply the highest improvement for the term relative to the distance between link flows and traffic counts (Fig. 5).
It means that link speed measures and path travel time data are very useful to obtain a better correspondence with traffic counts. OF2 and OF4 show very similar improvements of the term relative to the link speed (Fig. 6), a fact that demonstrates the capability of the two types of information (link speeds and path travel times) to represent the different levels of congestion of the network.
Finally, regarding path travel time term, OF3 and OF4 have similar improvements for low and medium degrees of reliability (RME ≤ 0.4, Seeds 2–4, Fig. 7). However, if RME is greater than 0.4, information on link speeds is no more sufficient to reflect the experienced path travel times.
It is possible to deduce from the previous considerations that for certain seed matrix reliability, the more information we add inside the adjustment procedure, the more accurate is the result. Of course, the accuracy of the estimation procedure can be only evaluated in laboratory experiments, where the true demand is known, while it is not possible in the real world, where only traffic measures are known.
So, the accuracy of the resulting demand is evaluated in the following pictures in terms of reduction of the distance between estimated and real demand, for the different OFs adopted and for the different degrees of reliability of the seed matrix (Figs. 8, 9, 10 and 11).
When realtime information includes only link fIows, as in case of OF1 (Fig. 8), the improvement of the solution with respect to a priori information is lower than 10 %; if also measured link speeds are added inside the objective function as in case of OF2 (Fig. 9), the improvement of initial estimation exceeds the 50 %, except for poor reliability of the boundary values of the seed matrix. This result highlights the importance of speed data on dynamic demand adjustment, as it allows to discriminate between congested or uncongested traffic conditions.
If data on path travel times instead of link speed are added to link counts (OF3, Fig. 10) strong improvements are still obtained compared to using only link counts, even if lower than those obtained by using measured speeds (Fig. 9) and concentrated in a small range of seed matrix reliability (RME from 0.2 to 0.35).
Finally, if both information on links (flows and speeds) and information on path travel times are put together (OF4, Fig. 11), an improvement is experienced increasing up to 66 % for RME equal to 0.35 and then decreasing to about 2 % when for RME equal to 0.68. So, speeds and path travel times add information to the adjustment process in order to reach a dynamic demand matrix closeness to the real one; however, their effects do not seem to be additive.
In order to better understand this result, it is necessary to explore the characteristics of the measures adopted inside the adjustment procedure in detail:

regarding link measurements, we assume measures of flows and speeds on 5 count sections collected for 5 time intervals, for a total number of 50 data, which cover information related to all the origin–destination components of the network (Fig. 2);

regarding path travel times, we assume only one path covered by probe vehicles, which cover information on only one origin–destination pair, measured from the origin for the first 3 time slices (that is, we assume that only a sample of vehicles are equipped with GPS devices and can be exploited as probes).
Measures are then normalized inside the objective function; i.e., the information is not weighted for its cardinality; however, link measures provide information on all the origin–destination components, while path travel times only on one origin–destination pair.
Table 2 shows how the error of estimation for the origin–destination pair (2–4 in Fig. 2) followed by the probe vehicles change for different levels of seed matrices reliability and for different objective functions. It is interesting to notice that the information on path travel times OF3 brings to the largest improvements in the estimation of the flow on the OD pair 2–4 when the seed matrix is related to RME values up to 0.35 (Table 2). For RME greater than 0.35, information on path travel times adds no improvement with respect to information on only link flows (OF1): this is also confirmed by Fig. 10 for the same range of RME. This last result can be explained considering that when a priori information has very low reliability, adding measurements on only one origin–destination pair and three departure time intervals is not sufficient to achieve further information on the whole timedependent OD demand matrix. In fact, path travel times on the last departure times are lost and information on all the origin–destination pairs can be obtained only if speed measurements for the whole time period are added (OF4). Moreover, OF4 implies best proximity (except in Seed 2) between real and estimated demand values for the OD pair 2–4 (Table 2).
Finally, some remarks are reported about the convergence of the algorithm: SPSA ADPI shows a good stability of the objective function after 200÷300 iterations, as reported in Figs. 12 and 13, relative to OF2 and OF4, respectively.
Each iteration takes about 1 min on a Dual Core, 2.2 GHz machine; this means about 4 h are needed to solve the optimization problem for the test network reported in this study. If the dimension of the network increases, also computational times increase because of the time needed by the DTA simulator to generate simulated values of measures at each iteration. As a result, the procedure can be used only in offline context. However, the solution found can be exploited as first input for online applications in order to start with good initial demand values and good traffic flow patterns on the network.
Conclusion
The paper has presented a preliminary analysis on the contribution provided by different kinds of information to the estimation of timedependent OD matrix demand. Numerical experiments carried out on a testnetwork case demonstrated the importance of type, quality and quantity of the information in demand estimation.
The best improvements on demand adjustment are usually obtained when a sample of path travel time measurements is considered together with measures of speeds and flows on link sections. In fact, link speeds and path travel times allow taking into account traffic congestion, which affects the propagation of flow on the network and then influences the timedependent relationship between link counts and OD demand matrix. Numerical experiments highlighted also the influence of the reliability of a priori information on the accuracy of resulting OD estimation in combination with different information sets.
Further research will be addressed to investigate the influence of penetration rate of probe vehicles that provide information on path travel times, considering also higher dimension networks; moreover the effect of other kinds of measurements, like density and occupancy, as well as pointtopoint travel time data, which introduce additional information on network congestion, will be analysed
References
 1.
Di Gangi M, Croce A (2005). Combining simulative and statistical approach for short time flow forecasting. Association for European Transport and contributors
 2.
Etemadnia H, Abdelghany K (2009) Distributed approach for estimation of dynamic origin–destination demand. Transp Res Record: J Transp Res Board 2105:127–134
 3.
Zhou X, Mahmassani HS (2004). Recursive approaches for online consistency checking and od demand updating for realtime dynamic traffic assignment operation. 84th Annual Meeting of the Transportation Research Board
 4.
Barceló J, Montero L, Marqués L, Carmona C (2010). A kalmanfilter approach for dynamic od estimation in corridors based on bluetooth and wifi data collection, 12th WCTR
 5.
Bierlaire M, Crittin F (2004) An efficient algorithm for realtime estimation and prediction of dynamic OD tables. Oper Res 52(1):116–127
 6.
Cipriani E, Fusco G, Gori S, Petrelli M (2006). Heuristic methods for the optimal location of road traffic monitoring stations. Proc. of IEEE Intelligent Transportation Systems Conference, 2006. ITSC’06, pp. 1072–1077
 7.
Balakrishna R (2006). Offline calibration of dynamic traffic assignment models. PHD thesis. Massachusetts Institute of Technology
 8.
Cipriani E, Florian M, Mahut M, Nigro M (2010). Investigating the efficiency of a gradient approximation approach for solution of dynamic demand estimation problem. New Developments In Transport Planning—Advances in Dynamic Traffic Assignment, edited by Tampère, Viti and Immers
 9.
Cipriani E, Florian M, Mahut M, Nigro M (2011) A gradient approximation approach for adjusting temporal origin–destination matrices. Transp Res Part C 19(2011):270–282
 10.
Dixon M, Rilett LR (2002) Realtime OD estimation using automatic vehicle identification and traffic count data. ComputAided Civil Infrastruct Eng 17(2002):7–21
 11.
Eisenman SM, List GF (2004) Using probe data to estimate OD Matrices. Intelligent transportation systems conference. Washington DC 3–6:291–296
 12.
Antoniou C, BenAkiva M, Koutsopoulos HN (2004) Incorporating automated vehicle identification data into origin–destination estimation. Trans Res Rec: J Transp Res Board 1882:37–44
 13.
Zhou X, Mahmassani HS (2006) Dynamic origin–destination demand estimation using automatic vehicle identification data. IEEE Trans Intell Transp Syst 7(1):105–114
 14.
Caceres N, Wideberg JP, Benitez FG (2007) Deriving origin–destination data from a mobile phone network. IET Intell Transp Syst 1(1):15–26
 15.
Barceló J, Montero L, Bullejos M, Serch O, Carmona C (2012). Dynamic OD matrix estimation exploiting bluetooth data in urban networks, Recent researches in automatic control and electronics, ISBN: 9781618040800
 16.
Mitsakis E, Salanova JM, Chrysohoou E, Aifadopoulou G (2013). A robust method for realtime estimation of travel times for dense urban road networks using pointtopoint detectors. Proceedings of the 92nd Annual Meeting in Transportation Research Board, TRB 2013
 17.
Iannò D, Postorino MN (2002). A generation constrained approach for the estimation of O/D trip matrices from traffic counts
 18.
Zhou X, Lu C, Zhang K (2012). Dynamic origin–destination demand flow estimation utilizing heterogeneous data sources under Congested Traffic Conditions, Available online at: http://onlinepubs.trb.org/onlinepubs/conferences/2012/4thITM/PapersA/0117000097.pdf. Accessed January 2013
 19.
Toledo T, Kolechkina T (2012) Estimation of dynamic origin–destination matrices using linear assignment matrix approximations. IEEE Trans Intell Transp Syst Digit Object Identifier. doi:10.1109/TITS.2012.2226211
 20.
Djukic T, Flötteröd G, Van Lint HW, Hoogendoorn SP (2012). Efficient real time OD matrix estimation based on principal component analysis. Intelligent Transportation Systems (ITSC), 2012 15th International IEEE Conference. 115–121
 21.
Cascetta E, Inaudi D, Marquis G (1993) Dynamic estimators of origin–destination matrices using traffic counts. Transp Sci 27(4):363–373
 22.
Florian M, Mahut M, Tremblay N (2006). A simulationbased dynamic traffic assignment model: Dynameq. In: Proceedings of the First International Symposium on Dynamic Traffic Assignment DTA2006. Institute for Transport Studies, University of Leeds
 23.
Frederix R, Viti F, Corthout R, Tampère CMJ (2011) New gradient approximation method for dynamic origin–destination matrix estimation on congested networks. Transp Res Rec 2263:19–25
Author information
Electronic supplementary material
Below is the link to the electronic supplementary material.
ESM 1
(DOC 32 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
About this article
Received
Accepted
Published
Issue Date
DOI
Keywords
 Demand adjustment
 Dynamic assignment
 Probe data
 SPSA algorithm