# An off-line map-matching algorithm for incomplete map databases

- Francisco Câmara Pereira
^{1}Email author, - Hugo Costa
^{1}and - Nuno Martinho Pereira
^{1}

**1**:13

https://doi.org/10.1007/s12544-009-0013-6

© European Conference of Transport Research Institutes (ECTRI) 2009

**Received: **19 September 2008

**Accepted: **17 July 2009

**Published: **11 September 2009

## Abstract

The task of map-matching consists of finding a correspondence between a geographical point or sequence of points (e.g. obtained from GPS) and a given map. Due to many reasons, namely the noisy input data and incomplete or inaccurate maps, such a task is not trivial and can affect the validity of applications that depend on it. This includes any Transport Research projects that rely on post-hoc analysis of traces (e.g. via Floating Car Data). In this article, we describe an off-line map-matching algorithm that allows us to handle incomplete map databases. We test and compare this with other approaches and ultimately provide guidelines for use within other applications. This project is provided as open source.

## Keywords

## 1 Introduction

Map Matching algorithms are needed in any geographical system to associate information to specific geo-referenced locations. Thus, while we may get exact maps that represent any portion of the planet, dynamic information obtained from common Global Navigation Satellite Systems (GNSS) devices (e.g. GPS) almost always carry errors that may negatively affect their usefulness. For example, for car navigation, for example, extreme care must be taken to continually locate the real position of the driver as opposed to what the GPS receiver estimates. Another example is the Floating Car Data probes (i.e. vehicles that periodically report their GPS position), from which it is possible to obtain information on traffic situations [1]. These can be used to generate real-time information as well as to provide traffic analysis and forecasting (e.g. [2, 3]), or simply to analyse mobility behaviour within an area. A stronger example could be the dynamic toll charging; charging each vehicle based on its profile, used roads and/or daily mileage. In any of these situations, accurate Map Matching algorithms become fundamental for the success of the applications. Furthermore, the analysis involved can be taken in an offline, post-processing manner. While on-line algorithms have evolved to their limits recently, essentially due to commercial car navigation applications, off-line approaches are still under explored. At a first sight, the former should be both a more challenging and a more generic task (solving the “real-time” problem often makes the post-processing solution simple), but under a more careful examination this shows that there are two different approaches to two entirely different problems. Real time applications demand solutions that provide instant response and can only rely on “past” points. This implies a compromise of performance over accuracy. On the other hand, off-line applications can take advantage of “future” points and allow for slower performances in favour of accuracy. As a result, on-line solutions applied on an off-line basis show extremely poor results, thus specific research is needed for solving the latter problem.

The task of off-line map matching is to determine a correspondence between sequences of geo-referenced points previously obtained (e.g. from GPS) and a given map. The difficulty of the challenge is inversely proportional to the accuracy of the localization technology. Thus, it could be said that with Differential GPS or with Real Time Kinematics (RTK), which allow centimetre level accuracy, the task becomes considerably simpler. However, these technologies still demand expensive receivers as well as a dedicated ground infrastructure, which enhances the importance of the common off-the-shelf GPS solutions that are presently widespread and available. With noticeable less accuracy, other low cost localization approaches are becoming common, such as cell-phone based localization (e.g. [4, 5]). For these, accurate Map Matching becomes a quite complex and determinant task.

Another aspect is that, either for on-line or off-line applications, the available maps are often incomplete due to the dynamics of the road networks almost everywhere in the world. Direction changes, areas under construction, new roads, off-road tracks and road closures are just some examples of phenomena that happen on a daily basis. In roads that are absent on maps, the Map Matching algorithms typically take some time to become aware of it. They stay “glued” to existing road links until they become too distant, and then typically enter into an “initialization mode” that starts promoting a new match when sufficiently close to a recognized map link. However, the “new road” segment becomes blurred in this process. For applications that demand some accuracy this may affect results. There is at least one application on the market that covers some of these issues, TomTom Map Share. However, we should point out that the approaches are very different and this application focuses on *correction* of the map provided by TomTom (direction changes, areas under construction, etc.), as opposed to the *aggregation* of new roads or geometry *updates*. This is done with the intervention of human hands (as happens in OpenStreetMap.org [14]), and not fully automatically as in our project, YouTrace.

In this article, we propose M-GEMMA, an off-line Map Matching algorithm for incomplete maps. It is based on two other algorithms: an improved version of Marchal’s algorithm [6] that allows incomplete maps; and the GEnetic Map Matching Algorithm (GEMMA), an algorithm based on the evolutionary computation paradigm of Genetic Algorithms (GAs) that intends to overcome the main problems raised by Marchal’s approach. M-GEMMA was designed to combine the strengths of these two approaches and is to become a versatile Map Matching tool.

We implemented and tested a total of four algorithms (Marchal’s original and improved versions; GEMMA and M-GEMMA) and made a thorough comparison, which is reported in this article.

M-GEMMA’s source code is available with a “creative commons license” and its use is free. We hope to provide information in this article that can help on its application and comprehension. M-GEMMA, Improved Marchal and GEMMA were developed within the context of the YouTrace platform (which will also be made available as open source), a project that allows for the collaborative incremental construction of trajectory maps^{1}. The following section will provide an overview of the YouTrace project in order to provide some context to M-GEMMA.

The state-of-the-art of Map Matching is presented in Section 3, while Marchal’s algorithms (original and improved version) are described in Section 4. We then describe GEMMA in Section 5, with M-GEMMA finally presented in Section 6.

The experiments and a comparative analysis are shown in Section 7 concluded with a consensus of our thoughts of strengths and weaknesses about the algorithm.

## 2 Giving some context: the YouTrace project

The YouTrace project intends to be a social platform that allows users to collaborate with the construction of a map-of-the-world [7] (Fig. 1). A key element is the Map Generation Engine that is responsible for aggregating the users’ traces into a single map. A YouTrace user can upload their traces while contributing to the construction of a joint map of the world. The users can then receive an updated map that will allow, for example, a more efficient car navigational application. An innovative characteristic of collaborative mapping is the strength of its dynamics, as opposed to the current static maps, as we are well aware: Roads are constantly being updated and aggregated as new traces are introduced. The collected traces can then provide information for more efficient route planning as the traces are a useful and realistic source about road/trajectory usage, average speeds and user preferences on road alternatives. Besides providing a dynamic map of the world, YouTrace can also be a useful source of information about users mobility and city dynamics. This information can be extremely valuable to urban planners, as they can base their planning decisions on more realistic information (as opposed to surveys or probabilistic reasoning). YouTrace users can access the system through a web portal that will be responsible for feeding the Map Generation Engine with traces, which in turn, will be added to the map.

The first step of trace processing is filtering. The filtered trace is then addressed to the Map Matching, where GPS points are matched to the map, in order to find the existing segments on the map. The matched points of the trace are used to update the existing segments on the map, which thus improves road precision. The non-matched points of the trace are aggregated on to the map, creating new roads (or trajectories). Two databases are then generated from this process; the map database and the statistics database. These databases serve to provide data for external services such as route planning and traffic analysis. For more information on the YouTrace project, please refer to [7]. As can be understood, the Map Matching is a key element for YouTrace, which is entirely responsible for finding the parts of the trace that already exist on the map. This allows for the distinction between the parts that should be aggregated and those that should be updated, therefore the quality of the final map is dependable of the quality of the match.

## 3 Current trends on map-matching

Map-Matching algorithms are used to fix location data into a spatial road network. They are used in the most varied applications. The most common are noticeably the GPS car navigation devices, which are constantly indicating the road segment where the user is located based on information retrieved from GPS satellites. The purpose of a Map-Matching algorithm can be divided in two parts. Firstly, the algorithm determines which road segment, from a given network, corresponds to each given position. Afterwards, it will determine the exact location of the same position inside the segment previously selected [8, 9].

### 3.1 Geometric algorithms

### 3.2 Topological algorithms

Since maps are usually represented as graphs, topological algorithms tend to preserve continuity in the matching, avoiding frequent errors. However, they do generally ignore additional readings from certain GPS readable data such as speed or heading and might be sensitive to outliers as well. One example of a topological algorithm is the Marchal’s algorithm [6], which will be explained below in detail. These algorithms can generally be divided into two stages. The first is the initial matching process, where the algorithm will select the most suitable link from the closest to the initial points. At the second stage, the algorithm will continue matching the points while keeping the network topology in consideration. In [8], the author also adds that these kinds of algorithms have some problems at certain junctions where the direction of links is not similar. This can only be solved via a sub-routine that selects the appropriate subsequent road segment. Since this routine runs in a post-processing mode, these algorithms tend to be useless in real-time applications.

### 3.3 Probabilistic algorithms

### 3.4 Advanced algorithms

The advanced algorithms generally use the most varied techniques and approaches, or combine them with the simplest algorithms described above or even a simple combination of Map-Matching algorithms. The major goal is always to improve the accuracy of the matching. Aside from GPS coordinates, these algorithms are often aided with extra information such as speed, heading, connectivity of the roadmap, quality of the input data or even using correction errors from third party systems (e.g. Differential GPS). The approaches most used here are: fuzzy logic models, Dempster–Shafer’s mathematical theory of evidence, Multiple Hypothesis Technique (MHT) or Bayesian inferences. Kalman Filters and Extended Kalman Filters are widely used as well, especially to integrate the data from GPS and from DR systems or, in other cases, to smooth the GPS data before proceeding to the matching.

In every algorithm, the accuracy of the matching highly depends on map resolution and completeness: the higher the resolution, the more accurate matching. Some comparisons about map resolutions have been made in [6, 11]. By default, the majority of the algorithms assume that the map network is complete and that it is always possible to have matching completed. However, this is often an incorrect assumption and algorithms might show unexpected behaviour where there are no roads nearby to match.

## 4 The Marchal algorithm—an improved version

### 4.1 Overview of the Marchal algorithm for offline map matching

The algorithm presented by [6] is an offline topological algorithm inspired in MHT used on previous algorithms [10]. Authors say their algorithm is more focused on computational speed rather than on accuracy as opposed to the remaining ones, yet the algorithm only uses GPS coordinates to perform the matching in a road network represented by a directed graph.

The algorithm works as follows: firstly, map links nearby the first trace point are picked and each one will constitute a scored path candidate (a possible match sequence). The score of each candidate is then based on the sum of the least Euclidean distance between each trace point and its matched link (also named as matching distance). Hence, best candidates have lowest scores.

After the initial matching process, for each point of the trace, it is assumed that the current point matches the last link of each candidate. Then, an update of the score occurs and the candidate is put into a new set. Afterwards, if the trace has reached the end of the link, new path candidates are created. To see if an intersection has been reached, a comparison is made between the travelled length through the trace points and the travelled length through the links of the path candidate. If the first length is longer than a given percentage of the second, it is assumed that the next junction has possibly been reached^{2}. For this percentage, the authors fixed the value in 50%, which in their opinion tends to give fair results. New candidates are created, they are similar to the current one and a link per new road segment starting on that junction will be added to each one of the new candidates. Their score is updated and they are inserted into the new set. When no more candidates to match the current point are available, the algorithm will pick only the best N candidates of the new set, it passes to the following trace point and does everything all over again. The authors tested some values for N and, based on these experiments, they say that with values above 30, improvements on matching accuracy are insignificant. The best candidate obtained gives the final match. This algorithm has some additional mechanisms that permits breaking the match and restarting it from scratch when the distance between two consecutive points or the difference between timestamps of two consecutive points is above given thresholds. The authors use, as an example, 300 m for the distance and 30 s for time difference.

### 4.2 Our implementation and modifications

*tolerance link*(see Fig. 7). The idea is to provide the possibility of accurately matching a point to a link without the need of the previous link being matched with any point. This allows a reduced number of non-matched links to be included in the matched output without affecting the accuracy of the overall match (since these links can only exist between two matched ones, there is a high probability that the user has passed over this area). This also helps to avoid the need of the matching process to stop and restart unnecessarily. At this moment, the number of

*tolerance links*is fixed to 2.

Due to all these new situations, we decided not to include the condition of testing if a jump to the following link is performed or not. Instead of this, at each trace point we decided to create a set of candidates per current candidate. Each new path candidate has a distinct link to perform the matching. The links are as follows: the link that matched the previous point of the past candidate and, all reachable links to a maximum depth level of the number of *tolerance links* plus one. After scoring all the new candidates, they are filtered before passing to the following point. Two restrictions are applied in order to avoid similar path candidates that only would increase the number of candidates exponentially without having any improvement and to invalidate matches that, although being topologically correct, are far away from the trace points and so we assume that it is a new road segment instead. For the first restriction, only the best candidate passes per most recently matched super-link. This way, the number of candidates is drastically reduced and we guarantee to have the best possible candidates available. On the second restriction, the distance between the last trace point and its matched link must be lower than a given threshold (fixed to 45 m). All candidates where the last match is above this threshold are simply removed, so these candidates will not be considered better than the “real” accurate ones on the following points in unexpected and rare situations.

## 5 GEMMA—GEnetic Map Matching Algorithm

### 5.1 Overview

We have created a new genetic algorithm since we did not find any reference in the Map-Matching literature to algorithms that use evolutionary approaches and which would meet our expectations. We knew that in terms of computational performance it would be less efficient than other types of algorithm, especially Marchal’s. The main concern was on improving the quality of the matches. The goal was ultimately to design an algorithm that would not have the same problems commonly seen in other algorithms, as described in the state-of-the-art section (e.g. matching errors due to topological situations or outliers) and also one that could perform smooth transitions between a matched segment to an unmatched and vice-versa, and in transitions between two matched segments that are not yet interconnected. In our genetic algorithm^{3}, each individual consists of a matching sequence (from beginning to the end of the trace). Each gene corresponds to a trace point. The possible alleles for each gene are the links that are close to the respective trace point. A special value is also inserted to give the opportunity not to perform any match for the given point. After an initial population that is randomly created, the algorithm will run for a given number of generations and the best individual of the last population is considered to be the correct match. In each generation, individuals have the possibility to be recombined and mutated. Afterwards, they are evaluated using a fitness function that considers many factors (described below). Since the search space (and thus the program complexity) increases exponentially with trace length, we decided to break the trace into small segments inspired on [12]. In doing so, better individuals are obtained in less time. The break points are then selected based on a score function that prefers less crowded areas, noticeably away from junctions.

### 5.2 Segmentation

As previously mentioned, the segmentation was inspired on [12] in order to speed up the algorithm and to have better results. Before starting the matching process, the algorithm segments the trace in the following manner: firstly, the algorithm scores every trace point. Lowest scores represent less ambiguous areas to the matching process; then, it looks for sets of four consecutive points that are under a given threshold (fixed in 0.9); finally, using the previous sets as segment borders, the algorithm will try to form the widest segments available, yet these are restricted to a maximum number of points per segment (currently fixed at 50). The score is based on the sum of four distinct variables, which are normalized according to their units. The variables are: the difference of heading between the point and the closest map link, the distance between the point and the same link, the number of map links that are nearby the trace point (the defined distance is 20 m) and the heading variation in the neighbourhood of the trace point—the closer the trace curves are the larger the heading variation is.

### 5.3 Link candidates

For each trace point, a set of link candidates is available as alleles. A special allele is also inserted in order to give the possibility of an unmatched point to be performed. At this time, these link candidates can be collected according to two distinct methods. The first comes directly from the map. Firstly, links in the area of each point are picked up. Then, the closest one per super-link is selected. The maximum distance allowed is set at 20 m. Links with opposite heading to the trace point are discarded in order to avoid matches with roads running in the opposite direction. The second method, Marchal’s algorithm is first run for each candidate search. The candidates of each point are all links to which that point matched during the entire running of that algorithm. Afterwards, candidates are filtered using the same rules applied in the first method (maximum distance of 20 m, opposite heading and one link per super-link). This second approach improved results especially close to junctions during the first versions of the fitness function, but at the moment, the difference between them is minimal.

### 5.4 Individual representation

Individuals of the population are match candidates for each given trace. Each gene of the individual corresponds to a trace point and everyone has their own set of alleles that are unchangeable between them. Candidates are ranked according to a fitness function.

### 5.5 Fitness function

*tolerance link*also exists here (see Fig. 7). In fact, it was created firstly for this algorithm and was later adapted to our Marchal’s implementation. Another parameter consists of the sum of distances between two links when it is not possible to reach the second from the first one. The number of

*tolerance links*used in the genetic algorithm is currently 5, so if it is not possible to go from one link to the other at a maximum deep of 6, the Euclidean distance between the end point of the first link and the start point of the second link is added to the specific parameter. In order to keep matching continuity as much as possible, another parameter is used to store the sum of the square of the matched lengths of each segment. Since we want to minimize the score, the inverse of the obtained value is used. Finally, the last two parameters are used to smooth the transitions between matched and unmatched areas and vice-versa. One of these parameters stores the average distance between an unmatched trace point and the following matched link or the distance between the last matched link and the following unmatched trace point. The other one stores the maximum of these distances. Each parameter is weighted, thus allowing for different orders of importance. For example, we prefer continuity to geometric proximity, so the three parameters that measure the unmatched lengths, the distances between unreachable links and the square of the continued matched lengths have the highest weights (Fig. 8).

### 5.6 Running the algorithm

For each segment of the trace, the algorithm generates a random population. Each gene has a roulette wheel with respective alleles. Every allele has the same probability except for the special one that represents an unmatched situation, which has a fixed probability of 15%. The population in each generation has the opportunity to be recombined and mutated. Pairs of two individuals are then selected using the tournament selection method and have a probability to be recombined, fixed at 75%. This recombination method uses one point crossover, with a randomly selected point. Afterwards, each gene of the individual in the population has a slight probability to be mutated (0.5%). The new link is picked up randomly from the correspondent roulette wheel built at the beginning. From one generation to the next, we decided to include the best previous individuals without being recombined or mutated. 3% of the population passes directly, which corresponds to six individuals (since the population size is two hundred). One stop condition only exists for the algorithm; it stops when the best individual has not being changed for the last given generations. Currently, this number of generations is three hundred.

### 5.7 Observations

The output of the algorithm is equal to the one presented above: a set with the matched paths and another one with the unmatched trace points. These are built from the best individuals of each trace segment. As it is in its nature, the genetic algorithm itself has been suffering some evolution through time, especially in regard to the fitness function. Consequently all the thresholds discussed here were defined based on observations in our set of traces, after a relatively large number of experiments (three months of daily tests, in which we refined both the algorithm and the parameters), meaning that new situations can always come up and new improvements to the fitness function or some thresholds adjustments might be needed. The same happens with the remaining parameters of the algorithm, including crossover and mutation rates, and size of population. The values have not been as frequently changed as in the fitness function, and it is less likely that they need new modifications. Adjustments were made after running several tests showing that these could improve the quality of the matching process. We currently have a set of traces with a total of 526,728 points that corresponds to an approximate length of 11,486 km throughout Portugal (essentially the central area).

## 6 M-GEMMA—joining the best from two worlds

For areas where both Marchal’s forwards and backwards runs produce different matches (and sets of candidate links), the whole set of candidate links for every participating point are added to a list. After testing all points, segments are created based on consecutive points that are on the list. For each segment, two unambiguous trace points are added in the borders (to force the start and end of the segment to “fit” into the remaining matches). This way, GEMMA can guarantee the continuity of the match and avoid topological errors. Candidates running on GEMMA are thus taken from the output of both Marchal’s runs. Since continuity is guaranteed on GEMMA, the output for the ambiguous areas fits automatically in the remaining map. With this integration, some parameters on GEMMA had to be adapted, namely the population size, the number of generations of the best individual that leads the algorithm to stop and some weights in the fitness function. Since the new segments are commonly very small, the population size remained fixed to fifty individuals and the number of stabilized generations necessary for stopping is set at seventy-five. This helped the algorithm find the best individual in the first few generations. Processing time became longer than simply running Marchal’s alone since we have to run Marchal’s twice per trace and GEMMA on some segments. Despite that, results show a gain in quality which justifies a loss of performance and as the map becomes more complete, fewer ambiguous segments appear to run on the genetic algorithm, thus further reducing the time.

## 7 Experiments and comparative analysis

Having implemented the four algorithms described, it is necessary to find which one adapts better to the objectives and performs better results. Each algorithm has benefits and drawbacks, either related to computational speed or to matching accuracy.

The base map to work with was extracted from OpenStreetMap.org [14]. This choice was due to several reasons: it is open source and freely available; it is partially complete; in covered areas, it is comparable to TeleAtlas or NavTeq commercial solutions in terms of accuracy and completeness. For the sake of the experiments, we are confident that this choice is as valid as any other map database available (commercial or not). At most, it could be said that OpenStreetMap is globally less complete and more imprecise than those other professional databases, which becomes more of a challenge for our purposes.

For the first experiment, we initially built a base map with YouTrace with 225 km of traces in the area of Coimbra, originating 730 intersections and 15,901 links (1,264 super-links). We applied both algorithms to match 11 traces with a total of 16 km. With an IntelCore™ 2 Duo processor running at 2.2 GHz with 2 GB of RAM, Improved Marchal’s algorithm took 0.171 s to determine the entire match. M-GEMMA took 0.874 s to do the same task, of which 0.468 were necessary for the GEMMA part to process 152 ambiguous points in a total of 1.363 km. On average, M-GEMMA needed nearly five times more processing effort than Marchal’s approach in areas with high density of intersections.

More experiments would be necessary (in other cities, rural areas, areas with plenty of multi-path effect, etc.) to achieve more conclusive results. However, these results are coherent with the experience we had during the development of the algorithms and with other experiments. We are also aware that a thorough algorithmic complexity analysis is needed in order to present a more explicit view of the efficiency involved. On a first analysis, Marchal’s algorithm time grows linearly with the size of the trace, with a quadratic component for local search of Euclidean distance. GEMMA behaves in an *O(n * p * m * g)*, with *n* being the trace size, *p* the population size, *m* the average number of alleles and *g* the number of generations. M-GEMMA corresponds to a combination of these two measures. This, however, is a naive analysis, since no attention is given to aspects such as distribution of segmentation, sensitive areas or other parameters on any of the algorithms.

In terms of map-matching accuracy, M-GEMMA still has its limitations. For example, the problem of parallel roads is only partially solved. The Marchal’s part of the algorithm does prevent the bouncing between two roads, however the proximity between the roads leads M-GEMMA to choose between two options: to make the entire match (incorrect option); or not make any match (correct option). The fine-tuning of this system is thus complicated: if we make it too restrictive (slight distance of tolerance) it becomes resistant to “parallel roads”, but then it will often wrongly report unmatched segments that could easily be properly processed. Making it too loose gives opposite result. The GPS NMEA protocol allows for (Dilution Of) Precision estimates (HDOP, VDOP) or SNR (Signal-to-Noise Ratio), but curiously these values are not consistent among different receivers with respect to the quality of the trace. For the same DOP or SNR values, we have observed very different qualities of traces along the 4 GPS receivers tested. For the case of YouTrace, we rely on the statistics to distinguish between an error and a parallel road (with many traces, there should be two centrelines gradually emerging out of the “statistical evidence”). The problem with parallel roads increases drastically when speaking of several road platforms on top of each other, as so happens at the entrance and exit of highways (although, normally, geometry helps distinguish this correct solutions).

Regarding time performance and complexity of the algorithms, we knew from the beginning that, in terms of speed GEMMA would have a poor performance due to its nature. Some modifications were made such as the creation of the upper layer of the map in order to speed up some searches in the map, which benefitted Marshal’s algorithm as well. Despite these modifications and other minor ones, GEMMA alone remains slow. Moreover, with adding some more modifications to the Marchal’s algorithm, this solution was speeded up. The difference of time when running both algorithms with the same set of data is noticeable.

## 8 Applying M-GEMMA to YouTrace

Regarding the inclusion of these algorithms in YouTrace, the large base map from above (225 km, 730 intersections) needed 186 s to be generated, while with Improved Marchal 125 s were necessary. The results were different, however, Marchal’s algorithm failed in some areas. On a different test, with a set with 87,005 points which corresponded approximately to 1,392 km of trace length, Marchal’s algorithm (either version) took 20 s in the matching process, for GEMMA it took approximately one hour and 79 s for M-GEMMA.

## 9 Conclusions

In this article, we presented an off-line Map Matching algorithm that showed reliability and robustness in regard to the potential incompleteness of the base map at hand. This algorithm is the result of an iterative process in which the authors implemented and tested previous work and added their own new implementations. The result is the integration of two algorithms: Marchal’s algorithm [6] and GEMMA. Marchal’s algorithm is used primarily for using topological continuity to infer matches. When ambiguities arise, the portion of the ambiguous segment is isolated and GEMMA is used.

M-GEMMA is visibly slower than Marchal’s original algorithm, but its performance is more than acceptable when running for a single user. The scalability to multiple simultaneous users (as is expected in YouTrace) remains to be tested and may demand improvements. Despite this issue, it is preferable to use the integration of both algorithms because of the improvements on having smoother transitions—which have a direct impact on map’s quality.

In terms of the integration into YouTrace, M-GEMMA is presenting satisfying results during the preliminary experiments. Testing this whole system thoroughly demands considerably larger chunks of traces and it is clearly beyond the scope of this paper. Future publications will focus on this task.

The code of M-GEMMA is written in C++ and is available as open source at http://eden.dei.uc.pt/~camara/files/mgemma.zip. The reader is invited to download and use this at will.

We refer to trajectory maps since the geometry obtained corresponds to the driving trajectory as opposed to the road infrastructure geometry. Particularly on curves, the visual result can become very distinct.

## Authors’ Affiliations

## References

- NCHRP Project 70–01 (2005) Private-sector provision of congestion data. Probe-based traffic monitoring. State-of-the-practice report . University of Virginia Center for Transportation Studies. Virginia Transportation Research Council, November 21Google Scholar
- Ben-Akiva M, Bierlaire M, Koutsopoulos HN, Mishalani R (1998) DynaMIT: a simulation-based system for traffic prediction. Proceedings of the DACCORD Short-term forecasting workshop (DACCORD) February, 1998Google Scholar
- Logi F, Ullrich M, Keller H (2001) Traffic estimation in Munich: practical problems and pragmatic solutions. 2001 IEEE Intelligent Transportation Systems Conference Proceedings—Oakland (CA), USA—August 25-29Google Scholar
- Travel Time Estimation Using Cell Phones (TTECP) for Highways and Roadways. Department of Electrical and Computer Engineering Florida International UniversityGoogle Scholar
- Performance and Limitations of Cellular-Based Traffic Monitoring Systems. CellINT Traffic Solutions (2007) Ohio Transport Engineering ConferenceGoogle Scholar
- Marchal F, Hackney J, Axhausen KW (2005) Efficient map-matching of large global positioning system data sets: tests on speed monitoring experiment in ZŸrich. Transp Res Rec 1935:93–100View ArticleGoogle Scholar
- Edelkamp S, Pereira FC, Sulewski D, Costa H (2008) Collaborative map generation—survey and architecture proposal. In: Hoeven F, Shaick J, Speck SC, Smith MJ (eds) Urbanism on track—application of tracking technologies in urbanism. IOS Press (Research in Urbanism Series, Vol. 1), AmsterdamGoogle Scholar
- Quddus MA, Ochieng WY, Noland RB (2007) Current map-matching algorithms for transport applications: state-of-the art and future research directions. Transp Res, Part C Emerg Technol 15(5):312–328, ISSN 0968-090XView ArticleGoogle Scholar
- Greenfeld JS (2002) Matching GPS observations to locations on a digital map. In proceedings of the 81st annual meeting of the transportation research board, January, Washington D.C.Google Scholar
- Quddus MA (2006) High integrity map-matching algorithms for advanced transport telematics applications, PhD Thesis. Centre for Transport Studies, Imperial College London, UKGoogle Scholar
- Quddus MA, Noland RB, Ochieng WY (2006) The effects of navigation sensors and digital map quality on the performance of map-matching algorithms. Presented at the Transportation Research Board (TRB) Annual Meeting of the Transportation Research Board, Washington D.C., January 2006Google Scholar
- Chawathe S (2007) Segment-based map matching. Proceedings of the 2007 IEEE Intelligent Vehicles Symposium, Istanbul, TurkeyGoogle Scholar
- Goldberg D (1989) Genetic algorithms in search, optimization and machine learning. Kluwer Academic, BostonMATHGoogle Scholar
- Open Street Map platform. URL: http://www.openstreetmap.org