Mobility surveys beyond stated preference: introducing MyTrips, an SP-off-RP survey tool, and results of two case studies

When introducing new mobility offers or measures to influence traffic, stated preference (SP) surveys are often used to assess their impact. In SP surveys, respondents do not answer questions about their actual behaviour, but about hypothetical settings. Therefore, answers are often biased. To minimise this hypothetical bias, so-called stated preference-off-revealed preference (SP-off-RP) surveys were developed. They base SP questions on respondents’ revealed behaviour and place unknown scenarios in a familiar context. Until now, this method was applied mostly to scenarios investigating the willingness to pay. The application to more complex mode or route choice problems, which require the calculation of routes, has not yet been done. In this paper, the MyTrips survey tool for the collection of SP-off-RP data based on respondents’ actual mobility behaviour is presented. SP questions are based on alternatives to typical routes of respondents, which are calculated on the fly with an intermodal router. MyTrips includes a larger survey and collects mobility diaries for one day representing respondents’ daily routine, calculates alternative routes and creates SP questions based on a Bayesian optimal design. Results from two case studies investigating behaviour changes are presented. The first case study investigated the extension of a subway line in Vienna, Austria. The second case study focused on the introduction of micro transit vehicles in a rural setting, replacing infrequent bus services. Results of the two case studies show a difference in response behaviour between SP and RP settings and suggest a reduction of hypothetical bias. For the latter study, a Latent Class SP-off-RP model was estimated. It shows that availability and accessibility of public transport are the main influences on the willingness to use it, independent of other household characteristics.


Introduction
Introducing new mobility options is costly and acceptance of new measures by travellers is hard to estimate in advance. Surveys are the prime source of data for impact assessment. In particular, stated choice experiments are used to collect data about hypothetical scenarios and choice models are applied to forecast adoption of new travel choices.
In general, there are two kinds of data sources for the estimation of mode and route choice models. The first are revealed preference (RP) data, where actual behaviour of travellers is recorded. The second are stated preference (SP) data, where respondents are asked to choose between different scenarios in a survey setting. Both have advantages and disadvantages. The main advantages of SP data are that hypothetical scenarios can be included and that one has control over the survey design. RP data on the other hand is real behaviour and does not suffer from hypothetical biases, which is often a problem in SP data where people must answer questions about hypothetical behaviour in an unknown setting. The problem of hypothetical biases is well studied, in particular in the area of willingness to pay. Several papers show that hypothetical biases exist, e.g. [1][2][3] show that there is a discrepancy between stated choice experiments and revealed preferences. In [4] a similar comparison between an SP experiment and GPS data is shown, but the paper also shows that hypothetical bias is reduced by mitigation techniques like showing a cheap talk script to respondents that informs them about the problems that arise from hypothetical bias before they answer the SP questions.
Other techniques combine different data sources, mainly SP and RP data. Pivoting techniques minimise the hypothetical bias by basing the design of the SP alternatives on a recent route chosen by the respondents in real life (see e.g. Rose et al. [5]). Pivoting techniques were applied in a series of papers [6][7][8] by Hensher and Rose, where methods for designing stated choice (SC) experiments from pivot alternatives collected in a computer assisted personal survey were developed. The pivot alternatives were subsequently adjusted according to a D-efficient design, but no actual routing step was applied in the survey design. In [9] the number of transfers in the SPquestions to determine transfer penalties of passengers were based on the real number of transfers in a collected RP-route.
In a series of papers, Train and Wilson [10,11] develop the Stated Preference-off-Revealed Preference (SP-off-RP) approach, where an actual revealed choice situation is offered within the SC experiment by changing one or more characteristics but leaving the main part of the choices as in the real situation of the respondent. In addition, they introduce a choice modelling framework to handle the resulting SP-off-RP data.
In this paper, the novel online survey tool MyTrips is presented. MyTrips allows the collection of SP-off-RP data based on routing. This SP-off-RP data offers the possibility to reduce hypothetical distortion for transport and route choice problems. In addition, we present results of two case studies where MyTrips was applied. The main contributions of this paper are:

C1
Methodologies applied in the MyTrips survey tool for an SP-off-RP survey based on routing: (a) Novel online survey collection methodology (b) Calculation of route alternatives based on collected trip diaries (c) Design of SP choice sets from route alternatives

C2
SP-off-RP evaluation methodology using Latent Class Models

C3
Two case studies applying the MyTrips survey tool and corresponding evaluation methodologies While the SP-off-RP data collection methodology was applied in a number of cases, most of these had in common that the RP setting was relatively easy, making the creation of the SP alternatives possible. In the original paper, Train and Wilson [11] use the example of shippers and alternative routes for their goods. The SP alternatives are pivoted on the chosen alternative by changing the tariff or travel time. Other examples within the transportation sector are Arellana et al. [12] where departure time choice of workers in Santiago was collected and SP questions were designed with different attribute levels of departure choices and changes in mode choice behaviour. In Cranenburgh et al. [13] holiday choices and possible alternatives are collected in the RP stage and in the SP stage random pairs with changed costs and travel times were offered as SP alternatives. Yu et al. [14] study rebound effects caused by energy efficiency in cars by applying an SPoff-RP survey where they ask about the current vehicle in a household and replace the annual operating costs automatically for a more efficient replacement.
The approaches closest to the one in this paper was presented in [15,16] where smartphone based tracking was used as RP-data and SP alternatives were pivoted on the collected alternatives. While Google Maps was applied to learn about car alternatives and the availability of transit, unlike in this paper, no routing step was applied to generate all the intermodal route alternatives. The inclusion of our own router enables us to present very detailed routes within the context of the respondents, even for scenarios like the introduction of hypothetical mode alternatives not available in public routers like Google Maps. In addition, rather than using a random design like in [15] we apply a different adaptive design process for the selection of SP-alternatives that matches the routed SP-alternatives to a Db-optimal design that iteratively includes already collected survey results (see Sect. 2.3).
This connection of router and online surveying tool is technically more challenging. Furthermore, the survey design is getting more difficult due to the added routing step preventing direct manipulation of variable levels in the SP design. Some design strategies are already present in literature for pivoted SP questions. In Rose et al. [5] a D-efficient design is tested where the experimental design was constructed based on assumed population level averages for each of the design attributes. Arellana et al. [12] suggests a five-step method based on the Bayesian efficiency criterion where a generic SP design is found first that is customised to the revealed choices for each respondent. In addition to the collection tool, this paper also presents methodologies to analyse the collected SP-off-RP data and shows results for two case studies. Besides general survey analyses, latent class models (LCM) are applied for a more detailed examination of respondents' mode choice behaviour. For a good introduction of LCM see e.g. [17]. For an incorporation of SP-off-RP data into choice model estimation see e.g. [11]. While other approaches for personalising models exist, LCM might be more suitable when personalising choice models for single respondents (see e.g. [18] or [19]) since other methods need data for each respondent for personalisation. LCM offers the possibility to personalise the mode choice behaviour, for example within an agent based simulation model based on socio-economic variables of the agents.
The remainder of this paper is organized as follows. In Sect. 2 we present the MyTrips Methodology for data collection and data analysis. Section 3 describes two case studies conducted with MyTrips. Section 4 presents the results of the two case studies. Finally, we present our conclusions and necessary future work.

MyTrips methodology
The goal of MyTrips is to improve online (web-based) mobility surveys. The main cases of application are surveys for assessing the impact of (future) mobility scenarios on mode and route choice, i.e. which modes of transport and routes people prefer to use for their everyday mobility.
In the following sections we describe the MyTrips concept, the main methodological and technical contributions of the MyTrips survey tool, as well as first approaches for analysing collected SP-off-RP data.

The MyTrips data collection methodology
The main novelty of MyTrips is real-time personalisation of questions, i.e. routes, in the SP-off-RP survey to improve quality and accuracy of the hypothetical SP questionnaire: For RP respondents' preferences are inferred through the calculation of alternative routes that would have been available to them at the time. For SP, the personalisation of routes helps respondents to better imagine the survey scenario as alternatives' setting is familiar. A MyTrips survey consists of the following parts, as shown in Fig. 1 The parts P1 and P4 together form the underlying conventional online mobility survey. In P1 first introductory questions are posed to filter respondents not meeting the survey requirements, e.g. living outside the study region, and to collect preferences required for the calculation of alternative routes such as the ability to cycle or availability of personal vehicles. P4 represents the major part of the conventional survey consisting e.g. of questions about mobility preferences or demographic data as well as an introduction to the hypothetical scenario and first pure SP questions about expected behaviour.
MyTrips builds upon these parts and adds P2 as RP data and P3 for corresponding SP-off-RP questions, and finally P5 for a stated choice experiment.
In Fig. 2 the user interface for the collection of trip diaries in P2 is shown.
For the collection of the mobility diaries P2 the activity locations are geocoded using an address search, whereby the positions can still be adjusted in a map. For each trip, respondents must fill in start and end location, departure and arrival times, and mode of transport. Intermodal trips are split into single-mode stages. Depending on the survey goal and scenario, the degree of detail required for the modes of transport may vary. For one survey, it may be sufficient to provide a single trip for a complex public transport journey with several line changes. For another survey line and changeover information or a differentiation between car driver and passenger may be required. In addition, the activity between trips, e.g. being at work, shopping, or spending leisure time, is inquired.
Choice sets for P5 are calculated based on P2, where each choice set consists of two or more routes with the same start and end location. The RP route, as well as SP and RP alternatives, are generated using a routing service. For the case studies in this paper, our proprietary intermodal routing framework Ariadne [20] was applied for the routing step. The attributes like the travel times per mode for all the alternatives were then calculated from the resulting route. See Table 1 for the attributes applied in the current case studies. The final alternatives are then chosen such that they fit best with the optimal design, as described in Sect. 2.3. The detailed calculation methodology for P3 is presented in Sects. 2.2 and 2.3. In P5 respondents select their preferred route from each choice set. If the trip diary contains too few trips, or no suitable alternative routes can be generated for a trip, predefined fallback choice sets are used for the remaining questions. The detailed survey procedure and interaction between these components is shown in Fig. 1. The survey procedure is sequential with one exception. After respondents submit the diary, two things happen in parallel: respondents continue with the main questionnaire and also the choice sets with alternative routes are calculated. The questionnaire therefore needs to be long enough to allow time for the calculation to finish. For P3 Choice set calculation, an intermodal router is required. For the two case studies, we used our proprietary intermodal routing framework Ariadne [20].

Creation of alternative routes for choice sets
To calculate SP-off-RP route alternatives, the trip diary is split into trips at all activities except bringing or fetching people and changes between modes of transport. This means that trips are not limited to single modes, but can be intermodal. For each trip, a set of alternative routes for the SP as well as the RP trips is calculated, using combinations of the available modes of transport for the respondent. The exact method and routing process are survey-dependent. For RP and the SP alternatives, different modes of transport or road and public transport networks can be used. As an example, for the second case study, the mode combinations used for the calculation of route alternatives were car, park and ride, bike, bike and ride, public transport and combinations with micro transit. Once all possible intermodal route alternatives are calculated in Ariadne, the attribute of the different alternatives are calculated for those routes. Using the calculated attributes, the choice sets for the SP questions are created as described in Sect. 2.3. For the two case studies, SP-off-RP choice sets in the MyTrips questionnaire contain only two routes. The decision for pairwise choices was taken, since the presentation of choice sets was implemented with a map view of the routes and text describing the choices. The restriction to two choices provides enough space for the presentation of both alternatives (see Fig. 3).
To avoid SP-off-RP choice sets with unrealistic route alternatives, for each trip, a subset of feasible routes is selected. Routes that are not viable for most users are discarded following the rules presented in Table 1, since they add no information to the model estimation process. Examples for discarded routes are those with very long walking or cycling stretches, or routes that take more than twice as long as the RP route.
The design process for the SP-off-RP alternatives follows the methodology from Arellana et al. [12]. Due to the intermediate routing step, the design process needs to be adjusted slightly since the choice sets cannot be directly taken from the calculated design. The adjusted design procedure consists of the following steps.
Step 1 Preliminary design feature definition For the two case studies, mode choice models were based on travel times per mode, waiting time, travel costs and number of transfers. Due to the difficulty to control the number of transfers in the routing process, the design of the SP alternatives was based only on time variables. Number of changes and travel costs were only applied in the subsequent model estimation.
Step 2 Optimisation of generic SP design A 20 row design was chosen as generic design. The generic design was selected using a Bayesian efficient design. The Bayesian D-criterion D b is calculated as where I −1 (X, β) is the asymptotic covariance matrix (ACV) of the parameter vector β of length k of the choice model with prior parameter distribution π(β) . Before data is collected, no prior information is available and a normal distribution N (−2, 1) is assumed as the prior for all parameters. Once data is available, a mode choice model with time variables is estimated and the ACV is calculated. For the estimation of D b a genetic algorithm (GA) was applied in the statistical computing environment R. As Input to the GA, a set S of 500 parameter vectors β s , s = 1, . . . , 500 was drawn from the prior distribution. For each design in the population of the GA, the ACV was calculated for all β s to simulate D b . Once collected data becomes available, it is added to the design to assure that the SP choice sets add information to existing data.
Step 3 Selection of alternatives from the generic design Since the GA is not fast enough to run in real time, SP routes are based on the generic design. Design routes are calculated from the generic design as the percentage of time spent in each mode of transport or for waiting, and come in complementary pairs defining a desired choice situation. Ideally, the generated choice sets exactly match one of the choice situations. A reduced generic design example can be seen in Table 2. Route 1A is a unimodal car route and the desired complementary route 1B consists of public transport with walking and waiting.
First, a route from the generic design is matched to the RP route. This is done by finding the design routes with the least differences in the used modes of transport. For an unimodal car route the design route 1A from Table 2 would be the perfect match. For a public transport route including a walking part, both 1B and 2A would be perfect matches. In the case of several matching design routes, the match is determined by minimising the Euclidean distance between the attributes of the routes and the design route. If the score is still the same for several design routes, one of them is chosen randomly. After matching a design route to the RP route, the complementary design route is used to find the best alternative route for this choice set using a similar method as before: first find alternative routes with the best match regarding the used modes of transport and then select the route with minimal Euclidean distance in the attributes. The process (2021) 13:49 described cannot assign attributes from a fixed optimal design to the SC-scenarios, since these attributes cannot be guaranteed in the alternatives generated by the routing tool. Instead, they match the alternatives together with the calculated attributes as well as possible to the optimal design.

MyTrips SP-off-RP data analysis and mode-choice modelling
In a first step, the SP-off-RP choices are compared to pure SP questions to see if there is a difference between purely hypothetical scenarios ("Would you use the new mode for trips to work?") and SP-off-RP scenarios. This analysis allows a first assessment whether the hypothetical bias changes between the two settings. For a more detailed analysis of the choice behaviour, different variants of SP-off-RP choice models can be estimated. In this paper, Latent Class Models (LCM) were chosen. LCM can be applied to analyse individual heterogeneity (see e.g. [17]). The parametrisation of the class membership model sets them apart from other approaches like Mixed Logit Models. This allows to study how different parts of the population (based on different socio-demographic data) behave in certain mode-choice situations. The MyTrips survey tool collects SP-off-RP data as well as user specific data that can be used in the class membership models. The SP-off-RP data collected for this paper consisted of a mobility diary and for the first three trips one corresponding SP choice each. A non-choice option was also offered to the respondents. The LCM approach assumes that the behaviour of individual n depends on observable utilities and that there is a latent heterogeneity that is unobserved. The latent class approach follows the approach from [17]. This approach is adapted to the SP-off-RP data available here. Assuming respondent n belongs to class q, the utilities for respondent n to choose alternative j in choice situation t for the RP data is given as where ǫ jn is independent and identically distributed (iid) extreme value with unit scale, x RP jn are the decision variables of alternative j of the RP data for respondent n and β q are the parameters of the choice model for class q. Assuming the independence of the random unobserved utilities might be a bit of a stretch. To shape the situation in such a way that the iid assumption is not too unrealistic, the choice set for each trip is chosen as realistically as possible. for this, modes in the choice set are restricted by mode availability, e.g. if a person does not have a driving licence a car alternative is not offered or if a trip does not start at home and bike was chosen for the previous trip, a car trip is not offered as an alternative assuming that the car is at home. This allows us to estimate the RP choice as a standard logit. The utility for the SP choice is given as where x i jn are the attributes of the SP alternative constructed from the chosen RP alternative i, ǫ j is the random utility of the RP alternative j and η j is iid extreme value with unit scale and is the scaling factor that is needed due to the fact that there might be different scales for the different random variables. The probability of alternative k being chosen by respondent n in the SP experiment conditional on the alternative i being chosen in the RP choice is Draws for conditional densities can be constructed from draws from a uniform between zero and one according to the methodology given in [11]. The probability of respondent n choosing k in the SP choice situation and i in the RP choice situation is given by In case there are less than three trips in the RP set of a respondent, the probability of the SP choice of a pure SP question is then just a normal logit probability. Equally, if the SP problem produces a no choice, the probability reduces to the logit RP choice.
The class membership probability for class q ∈ Q of respondent n is given by where z n are the respondent's socio demographic variables (age, household size, PT-pass, PT reachability and Motorbike license) and θ q are the parameters of the utility model of class q. The overall SP-off-RP probability for choice situation t of respondent n is then given by Using the draws from the conditional densities above, it is then straight forward to estimate the model by maximising the negative simulated log likelihood SLL as where N is the set of respondents, T n is the set of SP-off-RP choice situations of respondent n and P k t |i t is the simulated probability of respondent n choosing k t and i t in the SP and RP choice situations respectively.

Scenarios of the case studies
The MyTrips survey tool was already applied to two realworld case studies. The first studied the extension of an urban subway line (Sect. 3.1) the second the overhaul of the public transport system in a rural setting together with the introduction of micro transit (Sect. 3.2). The following subsections present the settings of these case studies, as well as the necessary adaptations to the routing tool and the data collection methodology.

Case study 1: urban subway extension
In February 2018 a survey with adults living in the region south and south-east of Vienna was conducted to explore the impact of the extension of the Viennese subway line U1 by 5 stations together with the accompanying changes to the surrounding bus and tram lines. 550 respondents completed the survey. A requirement for the respondents was that they regularly travel through a specified area of interest on work days. The study area was adjacent to the subway extension where the public transport system was adjusted. Since the survey took place after the subway extension and for RP the previous mobility behaviour is required, participants were instructed to fill in the trip diary for a typical work day before the extension where they travelled through the study area. OpenStreetMap data from before and after the subway extension were used for routing both individual and public transport. Since OpenStreetMap does not contain timetables, they had to be estimated by assuming a standard interval for each line extracted from public transport timetables available in PDF format.
The creation of SP-off-RP questions was adapted as follows. For each user, at least three choice sets were shown. If less than three choice sets could be calculated, the (7) P tn = Q H nq P kin|q .
SP-off-RP questions were complemented by a set of precalculated choice sets where the new subway extension was used in one of the routes of the choice set. Available modes of transport for the alternative routes were walking, cycling, bike-sharing, driving, car-sharing, public transport and intermodal combinations thereof. In [21] it was shown that weather influences demand of cycling in mode choice behaviour, hence for each choice set a randomly determined weather setting was defined and presented to the respondent: temperature (cold, average, hot) and rain (no, light, heavy). In addition, respondents were not forced to choose one of the two choices but an additional "no choice" option was offered, in case none of the offered routes was acceptable. In that case, a free text answer to explain why none of the routes was acceptable was also requested.

Case study 2: rural micro transit
The first case study did not need a lot of imagination from the respondents, since the subway is a well known mode of transport for most people travelling in Vienna. Therefore, respondents can imagine the hypothetical scenario quite well. However, the MyTrips survey tool can also be applied in more complex cases where respondents are less familiar with the hypothetical changes to the transport system.
In July 2018 a second case study with 300 respondents aged 16 and above living in the Austrian part of the Vienna Basin was conducted by face-to-face surveys in the study area. The exact survey area was confined by the Northern Railway in the north, the Leitha Mountains in the east, the Raaber Railway in the south and the Southern Railway in the west.
The public transport network as of 2018, which consists of railways and busses, can hardly compete with motorised individual traffic, because aside from the railway lines the busses operate infrequently and bus routes often involve long detours to connect most of the small villages in the countryside.
The second case study had a futuristic setting which involved the replacement of all current bus lines with new bus lines connecting to the railway lines with an interval timetable in such a way that the area is roughly covered with a grid of fast, high-level public transport lines. In addition, a micro transit system is envisioned as follows: former and current bus stations as well as railway stations also serve as micro transit stations. The micro transit works like stationary car-sharing but uses small, novel electric vehicles with a maximum passenger capacity of three adults. They can be driven by adults with a driving licence in a fast mode up to 45kph and by people older than 16 years without a driving licence in a restricted mode up to 25 kph. The latter user group is also only allowed to drive on minor roads. A major novelty of the vehicles is their ability to be coupled so that one driver can operate up to five vehicles at once, which opens up possibilities for efficient redistribution logistics, such as the redistribution done by users themselves during regular trips. The survey area and the envisioned public transport system can be seen in Fig. 4 together with the concept of the engageable micro transit vehicles. Data used for calculating routes was OpenStreetMap for individual transport and a combination of GTFS Open Government Data for the Viennese public transport and a GTFS export for trains and busses in the rural areas provided for the project "SynArea II" by local transport operators (Verkehrsverbund Ostregion). For the future scenario, lines and timetables for public transport were created manually.
The creation of choice sets was adapted as follows. For each user, at least three and at most five choice sets were shown. Both routes of a choice set used the transport system of the future scenario. The first route was a reconstruction of the route from the trip diary, i.e. in case a bicycle, motorcycle, or car was used for parts of a trip, these parts of the route were retained. For public transport trips the routes were recalculated with the new public transport system, i.e. using the new bus lines and micro transit. Depending on the respondents' ability to cycle and drive as well as the availability of a bicycle and a car, alternative routes were calculated for the reconstructed route for walking, cycling, driving, public transport, bike and ride, and park and ride. Public transport was assumed to be available for every respondent. The alternative route was determined using the design routes as described above.
In order to construct a plausible set of alternative routes, the following restrictions applied to the route calculation: for park & ride, only routes using a train were considered because combining a private car with a bus or micro transit was not deemed a viable alternative. In addition, the speed of bicycles and cars was reduced for bike and ride and park and ride routes after a certain distance, so that switching to public transport is forced to happen at a station nearby. The same was done for the micro transit vehicles to avoid routes using micro transit vehicles running parallel to high-level bus lines.
To address the issue of incomplete diaries, mistakes in the diaries and aborted questionnaires, in this case study, on-site interviewers performed the task of filling in the surveys while questioning the respondents. Due to the changes in collection, people who started the interview process also finished it and the data contained no incomplete diaries. This improved RP data quality considerably.
Since the transit provider wanted to look at the micro transit as an extension to their transit portfolio without added costs to PT travellers, monetary costs were not included in the stated choice experiment.This created an opportunity to look into the latent perceived costs of trips by car and micro-transit. To do that, different cost variables were added to the trip data to get an idea of what travel costs actually factor into daily transport choices. For car journeys, these were based on two distance based pricing schemes. The first is 42 cents/ km, corresponding to the official car kilometre allowance [22], which is meant to represent the real average costs of a car kilometre including amongst others depreciation, fuel costs, parking costs and taxes and fees. The second is just 8 cents/km for fuel costs, corresponding to an average fuel consumption of 6.5 l/100 km at an average price of fuel of 125 cents/litre. For micro transit two time dependent pricing schemes are used, corresponding to the pricing schemes of two major car sharing providers in Austria, Share Now (0.26 cent/min [23]) and Rail & Drive (0.11 cent/min [23]). For PT, when respondents do not have a PT pass, the costs are taken as the real costs available from the electronic timetable information system AnachB [24]. In case of a yearly transit pass, the costs are calculated as the daily costs of the transit pass for the region (1689 Euro) divided by the average yearly number of trips per person calculated from the mobility survey Österreich Unterwegs [25]. All combinations of these cost schemes were tested within the LCM-framework to see, which would improve the modelling quality most.

Results of the two case studies
In this section, the results of the data analysis are presented for the two case studies. The first subsection looks at first indications of changes of the hypothetical bias and at general results of the SP-off-RP survey. Section 4.2 presents the results for a first SP-off-RP LCM for the second case study.

General survey results
In the first case study, only around 40% of the participants who started the MyTrips online survey completed it. The rest stopped during the trip diary phase. This abortion rate of 60% is unusually high for online surveys. While the reasons for these abortions is not known, the feedback of 150 participants regarding their experience in the form of grades and free text highlights challenges of the survey procedure. The average rating of 2.3 for the trip diary is good (1 = excellent, 5 = bad). However, a frequent note was that filling in the diary takes much longer than typical surveys and is quite complex. As described in the following paragraph, the survey data showed an effect of MyTrips on the hypothetical bias, but due to the data quality in the travel diaries the LCM analysis was not performed for this case study. While in general, a trend to overstate usage or willingness to pay for new goods can be seen in SP surveys, only 4% of the respondents in the subway extension survey stated that the new subway option was chosen over their previous RP mode selection. However, in the answers for pre-calculated routes, a much higher preference for the subway extension could be found. This shows that putting the SP choice set in a known context has an effect on the choice of the respondent.
Analysis of second case study showed that when asked in a pure SP setting how likely the usage of the micro transit vehicles for trips with different purposes would be, the answers for likely and very likely range from 28% that state that they would use micro transit as a collective call taxi for work trips to 53% that stated they would use it for trips late at night (see Fig. 5). For all the activities, 39.9% chose very likely or likely, with a confidence band of 2.63%. The stated likelihood in these SP-questions is much higher than the choice of micro transit in the SCexperiment. In the SP-off-RP questions, only 13.81% of respondents choose the micro transit alternative in the 804 choice situations (with a 2.4% confidence interval). with 95% certainty, the probability of the chosen micro transit alternative does not fall into the confidence band of the Likert scale SP question. While this does not guarantee, that the hypothetical bias to choose micro transit is lower in the SP-off-RP choice experiment, it shows that there is a difference in choice behaviour when the micro transit alternative is put into a familiar setting. Figure 6 shows the differences in travel times between the SP choice sets in which the micro transit option was chosen and in which it was not. The figure shows that when the micro transit route is chosen, the travel times saved in mode car is largest, i.e. chosen micro transit routes do replace mostly car routes (on average there are six minutes less travel time in the car per route). Micro transit alternatives that were not chosen are on average about eight minutes longer than the chosen alternative, suggesting that time sensitive respondents chose car and bike instead of public transport routes to save time and avoid walking.

Results of the latent class analysis for rural micro transit
LCMs can be applied to study mode choice behaviour for different parts of the population in more detail for the second case study. In this paper, we look at the influence of travel times, the number of changes and travel costs on the mode choice of classes, where class membership is influenced by personal and household variables. The results of the model estimation can be seen in Table 3. The model was validated by randomly splitting the data set into a training data set (2/3 of the data) and a test data set ten times. The model was re-estimated on the training data. The modal splits as well as the average value of the chosen alternative were calculated on both the training and the test data sets. The result of the validation can be seen in Table 4. One can see that the modal splits of the training and test sets deviate similarly for the comparison of modelled modal splits and modal splits in the data for the SP-questions. For PT and Micro PT the deviations are a bit further apart, however compared to the modal splits of PT and Micro PT around 10% for PT and around 18% for micro PT the deviations of modal split of test and training data to the modal split in the data and are still comparatively small. The standard deviation of these differences is slightly higher for the test data. Similarly, the mean probabilities of the chosen alternatives are similar for test and training data sets at about 60% with slightly higher standard deviations for the test data sets. While the average probability of the chosen alternative is not very large, the model shows a clear preference for the chosen alternatives both for training and test data. Figure 7 shows that for the majority of choices, the probability of the chosen alternative lies around 75%. One reason that prevents a larger preference is that many factors not easily explainable with a rational choice model feature heavily in modal choice (as shown in the non-trading behaviour, for example for PT usage).
The estimation results show that the class model depends on the availability of public transport options (for accessibility of public transport, respondents were asked to state how easy it is for them to reach public transport with answers in a five level Likert scale) as well as the household size and motor-bike licences. Other Socio-demographic variables like employment status and age were not significantly different from 0 at a 10% level. Respondents who have a larger membership probability in class 1 tend to live in smaller households, further away from PT, with the motorbike licence offering them further individual mobility options, while respondents who belong more to class 2 tend to live in larger family homes closer to PT. This indicates that socio-demographic variables are mostly not the best variables to find behaviourally homogeneous groups regarding traffic behaviour, which confirms the results of e.g. Markvica et al. [26].
Overall, people who belong more to class two are more PT affine, with a higher travel time parameter for PT and a smaller one for car and micro-transit.
Travel time variables show that respondents from class 1 are more sensitive to travel times on foot and prefer individual mobility to public transport. whereas class 2 respondents are averse to long travel times in cars as well as micro transit. Some travel time parameters are positive, which makes the current model bad for an application in mobility simulations. However, there is a much higher likelihood that respondents belong to class 2 with a mean class membership probability of 81% and a minimal probability of 53%. This means that while the car travel time parameters are positive, together with the cost parameter, longer car journeys overall tend to have a negative effect on the probability of the alternative. For public transport alternatives, this is not the case. Here, there are many respondents that are non-traders who stick with the public transport alternative even though a faster alternative in a different mode is present. As an example, the travel time of the public transport alternative is on average 11.65 min longer than the car alternative that was not chosen. Overall, the positive travel   time parameters do suggest that people like to stick to their current public transit options even when seemingly better alternatives are on offer. However, the presence of positive travel time parameters also shows that certain characteristics of the respondents, such as the preference for the usual means of transport, are not represented by the model on the existing data basis. If    Examination of respondents' latent price expectations revealed that the maximum log-likelihood in model estimation was clearly better when only petrol costs were included in the cost of car trips, both in and out of sample. Log-likelihoods on the training sample were 662.7 on average for the ten randomly chosen training sets and 222.7 on the corresponding test sets for the model, including just petrol costs for car journeys and low costs per minute in micro PT. This compares to an average in sample log-likelihood of 687.5 and 224.0 out of the training sample for a model with high costs per car kilometre and low costs for micro PT. When using the higher costs per minute for micro transit, the performance of the model was almost the same as the one for with low costs for micro transit (log-likelihood of 663.6 in sample and 220.2 out of sample). Looking at the out of sample values, the models with a lower price for car journeys have better predictive power compared to the one with a high price for car journeys. Overall, this shows, that the respondents of the survey underestimate the costs of car journeys and are more likely to use just petrol costs for their planning.
The ASC are mostly negative, except for car for class 2, micro transit for class 1 and for the reference alternative walk. As a result, for very short trips as well as short access and egress trips to different modes, people prefer to walk. Due to the negative travel time parameter, this changes once the trips get longer, in particular since the travel times for walking trips are clearly longer than for trips of equal distance with other modes.
As a step towards policy implications, the model suggests that marketing the new micro transit as a viable alternative to cars might be a successful strategy since it appeals mostly to people who also have a strong tendency towards car usage. These respondents, that belong more to class 1 also have a smaller price sensitivity which again makes them easier to reach for a micro-transit that also carries extra costs compared to a current PT route. However, to reach people that belong more to class 1 it is necessary to avoid long walking stretches to reach the micro transit, since these respondents rate walking times very negatively. This suggests that it might need a tighter network of micro transit stations than the current bus stations to reach the majority of people. Respondents that belong mostly to class 2 rate travel time in micro public transport very negatively, and are therefore easier to reach if the current public transport network is improved.

Conclusions and future work
In this paper, we presented the MyTrips survey tool for the collection of routing based SP-off-RP data. The choice sets for the stated choice experiment are constructed based on mobility diaries filled in by respondents or interviewers and a connected routing step. This tool allows the collection of complex SP-off-RP data that is suitable for usage in mode and route choice decision modelling and helps to reduce the hypothetical bias. In addition, we present the two case studies that applied the MyTrips survey tool and results of these case studies including an SP-off-RP LCM.
The first MyTrips case study for a subway extension in Vienna showed promising results. While the completion of the online diary was deemed too time-consuming by some respondents and as a result many diaries were incomplete, the resulting SP-off-RP questions seemed to encourage a less pronounced bias towards the new alternative compared with pure SP cases from literature. In the second case study, a micro transit system in rural areas was studied. Again, looking at pure SP questions about the willingness to use micro transit and the SPoff-RP question, the willingness to use micro transit was significantly higher in the pure SP setting. For the second case study, a LCM was estimated. The model shows that it is not socio-demographic variables that determine the likelihood of public transport usage versus individual modes. It also shows that car and micro transit have very similar travel time parameter values, suggesting that micro transit in the presented format is viewed similar to motorised individual transport. As a result, mostly car affine people are likely to use the micro transit options, making them a main target for marketing the new mode of transportation. However, due to the non-traders, the model is not yet fit for the direct use in traffic simulations. Methods like the ones suggested in [27] will be applied in survey design and data pre-processing to deal with the problem of non-traders in future MyTrips surveys.
As an extension to the MyTrips survey tool the integration of automated data collection (see e.g. [28]) into MyTrips is planned, such that SP questions can be asked right after some actual route in the area of interest similar to the tool suggested in the papers [3,15]. This would on the one hand minimise the effort of entering a travel diary and on the other hand allow for asking hypothetical questions about a route while it is still fresh in the mind of the traveller. To allow route choice studies, MyTrips will be adjusted for the collection of the trip diary in a more detailed way such as recording a GPS trajectory.