Lifespans of passenger cars in Europe: empirical modelling of fleet turnover dynamics

Cars have a high share of global transport-related CO2 emissions. To model the market diffusion of new energy carriers and powertrains like electric vehicles, fleet turnover models are commonly used. A decisive influence factor for the substitution dynamics of such transformations is the survival rate of the national car fleet of a country. It represents the likelihood of a car reaching a certain lifespan. Due to a lack of data, current methods to estimate such survival probabilities neglect the imports and exports of used cars. Existing studies are limited to countries with a predominant market of new cars, compared to low numbers of imported and exported used cars. In this study, we resolve this marked simplification and propose a new method to estimate survival probabilities for countries with a high number of imported and exported used cars. Empirical data on the car stock, on inflows of new and used cars, and on outflows of exported and scrapped cars are gathered from 71 national statistics offices. Survival rates of the car fleets of 31 European countries are derived, for which we find a pronounced regional variability. Average lifespans of cars vary from 8.0 to 35.1 years, with a mean of 18.1 years in Western and 28.4 years in Eastern European countries, revealing the high impact of cross-border flows of cars. The study also shows that survival rate estimates can be improved significantly even in the absence of reliable data if a combination of a Weibull and a Gaussian distribution is used. It is likely that the predictive power of existing models (regarding the future environmental impact of car fleets) could be improved significantly if these findings were considered accordingly. The findings of this study can directly be included in fleet turnover and policy assessment models. They also enable the analysis of economic and environmental spillover effects from the imports and exports of used cars between countries. Supplementary Information The online version contains supplementary material available at (10.1186/s12544-020-00464-0).


Number of imported used cars in
scrappage schemes are "best viewed as a transitional strategy. Once the relatively dirty vehicles are removed from the fleet, the gains from scrappage are significantly diminished" [17]. In the light of EVfavouring policies, scrappage schemes could be used to 'jump ahead in time' and introduce a new electric era which would have been phased in slowly otherwise. Therefore, Transport & Environment [20] suggests that "scrappage schemes and other support measures should be focused on electric cars; [...] and company's [sic] could enjoy temporary incentives if they go electric in the coming year." Besides this transformation in car propulsion technology (from combustion to battery-electric), early retirement programs may also have an influence on the share of car categories (e.g. a trend to smaller cars [12]) or even trigger systemic changes in mode split, and therefore lead to further emission reductions [21]. Scrappage schemes are a well-known policy to stimulate the economy after recessions. After the last wave of such programs after the financial crisis in 2009, we are currently amidst the next economy shock triggered by COVID-19. Ideally, green economic stimuli of well thought out scrappage schemes could contribute to an economic recovery and to achieve a 1.5 or 2.0°C target at the same time [22,23].
Appendix B: Background information on the choice of a Gaussian distribution to model the imports and exports of used cars. Figure S1 shows exemplarily why a Gaussian distribution is suited to model the impact of used cars for CSPs -here for the case of car imports. The age distribution of used cars that have been imported to a country until the year of observation usually follows the left curve in Figure S1. For example, the initial registration year 2014 (cars registered abroad in 2014) contains the (0-)1-yr. old cars imported in 2014, the (1-)2-yr. old cars imported in 2015, and the (2-)3-yr. old cars imported in 2016. These three numbers are summed up for 2014 and likewise for all other years. Due to the high number of imports of 5-15 year-old cars, there is a steep slope of the curve going back from 2016. The shape of this curve was found for most of the countries within our analysis of used car imports data, see Appendix E. Applying the Weibull-shaped scrappage rate (middle curve in Figure S1) to the imported used cars (left curve in Figure S1) results in a Gaussian curve (right curve in Figure S1).
Appendix C: Vehicle stock data. To compute the survival probability of cars, age-resolved stock data is required. Three different data sources have been considered in this study: the European Automobile Manufacturers' Association, ACEA, the United Nations S2 Economic Commission for Europe, UNECE, and national statistics (NatStat). The overall data availability is summarized in Table S1. Orange circles indicate the data source that has been chosen to compute the CSPs due to its highest data resolution. Individually gathered data from national statistics (NatStat) clearly provides the highest data resolution. For 17 of 32 countries, NatStat even provides a full resolution of the national stock for cars up to 40 years and older. This is important since the availability of high-resolution stock age distributions was found to be a critical bottleneck for an accurate estimation of the CSPs of the observed countries.
UNECE and Eurostat provide the same age-resolved stock data for 27 countries (all countries except BGR, GRC, ISL, ROU, and SVK) [25,26]. The data is reported in coarser age bins than the data from ACEA, but covers a wider range of car ages for many countries (age bins: 1-2, 3-5, 6-10, 11-20, 21+). Note that "GBR" S3 refers to Great Britain (Northern Ireland not included), while for the rest of this study GBR refers to the United Kingdom. The data for Finland does not include vehicles inÅland. National statistics have been gathered individually for 28 countries (all countries except BGR, CYP, FRA, and SVN) from national statistics offices, ministries of interior or transport, national automotive associations, national automotive companies, and others. If no age-resolved stock data was available for 2016, we chose the closest year with available data. To account for different total stock sizes, we corrected the total stock numbers by the ratio of total stock in the respective year and 2016, while keeping the shares of cars per age bin constant. Where the respective NatStat source did not provide the total stock size in 2016 (BEL, CZE, LUX), we applied an analogous correction to the 2016 totals of UNECE. We assume that the age distribution usually does not change drastically within one or two years (maximum deviation is two years for 2014 and 2018 data). This has been confirmed by a case study of Latvia in Appendix I and the study of Oguchi & Fuse [54]. The following list states other notes on the data and assumptions for processing it: • EST: Data for Estonia contains vehicles with suspended registration.
• FIN: Data for Finland does not include vehicles fromÅland's separate register.
• GRC: The data contains the "year of first circulation internationally" and the "year of first circulation in Greece". Where the "year of first circulation internationally" is "without indication" in the original data set, we assume that it is similar to the "year of first circulation in Greece". Hence, we assume that these cars are new. Where the "year of first circulation internationally" is before a certain year in the original data set, we set it to this year. We do this for simplicity, since these cases represent considerably less than 1% of the total cars registered in Greece in the respective year. In the following, we document the pre-processing procedure of vehicle stock data according to the available data resolution for NatStat data: • Stock data with continuous age distribution (single years) and no overflow bin (full resolution) • Stock data with continuous age distribution and overflow bin • Stock data with discontinuous age distribution (multiple years lumped together in one age bracket) and overflow bin The depicted graphs in Figures S2-S6 refer to the chosen data sources with the highest resolution (orange circles in Figure S1).
Stock data with continuous age distribution and no overflow bin (full resolution). Figures S2 and S3 show the age distribution for all 13 countries with a full resolution (no overflow bin). This raw data already shows the inhomogeneity between countries. While a first group ( Figure S2) covers countries from Northern and Central Europe, a second group ( Figure S3) consists of Eastern and Southern European countries, with the exception of Iceland, which has high fluctuations due to its relatively small car market. For the first group, on average only 39.4% of all cars are equal to or older than 10 years. For the second group, on average 66.1% of all cars are equal to or older than 10 years. Figure S2 Raw stock data for selected countries with full data availability (continuous resolution) and a peak for younger cars. NatStat data sources are documented in Table S1. Figure S3 Raw stock data for selected countries with full data availability (continuous resolution) and a peak for older cars. NatStat data sources are documented in Table S1.

S6
Stock data with continuous age distribution and overflow bin. For all data sets with a continuous resolution up to a certain age and an overflow bin above (e.g. "older than 10 years"), we distribute the number of cars in the overflow age bin according to the value in the last fully resolved age bin. We sustain this value until the overflow bin is "depleted". Example: Assuming there are 50'000 ten-year-old cars in a country and the raw data reports "175'000 cars older than 10 years", we distribute the 175'000 cars as follows: 50'000 eleven-year-old cars, 50'000 twelve-year-old cars, 50'000 13-year-old cars and 25'000 14-year-old cars.
The effect of sustaining the overflow bins can be seen best for Poland in Figure S5. In reality, the distributions for higher car ages will rather follow the tails of the curves shown in Figure S2 and S3 -with a steady decrease instead of a sustained constant value and then an abrupt drop. However, since we do not have any knowledge about the nature of this distribution tail for the countries with an overflow bin, we stick as close to the raw data as possible and avoid any misleading interpretation of the missing data by sustaining the overflow bin constantly as described.
There is one anomaly: the overflow bin of the Czech Republic ("older than 70 years") is so large that the age bins 71 to 121 are not sufficient to distribute the number of cars in the overflow bin when sustaining the same number of cars as in the 70 yr.-old age bin. This is due to the low number of 70-year-old cars. To solve this issue, we sustain the average number of cars in the last 5 age bins (66)(67)(68)(69)(70) until the overflow bin is depleted. Overall, this has a negligible influence on the CSP due to the very low number of such old cars compared to the total stock. Figures S4 and S5 show the age distribution for all countries in this data resolution category -separated into countries with a high number of young/old cars. We see the same divide as for the first resolution category: Northern and Central European countries feature a rather steady decrease with car age, while Eastern and Southern European countries have a pronounced peak at older ages. Figure S4 Raw stock data for selected countries with high data availability (continuous resolution and overflow bin) and a peak for younger cars. Data sources are documented in Table S1. Figure S5 Raw stock data for selected countries with high data availability (continuous resolution and overflow bin) and a peak for older cars. Data sources are documented in Table S1.

S9
Stock data with discontinuous age distribution and overflow bin. For all data sets with a discontinuous resolution up to a certain age and an overflow bin above (e.g. "1-5, 6-9, 10-15, older than 15 years"), we distribute the number of cars in each age bin uniformly among all ages covered by the bin. The overflow bins are sustained as described earlier. Figure S6 shows the age distribution for all countries in this data resolution category. They are located in Eastern or Southern Europe. We observe a peak for older cars, as for the other categories. S10 Figure S6 Raw stock data for selected countries with low data availability (discontinuous resolution and overflow bin) and a peak for older cars. Data sources are documented in Table S1.
Overview over stock data from all data sources available. Figures S7-S9 show all car stock age distributions available -for all three data sources documented in Table S1. While ACEA and UNECE already provide a rich data set of many countries with a decent data resolution, the benefit of the compilation of national statistics lies in a higher data resolution for almost all countries. High-resolution stock data sets are the basis for an accurate determination of the survival probabilities of cars. S11 Figure S7 Raw stock data per country for all available data sources (1/3): ACEA [24], UNECE [25], and national statistics (NatStat). Data sources are documented in Table S1. S12 Figure S8 Raw stock data per country for all available data sources (2/3): ACEA [24], UNECE [25], and national statistics (NatStat). Data sources are documented in Table S1. S13 Figure S9 Raw stock data per country for all available data sources (3/3): ACEA [24], UNECE [25], and national statistics (NatStat). Data sources are documented in Table S1.

S14
Appendix D: New registration data.
In addition to vehicle stock data, the computation of survival probabilities also requires knowledge on the number of new vehicle registrations per year. Four different data sources have been considered in this study: the European Automobile Manufacturers' Association, ACEA, the United Nations Economic Commission for Europe, UNECE, national statistics (NatStat), and the European Statistical Office (Eurostat). The overall data availability is summarized in Table S2. As is the case for the stock data, the NatStat data for new registrations provides a benefit in terms of longer historical time series.
While UNECE overall provides longer time series than ACEA, its data suffers from a considerable number of missing entries, which is not the case for ACEA. Eurostat and NatStat offer the longest time series for many countries. In addition, NatStat data rarely has any missing entries and therefore was most often chosen for further processing within this study, see orange circles in Table S2.
For many countries, the registration data reported by different data sources are not congruent. This is mainly due to the fact that some data sets sometimes contain both the new registrations of new cars and the new registrations of used imported cars (mainly UNECE and Eurostat), while others always provide data only for new registrations of new cars (mainly ACEA and NatStat). Furthermore, different original data sources [1] lead to additional (but usually small) deviations. In the following, we list all available information on the different data sets: • ACEA: -"Data on new registrations only refers to newly registered vehicles, therefore it doesn't take into account used or second-hand vehicles." (private correspondence with Ms. Francesca Piazza, ACEA) -New registration data for Romania refers to car sales [55].
• UNECE: -"The data should refer to registration of new vehicles, so including the second hand imports. Unfortunately there is little harmonisation on this, both in the UNECE region and even within EU countries. [...] This lack of harmonisation (and lack of data on second hand imports in general) is a recognised problem and has been discussed. [...] So for now, unfortunately it's difficult to make comparisons across countries." (private correspondence with Mr. Alex Blackburn, UNECE) -GRC, HUN: Data "includes both new vehicles and used vehicles from abroad" [56].
-LVA: Data "includes vehicles that have been manufactured in the indicated or previous year only" [56]. -LTU: Data "includes new and re-registered vehicles" [56].
• NatStat: -Unless stated otherwise, these numbers are very consistent with ACEA data, which means that they are likely to represent registrations of new cars only (no imported cars). That is also stated explicitly for a number of countries, including HUN, LVA, POL, NOR, and DEU. For Germany, for example, the methodological documentation of their statistics says: "Eine Neuzulassung ist die erstmalige Registrierung eines fabrikneuen Fahrzeugs mit Kennzeichen in Deutschland. Fahrzeuge, die bereits im In-oder Ausland zugelassen waren, fallen nicht darunter" [57]. [Translation from Google Translate: "A new registration is the first registration of a brand new vehicle with license plate in Germany.
Vehicles that have already been registered in Germany or abroad are not included."] -Exceptions: Data for ISL and ESP includes imported used cars. [1] Note that UNECE, ACEA and Eurostat only collect data from national statistics. S15 -Data is similar to UNECE, but with an overall higher data availability.
-Private correspondence with the Eurostat User Support revealed: "In general you cannot compare Eurostat data to other sources, as mostly different methodologies are used. Eurostat has either a directive or regulation setting up for the current 27 member states on how to deliver data. You can find this directive here: https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri= CELEX:32012R0070&from=EN." We will discuss the impact of these different data sets later on.  [55], UNECE [56], national statistics (NatStat) and Eurostat [58]. Data from national statistics (NatStat) stem from national statistics offices, ministries of interior or transport, national automotive associations, national automotive companies, etc. For each country, orange circles indicate the data source that has been chosen for further analyses due to its highest resolution. Abbreviations: ACEA = European Automobile Manufacturers' Association, UNECE = United Nations Economic Commission for Europe, NatStat = National Statistics. Footnotes and further comments are documented in Table S3.

S20
As discussed earlier, different data sources show considerable deviations in the provided figures. In the following we analyse the potential origins of the data, in particular for which data sources used vehicles are included and for which they are not: Based on the analysis above, we chose one data set for each country for further processing in this study. The chosen data sets are highlighted in Table S2 with orange colour. Still, only short time series are available for several of the studied countries. For those, we compute the number of new registrations per capita using population data from end of 2016 [92]. Since this indicator is rather constant for most countries with saturated markets (with fluctuations around a mean due to fluctuating economic activity), we use the average new registrations per capita over the 'oldest' 10 years of the available time series to backcast missing data via population statistics (which offers a high availability of long historical time series). Figures S13-S15 show the number of new registrations per capita and the average over the whole time series, including the mean deviation from this constant average. [2] The stock numbers are available via http://andmebaas.stat.ee/?lang=en, Table "TS322: First registrations of vehicles (months)". This erroneous data has been deleted. S21 Figure S13 New car registrations per capita for all 32 countries for the data source with the highest resolution (1/3).

S22
Figure S14 New car registrations per capita for all 32 countries for the data source with the highest resolution (2/3).

S23
Figure S15 New car registrations per capita for all 32 countries for the data source with the highest resolution (3/3).

S24
From the registration-per-capita-curves, we see that the registrations per capita did not change drastically over time but are rather fluctuating over a time-invariant mean (exception LUX). Therefore, our hypothesis, that we can augment missing data on new car registrations by backcasting via historic population numbers of the years before, seems legit. Example: If a country has a full data availability from 1995 until 2016, we compute the average number of registrations per capita between 1995 and 2005. We multiply that number with historic population numbers from e.g. 1970 until 1994 to get the missing registration data during that time period.
Evidently, this only approximates reality rather coarsely, since the registrations per capita have been rising historically not only due to increasing population but also due to a higher motorisation rate. However, there are two reasons that suggest this approach is valid for our purposes: 1 From the countries where we have data back to 1970, there is no clear trend towards a rising number of registrations per capita (exception LUX). To operate within reliable boundaries, we limit our backcasting to 1970 and exclude previous years. 2 The goal of this project is to compare the age distribution of the car stock with the new car registrations in the corresponding years (e.g. 20 yr.-old cars with the new registrations in 1996). From this comparison, we can derive the cumulative survival probability of cars. The most important data (that shape the survival probability curves) is in the region of car ages between 0 and 20...30. For the countries for which we lack data back to 1970, we still have data back to 1995/6 (on average), corresponding to car ages of 20/21. Hence, the augmentation will on average be for car ages greater than 20, where less cars are still in-service in 2016 and the influence on the survival probability curves is becoming less important. To check the validity of our approach, we do backcasting also for countries where we have data and compare our generated data with the actual data, see Figure S16. The observations from the validation of our backcasting approach are: • Economic fluctuations cannot be captured with this approach.
• For some countries, like ITA, LUX, or GBR, the backcasted data does not represent the increase in new registrations very well. This indicates that an increasing motorization rate is a second driver for the number of car registrations besides increase in population.
• However, for the majority of the observed countries (foremost DNK, FIN, DEU, IRL, NLD, and CHE) augmented data is well aligned with the real data.
We conclude that backcasting is a valid approach to augment the new car registrations time series. This is in particular justified by the diminishing importance of the "older data" for the determination of survival probabilities. S26 Appendix E: Imports and exports data. Imports and exports data has been gathered from national statistics offices and other reports, see Table S5 for an overview on data availability, its resolution (whether it contains only totals, or a more detailed age resolution), and the corresponding sources. Figures S17 and S18 show the age distributions of the imports and exports of used cars for the countries where such data could be sourced from national statistics. For some countries, the age distributions of car imports are relatively stable (e.g. NOR, NLD, SWE), while for others, the average age of imported used cars has changed considerably over time (e.g. FIN, LVA). Note that the Netherlands have experienced an increased number of exports of relatively new cars (3-7 years) in the 2010s as a second peak in addition to the one at car ages around 10-15 years. More exports data would be needed to analyse whether exports of used cars might rather show a bimodal distribution than a Gaussian with one peak. This would have implications for using a Gaussian in the data-fitting approach to estimate CSPs (see Appendix B).   S28 Figure S17 Age distribution of imports time series for all nine countries where such data could be sourced, see Table S5 for data sources. Figure S18 Age distribution of imports and exports time series for the Netherlands and Sweden for which such data could be sourced, see Table S5 for data sources.

S30
We further augment the import totals data sets from national statistics by making use of our finding from Figures S10-S12 that some data sources for new car registrations contain imported used cars, while others do not. For some countries, some data sources definitely include imports of used cars, for others it is unknown so far whether higher registration numbers contain imports and if so, whether they contain all car ages (or e.g. only the imports of one/two yr.-old cars). In the following, we verify that the difference of new-cars-only vs. new-and-used-cars data sources amounts exactly to the total number of used car imports. The difference between these two data sources is depicted in Figure S19. Taking the difference in new registration totals between two data sources (one with new-cars-only and one with imports-included) as a proxy for the imports of used cars is a stopgap to augment the imports data that is subject to quite high uncertainty. However, the comparison in Figure S19 shows that for recent years the two different sources match quite well, and over the whole time series they show on average little differences.
In further considerations, we use the imports data from national statistics where available and fill gaps with the new-registrations-derived imports data where no national statistics are available, see Table S5. Hence, for LTU, SVK, SWE, and NOR, import totals could be augmented with the new-registrations-derived imports data, see Figure S19. Figure S19 Left: Difference between total number of imports as reported by national statistics offices (and others) and the new-registrations-derived imports data, i.e. the difference in new registrations data between different data sources -mainly Eurostat vs. ACEA/NatStat, see Figures S10-S12. Right: Augmentation of national statistics data with new-registrations-derived imports data.

S31
Appendix F: Cumulative survival probabilities. We compute cumulative survival probabilities (CSPs) via Eq. 1 using age-resolved stock data and historical time series of new car registrations. For the latter, we use the raw data that has been augmented by the backcasting approach (see Appendix D). Figures S20-S22 show the CSPs for all 31 countries where stock and new registration data is available. Figure S20 Cumulative survival probabilities (CSPs) for all 31 countries with available data (1/3). Underlying car stock data and data of new registrations (nReg) is shown normalized to the maximum of the time series (maximum = 1). For new registrations, both pure data and (via backcasting) augmented data is shown. Accordingly, also the CSP curves are partially relying on augmented data. Figure S21 Cumulative survival probabilities (CSPs) for all 31 countries with available data (2/3). Underlying car stock data and data of new registrations (nReg) is shown normalized to the maximum of the time series (maximum = 1). For new registrations, both pure data and augmented data (via backcasting) is shown. Accordingly, also the CSP curves are partially relying on augmented data. Figure S22 Cumulative survival probabilities (CSPs) for all 31 countries with available data (3/3). Underlying car stock data and data of new registrations (nReg) is shown normalized to the maximum of the time series (maximum = 1). For new registrations, both pure data and (via backcasting) augmented data is shown. Accordingly, also the CSP curves are partially relying on augmented data.

S35
The standard fitting approach in existing literature uses a Weibull curve, e.g. see Oguchi & Fuse [54]. However, some countries do not show a continuous decline in the CSP with increasing car age. In contrast, the CSP rises from a starting value of 1 up to values of 7 for higher car ages. This is due to imports of used cars that are present in the car stock data, but are not considered in the new registrations data. Hence, e.g. more cars of age 10 are in the stock than what we would have expected from only the corresponding new registrations ten years ago.
Therefore, the sum of a Weibull and a Gaussian curve is fitted to the raw CSP data (see Appendix B for an explanation why a Gaussian curve is suited to account for the imports of used cars). The fitting procedure is sequential. First, a Weibull curve starting at a value of 1 is fitted to the raw data, maximizing the R 2 . Then, a Gaussian curve is fitted "on top", defining the parameters of a Gaussian curve that is added to the fixed Weibull curve from the previous step, again maximizing the R 2 .
In order to avoid over-fitting, the following constraints have been imposed onto the fitting parameters: • Weibull average lifetime γ ∈ [5,40]. This is done to enable faster fitting. The boundaries are not hit.
• Weibull shape parameter β ∈ [2,6]. This parameter describes the steepness of the Weibull curve around the inflection point. While the lower limit is not hit, the upper one is. This is due to the fact that overflow bins of the stock data are pre-processed as sustained constant values with an abrupt decrease to 0 vehicles after the overflow bin is depleted (see Appendix C). Since this is an artefact, we limit the steepness to a maximum shape parameter of 6. The upper limit of 6 has been found empirically by analysing countries where high age resolution was available from national statistics, see e.g. cumulative survival probability curves for AUT, BEL, CHE, DEU, ESP, FIN, GBR, ITA, LUX, NLD, and NOR. • Gaussian stretch k = δ· √ 2πσ ∈ [2,10]. While the upper limit is not hit by any country, the lower boundary is. It is set to avoid over-fitting. Otherwise, very small Gaussian contributions are detected that may not stem from imports of used cars but from scattered data. This limit is found empirically from all 31 data sets.
• Gaussian mean µ ∈ [5,30]. Here, only the lower limit is hit by two countries. It is set in order to avoid peaks at very low car ages which are assumed to stem from data outliers or statistical errors. The limit has been tuned manually to reflect the situation of these two countries. • Gaussian standard deviation σ ∈ [5,30]. Here, only the lower limit is hit by a few countries. Low standard deviations only replicate data outliers. The lower limit is found empirically from all 31 data sets. The resulting R 2 values of using only a Weibull curve vs. using the sum of a Weibull and a Gaussian, are illustrated in Figure S23. Figure S24 shows the CSP fits for countries with a Gaussian curve fitted to the raw data. These are predominantly Eastern European countries with a high share of imported used cars compared to the market of new cars. They show a pronounced Gaussian. ISL, IRL, NOR, and MLT are exceptions that only show a small Gaussian contribution to the CSP fit, with low Gaussian means. Figure S25 shows the CSP fits for countries without a Gaussian curve but a pure Weibull fit. These are predominantly Central and South European countries. Only the Eastern European countries GRC and SVN are also in this second group of purely Weibull-fitted countries. However, both have rather long average lifetimes of cars -an attribute that is characteristic for Eastern European countries. Figure S23 Boxplots of the R 2 values from pure Weibull fits and from the sum of a Weibull and a Gaussian curve. The box depicts the median and the interquartile range. The whiskers cover the 95% confidence interval.

Figure S24
Cumulative survival probability (CSP) fits for all countries with a non-zero Gaussian fitted to the raw data in addition to the Weibull curve. Figure S25 Cumulative survival probability (CSP) fits for all countries that feature only a Weibull curve fitted to the raw data, with no Gaussian contribution.

S38
Finally, we compare our fitting parameters with values from Oguchi & Fuse [54], see Figure S26. Their analysis is based on data from 2008, compared to our study being based on data from 2016.
The average lifespan parameter is higher for all observed countries. The largest jump in average lifespan is observed for Italy. Interestingly, higher GDP per capita did not result in lower average lifespans as the inter-country comparison of the simplified CSP estimation method would suggest (see Figure 6 in the main manuscript). In particular due to the economic crisis of 2009, GDP alone seems not to be an accurate predictor for the average lifespan of a country's car fleet, see also Appendix I where the CSP of one country is calculated for 20 consecutive years. This is the reason why other correlations are prioritized for the simplified CSP estimation method in Figure 6. A normalization to the EU's or even the global GDP might, however, provide a more reliable result.
Appendix G: Proof of concept of new CSP estimation method. Based on Eq. 6, the number of imported used cars is added on top of the new registrations of new cars. The exported cars are subtracted. Since historical time series of age-resolved imports and exports data are scarce, we impute missing data -mainly by extending time series back to 1990, sustaining the average age distribution of the last three years where data is available. If age-resolved imports data is available e.g. for the years 1995-2016, we assign the average values per age bin over the years 1995, 1996, and 1997 to the years 1990-1994. Since even the best data sets do not reach very far into the past, we limit this backcasting of data to the manufacturing years 1990-2016. If recent data is missing, we do analogous forecasting. Figure S27 exemplifies the imports and exports correction for the case of the Netherlands. Imports and exports data is usually provided as the number of cars by age per calendar year. For our purposes, we transform this data from a calendar year basis to a manufacturing year basis (more precisely the year of first registrations which might differ from a vehicle's manufacturing year). Given e.g.  Figure S27. The second panel shows the correction of the new registration data by imports and/or exports data. The third panel of Figure S27 shows the resulting CSP curves for each correction step. Correcting for imports subtracts a Gaussian, while correcting for exports adds a Gaussian. Note that both imports and exports numbers are not very pronounced for the case of the Netherlands. Therefore, the deviation from the uncorrected CSP (baseline) is not too high. Figure S28 shows the imports and exports correction of the countries that have not been shown in the main manuscript. The concept of the imports-/exports-Gaussian can also be validated here. Even the pronounced Gaussian contribution for Latvia (LVA) can be fully corrected such that only a Weibull curve remains. However, data scarcity and low data resolution often impede a fine analysis of the actual CSPs (see notes in Figure S28). Note that Sweden only has a very low number of both imports and exports. Therefore, all CSP curves look very similar.
There are a few caveats regarding the application of this new CSP estimation method. The following limitations shall be addressed in future studies: • Low data availability & resolution for imports and exports data (see Table S5): Long time series of used car imports and exports with a high age resolution would be beneficial for an accurate estimation of the real CSP of a country. • Unharmonized data sources: Data sets are potentially subject to errors and statistical inconsistencies.
Combining data from different sources can add additional noise or cancel out existing noise in stock S39 Figure S26 Left: Comparison of Weibull average lifespans of Oguchi & Fuse [54] and this study (see Table 1). Right: Weibull average lifespans vs. GDP per capita.
and new registration data. Therefore, correcting for imports and/or exports might have advantageous or adverse effects. There is a high demand for comprehensive, harmonized data sets of all required inputs for a CSP curve. • Gauss-curve only an approximation (see Appendix B): As already indicated, further analyses should validate whether a Gaussian curve is able to fit the effect of imports on the CSP of other countries as well as it does for the countries in this study. This is even more important for the exports correction, for which fewer data was available at the time of writing. Often, the exact reason for data outliers cannot be named. This can e.g. be seen for the case of LVA, where the goodness of fit decreases with the imports correction, which can be due to noise (statistical errors) in all combined data sets, due to different data sources of stock, registration, and imports data, due to the fact that the exports correction is missing, or due to other reasons. Figure S27 Imports and exports correction of raw CSP data: proof of concept for new CSP estimation method, exemplified for the case of the Netherlands (NLD). Curves that are based on augmented imports/exports data are marked as "sustained history". Figure S28 Proof of concept for new CSP estimation method. The uncorrected CSP curve (left column) is corrected by imported (second column) and exported cars (third column), and by both (right column) using Eq. 6. Selection of six from nine countries with available age-resolved imports data, and one from two countries with both age-resolved imports and exports data, to show the proof of concept. The analysis for the other countries can be found in the main manuscript.

S42
Appendix H: The simplified CSP estimation method. This subsection is divided into two parts: The first part is verifying the simplified CSP estimation method of Oguchi & Fuse [54]. In the second part, a new simplified CSP estimation method is discussed to also cover countries with high imports.
Applying the simplified estimation method of Oguchi & Fuse [54] to all 31 observed countries of this study reveals that it only works well for the countries w./o. a considerable number of imports (mainly Western European countries). Taking the country sets of Western and Eastern European countries as defined in Figure 4, we see that the median average lifespan over all Western European countries calculated using the simplified estimation method of Oguchi & Fuse [54] (with only the total stock number as an input) is as close as 0.5 years to the value calculated using age-resolved stock data, see Table 1. The median deviation for Eastern European countries is 11.2 years.
The new proposed simplified CSP estimation method is based on correlations among fitting parameters and between fitting parameters and other aggregate indicators. While the correlations for γ, δ, and µ have already been illustrated in the main manuscript, an analysis of the impact of setting β and σ to constant values follows here.
Weibull shape parameter β. Based on an analysis of 17 countries, Oguchi & Fuse [54] suggest that the Weibull shape parameter can be set to a constant value of 3.6 without major impacts on the average lifespan and the R 2 of the CSP fits. We choose the 20 countries with highest stock data resolution to verify this finding. For AUT, BEL, CZE, DNK, EST, FIN, DEU, GRC, HUN, LVA, LTU, LUX, NLD, PRT, ESP, GBR, ISL, LIE, NOR, and CHE, we have stock data resolved by age for at least the first 25 age bins and for most cases for 1-100 year-old cars. The average Weibull shape parameter for this high-resolution data set is 3.5 (std dev: 1.1, min: 2.0, max: 6.0), very close to the value of Oguchi & Fuse [54].
We validate the sensitivity of a variation of the Weibull shape parameter β in two ways: first, parametrically by varying β from 2.0 to 6.0; second, by using a constant β of 3.5 versus the optimal parameters for each country (see Table 1). We measure the impact of these two sensitivity analyses on the Weibull average lifespan γ and the R 2 of the Weibull fit for the full data set of 31 countries.
1 Parametric variation: γ is 1.3 years higher (std dev: 1.9) for β = 2.0 compared to β = 3.5, and 0.8 years lower (std dev: 1.5) for β = 6.0 compared to β = 3.5. The effects on the R 2 of the Weibull fit are below 0.05 in both directions (β = 2.0 and β = 6.0). 2 Comparison with the optimal β: Compared to the optimal β values for each country, setting β to 3.5 yields Weibull average lifespans that are on average 0.8 years higher (std dev: 1.5), and R 2 values that are on average 0.02 lower (std dev: 0.03). All variables observed are only slightly affected by a change in the Weibull shape parameter. Note also that outlier β's are mostly caused by low stock data resolution. Hence, we can confirm the finding from Oguchi & Fuse [54] that β can be set constant -in our case to a value of 3.5.
Gaussian standard deviation σ. The mean Gaussian standard deviation σ of the set of 7 high-resolution countries is 5.2 (std dev: 0.6). [3] Taking all 31 countries into account (except for HRV, which is an outlier due to a low stock data resolution) yields an average σ value of 5.6 (std dev: 0.7). Therefore, we suggest that the Gaussian standard deviation can be approximated with a most likely range of 5.0-6.0. As for beta, we proceed with the value from the countries with highest data availability, i.e. 5.2. Due to the low standard deviations, no sensitivity analysis is needed. [3] Note that from the 20 high-resolution countries, only 7 have a non-zero Gaussian contribution. S43  Appendix I: Time variability of CSP fitting parameters: Case study for Latvia.
CSP curves change over time. To exemplify the need for further research on how CSP curves evolve over time, we analyse the CSPs ( Figure S29) and corresponding Weibull + Gauss fitting parameters ( Figure S30) for Latvia for 20 consecutive years (2000-2019). Note that the availability of age-resolved stock data for such a long period of time is very rare.
We see that the Weibull average lifespan shows a spread of almost 10 years in the given time period. The drop in the average lifespan from 2009 to 2010 is highlighted with a dashed line in both panels of Figure S29.
During the financial crisis of 2009, a lot of old cars in the fleet have been scrapped or deregistered -maybe to avoid operational costs, incl. insurance, etc., which might be an important consideration for corporate car fleets. This causes a drop in the average lifespan, even if there was no scrappage premium in Latvia at that time. The Gaussian mean is also affected considerably -which can be seen from the shifting peak of the CSP data in Figure S29. S44 Figure S31 Fits to cumulative survival probability (CSP) data for selected countries and different stock data sources: ACEA [24], UNECE [25], and national statistics (NatStat, sources see Table S1). Fitting parameters are the Weibull average lifetime, γ, the Weibull shape parameter, β, the Gaussian maximum, δ, the Gaussian mean, µ, and the Gaussian standard deviation, σ. Figure S32 Fitting parameters for cumulative survival probability (CSP) data for countries with a high stock data resolution from national statistics (NatStat) -see Table S1 -in comparison to fitting parameters for CSP data based on stock data from ACEA [24] and UNECE [25]. The best data source (highest data resolution) is marked with an 'x'. Fitting parameters are the Weibull average lifetime, γ, the Weibull shape parameter, β, the Gaussian maximum, δ, the Gaussian mean, µ, and the Gaussian standard deviation, σ.
The accompanying Excel file contains the following data for 31 European countries: • Final survival rates, computed using Eq. 6.
• Age-resolved stock data.
• New registration data.
• Data on total imports and exports of used cars.
• Age-resolved data on imports and exports of used cars.