Assessing Robustness in Multimodal Transportation Systems: A case study in Lisbon

are exposed to disruptions caused by malfunctions, accidents, maintenance, reduced fleet, and disasters, compromising mobility. In this context, the multimodal planning and management of transport networks can be explored to increase its robustness against these events. In this context, this research paper proposes and empirically compares methods to assess the robustness of a multimodal transport network, looking at aspects regarding the single-mode and multimodal network topology. We hypothesise that the appropriate multilayered and traffic-sensitive modelling of a multimodal transport network can help characterise robustness and further unravel vulnerabilities related with the integration of different transport modes. We we show that the strategies that depend on recalculating metrics are generally more effective, with the exception of a particular case of edge removal using betweenness centrality to maximise IC’s, which counter-intuitively had better results for the IB (initial betweenness) strategy. Even though we were able to postulate on this phenomena, further research needs to be done to understand it. We also showed that the resilience tests needed to remove about half the nodes of the network to leave all the remaining nodes wholly disconnected. This is a phenomenon that happens in all layers and the multilayer network as well, suggesting that betweenness targeting is he best way to measure robustness across the different strategies. We also verified higher assortativity phenomena in multilayered networks, in contrast to single layers, highlighting the importance of intermodal hub redundancy.


Introduction
Around 55% of the world population lives in urban areas (UN 2018). However, such areas represent only a minor percentage of the total surface area. Therefore, daily commuting patterns and mobility services' consumption impact factors such as the usage of natural resources and impact time in transportation, pollution, and quality of life, among others, yield extreme importance in sustainable development goals (Aparicio et al. 2021a;Bernardo et al. 2019;Aparicio et al. 2021b). In this context, numerous studies attempt to optimise transportation systems planning and usage in urban scenarios (Campbell and Woensel 2019;Lu et al. 2004;Migliore and Catalano 2007).
Among the studied solutions, the importance of a multimodal transportation system arises due to the high demand for some traffic corridors. According to Heinen and Mattioli (2019), an improved articulation of multimodality aspects within a transport system is significantly associated with lower carbon emissions. This assertion leads us to believe that successfully coordinating the interfaces in this kind of network is a viable solution for promoting carbon neutrality in the long run. However, according to Clifton and Muhs (2012), the study of multimodal transportation systems "has historically been underrepresented in travel surveying efforts. This lack of consideration has implications for widely accepted statistics for nonmotorised travel behaviour (walking, bicycling, among others). It affects researchers and professionals in travel modelling, urban planning, public health, and urban design". An integrated view of how specific failures affect the overall multimodal system is also relevant since it is fairly common to have planned or unplanned disruptions (i.e. construction works, technical failures, attacks and natural disasters),causing delays and possibly inhibiting the usage of a pathway or station (Cats et al. 2018). This observation raises the need to assess how robust these transportation systems are. Therefore, our main research objective is to assess robustness in multimodal transportation systems in an urban scenario. In the context of this study, this assessment is applied at a topological level.
Making stops along our path from origin to our destination is often required, and the more stops we make, the more time we spend on transportation. So naturally, how nodes and edges failures affect the Average Path Length (APL) is an important research question that we are interested in answering. Since the effectiveness of a path between two stations can be assessed by the number of stops in a trip (the APL). Besides the effects on the APL, understanding how many nodes (or stations) that have to fail to fragment the network into isolated components is a requirement to further understand the impact of a disaster on a station or stop (in creating isolated components). In addition, we also aim to infer the effect of removing edges (pathways) versus nodes (stations) since this helps us to infer if incidents have a higher impact on pathways or stations. To empirically test the simulation-based robustness testing, we aim to understand if metrics should be recalculated after each extraction, as well as the measurement of the impact of cascading failures such as what happens if a station The current study contributes to increase the knowledge on robustness assessment of multimodal transportation systems. Hence, we do a comprehensive review on how to model multimodal transportation systems and quantitatively assess their robustness. Another contribution of this study is the possibility to support public transportation policies focusing on actionable priorities for a more resilient system design. For instance, where should the focus of actionability should be in case of an accident on different stations or pathways. By understanding which elements are more important for network integration.
For this research, the city of Lisbon is used as an example to improve the integration of the current multimodal transport network. This analysis is fundamental in the context of the Lisbon Metropolitan Area (LMA), where the average occupancy rate for individual private transport (cars) is 1.60 passengers per vehicle, and the daily traffic inflow in the county of Lisbon is also the cumulative result of commuting traffic inflows between Lisbon and the eighteen municipalities that integrate the LMA (INE 2018). Several road traffic corridors are flooded daily with single occupancy vehicles that could possibly use other public transport alternative. Despite the stable establishments and the integration of the operators' systems with a single card, challenges to the integrated operation and multimodal planning of the public transport network still persist within LMA.
Additionally, as noted previously in a general sense, it is essential to note the impact of natural hazards to disrupt a working transportation system as well. Previous literature (Ramos, Zêzere, and Reis 2010) denotes the natural hazard risk in different zones of the LMA (see Figure 1). This map shows us the spatial distribution of risk for earthquakes, flooding and landslides for the county of Lisbon. This stresses the importance to understand various risk impacts on the structure of the network. Along with the general risk from natural incidents, other incidents through the city have a higher rate of incidence, these usually have a more localised effect (Elvas et al. 2021). Such incidents may be caused by faults in energy and water suppliers, traffic accidents and fires that disable stations and pathways. Elvas et al. (2021) rendered a map of incidents from such a diverse standpoint (see Figure 2). This shows us that a study on the impact of network robustness at a local level is relevant.
Since the Lisbon public transportation network is susceptible to these failures that can affect the daily commute of its users, a way to guarantee and improve the robustness of this critical component of our public services is crucial. To tackle this problem, we aim to understand how robust the Lisbon public transport network is. This assessment starts by quantitatively measuring robustness (Zhang et al. 2015). Since the Lisbon public transportation network is also complex, it is essential to look at how different stations and stops may have similar properties, bridging different lines (routes as linear road/rail/other mode infrastructures) and transportation modes. Topological attributes and concepts of network resilience, besides simulation, will play a role in this research study. We ask ourselves, which nodes (e.g. stations) are central and what the role of degree-degree correlations is. This is, if central stations actually connect to other central stations in this particular topology, since this may be a cause for a higher fragility (Zhou et al. 2012). Also, looking at how different extraction strategies (or simulations) for quantifying robustness affect the connectedness of the network will shed some light on which method has the highest impact on breaking the key pathways that keep the transportation system usable throughout the city.
The above concepts will be applied to the Lisbon's multimodal transport network in the context of the ILU (Integrative Learning from Urban data and the situational context for city mobility optimisation) project. This project joins research institutes, the Lisbon municipality and public transportation operators to promote a more efficient and sustainable mobility.

Literature Review
Robustness can be understood as the quality pertaining the capability of a system to work under disruptions (Immers et al. 2002). Robustness has been alternatively defined as an optimization problem (Kouvelis and Yu 2013). A robust solution can be seen as the one that has the best performance under high stress conditions. To further understand this property of the system perturbation, the literature review shows that experiments are a dominant approach, which can be done in silico (using computer simulations) (Masel and Siegal 2009). In the context of a multimodal transportation system, we can inherently see the redundancy since the transportation modes usually provide alternative routes for the same places.
Previous studies indicate that the first step to measure the robustness of a transportation system, is to model the transportation network. Accordingly, the modelling of multimodal transportation systems recurring to multilayer networks has been previously proposed to analyse urban transportation systems (Aleta, Meloni, and Moreno 2017). This allowed the analysis of superlayer interdependence. Multilayer networks are also used to model air transportation networks. This can, for instance, be used to study how the degree-degree correlations of a network affect the spreading of an epidemic (de Arruda et al. 2016). The modelling of different types of multimodal transport networks have been done in the context of different empirical studies as shown in table 1 (Zhang et al. 2020). Note that air travel can be modelled as a heterogeneous mode of transport, as different air travel companies constitute different layers of a network (Fernandes et al. 2020).  Derudder et al. (2014) studied connectedness in India using community membership and betweenness centrality. Varga (2016) applied a weighted multiplex to air traffic to then make a network analysis with metrics such as average degree and assortativity. Du et al. (2016) modelled an air traffic network in layers based on flight frequency and concluded that is less redundant in the Chinese network than the worldwide network.

© AET 2021 and contributors
De Arruda et al. (2016) modelled an air traffic network and assessed assortativity and rich club effect (nodes with high degree connect to nodes with high degree) in international flights. Aleta et al. (2017) modelled the Madrid urban transport network, measured overlap between different modes and transfers and time from one station to another. Ding et al. (2018) studied how the topology of the network affected the formation of communities and how betweenness centrality and closeness centrality affect growth. Liu et al. (2018) create a methodology to assess the spatial accessibility of public transportation network using a case study in Shanghai, China.
To answer the need for measuring resilience of networks, Klau and Weiskircher (2005) noted in their robustness and resilience review that a network should be considered resilient if it sustains a high number of node failures before it becomes disconnected.
In that sense, performing node percolation tests is suggested as a way to measure resilience. Measuring this property in transport networks has been the focus of several studies. Sulivan et al. (2010) has evaluated system-wide robustness by finding critical isolating links in road networks. They reduced the link capacity and measured the travel time as a means to measure robustness, this allowed for a dynamic perspective as well. Later on Zhou et al. (2017) have conducted a similar study, but induced the vulnerabilities by blocking lanes instead, same as deleting an edge, which ranked the links by how critical they were building upon the much earlier work who did just that (Scott et al. 2006).
These studies are related with the earlier studies on reliability by Chen et al. (2002) , where they measured the reliability of networks given the demand change based on models of route choice. This study extended capacity reliability to network equilibrium models as well as accounts for route-choice actions of drivers. Later Al-Deek and Eman (2006) measured the reliability using network capacity and travel time by changing demand and inducing link degradation and non-persistent congestion. These studies show how different demand levels affect the capacity and reliability in an analytical way. It is important to note that reliability is fundamentally different to resilience. Reliability is usually a measure of susceptibility to the interruption of service. Resilience, on the other hand measures the ability to recover from such interruption in service (Clark-Ginsberg 2016). Robustness on the other hand can be seen as a dimension of resilience and as a cause for reliability.
Studies of network resilience have also been done in the context of multimodal transport. Montes-Orozco (2020) recently showed that the same idea of percolation could be used in multiplex networks. This notion opens this test as a way to accurately measure robustness of multimodal transport network.
Another view on resilience measurement on multimodal transport networks is given by Bocewicz (2014) who postulates that a whirlpool network structure on a multimodal transport network leads to a cyclic steady state unlike a tree structure, under the flow simulations proposed. A whirlpool state is one that points towards another, and the following states recursively point to the following until it completes a circle. Then the measurement of robustness is given by the ration of states in a whirlpool structure and the total number of states in the multimodal transport network.
Besides a topological resilience test, a dynamic measurement of such property has also been studied in multimodal urban transport. Stamos et al. (2015) studied resilience to extreme weather events (EWE) in several European cities by measuring the percentage of change of working segments during the EWE. Then claimed this could be used for prediction tasks by using past rail, air and road usage data. However, this method lacked actionable information, because prevention is only attainable if we know where to act. Later on, Cats et al. (2017) defined a framework to assess robustness. This helped the description of robustness assessment as an economic problem. This kind of assessment is very relevant for this study because it provides a baseline for understanding the impact of different disruption scenarios in specific links, quantifying their criticality. This is a type of assessment that provides specific enough data to be actionable in the real world, showing that number of critical links distribution follows a left skewed behaviour ( Figure 3). This is relevant since we can see that most links are not that critical, and others are very important to keep the network connected. By identifying such links and providing redundancy, priority policies and other measures, the robustness of the network can actually increase.

Methods
To answer the questions stated in the introduction regarding the topological robustness we use different strategies. As we have seen there are several approaches to assess robustness, being one of them removing nodes or edges from a network to measure that the impact as a simulated station failure (Montes-Orozco et al. 2020).
The simplest way to do this is by injecting random failures in the network, i.e., randomly removing nodes from the network. It is expected that this simple strategy may not maximise the damage on the network, but it can be a reasonably realistic approach since the Lisbon public transport network may have station failures due to unpredictable and unexpected events.
Other more precise strategies were defined by Holme et al. (2002), and we followed their strategy. We also simulated failure of nodes and edges to be able to compare, which yields the highest impact, as the optimization stress handling can be part of the definition of robustness as seen previously. These more precise node extractions (simulated station failures) utilise attributes from the network, such as degree and centrality. To test a network with more precision, an effective strategy should be to extract the most important node. This notion of importance can vary depending on the attribute to which we give attention.
We could select nodes from descending order of degree. This extraction is called Initial Degree removal -ID removal -because we base the extraction on only the initial calculation of the degrees, and we do not update it after removing nodes. Another critical attribute is the centrality of a node. There are a few types of centralities, but the betweenness centrality is crucial since it represents the nodes that have edges that unify communities/groups inside the network. These nodes are the ones with highest criticality for network connectivity since they connect the communities of the network. This extraction is called Initial Betweenness removal -IB removal -like ID removal, it only calculates the initial betweenness centrality of each node at the beginning and does not update it.
Both extractions only calculate the attributes, degree distribution and betweenness centrality, at the beginning of the algorithm. When we remove nodes from the network, it changes the degree of the neighbours of the removed node, and it modifies the betweenness centrality of all the nodes. Two new strategies appear called Recalculate Degree removal -RD removal -and Recalculate Betweenness removal -RB removal. These strategies are identical to the ID and IB ones except they update/recalculate the degree and the betweenness centrality after each node/edge removal. The RD and RB algorithms are considerably more time and computation expensive for each interaction, but we wanted to study if they are more efficient at disconnecting the network. As efficiency in this case is a way to simulate the highest stress levels.
ID and RD removal are local strategies because when a node is removed, it only changes the degree of the neighbours of that node. Thus, only these local nodes need to be updated. On the other hand, IB and RB removal are global strategies since by removing a node, we need to recalculate the betweenness centrality globally, i.e., of all nodes.
Every calculation of the betweenness centrality of the network costs around 5 minutes, and since we must apply thousands of recalculations (one per node removal) this would be very time consuming and less sustainable. Because of that, we decided only to recalculate the betweenness centrality when batches of 5% of the nodes were removed. Being N the number of Nodes and E the number of Edges, the ID algorithm is O(N + E), the RD is O(N(N + E)), the IB is O(NE) (using a faster algorithm for betweenness centrality created by Brandes (2001)) and the RB is O( ! E).

Modelling
Most worldwide public transportation networks are multimodal. Therefore, these are modelled as a multilayer. Moreover, we also wanted to test if removing nodes that connect layers of the graph is critical, i.e., if it disconnects the network faster. We call these nodes multimodal hubs and this extraction strategy is Multimodal Hubs removal. For example, two multimodal hubs are Alameda, that connects Metro and Carris, and Gare do Oriente, that connects Carris, Metro, and CP. It is worth noting that the Multimodal Hubs removal strategy only makes sense when evaluating the multilayer network. When evaluating each modality, this strategy is invalid because there are no multimodal hubs.
Removing nodes from this network may correspond to a stop being deactivated or being in constructions, while removing edges may correspond to cutting roads or accidents preventing any vehicle from passing through. Edges can also fail. All our node extraction strategies can be applied to edges with a few modifications. For example, how can we get the degree of an edge? The edge degree depends on the nodes that are connected by it. We used the following four different methods to calculate edge degrees: where ke is the edge degree and kv and kw are the degree of the nodes that are connect by the edge e. Since our network is directed, we also assume that v is the source and w is the destination node of the edge. However, there are other ways to calculate the degree of and edge such as ke = kv or ke = kw.
In accordance with the study of Holme et al. (2002), where they concluded that method A. ke = kv * kw was the best fit to the majority of the network types that they studied. Method A showed the best correlation of the four methods between the edge degree and the edge betweenness centrality. It is expected that the node with a higher degree should also have a higher betweenness centrality. To calculate the betweenness centrality of an edge, we also used the Brandes (2001) O(NE) algorithm.
We can expect that the extractions based on centrality measures may be more harmful than the random failures. Concerning the nodes vs edges, we believe that there could be a small difference between these extractions, but this is hard to predict. Degree based strategies probably will split the network into many subgraphs of vertices with low degrees. Betweenness centrality strategies can create clusters that are highly connected since they tend to destroy the bridges that connect communities first.
We understand that not only single nodes fail but also more catastrophic events can happen in the network, bringing the notion of cascading effect. For the cascading effects, we described and implemented two extraction strategies.
The cascading effect simulating a Line Failure removes an entire transportation line in the network. This could happen with an underground implosion that made the whole line stop. We simulate this by removing all the nodes that have the attribute of that line. Some nodes may belong to more than one line, even from different modalities. For example, Gare do Oriente stop is a multimodal hub that belongs to Metro, Carris and CP (railway company).
The cascading effect simulating a Neighbours Failure removes neighbours from several layers of a failed node. One example of this in the Lisbon public transportation network could be a recursive failing of transports in the same spot. When a station fails, the neighbour stations could experience overflows because of relocated traffic and this could potentially cause them to fail as well. To simulate this type of cascading effects, we collect several layers of the node's neighbours, and then, remove all these nodes at the same time. An example of this could be a flood that affected all transport modes in a particular spot of the city to a point of not being usable. In conclusion, we used the six following extraction strategies for nodes and edges: 1) Random removal 2) Initial Degree removal (ID) 3) Initial Betweenness removal (IB) 4) Recalculate Degree removal (RD) 5) Recalculate Betweenness removal (RB) 6) Multimodal Hubs removal We also used the 2 following extraction strategies for cascading effects: 1) Line Failure 2) Neighbours Failure

Validation
To assess the impact of each strategy, we explore the impact of each removal based on different metrics thought the simulations. To understand how connectedness is affected, we measure the average path length (also known as average geodesic distance), average degree, number of isolated components, and the size of the strongly connected component (de Arruda et al. 2016;Derudder et al. 2014;Varga 2016). To understand how the average path length is impacted by each removal, we only average out the shortest paths that still exist in the network. This approach allows us not to need to calculate the inverse path length, %& . As the study of this network is applied to a multimodal transportation network, we also measure the impact of the described extractions on the use of multimodality.

Data extraction and pre-processing
Firstly, we identified criteria used for the inference of the multiplex network, followed by an analysis of the robustness of the topology. To transform route planning from GTFS data files of each mode into a network, we joined the routes and the stops of each one of the transportation modalities: Carris, CP, Fertagus, Metro, Rodlisboa, Sulfertagus, Transtejo and TST (bus, railway, riverway, subway, tram companies'). This merge contains the information regarding the id, geographical position, sequence within the line, and name of the station. This allows us to verify which stops are connected to one another. An extract of this file can be found bellow:
Given the structure of the data, we created a directed graph (digraph) for each transportation modality (bus, railway, riverway, subway, tram), forming a layer for each. We resorted to digraphs for each layer because transport does not always flow in both directions within the same path. This allows us to create each layer of our multilayer network. We apply a multilayer representation hence the edges of different layers have different types that represented different realities. Modelling such characteristics is not possible with a single layer (or monolayer) network.

Figure 4. Multilayer Lisbon transport network topology on a 3D representation with all the layers
After the representation of each layer, we must account for possible multimodal interactions, i.e., the possibility to shift to different transportation modes within a trip. These links are of extreme importance because they allow us to assess the connectivity of the transportation system as a whole. To accurately understand where these edges could be located, we created a script that extracted the Lisbon city map and calculated the walking distance between every two stations and combined it with a standard coordinate distance calculation to get faster calculations. After getting this result, we selected the pairs of stations from different modes and linked them based on distance. The intermodal interfaces are about 0,0040% of the distances calculated.
We connected the eight modes of transportation in the city of Lisbon using the described method. By programming a 3-dimensional method, for visualising such network, using the geographical information available, we were able to see its structure ( Figure 4). This network is composed of 8 layers, 7972 nodes and 11892 edges. The full network required the computation of 46 399 492 distances to join all the layers with the multimodal connections.

Network Analysis
The number of nodes and edges per layer is disclosed in Table 2. It is clear that the distribution of stations is not equitable across layers. Additionally to the edges in each layer, 1502 edges represent multimodal changes. The average in and out-degree are approximately the same at 1.4917, which means that the majority of the edges are reciprocally directed. As we are studying a directed network, its essential to assess the strongly connected components (SCC's) since it measures the numbers that connect with each other regarding the direction of each pathway. These are graph partitions, where the nodes are connected through a path. It is quite normal to have many single-node SCC's in unidirectional lines. This is precisely the case in this network. We observe 417 SCC's in the whole network. As some layers such as METRO and CP have only bidirectional relationships between nodes. An example that would cause a higher number of SCC's would be for instance an set of isolating line from the rest of the network by eliminating a critical station. Lines that are naturaly isolated within the present topology happen in

How many stations are connected with each station?
In this multilayered network, we can examine the kind of degree distribution we have, shown in Figure 6. We can see a high tendency of having both the in-degree and outdegree equal to one (part of a line) and two to four (stations that have intersections of lines of one or more modes). We can also see some nodes with an out-degree of 0. These are start and end stations respectively that have no reciprocal edges on the opposite direction. Stations with in-degree and out-degree equal to one can be caused by a significant amount of stations that flow only in one direction, or career end stops have different identifiers depending on orientation. This fact is also explaining the high number of low distances in the figure above. The in-degree is always very close to the out-degree. Since we are looking at a transportation network, this tells us that the majority of the connections between stations flow both ways, forming what is known as a chain (or a line), even though there are apparent exceptions (stations that connect with many stations) as we can see on the tail of the plot with a log scale.

How many stops do we have on average between each station?
To answer this question we look at the average path length in number of stops between two stations (not the length of the path itself . However, each layer covers a smaller area than the composition of all the layers. In the case of TST , the APL is higher than in the composition of all the layers. This means that multimodality can be useful to avoid many stops.

Which stations connect travellers from different parts of the city?
In the case of transportation networks, it is interesting to measure the betweenness centrality to understand which nodes connect different communities of stations, i.e. sets of interconnected stations within a region. In the multilayer network, we identify some stations that have a very high centrality (See Fig. 7), these are mostly from TST. This may be a sign that TST is kind of bridging layer in some zones. Some examples of such bridging stations are Lisboa Gare do Oriente (0,2529), Setúbal Ciprestes (0,1639), Lisboa Alcântra (0,1592) at TST. METRO also has three stations with exceptionally high betweenness centrality, including Campo Grande (0,1813), Oriente (0,1504) and Cidade Unversitária (0,1480). Rod Lisboa also has a station with a very high betweenness centrality, Lisboa Campo Grande (0,1879). With this simple analysis, we can see that both Lisboa Campo Grande and Lisboa Oriente are multimodality hubs that bridge across different layers.
The left skew on the betweenness centrality may indicate the same type of distribution on node criticality since the betweenness centrality on the node measures the number of shortest paths that include that node. So these results are similar to the ones found in the literature (Cats et al. 2017).

Are the central stations directly connected with one another?
To understand the role of degree-degree correlations, we look at degree assortativity to understand if central stations are connected to one another. This measures the similarity of connected nodes concerning their degree. Arruda et al. (2016) noted that in multilayer networks, degree-degree correlations should be measured system-wide.
In that same study, they generalise this concept for these types of networks and applies it to an airport transportation multilayer network. There they also note that the rich-club effect is, in fact, present in such networks, masked due to the high number of peripheral nodes that connect the hubs. However, intralayer, the networks tend to be disassortative as they focus on one specific region.
We observed a similar behaviour from Arruda et al. regarding the assortativity, even with with the differences of reality being represented (air travel versus urban transportation). We calculated the assortativity for the multilayer network and for each layer, observing the same results. In Figure 8, we see the same pattern described by Arruda et al. (2016). We can observe a much higher assortativity in the multilayer network than in any of the single layers. This is reasonably simple to understand since the assortativity is influenced by the high number of multimodal hubs that connect to one another. Analogous to the properties found in past case studies, we may find a kind of a rich-club effect, that may not be there because the many peripheral nodes. This means that between the central stations within the layer there are less central stations connected to more central stations. Since this is not the focus of this work, it will be left for future studies.

Impact of different extraction strategies on network connectedness
In this section, we attempt to answer this question by analysing the behaviour of the size of the largest strongly connected component -SCC -over time for the duration of the simulation. By implementing the different extraction strategies discussed previously. For each iteration, we compute the largest SCC and its size. This size decreases with the removal and if a critical station is removed the size decreases even faster. Ergo the strategy which has a faster decrease has a more significant impact on the network. This means that networks that exhibit a steeper decrease sooner, as the percentage of iteration steps, are less robust. In the same fashion, a high-performance extraction strategy is one which the SCC descends faster.
To compare different strategies, we use the discrete Area Under Curve (AUC) and a normalised AUC to compare the resilience of the different layers and the multilayer network. The normalised AUC ⊂ [0,100] allows us to compare the values of networks with different sizes, and calculated using the following formula: where χ is the number of steps of the simulation, τ ' is the size of the largest strongly connected component at timestep , and is the number of nodes of the network. This measurement allows us to compare the different resilience side by side. It is important to note that this metric might have a higher variability (for both inflation and deflation) in smaller networks given the granularity. Observing table 3, the most resilient layer is FERTAGUS, and the least resilient layer is RODLISBOA.  Table 3 shows that the RD and RB strategies usually yield the best results across all layers. In other words, this is more often than not the best strategy to measure robustness. Nevertheless, the RB strategy tends to have a faster descent among the different layers as we can see in Figure 9. Hence, the failure on intermodal hubs that have a high degree has devastating effects in the multimodal topology. Moreover, the IB and Random strategies seem to have the least impact on the size of the largest SCC. This is expected and it is a good baseline to understand that the targeting techniques work to measure robustness. There seems to be no strong reason why ID strategy has a better result than the RD. However, this is the case in SULFERTAGUS, and it should be investigated further. However, the degree is clearly not the targeting metric that yields the most effective stress strategy so this may be due to random removal of stations with same degree and lower betweenness. We postulate that this happens because there may be fairly large components that have nodes with a high degree; however, removing nodes from these components does not affect the size of the largest component. So, this phenomenon probably has more to do with the metric we are using than the strategy itself. So, this is means that single mode networks are more robust to targeting in high degree stations. In Figure 9 we observe the RD strategy yields a faster decrease, this checks out with the Normalised AUC with a lowest value. In the 3rd graph we observe that we still need many more edge removals to get the same result of size of largest SCC, as in node strategies.

Impact of node and edge targeting on average path length
To understand the evolution of path length when targeting stations and pathways, we calculated the APL only for existing paths along with the network. So, if there was no path using the transportation system between two points, this was not accounted for.
It is important to note that this strategy may not be the best to measure robustness on road transportation since alternative paths may be available on roads that are not on the normal route. So, we expect the APL to reduce along each time step quickly. We ran the result for each layer and the whole network as well.
Regarding the results for station and pathway targeting of the CARRIS network on Figure 11, the RD is the best strategy. This means that removing the stops with the highest degree has the highest impact on the length of stations one can reach. Note that may change with alternative routes and redundancy techniques. The remaining strategies have about the same efficiency, with RB being slightly ahead towards the end of the simulation. This result is not unanimous, as we can see in Figure 10. As we see, RB can either be the best or the worst strategy, depending on the network. This can be due to nodes with high betweenness centrality being part of paths that do not have alternatives. RB providing an efficient strategy indicates a lower network robustness. This is the case because it is a topological fragility to have most of the shortest paths within a network going through a single station. In the multimodal network with all the layers (Figure 11) we see that RB is a strategy that promotes a fast decrease the APL. This indicates a lower network robustness. In the multimodal network we see that many shortest paths go through the intermodal hubs, allowing parts of the city that have a single mode to be connected with the rest of the city. We can see that the APL decreases much faster in the RB than in the RD for the multilayer, unlike in the CARRIS network. We can also see a slight increase at the beginning of the simulation (only for other extraction strategies), this is due to the removal of less critical stations. The number of removals needed to get the same APL is much higher for edge strategies, meaning node removal is more effective for decreasing APL.

How many nodes or edges do we have to delete to fragment the network into isolated components?
An isolated component is when a node loses all its edges. Since our graph is directed, when we talk about all the edges, we are mentioning both the in and out edges. To be able to answer this question, we used the extraction strategies proposed on the methods and evaluate the evolution of the network by showing the distribution of the

Carris Multilayer
Iterations Node removal Edge removal APL isolated components. To accurately understand the extraction strategies, we ran them across every layer in isolation and then in the complete network, except for multimodal hub removal on isolated layers because there are no multimodal hubs in isolated layers. Figure 12 shows the evolution of isolated components for each strategy in the CARRIS layer. In these graphs we clearly see that RD had the best results, this is the only one that stopped before the end of the simulation, because there are only isolated components when it stops. We obtained about the same results in every layer and for multilayer network is also very similar.
We also find that all extraction strategies, except RD, increase and then decrease the number of isolated components ( Figure 12). This means that after we reach the maximum number of IC's, we are only removing isolated nodes. This means that the lower the maximum and the later its reached, the less effective at measuring the robustness to isolation of different parts of the city the strategy is. Halfway through RB, we start to see rapid growth. This is because after removing the main bridges from communities, there still are redundancy paths. Once these paths start to be removed, it is simpler to disconnect each community individually. It should be noted that it is likely that a node with a high degree also has a high betweenness centrality since multimodal hubs are often points that connect communities. However, this is not the case the other way around since less connected stations may be very important at connecting different parts of the city. The IB extraction is the worst because we are dividing the problem infinitely. This is virtually dividing the path from A to B into 2 recursively. In summation, the only effective strategy to measure the vulnerability of disconnecting the network from different parts of the city is RD.
So, to answer the highlighted question in this section, we look at the number of iterations when RD ends the simulation. This is the number of nodes that have been removed when all the remaining nodes are isolated. The higher the percentage of nodes that need to be removed from the total of nodes in this network, the more robust the network is, see Figure 13. We can see that the robustness across all networks using this metric is about the same. To understand the multimodal hub removal, we recurred to another graph due to scale and purpose. As the graph reaches a pique and then decreases, we can see that we do not need to remove all the multimodal nodes to get isolated layers. As we can see, there is a slight variation when some nodes are removed. This means that removing specific nodes has more impact on the interlayer connections ( Figure 14). We can see that node removal is much better than edge removal, that makes sense since in edges we only separate in at most 8 IC's (single layers). This is not directly comparable with other strategies, as it only encompasses the removal of about 10% of the nodes.
However, from what we can assess, in the beginning, the growth in the number of isolated components follows approximately a linear function, which is slower than RD. However, we can safely conclude that redundancy of intermodal hubs is the best way to ensure connectedness in this multimodal network, since not all layers are redundant in all areas of the city. Redundancy on all areas of the city would be much more costly and may even cause other issues like consuming space of the already established routes.
On the other hand, there are edge extractions. In this case, it is expected that we have to remove all the edges to have all nodes isolated, Figure 15 shows exactly that. The behaviour of extraction per layer and in the multilayer network is identical as can be seen in both graphs of Figure 16. Contrary to node targeting, the worst strategy is RD, and the best strategy is IB. The RD is always the worst strategy, this is the case because we are removing edges from nodes with high degree. We can conclude that the betweenness centrality is a more adequate metric for edge targeting and degree is better for node targeting. This means that removing single pathways that are in many shortest paths has a higher impact than removing single pathways that lead to high degree stations. However, deactivation or failure of high degree stations have the highest impact. High degree stations are the ones that are usually more central geographically and have a more critical role in connecting different modes of transport.

Impact of removing nodes vs edges
We found that nodes extractions are more efficient than extracting the edges, as we need more removals to get the same increase of IC's and decrease of APL and decrease of SCC. This means that removing the high degree stations has a much higher impact than removing every single pathway that leads to each important station individually. As previous Figures indicate, it takes much more iterations to disconnect a network by extracting the edges. This happens for various reasons. Expectedly, there are more edges than nodes, so more iterations are needed to create the same impact. Another important reason is that when we remove a node, we also are removing all the edges of that specific node, and the contrary is not the case. This happens because an incident in a pathway to a station does may not impact the station and an incident on a station may more probably impact the pathways through it. This makes the node removal more damaging to the network.
On the SCC and the APL evaluation, we also see that edges' extractions are very similar to each other and also less efficient at disconnecting the network than nodes removal. We can conclude that the network has various redundant paths. Hence, removing edges, we can go to the same destination by going through more stations.
It is also important to note that in the IC evaluation, the ID and RD removal extractions on the edges are not as efficient as in the nodes and have a slow impact. With a slow impact, we mean that we need to remove more edges to create the same damage as the node removal. This happens because, when we base the extraction on the edge degree, we are extracting edges of hub nodes (nodes with high degree). This means that we are removing edges from nodes with lots of edges and so it does not show the effect at first.

Should we recalculate the degree and betweenness after each extraction?
Often, the random strategy is better than the strategies which do not recalculate the new metrics of the new graph (IB and ID). On the other hand, RB and RD removal strategies can be very effective (Table 3), yet not so computationally efficient. They take much more computational power to compute metrics, especially the RB. As can be seen in previous sections, recalculating degree after each node removal has proven to be beneficial, but recalculating betweenness centrality has yielded relatively stable result for IC's and wildly variable results in APL evolution. Intuitively this could probably be explained by the lack of redundant paths in some networks, which is a fair way to assess its robustness. Both RB and RD strategies have the same similar results across layers and being the best ones to reduce the SSC.
For the edge removing strategies, in the case of the goal being generating more isolated components, recalculating the betweenness centrality yields worst results. These results are curious as they are quite different from the ones proposed by Holme et al. (2002:1) , "suggesting that the network structure changes as important vertices or edges are removed".
Despite the RD and RB strategy overall having better results for nodes extractions, in massive graphs seems unfeasible to calculate the betweenness centrality per each iteration. Another possible strategy is, instead of calculating per each iteration, i.e., each node removal, to calculate this metric per each percentage of removals. In this way, we are avoiding the complexity, with the trade-off of slightly disrupting the result. This is a suboptimal solution however it is shown to be viable in the extent of the empirical results.

What is the impact of cascading failures?
The cascading failures are a complex topic, and we have only scratched the surface. On cascading failures, we used two removal strategies: entirely deleting a transportation line (e.g. route) and removing the neighbours of a given node (e.g. station). For the first removal, we simulated the crash of each line separately, that is, one line at a time, and measured the number of ICs, then repeat the process for each line after resetting the network to its initial condition. We were able to get the maximum number of 6 separated components from a single line by doing this assessment. This line/route connects Pontinha and Campo Grande, operated by RodLisboa (bus road operator). It's mainly a commuting route that connects passengers living in Pontinha (county within LMA) with the Lisbon city, mainly for home-work purposes.
In our second simulation, we applied the failure removal approach of a neighbour. Line Failure was used as a model for the simulation. We estimated the neighbours of each node, eliminated them, and gathered the metric of the isolated component. We chose three layers of a node's neighbours since the average number was closer to the average number of each line when comparing the two. Odivelas was discovered to be the key point in this method, with a maximum number of isolated components of ten.
We can see that the nodes that connect the Lisbon city with other counties within LMA are critical to the organisation of the city's public transportation system. Nonetheless, this Neighbour Failure technique has much more devastating effects than the Line Failure strategy, with nearly double the network damage (6 vs. 10), but this is based on a small sample. These results tell us that the impact of a localised failure that deactivates that spot in several layers has a higher impact than a line failure in this topology. However, this result may vary depending on the region, since localized failures in remote areas have a lower relative impact.

Conclusions and future works
In this work, we modelled and analysed the structure of the Lisbon's public transport network to measure its robustness according to a multimodal stance. Results allow us to understand the geographic information inherent to the multiplex network by programming a 3D plotting method. After calculating and interpreting some of the network metrics, results indicate that multimodal coordination can help diminish the number of stops in a pathway. After identifying stations that work as bridges between different layers and communities, we then analysed different extraction strategies based on previous work. This is an important step to guide the assessment of classic robustness metrics in the context of multimodal transport networks. The target metrics include the size of the largest SCC (strongly connected components), APL (average path length) and IC's (isolated components), as well as a proposed normalised AUC (area under curve). Hence, we observed that the selection of metrics should be constrained to the goal and type of removal (edge vs node).
Results show that the strategies that depend on recalculating metrics are generally more effective, with the exception of a particular case of edge removal using betweenness centrality to maximise IC's, which counter-intuitively had better results for the IB (initial betweenness) strategy. Even though we were able to postulate on this phenomena, further research needs to be done to understand it. We also showed that the resilience tests needed to remove about half the nodes of the network to leave all the remaining nodes wholly disconnected. This is a phenomenon that happens in all layers and the multilayer network as well, suggesting that betweenness targeting is he best way to measure robustness across the different strategies. We also verified higher assortativity phenomena in multilayered networks, in contrast to single layers, highlighting the importance of intermodal hub redundancy.
Based on the robustness tests we were able to conclude that the most effective method for targeting nodes is RD (recalculate degree). However, in some cases, RB (recalculate betweenness) yielded better results for multilayer APL decadence (both for nodes and edges strategies), although it showed higher variability. This means that the number of pathways to a station is less important than how many shortest paths go through that station in a multimodal scenario when completely disconnecting the network. For decreasing the size of the largest SCC, RD yielded better results for the multilayer network, however, for most of its individual layers, the best strategy was actually RB. This means that to divide a multimodal network into disconnected regions, high degree station failures have a higher impact than high betweenness station failures. However, to yield the same result in a single mode network, betweenness is a more relevant metric, highlighting the impact of the network topology as the vulnerabilities linked to a multimodal network considerably differ from a single layer network.
With the normalised AUC, we were able to compare, side by side, the robustness of each transport operator-specific network, regardless their size. Carris, RodLisboa, SulFertagus and TST show inferior robustness than the remaining networks. The two most robust networks were Fertagus and Transtejo. Nevertheless, this result may be due to inflation on smaller sizes, pinpointing the need for non-uniform scaling factors for the AUC normalization. The multilayer network was more resilient than one-half of the layers but less than the other. We also performed cascading failures in the network to understand some of the expected impacts. Still, since it is a vast and exciting topic, we focused on two major strategies. We showed that neighbour failure is more effective than line failure in this particular network, even though they are moderately similar in the structure of the present network. The impact yield from different cascading events can be the subject of a future study. For example, instead of simulating random line crashes, using simulated failures on more than one line at a time and using a specific measure would be more effective to determine which lines are more critical, like what we've done with degree and betweenness centrality targeting. The second elimination could be investigated further by experimenting with © AET 2021 and contributors a greater number of layers of neighbours. The archetype of these might be a combination of these two strategies, attempting to fail lines that are close to other lines utilising information from each node's neighbours. Last but not least, we identified that further research is needed in the effect of rich club on peripheral nodes on robustness and cascading effects in the context of this network.
The gathered results in this study suggest that robustness can be objectively measured using network metrics and percolation simulations. The impact of such simulations can be compared regardless of the network size or structure in any multimodal transportation scenario. Moreover, research findings seem to point out that we can use the targeting techniques to understand network recoverability (resilience stance) by focusing on stations with hub characteristics (higher centrality) or high betweenness. Results of this study allow practitioners and urban transportation policy makers to tackle the impact of negative disruption in multimodal transportation networks.