3.1 The framework
This study develops a workflow to examine the evolution of a micro-mobility system. At a higher level, we investigate how daily trips vary periodically with a window of several years. Next, we simulate the cumulative connections of each station or service point over space and time, which is an indicator of the system’s evolution. A critical assumption is that every micro-mobility system can be somehow modelled as a station, or service points-based system. The network of a docked system can be represented easily, as the origins and destinations of trips are fixed. For dockless systems, a normal procedure is to impose grid cells onto a study area and treat the centroids of each cell as a station or service point. There are also other approaches to represent a dockless system through a voronoi diagram based on a bus stop network for example [50]. These generated centroids can be thought of as hypothetical stations for a dockless system.
The logical diagram of the proposed framework is displayed in Fig. 1. It contains three sequential components: data collection and preprocessing, the modeling of the periodicity and evolution of the system, and geo-visualization of the results.
Data collection and preprocessing. The data on cycling trips come with noise. Therefore, the first step is to eliminate abnormal trips. First, trips that were made from/to any “testing” stations were excluded from analysis. Second, trips with excessively short duration (e.g., less than 1 min) were also removed. These trips are largely due to false starts or users attempting to re-dock a bike. Finally, trips with average speeds exceeding local biking speed limit were removed as well, which were probably made by restocking trucks carrying bikes to different stations. After the preprocessing, the outputs are (1) the aggregated number of daily trips and (2) daily traffic networks.
The periodicity and evolution of the system. Two tasks are fulfilled. First, a Gaussian Mixture Model is applied to the data set of daily trips to identify yearly patterns of trip demands. Second, an eigendecomposition approach is used to analyze the hidden structures of each station’s growth trajectories, which is compared with the curve fitting results based on exponential and logistic models.
Geo-visualization of the results. The growth patterns of each station are spatially diversified. This is illustrated by the geo-visualization of top principal components (PC) extracted from the growth trajectory data. It can uncover the distinct growth landscape of different stations over the space.
3.2 Traffic network
We can represent the system as a network where nodes are stations and edges denote the connections among the nodes. For a dockless system, hypothetical stations can be generated using approaches discussed in Sect. 3.1. Thus, a micro-mobility system can be denoted as a undirected network. A network on the date index i is defined as
$$\begin{aligned} BS_i = \{V,E\} \end{aligned}$$
(1)
where V is a set of the docking stations, and E denotes a set of edges. Specifically, the edges can be defined by
$$\begin{aligned} E = V \times V = \{e_{ij}\} \end{aligned}$$
(2)
where \(e_{ij}\) is the link connecting station i and j. As we only consider the connections among different stations, traffic flows are ignored. Thus, \(e_{ij}\) is 1 if there is at least 1 trip appearing on this edge, and 0 otherwise. In a network on a specific date, there is a set of connected nodes for each station. Therefore, for a certain time window, the set of the growth trajectories of station i can be described by
$$\begin{aligned} G_i = \{g_i | g_i, i = 1, 2, 3,\ldots ,n\} \end{aligned}$$
(3)
where \(g_i\) is the number of cumulative connections on date index i, and n is the number of dates during the growth period.
3.3 Prediction of the periodic of the system
The trips on weekdays and weekends show an upward increase and multiple local peaks (Fig. 2). The characteristic of multiple local peaks can be identified by a Mixture-of-Gaussian model.
A Gaussian Mixture Model is an approach to reveal the underlying degree of freedom of an unlabeled data set. Its primary application is to identify a clustering structure of the data. It fits a probability density function that has different components represented by a Gaussian distribution. Each distribution is parameterized with a mean and a covariance matrix, and thus the entire data set can be represented by the specified Gaussian parameters. Each component can be thought of as a cluster centered at a peak point with the highest probability.
The simplest application of the Gaussian Mixture Models is to identify the clusters of a one-dimensional data set. It is employed by this work as well. Before fitting the probabilistic model, we need to convert the temporal trip data into an appropriate format that can be interpreted by the model. Take the weekday trips for example (Fig. 2a). The original data set can be expressed by
$$\begin{aligned} TP = \{TP_i|TP_i, i=1,2,\ldots ,n\} \end{aligned}$$
(4)
where \(TP_i\) is the total number of weekday trips in date index i, and n is the total number of weekdays in the data set.
To model the trip distributions, we use a set of dates as a data representation method. The representation is defined by
$$\begin{aligned} W = \{W_i|W_i, i=1,2,\ldots ,n\} \end{aligned}$$
(5)
where \(W_i\) is a vector \(\mathbf {W_i}\) for weekday index i, in \({\mathbb {R}}^{d_i}\). The dimension \(d_i\) is the number of trips for weekday i, and the number of element in the vector is i. For example, if in the first weekday \(W_1\) there are 5000 trips, \(\mathbf {W_1}\) is in \({\mathbb {R}}^{5000}\) and represented by \((\underbrace{1,1,\ldots ,1}_\text {5000})\). All the vectors are subsequently concatenated into a row vector that equivalently denotes the weekday set W as
$$\begin{aligned} W = (\underbrace{1,\ldots , 1,}_{d_1} \ \underbrace{2,\ldots , 2,}_{d_2} \ ,\ldots , \ \underbrace{n,\ldots , n,}_{d_n}) \end{aligned}$$
(6)
where the dimension of W is the total number of trips. \(W^T\), a column vector, can be viewed as a one-dimensional data set in which each data point corresponds to a date index. We assume that the data follow a Gaussian distribution, as may indicated by Fig. 2. Thus, given one-dimension data set with data points \(x_1,\ldots ,x_n \in \mathbb {R}^1\), we can fit it with a Gaussian Mixture Model M, which is parameterized by a set as
$$\begin{aligned} M = \{(\pi _i, P_i)|\pi _i, i=1,2,\ldots ,k; P_i = N(\mu _i, \rho _i^2),i=1,2,\ldots ,k\} \end{aligned}$$
(7)
where k is the number of Gaussian components, \(\pi _i\) is the weights of different components, and a component has mean \(\mu _i\), and \(\rho _i^2\).
The probability of a data point i is defined as
$$\begin{aligned} Pr_i = \sum _{j}^{m}Pr(i, G_j) = \sum _{j}^{m}\pi _iPr(i|G_j) \end{aligned}$$
(8)
where \(Pr(i|G_j)\) is the probability of i under the Gaussian distribution of j.
Furthermore, the probability of the data set is defined as
$$\begin{aligned}{}&Pr(data\vert \pi _1P_1+\cdots +\pi _kP_k) \\ &\quad = \prod _{i=1}^{n}(\pi _1P_1(x_i)+\cdots +\pi _kP_k(x_i)) \\ &\quad = \prod _{i=1}^{n}\left( \sum _{j=1}^{k}\frac{\pi _j}{(2\pi \rho ^2_j)^{1/2}} \hbox{exp} \left( -\frac{(x_i-u_i)^2}{2\rho ^2_i}\right) \right) \end{aligned}$$
(9)
The goal is to find a model M (Eq. 7) that maximize the function (Eq. 9). However, there is no optimum solution for this problem. So the expectation–maximization algorithm is employed to identify a local optimum solution. Given a data set with n data points of one dimension, the algorithm first initiates the Gaussian components randomly by k-means (another clustering algorithm, please refer to Arthur and Vassilvitskii [4]) or other methods. Next, it repeats the following two steps until convergence. The first step assigns each point \(x_i\) fractionally to the k components, so the weight of \(x_i\) associated with a component \(P_j\) is
$$\begin{aligned} w_{ij} = Pr(P_j|x_i) = \frac{\pi _jP_j(x_i)}{\sum _{m}^{k}\pi _mP_m(x_i)} \end{aligned}$$
(10)
The second step is to update the model’s parameters:
$$\begin{aligned} \pi _j&= \frac{1}{n}\sum _{i=1}^{n}w_{ij} \end{aligned}$$
(11)
$$\begin{aligned} \mu _j&= \frac{1}{n\pi _j}\sum _{i=1}^{n}w_{ij}x_i \end{aligned}$$
(12)
$$\begin{aligned} \rho ^2_j&= \frac{1}{n\pi _j}\sum _{i=1}^{n}w_{ij}(x_i-\mu _j)^2 \end{aligned}$$
(13)
The model is a unsupervised learning task, and thus we do not know the true number of the Gaussian components. Hence, the elbow method is applied to identify an optimal number of the components. Specially, given a candidate set of the number of components, Akaike information criterion (AIC) scores are generated. According to the elbow method, the number associated with the steepest decrease of the AIC is believed to a local optimal solution.
3.4 The modeling of the initial growth of the system
Based on network theories, we can regard the evolution of a micro-mobility system as the expansion of connections of each station over time. Equation 3 tracks the growth statistics for each station, which serves as a basis to model the system’s evolution. The system is periodic in terms of trip demands, and it keeps the upgrade and expansion of shared facilities. Such processes are complicated, so it is hard to investigate the system’s evolution over a full life cycle. Therefore, this work adopts a simplified approach by only examining the system’s initiating period, i.e., six months after the system is implemented.
3.4.1 Growth curves of connected stations
Many natural phenomena, such as population increase and bacteria reproduction, can be modeled by exponential or logistic growth. We hypothesize that the growth of a bike station may follow similar patterns. Therefore, the growth of each station is fitted against an exponential growth model defined by
$$\begin{aligned} g_i = C - ae^{-bi} \end{aligned}$$
(14)
where \(g_i\) is the number of connections on date i, C, a, and b are model parameters.
The growth is also fitted against a logistic growth model, which is defined by
$$\begin{aligned} g_i = \frac{C}{1 + ae^{-bi}} \end{aligned}$$
(15)
For each station, the model with a higher R square is selected.
3.4.2 Analysis of the spatio-temporal growth patterns by eigendecomposition
The hidden structures of the data may dominate the growth trajectories of stations. An eigendecomposition method can reveal such hidden patterns. Thus, it has been in widespread applications in human mobility studies. It is used to extract top PCs from a data set, and the top PCs can explain the inherent variance of the data. The resultant PC coefficients can represent the original data well. The eigendecomposition is a dimension reduction approach and particularly appropriate for those data sets with high dimensions. Additionally, the growth patterns of different stations can be reconstructed by the top few PCs and associated coefficients. The reconstruction may indicate how the patterns deviate from the average growth trajectory.
Specifically, each station’s growth can be represented by a vector \(\mathbf {A_i} = \{a_{i1},a_{i2},\ldots ,a_{im}\}\) where \(a_{ij}\) refers to the number of connections of station i on date j, and m is the number of dates. Thus, the average vector \(\mathbf {\mu }\) can be obtained by
$$\begin{aligned} \mathbf {\mu } = \frac{1}{N}\sum _{i=1}^{N}\mathbf {A_i} \end{aligned}$$
(16)
where N is the total number of stations during the growth period. A station’s temporal deviation from the average vector is \(\mathbf {A^{'}_i} = \mathbf {A_i} - \mathbf {\mu }\). A matrix K can be then defined by
$$\begin{aligned} K_{N \times M} = \begin{bmatrix} a^{'}_{11} &{}\quad a^{'}_{12} &{}\quad \dots &{}\quad a^{'}_{1m}\\ a^{'}_{21} &{}\quad \ddots &{}\quad &{}\quad a^{'}_{2m} \\ \vdots &{}\quad &{}\quad &{}\quad \vdots \\ a^{'}_{n1} &{}\quad a^{'}_{n2} &{}\quad \dots &{}\quad a^{'}_{nm} \\ \end{bmatrix} \end{aligned}$$
(17)
where M is the number of dates. A covariance matrix C of size \(M \times M\) is an averaged outer product of K, which is computed by
$$\begin{aligned} C = \frac{1}{N}K^TK \end{aligned}$$
(18)
The covariance matrix is used to calculate all the eigenvectors (set \(\{\mathbf {e_i}|\mathbf {e_i}, i=1,2,\ldots ,m\}\)) and associated eigenvalues ranked by descending order (set \(\{\lambda _i|\lambda _i, i=1,2,\ldots ,m\}\)). The eigenvectors represent the PCs of matrix K. A coefficient matrix B can be computed by
$$\begin{aligned} B = KE^T \end{aligned}$$
(19)
where E is a matrix whose rows denote the eigenvectors, and the resultant matrix B is denoted as
$$\begin{aligned} B_{N \times M} = \begin{bmatrix} b_{11} &{}\quad b_{12} &{}\quad \dots &{}\quad b_{1m} \\ b_{21} &{}\quad \ddots &{}\quad &{}\quad b_{2m} \\ \vdots &{}\quad &{}\quad &{}\quad \vdots \\ b_{n1} &{}\quad b_{n2} &{}\quad \dots &{}\quad b_{nm} \\ \end{bmatrix} \end{aligned}$$
(20)
where \(b_{ij}\) denotes the coefficient of the jth PC for station i. The top few coefficients are important. They are used to reconstruct the original temporal signatures of the growth of a station, which is computed by
$$\begin{aligned} \mathbf {A_i} = {\boldsymbol{\mu}} + B_iE \end{aligned}$$
(21)
Empirical rules are normally adopted to determine the number of the top PCs applied in the reconstruction process. For this study, we only retain the first few PCs that account for at least 90% of the total variance.