- Original Paper
- Open Access
- Published:

# Analyzing urban traffic demand distribution and the correlation between traffic flow and the built environment based on detector data and POIs

*European Transport Research Review*
**volume 10**, Article number: 50 (2018)

## Abstract

### Purpose

This paper aims to determine the urban traffic flow spatiotemporal characteristics and correlation with the built environment using SCATS (Sydney Coordinated Adaptive Traffic System) and POIs (Point of Interests) data of Shenyang, China.

### Methods

A standard analysis framework based on these data is proposed in the paper. The study analyzes the traffic volume spatiotemporal distributions and built environment influence factors determined by the geographical detector. An improved gravity model using simple structural parameters (lanes number and road length) is proposed to estimate the traffic flows of day and peak hour scales for specific flow ranges.

### Results

The results show that the peak hours of different intersections and roads are heterogeneous and reveal trip time flexibility. The correlation between peak hour flows and day flows is significant in the multidimensional analysis. Based on the investigation of lanes, more interesting conclusions are found. In this case, when the numbers of lanes of intersections and roads are more than 14 and 4 respectively, the lane resources are wasted to a great extent. There is also a certain correlation between these factors. Proposed gravity model establishes the connection between structure and function of urban roads.

### Conclusions

Flexible work time and places will be effective methods to reduce traffic congestion. The day flows could be estimated via a traffic survey on peak hour flows, especially in developing cities. The traffic flow mainly concentrates in a relatively small part of city roads. The maximum service traffic volumes exhibit segmentation, we should reconsider the maximum optimal lanes number of intersections and roads under better performance and utilization rate of the network. The effect of lanes number on the service traffic volumes is found to be more significant compared with the other factors. Our conclusions will be helpful for policy-makers and sustainable urban planning.

## Introduction

Transportation is an important aspect of a sustainable city and society. Urban road transportation networks, as the carrier of human activities in the city, have been studied in terms of the structural characteristics and dynamics for decades. Most of the academic studies, however, are in a fragmented state. In the fields of geography and urban planning [1,2,3], physics [4,5,6,7] and related domains, academic scholars have paid more attention to the structure of urban street networks, and the transport dynamics is the traditional research content of transportation scholars [8, 9]. Determining the network characteristics of structure and function is the key challenge of current research, and the ultimate goal of the network research is to better understand the behaviors of the transport systems [10]. Hence, the research involving the fusion of the two aspects is a more comprehensive research approach [11,12,13,14].

However, the actual data acquisition of the network level is always difficult, with most of the results obtained via traffic simulation. Currently, big data provides opportunities to gain insight into the relationship between transport dynamics and network intrinsic properties. The source and quality of big data are the main constraints for most scholars. There are many studies based on trajectory data or smart cards from different location devices (GPS, phone, etc.), however, the traffic detector data is managed by the government and traffic police. Related work is seldom reported. Even worse, data sources from different modes or samples are likely to show diverse urban mobility patterns [15]. It should be paid more attention to comparative data analytics to urban and transportation research. Confronted with the three fundamental problems in transport system (i.e., traffic accidents, environment pollution and traffic congestion), we believe it will take a very long time to solve or mitigate these problems. In fact, many transport characteristics are not well studied, primarily because of the lack of data. The research on this topic should include the characteristics of road network structure, traffic flow, traffic demand and traffic organization and control. In particular, understanding traffic demand is indispensable. Moreover, with the rapid development of the technology of autonomous vehicles, the mode of traffic flow will change as the popularity of automated vehicles increases in the next few years. As a result, the existing research on traffic flow may not conform to future conditions. Therefore, here we focus on the correlation between traffic flow and built environment because the relative relationship is fixed in general. In the context, the transport dynamics refers to traffic flow, and the built environment is the typical part of the road network structure.

Some studies have analyzed the relationship between the built environment and traffic systems, such as traffic behaviors [16,17,18,19,20], the association between network structure and road safety [13, 21, 22], and the correlation between traffic congestion and different attributes of urban land use [23,24,25]. Though they found strong empirical evidence to show the correlation, limited research has investigated the impact of the built environment on traffic flow or complicated relationships between them at the level of the network. Because research data as a kind of scarce resource in this domain is often comprised of location data and other data [24, 25]. Trajectory, traffic state or other data in these studies are sample data in a sense and do not involve the traffic flow of different traffic modes. These data have restricted the ability to reflect the whole traffic flow conditions. Many existing studies are still limited and restrained because of lack of the whole investigation. The deficiency is also common in European transport studies. For example, the conclusions of the spatial distribution of traffic flow in existing studies derived from travel time and taxi data [26, 27], not traffic volume of all transport modes. Moya-Gómez and García-Palomares studied changes in automobile accessibility over the course of the day, as caused by congestion of the road network in eight European cities [28], however, it is not a direct investigation towards trip time. Meanwhile, the correlation analysis between traffic flow and built environment is seldom mentioned in European cases. Therefore, this correlation needs to be analyzed in more empirical evidence especially from traffic detector data. The authors have also conducted a preliminary exploration of the road network structure characteristics [29,30,31,32,33], and greatly appreciated the importance of the number of lanes for traffic which is different from existing studies significantly. The lane is the carrier of urban traffic glow and plays a significant role in the transportation systems. However, to our knowledge, the empirical research involved lanes at the level of the network is still very limited. And research on the network structure is just a small step; a more important task is to establish the connection between the structure and function of the network, i.e., determine how to predict or infer the operation law of the whole system with structural measures after quantitative observation of the structure of the network. Unfortunately, compared with the description of the network structure, research on this topic has developed slowly [10]. Our research is based on SCATS and POI data which have a detailed description of the whole traffic state. The detector data from SCATS is complete and can show the real traffic burden or demand. The beauty of real data lies in this capability. Therefore, the first and most important contribution of the paper is to help researchers to understand these characteristics and correlations derived from the whole, not samples before modeling analysis and engineering application. In the overall view, we will answer the following two questions in the paper.

(1) Based on the empirical population evidence, what are the time and space characteristics for traffic demand in the road network? How to illustrate them systematically?

(2) In our case, whether the correlation between traffic flow and the built environment exists? If so, is it the same as the existing studies?

Based on the simple thought, we will analyze the SCATS data and built environment data to reveal interesting results. Our empirical analysis could offer a more comprehensive understanding of the temporal and spatial distribution characteristics of urban transportation demand, and also further reveal the relationships between structure and function information. The results of this study provide an empirical and theoretical reference for the network analysis and management of urban road traffic as well as exploring how universal these findings are by conducting a similar analysis for European and other cities.

## Materials and methods

### Traffic flow data description

In this paper, the functional information of the urban road traffic network is extracted from the SCATS (Sydney Coordinated Adaptive Traffic System) [34] in Shenyang City of China. In 2014, the city which is one of the biggest cities in the northeast region of China had a population of 8.29 million, and the number of cars was 1.46 million. As a form of spatial distribution of travel demand, traffic flow is selected as the basic parameter to reflect the function information of the urban road traffic network. The SCATS system has a total of 525 intersections in the main urban area of the city, as shown in Fig*.* 1. The number of inlet lanes of intersections is 3–24, including T-intersections, crossing intersections and five-way intersections.

Considering the regularity of the travel patterns of the residents, we randomly selected data of a typical intersection and the corresponding western entry road for a week (July 29, 2014 – Aug. 4, 2014). As an example, these data exhibited time similarity separately in traffic flow, as shown in Fig. 2, making it necessary to select the day which has maximum traffic demand as the research object.

The calculation formula of the Pearson correlation coefficient is given,

where *n* is the total number of samples, *i* refers to a specific sample. *x* and *y* are the variables, and \( \overline{x}\kern0em \) and \( \overline{y}\kern0em \) are the means of the corresponding variables, respectively. If the two variables are positively linearly correlated, then 0 < *R* ≤ 1. If two variables are negatively linearly correlated, then −1 ≤ *R* < 0. If there is no linear correlation between the two variables, then *R* = 0. Generally, if |*R*| > 0.8, then the two variables are considered to have a strong linear correlation.

At this typical intersection, the change trends and values of the flow time series of 15 min during the continuous week are consistent. Fig. 2(a) exhibits the correlation between variables directly. The corresponding west entrance road also has similar characteristics, as shown in Fig. 2(b). Table 1 shows the total traffic volume and correlation coefficients of the 15-min flow time series of the intersection on different days. In terms of the total volumes, except Sunday (Aug. 3, 2014), the traffic demand of this intersection is relatively stable, and the maximum value occurred on August 1, 2014. The correlation coefficients at different dates were found to be *R* > 0.95 indicating that the flow time sequence of different dates has obvious time similarity. The average correlation coefficient of the flow time series of the corresponding western inlet road is 0.9692, which also has obvious time similarity. In view of the maximum traffic demand of August 1, 2014, the traffic of the road network on that day is selected for the following analysis.

Data quality and detectors condition were checked. According to the statistics, there are 318 intersections having the output data on the day, 63 of which are normal for all detectors. There are 521 segments having the complete data, 64 of which are normal for the two-way detectors.

### Built environmental influence factors

The primary built environmental factors affecting traffic state of travel behaviors could be divided into traffic-related and land-use related factors [11, 35, 36]. The geographical detector [37] was introduced to assess the built environmental parameters that may be responsible for the road traffic state [24]. Zhang et al. defined the power of determinant (PD) to determine whether a spatial factor may be responsible for clustering results of traffic state [24]. The equation of PD is as follows,

where *n*_{D, i} is the number of samples in the sub-region *i* of the determinant *D*_{i}, and *n* is the total number of samples. \( n=\sum \limits_{i=1}^k{n}_{D,i} \), where *k* is the number of the sub-regions. *σ*^{2} is the global variance of an influence factor in the study region, and \( {\sigma}_{D,i}^2 \) is the weighted divisional variation. The value range of *PD* is [0, 1]; a larger value indicates the factor’s determinant power is stronger. Zhang et al. investigated the relationship between traffic congestion and the built environment based on taxi GPS data of Shanghai, China; the built environment factors and *PD* are shown in Table 2.

Table 2 presents the explanatory power of the factors. *Num_bus* (0.130) has the highest PD, i.e., more bus stations along the road segment per 100 m are related to the high possibility of congestion, because bus stations of higher density reflect greater commuting volume along the road segments. Considering the possibility of data collection and the values of PD, the following factors were chosen for further analysis in this study: *Num_bus* (0.130), *Rd_type* (0.105), *Dist_hosp* (0.091), and *Num_scho* (0.084). Different from the simple description (1 for the primary road and 2 for the secondary road) in Zhang’s work, the paper replaces that factor with the lane number as well as considering the importance of lane number based on our previous studies [29,30,31,32,33]. The complete data of traffic demand from SCATS is more convincing with comparison to taxi data of Zhang’s work.

Correspondingly, this study extracted 8643 POIs (point of interests), including bus stations, hospitals and schools using web-crawler software, as shown in Fig. 3.

### Research methods

Based on the analysis mentioned above, the analytical flow of this study is shown in Fig. 4. Section 2 presents a description of the used data and methods in the study. In section 3, traffic demand spatiotemporal characteristics will be analyzed from the two dimensions (time and space) within the actual road network data in Shenyang, China. For variable traffic demand in a day, peak periods will be firstly investigated in this section. Except for the peak time distributions in the scales of the morning, evening and the whole day, the correlations between peak hour flows and day flows of specific peak periods will be also considered. From the spatial point of view, we will show traffic flow distribution of roads with diverse lane numbers in the urban street network. Subsequently, the correlation between the traffic flow and the built environment (*Num_bus, lane number, Dist_hosp, Num_scho*) is investigated, and some interesting findings will be given. Finally, we will present the research conclusions and discussions of future work in section 4.

#### Traffic demand analysis

The analysis object is the city’s traffic data on August 1, 2014. Because of the lack and fault of detectors, it is difficult to obtain the traffic flow of all intersections; thus the results represent a relative relationship. At first, this section analyzed the temporal distribution of traffic demand in the three scales of the whole day, the morning peak and the evening peak.

As the bottleneck of the urban traffic system, the intersections play a significant role in the process of transportation operation. Analyzing the peak hour and flow distributions of an intersection is helpful to gain the profound understanding of the temporal operation characteristics of urban traffic flow. As shown in Fig. 3, the traffic flow chart in urban roads is usually in the form of a saddle shape. *q*_{hi} indicates the traffic in the *i*_{th} hour. There are peaks in the morning and afternoon, and each corresponding hour is called the peak hour. The traffic volume within that peak hour is called the peak hour flow *q*_{hm}. Define peak hour flow ratio,

In the eq. (3), *Q* is the full day flow, that is \( Q=\sum \limits_{i=1}^{24}{q}_{hi} \).

And then we mainly analyzed the spatial distribution of traffic flow in different types of roads and the distribution of traffic in double directions of the road. We screened out detectors, and the segments with fully covered and normal detectors were selected. There are 521 segments in one way meeting the research requirement according to the statistics, and the number of lanes is 1~ 7. There are 64 two-way segments with normal data output.

#### Correlation analysis between traffic flow and built environment

The correlation analysis in this part includes two objects, i.e. the intersections and roads.

To study the relationship between the intersection traffic flow and the built environment, the aforementioned several factors were first analyzed in the context of the intersections. Given that intersection traffic flow comes from the adjacent road segments, the number of lanes becomes the only analysis factor. In this section, we investigated the correlation between the actual maximum capacity and number of lanes. The number of approach detectors, peak hour time throughout the day and the corresponding flow data of each intersection were counted. The traffic analysis report of China major cities in the third quarter, 2014 (http://report.amap.com/download_city.do) indicates the rankings of Shenyang’s peak time and all-day congestion in key cities were both at the top of the list. Therefore, Shenyang is the typical case for analysis in China. The traffic function of the intersections at the network level is measured by peak hour flow of the full-day. Moreover, considering the lower traffic demand of some roads, the 30 percentile of the average lane hour flows (227veh/(h * ln)) is selected for the threshold value to remove intersections of lower traffic pressure.

Define detector integrity rate,

In the formula, *n*_{j} is the number of actual detectors at the intersection*j*; *N*_{j} refers to the actual number of approach lanes. The sample size *U* is assumed to represent a set *S* of intersections. Other influence factors will be considered in the part of roads.

To analyze the road traffic’s relationship with POIs, we combined the roads with the same names and obtained 42 roads. The flows are the maximum values of the same roads. In the seventeenth century, Newton proposed that the force of any two objects is proportional to its mass and inversely proportional to the square of the distance between them. Currently, the gravity model has become a widely used model in spatial interaction. The improved gravity model formula is given below,

where K is a constant; *M* is the fitness that refers to the intrinsic properties of the nodes and indicates the ability to get an edge; *D* is generally defined as the Euclidean distance, but it can also represent other physical quantities, such as time; The values of the two exponents *α* and *γ* depend on the network’s dependence on node fitness and geography [38]. A classic application of the gravity model in the field of transportation planning is trip distribution forecast of the four stages, in which trips between two traffic zones are directly proportional to the number of trip productions and attractions and is inversely proportional to the traffic impedance between the origin and destination. Previous studies have validated the applicability of the gravity model in network flow analysis including highway systems [39], airport systems [40,41,42] and rail systems [43]. Related research on urban traffic flow focuses on the human mobility among towns or cities [44,45,46,47], however, to our knowledge, no study has used the gravity model to estimate traffic flow between two adjacent intersections from the perspective of spatial interaction.

Here, we defined *M* as degree [10], improved degree [29] and lane number of the connecting road (i.e., estimated road). The distance function is described in the forms of the power function, the exponential function and the combination function [48]. The interactions *Q* between the adjacent intersections will be investigated in the fitting experiments. *Q* includes total day flows of two-way roads and peak hour flow of larger traffic demand direction. When fitness (*M*) is defined as the two-way lane number and the form of combination function is selected for *D*, eq. (6) is true within a specific flow range.

In the formula, *α*, *β*, *γ*, *η*, *k*, *K* are parameters that require calibration in the fitting experiments. The values of root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE) and correlation coefficient (R) were calculated to determine the accuracy and agreement between the observed and estimated values.

where *n* is the total number of observed (and forecasted) values; *Q*_{ij} refers to the observed values of daily flows (or peak hour flows) of intersection *i* and *j*; \( {Q}_{ij}^{\prime } \) refers to the corresponding traffic estimated values.

## Results

### Traffic demand spatiotemporal distribution characteristics

#### Temporal distribution characteristics analysis

In the 318 intersections of Shenyang city, China, the peak hours of the day were mainly concentrated in the morning peak (07:00–09:15) and the evening peak (16:15–18:15), as shown in Fig. 5(a). The trip peak period is similar to that of European cases [28]. The combined peak hour frequencies of both peaks accounted for 82.22% of the peak hours. The maximum flow of the day mainly occurred in the morning peak. The ratio of the frequency of the morning peak to that of the evening peak was 3.18:1. The morning peak accounted for 62.54% of the day. The average peak hour flow ratio of the whole day was 0.0844. The average peak hour flow ratio ranged from 0.0654 to 0.1087 according to the time interval statistics.

As shown in Fig. 5(b), the average flow distributions and change trends of the peak hours were consistent with those of the day flows of corresponding sample sets, especially during the morning peak and the evening peak, and the correlation coefficient *R* was as high as 0.9821. A linear correlation was found between day flow and peak hour flow of individual intersection, as shown in Fig. 5(c). The model is as follows,

where Q is the day flow of the intersection, q is the peak hour flow of the intersection, and the number of samples is 318.

In addition to the time distribution of all day traffic, the morning and evening peaks will be inspected separately, as shown in Figs. 6 and 7. The morning average peak hour flow ratio was 0.0831, slightly less than that of the full-day. The average peak hour flow ratios of the time interval segment ranged from 0.0646 to 0.1510 (the second largest ratio is 0.1087). The correlation coefficient between average daily flows and average flows of the morning peak hours in the corresponding samples was 0.8704. The average flow ratio of the evening peak hour was 0.0733, which was significantly less than that of all-day and the morning peak. The average peak hour flow ratio of the time-sharing segment was 0.0605–0.2131 (the second largest ratio was 0.1075). The correlation coefficient is 0.9073 between average daily flows and the average flows of the evening peak hours. The flows of the morning peak and the evening peak were in line with the changing trend. Because the bottleneck of urban traffic flow is the intersection, the time distribution of the road was not discussed here. The average peak hour flow ratio of the road was 0.0801, which was the 63rd percentile. The 93rd percentile was 0.1003.

This section reviewed the consistency of trends between peak hour flows and full-day flows from three dimensions of the day, morning and afternoon. In the form of an example, the full-day service traffic volumes could be estimated by the peak hour flow ratio because the peak hour volume is a typical item of the traffic survey.

#### Spatial distribution characteristics analysis

Figure 8(a) shows the total day flows of the same lane number in ascending order throughout the day. Although the average daily flows of the roads are linearly increasing with the number of lanes (*R*^{2} = 0.9684), as shown in Fig. 8(b), the ranges (max-min) of the total flows of the same number of lanes first gradually expanded and then decreased. These findings show that the actual traffic function of different roads is quite different, despite having the same road structure. It provides the possibility for the refined design of traffic control strategy and the further optimization of transportation resources. Note that there is only one normal one-way road whose number of lanes is 7; this road is ignored in the range analysis. The distribution of peak hour flows is similar to that of the daily flows.

Similar to the intersections, the correlation between the daily flows and peak hour flows of roads was analyzed. A significant linear relationship was found, as shown in Fig. 8(c). The linear model is as follows,

where Q' is the day flow of the road, q' is the peak hour flow of the road, and the number of samples is 521.

In addition, the total traffic volume of each road type was obtained by multiplication of the number of segments of different lanes and the average values of the daily total flows. The subgraph of Fig. 8(a) shows the daily flow cumulative probability with increasing number of lanes. We find that urban traffic flows are mainly concentrated in a small number of roads. Approximately 66% of the small and medium-sized roads were covered by about 38% of the traffic flow, and 34% of the medium and big-size roads served approximately 62% of the traffic. One-lane roads, which occupied 42.54% of the total number of segments accounted for 14.64% of the traffic, and their traffic functions were equal roughly to those of arterial roads (five, six and seven lanes in one direction), which accounted for 5.19% of all roads (the latter served 13.40% of the traffic demand). Service traffic volumes of one-way two, three and four lane roads were 23.02%, 25.11% and 23.82%, respectively, all exceeding 20%. The proportions of the three kinds of roads were 23.76%, 17.69% and 10.81% for two, three and four lane roads, respectively. The finding from China’s real traffic data is a powerful supplement of street hierarchies for Lammer’s work [26] on the German cities using travel time and betweenness centrality to reflect the real flows, Jiang’s European taxi case [27] and Huang’s Wuhan case [49].

As an important arterial road connecting the north and south of the city, Fig. 9 shows that the traffic flows in both directions of the Qingnian street were relatively close, which were 54,639 veh/d and 53,714 veh/d. The flow ratio was 0.8–1.2, and the mean was 1.01; thus, the flows in both directions were balanced. Upper carriageways 1–4 are distributed from the inside to the outside, and the inlet is forbidden to turn left. Lanes 1–3 are straight lanes, and lane 4 is the right lane. According to the distribution map of the lane flows, there were more vehicles in the second and third lanes, all of which had flows of approximately 15,000 veh/d. The flows of 1st and 4th lanes on both sides were relatively less, with values of 13,740 veh/d and 10,685 veh/d, respectively. In the case of 64 samples, the correlation between the up-flow and down-flow was also significant. For peak hour flows, *R* = 0.7938. For day flows, *R* = 0.7845.

### Correlation between the traffic flow and the built environment

#### Correlation between intersection traffic flow and lane number

Statistical results of the set *S*_{1} (*U*_{1} = 318) of all intersections, intersection set *S*_{2} (*U*_{2} = 63) of *ξ*_{j} = 1, intersection set *S*_{3} (*U*_{3} = 97) of *ξ*_{j} = 0.9 and intersection set *S*_{4} (*U*_{4} = 138) of *ξ*_{j} = 0.8 are shown in Fig. 10. After deleting the intersections with lower traffic demand, *U*_{1}^{′} = 222. The difference between samples *U*_{1} and *U*_{1}^{′} is that average flows of the latter are slightly greater than that of the former. The frequency is the opposite case. Moreover, they have same correlation with the entry lane number. *U*_{2}, *U*_{3}, and *U*_{4} are the cases of filtering out the intersections.

According to the relationship shown in Fig. 10 between the number of lanes and the mean of traffic flow, the number of inlet lanes in the intersections was linearly positively correlated with the peak hour flow. In general, the peak hour load flow (actual capacity) at the intersection increased with the increase of the number of inlet lanes. Different sets of samples, however, revealed that the maximum carrying capacity for each type of individual was optimal when entry lane number was 14, namely, a marginal effect of entry number of lanes existed in the urban road network. The marginal effect is that the increase of the traffic capacity will gradually decrease when the other inputs are fixed. Since the flow is still larger when the number of entry lanes is 15, it can also be considered.

In SCATS, Degree of Saturation (DS), which refers to the ratio of effectively used green time to the total available green time, is utilized to evaluate the saturated state of the traffic control system [50]. Similar to the previous study, we acquired the DS data of the intersections with *ξ*_{j} ≥ 80% and removed outliers whose phase number is significantly less than the illustrated number in the system. The distribution of the average degree of saturation in ascending order for each intersection is shown in Fig. 11. The figure indicates the average DSs of different intersections and phases are larger in peak hours, with the average values of 77.08% and 76.09% respectively.

In addition to signal control, another important factor affecting the capacity of intersections is lane function division. As one of the common traffic facilities on the city roads, the commonly used signals are red, green and yellow. In the green light period, vehicles that arrive at the intersection can go straight into the intersection, turn right or left (unless other traffic signs forbid a flow). But when the yellow light starts, the vehicles are prohibited from entering the intersection and wait in line until the restart of the next green light. Because of the releasing or interrupting traffic flow of a certain direction periodically, vehicles in a given lane go through the intersection at part of the time, and they will wait for the green light signal or the previous release at other times. According to the control of the signal, the traffic flow of the signal intersection having conflict in space could be separated in time. The lane group is an important analysis object when calculating the capacity of a single intersection. From the perspective of network traffic flow analysis, however, vehicles during peak time are in a state of saturation or even oversaturation for most intersections. Moreover, each approach usually has lanes of three directions (straight, left and right). When the total number of lanes is fixed, the specific lane combinations are no longer analyzed in the comparison among multiple intersections in the urban road network.

#### Correlation between the road traffic flow and the built environment

From the perspective of traffic flow, however, a significant correlation among traffic flow, bus stations, hospitals and schools found from a speed analysis [24] did not occur. *Dist_hosp* did not also show correlation with other factors, so it was replaced by the number of hospitals within 500 m. This discrepancy may be the result of the differences in various research cases and analysis indicators. Despite the discrepancy in this aspect, we found a new correlation among the built environment factors. Figure 12(a) exhibits the positive linear correlation between road length and the number of bus stations along the roads. Figure 12(b) shows the correlation among the number of hospitals within 500 m, the number of schools within 500 m and the degree. Figure 12(c) shows the correlation between the degree and the number of schools within 500 m. Here, the degree is the number of connecting roads for a road, and it is a basic indicator in network science [10]. Although a simple and clear correlation has not been found in this case, we think the correlation should exist in a mature development status of the city systems.

When examined the prediction result of the gravity model, we found that when fitness is defined as the two-way lane number, the form of combination function is selected for *D*, within a specific flow range, there exists eq. (5). The traffic is filtered, and the model is calibrated excluding smaller traffic values and several outsized values. When the daily total flow range is [36000, 75000], *K* = 106.31, *α* = 5.75, *β* = − 1.49,*γ* = − 2.55,*η* = 4.60, and *k* = 39491.35. When peak hour flow range is [1700, 3500], *K* = 0.13,*α* = 1.48,*β* = 1.93,*γ* = 0.31,*η* = ‐ 0.48, and *k* = 1729.14. The number of estimated segments is 31 for day flow and 66 for peak hour flow. The former is the sum of the vehicles in two directions, and the latter is the larger one of the two ways, including one-way data of partial roads. The fitting results and tests are shown in Tables 3 and 4. The two tests indicate there is no significant difference between the observed and estimated values. The fitting functions are given by 1stOpt 7.0 (http://www.7d-soft.com/en), and Samples tests are outputted by SPSS 20.

## Conclusions and discussions

In this paper, real data of Shenyang, China was taken as an example to study the urban traffic flow spatial-temporal characteristics and its relationship with the built environment; and some interesting findings were obtained. The conclusions were derived from empirical data analysis from the perspectives of time and space. The temporal characteristics focus on the trip time flexibility and the trip quantity variability of city traffic demand. The spatial aspect focuses on the difference of road utility at the network level, i.e., the road utilization rate. The potential important findings were elaborated in figures and models.

In terms of the temporal distribution of traffic demand, the peak hours of different intersections and roads were found to be heterogeneous, revealing trip time flexibility. The primary trip peaks were the morning and evening peaks (07:00–09:15 and 16:15–18:15). Citizens’ commute behaviors determine the phenomenon; however, we found that the trip quantity of the morning peak is larger than that of the evening peak under fixed traffic demand (average peak hour flows are 40,956 and 33,989 vehicles for morning and evening peaks, respectively). The peak period of the day mainly occurs in the morning, accounting for approximately three quarters of peak hours. It indicates that, after work, people’s destinations and the variability of routes caused less traffic burden for the roads. Therefore, flexible work times and places is an effective method to reduce the number of vehicles and improve the traffic condition. Considering the influence of routes and trip times on the traffic state and the imbalance of the network flow distribution, the study of the traffic signal control strategy should emphasize the time difference and signal optimization of heavy traffic burden routes. In addition to traffic control, another link that must be strengthened is traffic information service based on GIS-T (Geography Information System-Transportation). It will be more important in the next era of autonomous vehicles. After studying the traffic flow of the intersections and roads, the interesting scope and number of the average peak hour flow ratio were discovered. The scope was found to be 0.06~ 0.10, and 88% of the intersections and 93% of the roads are in this interval. The average values of peak hour flow ratios are 0.08 (0.0844 for the intersections, 0.0801 for the roads). Since the correlation between peak hour flows and day flows is significant, day flows could be estimated when we have traffic survey of the peak hour flows. This estimation is more important for developing cities because of the lack of data collection equipment. Moreover, even if the roads have similar road structure with the same number of lanes, the actual traffic functions of different roads are quite different. The traffic flow is found to be concentrated in a relatively small part of city roads. The small and partial medium-sized road segments account for 66% of all roads, but only cover approximately 38% of the day service traffic, and the large and partial medium-sized road segments (34% of the whole) account for 62% of the traffic.

Built environment influence factors (*Num_bus*, *Rd_type*, *Num_hosp*, *Num_scho*) were considered in the correlation analysis with traffic flow. We found that the effect of lane number on service traffic volumes of the intersections and roads is more significant compared with the other factors. The lane number has a significant positive linear correlation with average service traffic flow. The greater the number of lanes is, regardless of whether roads or intersections are considered, the greater the number of vehicles serviced is. However, maximum values of traffic flow revealed that the service capacity is different. There is a segmentation feather. Namely, for both of cases, optimal network function is achieved at a certain number of lanes. The case results indicate that the maximum number of lanes of intersections and roads should be 14 and 4, respectively. The latter, 4 lanes, is merely a reference because the utility of roads is also determined by the green time or split. In this context, we should reconsider the road diet [51] from the view of point of better performance and utilization rate of the road network. The discovery of the optimal lane number provides new insight and reference for urban planning and traffic design. Other factors were not found to be strongly correlated with traffic flow. However, the correlations among these factors were revealed, such as road length with the number of bus stations, numbers of hospitals and schools with degree, and degree with the number of schools. Finally, we proposed an improved gravity model to estimate the traffic flow at the day and peak hour scales for specific flow ranges. This model represents a new approach to investigate the traffic flow using simple structural parameters (number of lanes and length of a road).

The results of this study provide quantitative support for urban traffic flow spatiotemporal characteristics and its relationship with the built environment. It could provide the reference for current traffic management and help determine how to reduce the waste of road resources in the form of empirical evidence. However, the results are perhaps only valid in this case; thus more data from other cities are required to explore whether there is a universality rule. It would be interesting to explore how universal our findings are by conducting a similar analysis for European and other cities so that we can have a better understanding of urban transport systems. Proposed analysis method and subsequent results will be important references for trip demand distributions and the correlation between traffic flow and built environment of European transport studies. Related work on the urban traffic flow spatiotemporal characteristics and its relationship with the built environment must be further investigated in future studies. Except for the traffic demand of vehicle level, the characteristics of selecting routes of drivers based on trajectory data will also be our next research content.

## References

Boeing G (2017) OSMnx: new methods for acquiring, constructing, analyzing, and visualizing complex street networks. Comput Environ Urban Syst 65:126–139 https://doi.org/10.1016/j.compenvurbsys.2017.05.004.

Jiang B (2007) A topological pattern of urban street networks: universality and peculiarity. Physica A 384(2):647–655 https://doi.org/10.1016/j.physa.2007.05.064.

Masucci AP, Stanilov K, Batty M (2014) Exploring the evolution of London’s street network in the information space: a dual approach. Phys Rev E 89(012805). https://doi.org/10.1103/PhysRevE.89.012805.

Buhl J, Gautrais J, Reeves N, Solé RV, Valverde S, Kuntz P, Theraulaz G (2006) Topological patterns in street networks of self-organized urban settlements. Eur Phys J B 49(4):513–522 https://doi.org/10.1140/epjb/e2006-00085-1.

Crucitti P, Latora V, Porta S (2006) Centrality measures in spatial networks of urban streets. Phys Rev E 73(03612532). https://doi.org/10.1103/PhysRevE.73.036125.

Kalapala V, Sanwalani V, Clauset A, Moore C (2006) Scale invariance in road networks. Phys Rev E 73(02613022). https://doi.org/10.1103/PhysRevE.73.026130.

Porta S, Crucitti P, Latora V (2006) The network analysis of urban streets: a dual approach. Physica A 369(2):853–866 https://doi.org/10.1016/j.physa.2005.12.063.

Geroliminis N, Daganzo CF (2008) Existence of urban-scale macroscopic fundamental diagrams: some experimental findings. Transp Res B Methodol 42(9):759–770 https://doi.org/10.1016/j.trb.2008.02.002.

Daganzo CF (2007) Urban gridlock: macroscopic modeling and mitigation approaches. Transp Res B Methodol 41(1):49–62 https://doi.org/10.1016/j.trb.2006.03.001.

Newman MEJ (2010) Networks: an introduction. Oxford University Press, New York.

Wen T, Chin W, Lai P (2017) Understanding the topological characteristics and flow complexity of urban traffic congestion. Physica A 473:166–177 https://doi.org/10.1016/j.physa.2017.01.035.

Jiang B, Ren Z (2018) Geographic space as a living structure for predicting human activities using big data. Int J Geogr Inf Sci:1–16 https://doi.org/10.1080/13658816.2018.1427754.

Osama A, Sayed T (2017) Evaluating the impact of connectivity, continuity, and topography of sidewalk network on pedestrian safety. Accid Anal Prev 107:117–125 https://doi.org/10.1016/j.aap.2017.08.001.

Zhao S, Zhao P, Cui Y (2017) A network centrality measure framework for analyzing urban traffic flow: a case study of Wuhan, China. Physica A 478:143–157 https://doi.org/10.1016/j.physa.2017.02.069.

Zhang X, Xu Y, Tu W, Ratti C (2018) Do different datasets tell the same story about urban mobility — a comparative study of public transit and taxi usage. J Transp Geogr 70:78–90 https://doi.org/10.1016/j.jtrangeo.2018.05.002.

Gehrke SR, Welch TF (2017) The built environment determinants of activity participation and walking near the workplace. Transportation 44(5):941–956 https://doi.org/10.1007/s11116-016-9687-5.

Cervero R (2002) Built environments and mode choice: toward a normative framework. Transp Res Part D: Transp Environ 7(4):265–284 https://doi.org/10.1016/S1361-9209(01)00024-4.

Ding C, Wang D, Liu C, Zhang Y, Yang J (2017) Exploring the influence of built environment on travel mode choice considering the mediating effects of car ownership and travel distance. Transp Res A Policy Pract 100:65–80 https://doi.org/10.1016/j.tra.2017.04.008.

Munshi T (2016) Built environment and mode choice relationship for commute travel in the city of Rajkot, India. Transp Res Part D: Transp Environ 44:239–253 https://doi.org/10.1016/j.trd.2015.12.005.

Sun B, Ermagun A, Dan B (2017) Built environmental impacts on commuting mode choice and distance: evidence from Shanghai. Transp Res Part D: Transp Environ 52:441–453 https://doi.org/10.1016/j.trd.2016.06.001.

Marshall WE, Garrick NW (2011) Does street network design affect traffic safety? Accid Anal Prev 43(3):769–781 https://doi.org/10.1016/j.aap.2010.10.024.

Zhang Y, Bigham J, Ragland D, Chen X (2015) Investigating the associations between road network structure and non-motorist accidents. J Transp Geogr 42:34–47 https://doi.org/10.1016/j.jtrangeo.2014.10.010.

Badoe DA, Miller EJ (2000) Transportation-land-use interaction: empirical findings in North America, and their implications for modeling. Transp Res Part D: Transp Environ 5(4):235–263 https://doi.org/10.1016/S1361-9209(99)00036-X.

Zhang K, Sun DJ, Shen S, Zhu Y (2017) Analyzing spatiotemporal congestion pattern on urban roads based on taxi GPS data. J Transp Land Use 10(1):675–694 https://doi.org/10.5198/jtlu.2017.954.

Zhang T, Sun L, Yao L, Rong J (2017) Impact analysis of land use on traffic congestion using real-time traffic and POI. J Adv Transp (7164790). https://doi.org/10.1155/2017/7164790.

Lammer S, Gehlsen BR, Helbing D (2006) Scaling laws in the spatial structure of urban road networks. Physica A 363(1):89–95 https://doi.org/10.1016/j.physa.2006.01.051.

Jiang B (2009) Street hierarchies: a minority of streets account for a majority of traffic flow. Int J Geogr Inf Sci 23(8):1033–1048 https://doi.org/10.1080/13658810802004648.

Moya-Gómez B, García-Palomares JC (2017) The impacts of congestion on automobile accessibility. What happens in large European cities? J Transp Geogr 62:148–159 https://doi.org/10.1016/j.jtrangeo.2017.05.014.

Wang S, Zheng L, Yu D (2017) The improved degree of urban road traffic network: a case study of Xiamen, China. Physica A 469:256–264 https://doi.org/10.1016/j.physa.2016.11.090.

Wang S, De Y, Lin C, Shang Q, Lin Y (2018) How to connect with each other between roads? An empirical study of urban road connection properties. Physica A 512:775–787 https://doi.org/10.1016/j.physa.2018.08.115.

Zhang W, Wang S, Tian X, Yu D, Yang Z (2017) The backbone of urban street networks: degree distribution and connectivity characteristics. Adv Mech Eng 9(11):1–11 https://doi.org/10.1177/1687814017742570.

Wang S, Yu D, Wang S, Xing R, Li Z (2018) Connectivity characteristics of urban road network elements based on improved degree. J Traff Transp Eng 18(02):101–110.

Xing X, Yu D, Tian X, Wang S (2017) Analysis of multi-state traffic flow time series properties using visibility graph. Acta Phys Sin 66(23):230501 https://doi.org/10.7498/aps.66.230501.

Sims AG, Dobinson KW (1980) The Sydney coordinated adaptive traffic (SCAT) system philosophy and benefits. IEEE Trans Veh Technol 29(2):130–137 https://doi.org/10.1109/T-VT.1980.23833.

Moeckel R (2017) Constraints in household relocation: modeling land-use/transport interactions that respect time and monetary budgets. J Transp Land Use 10(1):211–228 https://doi.org/10.5198/jtlu.2015.810.

Handy S, Cao XY, Mokhtarian P (2005) Correlation or causality between the built environment and travel behavior? Evidence from northern California. Transp Res Part D: Transp Environ 10(6):427–444 https://doi.org/10.1016/j.trd.2005.05.002.

Wang JF, Li XH, Christakos G, Liao YL, Zhang T, Gu X, Zheng XY (2010) Geographical detectors-based health risk assessment and its application in the neural tube defects study of the Heshun region, China. Int J Geogr Inf Sci 24(1):107–127 https://doi.org/10.1080/136e58810802443457.

Qian J, Han D (2009) A spatial weighted network model based on optimal expected traffic. Physica A 388(19):4248–4258 https://doi.org/10.1016/j.physa.2009.05.047.

Jung W, Wang F, Stanley HE (2008) Gravity model in the Korean highway. Europhys Lett 81(480054). https://doi.org/10.1209/0295-5075/81/48005.

Liu X, Xia H (2008) Estimating methods of passenger throughput for hub airport based on reverse gravity model. J Traff Transp Eng 8(2):85–89.

Sato A, Sawai H (2015) Relationship between socioeconomic flows and social stocks: case study on Japanese air transportation. Evol Inst Econ Rev 12(2):243–263 https://doi.org/10.1007/s40844-015-0016-z.

Zhang Y, Peng T, Hao S (2016) Gravity model for forecasting airline passenger flow considering network structure. J Wuhan Univ Technol (transportation science & Engineering) 1(40). https://doi.org/10.3963/j.issn.2095-3844.2016.01.003.

Dai T, Jin F (2008) Spatial interaction and network structure evolvement of cities in terms of China’s rail passenger flows. Chin Geogr Sci 18(3):206–213 https://doi.org/10.1007/s11769-008-0206-2.

Ren Y, Ercsey-Ravasz M, Wang P, Gonzalez MC, Toroczkai Z (2014) Predicting commuter flows in spatial networks using a radiation model based on temporal ranges. Nat Commun 5(5347). https://doi.org/10.1038/ncomms6347.

De Montis A, Barthelemy M, Chessa A, Vespignani A (2007) The structure of interurban traffic: a weighted network analysis. Environ Plann B Plann Des 34(5):905–924 https://doi.org/10.1068/b32128.

Simini F, Gonzalez MC, Maritan A, Barabasi A (2012) A universal model for mobility and migration patterns. Nature 484(7392):96–100 https://doi.org/10.1038/nature10856.

Song C, Koren T, Wang P, Barabasi A (2010) Modelling the scaling properties of human mobility. Nat Phys 6(10):818–823 https://doi.org/10.1038/nphys1760.

Lenormand M, Bassolas A, Ramasco JJ (2016) Systematic comparison of trip distribution laws and models. J Transp Geogr 51:158–169 https://doi.org/10.1016/j.jtrangeo.2015.12.008.

Huang L, Zhu X, Ye X, Guo W, Wang J (2016) Characterizing street hierarchies through network analysis and large-scale taxi traffic flow: a case study of Wuhan, China. Environ Plann B Plann Des 43:276–296 https://doi.org/10.1177/0265813515614456.

Sang SL, Young TO, Seung HL, Kee CC (2002) Development of degree of saturation estimation models for adaptive signal systems. KSCE J Civ Eng 6(3):337–345 https://doi.org/10.1007/BF02829156.

Knapp K, Chandler B, Atkinson J, Welch T, Rigdon H, Retting R, Meekins S, Widstrand E, Porter RJ (2014) Road diet informational guide. Federal Highway Administration, Washington D.C https://safety.fhwa.dot.gov/road_diets/guidance/info_guide/rdig.pdf.

## Acknowledgments

We thank Shi Qiu for discussing the questions and analyzing data, editor and reviewers for the valuable comments, professor Mei-Po Kwan at UIUC for good revision suggestions.

### Funding

This research was funded by National Natural Science Foundation of China (Grant Nos. 51408257, 51308249, 51308248), the National Key Technology Research and Development Program of the Ministry of Science and Technology of China (Grant No.2014BAG03B03) and China Scholarship Council (Grant No.201806170188).

### Availability of data and materials

The data that support the findings of this study are available from Jilin University but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of Jilin University.

## Author information

### Authors and Affiliations

### Contributions

SW and DY conceived and designed the experiments; XM and XX performed the experiments; SW and XM analyzed the data; DY contributed materials/analysis tools; SW and XX wrote the paper. All authors read and approved the final manuscript.

### Corresponding author

## Ethics declarations

### Competing interests

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

### Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Rights and permissions

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## About this article

### Cite this article

Wang, S., Yu, D., Ma, X. *et al.* Analyzing urban traffic demand distribution and the correlation between traffic flow and the built environment based on detector data and POIs.
*Eur. Transp. Res. Rev.* **10**, 50 (2018). https://doi.org/10.1186/s12544-018-0325-5

Received:

Accepted:

Published:

DOI: https://doi.org/10.1186/s12544-018-0325-5

### Keywords

- Sustainable transportation
- Urban traffic flow
- Travel patterns
- Spatiotemporal characteristics
- Built environment
- Lane marginal utility
- Gravity model