Improving inbound logistic planning for large-scale real-world routing problems: a novel ant-colony simulation-based optimization

This paper presents the first results of an agent-based model aimed at solving a Capacitated Vehicle Routing Problem (CVRP) for inbound logistics using a novel Ant Colony Optimization (ACO) algorithm, developed and implemented in the NetLogo multi-agent modelling environment. The proposed methodology has been applied to the case study of a freight transport and logistics company in South Italy in order to find an optimal set of routes able to transport palletized fruit and vegetables from different farms to the main depot, while minimizing the total distance travelled by trucks. Different scenarios have been analysed and compared with real data provided by the company, by using a set of key performance indicators including the load factor and the number of vehicles used. First results highlight the validity of the method to reduce cost and scheduling and provide useful suggestions for large-size operations of a freight transport service.


Introduction
Logistics is the set of services and activities that allow goods to be carried from the place of origin in which they are available to the destinations where they are required. Transport helps to connect the sources of raw materials, production centres and markets, generating an increase in the value of goods sufficiently to justify the transport cost incurred. The first component of the logistics system is inbound logistic, which deals with the management of incoming materials, so it has to do with the purchases and supplies of raw materials, components or semi-finished products arriving from upstream suppliers of the logistics network. Among the activities of order management, collection, storage, internal handling and transport of goods, the latter often represents the main cost item. Therefore, a transport company that is able to provide an efficient and timely service achieves a competitive advantage in the increasingly competitive national and international markets. By improving route assignments to the vehicles of the fleet, it is possible to obtain significant time and cost savings.
However, even big companies often plan loading and distribution operations based on their empirical experience, without optimization methods able to minimize driving distance, avoid space waste inside the transport vehicles or at worse infeasible loading [1]. Since the relatively recent development of computer tools, a huge amount of scientific literature has been produced with the aim of optimizing delivery and/or pickup operations for a fleet of vehicles serving a set of customers and subject to side constraints. This gave rise to a whole class of problems sharing the generic name of Vehicle Routing Problem (VRP). The original version of the VRP was proposed by Dantzig and Ramser [8] under the definition of Truck Dispatching Problem, which dealt with the calculation of optimal routes for a fleet of trucks for petrol deliveries. This issue, in turn, may be considered as a generalization of the Traveling-Salesman Problem (TSP), consisting in finding the shorter route (or, in general terms, the lowest cost path) connecting all vertex of a graph, starting and finishing at a specified vertex after having visited each other vertex exactly once. Thanks to its numerous practical implications (especially in logistics but also in passenger transport), several variants of the basic problem have been put forward in recent years. One of the most studied members of the VRP family is the Capacitated Vehicle Routing Problem (CVRP), in which a fleet of identical vehicles has to be optimally routed from a central depot to supply a set of geographically dispersed customers with known demands [2]. Although CRVPs are not so "hard" to deal with as problems with pickups and deliveries and/ or time-windows, when we deal with large-scale instances, it is fundamental to reduce the computational demand by acting both on the optimization algorithm and on the network topology, which is precisely the point on which this paper is focused.
The work is organized as follows: Section 1 introduced the research topic, highlighting the applicability of the Ant Colony Optimization metaheuristic to solve freight transport problems; Section 2 presents a brief literature review on VRP instances and their resolution approaches, with reference to the research contribution; Section 3 describes in detail the methodology adopted and the algorithm implemented in the multi-agent simulation environment; Section 4 presents the application of the model to a real case study; Section 5 shows and discusses the experimental results; finally, Section 6 concludes the work, providing some considerations for further research.

The use of ant Colony optimization to solve the VRP
It is well known that VRP, in its various specifications, is a non-determistic polynomial-time hard problem (NPhard problem) which is not easily addressed with exact algorithms, since the computational time grows exponentially with problem size (with the increase in the scale of logistics and distribution this time would be extremely high). Therefore, a feasible option consists in formulating heuristic and metaheuristic algorithms, conceived so as to generate solutions that are as close as possible to the optimal one. Ant Colony Optimization (ACO) algorithms are derived from an analogy with ants which lay a volatile substance called "pheromone" on their trail when foraging for food. In this family of metaheuristics, by extension, a certain number of simple artificial agents cooperate to build good solutions to hard combinatorial optimization problems via low-level based communications [9]. Iteration after iteration, more pheromone is deposited on the more frequented trails and this brings out a learning mechanism: when constructing a VRP solution, the probability of selecting a certain move is higher if this move has previously led to a better solution in previous iterations. Therefore, the "auto catalytic" nature of the process leads to the convergence towards good nearoptimal solutions. A detailed explanation of the algorithm proposed in the present work will be provided in section 3.2.
In general, ACO is conceived to find the minimum cost paths within a network, so it presents several applications to routing and scheduling problems and is of particular interest in transport problems [4,17]. Besides, thanks to its easy applicability to dynamic problems, where the topology of the characteristics of the network changes during the simulation, ACO algorithms are able to perform better than other metaheuristics. The excellent performances of ACO in solving such optimization problems are highlighted by the works of Catay [6] and Carabetti et al. [5], which applied the ACO approach to a series of benchmark problems finding results that were comparable and in some cases better than those available from the literature.

Literature review and research contribution
An extensive review on VRP instances exists [7,21] and numerous variations of the basic problem in real-world applications have been addressed, including supply chain and freight transport issues [14], public transport [23], street cleaning, urban solid waste collection [3], school bus routing [12] and other instances.
Zhang et al. [26] investigate the reverse logistics vehicle routing problem with a single depot, simultaneous distribution and collection of the goods by a homogeneous fleet of vehicles under the restrictions of maximum capacities and maximum distance. They proposed an Ant Colony System (ACS) approach in which the vehicle residual loading capacity is introduced into the heuristic function to consider the dynamic fluctuation of vehicle load. Xiao et al. [22] extended the classical CVRP introducing the objective of minimizing fuel consumption, assumed as load dependent function, and using a simulated annealing algorithm to solve the problem. Lin et al. [15] addressed the recent trend of the environmental sensitivity in the supply chain management through a survey of green vehicle routing problems. Schneider et al. [18] introduced the electric VRP with time windows with the possibility for vehicles of recharging at any of the available stations, exploiting a hybrid heuristic that combines neighbourhood search and tabu search. Wang et al. [24] propose a modified ACO algorithm integrated with other savings algorithms in order to solve the CVRP allowing ants to go in and out the depots more than once until they have visited all customers, aiming at simplifying the procedure of constructing feasible solutions. Martin et al. [16] developed a multi-agent framework for scheduling and routing problems where agents use different metaheuristics and cooperate by sharing partial solutions during the search, giving rise to a reinforcement learning and pattern matching process. Hannan et al. [11] address the routing and scheduling optimization problem in waste collection by using a modified particle swarm optimization algorithm in a CVRP model, with the objective of minimizing travel distance, collected waste and tightness. Song et al. [19] propose a multi-objective approach to solve a CVRP with Time Windows and two-dimensional loading constraints, making use of mixed integer linear programming and a generalised variable neighbourhood search algorithm.
This paper contributes to the current literature by proposing a new agent-based modelling framework for the optimized planning of truck routes in large-scale inbound operations. This work provides a twofold contribution, methodological, since the ACO has been applied to a specific real transport network, allowing to vary and verify the incidence of some specific parameters related to truck freight (e.g. maximum working time, average speed, truck capacity, etc.) in each simulation, and operational, considering that the proposed model stands as a useful practical optimization tool able to support logistics operators in the route planning phase of their service.

Methodology
The problem addressed in the present work is to plan and design an optimal set of routes for the collection of goods through a new methodological approach. Its main components are described as follows: a single depot (D) where collection, groupage and distribution activities are centralized; a set of clients (farms) clustered in client nodes (N) spread on the study area, with a pick-up demand (P j ); a fleet of vehicles (V) with the same capacity (Q), able to collect goods from the farms to the depot; the road network and the set of possible links (i, j ∈ L) between different client nodes.
Starting from the assumption of no congestion on the road network (considering the suburban and rural context) and thus assuming a constant average speed for all vehicles, the objective of the optimization process is to minimize the Total Distance Travelled (TDT) by the vehicles of the fleet (minimization of the operational costs), while taking into account a work hours constraint. Decision variables, objective function and constraints will be explained in section 3.1. The optimal assignment of vehicles to routes is achieved by using an ACO algorithm that will be described in detail in section 3.2. Simulations have been carried out in NetLogo [25], a multi-agent programming and modelling environment which allows to model and simulate complex systems and allowing the visualization of their parameters in real time.

Model conceptualization
The agent-based model is structured on a double-layer network. The base layer spatially reconstructs the real road network, while the upper layer reproduces the directed graph of the possible connections (links) between the different client nodes and between the depot and the all the nodes.
The first step is the reconstruction of the road network in the NetLogo workspace and the localization of client nodes, to each of which an array of indivisible loading units from each farm belonging to the clientnode itself is associated. This particular procedure has been designed to reduce the number of nodes and links of the graph by creating clusters of neighbouring farms since when the problem deals with hundreds of customers (large-scale VRP) it is computationally demanding and difficult to tackle (Fig. 1).
Starting from the base layer of the road network, the upper layer is created by connecting each client node with the depot and with a certain number of other client nodes. This number varies from node to node once a maximum road distance (d-max) to other nodes and a minimum number of links (n-links) to the nearest nodes are selected. The road distance (d ij ) is an attribute of the link connecting two client nodes; it is calculated for each link through a shortest path algorithm before the simulation starts.
The CVRP model formulated as follows: Eq. (1) is the objective function which minimizes the TDT, x ij,v is a binary variable equal to 1, if vehicle v travels along the link (i, j), or 0, otherwise. Eq. (2) ensures that the pick-up load of each vehicle never exceeds its capacity. Eq. (3) imposes a maximum travel time TT max (work hours constraints, generally equal to 8 h) for each vehicle, defining t s as the service-time that the vehicle takes in each farm to for loading operations (about 15 min), t ij as the travel time along the link (i, j) and n f as the number of farms of the client-node j served by vehicle v.
It should be specified that each farm must be visited only once (indivisibility of pickup loads), considering a constant service-time, and each vehicle has a certain capacity that cannot be exceeded considering any pickup operations (however, since a client node could have an overall demand exceeding the vehicle capacity, it may be served by more than one vehicle of the fleet, as opposed to classical VRP instances).

The ant Colony optimization algorithm
It is well known that CVRPs are NP-hard problems in the field of operations research. Moreover, the proposed model aims at addressing large-scale instances, where the residual orders of client nodes changes dynamically during the simulation, so they are practically impossible to solve using exact methods. In last decades, a huge number of heuristic procedures have been developed in order to find good suboptimal solutions with acceptable computational efforts. Among them, metaheuristics take inspiration from natural optimization mechanisms, translating them into specific algorithms. In particular, ACO algorithms [10] derives from the social behaviour of some ant species which are capable to find the shortest paths between their nest and a food source. This ability arises because ants can exploit a sort of communication based only on pheromone trails, a volatile chemical substance deposited on the ground by ants. Artificial ant colonies, despite being very simple organisms, can form systems able to perform highly complex tasks and jointly solve optimization problems by dynamically interacting with each other.
The algorithm implemented in the present model derives from MAX-MIN Ant System [20], which improved the first member of the ACO family, named Ant System, originally applied to the resolution of the Travelling Salesman Problem. Simulations are based on an iterative optimization process that ends after a given number of generations (g) of a chosen number of colonies (m) made of a specified number of ants. This process leads to the quality improvement of the final solution comes from the comprehensive exploitation of three different information components, iteration after iteration: simulated artificial ants build routes by considering a) the pheromone trail, b) the "visibility" and c) the residual capacity. The first component is updated for each link when a new generation g of colonies is launched. The last two component are included in the heuristic function, which structure is shown in Eq. (4). The visibility is given by the reciprocal of the distance related to a link and represents the fixed information available a priori. As concern the residual capacity, when ants explore their neighbourhood (client nodes linked with the actual node), the feasible combination of orders from the farms belonging to the next client node has to be recalculated every time. So, if a client node consists of n farms, each one with a given pick-up demand (number of loading units), ant k at iteration t investigates the combination of all pick-up demands p j,h that can be served without exceeding the residual loading capacity (orders combination list).
Where d ij is the road distance between node i and node j and p j,h (t) is p j,h if it belongs to the orders combination list, zero otherwise. However, there are situations in which, although the remaining capacity makes possible to satisfy another pick-up demand, it may be preferable to return to the depot in order not to further lengthen the distances covered. In consequence, we took into account a heuristic function related to the links from client node i to the depot D, given by the reciprocal of the product between the distance d iD and the residual loading capacity at iteration t. Since the total numbers of orders, represented by standard loading unit, is usually much higher than the single vehicle capacity, multiple routes must be found, each one served by one vehicle. So, every solution must be built by an ant colony, able to jointly minimize the total distance travelled without violating the constraints of maximum capacity and maximum working time.
Step by step, each ant of the colony applies a random proportional rule to decide the next farm to go. Therefore, the probability with which ant k, currently at farm i, chooses to go to farm j is given by Eq. (5): where and α and β are calibration parameters that control the relative importance of the pheromone trail τ ij versus the heuristic information η ij , N i k is the feasible neighbourhood.
When an ant of the colony reaches the loading capacity (or its feasible neighbourhood is empty) it comes back to the depot and the next ant of the same colony starts its tour. When pick-up demands are all satisfied, the following colony of ants is allowed to explore other possible solutions, until the given number m of colonies is reached.
Once all the m colonies have found their solution, only the "best" colony (i.e. the one that founds the solution that minimises the total distance travelled) is allowed to reinforce the pheromone trail, to better exploit the best of the m solutions found, using the following updating rule (Eq. (6)): where ρ is the evaporation rate, ranging from 0 to 1, Δτ ij best is the amount of pheromone deposited on link (i, j) used by the best colony at generation g, which is given by Eq. (7): E represents the value of the objective function (i.e. the reciprocal of the total distance travelled by vehicles) and Q is the diffusion rate, which is greater than zero. Finally, when the maximum number of generations is reached, the simulation stops and outputs the results. The whole process described so far is showed through a flow chart in Fig. 2. 4 Case study

Territorial framework
The described methodology is applied to the case study of Gali Group, a freight transport and logistics company, located in Ispica (Sicily), on the eastern end of the province of Ragusa, bounding Siracusa's district. Ispica is 33 km from Ragusa (Fig. 3); it has an area of about 110 km 2 with a population density of 143,54 inhabitants/ km 2 [13]. Its economy is primarily agricultural boasting major outputs of early fruit, tomato, vegetables and carobfor which Ispica is Italian's biggest producer and exporter. Industry has developed in recent decades, particularly the agriculture-related businesses. Thus, the main industrial activities are those involved in processing and marketing the agricultural products.
In this context, the Gali Group company is recognized as a landmark for the activity of pick-up and delivery of horticultural products from Sicily to the central-northern Italy and in some cases also abroad. Its activity is based on road transport. The company offers the possibility to request the pick-up order by clients up to 5.00 p.m.; and only after this time the pick-up activity is carried out. This way of working arises from the concept that logistics operators can decide the routes when they have an almost complete awareness of the orders. This clearly affects the subsequent phases of the logistics process. Hence, this work analyses the upstream of the process by addressing to the pick-up procedure, in order to provide an optimized route planning in terms of times and costs. The study area is represented by the catchment area of Gali Group, as shown in Fig. 3 (on the right side) and the data analysis and the algorithm implementation are provided below.

Data analysis and algorithm implementation
Data analysis is referred to a period between May 2018 and March 2019. The initial basis of data for this study is essentially represented by the analysis of 3 days with a maximum flow of goods. For all these days, incoming orders have been collected, with the following information: number, code and time of arrival of the order; name and pick-up zone of the customer/provider company corresponding to each order; number and type of loading units.
On average, about 90 orders have been registered for each day with more than 1400 loading units. In addition, the operating program of the logistics operators concerning the procedure of pick-up of goods has been recorded for each day, i.e. total distance travelled (TDT), number of vehicles (NV) and load factor (LF).
Once the study area has been identified, coinciding with the catchment area of Gali Group characterized by the various provider companies, the first step to start simulations in NetLogo consisted in the definition and construction of the road network graph. It has been drawn using OpenStreetMap as a basis and it is characterized by a double-layer network (as stated in section 3.1). Figure 4 shows the reconstruction of the real road network (on the left side) and the directed graph of the possible connections between the different client nodes and between the depot and all the nodes (on the right side), reproduced by links.
In the proposed model, the clients have been considered by creating clusters of neighbouring farms clustered in client nodes, with the corresponding number of loading units organized in an array for each node. Therefore, 60 client nodes or farms have been positioned. Several simulations have been performed to test the model and results are shown in Table 1. Since the ants apply to each node a probability-based choice criterion, an initial level of pheromone concentration is assigned to each link of the network. Moreover, the maximum number of generations g max , the number of colonies for each generation m, the vehicles capacity C (i.e. maximum number of units), the diffusion rate Q, the evaporation rate ρ, exponents α and β have been fixed as input parameters. They have been chosen after several tests which resulted in better computational times and model outcomes. The only two variable parameters have been constituted by the maximum road distance d-max to other client nodes and the minimum number of links nlinks to the nearest nodes. Three combinations of d-max and n-links have been considered. The first one, 10-5 is characterized by a low number of connections and from the nearest ones considering the reduced radius. In the second combination 50-10, both the radius and the number of minimum connections have been increased. The third one, 50-30 has an equal radius value while the number of minimum connections is increased. The results of these three sets of simulations highlight that the shorter distances have been obtained for the second combination 50-10. This proves that an increase in exploration possibilities does not correspond to a better solution found by the algorithm (in this case a decrease of TDT).
To demonstrate the effectiveness of the model, Fig. 5 shows the convergence curve of the objective function obtained in one of the simulations for the day 1. The xcoordinate denotes the number of generations and the y-coordinate denotes the corresponding TDT. It can be For each analysed day, considering the second combination of d-max and n-links (50-10), several simulations have been performed. Scheduled data (provided by the company) with reference to the total number of travelled kilometers (calculated considering the optimal minimum path for the trip of each vehicle), the number of vehicles and the load factors have been taken into consideration to make a comparison with data derived from simulations. Table 2 shows the aggregated results of simulations deriving from the elaboration of collected data during the three analysed days. It is noticeable that the total number of travelled kilometres (TDT) deriving from the simulation is significantly lower than the scheduled one provided by the company. This outcome is more emphasized in the case of the second and third days, during which the scheduled number of travelled km and vehicles is greater. Moreover, the load factors of simulated vehicles are higher than the scheduled ones, consequently leading to a fewer number of vehicles to carry out the procedure of pick-up of goods. Only the Average Distance Travelled (ADT) by vehicles is higher than the scheduled one for "Day 1", but this outcome is due to the minor number of vehicles resulting from simulations.
These findings are much more evident from a graphical point of view in Fig. 6 which show the comparison between the daily programme of the logistics company and data obtained through some simulations, related to the number of travelled kilometres and the load factor for each vehicle and for all days.
For most vehicles, the number of travelled kilometres deriving from simulations is consistent with the scheduled one. Then, despite for the latest vehicles this number appears higher in the case of simulation (Fig. 6 on the left side). This is largely justified by the fact that the load factor of vehicles is higher (Fig. 6 on the right side). Figure 7 shows the aggregated results for all 3 days. Comparing the average results obtained from the simulations and scheduled data, it is demonstrated how the model can optimize the routes for the collection of goods. This optimization is configured not only in a reduction of the travelled kilometres and a higher load factor of vehicles (as stated before), but also in a lower simulated number of vehicles (e.g. vehicles' number = 47 for the second day) compared to the one scheduled (i.e. vehicles' number = 60 for the second day) (see Fig. 7 on the right). As seen, the implementation of the ACO in NetLogo, besides representing an optimization tool, it allows the logistics company to optimize resources through the model results of routing, in terms of travelled km, load factor and number of vehicles. Moreover, the proposed study has an important advantage represented by the fact that NetLogo gives the possibility of graphically representing networks and displaying the best routes, configuring NetLogo as an operational tool for the company from a practical point of view. Furthermore, the element of absolute originality of the work consists in having applied the multi-agent modelling environment NetLogo to solve optimization problems in the planning of transport operations. The simulations based on parameters of Table 1, after an initial calibration, could be replicated in other contexts, allowing benchmark analyses to be performed between different contexts. Currently, the model does not consider the congestion level of each link belonging to the transport network, but the computations are based on the travelled distance. Furthermore, the time-wasters due to the pick-up of goods at each farm are considered in an equivalent manner in the probabilistic choice made by the model in each iteration (i.e. when the number of array components of each pick-up zone varies). This is because more farms are clustered in a single pick-up zone to simplify the model and to decrease computational demand. These aspects could be further investigated in future research.

Conclusions
This study proposes a new methodological approach of CVRP to optimize inbound logistics in a large-scale problem. The model has been tested using input data provided by the logistics company Gali Group, located in the Sicilia region in southern of Italy. Data have been acquired for a significant period and 3 days with a maximum flow of goods have been considered for comparisons. From the analysis of data, the catchment area of Gali Group has been identified and the road network graph has been constructed in NetLogo to start simulations. The CVRP using ACO algorithm has been implemented for the identification of an optimal set of routes for the collection of goods, by using an objective function coinciding with the travel distance and a maximum working time of 8 h/day as a constrain. In this way, the proposed model is able to support the logistics operators during the route planning phase, optimizing the operations related to their service: in fact, comparing the obtained results from simulations and the scheduled ones, it is evident a significant reduction in the traveled distances by vehicles, as well as in the number of vehicles itself, compared to those planned by the company, with a corresponding higher load factor. Therefore, this study lays the basis for a deeper analysis in order to investigate the logistics process in its overall perspective. Further analysis will be carried out in future research, in order to obtain more information on the operation of the logistics company through interviews to planners and drivers (e.g. departure and arrival times of vehicles fromto the depot; groupage and delivery processes), identifying algorithm improvements (although in the initial phase, the model is providing very interesting results) and paving the way for a well-thought-out decision support service of an optimized logistics freight transport.