The problem addressed in the present work is to plan and design an optimal set of routes for the collection of goods through a new methodological approach. Its main components are described as follows:

a single depot (D) where collection, groupage and distribution activities are centralized;

a set of clients (farms) clustered in client nodes (N) spread on the study area, with a pick-up demand (P_{j});

a fleet of vehicles (V) with the same capacity (Q), able to collect goods from the farms to the depot;

the road network and the set of possible links (i, j ∈ L) between different client nodes.

Starting from the assumption of no congestion on the road network (considering the suburban and rural context) and thus assuming a constant average speed for all vehicles, the objective of the optimization process is to minimize the Total Distance Travelled (TDT) by the vehicles of the fleet (minimization of the operational costs), while taking into account a work hours constraint. Decision variables, objective function and constraints will be explained in section 3.1. The optimal assignment of vehicles to routes is achieved by using an ACO algorithm that will be described in detail in section 3.2. Simulations have been carried out in NetLogo [25], a multi-agent programming and modelling environment which allows to model and simulate complex systems and allowing the visualization of their parameters in real time.

### 3.1 Model conceptualization

The agent-based model is structured on a double-layer network. The base layer spatially reconstructs the real road network, while the upper layer reproduces the directed graph of the possible connections (links) between the different client nodes and between the depot and the all the nodes.

The first step is the reconstruction of the road network in the NetLogo workspace and the localization of client nodes, to each of which an array of indivisible loading units from each farm belonging to the client-node itself is associated. This particular procedure has been designed to reduce the number of nodes and links of the graph by creating clusters of neighbouring farms since when the problem deals with hundreds of customers (large-scale VRP) it is computationally demanding and difficult to tackle (Fig. 1).

Starting from the base layer of the road network, the upper layer is created by connecting each client node with the depot and with a certain number of other client nodes. This number varies from node to node once a maximum road distance (*d-max*) to other nodes and a minimum number of links (*n-links*) to the nearest nodes are selected. The road distance (d_{ij}) is an attribute of the link connecting two client nodes; it is calculated for each link through a shortest path algorithm before the simulation starts.

The CVRP model formulated as follows:

$$ \operatorname{Minimize}\ TDT={\sum}_{i\in N}{\sum}_{j\in N}{\sum}_{v\in V}\ {d}_{ij}\bullet {x}_{ij,v} $$

(1)

$$ \mathrm{Subject}\ \mathrm{to}\ 0\le {\sum}_{i\in N}{\sum}_{j\in N}\ {P}_j\bullet {x}_{ij,v}\le Q\kern3.25em v\in V $$

(2)

$$ {\sum}_{i\in N}{\sum}_{j\in N}\ \left({t}_{ij}+{n}_f\bullet {t}_s\right)\bullet {x}_{ij,v}\le T{T}_{max}\kern2.75em v\in V $$

(3)

Eq. (1) is the objective function which minimizes the TDT, *x*_{ij,v} is a binary variable equal to 1, if vehicle *v* travels along the link *(i, j)*, or 0, otherwise. Eq. (2) ensures that the pick-up load of each vehicle never exceeds its capacity. Eq. (3) imposes a maximum travel time *TT*_{max} (work hours constraints, generally equal to 8 h) for each vehicle, defining *t*_{s} as the service-time that the vehicle takes in each farm to for loading operations (about 15 min), *t*_{ij} as the travel time along the link *(i, j)* and *n*_{f} as the number of farms of the client-node *j* served by vehicle *v*.

It should be specified that each farm must be visited only once (indivisibility of pickup loads), considering a constant service-time, and each vehicle has a certain capacity that cannot be exceeded considering any pickup operations (however, since a client node could have an overall demand exceeding the vehicle capacity, it may be served by more than one vehicle of the fleet, as opposed to classical VRP instances).

### 3.2 The ant Colony optimization algorithm

It is well known that CVRPs are NP-hard problems in the field of operations research. Moreover, the proposed model aims at addressing large-scale instances, where the residual orders of client nodes changes dynamically during the simulation, so they are practically impossible to solve using exact methods. In last decades, a huge number of heuristic procedures have been developed in order to find good suboptimal solutions with acceptable computational efforts. Among them, metaheuristics take inspiration from natural optimization mechanisms, translating them into specific algorithms. In particular, ACO algorithms [10] derives from the social behaviour of some ant species which are capable to find the shortest paths between their nest and a food source. This ability arises because ants can exploit a sort of communication based only on pheromone trails, a volatile chemical substance deposited on the ground by ants. Artificial ant colonies, despite being very simple organisms, can form systems able to perform highly complex tasks and jointly solve optimization problems by dynamically interacting with each other.

The algorithm implemented in the present model derives from MAX-MIN Ant System [20], which improved the first member of the ACO family, named Ant System, originally applied to the resolution of the Travelling Salesman Problem. Simulations are based on an iterative optimization process that ends after a given number of generations (*g)* of a chosen number of colonies (*m)* made of a specified number of ants. This process leads to the quality improvement of the final solution comes from the comprehensive exploitation of three different information components, iteration after iteration: simulated artificial ants build routes by considering a) the pheromone trail, b) the “visibility” and c) the residual capacity. The first component is updated for each link when a new generation *g* of colonies is launched. The last two component are included in the heuristic function, which structure is shown in Eq. (4). The visibility is given by the reciprocal of the distance related to a link and represents the fixed information available a priori. As concern the residual capacity, when ants explore their neighbourhood (client nodes linked with the actual node), the feasible combination of orders from the farms belonging to the next client node has to be recalculated every time. So, if a client node consists of *n* farms, each one with a given pick-up demand (number of loading units), ant *k* at iteration *t* investigates the combination of all pick-up demands *p*_{j,h} that can be served without exceeding the residual loading capacity (orders combination list).

$$ {\eta}_{ij}(t)=\frac{1}{d_{ij}}\cdot \sum \limits_{h=1}^n{p}_{j,h}(t) $$

(4)

Where *d*_{ij} is the road distance between node *i* and node *j* and *p*_{j,h}*(t)* is *p*_{j,h} if it belongs to the orders combination list, zero otherwise. However, there are situations in which, although the remaining capacity makes possible to satisfy another pick-up demand, it may be preferable to return to the depot in order not to further lengthen the distances covered. In consequence, we took into account a heuristic function related to the links from client node *i* to the depot *D*, given by the reciprocal of the product between the distance *d*_{iD} and the residual loading capacity at iteration t. Since the total numbers of orders, represented by standard loading unit, is usually much higher than the single vehicle capacity, multiple routes must be found, each one served by one vehicle. So, every solution must be built by an ant colony, able to jointly minimize the total distance travelled without violating the constraints of maximum capacity and maximum working time. Step by step, each ant of the colony applies a random proportional rule to decide the next farm to go. Therefore, the probability with which ant *k*, currently at farm *i*, chooses to go to farm *j* is given by Eq. (5):

$$ {p}_{ij}^k(t)=\frac{{\left[{\tau}_{ij}(g)\right]}^{\alpha}\cdot {\left[{\eta}_{ij}(t)\right]}^{\beta }}{\sum_{l\in {N}_i^k}{\left[{\tau}_{il}(g)\right]}^{\alpha}\cdot {\left[{\eta}_{il}(t)\right]}^{\beta }}\kern1.25em \mathrm{if}\ j\in {N}_i^k $$

(5)

where and *α* and *β* are calibration parameters that control the relative importance of the pheromone trail *τ*_{ij} versus the heuristic information *η*_{ij}, *N*_{i}^{k} is the feasible neighbourhood.

When an ant of the colony reaches the loading capacity (or its feasible neighbourhood is empty) it comes back to the depot and the next ant of the same colony starts its tour. When pick-up demands are all satisfied, the following colony of ants is allowed to explore other possible solutions, until the given number *m* of colonies is reached.

Once all the *m* colonies have found their solution, only the “best” colony (i.e. the one that founds the solution that minimises the total distance travelled) is allowed to reinforce the pheromone trail, to better exploit the best of the *m* solutions found, using the following updating rule (Eq. (6)):

$$ {\tau}_{ij}\left(g+1\right)=\left(1-\rho \right)\cdot {\tau}_{ij}(g)+\varDelta {\tau}_{ij}^{best}(g) $$

(6)

where *ρ* is the evaporation rate, ranging from 0 to 1, *Δτ*_{ij}^{best} is the amount of pheromone deposited on link *(i, j)* used by the best colony at generation *g*, which is given by Eq. (7):

$$ \varDelta {\tau}_{ij}^{best}(g)=Q\cdot {\left(\frac{E^{best}(g)}{E^{global- best}}\right)}^2 $$

(7)

*E* represents the value of the objective function (i.e. the reciprocal of the total distance travelled by vehicles) and *Q* is the diffusion rate, which is greater than zero. Finally, when the maximum number of generations is reached, the simulation stops and outputs the results. The whole process described so far is showed through a flow chart in Fig. 2.