Capacitated Vehicle Routing Problem, Simulated Annealing and how our Holiday Puzzle turned into a perfect learning opportunity

April 12, 2019

Read time 5 min

We love programming riddles! At the end of 2018, we launched a double-NP-hard problem that combines the well-known traveling salesman and bin packing problems into what’s often called a Capacitated Vehicle Routing Problem (CVRP). In this article I’ll explore how playful puzzles can result in some serious learning as we explore what ended being the algorithm of choice for top players: simulated annealing (SA).

The problem this time was relatively simple: Santa has a list of presents to deliver around the globe, but can only fit so many presents in their sleigh at once. The task is to find the shortest route around the globe that would deliver all of the presents. What makes the problem difficult is the large number of potential routes to consider, as there are two separate problems to be optimized. One must balance between choosing the most optimal routes, or optionally consider what and how many items to pack in order to reduce the total number of routes. As is typical for combinatorial optimization problems, the amount of different combinations makes brute-force methods impractical.

We received a couple of hundred submissions to the challenge, and the participants really iterated on improving their solutions. It was exciting to see a race to improve solutions and many competitors were really neck-and-neck. At the end of the competition I congratulated all the participants and inquired the top performers how they approached the problem. Most, if not all of the top performers used a simulated annealing algorithm. Some had used rather extensive cloud computing resources, but the top three had only ran the algorithm on their own laptops. So I decided to dig deeper into the solution in the spirit of learning what makes it tick.

I had come across the term “simulated annealing” before, but never had the time nor opportunity to study it in detail. The use case of the algorithm is to approximate a global minimum of a cost function, and what makes it different from greedy algorithms is that instead of trying to always find a strictly better solution, simulated annealing takes occasional “uphill moves”. It is these occasional uphill moves that allow the algorithm to discover globally more effective alternative solutions while still focusing most of its efforts on improving its current solution. Routing problems specifically are a tricky domain, as the solution space is huge and riddled with local optima, which may be an order of magnitude worse that the global optimum.

The simulated annealing algorithm starts from a given (often random) state, and on each iteration, generates a new neighbor state. If the new state is a less optimal solution than the previous one, the algorithm uses a probability function to decide whether or not to adopt that state. These occasional “uphill moves” are what allows the algorithm to avoid getting stuck in a local minimum. The probability of selecting a less optimal solution is controlled by a temperature variable, which starts large but gradually decreases over time – not unlike how metals cool, lending to the term “annealing”.  There are no hard rules for what the temperature parameter actually controls and how much of an effect it will have, as optimal settings will depend on each specific problem. Generally, however, the suggested solutions should be less radical as temperature is lowered throughout the optimization process. Setting the temperature to zero would cause the algorithm to behave exactly like a greedy search – only options that improve the solution would ever get accepted and the algorithm would eventually arrive at a local minimum.

Because implementing an algorithm yourself is the best way to learn it, I decided to code one in R.

This is definitely not the most optimal implementation, and both the neighbor generation function and the probability function are very simple – e.g. the winning competition entry used C with highly tuned neighbor and probability functions. Nevertheless, you can see the algorithm converging towards a good solution while taking some occasional side steps (note, the generated video does not use the full competition dataset):

The key functions of the algorithm are implemented in the source code as refine_route, which generates a new neighbor state, and select_better_route, which controls whether or not the generated state is accepted. Neighbor generation is quite simplistic: the function simply swaps the order of 1..N cities orders in the route, where N is gradually decreased according to the temperature parameter. Some potential optimization steps in this function are adjusting which cities get swapped (e.g. neighbors or randomly from the entire route) and the temperature gradient.

refine_route <- function(route, temp) {
    new_route <- as_tibble(route)
    row_order <- 1:nrow(new_route)
    num_to_swap <- (runif(1, 0, temp) * 5) + 1
    for (idx in 1:num_to_swap) {
      entries <- as.integer(runif(c(1, 1), 0, nrow(new_route)))
      row_order[entries] = row_order[rev(entries)]
    new_route[row_order, ]

Likewise, the selection function is very simplistic. It will always accept the new route if it has a lower overall cost. Based on a probability that decreases with temperature, it may adopt the new route even if it has a higher cost. There are optimization possibilities in this function as well: one might e.g. consider using the cost difference as a factor in the selection process.

select_better_route <- function(current_route,
                                temp = NULL) {
  if (is.null(temp)) {
    stop("Temperature cannot be null")

  new_cost <- calculate_route_total_cost(suggested_route)
  should_choose_worse <- (runif(1) < prob_choose_worse * temp)
  if (is.null(current_cost) || new_cost < current_cost || should_choose_worse) {
      cost = new_cost,
      route = suggested_route
  } else {
      cost = current_cost,
      route = current_route

In summary, simulated annealing is a solid approach for optimizing a function where several local minima exist. Other algorithms will converge towards a minimum more quickly, but the robustness of SA lends itself to problems where finding a global optimum is required. The algorithm is rather simple and provides multiple optimization possibilities so that it can be tuned to fit specific problems better.

Sign up for our newsletter

Get the latest from us in tech, business, design – and why not life.