Traveling
Salesman Problem
|
Outline
1.Traveling Salesman Problem | 4.Experiment Results |
2.Genetic Algorithm (GA) | 5.Conclusion and Future Work |
3.Simulated Annealing (SA) | 6.Reference |
Authors
|
TSPbyGA&SA.
The Traveling Salesman Problem (TSP)
was first introduced by the RAND in 1948.
RAND's reputation helped to make the TSP a well known and popular problem.
The TSP also became popular at that time due to the new subject of linear
programming and attempts to solve combinatorial problems. The TSP is conceptually
very simple: the traveling salesman
must visit every city in his territory exactly once and then return to
the starting point. Given the cost of travel between all cities, how should
he plan his itinerary for minimum total cost of the entire tour?
The TSP has many variations:
Only the simplest symmetric traveling salesman problem is considered in this project. Although the problem definition is quite simple, the search space of TSP is quite large. Actually, O(TSP)=n! , which implies that the TSP is NP-Complete. Say a case of 30 cities, if the computer can check 1 million permutations in one second, it would require about 4 * 1018 years for the computer to finish checking all possible permutations. Therefore, we have no choice but seek for heuristics. The formal definition of this problem can be found at A compendium of NP optimization problems and the proof of NP-Complete can be found in the book "Computers and Intractability" by M. Garey and D. Johnson, published by W. H. Freeman, San Francisco, 1979.
All the trials to solve TSP can be classified into one of the categories below:
2-1. What is GA?
Genetic Algorithms are a family of computational models inspired by evolution. These algorithms encode a potential solution to a specific problem on a simple chromosome-like data structure and apply recombination (crossover) operators to these structures so as to preserve critical information. Genetic algorithms are often viewed as function optimizers, although the range of problems to which genetic algorithm have been applied is quite broad.
The
basic components of GA are illustrated in the left figure: gene, chromosome,
and population. Usually the chromosome is represented as a binary string.
The real trick of GA is on the encoding of problem domain, and the selection
of next generation.
2-2. How does GA work?
1. Initialize
a population of chromosomes.
2. Evaluate each
chromosome in the population.
3. Create new
chromosomes by mating current chromosomes. Apply Mutation
and Recombination
as the parent chromosomes mate.
4. Delete members
of the population to make room for the new chromosomes (new generations).
5. Evaluate the new chromosomes and insert
them into the population.
6. If time is up, stop and return the best chromosome; if not, go to step
3.
2-3. Why does GA work?
The question that most people ask about GA is that why such a select-recombine-mute process (elitism) should do anything useful The most widely given answer was from the book "Adaptation in Natural and Artificial System" by John Holland, who is the father of Genetic Algorithm. Consider a function over a single variable, whose search space is one-dimensional, with function maximization as a goal.
The following figure demonstrates how the search space is partitioned (sampled). The chromosome form (hyperplane) of 0***...* spans the first half of the search space and 1***...* spans the second half of the space.
Since the strings in the 0***...* partition
are on average better than those in the 1***...* partition, we would like
the search to be proportionally biased toward this partition.
In the second figure the portion of the space corresponding to **1***...*
is shaded, which also highlights the intersection of 0***...* and **1***...*,
namely, 0*1***...*.
Finally, in the third figure, 0*10***...* is highlighted. It can be shown
that tge GA will not be affected by local maxima. This is the informal
demonstration of the fundamental theorem behind GA called Schema Theorem.
2-4. Encode Scheme of Traveling Salesman
Problem
Representations:
1. Adjacency Representation
2. Ordinal Representation
3. Matrix Representation
4. Path Representation (used in this project)
Crossover Operators:
1. Partially-Mapped Operator (PMX): - proposed by Goldberg and Lingle - builds an offspring by choosing a subsequence of a tour from one parent and preserving the order and position of as many cities as possible from the other parent.
Example: p1 = (1 2 3 | 4 5 6 7 | 8 9) p2 = (4 5 3 | 1 8 7 6 | 9 3) => swap the selected segments o1 = (x x x | 1 8 7 6 | x x) o2 = (x x x | 4 5 6 7 | x x) => define the mapping (1-4,8-5,7-6,6-7) copy the remaining genes in p1 to o1 using the mapping just defined: o1 = (1-4 2 3 | 1 8 7 6 | 8-5 9) = (4 2 3 | 1 8 7 6 | 5 9) construct o2 in the same way: o2 = (4-1 5-8 3 | 4 5 6 7 | 9 3) = (1 8 3 | 4 5 6 7 | 5 9)
2. Order Operator (OX) - proposed by Davis - builds offspring by choosing a subsequence of a tour from one parent and preserving the relative order of cities from the other parent.
Example: p1 = (1 2 3 | 4 5 6 7 | 8 9) p2 = (4 5 2 | 1 8 7 6 | 9 3) => copy the selected segments o1 = (x x x | 4 5 6 7 | x x) o2 = (x x x | 1 8 7 6 | x x) => the seq of p2 : 9-3-4-5-2-1-8-7-6 remove 4, 5, 6, 7 in the seq and put it in o1 the remaining seq : 9-3-2-1-8 o1 = (x x x | 4 5 6 7 | x x) = (2 1 8 | 4 5 6 7 | 9 3) construct o2 in the same way: o2 = (3 4 5 | 1 8 7 6 | 9 2)
3. Cycle Operator (CX) - proposed by Oliver - builds offspring in such a way that each city (and its position) comes from one of the parents.
Example: p1 = (1 2 3 4 5 6 7 8 9) p2 = (4 1 2 8 7 6 9 3 5) => follow the (p1-p2) sequence 1(p1)-4(p2), 4(p1)-8(p2), 8(p1)-3(p2), 3(p1)-2(p2), 2(p1)-1(p2) -> cycle o1 = (1 2 3 4 x x x 8 x) => fill the x using p2 o1 = (1 2 3 4 7 6 9 8 5) construct o2 in the same way: o2 = (4 1 2 8 5 6 7 3 9)
4. Edge Recombination Operator (ER) - proposed by Whitley, Starweather, and Fuquay - builds an offspring exclusively from the edges present in both parents. This is done with help of the edge list created from both parent tours.
Example: p1 = (1 2 3 4 5 6 7 8 9) p2 = (4 1 2 8 7 6 9 3 5) => build the edge list: (-) means listed twice city 1 : edges to other cities : 9 -2 4 city 2 : edges to other cities : -1 3 8 city 3 : edges to other cities : 2 4 9 5 city 4 : edges to other cities : 3 -5 1 city 5 : edges to other cities : -4 6 3 city 6 : edges to other cities : 5 -7 9 city 7 : edges to other cities : -6 -8 city 8 : edges to other cities : -7 9 2 city 9 : edges to other cities : 8 1 6 3
The construction of the offspring starts with a selection of an initial city from one of the parents. The city connected to the initial city with the smallest number of edges is selected as the next one. Note that priority should give to negative cities. From a series of experiments, edge failure occurred at a very low rate (1% - 1.5%).
Mutation Operator:
Inversion Operator (INV) - The mutation is done by inverting the order of the selected segment.
Example: p1 = (1 2 3 | 4 5 6 7 | 8 9) => (1 2 3 | 7 6 5 4 | 8 9)
3-1. Introduction
Solutions to numerical problems often involve finding (or fitting) a set of parameters to minimize a function. In the design of computer chips, for example, it is often the designer's goal to find a chip layout that is optimal with respect to one or more cost measures (e.g. chip area and maximum wire length) and that meets certain layout requirements. These types of practical problems are of high complexity and are often symbolically difficult to solve. Thus, direct numerical algorithmic methods are often employed to find a set of optimal task parameters. The optimal parameter set is the one with the minimal (or maximal) function value. It is often the function optimization task to find the parameters yielding the lowest function value in the solution space.
Simulated Annealing is a stochastic algorithm which is used for finding global minimum cost configurations for NP-complete problems with cost functions having many local minima. The method is adopted from the physical annealing procedure where a liquid is cooled down in order to obtain a minimum energy formation. It allows the system to probabilistically sample different locations (points) of the function landscape, both in and out of different local minima. As the temperature is reduced, the likelihood that the system samples lower local minima than higher ones increases. Finally, when the temperature is adequately low, the system finds the lowest minima, namely, the global minimum. This is mathematically guaranteed given time and a proper temperature annealing schedule. The major problem with simulated annealing methods is that they are slow to converge to the lowest functional minima.
3-2. Simulated Annealing optimization algorithms
Annealing based optimization algorithms are relatively new optimization method and employs noise to choose new parameter values. General simulated annealing optimization methods choose new points at various distance from their current point x. Each new point x' is generated probabilistically according to a given generating distribution g(). These algorithms calculates the function value E = f(x), and then probabilistically decides to accept or reject it. If accepted, the new point becomes the current point. The new point may be accepted even if it is worse and has a larger function value than the current point. The criteria for acceptance is determined by an acceptance function h(), the temperature parameter T , and the difference in the function values of the two points. Initially, the temperature T is large, and a new point is accepted roughly half the time. As the algorithm progresses, T is reduced, thus lowering the probability that the acceptance function will accept a new point if it's functional value is greater than that of the current point. The general simulated annealing procedure is defined as:
Because the algorithm occasionally chooses points uphill from its current point, it can escape from local minima and more effectively search the function space to find the global minimum. Thus, simulated annealing algorithms are often well suited to solving constrained nonlinear optimization problems.
3-3. Solving the Traveling Salesman Problem by Simulated Annealing
In our implementation, the system is initialized with any random configuration. A new configuration is constructed by imposing a random displacement. For the traveling salesman problem this might mean switching the order in which two cities are visited. If the cost of this new state (the length of the tour for the TSP) is lower than that of the previous one, the change is accepted unconditionally and the system is updated. If the cost is greater, the new configuration is accepted with the probability edE/kT, where dE is the difference in the cost of the new state and that of the previous one, k is the count of iteration, and T is the current temperature according to the cooling schedule. The cooling schedule is given by Tk+1 = c*Tk, c<1, where c is the ratio of next state temperature to current temperature. In our implementation, the ratio is given as 0.999. On the other hand, the initial temperature is given as 200. As the temperature parameter is decreased, this procedure allows the system to move consistently towards lower cost states, yet still 'jump' out of local minima due to the occasional acceptance of a upward move. The annealing loop is terminated when the temperature reaches the final temperature, which is given as 0.000001.
4-1. Dataset
The case of 99 cities is experimented in this project. RAT99, whose best solution is 1211, is used from TSPLIB, a library of sample instances for the TSP (and related problems) from various sources and of various types.
4-2. Parameters of Genetic Algorithm
Number of Population: 1,000
and 2,000 are used in this experiment.
Probability of Crossover: 13%
Probability of Mutation: 5%
Termination Condition: Stop when best solution doesn't improve for 500
generations
This figure shows how different GA operators converge to the best solution. The number of chromosomes is 2,000. Among all four operators, ER and MIX converge much faster that other three operators, while CX is the slowest one. But the best solution found by CX is quite good, it slowly lasts for close to 5,000 generations. PMX converges fast in the beginning, but slows down later. ER is fast, but the complexity of coding makes it slow in generating new generations. This figure suggests that mix ER with other operators can yield the same result. Note that it is not very meaningful to compare the execution time of different GAs, since GA is highly parallelized.
The best solutions found by respective methods are illustrated as the following figure, it is clear that all these are near-optimal solutions only. The best solution of RAT99 is 1211.
Another set of experiment results using 1,000 chromosomes is shown in the next two figures. It is obvious that less chromosomes yields generations of worse quality and slower convergence.
The best solution found by 1,000 chromosomes is shown in the next figure. It is interesting to note that PMX Operator performs worst in both cases, while many papers, say Homaifar and Guan, report that OX > PMX > CX.
The experiment results of SA is shown in the next figure. It is interesting to observe that the cost (solution) found by SA grows rapidly in the initial phase and converges to the optimal solution as the temperature goes down. The execution of SA is much faster than GA, because the computational cost of each iteration is only a little bit higher than generating a chromosome. Although SA converges faster than GA, it doesn't provide better solution. Contrarily, the convergence slows down fast after a certain period of time.
The best solution found by SA is illustrated in the next figure. The ER and MIX operators of GA outperforms SA even when only 1,000 chromosomes are generated in each generation.
In this project, we have explored two derivative-free optimization methods, GA and SA, to try to optimize the traditional Traveling Salesman Problem and fair results are obtained. When the programs start to execute, SA converges much faster than GA, but the convergence slows down when the temperature is getting lower. The best solution found by GA is obviously better than those found by SA. This could be due to the difficult tuning of system-dependent parameters of SA, which is a main drawback of SA in our view.
GA, and other optimization art have been proved to be workable for small sizes of the TSP (as small as 100 cities), but much larger size of the TSPs, say 3,000 cities, usually demands some heuristic approaches. There are many attempts trying to combine different methods to solve the TSP. For example, combine GA and some local search heuristic methods, or combine GA and SA. The union of strength is believed to be stronger than of a sole one.
Combinatorial optimization is of great importance in the fields of operation research and real world, like routing problems in circuit design. Few research has been done on other variants of the TSPs, which shows a potential future direction for research. More complicated encoding scheme, crossover and mutation operators are required for other variants of the TSPs, which is surely of great help to extend the applications of GA. The comparison of GA, SA, heuristics, and other global optimization arts is also helpful for us to understand the capacities of these methods.
Books
Comment: The entire book is dedicated to the TSP, with lots of history, analysis, theory, heuristics, and related topics like vehicle routing problems. Warning: Mathemetics-major students only!
Comment: Good introductory books and rich aspects are covered . Note that Goldberg is one of the masters in GA field.
Comment: Two parts are included in this book. Part I: A Genetic Algorithms Tutorial (about 100 pages), and Part II: Application Case Studies (about 250 pages). Quite easy for beginners and non-CS majors. Software included.
Comment: One of the most popular books in GA community. Suitable for CS majors.
Comment: A brief book (about 200 pages) for introductory courses.
Comment:
Links
Sources
We have written some pretty dirty C++ codes, which disobeys every programming principles on Earth. Get them if you know what you are doing!