Genetic algorithm

A genetic algorithm (GA) is an algorithm used to find approximate solutions to difficult-to-solve problems, inspired by and named after biological processes of inheritance, mutation, natural selection, and the genetic crossover that occurs when parents mate to produce offspring. Genetic algorithms are a particular class of evolutionary algorithms.

Genetic algorithms are typically implemented as a computer simulation in which a population of abstract representations of candidate solutions to an optimization problem are stochastically selected, recombined, mutated, and then either eliminated or retained, based on their relative fitnesses.

John Holland was the pioneering founder of much of today's work in genetic algorithms, which has moved on from a purely theoretical subject (though based on computer modelling), to provide methods which can be used to solve some difficult problems today. Problems which appear to be particularly appropriate for solution by genetic algorithms include timetabling and scheduling problems, and many scheduling software packages are based on GAs.

The problem to be solved is represented by a list of parameters which can be used to drive an evaluation procedure. Possible solutions to the problem (referred to as chromosomes or genomes) are also represented in this parameter list form. These chromosomes are evaluated, and a value of goodness or fitness is returned.

Initially several such parameter lists or chromosomes are generated randomly, to form an initial pool of possible solutions. This is called the first generation pool.
All of the chromosomes are evaluated by a fitness function, and effectively the pool is sorted with those having better fitness at the top, representing better solutions to the problem. Notice that better in this context is relative, as initial solutions are likely to be rather poor.

The next step of the algorithm is to generate a second generation pool of parameter lists, which is done using the genetic operators selection, crossover (or recombination), and mutation.

The first step in the construction of the next generation is to select a pair of chromosomes for crossover. Selection is biased towards elements of the initial generation which have better fitness, though it is usually not so biased that poorer elements have no chance to participate. There are several well-defined selection methods; roulette wheel selection and tournament selection are popular methods.
Following selection, the crossover (or recombination) operation is performed upon the selected chromosomes. In general, two parent chromosomes are chosen to reproduce and their crossover results in two new child chromosomes, which are added to the second generation pool. This is repeated until there are an appropriate number of candidate solutions in the second generation pool.
The next step is to mutate the newly developed pool, again by a process of selection, this time of individual chromosomes, followed by application of the mutation genetic operator.

These processes result in a second generation pool of chromosomes that is different from the initial generation, which is then evaluated and the fitness values for each list is obtained. Generally the average degree of fitness will have increased by this procedure for the second generation pool.

A slight variant of this method of pool generation is to allow some of the better chromosomes from the first generation to carry over to the second. This form of genetic algorithm is known as an elite selection strategy.

The process continues by generating third, fourth, fifth,... (and so on) generations, until one of the generations contains solutions which are good enough.

There are several general observations to make about the generation of solutions.

GAs may have a tendency to converge towards local solutions rather than global solutions to the problem to be solved
as time goes on each generation will tend to have multiple copies of successful parameter lists, which require evaluation, and this can slow down the processing
the most important genetic operators are selection and crossover. Mutation is only necessary to ensure that potential solutions are not lost.
GAs are not good at finding optimal solutions but can rapidly locate good solutions, even for difficult search spaces.

It is also important to note that there are several different variants of the basic GA algorithm. The simplest algorithm represents each chromosome as a bit string. Typically numeric parameters can be represented by integers, though it is possible to use floating point representations. The basic algorithm performs crossover and mutation at the bit level. Other variants treat the parameter list as lists of numbers, and crossover and mutation are performed so as to respect number boundaries. For most data types specific variation operators can be designed.

Genetic algorithms are known to produce good results for some problems. Their major disadvantage is that they are relatively slow, compared to other methods, such as random optimisation. Recent speed improvements have focused on speciation, wherein cross-over can only occur if individuals are closely-enough related. Genetic algorithms have been successfully applied to the study of neurological evolution (see NeuroEvolution by Augmented Topologies).

Genetic programming is a related technique developed by John Koza, in which computer programs, rather than function parameters, are optimised. Genetic programming often uses tree-based internal data structures to represent the computer programs for adaptation instead of the list, or array, structures typical of genetic algorithms. Genetic programming algorithms typically require running time that is orders of magnitude greater than that for genetic algorithms, but they may be suitable for problems that are intractable with genetic algorithms.

References

Goldberg, David E (1989), Genetic Algorithms in Search, Optimization and Machine Learning
Mitchell, Melanie, (1996), An Introduction to Genetic Algorithms, MIT Press, Cambridge, MA Addison-Wesley

External links

Golem Project - Automatic Design and Manufacture of Robotic Lifeforms
Introduction to Genetic Algorithms Using RPL2