This Page gives further detail on the breeding procedure
incorporated in the genetic algorithm-based optimization subroutine
PIKAIA. The discussed is framed in the context of a generic
maximization problem that consists in searching for the (x,y) pair
that maximizes the output of a (given) function f(x,y),
in a domain bounded between 0 and 1 in the x
and y directions.
This defines a two-parameter optimization problem.
1. Encoding and Decoding
Suppose that two parents (i.e., two (x,y) pairs) have been
selected for breeding. The first step is to construct an encoding
of each parent, in the form of a string-like structure or
chromosome. PIKAIA proceeds by directly fragmenting each
parameter into simple decimal integers, and splicing the resulting
integers strings one behind the other.
The number of digits (nd) to be retained in the encoding
must be specified externally. For example, with nd=8 one would
obtain
(x,y)=(0.123456789,0.987654321) -> 1234567898765432
The length of the chromosome is then nd times the number of
parameters (n) defining the solution. Here n=2 so the chromosome
has length 16. Each digit can be thought of as a gene
occupying a chromosomal site for which there
exists 10 possible allelles (possible gene values).
Decoding is simply the reverse process:
1234567898765432 -> (0.12345678,0.98765432)=(x,y)
Not that in terms of the original parameter
values, a truncation loss has occurred in running through the
encoding/decoding processes.
2. Crossover
The first step of the breeding process proper is the application of
the crossover operation to the pair of parent chromosomes.
Consider now two parent chromosomes produced by the encoding process
described in the preceding section:
1234567898765432 Parent #1
7654321023456789 Parent #2
Randomly select one of the 16 sites, cut both chromosomes at that site,
interchance the fragment right of and including the cutting site,
and splice the fragments back: for a cut at site #10, this would
look like:
CUT: 123456789 8765432
765432102 3456789
SWAP: 123456789 3456789
765432102 8765432
SPLICE: 1234567893456789 Offspring #1
7654321028765432 Offspring #2
There results of this two new offspring chromosomes, each
containing intact chunks of chromosomal material from each parent.
In the GA literature this is called one-point uniform crossover
; ``one point'' because there is only one cutting site
per chromosome pair, ``uniform'' because each site is equally likely
to be selected for the cut/swap/splice operation. In general one
introduces a probability Pc that the crossover operation
is to occur; unlike the mutation probability (see below), Pc
should not be very much smaller than one.
3. Mutation
The second step of the breeding process is the application of
the mutation operator to each offspring chromosome. For each gene
of each offspring chromosome, a random number between 0 and 1
is generated, and if this number is smaller than a preset
mutation probability Pm then the gene value is randomly changed
to any other legal value. For example, the following is a mutation
at site 2:
7654321028765432
7 54321028765432
7154321028765432
In the GA literature this is called one-point uniform mutation
; ``one point'' because there is only one gene affected
at a time, ``uniform'' because each gene is equally likely
to be subjected to mutation, independently of the site it occupies
along the chromosome.
The effects of crossover and mutation on the decoded version
of a parameter set can be drastic (if one of the leading digits
is affected) or quite imperceptible (if one of the least significant
digit is affected). The result is that the breeding process can
cause large jumps in parameter space, as well as slight displacements;
the resulting search algorithm can explore efficiently, as well as perform
fine tuning. In the case of the two parameter example used here,
we would have gone from two parents:
(x,y)=(0.12345678,0.98765432)
(x,y)=(0.76543210,0.23456789)
to two offsprings:
(x,y)=(0.12345678,0.93456789)
(x,y)=(0.71543210,0.28765432)
Clearly the offsprings have taken sizable jumps through
parameter space, as compared to their parents.
4. Why this crossover business?
The crossover operation is what distinguishes genetic algorithms
from other adaptive stochastic techniques, and to a large degree
is responsible for the efficient exploration capabilities
of genetic algorithms. Consider again one of the 16-gene chromosome
defined previously; replace now each gene by one of three possible
symbols (``-'', ``+'' or ``X'') according to the following coding:
if the gene value is ``good'', in the sense of giving its bearer (on average)
higher-than-average fitness, replace the gene by the symbol +;
if its bearer has below-than-average fitness, by -; if the gene
value does not seem to matter, by X (this would usually be the case for
gene encoding least significant digits early in an evolutionary
run). Here's two possible result of this process for a 16-gene
chromosome:
++--XXXX---+XXXX Parent #1
---++XXX+++XXXXX Parent #2
Crossover occurring at site #9 (without any subsequent mutation)
would produce
++--XXXX+++XXXXX Offspring #1
---++XXX---+XXXX Offspring #2
Clearly, Offspring #1 is doing a lot better than any of its parents
(having 1's in its sites decoding to the first two significant digits),
while Offspring #2 will be mostly below average fitness, and so will
have a low probability of being selected for breeding at the next
generational iteration. Observe that crossover can combine advantageous
``chunks'' of ``good'' chromosomal material in a single offspring.
With natural selection operating,
these chunks will end up (on average) being copied into the
next generation rather frequently, the more so the more the lucky
offspring finds itself above average. This is the essence of the
schema theorem, which forms the basis
of genetic algorithm theory.
Technical details:
Consider the set of ``good'' (i.e., coded ``+'') genes of
Offspring #1 of the preceding discussion:
++......+++.....
where the period symbol is used to fill sites that are not
part of the set of genes.
This defines a schema, which we will denote
by s. The order of the schema (denoted
as N(s)) is the number of participating sites (N(s)=5 here),
and its defining length (denoted D(s)) is the linear
distance (measured in sites) between the first and last of
its defining components (D(s)=11 here;
the full chromosome length L is 16). Denote now by
m(s,t) the fraction of population members containing
the schema s in their chromosomes at generation t, let
f(s) be their average fitness, and and F be
the average fitness of the population as a whole.
With natural selection operating, it can be shown
that m will vary from one generation to the next as
This (discrete) equation expresses the fact that the frequency
of occurrence of favorable
schemata (meaning schemata for which f(s)/F is larger
than one) will increase from one generation
to the next. If the average fitness were to remain constant
(which it won't), that growth would be geometrical. In practice
the growth will not be geometrical, but it will be fast; crossover
leads to the grouping of advantageous genetic chunks, originally
distributed throughout the gene pool. The quantity within the
square parenthesis refers to the probability of a given schema
being disrupted/destroyed by the crossover and mutation
operation, a topic explored at greater depth
in the question section below. Note that it differs a bit
from the similar expression given in chapter 2 of
Goldberg's book, due to the slightly different definition
of the crossover operator adopted above.
Questions and Answers:
- For those having already worked their way through
the selection subPage; what is the most important way in which
the breeding procedure defined above differs from that used
in the WEASEL example ?
[answer]
- Under the assumption of zero crossover and mutation
probability, and with a fixed average fitness F
for the population as a whole, can you obtain a (discrete)
equation expressing the growth in the frequency m(s,t)
of a schema whose carriers' average fitness f(s) exceeds the
average by a fraction c, i.e., f(s)=F+cF ?
[answer]
- Can you construct general mathematical expressions for the
probability of a schema of a given order
(N) and defining length (D) not being disrupted
by (a) the crossover operator, (b) the crossover operator,
(c) either the crossover or mutation operator. Assume a
full chromosome length equal to L.
(d) How does your result in (c) compare to the quantity
within square parentheses in the expression cited above ?
Can you explain the difference ?
[answer]
- Based on your answers to question 3, what are the optimal
defining characteristics of a schema that will minimize
disruption upon breeding ?
[answer]
|