# [Beowulf] help on island model

Robert G. Brown rgb at phy.duke.edu
Fri Mar 10 05:14:42 PST 2006

On Thu, 9 Mar 2006, Jeremy Baker wrote:

> How CS has used the island model to extrapolate the design of a model for
> software/hardware dynamics is beyond my knowledge base. Biology has been my
> learn'n.

The point of Island models in GAs is that how well a GA works is a
rather delicate balance between population diversity, population size,
mutation rate, and the underlying projective dimensionality of the
problem being solved (which in many cases is not known except as an
upper bound).

The solution being sought is typically a parametric vector that is
mapped by some process (the one being optimized) into a scalar landscape
function -- a fitness function, a cost function, a benefit function, a
free energy function (in the case of physics).  This vector is the
analog of a "chromosome", and each entry can be thought of as a "gene".
The problem is to find the best possible gene for each slot in the
chromosome by means of a genetic process whereby each member of a trial
population exchanges some part (say, half) of its parametric vector
with another member of the population to produce and "offspring".  Both
parents and children then go through a "natural selection" process that
compares their "fitness" -- the scalar landscape function.  Losers
"die" and are eliminated from the population; winners go on to
reproduce again.  Most GAs eliminate a parent after (say) three
generations anyway no matter how fit they are because otherwise
everybody starts to look like grampa and the population prematurely
converges.

The information-theoretic part of a GA is very interesting.  Let us
imagine that each number in the vector is an integer in the range 0-10,
although of course in many cases it will be an essentially unbounded
real number.  Just to make it clear that a GA can solve problems that
other algorithms have a very hard time with, we'll also assume that the
landscape function has no local (parametric) gradient, although it does
have a genetic gradient.  That is, if a digit in some position is
correct, the landscape function increases by a small amount relative to
incorrect, but incorrect values all lead to the same landscape function.
Gradient search on the parameters is thus impossible because there ain't
presuming that you've already found the right one, but you CAN identify
the right one one gene at a time.

If the problem dimensionality is 1, and the optimum answer is (say) "3",
then a randomly drawn population with more than ten members is likely to
contain the optimum solution already and using a GA at all is foolish --
an exhaustive search is cheap.  If it is 2, with answer "3 1", you need
>100 members of the population to have a good chance of actually hitting
the answer directly.  Still within the bound of exhaustive search or (if
you know the optimum you are searching for) a monte carlo where you just
generate random trials until you hit it.  If it is 3, with answer "3 1
4", things start to get a bit nasty, and by the time you hit 10 and the
answer is "3 1 4 1 5 9 2 6 5 3" you no longer have the physical memory
to HOLD an exhaustive search and if the fitness function is "expensive"
to generate you start to have to wait long times to run through this
many numbers.

HOWEVER, a much, much smaller population is likely to contain the answer
in the following sense.  A population of only a few hundred is almost
certain to contain a "3" in position one many times over.  It will also
contain a "1" in position 2, a "4" in position 3, and so on, but for the
MOST part they won't be assembled in anything like the digit string for
\pi.  A very few -- maybe only one or two -- will have "3 x 4 x x x..."
or "x x 4 x x 9 2 x..." -- two or more lined up correctly.  These will
be the "fittest" members of the population.  Similarly a few will have
NO genes correct -- they will be the least fit members.

At each step the GA has a chance of taking two strings from the
population and generating an offspring that has e.g. "3 x 4 x x 9 2
x..." (four in right position).  If it does, it will likely be
signficantly more "fit" than either parent, and MUCH more fit than the
population members with 0 or 1 genes correct.  It therefore will replace
one of the 0's or 1's and be "favored" somewhat in breeding.  Somewhere
else one HOPES one is making members like "x 1 x 1 5 x ..." and that
eventually random chance will get these two scions together to produce
solutions with still more of the solution correct.

However, it doesn't work THAT efficiently.  It could well be that only
10 or so members of the population have a "1" in position two, and that
by chance none of them have any more genes right.  The "1" is then in a
precarious position.  In spite of it being in "unfit" individuals only,
if you lose those individuals you lose any possibility of ever getting
that gene right!

Here is where population size and >>mutation<< start to be important.
If you have a large population (compared to some measure of problem
dimensionality) there are a lot of copies of the "1" around and at least
SOME of them are likely to be in halfway decent individuals.  It will
take a long time for that gene to be eliminated, and there is a decent
chance that it will end up eventually being incorporated by a suitable
differential gene crossover step into the "most successful" individuals
before being lost forever.  If your data dimensionality is >>fifty<<
though, this is not likely to be true for >>any<< population you can
likely afford to run.

genes in any single slot, gene pairs in slots taken two at a time, gene
triplets etc WILL disappear before they are "discovered" and
incorporated into the "best" individuals.  If you make the best
individuals too good, they paradoxically are TOO efficient at
eliminating the "unfit" individuals that >>nevertheless act as a
reservoir for critical genetic information<< and yeah, everybody looks
like grampa even though grampa isn't really all that good looking.  If
you don't make the best individuals "good enough" (successful enough in
reproduction) then the best schema (genetic patterns) you've found so
far similarly have a rather high chance of disappearing from the
population before they are "fixed" by appearing in enough individuals
that they spread out and eventually appear in all individuals.

Mutation is one way of recovering lost genes.  Even if all the "1"s in
slot 2 have been eliminated as unfit, in every generation you roll dice
and subject random genes in random individuals to a random reselection
of their values.  In a population of a few hundred, every hundred
generations or so you might set the probabilities such that at least one
individual with a "1" in slot 2 is likely to be generated (recalling
that MOST mutations are likely to be BAD, and that mutation can
actually destabilize the process and prevent the conservation of good
schema if it occurs to rapidly).  As the population starts to shuffle
out the good genes in the good slots, you have a decent chance of
eventually taking "3 x 4 1 5 9 x x " -> "3 1 4 1 5 9 x x" to a small
differential increase of fitness, and then conserving the new scheme.

However, if you work out the numbers, mutation can only really help with
distribution tails and usually only with single genes.  In many problems
the fitness doesn't increase linearly (as I'm presuming here) but in a
correlated way -- you only get an increase in fitness if you find BOTH a
"1" in slot 2 AND a "4" in slot 3 at the same time.  This is
geometrically less likely, making it "infinitely" unlikely as far as
random processes are concerned (for an expensive to evaluate fitness
function) all too fast.

ALSO if you work out the numbers, if you have a large population it
takes a long, long time for any superior gene set to take it over.
There are simply too many competing schema, and in REAL problems these
schema may be incompatible -- that is, there may be patterns that lead
to good comparative fitness in (say) four out of ten slots that are NOT
in the actual optimal answer.  Those "false" patterns can be quickly
resolved and dominant in the population and constantly try to "mix" with
the members of the population that have the true solution encoded.  This
can actually PERMANENTLY block the correct solution from emerging all
too easily -- the population either fails to converge in any time you
can afford to wait or converges to a false solution where there is
simply no differential or crossover-linked path to the "right" one (or
even a "better" one).

This is the problem that people are constantly looking to fix.  How can
you maintain the diversity of the population (so that good genes are
conserved, and good gene schema are conserved) AND eventually converge
(so that populations cannot be too large) AND not get locked into
locally stable but false solution patterns (at least ones involving less
than N genes -- it simply isn't possible to prevent this for a really
complex problem but you can work to ensure that your answer is at least
"halfway decent".

The Island model seeks to solve the problem by splitting a large
population up into several distinct subgroups -- Islands -- that do the
GA separately.  Then it seeks to selectively mix the results.  I'm sure
there is all sorts of black magic in the selective mixing -- if you let
the islands develop completely independently you'll get distinct
"species" that will NOT mix for many kinds of problems, so you have to
somehow ensure that the schema that emerge are interbreedable.

HTH,

rgb

> The following is an intuitive guess and may be redundant:
> In ecology, an oversimplified  island model  is a theory that details why
> diversity varies between islands. Island size is a direct relationship to
> genotype diversity. Island's distance from the main land and/or other
> islands is inversely related to the island's diversity. Other factors could
> be included such as island's global location, climate's seasonal
> temp/wind/current patterns that may also alter specie migration patterns,
> human impact, etc. This aforementioned model may include predator prey
> relationships, inter/intra specie competition for resources, and other food
> web facts and specie relationships.
> At best, a software program will only be as good as the data derived from
> good field work. Traps that are not properly set will yield poor results, as
> well as animal behavior not accounted for or known/observed. Field methods
> may enhance or impair how the target animal behaves. Data may be precise but
> not accurate. Building a model on limited research might be a waste
> of computer resources when a less interesting but statistically accurate
> environment might better merit data mining, to dwell on how to mitigate
> hardware needs and software design.  Where field data produces copious
> results that fine tune statistical curves, I would think a population could
> be grouped such that individual calculations could be avoided for each
> organism when a subroutine wants to find out how many birds with small beaks
> fail to eat enough food when the weather impairs their food source.
> However, individual computations may create enhanced results, that is if the
> average wave height at 60 fathoms forty feet from a shoal on the north side
> of an island, during high tide in a fixed hector^2 area is found at 2PM when
> the moon is new and water temp is 27C, the frequency of the rouge wave
> height can be modeled using RND and an S curve, but it doesn't tell you
> where this wave occurred in the measured zone. If the birds that die off are
> individually computed, then other field factors may be altered such as how
> that bird's death impacts its immediate territory. The bird population may
> encompass territory shared by species that are niche or mutually exclusive
> to other species or in defined zones depending on many relationship factors.
> The death of a bird(s) where it competes for certain resources (including
> nesting sites) with a niche community may just help (or hurt) that niche
> community's population, approaching a new population limit which would alter
> your diversity model and CPU/memory needs. Also, learned behavior in certain
> animals might not be accounted for in a program, such as a cat that doesn't
> forage past a boundary of a competitive neighbor (although that neighbor has
> gone/died) before another competitor takes advantage of the resource.
> I would think a good program would divide the model into several programs,
> some suggestions: [most to least predictable] i) weather, ii) geography,
> iii) botany, iv) animal profiles excluding prokaryotes unless the animal
> needs it, v) tags for profile attributes that use similar calculations, vi)
> uncommon computations, and vii) monitor for CPU/memory needs between
> programs at n cycle. I imagine the weather modal would easily merit
> predictable CPU/memory power, as well as the geographic model (physical
> attributes as well as effects caused by animals/plants/time such as how much
> nitrate is in soil, hard/soft water, pH, water/soil cohesion tension, etc),
> and to a less extent the botany model (current plant surveys and
> known/recent trends). The aforementioned models will be using data that is
> by far easier to obtain than when compared to animals, so I would find
> good/meaningful curves to represent these items for the program (so knowing
> what kind of animal would alter the number of variables needed for a
> profile).  The variable CPU/memory power would most likely need to be fluid
> when approaching animal relationships. Perhaps generalized curves could be
> used for populations or regions at n cycle unless an event merits itemizing
> each organism's profile (ie, approaching a limit where a virus bloom may
> occur as augmented by an organism's profile, weather, geography).
> Some animals are easy to study while others are obscure, so profile
> variables would need to address the info known about the genotype. Without a
> keystone organism, the continued computing of profiles for other species in
> the shared domain would be pointless, so to speak. A library of curves could
> be built for certain species, so individual profiles may not be necessary
> when field data is solid. How and by what standards field research is
> measured would need to be a collective discussion to establish a benchmark
> as to when a specie population can be statistically infused in the code vs a
> population with individually computed profiles, or what kind of event would
> merit itemizing a statistical population into a collection of individual
> profiles.
> Perhaps R species might warrant greater use of statistics vs K species with
> individual profiles?
> If RND generation takes more time than to access when compared to seeking an
> address holding a preset RND, then I would harvest idle CPU time to stack
> RND numbers such that when an event requires use of curves or heavy profile
> calculations, the call uses the stack. [I don't know if this is good
> application of CS.]
>
>
> On 3/9/06, Eric Thibodeau <kyron at neuralbs.com> wrote:
>>
>> I happen to be working on just this subject ;)
>>
>>        Island model can be more than one thing, it depends on your
>> implementation.
>>> From your description, you seem to be implementing a "multiple-deem"
>> approach
>> while retaining the same GA parameters across these islands. David A. Van
>> Veldhuizen addresses the parallelisation issues in his document entitled
>> "Considerations in Engineering Parallel Multiobjective Evolutionary
>> Algorithms"
>> http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1197689
>>
>> Even though his article is oriented towards MOGAs, most if not all issues
>> brought up in his article also apply to GAs. To give you an idea, here are
>> four points brought up in his article specifically about island model:
>>
>> Four basic island pMOEA variants are seen to exist, each with
>> appropriate migration operators.
>> 1) All islands execute identical MOEAs/parameters (homo-
>>     geneous).
>> 2) All islands execute different MOEAs/parameters (hetero-
>>     geneous).
>> 3) Each island evaluates different objective function subsets.
>> 4) Each island represents a different region of the genotype
>>     or phenotype domains.
>>
>>        My masters is actually geared at trying to "answer" or "reply" to
>> some of the
>> unanswered questions brought up in this same article (providing
>> quantitative
>> results with a clearly defined/detailed environment specification as well
>> as
>> some sort of recipe for parallelising a specifically given MOGA).
>>
>> I hope this reference helps!
>>
>> Eric
>>
>> Le Dimanche 5 Mars 2006 03:30, purnima a écrit:
>>> hi,
>>>       I've to implement island model genetic algorithm using MPI. the
>> problem is I've already implemented a parallel Genetic Algorithm using MPI
>> where i divide population into subpopulations and distribute it 2 all the
>> processors, and each processors performs sequential GA on its
>> subpopulation
>> after some iteration these processors migrates its best chromosome toe ach
>> other and again proceed.
>>>
>>>  My problem is that i dnt understand what exactly is island model? what
>> island represents??  how shud i implement it..
>>>  thanx
>>>
>>>
>>> ---------------------------------
>>>  Yahoo! Mail
>>>  Use Photomail to share photos without annoying attachments.
>>
>> --
>> Eric Thibodeau
>> Neural Bucket Solutions Inc.
>> T. (514) 736-1436
>> C. (514) 710-0517
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>
>
>
> --
> Jeremy Baker
> SBN 634
> 337 College Hill
> Johnson, VT
> 05656
>

--
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu