here is some code that might help (which I just wrote): GA to order 10 values with an interval of 1.0 . It starts with a collection of 100 completely random alleles, and that is how your code begins.
The goal I gave the GA solution was to order the values in ascending order with a split of 1.0. He does this in the Eval_OrderedDistance fitness function by calculating the standard deviation of each pair of samples from 1.0. As fitness tends to 0.0, alleles must appear in sequential order.
Generation 0 is best The chromosome was completely random, like the rest of the chromosome. You can see that the value of fitness is very large (i.e., bad):
GEN: fitness (allele, ...) 0: 375.47460 (583.640, -4.215, -78.418, 164.228, -243.982, -250.237, 354.559, 374.306, 709.859, 115.323)
As generations continue, fitness (standard deviation from 1.0) decreases until it becomes almost ideal in the 100,000 gene:
100: 68.11683 (-154.818, -173.378, -170.846, -193.750, -198.722, -396.502, -464.710, -450.014, -422.194, -407.162) ... 10000: 6.01724 (-269.681, -267.947, -273.282, -281.582, -287.407, -293.622, -302.050, -307.582, -308.198, -308.648) ... 99999: 0.67262 (-294.746, -293.906, -293.114, -292.632, -292.596, -292.911, -292.808, -292.039, -291.112, -290.928)
The interesting parts of the code are the fitness function:
// try to pack the aleles together spaced apart by 1.0 // returns the standard deviation of the samples from 1.0 static float Eval_OrderedDistance(Chromosome c) { float sum = 0; int n = c.alele.Length; for(int i=1; i<n; i++) { float diff = (c.alele[i] - c.alele[i-1]) - 1.0f; sum += diff*diff; // variance from 1.0 } return (float)Math.Sqrt(sum/n); }
And mutations. I used a simple crossover and "completely mutated one allele":
Chromosome ChangeOne(Chromosome c) { Chromosome d = c.Clone(); int i = rand.Next() % d.alele.Length; d.alele[i] = (float)(rand.NextDouble()*2000-1000); return d; }
I used elitism to always keep one exact copy of the best chromosome. Then 100 new chromosomes were generated using mutation and crossover.
It really sounds like you are calculating the difference in fitness, which of course tells you that fitness in your population is about the same. I found it very important how you define your fitness function. The more granular the fitness function, the more you can distinguish your chromosomes. Obviously, your fitness function returns similar values for completely different chromosomes , since your gene 0 returns the fitness variance of 68e-19.
Can you talk about your fitness account? Or what problem do you ask the GA to solve? I think this will help us.
[Edit: Add explicit sharing / Niching]
I changed my mind a bit and updated my code . If you are trying to preserve unique chromosomes, you need to compare their contents (as others have pointed out). One way to do this is to calculate the standard deviation between them. If it is less than a certain threshold, you can consider them the same. From the Chromosome class:
// compute the population standard deviation public float StdDev(Chromosome other) { float sum = 0.0f; for(int i=0; i<alele.Length; i++) { float diff = other.alele[i] - alele[i]; sum += diff*diff; } return (float)Math.Sqrt(sum); }
I think Nichin will give you what you want. He compares all chromosomes in a population to determine their similarity and assigns each a niche value. Then the chromosomes are “punished” for belonging to a niche using the “Explicit Fitness Access” method. Suitability values are divided by the number of chromosomes in each niche. Therefore, if you have three in the niche group A (A, A, A), instead of making this niche 3 times more likely, it should be considered as a whole.
I compared my example with an explicit separation of skills on / off. With a maximum STDDEV of 500 and Niching turned off, there were about 18-20 niches (so basically 5 duplicates of each item in 100 people). When Niching turned on, there were about 85 niches. This is 85% of the unique chromosomes in the population. In the output of my test you can see the diversity after 17,000 generations .
Here's the niching code:
// returns: total number of niches in this population // max_stddev -- any two chromosomes with population stddev less than this max // will be grouped together int ComputeNiches(float max_stddev) { List<int> niches = new List<int>(); // clear niches foreach(var c in population) { c.niche = -1; } // calculate niches for(int i=0; i<population.Count; i++) { var c = population[i]; if( c.niche != -1) continue; // niche already set // compute the niche by finding the stddev between the two chromosomes c.niche = niches.Count; int count_in_niche = 1; // includes the curent Chromosome for(int j=i+1; j<population.Count; j++) { var d = population[j]; float stddev = c.StdDev(d); if(stddev < max_stddev) { d.niche = c.niche; // same niche ++count_in_niche; } } niches.Add(count_in_niche); } // penalize Chromosomes by their niche size foreach(var c in population) { c.niche_scaled_fitness = c.scaled_fitness / niches[c.niche]; } return niches.Count; }
[Edit: post-analysis and updating Anton code]
I know that this is probably not the right forum for solving home problems, but since I did it before knowing it, and I was very interested in doing it, I believe that this can be useful only for Anton.
Genotip.cs , Kromosom.cs , KromoMain.cs
This code supports a good variety, and in one pass I was able to get a "raw food diet" of up to 47, which in your case is a standard error. It was pretty close!
As noted in my comment, I would like to try to help you with programming, and not just help you with your homework. Read this analysis of your work.
As expected, from the very beginning it was not necessary to create a “more diverse” population. Just generate completely random cromosomes.
Your mutations and crossovers were very destructive, and you only had a few of them. I have added some new operators that seem to work better for this problem.
You have chosen the best solution. When I get your code that works only with the choice of the tournament, there will be one Kromo, which is 99% better than everyone else. With a tournament choice, this best value is likely to be forgotten. I added a bit of “elitism” that keeps a copy of this meaning for the next generation.
Consider object oriented methods. Compare the rewriting I sent you with my source code.
Do not duplicate the code. You had sampling options in two different classes.
Keep your code clean. There were several unused pieces of code. Especially when submitting questions to SO, try narrowing it down, removing unused code, and doing some cleanup.
Comment your code! I commented on the re-work significantly. I know this is Serbian, but even a few comments will help someone understand what you are doing and what you intended to do.
All in all, a good job implementing some of the more complex things like Tournament Selection
Prefer double [] arrays over list. There is less overhead. In addition, some of the Temp List variables are not needed. Your structure
List temp = new List (); for (...) {temp.add (value); } for (each value in temp) {sum + = value} average = sum / temp.Count
easy to write as:
sum = 0 for(...) { sum += value; } average = sum / count;
In several places, you forgot to initialize a loop variable that could easily add to your problem. Something like this would cause serious problems, and that was in the fitness code along with one or two other places
double fit = 0; for (each chromosome) {// YOU SHOULD INITIALIZE HERE inside the LOOP for (each allele) {fit + = ...; } fit / = count; }
Successful programming!