System Programming: Genetic Algorithms. Lame Example

Thursday, January 10, 2013

Genetic Algorithms. Lame Example - Solving Quadratic Equation

Source code to this article may be found here.

There are numerous resources on the Internet, that provide description of the theory of Genetic Algorithms and theoretical explanation thereof. I, however, have found a bit more then none giving a real example (I may have not searched that good, though). Therefore, I decided to try and implement the theory into a live example. While there are thousands of areas where GA may be applied, I decided to choose trivial quadratic equation solution process for the sake of simplicity of the implementation. The equation is y = 13x^2 - 5x - 12. It has two roots (points where its graph crosses the X axis) at x = -0.7875184 and at x = 1.17213378. Although, this particular GA implementation has no application in real life (you can solve that equation on paper in several seconds), it demonstrates GA at its finest.

Data Encoding

Classic implementation of GA implies encoding of data as bit strings (chromosomes where each gene is represented by one or more bits), however, I decided to change it a bit (which is not prohibited) and use real numbers instead of the bit strings. Each chromosome includes two genes, two possible solutions, one for each root of the equation.

typedef struct _CHROMO_

{

double value1, value2;

double fitness;

}chromo_t;

The third member of the chromosome is fitness and is used only to store the fitness of the solution represented by this chromosome. The genes (values) are initiated to random values when the initial population is created:

#define POPULATION 2000

chromot_t population[POPULATION];

for(int i = 0; i < POPULATION; i++)

{

population[i].value1 = (double)(rand() % 10) + 5.0;

population[i].value2 = -(double)(rand() % 10) - 5.0;

}

Fitness Function

Fitness function is used to estimate the suitability of the current solution. In this particular case, the fitness function is the the sum of values produced by the equation when fed the values from the chromosome. This means that the lower the value returned by the fitness function the closed we get to the proper solution.

double fitness(chromo_t* ch)

{

return fabs(13.0 * pow(ch->value1, 2.0) - 5.0 * ch->value1 - 12.0) + fabs(13.0 * pow(ch->value2, 2.0) - 5.0 * ch->value2 - 12.0);

}

While you iterate through the population calculating fitness for each phenotype, you should save the indices of two solutions (the amount may actually vary from case to case) as those would be used to produce the next generation.

Crossover

Crossover operator is the crucial part of any GA implementation. It is a function that takes two (or more) best phenotypes and uses them to produce a new generation. Suggestion is to crossover that pair to produce one (or more - depends on the implementation) child and then crossover the best phenotype with the rest of the population, thus, entirely replacing the old generation with the new one. In this case, the crossover function is rather trivial:

void cross(chromo_t* chMom, chromo_t* chDad, chromo_t* chChild)

{

static int mutation = 0; // Which operation to use for crossover

switch(mutation)

{

case 0:

chChild->value1 = chMom->value1 + chDad->value1 * MUTATION_RATE;

chChild->value2 = chMom->value2 + chDad->value2 * MUTATION_RATE;

break;

case 1:

chChild->value1 = chMom->value1 + chDad->value1 * MUTATION_RATE;

chChild->value2 = chMom->value2 - chDad->value2 * MUTATION_RATE;

break;

case 3:

chChild->value1 = chMom->value1 - chDad->value1 * MUTATION_RATE;

chChild->value2 = chMom->value2 + chDad->value2 * MUTATION_RATE;

break;

case 4:

chChild->value1 = chMom->value1 - chDad->value1 * MUTATION_RATE;

chChild->value2 = chMom->value2 - chDad->value2 * MUTATION_RATE;

break;

case 5:

chChild->value1 = chDad->value1 - chMom->value1 * MUTATION_RATE;

chChild->value2 = chDad->value2 - chMom->value2 * MUTATION_RATE;

break;

case 6:

chChild->value1 = chDad->value1 + chMom->value1 * MUTATION_RATE;

chChild->value2 = chDad->value2 - chMom->value2 * MUTATION_RATE;

break;

case 7:

chChild->value1 = chDad->value1 - chMom->value1 * MUTATION_RATE;

chChild->value2 = chDad->value2 + chMom->value2 * MUTATION_RATE;

break;

}

mutation++;

if(mutation > 7)

mutation = 0;

}

where the MUTATION_RATE is the speed at which solutions evolve (in this example set to 0.0001). The higher the MUTATION_RATE, the faster you may get to the proper solution, however, it may be less accurate. The best would be to reduce it while approaching the solution.

As you may have noticed, crossover and mutation are combined in a single function, while most of the time you would want to separate them and use mutation only in case of stagnation (inability of the phenotypes to evolve into a proper solution).

Stop Condition

Not much can be said here. You should stop the execution once you reach desired precision or once you realize that there is no solution as it is totally possible that certain quadratic equation has no roots, instead, it has its minimum or maximum. It is important to mention, that the precision and speed of certain GA implementation depends on the mutation rate and amount of phenotypes in population. In fact, you may implement another GA in order to find optimal values for the current one :).

Testing the Implemented Algorithm

This specific implementation approaches the roots of the aforementioned quadratic equation in a bit less then 23 seconds. Generated solutions are precise enough for most needs.

Let us take a look at the evolution of the solutions generated by the described implementation of GA.

This graph shows how solutions change over the execution time.

The fitness curve looks quite similar:

Fitness curve

As you may see, GA is capable of finding the right direction in a very fast manner.

Conclusion

While this specific implementation is trivial and, to be honest, quite lame, I hope it shows that GAs are a very powerful tool. It is sage to say, that solving a quadratic equation is one of the simplest tasks where GA may be successfully applied. Genetic Algorithms of different kinds may be used for selection of Artificial Neural Network topology, different algorithm optimizations, etc. The success of certain GA only depends on the correctness of the chromosome and crossover implementations. Let me reiterate - it is always possible to implement another GA in order to get proper implementation of the current one.

Hope this post was helpful. See you at the next.

30 comments:

Miky GonzalezJanuary 10, 2013 at 5:32 PM
In Linux --> segmentation fault!
My name's MikyGonzalez, regards :)
ReplyDelete
Replies
Miky GonzalezJanuary 10, 2013 at 7:39 PM
In fact, I'm on 64-bit linux. Just in case I added -Os but still gives me the error: Segmentation fault (core dumped)
ReplyDelete
Replies
Miky GonzalezJanuary 10, 2013 at 10:47 PM
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400804 in main () at main.c:51
51 printf("%d, Best: %4d = %0.40lf, %0.40lf fitness = %0.50lf\n", i, indexBest, population[indexBest].value1, population[indexBest].value2, population[indexBest].fitness);
ReplyDelete
Replies
Сr4shJanuary 11, 2013 at 2:22 AM
Хороший пост!
Но в качестве наглядных примеров оно как-то больше для machine learning подходит, или whitebox-фаззинга с полиморфными вирусами на худой конец.
ReplyDelete
Replies
UnknownFebruary 24, 2013 at 2:50 AM
Hello , I have the source of a rootkit , made in C++ you want to see ? , But it is detected by most anti-virus with a little modification you can make it work again : undetectable ...
ReplyDelete
Replies
UnknownFebruary 26, 2013 at 9:16 PM
It was not bad at all , bro only i wanted to share so that you can, check out a look or serve in a future ..it is detected by the antivirus ..i can give you the blog from the original user who created it programd in C++ with a little modification you can again make undetectable ,you share because you know the C++ language
ReplyDelete
Replies
UnknownAugust 28, 2013 at 8:47 AM
This comment has been removed by a blog administrator.
ReplyDelete
Replies
UnknownOctober 23, 2013 at 7:58 PM
Hey, This one thing I cannot understand.

double fitness(chromo_t* ch)
{
return fabs(13.0 * pow(ch->value1, 2.0) - 5.0 * ch->value1 - 12.0) + fabs(13.0 * pow(ch->value2, 2.0) - 5.0 * ch->value2 - 12.0);
}

value2 is basically negative value1.
So why have you substituted value1 and value2 in the equation and then added them?
As in why are you computing their sum?
ReplyDelete
Replies
UnknownOctober 23, 2013 at 8:11 PM
I mean, how do you know that both value1 and value2 will not turn out to be the same.
ReplyDelete
Replies
UnknownNovember 1, 2013 at 7:12 PM
i cant execute this code, any equation tested is displaying the same result in all generations. For example, in the generate 2000 the two values are 12 and -7 to any equation tested. Can you help me??
ReplyDelete
Replies
Abhijeet charlieMay 10, 2014 at 10:45 PM
That link of source code at the top of the page..right under title is not working..
ReplyDelete
Replies
khadijah sallehJune 10, 2014 at 3:12 PM
Hi, may i know why did you set the value1 = (double)(rand() % 10) + 5.0; and -(double)(rand() % 10) - 5.0;
I mean why you use double and mudolous to 10 the plus or minus with 5?
tq
ReplyDelete
Replies

Add comment

Note: Only a member of this blog may post a comment.

Search This Blog

Thursday, January 10, 2013

Genetic Algorithms. Lame Example - Solving Quadratic Equation

30 comments: