Open Source Libraries
PyGEP – Gene Expression Programming for Python.
PyGEP is maintained by
Ryan O'Neil, a graduate student from George Mason University. In his
words, "PyGEP is a simple library suitable for academic study of
Gene Expression Programming in Python 2.5, aiming for ease of use
and rapid implementation. It provides standard multigenic
chromosomes; a population class using elitism and fitness scaling
for selection; mutation, crossover and transposition operators; and
some standard GEP functions and linkers." PyGEP is hosted at
http://code.google.com/p/pygep/.
Software Applications
GeneXproTools 4.0 – Data mining software based on Gene
Expression Programming developed by
Gepsoft. So far it supports
Function Finding, Classification/Logistic Regression, Time Series
Prediction, and Logic Synthesis. For a quick introduction to these
problem categories you can watch the
videos released by
Gepsoft. The free Demo is available for
download from here.
The good thing about this
Demo is that it
allows you to experiment with your own data and besides, for a wide set of sample runs, the
Demo is fully functional.
This means that, for all the sample runs (and they are interesting,
mostly real-world problems), you can see the generated code in 16 different programming languages
(Ada, C, C++, C#, Fortran, Java, Java Script, Matlab, Pascal, Perl, PHP, Python, Visual Basic, VB.Net, and VHDL),
draw the parse trees, change the population size,
the number of generations, the chromosome architecture,
the learning algorithm,
the fitness function,
the function set,
the genetic
operators and their rates, etc.
Sample Runs for Function Finding:
Sample Runs for Classification:
Sample Runs for Time Series Prediction:
Sample Runs for Logic Synthesis:
|
Executables
All the executables from the
Suite of Problems. The files aren't compressed and can be run from the command prompt without parameters.
Symbolic regression with x4+x3+x2+x x4x3x2x-01.exe+> Sequence induction with 5j4+4j3+3j2+2j+1 SeqInd-01.exe+> Pythagorean theorem Pyth-01.exe+> Block stacking Stacking-01.exe+> Boolean 6-multiplexer Multiplexer6-01.exe+> Boolean 11-multiplexer Multiplexer11-01.exe+> GP rule GP_rule-01.exe+> Symbolic regression with complete evolutionary history SymbRegHistory.exe+> Sequence induction with complete evolutionary history SeqIndHistory.exe+>
The versions with the complete evolutionary history write the chromosomes of all the individuals of all populations of a run. The remaining programs write the expression tree of the first discovered solution to a file and ask if you want to evaluate the success rate over 100 identical runs.
An example of such an output for the symbolic regression problem would be:
Number of runs: 100
Number of generations: 50
Population size: 30
Absolute error: 0.01
Maximum fitness: 1000
Fitness cases:
terminal a f(a)
2.81 95.2425
6 1554
7.043 2866.55
8 4680
10 11110
11.38 18386
12 22620
14 41370
15 54240
20 168420
Function and terminal set:
* + - / a
Your solution with maximum fitness 1000:
Expression Tree:
+
++
//*-
*+-*+a*-
*+aa//aa*/**aa
aaaaaaaaaaaaaaaa
Success rate: 0.99
|
***
|