Buy the Book

  Home
  News
  Author
  Q&A
  Tutorials
  Downloads
  GEP Biblio
  Contacts

  Visit Gepsoft

 

© C. FERREIRA, 2002 (Terms of Use) ISBN: 9729589054

Gene Expression Programming: Mathematical Modeling by an Artificial Intelligence

Mining meaningful information from noisy data
 
Tools for mining knowledge from data are crucial in a world where data is constantly increasing. The quantity of data is so big that to find the meaningful factors in the sea of data becomes a Herculean task and new technologies have been developed to extract relevant knowledge from data. Gene expression programming is one of these emerging technologies and is ideal for separating the wheat from chaff. In this section we are going to illustrate this with a function finding problem where nine out of 10 variables are meaningless.

The test function is the already familiar function of section 4.1.1, with the difference that the meaningful parameter is to be discovered among a total of 10 variables. In Table 4.4 are summarized the parameters used per run in this experiment. As the high success rate shows (77%), GEP was not overwhelmed by the quantity of irrelevant data and found its way very efficiently. The first perfect solution was found in generation 61 of run 0. Its chromosome is shown below (the sub-ETs are linked by addition):

01234567890120123456789012012345678901201234567890120123456789012

*a*aa-hgadadc-ah*d-gcfjcbd/--gcgciijeegh+eeehbeddbfd*aadaabcecfgb

(4.7)

where a represents the meaningful variable and b-j represent the remaining meaningless variables. As its expression shows, this chromosome encodes a function equivalent to the target function (4.1).


Table 4.4
Settings used in the 10-dimensional data mining problem.

Number of runs 100
Number of generations 1000
Population size 50
Number of fitness cases 100
Function set + - * /
Terminal set a b c d e f g h i j
Head length 6
Number of genes 5
Linking function +
Chromosome length 65
Mutation rate 0.044
One-point recombination rate 0.3
Two-point recombination rate 0.3
Gene recombination rate 0.1
IS transposition rate 0.1
IS elements length 1,2,3
RIS transposition rate 0.1
RIS elements length 1,2,3
Gene transposition rate 0.1
Selection range 100%
Precision 0.01%
Success rate 77%


Home | Contents | Previous | Next