GEP Book

  Home
  News
  Author
  Q&A
  Tutorials
  Downloads
  GEP Biblio
  Contacts

  Visit Gepsoft

 

C. FERREIRA In J. M. Benitez, O. Cordon, F. Hoffmann, and R. Roy, eds., Advances in Soft Computing: Engineering Design and Manufacturing, pages 257-266, Springer-Verlag, 2003.

Function Finding and the Creation of Numerical Constants in Gene Expression Programming

First Approach: Direct Manipulation of Numerical Constants
 
To solve the sequence induction problem using random numerical constants, F = {+, -, *}, T = {a, ?}, the set of integer random constants R = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, and “?” ranged over the integers 0, 1, 2, and 3. The parameters used per run are shown in the first column of Table 2. In this experiment, the first perfect solution was found in generation 45 of run 9 (the contribution of each sub-ET is indicated in square brackets):

y = [a2] + [a] + [2a4 + 4a3] + [0] + [2a2] + [1 + a] + [3a4]

(3.3)

which corresponds to the target sequence (3.1).

As shown in the first column of Table 2, the probability of success for this problem is 16%, considerably lower than the 81% of the second approach (see Table 2, column 2). It is worth emphasizing that only the prior knowledge of the solution enabled us, in this case, to choose correctly the type and the range of the random constants.

To find the “V” shaped function using random constants F = {+, -, *, /, L, E, K, ~, S, C} (“L” represents the natural logarithm, “E” represents ex, “K” represents the logarithm of base 10, “~” represents 10x, “S” represents the sine function, and “C” represents the cosine) and T = {a, ?}. The set of rational random constants R = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, and “?” ranged over the interval [-1, 1]. The parameters used per run are shown in the third column of Table 2. The best solution, found in run 50 after 4584 generations, is shown below (the contribution of each sub-ET is indicated in square brackets):

(3.4)

It has a fitness of 1989.566 and an R-square of 0.9997001 evaluated over the set of 20 fitness cases and an R-square of 0.9997185 evaluated against a testing set of 100 random points also chosen from the interval [-1, 1].

It is worth noticing that the algorithm does in fact integrate constants in the evolved solutions, but the constants are very different from the expected ones. Indeed, GEP (and I believe, all genetic algorithms with tree representations) can find the expected constants with a precision to the third or fourth decimal place when the target functions are simple polynomial functions with rational coefficients and/or when it is possible to guess pretty accurately the function set, otherwise a very creative solution would be found.

To predict sunspots using random numerical constants, the set of functions F = {4+, 4-, 4*, 4/} and the set of terminals T = {a, b, c, d, e, f, g, h, i, j, ?}. The set of rational random constants R = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, and “?” ranged over the interval [-1, 1]. The parameters used per run are shown in the fifth column of Table 2. The best solution, found in run 92 after 4759 generations, is shown below:

(3.5)

It has a fitness of 86603.2 and an R-square of 0.833714 evaluated over the set of 90 fitness cases.

Home | Contents | Previous | Next