GEP Book

  Home
  News
  Author
  Q&A
  Tutorials
  Downloads
  GEP Biblio
  Contacts

  Visit Gepsoft

 

C. FERREIRA In E. Lutton, J. A. Foster, J. Miller, C. Ryan, and A. G. B. Tettamanzi, eds., Proceedings of the 4th European Conference on Genetic Programming, Lecture Notes in Computer Science, Vol. 2278, pages 51-60, Springer-Verlag, Berlin, Germany, 2002.

Discovery of the Boolean Functions to the Best Density-Classification Rules Using Gene Expression Programming

Structural Organization of Genes
 

GEP genes are composed of a head and a tail. The head contains symbols that represent both functions and terminals, whereas the tail contains only terminals. For each problem, the length of the head h is chosen, whereas the length of the tail t is a function of h and maximum arity n, and is evaluated by the equation:

t = h (n-1) + 1

(4)

Consider a gene for which the set of functions consists of F = {Q, *, /, -, +} and the set of terminals T = {a, b}. In this case, n = 2; and if an h = 15 is chosen, then t = 16. Thus, the length of the gene g is 15 + 16 = 31. One such gene is shown below (the tail is shown in blue):

0123456789012345678901234567890

/aQ/b*ab/Qa*b*-ababaababbabbbba

(5)

It codes for an ET with eight nodes. Note that, in this case, the ORF ends at position 7 whereas the gene ends at position 30.

Suppose now a mutation occurred at position 2, changing the “Q” into “+”. Then the following gene is obtained:

0123456789012345678901234567890

/a+/b*ab/Qa*b*-ababaababbabbbba

(6)

In this case, its expression gives a new ET with 18 nodes. Note that the termination point shifts 10 positions to the right (position 17).

Obviously the opposite might also happen, and the ORF can be shortened. For example, consider again gene (5) above, and suppose a mutation occurred at position 5, changing the “*” into “b”, obtaining:

0123456789012345678901234567890

/aQ/bbab/Qa*b*-ababaababbabbbba

(7)

Its expression results in a new ET with six nodes. Note that the ORF ends at position 5, shortening the parental ET in two nodes.

So, despite their fixed length, GEP genes have the potential to code for ETs of different sizes and shapes, being the simplest composed of only one node (when the first element of a gene is a terminal) and the biggest composed of as many nodes as the length of the gene (when all head elements are functions with maximum arity).

It is evident from the examples above, that any modification made in the genome, no matter how profound, results always in a structurally correct ET as long as the structural organization of genes is maintained. Indeed, the implementation of high-performing genetic operators in GEP is a child’s play, and Ferreira [2] describes seven: point mutation, RIS and IS transposition, two- and one-point recombination, gene transposition and gene recombination.

Home | Contents | Previous | Next