THE ANNALS OF "DUNAREA DE JOS" UNIVERSITY OF GALATI FASCICLE III, 2003 ISSN X ELECTROTECHNICS, ELECTRONICS, AUTOMATIC CONTROL, INFORMATICS

FASCICLE III, 003 ISSN -454X ELECTROTECHNICS, ELECTRONICS, AUTOMATIC CONTROL, INFORMATICS A NEW METHOD OF GENE CODING FOR A GENETIC ALGORITHM DESIGNED FOR PARAMETRIC OPTIMIZATION Rau Belea * an Liviu Beliman * Department of Control System an Inustrial Informatics, University Dunărea e Jos of Galati, Faculty of Electrical Engineering an Computer Science, Domneasca Street 47, 600, Galaţi, Romania. Pone: (+40) 36-4487, Pone+Fax: (+40)36-4608, E-Mail: Rau.Belea@ugal.ro., Liviu.Beliman@ugal.ro. Abstract: In a parametric optimization problem te genes coe te real parameters of te fitness function. Tere are two coing tecniques known uner te names of: binary coe genes an real coe genes. Te comparison between tese two is a controversial subject since te first papers about parametric optimization ave appeare. An objective analysis regaring te avantages an isavantages of te two coing tecniques is ifficult to be one wile ifferent format information is compare. Te present paper suggests a gene coing tecnique tat uses te same format for bot binary coe genes an for te real coe genes. After unifying te real parameters representation, te next criterion is going to be applie: te ifferences between te two tecniques are statistically measure by te effect of te genetic operators over some ranom generate fellows. Keywors: binary coe genes, exploration, exploitation, ranom initialize genes, uniform crossover, HUX crossover, aritmetic crossover.. INTRODUCTION Everyting tat can be foun in a numerical computer, numerical information, grapical information, texts, programs, operating systems, algoritms etc. is coe wit binary igits. Tat is wy te expressions binary coe genes an real coe genes, even if tey are very often use, are pleonasms. It is proper to speak about fixe-point real numbers an floating-point real numbers, but tese expressions aren t consecrate in te genetic algoritms literature. In literature is often use a wor-game: explorationexploitation. Te verb to explore means to searc over large areas, wile te substantive exploit also as te meaning of mining exploitation or mine. In tis wor-game te verb to exploit means to searc wit small steps. A genetic algoritm as goo results if it combines large areas searcing wit small steps searcing. In Crawfor [977] tere are enounce eigt guielines tat guie a new crossover operator esign. Two of tem are important for te present paper. - Guieline 5: te crossover operator soul explore, not exploit. - Guieline 8: In general, small (large) canges in genotype soul prouce small (large) canges in penotype. In a genetic algoritm, te exploration is goo if te population covers, in time, uniformly te searc space, an if te number of visite points is as big as possible. Accoring to guieline 5 (Crawfor, [977]), te exploration success epens firstly on Tis paper was recommene for publication by Rustem POPA 66

FASCICLE III, 003 ISSN -454X te crossover operator use, an ten on te space sape an te selection meto use. Te paper is organize as it follows: in section tere is briefly presente te aritmetic crossover. In section 3 tere are presente te binary representation tecniques of real numbers. In section 4 it is explaine a new gene coing meto, meto tat unifies binary coe genes wit real coe genes. In section 5 it is use te bits istogram to test if te genetic operators were programme correct. In section 7 it is use te istogram of Hamming istance between parents an cilren in orer to appreciate te crossover operators quality. A summary of te results will be one in te last section of te paper.. CROSSOVER BETWEEN REAL NUMBERS In real coe genes genetic algoritms, te parents cromosomes p [, p [ an te cilren cromosomes o [, o [, k =, K, K are gene arrays. Te most popular crossover variant between real numbers is te aritmetic crossover. Genes situate in te k position of te cilren cromosomes are calculate as it follows: o[ = α p[ + ( α) p[ (), o [ = ( α) p [ + α p [ were α [0,) is a ranom real number uniformly istribute. Te aritmetic crossover as a major isavantage: it oesn t explore te wole space. were α, α [ δ,+ δ ) are two ranom numbers an δ = 0. 5 is a number cosen by te programmer. In te case δ = 0. 5 it is necessary to verify for eac gene if te result obtaine after te crossover in t overflow te searcing space. 3. THE BINARY REPRESENTATION OF REASL NUMBERS In orer to recor te exact value of a real number an infinite number of igits is neee. For example, tere are known some tousans ecimals of te number π but its exact value isn t known. Tat means tat te value of a real number store in a numeric calculator is truncate, as it is represente wit a finite sequence of bits. A finite sequence of bits bpbp K bb 0b Kbq + bq is a fixe-point unsigne real number. Te value of tis number is: (3) V = i b i. p i= q Consequently, wit p + q + binary igits real numbers can be represente in te omain p q 0 V wit a resolution of. Te resolution is te ifference between two consecutive real numbers in te respective representation, an te maxim representation error is alf of te resolution, tat is q. A floating-point real number can be represente as it follows: x,b B (4) V = ( ) M, S E x,a A x,a x,b Fig.. Te effective space explore by te aritmetic crossover In figure it is presente a two-imensional searcing space ( x, x) [, ) [, ). Te aritmetic crossover prouces two cilren situate in te rectangle elimite by te x, ) an (, A x, A ( x, B, x, B ) points tat correspons to te parents cromosomes an it oesn t explore te wole space. In orer to avoi tis fact, te intermeiate crossover may be use: ( ) ( ) o[ = p[ + α p[ p[ () o [ = p [ + α p [ p [ were S {0, } is te sign bit, E is a signe integer calle exponent, an M is fixe point real number calle mantissa. At te en of a floating point operation, te numbers E an M are arrange so tat te mantissa as only one binary igit equal to in front of te point. Tis operation is calle te mantissa normalization. In te case of working wit a normalize mantissa, it isn t necessary to store te igit in front of te point. Te value of te real number is: S N n off (5) V = ( ) ( b. F), msb were F is te fractional part of te mantissa. Te bit b msb is te most significant bit of te mantissa. In te case of floating point real numbers wit normalize mantissa b =. Te exponent msb = N n off is store wit a gap of off E n so tat te number N never as all te bits null. Te number 67

FASCICLE III, 003 ISSN -454X noff is cosen so tat noff E noff. It can be observe tat te real number 0 cannot be represente in a floating-point real number wit normalize mantissa. From tis reason, te number real 0 is coe by storing all te bits null. Single 4 Real 6 Double 8 Comp 0 K S N bmsb F noff b b9 b L - b0 Lb3 7 b 40 b4 b L - b Lb39 5 b b b L - b Lb64 03 b b5 b L 6 Table b 7 b80 b L 6535 Te bits signification of te four representations of te real numbers in te IBM-PC computers is given in Table. Te real number is store as a binary igits array of form: bb b3b4 L b8 K were K is te byte number of te real number representation, S is te sign bit, N is a bits array tat stores te exponent, b msb is te most significant bit of te mantissa, F is a bits array tat stores te fractional part of te mantissa of te real number, an n off is te gap use to compute te exponent. 4. A NEW METHOD OF GENE CODING Te major isavantage of te floating-point representation is te epenency between te number value an te bits weigt. Tat s wy te mutation an te crossover operators cannot be use te way tey were efine in te case of binary coe genes. Toug, in te representation of floating-point real numbers wit normalize mantissa tere are intervals k k+ x [, ), k Z, were all te real numbers from tis interval ave te same exponent an te mantissa s bits wit te same position ave te same weigt. For example, te real parameter x x min, x ) can be normalize wit te function: [ max (6) ~ x xmin [ xmax, xmin ) [,), x = + x x In te next examples te single type is use, store on 3 bits. Te numbers in te interval [,) ave S =, N = 0, so te exponent is E = N 7 = 0, an te mantissa takes values starting wit te combination.000k0 to te combination.k. Te numbers are normalize, so tere are store only te first 3 bits from te max min fractional part of te mantissa. Tis representation as te following avantages: te cromosome is an array of real numbers; tere can be use genetic operators efine for binary coe genes, but also genetic operators efine for real coe genes; tere is no nee of a function for ecoing te genetic information. It is sufficient to enormalize te gene. All te aritmetic operations are realize by te co-processor, fact tat spees up te genetic algoritm spee. Tis representation imposes some restrictions wen programming genetic operators for binary coe genes. Te mutation an te crossover must not moify te bits tat carry te sign an te exponent. For all te numbers normalize to te interval ~ x [,) on four bytes, te bits tat souln t be moifie are b K b 00. 9 = 5. TESTING GENETIC OPERATORS WITH MONTE CARLO METHODS In orer to verify te correctitue of te genetic operators implementation, it is necessary to make some statistic tests for te ranom initialization, te mutation an te crossover. Te gene ranom initialization was implemente wit te instruction: (7) g : = Ranom +; were te function Ranom generates a ranom number in te interval [ 0,). In Figure tere is presente a bit istogram for 000 genes ranom initialize. Fig.. Statistic test for ranom initialization In te figure, on te abscissa te bits are place in te orer from table, an on te orinate is te probability of te bits tat ave te value. Te ranom initialization is correct if b = b 0, = b K b9 =, an any bit from b0 K b3 takes te value wit a probability of 0.5. Te mean value was calculate only for te bits b0 K b3 tat can moify teir content. 68

FASCICLE III, 003 ISSN -454X In Golberg [99] te next recommenation is enounce: te mutation probability must not be affecte by te bits number tat coe te gene. In orer to fulfill tis recommenation an to avoi excessive use of te ranom generator first it is calculate n, te number of te bits tat will suffer a mutation wit te instruction: (8) n : = Trunc(3* pm + Ranom); were Trunc is te roun function by truncation to te lower value, 3 is te number of bits of te fractional part of te mantissa, pm is te mutation probability, an Ranom is te function tat generates ranom numbers uniformly istribute in te interval [ 0,). Ten te ranom numbers generator is use n times more to establis te bits tat are going to be moifie. Fig. 3. Statistic test for mutation In figure 3 is presente a istogram of te bits tat coe te gene for 000 applications of te mutation operators wit p m = 0. 05. Te gene was initialize wit te value g =. 0. Te mutation operator is programme correctly if it results from te istogram tat every bit from b0 K b3 takes te value wit te probability p m. Fig. 4. Statistic test for crossover In figure 4 it is presente a istogram of te bits tat coe te g genes (gray bars) an te g genes (black bars). Te istograms were realize after te crossover was applie 000 times over te genes: g =.99L an g =. 0. It was stuie te onepoint crossover as it was escribe in Golberg [99]. Te crossover operator is programme correctly if te bits b0 K b3 of te g istogram take te value wit an increasing probability from 0 to, an te bits b0 K b3 of te g istogram take te value wit a ecreasing probability from to 0. 6. THE CROSSOVER OPERATOR QUALITY MEASURING METHOD Te Hamming istance between two cromosomes was use to analyze te uniform crossover even from te first papers of Booker or Eselman were tere were suggeste te uniform crossover an te HUX (igly isruptive form of crossover) crossover. Let it be two genes g = 00 an g = 00. Te amming istance between g an g is: (9) = bc p xor p ) = bc(00 xor 00) = (000) ( bc were te logical operator "xor" is applie to te bits wit te same position, an bc ( ) is a function tat counts te non-zero bits in a string of bits. A first result obtaine from using te Hamming istance in te analyze of te crossover operator is te Booker observation: (0) ( p, p ( p, p ) = 0 ) = o = p o = p o = p o = p were o an o are calle substitute offspring. Correlating te guielines 5 an 8 presente by Crawfor wit Booker observation, te next criteria results: Criteria: te crossover operator explores te better te parameters space if te Hamming istances between te parents an te offspring cromosomes are statistically te larger. Let it be a species of solutions tat ave a gene coe wit te meto suggeste in section 4 of te present paper. It is note wit p, p te parents cromosomes an o, o te offspring cromosomes. Rana [999] notices tat: () ( p, o ) = ( p, o). ( p, o ) = ( p, o ) In Rana [999] it is covere te set of te inepenent cromosomes an te istogram of te parents-offspring Hamming istances is realize. Te Hamming istance between parents an offspring was calculate wit te formula: () p, o) = min( ( p, o ) ( p, )) ( + o Te quality of te crossover operator is appreciate consiering te probability of substitute offspring 69

FASCICLE III, 003 ISSN -454X apparition an te mean value of te istances calculate wit te formula (). In te present paper tere are consiere a set of ranom initialize cromosomes an te Hamming istance between parents an offspring is calculate wit te formula: (3) ( p, o ) + ( p, o ) p, o) =. ( Te quality of te crossover operator is appreciate consiering te mean value an te ispersion of te istances calculate wit formula (3). Moreover, tis meto is extene to te comparison of te binary crossover wit te aritmetic crossover an to te comparison of te exploration realize by te aritmetic crossover wit te exploration realize by te ranom initialization of te population. Te analyze meto consist in ranom initialization of te parents genes an ten two ifferent crossover operators are applie an two pairs of cilren are obtaine. Ten, for eac crossover operator, it is calculate te Hamming istance between parents an offspring. Te last operations are repeate for a large number of pairs of genes ranom initialize, an ten te Hamming istance istograms are grapically represente. 7. EXPERIMENTAL RESULTS One point crossover is largely presente in Golberg [99]. In te case of uniform crossover, te probabilities of every bit transfer from one parent to offspring an te bit transfer in te same position, from te oter parent is 0.5. Fig. 6. Te comparison between uniform crossover an te HUX crossover In figure 6 tere are comparatively presente te Hamming istance istograms between parents an offspring for uniform crossover an te HUX crossover. Te HUX crossover is a uniform crossover variant tat reuces te istance ispersion between te parents p, p an te offspring o, o imposing: (4) ( p, o ) = ( p, p) iv, ( p, o ) = ( p, p ) ( p, o ) te "iv" operator from () representing te integer ivision witout rest. Te mask of te move bits m c, of te HUX crossover is obtaine from te Hamming mask m = p xor p, from wic alf of te bits wit te value are remove. Of course, te positions of tese bits are ranomly cosen. For 3 bits genes, te probability of substitute offspring ecrease from.3% in te case of uniform crossover to 0.0% in te case of HUX crossover. If te meto suggeste in section of te present paper is use, ten te parents cromosomes p [ ], k p [ ] an te offspring cromosomes o [ ], o [ ], k k k k =, K, K are gene arrays an if it is performe te aritmetic crossover wit te formula (), were α [0,) is a ranom real number, it results tat: (5) p[ [,) p[ [,) o[ [,), o[ [,) Fig. 5. Te comparison between one point crossover an uniform crossover In figure 5 tere are comparatively presente te Hamming istance istograms between parents an offspring for one point crossover an uniform crossover. Summing te two samples from te figure 5 3.3% from te offspring obtaine troug one point crossover are substitute offspring, wile only.3% of substitute offspring obtaine after te uniform crossover, so te quality of te exploration realize by te uniform crossover is better because te probability of substitute offspring appearance is smaller. tat means tat te bits of a gene oesn t moify teir weigt in te gene value (see section 4). So, te aritmetic crossover can be compare to any of te binary crossover variants. Fig. 7. Te comparison between te HUX crossover an te aritmetic crossover 70

FASCICLE III, 003 ISSN -454X In figure 7 tere are comparatively presente te Hamming istance istograms between parents an offspring for te HUX crossover an te aritmetic crossover. It can be notice tat te Hamming istances between parents an offspring are bigger in te case of te aritmetic crossover. In te introuctory part it was state tat te exploration is better if te number of points visite by te fellows is bigger. Tis visitation of te searcing space starts in te moment of te ranom initialization of te genetic algoritm. Te coing meto suggeste in section 4, togeter wit te analyze meto formerly use allow te comparison of te performances of te exploration realize by te ranom initialization wit te exploration realize by te crossover genetic operator. Fig. 8. Te comparison between te exploration realize by te aritmetic crossover an te exploration realize by te ranom initialization of te genes In figure 8 tere are presente two istograms: te istogram of te Hamming istances between parents an offspring in te case of aritmetic crossover an te istogram of te Hamming istances between two ranomly initialize genes. It can be notices tat te aritmetic crossover is almost as efficient as te ranom initialization of te genes. 8. CONCLUSIONS From te figures 5, 6 an 7 it results tat ue to te criteria enounce in section 6, te one-point crossover as te weakest performance an te performances increase from te uniform crossover, te HUX crossover to te real numbers crossover. Te aritmetic crossover (see formula ()) realize a very goo searc wen te population is concentrate in an ecological nice, improving te exploring process (see figure 7, section 7), but it oesn t explore te wole searcing space (see figue, section ). So, te aritmetic crossover oesn t fulfill te Crawfor s guieline 5. Te unwante result from te figure suggests a combine using of te uniform crossover or te HUX crossover wit te aritmetic crossover. In tis way, te aritmetic crossover reinitializes te bits of a part of te population, preserving te goo solutions near te optimum point, wile te binary crossover explores very well te searcing space. 8. REFERENCES Bäck T. [997]. Principles of Evolutionary Computation. EUFIT 97, September 8-, 997, Aacen, Germany p. 759-763. Crawfor K. D. [997]. How one go eveloping a new crossover operator wit an a priori expectation of its merit? Report at university of Tilsa 997. Eselman L. J. [99]. Te CHC Aaptive Searc Algoritm: How to Have Safe Searc Wen Engaging in Nontraitional Genetic Recombination. "Founations of genetic Algoritms". Eite by Gregory Rowlins Morgan Kaufmann Publisiers, Inc. 99. Golberg D. E. [990]. Real-coe Genetic Algoritms, Virtual Alpabets. University of Illinois at Urbana-Campaign, Tecnical Report no. 9000, September 990. Golberg D. E. [99]. Genetic Algoritms. Aison-Wesley Usa, (5767), 99. Traucere în limba franceză, Copyrigt Juin 994 Eitions Aison-Wesley France, S. A. Micalewicz Z. [994]. Genetic Algoritms + Data Structures = Evolution Programs. Springer-Verlag Berlin Heielberg 994. Rana S. [999]. Disertation. Examining Te Role of Optima an Scema Processing in Genetic Searc. For te Degree of Doctor of Pilosopy, Colorao State University, Fort Colins, 999. Wrigt A. H. [99]. Genetic Algoritms for Real Parameter Optimisation. "Founations of genetic Algoritms". Eite by Gregory Rowlins Morgan Kaufmann Publisers, Inc. 99. ***[985]. ANSI/IEEE 754-985 Stanar. 7