Representations for genetic and evolutionary algorithms. With a foreword by David E. Goldberg. (Q1874029)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Representations for genetic and evolutionary algorithms. With a foreword by David E. Goldberg.
scientific article

    Statements

    Representations for genetic and evolutionary algorithms. With a foreword by David E. Goldberg. (English)
    0 references
    22 May 2003
    0 references
    Imitating Darwin's basic life principles, genetic and evolutionary algorithms are often used to solve difficult optimization problems. Starting with a population of candidates, often represented as binary strings, a genetic algorithm selects some of those candidates with a fitness function, then generates new candidates by applying genetic operators, either crossover, where it swaps parts of the bits between a pair of strings; or mutation, where it randomly changes certain bits of a string. This process will be repeated until a preset criterion is met, when the algorithm converges. The ``best string'' produced at the end of this process is declared as the result. The evolutionary algorithm works in basically the same way, except that it allows more flexible coding mechanisms for the candidate representation, more sophisticated selection algorithms to select candidates for expansion, and more genetic operators, which are essentially variants of the two aforementioned operators, to generate new candidates. A key component of this process is a genotype-phenotype mapping, which uses the genetic information (genotype) to construct various properties (phonotype) of the aforementioned candidates. Much work in the GEA area has focused on the part of identifying powerful operators and the associated test problems showing their effectiveness, with the other important area of problem representation being often ignored. On the other hand, we all know the importance of a correct, and proper, problem representation on its final solution, both in theory and practice. This work, an extention of a Ph.D. dissertation, starts a process of filling this cavity. After a thorough investigation of several key aspects of problem representation, it is concluded that a satisfactory problem representation, measured by a high success rate of finding an optimal solution within a reasonable convergence time, should be at least non-, or uniformly, redundant in terms of the average number of genotypic information used to represent phenotypic information; uniformly scaled in terms of characterizing candidates' fitness with building blocks; and zero distortion in terms of the genotype-phenotype mapping distance. A formal model based on this finding, the time-quality framework, is suggested and then applied to the study of the popular binary and tree representation, and the design of an ideal tree representation, as backed up with both theoretical analysis and empirical data. The introductory Chapter 1 presents the purpose of the contained work and briefly overviews each of the eight following chapters. Chapter 2 presents a very approachable introduction to the general ideas of the GEA area, and some of the basic concepts such as the main principles of life, basic GA algorithms, together with the main operators of crossover, and mutation, as well as some of the performance related results such as the schemata theorem and the building block hypothesis. It also discusses the concept of problem difficulty, its causes and its measurements. Chapters 3 identifies three key elements of problem representation: redundancy, when some phenotype information is represented by multiple genotype information; building block scaling, which describes how differently the building blocks contributes to the fitness of those candidates; and distance distortion, specifying how great the distance between genotypic information compared with that between the corresponding phenotypic information. It then investigates their respective impacts on the solution process as well as solutions themselves, backed up with empirical results. Based on this investigation, Chapter 4 presents the formal model, the time-quality model, for the analysis and design of representation, measured with the quality of solutions, the time to convergence, and the difficulty degree of the problem as represented. Chapters 5 and 6 are devoted to the analysis of the often used binary and tree representation, respectively, in the context of the aforementioned model. Chapter 7 is then devoted to the design of the direct tree representation, guided by the model as developed in Chapter 4. Chapter 8 discusses how to predict the performance of a representation, thus helping the practioners to choose a representation among a few options. Finally, Chapter 9 summarizes the whole work. This book also contains one appendix, collecting some unpublished testing instances for the optimal communication spanning tree, a list of over 220 references, a list of symbols used in the book, and a two-layer index. By and large, this thoroughly investigated work is well written and nicely presented. It is self-contained in the sense that it does not require a reader have prior knowledge on GEA, but as pointed out, a reader should possess basic mathematic knowledge on counting, probability and algebra. I found the index table is not as detailed as it should be. For example, I could not find an index for the word ``performance'', which is really one of the central subjects of this work. I also found much of the discussion of the purpose of this work, and the accomplished results, often duplicated in the preface, the section where it is studied, the conclusion subsections, the summary and conclusion section within the chapters, and finally, the summary chapter. This certainly reminds us of the dissertation nature of this work. I belive this book does provide some valuable insight as the impact of various factors within the problem representation process on the GEA performance, and starts an important process of coming up with a general theory, whatever it is going to be, that will guide our analysis and design of representations fitting some of the problems we encounter in, the GEA area.
    0 references
    genotype-phenotype mapping
    0 references
    time-quality framework
    0 references
    0 references
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references
    0 references