This document provides an overview of genetic algorithms. It discusses that genetic algorithms are a type of evolutionary algorithm inspired by biological evolution that is used to find optimal or near-optimal solutions to problems by mimicking natural selection. The document outlines the basic concepts of genetic algorithms including encoding, representation, search space, fitness functions, and the main operators of selection, crossover and mutation. It also provides examples of applications in bioinformatics and highlights advantages like being easy to understand while also noting potential disadvantages like requiring more computational time.
Presenting Genetic Algorithm (GA) and its development in the USA during the 1970s, highlighting its application for discrete optimization and combining parent information.
Defining GA as a global search heuristic inspired by evolutionary biology, focusing on objective functions, genetic representation, and randomized operators.
Discussing different encoding techniques for chromosomes in GA, including binary, permutation, value, and tree encoding formats.
The significance of search space and fitness function in GA, evaluating solutions and guiding the selection of the best population.
Outlined steps for executing GA, including encoding, generating populations, fitness evaluation, and selection processes.
Describing the reproduction process in GA, different selection methods like Roulette Wheel, and the principle of genetic crossover.
Details on chromosomal crossover methods, their probabilities, and various crossover types such as single-point and uniform crossover.
Explaining the role of mutation in maintaining diversity in GA and discussing different mutation techniques and their probabilities.
Listing various applications of GA in bioinformatics, including sequence alignment, RNA prediction, and clustering.
Summarizing the advantages of GA, like modularity and flexibility, alongside its computational drawbacks.
Providing book references and resources for further study, concluding the presentation with gratitude.
GA Quick Overview Developed:USA in the 1970’s By J. Holland, K. DeJong, D. Goldberg • Discrete optimizationTypically applied to: • Not too fast • Good heuristic for combinatorial problems Attributed features: • Traditionally emphasizes combining information from good parents (crossover) • many variants, e.g., reproduction models, operators Special Features:
3.
What is GA?? Agenetic algorithm (or GA) is a search technique used in computing to find true or approximate solutions to optimization and search problems. Genetic algorithms are categorized as global search heuristics. Genetic algorithms are a particular class of evolutionary algorithms that use techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover (also called recombination).
4.
BASIC CONCEPTS GA convertsdesign space into genetic space • Works with a coding variables • Traditional optimization techniques are deterministic in nature, but GA uses randomized operators • Three important aspects: 1. Definition of objective function 2. Definition and implementation of genetic representation 3. Definition and implementation of genetic operators
5.
ENCODING • Encoding ofchromosomes is one of the problems when one is starting to solve problem with genetic algorithm • There are various types of encoding used for solving the problems • Four types of encoding are:- 1) Binary encoding 2) Permutation encoding 3) Value encoding 4) Tree encoding
6.
REPRESENTATION (ENCODING) EXAMPLES Possibleindividual’s encoding • Binary (Bit strings) (0101 ... 1100) • Real numbers (43.2 -33.1 ... 0.0 89.2) • Permutations of element (E11 E3 E7 ... E1 E15) • Lists of rules (R1 R2 R3 ... R22 R23) • Program elements (genetic programming) • Octal • Hexadecimal • Floating-point • Tree Representation • Random Keys • ... any data structure ...
7.
SEARCH SPACE • Thespace for all possible feasible solutions • Each feasible solution can be "marked" by its value of the fitness of the problem • GA is started with a set of solutions called populations • Used to form a new population • More suitable, more chances to select • This is repeated until best population is obtained
8.
FITNESS FUNCTION Basically,a fitness function is used to evaluate phenotypes to identify the fittest population members. It is a function which takes the solution as input and produces the suitability of the solution as output In some cases, the fitness function and the objective function may be the same, while in others it may be different based on the problem. It measures the quality of the represented solution It is always problem dependent The fitness of a solution is measured, how best it gives the result For instance, Knapsack problem
9.
STEPS OF GENETICALGORITHM • Step 1: Encoding of the problem in a binary string • Step 2: Random generation of a population • Step 3: Calculate fitness of each solution • Step 4: Select pairs of parent strings based on fitness • Step 5: Generate new string with crossover and mutation until a new population has been produced Repeat steps 2 to 5 until satisfying solution is obtained
10.
OPERATORS OF GENETICALGORITHM Three Basic operators are: 1. Reproduction 2. Crossover 3. Mutation The new population is further evaluated and tested for termination If the termination criteria are not met, the population is iteratively operated One cycle of operations and the subsequent evaluation– Generation in GA
11.
REPRODUCTION Chromosomes areselected from the population to be parents to crossover and produce offspring Also known as Selection Operator Parents are selected according to their fitness There are many methods to select the best chromosomes A) Roulette Wheel Selection B) Rank Selection C) Tournament Selection D) Steady State Selection • The better the chromosomes are, the more chances to be selected they have
12.
CROSSOVER Chromosomal crossover (or crossingover) is the exchange of genetic material between homologous chromosomes Crossover usually occurs when matching regions on matching chromosomes break and then reconnect to the other chromosome.
13.
GA OPERATORS: CROSSOVER •Choose a random point on the two parents • Split parents at this crossover point • Create children by exchanging tails • Generally setting crossover probability with a value 0.5 produces good results and its typical range id 0.3 to 1. • E.g. Probability of crossover is 25% so 5 out of 20 chromosome will undergo crossover.
MUTATION IN GENETIC ALGORITHM Mutationis a genetic operator used to maintain genetic diversity from one generation of a population of genetic algorithm chromosomes to the next. It is analogous to biological mutation. The mutation probability is kept quite low between 0.001 and 0.01. Mutation alters one or more gene values in a chromosome from initial state
APPLICATIONS There arevarious applications of genetic algorithm in bioinformatics. Some of them are:- 1. They are used for multiple sequence alignment 2. Used for RNA structure prediction 3. Used for motif discovery 4. Used for building phylogenetic trees 5. Used for clustering to optimize a wide range of fit – funcions 6. Used for gene expression profiling analysis 7. Used for protein folding and protein/ligand docking 8. Used for operon prediction
19.
ADVANTAGES Easy tounderstand Modular, separate from application Supports multi-objective optimization Easy to exploit previous or alternate solutions Flexible in forming building blocks for hybrid applications Substantial history and range of use DISADVANTAGES GA requires more computational time It is slower than some other methods
20.
REFERENCES Genetic algorithmsin search, optimization, and machine learning (Book by David E. Goldberg) ocw.mit.edu(MIT OPEN COURSE) nptel.ac.in www.google.com Neural Networks, Fuzzy Logic, Algorithms - S. Rajasekaran - G. A. Vijayalakshmi Pai
#14 Mutation probability (or ratio) is basically a measure of the likeness that random elements of your chromosome will be flipped into something else. For example if your chromosome is encoded as a binary string of length 100 if you have 1% mutation probability it means that 1 out of your 100 bits (on average) picked at random will be flipped. Crossover basically simulates genetic recombination (as in human reproduction) and there are a number of ways it is usually implemented in GAs. Sometimes crossover is applied with moderation in GAs (as it breaks symmetry, which is not always good, and you could also go blind) so we talk about crossover probability to indicate a ratio of how many couples will be picked for mating.