Today I will write about optimization methods based on biological concepts. But first I will introduce the concept of optimization, and why it is so important in science.

Science is based upon experiments and theories that explain the data obtained in those experiments. Those data are the observations we have of a particular phenomenon. They are measures of some physical quantity we can extract from the experiment. In general we try to correlate some physical quantities with others in the way of an equation, or physical law. These equations are composed by the observations and some parameters. For example, think about a straight line. The equation (physical law) would be: y=ax+b. That means that we have a series of measures of x and y, and we want to calculate the optimum parameters a and b that define that line.

It is important to note that we assume a physical law for the relationship between x and y. That is, our physical law would say that they are linearly correlated. The a and b parameters will describe other quantities related to the studied phenomenon. If our physical law is a straight line (more generally, if we have a linear problem), we have a procedure that calculates the parameters that define our data: the linear least squares fit. However, in Physics, and particularly in Geophysics, the equations are nonlinear. This poses a problem, because we usually want to find the models (parameters) that minimize the error between data and the equation (that is, linear least squares in linear fitting).

If the number of possible solutions is high (as usual in real problems), we cannot perform a systematic search. We need some algorithm that searches only in a small portion of the parameters' space. There exists a number of algorithms that can do that. Here we will talk about some of the ones that are based on biological concepts. As examples, I will explain Genetic Algorithms, Ant colonies, and Particle Swarm.

Genetic Algorithms are based on the idea that a solution (a set of parameters) is represented by a chromosome (each gene representing each parameter), and that a population evolves to the optimum (the best fit) following Darwinian rules. The operators that make the population evolve are the selection of the parents to be crossed, the crossover of the two solutions, mutation, and replacement of the individuals in the population. Each solution (chromosome) has a value that represents how well fitted the data are to said solution, and the individuals with better solutions have more opportunities to mate and therefore to pass on their information from generation to generation.

Another example of fitting algorithm is the Ant Colony Algorithm. In it, each agent (ant) creates a path that represents the solution. At first each ant will be wandering randomly, like in the natural world. Each trail will be marked with a pheromone trail. This makes the path attractive to other ants. If more ants follow this trail, it will be marked as well fitted. However, the pheromone trial evaporates with time. In that way, we will not end up always in the paths chosen by the first ants. The trail with higher pheromone content will be the solution.

Finally, we will explain the Particle Swarm Algorithm. In it, each solution is wandering through the parameters' space with certain velocity. This algorithm mimics the way a flock or a swarm behaves in the real world. In each iteration, each individual adjusts its movement along the parameters' space in function of the solution that is the leader in that iteration. The leader is, of course, the best fitted of all. The swarm as a whole changes its movement in function of the best solution found previously. Of course, there is a random component (in velocity) that prevents the search to end up in a local minimum.

It is very interesting to see how Biology can help in a purely mathematical problem like the optimization of functions. Of course, there are more Biology-based algorithms out there. I only mentioned the ones I know better. See Swarm Intelligence for other methods.

Science is based upon experiments and theories that explain the data obtained in those experiments. Those data are the observations we have of a particular phenomenon. They are measures of some physical quantity we can extract from the experiment. In general we try to correlate some physical quantities with others in the way of an equation, or physical law. These equations are composed by the observations and some parameters. For example, think about a straight line. The equation (physical law) would be: y=ax+b. That means that we have a series of measures of x and y, and we want to calculate the optimum parameters a and b that define that line.

It is important to note that we assume a physical law for the relationship between x and y. That is, our physical law would say that they are linearly correlated. The a and b parameters will describe other quantities related to the studied phenomenon. If our physical law is a straight line (more generally, if we have a linear problem), we have a procedure that calculates the parameters that define our data: the linear least squares fit. However, in Physics, and particularly in Geophysics, the equations are nonlinear. This poses a problem, because we usually want to find the models (parameters) that minimize the error between data and the equation (that is, linear least squares in linear fitting).

If the number of possible solutions is high (as usual in real problems), we cannot perform a systematic search. We need some algorithm that searches only in a small portion of the parameters' space. There exists a number of algorithms that can do that. Here we will talk about some of the ones that are based on biological concepts. As examples, I will explain Genetic Algorithms, Ant colonies, and Particle Swarm.

Genetic Algorithms are based on the idea that a solution (a set of parameters) is represented by a chromosome (each gene representing each parameter), and that a population evolves to the optimum (the best fit) following Darwinian rules. The operators that make the population evolve are the selection of the parents to be crossed, the crossover of the two solutions, mutation, and replacement of the individuals in the population. Each solution (chromosome) has a value that represents how well fitted the data are to said solution, and the individuals with better solutions have more opportunities to mate and therefore to pass on their information from generation to generation.

Another example of fitting algorithm is the Ant Colony Algorithm. In it, each agent (ant) creates a path that represents the solution. At first each ant will be wandering randomly, like in the natural world. Each trail will be marked with a pheromone trail. This makes the path attractive to other ants. If more ants follow this trail, it will be marked as well fitted. However, the pheromone trial evaporates with time. In that way, we will not end up always in the paths chosen by the first ants. The trail with higher pheromone content will be the solution.

Finally, we will explain the Particle Swarm Algorithm. In it, each solution is wandering through the parameters' space with certain velocity. This algorithm mimics the way a flock or a swarm behaves in the real world. In each iteration, each individual adjusts its movement along the parameters' space in function of the solution that is the leader in that iteration. The leader is, of course, the best fitted of all. The swarm as a whole changes its movement in function of the best solution found previously. Of course, there is a random component (in velocity) that prevents the search to end up in a local minimum.

It is very interesting to see how Biology can help in a purely mathematical problem like the optimization of functions. Of course, there are more Biology-based algorithms out there. I only mentioned the ones I know better. See Swarm Intelligence for other methods.

## No comments :

Post a Comment