Exploring and Improving the GT-SUITE Genetic Algorithm

Written by Ryan Dudgeon

December 9, 2020

Mathematical optimization is the search for optimal model input parameters (factors) to minimize or maximize one or more of its predicted outputs (responses).  Optimization is a powerful tool to the modeler for two main tasks, both of which can enhance the modeler’s understanding of the relationship between factors and responses, and therefore the model in general.  The first task is model calibration, which is the process of tuning model inputs, whose values have a reasonable degree of uncertainty, to get one or more desired model outputs to align better with measurement data, thereby achieving a more accurate model.  The second task is engineering design, which is the process of finding optimal design parameters for which the design engineer is responsible, for the purpose of improving a real component, product, or system.

To avoid the rote task of simulating every possible combination of factor values within their specified bounds and with an acceptable degree of resolution, an optimizer employs a search algorithm to more efficiently find the optimal solution in as few model evaluations (referred to as design iterations) as possible.

Genetic algorithms are commonly used for optimization problems, as they include stochastic methods to more thoroughly search the design space, which is the multi-dimensional region between the lower and upper bounds of the factors.  Stochastic methods attempt to balance two processes, exploitation and exploration, to further improve upon solutions that are already known while continuing to find new solutions in unexplored areas of the design space.  The genetic algorithm included in GT-SUITE, NSGA-III, was developed by Kalyanmoy Deb and Himanshu Jain [1] and is well-regarded in the optimization community. A genetic algorithm mimics biological evolution by applying mathematical equivalents of selection, cross-over, and mutation to a population of agents over multiple generations.  An agent is a specific combination of factor values, thereby representing one design iteration.  A key genetic algorithm setting is the population size, which, after getting initialized with a set of agents, proceeds to evolve from generation to generation.  The total number of model evaluations or design iterations is the population size multiplied by the number of generations.

The performance of an optimization run generally consists of viewing the change in the response variable plotted against the design iteration number.  To evaluate optimization performance, we introduce the term progress, which is defined as the best response value at any given design iteration.  The performance of an optimization run can be evaluated using a single plotted line that represents the progress.  Because of the stochastic nature of a genetic algorithm, multiple optimizations (referred to as trials) must be run on the same problem using different random seeds to fairly evaluate the progress.  For these studies, 20 trials were run on each model, and the median and maximum progress were calculated and plotted to compare the two algorithms.  The maximum progress is a measure of the worst performance that can be expected.  The plot below is an example of 20 trials that were run for a particular optimization model, along with the median and maximum progress.

Again, the genetic algorithm’s population size is an important setting that influences the performance that can be achieved.  Smaller population sizes generally allow the optimizer to “turn over” and learn from itself faster, given an equal number of total design iterations.  In other words, with a smaller population size, more generations fit within a prescribed number of total design iterations.  As a result, smaller population sizes generally provide faster progress to be made for any given optimization run.  However, if the population size is too small, it might lack sufficient diversity of agents and converge to a local optimum.  This concept is illustrated in the following plot where the median progress is shown for four different population sizes for a given model.  The smaller population sizes achieve faster earlier progress within the first 400 design iterations, but population sizes 10 and 20 converge to local optimums and fail to continue to make progress.  The larger population sizes of 30 and 40 are slower to converge, but continue to make progress.

Gamma Technologies recently improved the genetic algorithm in V2021 by incorporating metamodeling during optimization runs.  Also referred to as a response surface or surrogate model, a metamodel is a mathematical representation of response data that uses factors as input variables.  The known data points are used to train the metamodel for the purpose of predicting response values between the known points.  The improved search algorithm, which is referred to as the Accelerated GA, seamlessly uses metamodels behind-the-scenes to intelligently estimate where the optimal solution might be located.   The following graphic is a basic representation of this concept, where a metamodel (represented by the surface contour) is fit through 9 datapoints.  If the optimizer is trying to find the maximum response, it can quickly determine that the starred location is the estimated maximum by using the metamodel.

More details about this approach are given in a white paper recently published to the Gamma Technologies website.  The paper also provides a comprehensive performance comparison between the standard genetic algorithm and the Accelerated GA using 25 complex optimization test models that were used to develop the modified algorithm.  Finally, the paper provides some helpful background on the genetic algorithm, including an exploration of the relationship between its two main parameters, the population size and number of generations.  The paper can be accessed using this link:

https://www.gtisoft.com/wp-content/uploads/2020/12/Exploring-and-Improving-the-GT-SUITE-Genetic-Algorithm.pdf

A few results of the final comparison are included here.  The plots below show median progress (solid lines) and maximum progress (dashed lines) for the standard genetic algorithm and Accelerated GA for four physics-based models.  For all of these four models, the Accelerated GA not only finds better final solutions, but does so faster and with less variance (as indicated by the maximum progress lines).  Considering the DIPulse-B model for example, the Accelerated GA is often finding better solutions multiple hundreds of design iterations sooner than the standard genetic algorithm, as measured by the horizontal distance between red and blue lines.

In summary, the GT-SUITE genetic algorithm’s exploitation process has been improved by incorporating metamodeling using the optimization data, and these metamodels provide an estimate for where the real optimum might exist.  The resulting Accelerated GA provides better performance than the standard genetic algorithm for almost all 25 of the models tested.  As a result, the modeler can use larger population sizes and fewer generations to increase diversity without substantially increasing the total number of design iterations.  The Accelerated GA should provide the modeler more confidence in finding better optimization solutions in fewer total design iterations.

Gamma Technologies is committed to continued research, testing, and improvement of its optimization and data analysis capabilities to provide its users with increased confidence in using these tools for their own projects.  For any questions, support, or further discussion, please reach out to us at [email protected].

Written by Ryan Dudgeon

References

[1]  K. Deb and H. Jain, “An Evolutionary Many-Objective Optimization Algorithm Using Reference-Point-Based Nondominated Sorting Approach, Part I: Solving Problems With Box Constraints,” in IEEE Transactions on Evolutionary Computation, vol. 18, no. 4, pp. 577-601, Aug. 2014.