Published August 7, 2018

Machine Learning for Automotive Engine Design

Julian Toumey

Senior Research Engineer

In the last five years, “machine learning” has become a veritable buzzword. From applications as diverse as traffic forecasting and the virtual assistant on your smartphone to genome sequencing, researchers employ machine learning across a broad array of fields to improve predictions based on big datasets.

Beyond adding convenience to everyday life, machine learning can contribute to technology development as well. In a recent collaboration between Argonne National Laboratory, Aramco, and Convergent Science, Moiz et al. applied machine learning techniques to automotive engine research, enhancing computational fluid dynamics (CFD) studies performed in CONVERGE CFD [1]. Machine learning leverages existing datasets to optimize and predict new designs that have improved performance, higher efficiency, and reduced emissions. In light of market competition and increasingly strict emissions requirements, the union of machine learning and engine CFD is a promising development.

Machine Learning Overview

At a very basic level, machine learning means leveraging data to make accurate predictions. An example of this that we encounter every day is targeted advertising. Marketers use machine learning to take information about our demographics and interests and provide relevant product recommendations. More often than not, these recommendations are startlingly accurate.

The first step in developing a machine learning model is for scientists to collect large datasets. Next, the machine learning model applies computational statistics to the data, detecting relationships between inputs and outputs. This process is known as training the model. To evaluate the accuracy of the model, scientists often supply to the model a test dataset that was not included in the training dataset and examine the accuracy of the predictions. The more accurate the algorithm, the lower the risk of inaccurate predictions. Many machine learning algorithms exist (decision tree, support vector machines, neural networks, etc.), some of which have been in development for decades.

CFD Applications

A popular optimization technique for engine designers is the genetic algorithm (GA). CONVERGE includes such a tool, CONGO, which takes a “survival of the fittest approach” to optimize a design. That is, the method pits individuals (designs) against each other in a population with a set of user-defined parameters that vary. Each individual includes characteristics of the various parameters to optimize, such as combustion phasing, combustion shape design, etc. The goal of a GA study is to optimize a result such as indicated specific fuel consumption, while staying within certain constraints such as emissions or peak cylinder pressure.

By definition, a genetic algorithm study needs to run for many successive generations. One of the primary drawbacks of this technique is that the generations can run for a long time, sometimes in the range of months. This is because most engine CFD simulations require between a day and a week for individual results. Engine researchers often require a faster solution than a GA optimization of CFD. To address this, Moiz et al. combined machine learning with genetic algorithm optimization to quickly develop gasoline compression ignition (GCI) engine designs. The engine analyzed in the work uses a low-octane gasoline fuel in partially premixed compression ignition.

First, scientists ran a large (2048 individual CONVERGE simulations) space-filling design of experiments (DoE) to create a training data set. Since the DoE can be defined all at once, the simulations ran concurrently. With the advent of large HPC clusters like the Mira supercomputer at Argonne National Laboratory, the entire DoE of CFD simulations ran in a few days. The authors also investigated using smaller subsets of the training dataset to see if a less expensive DoE would be sufficient. They found that the learning curves were promising down to a DoE with sample size of 300.

An emerging combustion technology like GCI has ample room for optimization to maximize efficiency and minimize emissions, and computational studies are ideal for this task. In the current work, the authors employed a machine learning genetic algorithm approach to reduce the design cycle for optimizing a GCI engine and overcoming the above-mentioned obstacles. The general procedure is as follows:

  1. Ran over two thousand high-fidelity CFD simulations in CONVERGE to create a large dataset on which to train the machine learning model,
  2. Trained and tested the machine learning model on the CFD data,
  3. Used the machine learning algorithm as an emulator of the design space for optimization to optimize the engine designs.

The machine learning (ML) GA procedure poses a speed advantage over a traditional GA optimization. First, engineers can run the initial CFD simulations in parallel, generating seed data very quickly. Second, the ML GA emulator can evaluate an individual design in a few seconds in comparison with a high-fidelity CFD simulation, which can take around 12 hours on 128 processors.

In a GA optimization with CFD results as the objective function, sequentially running the CFD simulations is a bottleneck in the process. The ML GA approach, however, reduces the time significantly, allowing a full optimization in approximately a day. An additional benefit of this technique is that engineers can use the initial space-filling DoE datasets for future design space interrogation or uncertainty analyses.

Machine learning is a powerful tool which is now becoming ubiquitous in software applications. It is only natural that, when combined with CFD, ML GA methods help designers more rapidly optimize engine efficiency and performance.


[1] Moiz, A., Pal, P., Probst, D., Pei, Y., Zhang, Y., Som, S., and Kodavasal, J., “A Machine Learning-Genetic Algorithm (ML-GA) Approach for Rapid Optimization Using High-Performance Computing,” SAE Paper 2018-01-0190, 2018. DOI:10.4271/2018-01-0190