ChatGPT and other deep generative models have proven to be uncannily capable of imitation. These AI supermodels can compose poetry, complete symphonies and create new videos and images by automatically learning from millions of examples of previous work. These extremely powerful and versatile tools excel at generating new content that is similar to anything they have seen before.
But as MIT engineers say in a new study, if you want to be truly innovative in engineering tasks, similarity isn’t enough.
“Deep generative models (DGM) are very promising, but they also have inherent flaws,” said study author Lyle Regenwetter, a graduate student in mechanical engineering at MIT. “The goal of these models is to mimic the data set. But as engineers and designers, we often don’t want to create designs that already exist.”
He and his colleagues believe that if mechanical engineers need AI’s help to generate novel ideas and designs, they must first refocus these models beyond “statistical similarity.”
“The performance of many models is explicitly related to how statistically similar the generated samples are to samples the model has already seen,” said co-author Faez Ahmed, an assistant professor of mechanical engineering at MIT. “But in design, if you want to innovate, it can be important to be different.”
In their research, Ahmed and Regenweit reveal the pitfalls of deep generative models when solving engineering design problems. In a case study of bicycle frame design, the team showed that these models ultimately produced new frames that mimicked previous designs but performed poorly in terms of engineering performance and requirements.
When the researchers presented the same bicycle frame problem to DGM, which they designed specifically with engineering-focused goals in mind, rather than just statistical similarity, the models produced more innovative, higher-performing frames.
The team’s results show that similarity-centered AI models don’t fully translate when applied to engineering problems. But, as the researchers emphasize in their study, by carefully planning metrics appropriate to the task, AI models can become effective design “co-pilots.”
“It’s about how artificial intelligence can help engineers create innovative products better and faster,” Ahmed said. “To do that, we have to first understand the requirements. This is a step in that direction.”
The team’s new research was recently published online and will be published in the December print edition of the journal Computer Aided Design. The research was a collaboration between computer scientists at the MIT-IBM Watson AI Laboratory and mechanical engineers at the MIT DeCoDe Laboratory. Co-authors of the study include Akash Srivastava and Dan Gutreund of the MIT-IBM Watson AI Lab.
Ask a question
As Ahmed and Regenwetter write, DGMs are “powerful learners with unrivaled capabilities” to process large amounts of data. DGM is a broad term that refers to any machine learning model that is trained to learn a data distribution and then uses it to generate new, statistically similar content. The popular ChatGPT is a deep generative model, called a Large Language Model (LLM), which incorporates natural language processing capabilities into the model, enabling applications to generate realistic images and speech in response to conversational queries. Other popular image generation models include DALL-E and stable diffusion.
DGM has been increasingly used in several engineering fields due to its ability to learn from data and generate real samples. Designers use deep generative models to draft new aircraft frames, metamaterial designs, and optimal geometries for bridges and cars. But in most cases, these models imitate existing designs without improving their performance.
“Designers working with DGM kind of miss the point, which is to adjust the model’s training objectives to focus on the design requirements,” Regenwetter said. “So one ends up generating designs that are very similar to the data set.”
In new research, he outlines the main pitfalls of applying DGM to engineering tasks and shows that the basic goals of standard DGM do not take into account specific design requirements. To illustrate this point, the team cited a simple bicycle frame design example and demonstrated that problems can arise as early as the initial learning phase. When a model learns from thousands of existing bicycle frames of varying sizes and shapes, it may assume that two similarly sized frames have similar performance, when in fact a tiny disconnect in one frame—too Differences that are too small to register as significant indicators of statistical similarity – making the frame much weaker than other visually similar frames.
Beyond “vanilla”
The researchers continued with the bicycle example to see what designs DGM would actually generate after learning from existing designs. They first tested traditional “ordinary” generative adversarial networks (GANs)—models that have been widely used for image and text synthesis and can be simply tweaked to generate statistically similar content. They trained the model on a dataset of thousands of bicycle frames, including commercially manufactured designs and less traditional one-off frames designed by hobbyists.
Once the model learned from the data, the researchers asked it to generate hundreds of new bicycle frames. The model produced a realistic design similar to existing frames. But none of these designs showed significant improvements in performance, and some were even slightly inferior, with heavier frames and poorer construction.
The team then conducted the same test on two other DGMs designed for engineering tasks. The first model is one Ahmed previously developed to generate high-performance airfoil designs. He built this model to prioritize statistical similarity and functional performance. When applied to the bicycle frame task, the model produced a realistic design that was lighter and stronger than existing designs. But it also produces physically “invalid” frames whose components don’t fit well or overlap in physically impossible ways.
“We saw designs that clearly outperformed the dataset, but we also saw designs that were geometrically incompatible because the model was not focused on meeting the design constraints,” Regenwetter said.
The last model the team tested was one built by Regenwetter to generate new geometries. The design priorities for this model are the same as the previous model, with the added component of design constraints and prioritizing a physically feasible frame, for example, without disconnected or overlapping rods. The last model produced the highest performing design that was also physically feasible.
“We found that when the model goes beyond statistical similarity, it can come up with designs that are better than existing designs,” Ahmed said. “It demonstrates that an AI can do what it can do if it is trained on an explicit design task. What.”
For example, if DGM could take into account other priorities such as performance, design constraints, and novelty, Ahmed predicts that “many engineering fields, such as molecular design and civil infrastructure, would benefit greatly.” By revealing the potential of relying solely on statistical similarity Traps, we hope to inspire new approaches and strategies for generative AI applications beyond multimedia.”