Google recently admitted that the image generation feature in its conversational AI app Gemini produced some inaccurate and potentially offensive results. The company has paused the feature while it studies what steps need to be taken to correct it.
It’s easy to laugh off these errors or get outraged by their absurdity, and some even think there’s some kind of racial conspiracy involved.
Android & Calm
Android & Chill is one of the longest-running tech columns on the web, talking about Android, Google, and all things tech on Saturdays.
It’s possible, but extremely unlikely. Google works to tell you what you want to know and about the company no Go into business and make the world a better place. Its purpose is to make money, and controversy does not help achieve that.
So what went wrong, and why do Geminis falter in their attempts to create authentic characters?
Too much of a good thing?
Well, I thought people exaggerated this stuff, but this was the first image request I tried with Gemini. pic.twitter.com/Oipcn96wMhFebruary 21, 2024
One of the main issues is the over-adjustment of inclusion and diversity. Google wants to eliminate potential bias in its image generation model. Unfortunately, this adjustment had unintended side effects. Rather than simply avoid unfair stereotypes, Gemini sometimes seems to insert diversity where neither historical fact nor a particular prompt fits. A request for “Doctors of the 1940s” might produce images featuring doctors of different races, although this would not be an accurate representation of the time.
Google needs to do this, and it has nothing to do with being “woke.” The people who program and train AI models do not represent everyone. For example, Joe from Indiana doesn’t have much in common with Fadila from Tanzania. Both can use Google Gemini, and both expect inclusive results. Google just went too far in one direction.
To ensure inclusivity and avoid bias, Gemini’s image generation is tuned to prioritize diverse representation in the output. However, this adjustment is wrong in some cases.
When a user requested an image of a person from a specific background, the model did not always generate an accurate image, instead prioritizing individuals from different backgrounds regardless of whether they fit the specific prompt. That’s why we see things like African-American George Washington or the female pope. AI is only as smart as the software that powers it, because it isn’t actually smart.
To its credit, Google recognized this mistake and didn’t try to avoid the issue. Jack Krawczyk, senior director of product management for Gemini experience at Google, said in an interview with the New York Post:
“We’re working to improve these types of depictions immediately. Gemini’s AI image generation does generate a variety of people. That’s generally a good thing because people around the world are using it. But it missed the mark here. .”
In addition to considering diversity and inclusion, the model also aims to carefully avoid harmful content or replicating harmful stereotypes. This caution, while well-intentioned, turned into a problem. In some cases, Gemini will avoid generating certain images altogether, even if there appears to be no malicious intent behind the prompt.
These two issues combined result in Geminis sometimes producing strange or inaccurate images, especially when depicting people. Generative AI is very different from the AI that powers many other Google products you have installed on your phone, and requires more attention.
way forward
Google has recognized these issues and the need to balance inclusivity with historical and contextual accuracy. This is a difficult challenge for generative AI models. While preventing the reinforcement of harmful stereotypes is a noble goal, it shouldn’t come at the expense of models simply doing what they’re told to do.
Finding this balance is critical to the future success of image-generating AI models. Google and other companies working in this space will need to carefully refine their models to achieve inclusive results and the ability to accurately meet a wider range of user cues.
It’s important to remember that these are still in the early stages of this type of technology. While disappointing, these setbacks are an important part of the learning process that will ultimately lead to more powerful and reliable generative AI.
Generative AI models need to be fine-tuned to achieve a balance between inclusivity and accuracy. When trying to account for potential biases, models can become overly cautious and produce incomplete or misleading results—developing more powerful image-generating AI is an ongoing challenge.
Google’s mistake was not explaining what was going on in a way that the average person could understand. Many people are not interested in understanding how an AI model is trained, but in this case it is important to understand why it is trained a certain way. Google could have written this article on one of its many blogs and avoided the controversy surrounding Geminis not being good at something.