Why Google’s AI tool was slammed for showing images of people of colour
America’s founding fathers depicted as Black women and Ancient Greek warriors as Asian women and men – this was the world reimagined by Google’s generative AI tool, Gemini, in late February.
The launch of the new image generation feature sent social media platforms into a flurry of intrigue and confusion. When users entered any prompts to create AI-generated images of people, Gemini was largely showing them results featuring people of colour – whether appropriate or not.
What is Google Gemini?
Google’s first contribution to the AI race was a chatbot named Bard.
Bard was announced as a conversational AI programme or “chatbot”, which can simulate conversation with users, by Google CEO Sundar Pichai on February 6, 2023, and it was released for use on March 21, 2023.
It was capable of churning out essays or even code when given written prompts by the user, hence being known as “generative AI”.
Google said that Gemini would replace Bard and both a free and paid version of Gemini were made available to the public through its website and smartphone application. Google announced that Gemini would work with different types of input and output, including text, images and videos.
The image generation aspect of Gemini is the part of the tool which gained the most attention, however, due to the controversy surrounding it.
What sort of images did Gemini generate?
Images depicting women and people of colour during historical events or in positions historically held by white men were the most controversial. For example, one render displayed a pope who was seemingly a Black woman.
In the history of the Catholic Church, there have potentially been three Black popes, with the last Black pope’s service ending in 496 AD. There is no recorded evidence of there being a female pope in the Vatican’s official history but a medieval legend suggests a young woman, Pope Joan, disguised herself and served as pope in the ninth century.
How does Gemini work?
Gemini is a generative AI system which combines the models behind Bard – such as LaMDA, which makes the AI conversational and intuitive, and Imagen, a text-to-image technology – explained Margaret Mitchell, chief ethics scientist at the AI startup, Hugging Face.
Generative AI tools are loaded with “training data” from which they draw information to answer questions and prompts input by users.
The tool works with “text, images, audio and more at the same time”, explained a blog written by Pichai and Demis Hassabis, the CEO and co-founder of British American AI lab Google DeepMind.
“It can take text prompts as inputs to produce likely responses as output, where ‘likely’ here means roughly ‘statistically probable’ given what it’s seen in the training data,” Mitchell explained.
Does generative AI have a bias problem?
Generative AI models have been criticised for what is seen as bias in their algorithms, particularly when they have overlooked people of colour or they have perpetuated stereotypes when generating results.
AI, like other technology, runs the risk of amplifying pre-existing societal prejudices, according to Ayo Tometi, co-creator of the US-based anti-racist movement Black Lives Matter.
Artist Stephanie Dinkins has been experimenting with AI’s ability to realistically depict Black women for the past seven years. Dinkins found AI tended to distort facial features and hair texture when given prompts to generate images. Other artists who have tried to generate images of Black women using different platforms such as Stability AI, Midjourney or DALL-E have reported similar issues.
Critics also say that generative AI models tend to over-sexualise the images of Black and Asian women they generate. Some Black and Asian women have also reported that AI generators lighten their skin colour when they have used AI to generate images of themselves.
Instances like these happen when those uploading the training data do not include people of colour or people who are not “the mainstream culture”, said data reporter Lam Thuy Vo in an episode of Al Jazeera’s Digital Dilemma. A lack of diversity among those inputting the training data for image generation AI can result in the AI “learning” biased patterns and similarities within the images, and using that knowledge to generate new images.
Furthermore, training data is collected from the internet where a huge range of content and images can found, including that which is racist and misogynistic. Learning from the training data, the AI may replicate that.
The people who are the least prioritised in data sets, therefore, are more likely to experience technology that does not account for them – or depict them correctly – which leads to and can perpetuate discrimination.