When Microsoft released an artificially intelligent chatbot named Tay on Twitter last March, things took a predictably disastrous turn. Within 24 hours, the bot was spewing racist, neo-Nazi rants, much of which it picked up by incorporating the language of Twitter users who interacted with it.
Unfortunately, new research finds that Twitter trolls aren’t the only way that AI devices can learn racist language. In fact, any artificial intelligence that learns from human language is likely to come away biased in the same ways that humans are, according to the scientists.
The researchers experimented with a widely used machine-learning system called the Global Vectors for Word Representation (GloVe) and found that every sort of human bias they tested showed up in the artificial system. [Super-Intelligent Machines: 7 Robotic Futures]
“It was astonishing to see all the results that were embedded in these models,” said Aylin Caliskan, a postdoctoral researcher in computer science at Princeton University. Even AI devices that are “trained” on supposedly neutral texts like Wikipedia or news articles came to reflect common human biases, she told Live Science.
GloVe is a tool used to extract associations from texts — in this case, a standard corpus of language pulled from the World Wide Web.
Psychologists have long known that the human brain makes associations between words based on their underlying meanings. A tool called the Implicit Association Test uses reaction times to demonstrate these associations: People see a word like “daffodil” alongside pleasant or unpleasant concepts like “pain” or “beauty” and have to quickly associate the terms using a key press. Unsurprisingly, flowers are more quickly associated with positive concepts; while weapons, for example, are more quickly associated with negative concepts.
The IAT can be used to reveal unconscious associations people make about social or demographic groups, as well. For example, some IATs that are available on the Project Implicit website find that people are more likely to automatically associate weapons with black Americans and harmless objects with white Americans.
There are debates about what these results mean, researchers have said. Do people make these associations because they hold personal, deep-seated social biases they aren’t aware of, or do they absorb them from language that is statistically more likely to put negative words in close conjunction with ethnic minorities, the elderly and other marginalized groups?
Caliskan and her colleagues developed an IAT for computers, which they dubbed the WEAT, for Word-Embedding Association Test. This test measured the strength of associations between words as represented by GloVe, much as the IAT measures the strength of word associations in the human brain.
For every association and stereotype tested, the WEAT returned the same results as the IAT. The machine-learning tool reproduced human associations between flowers and pleasant words; insects and unpleasant words; musical instruments and pleasant words; and weapons and unpleasant words. In a more troubling finding, it saw European-American names as more pleasant than African-American names. It also associated male names more readily with career words, and female names more readily with family words. Men were more closely associated with math and science, and women with the arts. Names associated with old people were more unpleasant than names associated with young people.
“We were quite surprised that we were able to replicate every single IAT that was performed in the past by millions,” Caliskan said.
Using a second method that was similar, the researchers also found that the machine-learning tool was able to accurately represent facts about the world from its semantic associations. Comparing the GloVe word-embedding results with real U.S. Bureau of Labor Statistics data on the percentage of women in occupations, Caliskan found a 90 percent correlation between professions that the GloVe saw as “female” and the actual percentage of women in those professions.
In other words, programs that learn from human language do get “a very accurate representation of the world and culture,” Caliskan said, even if that culture — like stereotypes and prejudice — is problematic. The AI is also bad at understanding context that humans grasp easily. For example, an article about Martin Luther King Jr. being jailed for civil rights protests in Birmingham, Alabama, in 1963 would likely associate a lot of negative words with African-Americans. A human would reasonably interpret the story as one of righteous protest by an American hero; a computer would add another tally to its “black=jail” category.
Retaining accuracy while getting AI tools to understand fairness is a big challenge, Caliskan said.
“We don’t think that removing bias would necessarily solve these problems, because it’s probably going to break the accurate representation of the world,” she said.
The new study, published online today (April 12) in the journal Science, is not surprising, said Sorelle Friedler, a computer scientist at Haverford College who was not involved in the research. It is, however, important, she said.
“This is using a standard underlying method that many systems are then built off of,” Friedler told Live Science. In other words, biases are likely to infiltrate any AI that uses GloVe, or that learns from human language in general.
Friedler is involved in an emerging field of research called Fairness, Accountability and Transparency in Machine Learning. There are no easy ways to solve these problems, she said. In some cases, programmers might be able to explicitly tell the system to automatically disregard specific stereotypes, she said. In any case involving nuance, humans may need to be looped in to make sure the machine doesn’t run amok. The solutions will likely vary, depending on what the AI is designed to do, Caliskan said — are they for search applications, for decision making or for something else?
In humans, implicit attitudes actually don’t correlate very strongly with explicit attitudes about social groups. Psychologists have argued about why this is: Are people just keeping mum about their prejudices to avoid stigma? Does the IAT not actually measure prejudice that well? But, it appears that people at least have the ability to reason about right and wrong, with their biased associations, Caliskan said. She and her colleagues think humans will need to be involved — and programming code will need to be transparent — so that people can make value judgments about the fairness of machines.
“In a biased situation, we know how to make the right decision,” Caliskan said, “but unfortunately, machines are not self-aware.”