Ever since Microsoft’s chatbot Tay started spouting racist commentary after 24 hours of interacting with humans on Twitter, it’s been obvious that our AI creations can fall prey to human prejudice. Now a group of researchers have figured out one reason why that happens. Their findings shed light on more than our future robot overlords, however. They’ve also worked out an algorithm that can actually predict human prejudices based on an intensive analysis of how people use English online.
The implicit bias test
Many AIs are trained to understand human language by learning from a massive corpus known as the Common Crawl. The Common Crawl is the result of a large-scale crawl of the Internet in 2014 that contains 840 billion tokens, or words. Princeton Center for Information Technology Policy researcher Aylin Caliskan and her colleagues wondered whether that corpusâ€”created by millions of people typing away onlineâ€”might contain biases that could be discovered by algorithm. To figure it out, they turned to an usual source: the Implicit Association Test (IAT), which is used to measure often unconscious social attitudes.
People taking the IAT are asked to put words into two categories. The longer it takes for the person to place a word in a category, the less they associate the word with the category. (If you’d like to take an IAT, there are several online at Harvard University.) IAT is used to measure bias by asking people to associate random words with categories like gender, race, disability, age, and more. Outcomes are often unsurprising: for example, most people associate women with family, and men with work. But that obviousness is actually evidence for the IAT’s usefulness in discovering people’s latent stereotypes about each other.
Using the IAT as a model, Caliskan and her colleagues created the Word-Embedding Association Test (WEAT), which analyzes chunks of text to see which concepts are more closely associated than others. The “word-embedding” part of the test comes from a project at Stanford called GloVe, which packages words together into “vector representations,” basically lists of associated terms. So the word “dog,” if represented as word-embedded vector, would be comprised of words like puppy, doggie, hound, canine, and all the various dog breeds. The idea is to get at the concept of dog, not the specific word. This is especially important if you are working with social stereotypes, where somebody might be expressing ideas about women by using words like “girl” or “mother.” To keep things simple, the researchers limited each concept to 300 vectors.
To see how concepts are get associated with each other online, the WEAT looks at a variety of factors to measure their “closeness” in text. At a basic level, Caliskan told Ars, this means how many words apart the two concepts are, but it also accounts for other factors like word frequency. After going through an algorithmic transform, closeness in the WEAT is equivalent to the time it takes for a person to categorize a concept in the IAT. The further apart the two concepts, the more distantly they are associated in people’s minds.
The WEAT worked beautifully to discover biases that the IAT had found before. “We adapted the IAT to machines,” Caliskan said. And what that tool revealed was that “if you feed AI with human data, thatâ€™s what it will learn. [The data] contains biased information from language.” That bias will affect how the AI behaves in the future, too. As an example, Caliskan made a video (see above) where she shows how the Google Translate AI actually mistranslates words based on stereotypes it has learned about gender from the English language.
Imagine an army of bots unleashed on the Internet, replicating all the biases that they learned from humanity. That’s the future we’re looking at, if we don’t build some kind of corrective for the prejudices in these systems.
A problem that AI can’t solve
Though Caliskan and her colleagues found language was full of biases based on prejudice and stereotypes, it was also full of latent truths as well. In one test, they found strong associations between the concept of woman and the concept of nursing. This reflects a truth about reality, which is that nursing is a majority female profession.
“Language reflects facts about the world,” Caliskan told Ars. She continued:
Removing bias or statistical facts about the world will make the machine model less accurate. But you canâ€™t easily remove bias, so you have to learn how to work with it. We are self-aware, we can decide to do the right thing instead of the prejudiced option. But machines donâ€™t have self awareness. An expert human might be able to aid in [the AIs’] decision-making process so outcome isnâ€™t stereotyped or prejudiced for a given task.
The solution to the problem of human language is…humans. “I canâ€™t think of many cases where you wouldnâ€™t need a human to make sure that the right decisions are being made,” concluded Caliskan. “A human would know the edge cases for whatever the application is. Once they test the edge cases they can make sure itâ€™s not biased.”
So much for the idea that bots will be taking over human jobs. Once we have AIs doing work for us, we’ll need to invent new jobs for humans who are testing the AIs’ results for accuracy and prejudice. Even when chatbots get incredibly sophisticated, they are still going to be trained on human language. And since bias is built into language, humans will still be necessary as decision-makers.
In a recent paper for Science about their work, the researchers say the implications are far-reaching. “Our findings are also sure to contribute to the debate concerning the Sapir Whorf hypothesis,” they write. “Our work suggests that behavior can be driven by cultural history embedded in a termâ€™s historic use. Such histories can evidently vary between languages.” If you watched the movie Arrival, you’ve probably heard of Sapir Whorfâ€”it’s the hypothesis that language shapes consciousness. Now we have an algorithm that suggests this may be true, at least when it comes to stereotypes.
Caliskan said her team wants to branch out and try to find as-yet-unknown biases in human language. Perhaps they could look for patterns created by fake news or look into biases that exist in specific subcultures or geographical locations. They would also like to look at other languages, where bias is encoded very differently than it is in English.
“Let’s say in the future, someone suspects there’s a bias or stereotype in a certain culture or location,” Caliskan mused. “Instead of testing with human subjects first, which takes time, money, and effort, they can get text from that group of people and test to see if they have this bias. It would save so much time.”
Science, 2017. DOI: 10.1126/science.aal4230
from Ars Technica http://ift.tt/2o00CdZ