He's Brilliant, She's Lovely: Teaching Computers To Be Less Sexist
Computer programs often reflect the biases of their very human creators. That's been well established.
The question now is: How can we fix that problem?
Adam Kalai thinks we should start with the bits of code that teach computers how to process language. He's a researcher for Microsoft and his latest project — a joint effort with Boston University — has focused on something called a word embedding.
"It's kind of like a dictionary for a computer," he explains.
Essentially, word embeddings are algorithms that translate the relationships between words into numbers so that a computer can work with them. You can grab a word embedding "dictionary" that someone else has built and plug it into some bigger program that you are writing.
For example, let's say you work for a big tech company and you're trying to hire a new employee. You're getting thousands and thousands of resumes a day. So, being tech savvy, you write a little program that searches through those resumes for the term "computer programmer."
A search program that has a word embedding algorithm plugged into it can bring resumes that contain more of these related words up to the top of the search pile, hopefully helping you to find the most qualified candidates without having to read through every single document.
Word embeddings do similar work in computer programs that we interact with every day — programs that target ads at us, decide what we see on social media, or work to improve Internet search results. They've even been used to classify NPR interviews.
But here's the problem: These word embeddings learn the relationships between words by studying human writing — like the hundreds of thousands of articles on Wikipedia or Google News.
"We try, especially in news articles, to avoid saying sexist things," Kalai says. Nevertheless, he says, "you find, within these word embeddings, some pretty blatantly sexist properties."
Kalai and his colleagues discovered the problem by using word embeddings to solve analogies.
They gave the word embedding one pair of words, like "he" is to "she," and asked it to provide a pair that it sees as having a similar relationship, completing the analogy.
Some pairings were relatively inoffensive cultural stereotypes: "She" is to "he" as "pink" is to "blue." Others were mildly amusing, or even confusing. "She" is to "he" as "OMG" is to "WTF," or as "pregnancy" is to "kidney stone."
And then, there were more problematic pairings, like when the word embedding algorithm suggested that "he" is to "she" as "brilliant" is to "lovely."
Or as "computer programmer" is to "homemaker."
That last one is especially problematic. Let's go back to our example of how a big tech company might filter through prospective employees.
If the algorithm reads maleness as more connected to the profession of computer programming, it will bump male resumes to the top of the pile when it searches for "computer programmer."
The resumes of female candidates will wind up at the bottom of the pile. So would applications from black candidates, because the algorithm also picks up racial biases from articles.
That's a problem just on its face, but it's even more of an issue when you consider that one of the other big reasons why computer programs wind up being inadvertently biased is because there's such limited diversity among computer programmers themselves.
But here's the good news: Kalai and his colleagues have found a way to weed these biases out of word embedding algorithms. In a recent paper, they've shown that if you tell the algorithms to ignore certain relationships, they can extrapolate outwards.
To help make it clear, he made a little graphic explanation. (Word embeddings actually work in a 300-dimension vector space ... but we'll stick to two.)
Words are arranged according to their gender associations: words seen as masculine on the right, words seen as feminine on the left.
The words below the dividing line are words like mother, sister, father, uncle — words that a dictionary might classify as masculine or feminine.
The words above the line — nurse or sassy or computer programmer — have gender associations that are more problematic.
So Kalai and his colleagues want to bring the words above the line to a central, non-gendered point. Similar work can be done with other stereotypical markers of race, class, or ability.
Kalai wants to be clear, though: Their project is only offering a technique to de-bias word embeddings. They're very intentionally not offering the de-biased word embeddings themselves.
"We're computer scientists," he says. "We're not going to choose what is a good bias and what is a bad bias."
He says there may be some instances where having certain biases in your program is useful. You might want to be able to distinguish based on gender when targeting certain health-related ads, for example, or keep the biases in the algorithm so that you can study them.
For Kalai, the problem is not that people sometimes use word embedding algorithms that differentiate between gender or race, or even algorithms that reflect human bias.
The problem is that people are using the algorithms as a black box piece of code, plugging them in to larger programs without considering the biases they contain, and without making careful decisions about whether or not they should be there.
Copyright 2020 NPR. To see more, visit https://www.npr.org.