Artificial intelligence is biased because the datasets used to train machine learning algorithms are biased. Researchers at MIT have developed a technique that ‘injects fairness’ into these algorithms, reducing or removing bias.
The use of artificial intelligence in business is on the rise and it’s not slowing down. It’s already been proven that AI is biased and the FTC has warned businesses that using AI that produces biased outputs can result in penalties. Besides facing penalties, using AI that produces biased output gives business leaders skewed information. If the information they use to make business decisions is skewed, then their decisions will not provide the desired results. To combat bias in AI, researchers at MIT have developed a new technique that ‘injects fairness’ into machine learning algorithms.
Businesses use AI for a number of things including facial recognition, predictive text, chatbots, data analytics, etc. But if the dataset used to train the algorithm is unbalanced, the results it produces in response to a query will be biased. For example, if a facial recognition program is trained on data that has more images of a certain skin tone than another, its predictions will be unfair when it’s deployed. But balanced datasets aren’t always available.
“In machine learning, it is common to blame the data for bias in models. But we don’t always have balanced data. So, we need to come up with methods that actually fix the problem with imbalanced data,” says lead author Natalie Dullerud, a graduate student in the Healthy ML Group of the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT.
Dullerud’s team studied deep metric learning, which uses a neural network to learn similarities between objects by mapping similar objects closer together and different objects farther apart. The neural network maps these objects, usually photos, in an embedding space where a similarity metric between photos corresponds to the distance between them. Once trained, the algorithm can effectively identify its target, so long as the dataset is balanced. But, as indicated above, balanced datasets are not always available, so the researchers came up with a technique that “introduces fairness directly into tho the model’s internal representation itself.” Rather, it’s introduced in the embedding space during training, so bias is removed in the beginning.
The solution they developed is called Partial Attribute Decorrelation, or PARADE. It means training models to learn separate similarity metrics for sensitive attributes, then decorrelating that similarity metric from the targeted similarity metric. For example, it will learn to map the similarities of human faces by grouping similar features together without considering skin tone. Other attributes and metrics can be targeted in this way, and the similarity metric for whichever attribute is targeted is learned in a separate embedding space, which is discarded after training so that only the targeted similarity metric is left.
This method is applicable to many situations because the amount of decorrelation between similarity metrics can be controlled. When these metrics are controlled, models make more balanced predictions and downstream tasks see increased performance.
The MIT article describes the research and methods in much further detail, but seeing that methods are being derived to prevent or remove bias in AI is a huge step in the right direction. Businesses need to be aware of how the products they use perform and how that performance affects their bottom line. Bias in AI can return FTC penalties, but it can also harm the decision-making process, both of which can lead to poor business performance. So it’s important for businesses to follow this research, to know that options are in the works to make AI more reliable and accurate and to know that while AI isn’t perfect, it’s improving every day.