MIT Apologizes, Pulls Huge Dataset That Teaches AI Systems To Use Racist Slurs

Written by Ann Brown

Jul 02, 2020

MIT apologizes then pulls its huge dataset offline that teaches artificial intelligence systems to use racist and misogynistic slurs. Photo by Morning Brew on Unsplash

MIT has pulled the giant dataset it used to teach AI how to assign racist and misogynistic labels to people in images.

The database, known as “80 Million Tiny Images,” is a massive collection of photos with descriptive labels used to teach machine-learning models to identify images.

It was created in 2008 to help produce advanced object-detection techniques.

“It is, essentially, a huge collection of photos with labels describing what’s in the pics, all of which can be fed into neural networks to teach them to associate patterns in photos with the descriptive labels,” The Register reported. “So when a trained neural network is shown a bike, it can accurately predict a bike is present in the snap.”

The database was called “Tiny Images” because the pictures in its library are small enough for computer-vision algorithms in the late-2000s and early-2010s to digest.

Here’s the problem. The program did such things as label women as “whores” and “bitches” and used other derogatory terms to label Black people and people of color, such as pictures of Black people and monkeys being labeled with the N-word.

“It also contained close-up pictures of female genitalia labeled with the C-word and other images with the labels ‘rape suspect’ and ‘molester,” The Daily Mail reported.

MIT has since apologized and removed the dataset, but only after media outlet The Register was alerted of the problem based on concerns from two academics. The program’s “serious ethical shortcomings” were discovered by Vinay Prabhu, chief scientist at privacy startup UnifyID, and Abeba Birhane, a Ph.D. candidate at University College Dublin, The Next Web reported. They revealed their findings in the paper, “Large image datasets: A pyrrhic win for computer vision?” which is currently under peer review for the 2021 Workshop on Applications of Computer Vision conference.

Listen to GHOGH with Jamarlin Martin | Episode 73: Jamarlin Martin Jamarlin makes the case for why this is a multi-factor rebellion vs. just protests about George Floyd. He discusses the Democratic Party’s sneaky relationship with the police in cities and states under Dem control, and why Joe Biden is a cop and the Steve Jobs of mass incarceration.

“All AI is racist,” The Next Web’s Tristan Greene wrote. “Most people just don’t notice it unless it’s blatant and obvious. But AI isn’t a racist being like a person. It doesn’t deserve the benefit of the doubt, it deserves rigorous and constant investigation. When it recommends higher prison sentences for Black males than whites, or when it can’t tell the difference between two completely different Black men, it demonstrates that AI systems are racist. And, yet, we still use these systems.”

MIT isn’t the only institution of higher learning using “Tiny Images.” New York University was also using the program, and has taken it offline as well, Venture Beat reported.

Oh my. MIT pulled the Tiny Images AI training dataset after researchers found it contained misogynist, racist labels/imgs. Ideology -> Culture -> Data -> AI https://t.co/Deu1XOVbsJ
— Matt Bailey (@MattBailey0) July 2, 2020