Artificially Correct Hackathon October 2021
The event
From 2021 the Goethe-Institut has brought together AI researchers, translators in the Artificially Correct project. The project focuses on AI-based translation and writing tools and the biases they can generate. In October 2021 the project hosted a hackathon bringing together 12 teams from around the world.
I was asked to provide a challenge for the hackathon. The guidance text and starter resources I provided are reproduced below. Excitingly, one of the two winning teams was a project addressing this challenge.
Challenge: Identifying sentences susceptible to machine translation bias
Some translation mistakes matter more to us than others, especially if we're worried about bias.
For example, if a sentence is not about people at all, translating it is less likely to reinforce harmful stereotypes about people. But if a sentence uses many words relating to an individual, we might have to be especially careful with the machine translation.
Current ways to identify bias-susceptible sentences typically involved fixed vocabulary, like lists of jobs, and typically focus on English. This challenge is to instead automatically identify such sentences, ideally in a way that generalises to languages other than English.
Below is a link to a toy test set for this challenge: a mix of sentences from existing bias datasets and sentences from other sources that are not about people, with English, German and Spanish translations in each case. However, ideally participants would also look at other datasets, potentially in other languages, and explore more fine-grained ways to determine whether a sentence could cause bias problems for translation.