A Facebook data leak has revealed that Finnish-language moderation rests with a handful of moderators, eleven people working in Berlin. While the social media giant praises the efficiency of its automated moderation tools, documents obtained by Yle show that these are of little use when it comes to small languages like Finnish.
The company's ability to detect hate speech is not as good as Facebook has led users to believe. Moderation for languages like Finnish is often half-baked, a fact apparent in thousands of pages of leaked internal Facebook documents obtained by Yle.
These documents show that Facebook has not developed Finnish-language automated moderation for things like hate speech, violence and nudity.
Facebook has always wanted to keep the inner workings of its moderation secret. It has, for example, not wanted to say how many workers it has moderating content in different languages. It has also refused to reveal in which languages it employs automatic moderation.
The company turned down two separate requests from Yle to discuss how it carries out Finnish-language moderation.
Globally Facebook employs some 15,000 people whose job it is to trawl through published posts. These moderators, who mainly work through subcontractors, scrutinise content in 70 different languages.
When it comes to Finnish moderators, we now know there's about ten of them working in Berlin.
Courts weigh in
Facebook's community standards outline what's acceptable on the platform and what isn't—the latter including violence and hate speech. In theory, this type of content requires moderators to intervene, but in reality that doesn't always happen.
Cases winding through the Finnish court system are testament to the company's poor moderation of Finnish content. Facebook posts have triggered dozens of verdicts in Finland over the past few years.
That said, a court ruling doesn't necessarily imply that efforts at moderation have failed. But likewise the decision by a prosecutor not to pursue charges regarding questionable content is not a sign of adequate moderation either.
The volume of legal cases suggests Facebook features hate speech breaking Finnish law. Cases reaching the courts are becoming increasingly egregious.
"It's plain to see that there's writing on Facebook inciting violence and ethnic hatred toward minorities and nothing is done about it," said Aleksi Knuutila, a political culture and communications researcher.
Finland not a priority
Documents leaked by Facebook whistleblower Frances Haugen suggest that the company's moderation doesn't really work in any language. This is why the company focuses nearly all of its attention on large language groups.
Facebook creates automated systems for languages producing the most content. The company prioritises moderation in countries with recurring violence. However, a country can also fall off the priority list if Facebook considers it too difficult to develop automated moderation for a particular language.
Facebook trains algorithms in a few dozen languages. Facebook's AI monitors Covid-related content in 17 different languages, and societal topics in 31.
Documents obtained by Yle do not indicate that Facebook has developed its own algorithm for Finnish. But its AI does trawl through content published in Finland in other languages.
Facebook's own research suggests that its societal algorithm recognises around a fifth of of discussions on Finnish Facebook pages. In addition to English, this algorithm is trained to recognise at least Swedish, Russian and Arabic.
This would entail human moderators poring over the remaining four fifths.
The amount of hate speech posted on Facebook has skyrocketed during the pandemic. The company went from deleting 9.6 million instances of hate speech per quarter to 22.5 million. The social media giant said adopting automated moderation in Spanish, Arabic and Indonesian contributed to this growth.
Facebook's leaked internal documents, however, suggest that the company only manages to remove a few percent of the hate speech circulating on its network.
Yle is a part of a consortium of news organizations that has reviewed the disclosures made to the U.S. Securities and Exchange Commission and provided to the U.S. Congress in redacted form by Frances Haugen’s legal counsel.