Yle adopts the National Library of Finland’s Annif tool for the automated tagging of articles
Yle has started using the Annif tool for the computer-assisted tagging of article content in Finnish and Swedish.
Annif is an open-source software developed by the National Library of Finland. It is based on a combination of different natural language processing and machine learning methods. Annif is used in the tagging tool integrated into Yle's text publishing systems. The tagging tool enables content providers to select tags for their articles from automatically produced suggestions.
Annif combines a variety of machine learning algorithms to achieve the best result. Yle has adopted a version which uses the lexical MLLM algorithm, the associative Omikuji algorithm, and the neural network-based nn_ensemble that combines their results.
The National Library of Finland offers a pre-trained Annif model and a pre-made vocabulary, which are currently being used, for example, at the University of Jyväskylä for the subject indexing of theses.
Yle's version of Annif, on the other hand, has been trained with Yle's own articles in Finnish and Swedish as well as their tags. The main sources of the Yle vocabulary are the KOKO ontology maintained by the Finto.fi service and the Wikidata database. Annif will be trained once a week. The weekly training also updates the vocabulary used.