Fine-grained and Dense Annotation of Czech Propaganda Using Large Language Models

Sabol,  Radoslav; Horák,  Aleš

Fine-grained and Dense Annotation of Czech Propaganda Using Large Language Models

Varování

Publikace nespadá pod Fakultu sociálních studií, ale pod Fakultu informatiky. Oficiální stránka publikace je na webu muni.cz.

Autoři	SABOL Radoslav HORÁK Aleš
Rok publikování	2025
Druh	Článek ve sborníku
Konference	Recent Advances in Slavonic Natural Language Processing, RASLAN 2025
Fakulta / Pracoviště MU	Fakulta informatiky
Citace
www	Proceedings of the Nineteenth Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2025.
Klíčová slova	manipulative techniques; propaganda; detection; annotation; large language models; LLM; Czech
Přiložené soubory	Fine-grained_and_Dense_Annotation_of_Czech_Propaganda.pdf
Popis	WithinthepreviousprojectofCzechPropagandaDetectionthat aimed to recognize manipulative techniques in Czech news articles, the annotation was mostly limited to indicating the presence of a technique in a document. Inabout35%ofthedocuments,span-level evidence was also annotated, but only as a support for the document-level labels, resulting in a sparse coverage of the techniques. Thus, the resulting dataset has limitations for training and evaluating more fine-grained propaganda detection models. In this study, we examine the potential of large language models (LLMs) to generate dense span-level annotations of manipulative techniques in Czech news articles. We designed generation prompts tailored to each technique and experimented with several LLMs to produce annotations for a subset of the Czech Propaganda dataset. We present the details of the generation process, including the design of the prompts and the selection of models. We also evaluate the generated annotations both quantitatively and qualitatively, including a manual validation and comparison with human annotations.
Související projekty:	Na všechno sami: příležitosti a rizika individualizace společnosti (PRINS)