Conflibert-spanish: A pre-trained spanish language model for political conflict and violence

W Yang, S Alsarra, L Abdeljaber… - 2023 7th IEEE …, 2023 - ieeexplore.ieee.org
2023 7th IEEE Congress on Information Science and Technology (CiSt), 2023ieeexplore.ieee.org
This article introduces ConfliBERT-Spanish, a pre-trained language model specialized in
political conflict and violence for text written in the Spanish language. Our methodology
relies on a large corpus specialized in politics and violence to extend the capacity of pre-
trained models capable of processing text in Spanish. We assess the performance of
ConfliBERT-Spanish in comparison to Multilingual BERT and BETO baselines for binary
classification, multi-label classification, and named entity recognition. Results show that …
This article introduces ConfliBERT-Spanish, a pre-trained language model specialized in political conflict and violence for text written in the Spanish language. Our methodology relies on a large corpus specialized in politics and violence to extend the capacity of pre-trained models capable of processing text in Spanish. We assess the performance of ConfliBERT-Spanish in comparison to Multilingual BERT and BETO baselines for binary classification, multi-label classification, and named entity recognition. Results show that ConfliBERT-Spanish consistently outperforms baseline models across all tasks. These results show that our domain-specific language-specific cyberinfrastructure can greatly enhance the performance of NLP models for Latin American conflict analysis. This methodological advancement opens vast opportunities to help researchers and practitioners in the security sector to effectively analyze large amounts of information with high degrees of accuracy, thus better equipping them to meet the dynamic and complex security challenges affecting the region.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果