Named entity recognition in the Romanian legal domain

V Păiș, M Mitrofan, CL Gasan… - Proceedings of the …, 2021 - aclanthology.org
Recognition of named entities present in text is an important step towards information
extraction and natural language understanding. This work presents a named entity …

Improving Romanian BioNER using a biologically inspired system

M Mitrofan, V Păiș - Proceedings of the 21st Workshop on …, 2022 - aclanthology.org
Recognition of named entities present in text is an important step towards information
extraction and natural language understanding. This work presents a named entity …

Introducing the CURLICAT corpora: seven-language domain specific annotated corpora from curated sources

T Váradi, B Nyéki, S Koeva, M Tadić… - Proceedings of the …, 2022 - aclanthology.org
This article presents the current outcomes of the CURLICAT CEF Telecom project, which
aims to collect and deeply annotate a set of large corpora from selected domains. The …

PyEuroVoc: a tool for multilingual legal document classification with EuroVoc descriptors

AM Avram, V Pais, D Tufis - arXiv preprint arXiv:2108.01139, 2021 - arxiv.org
EuroVoc is a multilingual thesaurus that was built for organizing the legislative documentary
of the European Union institutions. It contains thousands of categories at different levels of …

Designing the LECOR Learner Corpus for Romanian

AM Barbu, E Irimia, CM Vasile… - Proceedings of the 14th …, 2023 - aclanthology.org
This article presents a work-in-progress project, which aims to build and utilize a corpus of
Romanian texts written or spoken by non-native students of different nationalities, who learn …

HistNERo: Historical Named Entity Recognition for the Romanian Language

AM Avram, A Iuga, GV Manolache, VC Matei… - … on Document Analysis …, 2024 - Springer
This work introduces HistNERo, the first Romanian corpus for Named Entity Recognition
(NER) in historical newspapers. The dataset contains 323k tokens of text, covering more …

Challenges in creating a representative corpus of romanian micro-blogging text

V Păiș, M Mitrofan, VB Mititelu, E Irimia… - Proceedings of the …, 2022 - aclanthology.org
Following the successful creation of a national representative corpus of contemporary
Romanian language, we turned our attention to the social media text, as present in micro …

[PDF][PDF] In-depth evaluation of Romanian natural language processing pipelines

V Pais, R Ion, AM Avram, M Mitrofan, D Tufis - Romanian Journal of …, 2021 - romjist.ro
With the increased size of Universal Dependencies tree banks, several basic language
processing kits (BLARK) for multiple languages appeared in recent years, indicating …

System for the anonymization of Romanian jurisprudence

V Păiş, R Ion, E Irimia, VB Mititelu, V Badea… - Artificial Intelligence and …, 2024 - Springer
The transparency of the judicial process and the consistency of judicial decisions can be
improved through their publication. Access to jurisprudence is of paramount importance both …

Romanian Language Technology—a view from an academic perspective

D Tufiș - International Journal of Computers Communications & …, 2022 - univagora.ro
The article reports on research and developments pursued by the Research Institute for
Artificial Intelligence" Mihai Draganescu" of the Romanian Academy in order to narrow the …