Named entity recognition in the Romanian legal domain

V Păiș, M Mitrofan, CL Gasan… - Proceedings of the …, 2021 - aclanthology.org
Recognition of named entities present in text is an important step towards information
extraction and natural language understanding. This work presents a named entity …

Improving Romanian BioNER using a biologically inspired system

M Mitrofan, V Păiș - Proceedings of the 21st Workshop on …, 2022 - aclanthology.org
Recognition of named entities present in text is an important step towards information
extraction and natural language understanding. This work presents a named entity …

Introducing the CURLICAT corpora: seven-language domain specific annotated corpora from curated sources

T Váradi, B Nyéki, S Koeva, M Tadić… - Proceedings of the …, 2022 - aclanthology.org
This article presents the current outcomes of the CURLICAT CEF Telecom project, which
aims to collect and deeply annotate a set of large corpora from selected domains. The …

Building paths to corpus data. A multi-level least effort and maximum return approach

M Kupietz, N Diewald, E Margaretha - CLARIN. The Infrastructure for …, 2022 - degruyter.com
Enabling appropriate access to linguistic research data, both for many researchers and for
innovative research applications, is a challenging task. In this chapter, we describe how we …

[PDF][PDF] Towards a romanian end-to-end automatic speech recognition based on deepspeech2

AM Avram, P Vasile, D Tufis - Proc. Rom. Acad. Ser. A, 2020 - academia.edu
This paper presents an implementation of an ASR system for the Romanian language that
uses a multi-layer neural network architecture to transcribe the input speech, augmented …

VeLeRo: an inflected verbal lexicon of standard Romanian and a quantitative analysis of morphological predictability

B Herce, B Pricop - Language Resources and Evaluation, 2024 - Springer
This paper presents VeLeRo, an inflected lexicon of Standard Romanian which contains the
full paradigm of 7297 verbs in phonological form. We explain the process by which the …

[PDF][PDF] How to find a shining needle in the haystack. Querying CoRoLa: solutions and perspectives

D Cristea, N Diewald, G Haja, C Mărănduc… - 2019 - dspace.bcu-iasi.ro
The present paper examines a variety of ways in which the Corpus of Contemporary
Romanian Language (CoRoLa) can be used. A multitude of examples intends to highlight a …

Challenges in creating a representative corpus of romanian micro-blogging text

V Păiș, M Mitrofan, VB Mititelu, E Irimia… - Proceedings of the …, 2022 - aclanthology.org
Following the successful creation of a national representative corpus of contemporary
Romanian language, we turned our attention to the social media text, as present in micro …

[PDF][PDF] In-depth evaluation of Romanian natural language processing pipelines

V Pais, R Ion, AM Avram, M Mitrofan, D Tufis - Romanian Journal of …, 2021 - romjist.ro
With the increased size of Universal Dependencies tree banks, several basic language
processing kits (BLARK) for multiple languages appeared in recent years, indicating …

System for the anonymization of Romanian jurisprudence

V Păiş, R Ion, E Irimia, VB Mititelu, V Badea… - Artificial Intelligence and …, 2024 - Springer
The transparency of the judicial process and the consistency of judicial decisions can be
improved through their publication. Access to jurisprudence is of paramount importance both …