BERTimbau: pretrained BERT models for Brazilian Portuguese

F Souza, R Nogueira, R Lotufo - … 2020, Rio Grande, Brazil, October 20–23 …, 2020 - Springer
Recent advances in language representation using neural networks have made it viable to
transfer the learned internal states of large pretrained language models (LMs) to …

Portuguese named entity recognition using BERT-CRF

F Souza, R Nogueira, R Lotufo - arXiv preprint arXiv:1909.10649, 2019 - arxiv.org
Recent advances in language representation using neural networks have made it viable to
transfer the learned internal states of a trained model to downstream natural language …

Probing for idiomaticity in vector space models

M Garcia, TK Vieira, C Scarton… - Proceedings of the …, 2021 - eprints.whiterose.ac.uk
Contextualised word representation models have been successfully used for capturing
different word usages and they may be an attractive alternative for representing idiomaticity …

BERT models for Brazilian Portuguese: Pretraining, evaluation and tokenization analysis

FC Souza, RF Nogueira, RA Lotufo - Applied Soft Computing, 2023 - Elsevier
Recent advances in language representation using neural networks have made it viable to
transfer the learned internal states of large pretrained language models (LMs) to …

Named entity recognition for sensitive data discovery in Portuguese

M Dias, J Boné, JC Ferreira, R Ribeiro, R Maia - Applied Sciences, 2020 - mdpi.com
The process of protecting sensitive data is continually growing and becoming increasingly
important, especially as a result of the directives and laws imposed by the European Union …

A deep neural network-based model for named entity recognition for Hindi language

R Sharma, S Morwal, B Agarwal, R Chandra… - Neural Computing and …, 2020 - Springer
The aim of this work is to develop efficient named entity recognition from the given text that in
turn improves the performance of the systems that use natural language processing (NLP) …

Assessing idiomaticity representations in vector models with a noun compound dataset labeled at type and token levels

M Garcia, T Kramer Vieira, C Scarton… - Proceedings of ACL …, 2021 - eprints.whiterose.ac.uk
Accurate assessment of the ability of embedding models to capture idiomaticity may require
evaluation at token rather than type level, to account for degrees of idiomaticity and possible …

Assessing the impact of contextual embeddings for Portuguese named entity recognition

J Santos, B Consoli, C dos Santos… - 2019 8th Brazilian …, 2019 - ieeexplore.ieee.org
Modern approaches to Named Entity Recognition (NER) use neural networks (NN) to
automatically extract features from text and seamlessly integrate them with sequence …

Comparing different methods for named entity recognition in portuguese neurology text

F Lopes, C Teixeira, H Gonçalo Oliveira - Journal of Medical Systems, 2020 - Springer
Abstract Electronic Medical Records (EMRs) are written in an unstructured way, often using
natural language. Information Extraction (IE) may be used for acquiring knowledge from …

Contributions to clinical named entity recognition in Portuguese

F Lopes, C Teixeira, HG Oliveira - Proceedings of the 18th BioNLP …, 2019 - aclanthology.org
Having in mind that different languages might present different challenges, this paper
presents the following contributions to the area of Information Extraction from clinical text …