A Privacy-Preserving Corpus for Occupational Health in Spanish: Evaluation for NER and Classification Tasks

C Aracena, L Miranda, T Vakili, F Villena… - Proceedings of the …, 2024 - aclanthology.org
Annotated corpora are essential to reliable natural language processing. While they are
expensive to create, they are essential for building and evaluating systems. This study …

End-to-end pseudonymization of fine-tuned clinical BERT models: Privacy preservation with maintained data utility

T Vakili, A Henriksson, H Dalianis - BMC Medical Informatics and Decision …, 2024 - Springer
Many state-of-the-art results in natural language processing (NLP) rely on large pre-trained
language models (PLMs). These models consist of large amounts of parameters that are …

Attacking and Defending the Privacy of Clinical Language Models

T Vakili - 2023 - diva-portal.org
The state-of-the-art methods in natural language processing (NLP) increasingly rely on large
pre-trained transformer models. The strength of the models stems from their large number of …

Mapping the Past: Geographically Linking an Early 20th Century Swedish Encyclopedia with Wikidata

A Ahlin, A Myrne, P Nugues - arXiv preprint arXiv:2406.17903, 2024 - arxiv.org
In this paper, we describe the extraction of all the location entries from a prominent Swedish
encyclopedia from the early 20th century, the\textit {Nordisk Familjebok}Nordic Family …