Compiling a suitable level of sense granularity in a lexicon for AI purposes: The open source COR lexicon

BS Pedersen, NCH Sørensen, S Nimb… - Proceedings of the …, 2022 - aclanthology.org
Abstract We present The Central Word Register for Danish (COR), which is an open source
lexicon project for general AI purposes funded and initiated by the Danish Agency for …

FT speech: Danish parliament speech corpus

A Kirkedal, M Stepanović, B Plank - arXiv preprint arXiv:2005.12368, 2020 - arxiv.org
This paper introduces FT Speech, a new speech corpus created from the recorded meetings
of the Danish Parliament, otherwise known as the Folketing (FT). The corpus contains over …

Exploring lexical factors in semantic annotation: insights from the classification of nouns in French

L Barque, R Huyghe, M Foegel - Language Resources and Evaluation, 2025 - Springer
This paper investigates how different lexical factors influence inter-annotator agreement in a
semantic annotation task, with the level of agreement serving as an indicator of task …

Annotating the French Wiktionary with supersenses for large scale lexical analysis: a use case to assess form-meaning relationships within the nominal lexicon

N Angleraud, L Barque, M Candito - Proceedings of the 31st …, 2025 - aclanthology.org
Many languages lack broad-coverage, semantically annotated lexical resources, which
limits empirical research on lexical semantics for these languages. In this paper, we report …

[PDF][PDF] Supersense tagging with inter-annotator disagreement

HM Alonso, A Johannsen, B Plank - … of LAW X: The 10th Linguistic …, 2016 - research.rug.nl
Linguistic annotation underlies many successful approaches in Natural Language
Processing (NLP), where the annotated corpora are used for training and evaluating …

[PDF][PDF] From thesaurus to framenet

S Nimb, A Braasch, S Olsen, BS Pedersen… - Proceedings of eLex …, 2017 - academia.edu
High-quality semantic data from a Danish thesaurus linked with valency information from a
Danish dictionary allows us to compile a frame lexicon (Berkeley FrameNet style) for Danish …

[PDF][PDF] COR-S–den semantiske del af Det Centrale OrdRegister (COR)

S Nimb, BS Pedersen, NCH Sørensen, I Flörke… - nordisk lexikografi–nu …, 2022 - tidsskrift.dk
We present the formal lexicon COR-S, which constitutes the semantic part of a Danish
computational lexicon project called COR. COR-S is based on linked data, but apart from …

The DA-ELEXIS corpus-a sense-annotated corpus for Danish with parallel annotations for nine European languages

BS Pedersen, S Nimb, S Olsen… - Proceedings of the …, 2023 - aclanthology.org
In this paper, we present the newly compiled DA-ELEXIS Corpus, which is one of the largest
sense-annotated corpora available for Danish, and the first one to be annotated with the …

Dansk betydningsinventar i et datalingvistisk perspektiv

BS Pedersen, S Nimb, S Olsen - Danske Studier, 2021 - tidsskrift.dk
Resumé In this paper we investigate the Danish sense inventory from a paradigmatic and a
syntagmatic perspective, respectively, and we present a collection of related lexical …

[PDF][PDF] Combining dictionaries, wordnets and other lexical resources-advantages and challenges

BS Pedersen, S Nimb, S Olsen… - Globalex …, 2018 - academia.edu
In this paper we account for the advantages, challenges and pitfalls that we have
encountered when compiling language technology (LT) resources based on dictionary …