Gigafida 2.0: the reference corpus of written standard Slovene

M Ulčar, M Robnik-Šikonja - Frontiers in Artificial Intelligence, 2023 - frontiersin.org

Introduction Large pretrained language models have recently conquered the area of natural
language processing. As an alternative to predominant masked language modeling …

被引用次数：14 相关文章所有 7 个版本

[PDF] arxiv.org

A Survey of Large Language Models for European Languages

W Ali, S Pyysalo - arXiv preprint arXiv:2408.15040, 2024 - arxiv.org

Large Language Models (LLMs) have gained significant attention due to their high
performance on a wide range of natural language tasks since the release of ChatGPT. The …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Semantic change detection for Slovene language: a novel dataset and an approach based on optimal transport

M Pranjić, K Dobrovoljc, S Pollak, M Martinc - arXiv preprint arXiv …, 2024 - arxiv.org

In this paper, we focus on the detection of semantic changes in Slovene, a less resourced
Slavic language with two million speakers. Detecting and tracking semantic changes …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

Cross-lingual transfer of abstractive summarizer to less-resource language

A Žagar, M Robnik-Šikonja - Journal of Intelligent Information Systems, 2022 - Springer

Automatic text summarization extracts important information from texts and presents the
information in the form of a summary. Abstractive summarization approaches progressed …

被引用次数：10 相关文章所有 6 个版本

[PDF] researchgate.net

The anatomy of specialized knowledge: Comparing experts and non-experts through associations, frames and language models

Š Vintar, A Saksida - Lexicographica, 2023 - degruyter.com

We explore specialized knowledge and aim to show that expert conceptual spaces differ
from those of non-experts. This rather broad research question is addressed from different …

被引用次数：3 相关文章所有 4 个版本

[PDF] eurasip.org

[PDF][PDF] A Method for Selection of Phonetically Balanced Sentences in Read Speech Corpus Design

JZ Gros, B Vesnicer, S Dobrisek - Proceedings of the 30th European …, 2022 - eurasip.org

Sentence selection for speech prompts plays an important role in the process of designing a
speech corpus of read speech, both for speech recognition and speech synthesis. The …

被引用次数：3 相关文章所有 2 个版本

[PDF] aclanthology.org

Extending the SSJ Universal Dependencies Treebank for Slovenian: Was it Worth it?

K Dobrovoljc, N Ljubešić - Proceedings of the 16th Linguistic …, 2022 - aclanthology.org

This paper presents the creation and evaluation of a new version of the reference SSJ
Universal Dependencies Treebank for Slovenian, which has been substantially improved …

被引用次数：3 相关文章所有 3 个版本

[PDF] uni-lj.si

Collocation ranking: frequency vs semantics

N Ljubešić, N Logar, I Kosem - Slovenščina 2.0: empirične …, 2021 - journals.uni-lj.si

Collocations play a very important role in language description, especially in identifying
meanings of words. Modern lexicography's inevitable part of meaning deduction are lists of …

被引用次数：3 相关文章所有 4 个版本

[PDF] genderonline.cz

Corpus-Linguistic Analysis of Speech Communities on Anti-Gender Discourse in Slovene

D Popič, V Gorjanc - Gender a výzkum/Gender and Research, 2023 - ceeol.com

his paper deals with a corpus-linguistic analysis of different text/media types in Slovene with
the aim of finding out whether or not any of the communication channels covered by the …

被引用次数：1 相关文章所有 5 个版本

[PDF] uni-lj.si

Data preparation in crowdsourcing for pedagogical purposes: the case of the CrowLL game

TZ Kuhn, ŠA Holdt, I Kosem, C Tiberius… - Slovenščina 2.0 …, 2022 - journals.uni-lj.si

One way to stimulate the use of corpora in language education is by making pedagogically
appropriate corpora, labeled with different types of problems (sensitive content, offensive …

被引用次数：1 相关文章所有 7 个版本

高级搜索

QQ 群