Compilation, transcription and usage of a reference speech corpus: The case of the Slovene...

O Kuparinen, AM Haddad… - Conference on Empirical …, 2023 - researchportal.helsinki.fi

Text normalization methods have been commonly applied to historical language or user-
generated content, but less often to dialectal transcriptions. In this paper, we introduce …

被引用次数：11 相关文章所有 9 个版本

[PDF] bsz-bw.de

Accessing spoken language corpora: an overview of current approaches

J Batinić, E Frick, T Schmidt - Corpora, 2021 - euppublishing.com

In this paper, we present an overview of freely available web applications providing online
access to spoken language corpora. We explore and discuss various solutions with which …

被引用次数：7 相关文章所有 5 个版本

The Janes project: language resources and tools for Slovene user generated content

D Fišer, N Ljubešić, T Erjavec - Language resources and evaluation, 2020 - Springer

The paper presents the results of the Janes project, which aimed to develop language
resources and tools for Slovene user generated content. The paper first describes the 200 …

被引用次数：46 相关文章所有 4 个版本

[PDF] springer.com

Treebanking user-generated content: a UD based overview of guidelines, corpora and unified recommendations

M Sanguinetti, C Bosco, L Cassidy, Ö Çetinoğlu… - Language Resources …, 2023 - Springer

This article presents a discussion on the main linguistic phenomena which cause difficulties
in the analysis of user-generated texts found on the web and in social media, and proposes …

被引用次数：27 相关文章所有 18 个版本

[PDF] bsz-bw.de

Gesprächskorpora

T Schmidt, M Kupietz, T Schmidt - Korpuslinguistik, 2018 - degruyter.com

Dieser Beitrag setzt sich mit Gesprächskorpora als einem besonderen Typus von Korpora
gesprochener Sprache auseinander. Es werden zunächst wesentliche Eigenschaften …

被引用次数：37 相关文章所有 2 个版本

[PDF] jlcl.org

[PDF][PDF] Construction and Dissemination of a Corpus of Spoken Interaction–Tools and Workflows in the FOLK project

T Schmidt - Journal for language technology and computational …, 2016 - jlcl.org

This paper is about the workflow for construction and dissemination of FOLK
(Forschungsund Lehrkorpus Gesprochenes Deutsch–Research and Teaching Corpus of …

被引用次数：42 相关文章所有 8 个版本

[PDF] bsz-bw.de

Zur Stratifikation des FOLK-Korpus: Konzeption und Strategien

J Kaiser - Gesprächsforschung, 2018 - ids-pub.bsz-bw.de

Das Forschungs-und Lehrkorpus Gesprochenes Deutsch (FOLK), zugänglich über die
Datenbank für Gesprochenes Deutsch (DGD), strebt den Status eines Referenzkorpus für …

被引用次数：32 相关文章所有 4 个版本

[PDF] aclanthology.org

The universal dependencies treebank of spoken Slovenian

K Dobrovoljc, J Nivre - … of the Tenth International Conference on …, 2016 - aclanthology.org

This paper presents the construction of an open-source dependency treebank of spoken
Slovenian, the first syntactically annotated collection of spontaneous speech in Slovenian …

被引用次数：36 相关文章所有 5 个版本

[PDF] springer.com

Spoken corpora of Slavic languages

N Dobrushina, E Sokur - Russian Linguistics, 2022 - Springer

Spoken corpora are collections of transcribed and annotated audio and/or video recordings
of languages or language varieties. The aim of this paper is to present an overview of 51 …

被引用次数：11 相关文章所有 4 个版本

Annotating dialogue acts in speech data: Problematic issues and basic dialogue act categories

D Verdonik - International Journal of Corpus Linguistics, 2023 - jbe-platform.com

The aims of this paper are to detect the most problematic issues related to dialogue act
annotation in speech corpora and to define basic categories of dialogue acts. I critically …

被引用次数：8 相关文章所有 2 个版本

高级搜索

QQ 群