Corpus-based typology: Applications, challenges and some solutions

N Levshina - Linguistic Typology, 2022 - degruyter.com
Over the last few years, the number of corpora that can be used for language comparison
has dramatically increased. The corpora are so diverse in their structure, size and …

Computational sociophonetics using automatic speech recognition

R Coto‐Solano - Language and Linguistics Compass, 2022 - Wiley Online Library
Recent years have seen numerous advances in natural language processing that can help
accelerate sociophonetic work. These include software to align speech recordings with their …

[HTML][HTML] Final lengthening and vowel length in 25 languages

L Paschen, S Fuchs, F Seifart - Journal of Phonetics, 2022 - Elsevier
Lengthening of segments at the end of prosodic domains is commonly considered a
universal phenomenon, but language-specific variation has also been reported, specifically …

Consonant lengthening marks the beginning of words across a diverse sample of languages

F Blum, L Paschen, R Forkel, S Fuchs… - Nature Human …, 2024 - nature.com
Speech consists of a continuous stream of acoustic signals, yet humans can segment words
and other constituents from each other with astonishing precision. The acoustic properties …

Bottom-up discovery of structure and variation in response tokens ('backchannels') across diverse languages

A Liesenfeld, M Dingemanse - Interspeech 2022, 2022 - pure.mpg.de
Response tokens (also known as backchannels, continuers, or feedback) are a frequent
feature of human interaction, where they serve to display understanding and streamline turn …

Concepts and methods for integrating language typology and sociolinguistics

F Di Garbo, E Kashima… - … Verso un approccio …, 2021 - researchportal.helsinki.fi
This paper presents the building blocks of a comprehensive framework for the typological
study of linguistic adaptation, ie how languages change in relation to the socio-historical and …

Optimization of morpheme length: a cross-linguistic assessment of Zipf's and Menzerath's laws

M Stave, L Paschen, F Pellegrino, F Seifart - Linguistics Vanguard, 2021 - degruyter.com
Abstract Zipf's Law of Abbreviation and Menzerath's Law both make predictions about the
length of linguistic units, based on corpus frequency and the length of the carrier unit. Each …

[PDF][PDF] Comparing language-specific and cross-language acoustic models for low-resource phonetic forced alignment

E Chodroff, E Ahn, H Dolatian - Language Documentation & …, 2024 - eleanorchodroff.com
Phonetic forced alignment can greatly expedite spoken language analysis by providing
automatictimealignmentsattheword-andphone-levels. Inthecaseoflow-resourcelanguages, it …

Speech recognition for endangered and extinct Samoyedic languages

N Partanen, M Hämäläinen, T Klooster - arXiv preprint arXiv:2012.05331, 2020 - arxiv.org
Our study presents a series of experiments on speech recognition with endangered and
extinct Samoyedic languages, spoken in Northern and Southern Siberia. To best of our …

Building and curating conversational corpora for diversity-aware language science and technology

A Liesenfeld, M Dingemanse - arXiv preprint arXiv:2203.03399, 2022 - arxiv.org
We present an analysis pipeline and best practice guidelines for building and curating
corpora of everyday conversation in diverse languages. Surveying language documentation …