Acoustic classification and segmentation using modified spectral roll-off and variance-based features

M Kos, Z Kačič, D Vlaj - Digital Signal Processing, 2013 - Elsevier
This paper presents novel features and an architecture for an automatic on-line acoustic
classification and segmentation system. The system includes speech/non-speech …

Towards building an automatic transcription system for language documentation: Experiences from muyu

A Zahrer, A Žgank, B Schuppler - Proceedings of the Twelfth …, 2020 - aclanthology.org
Since at least half of the world's 6000 plus languages will vanish during the 21st century,
language documentation has become a rapidly growing field in linguistics. A fundamental …

Influence of Highly Inflected Word Forms and Acoustic Background on the Robustness of Automatic Speech Recognition for Human–Computer Interaction

A Zgank - Mathematics, 2022 - mdpi.com
Automatic speech recognition is essential for establishing natural communication with a
human–computer interface. Speech recognition accuracy strongly depends on the …

[PDF][PDF] Online speech/music segmentation based on the variance mean of filter bank energy

M Kos, M Grašič, Z Kačič - EURASIP Journal on Advances in Signal …, 2009 - Springer
This paper presents a novel feature for online speech/music segmentation based on the
variance mean of filter bank energy (VMFBE). The idea that encouraged the feature's …

[PDF][PDF] The Slovene BNSI Broadcast News database and reference speech corpus GOS: Towards the uniform guidelines for future work.

A Zgank, AZ Vitez, D Verdonik - LREC, 2014 - academia.edu
The aim of the paper is to search for common guidelines for the future development of
speech databases for less resourced languages in order to make them the most useful for …

Speaker's gender classification and segmentation using spectral and cepstral feature averaging

M Kos, D Vlaj, Z Kačič - 2011 18th International Conference on …, 2011 - ieeexplore.ieee.org
This paper presents speaker gender classification and segmentation. Such classification is
frequently used in broadcast news domain. Because pitch is a feature that is difficult to …

[PDF][PDF] Development of the RWTH transcription system for slovenian.

P Golik, Z Tüske, R Schlüter, H Ney - Interspeech, 2013 - isca-archive.org
In this paper we describe the RWTH automatic speech recognition system for Slovenian
developed within the transLectures project. The project aims at supporting the transcription …

[PDF][PDF] Slovenian spontaneous speech recognition and acoustic modeling of filled pauses and onomatopoeas

A Žgank, T Rotovnik… - WSEAS Transactions on …, 2008 - researchgate.net
This paper is focused on acoustic modeling for spontaneous speech recognition. This topic
is still a very challenging task for speech technology research community. The attributes of …

[PDF][PDF] Speech recognition for interaction with a robot in noisy environment

MS Maučec, Z Kačič, A Žgank - Przegląd Elektrotechniczny, 2013 - pe.org.pl
One of the main problems with speech recognition for robots is noise. In this paper we
propose two methods to enhance the robustness of continuous speech recognition in noisy …

The SI TEDx-UM speech database: A new Slovenian spoken language resource

A Žgank, MS Maucec, D Verdonik - Proceedings of the Tenth …, 2016 - aclanthology.org
This paper presents a new Slovenian spoken language resource built from TEDx Talks. The
speech database contains 242 talks in total duration of 54 hours. The annotation and …