A review of speaker diarization: Recent advances with deep learning

TJ Park, N Kanda, D Dimitriadis, KJ Han… - Computer Speech & …, 2022 - Elsevier
Speaker diarization is a task to label audio or video recordings with classes that correspond
to speaker identity, or in short, a task to identify “who spoke when”. In the early years …

Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization

D Kitamura, N Ono, H Sawada… - … on Audio, Speech …, 2016 - ieeexplore.ieee.org
This paper addresses the determined blind source separation problem and proposes a new
effective method unifying independent vector analysis (IVA) and nonnegative matrix …

Generalization of multi-channel linear prediction methods for blind MIMO impulse response shortening

T Yoshioka, T Nakatani - IEEE Transactions on Audio, Speech …, 2012 - ieeexplore.ieee.org
The performance of many microphone array processing techniques deteriorates in the
presence of reverberation. To provide a widely applicable solution to this longstanding …

Multichannel signal enhancement algorithms for assisted listening devices: Exploiting spatial diversity using multiple microphones

S Doclo, W Kellermann, S Makino… - IEEE Signal …, 2015 - ieeexplore.ieee.org
In everyday environments, we are frequently immersed by unwanted acoustic noise and
interference while we want to listen to acoustic signals, most often speech. Technology for …

Applications and trends in wireless acoustic sensor networks: A signal processing perspective

A Bertrand - 2011 18th IEEE symposium on communications …, 2011 - ieeexplore.ieee.org
Wireless microphone networks or so-called wireless acoustic sensor networks (WASNs) are
a next-generation technology for audio acquisition and processing. As opposed to traditional …

Model-based expectation-maximization source separation and localization

MI Mandel, RJ Weiss, DPW Ellis - IEEE Transactions on Audio …, 2009 - ieeexplore.ieee.org
This paper describes a system, referred to as model-based expectation-maximization source
separation and localization (MESSL), for separating and localizing multiple sound sources …

Convolutive blind source separation methods

MS Pedersen, J Larsen, U Kjems, LC Parra - Springer handbook of …, 2008 - Springer
In this chapter, we provide an overview of existing algorithms for blind source separation of
convolutive audio mixtures. We provide a taxonomy in which many of the existing algorithms …

[图书][B] Acoustic MIMO signal processing

Y Huang, J Benesty, J Chen - 2006 - books.google.com
Telecommunication systems and human-machine interfaces start employing multiple
microphones and loudspeakers in order to make conversations and interactions more …

Acoustic beamforming for hearing aid applications

S Doclo, S Gannot, M Moonen… - Handbook on array …, 2010 - Wiley Online Library
Noise reduction algorithms in hearing aids are crucial for hearing-impaired persons to
improve speech intelligibility in background noise (eg, traffic, cocktail party situation). Many …

Advances in online audio-visual meeting transcription

T Yoshioka, I Abramovski, C Aksoylar… - 2019 IEEE Automatic …, 2019 - ieeexplore.ieee.org
This paper describes a system that generates speaker-annotated transcripts of meetings by
using a microphone array and a 360-degree camera. The hallmark of the system is its ability …