[PDF][PDF] Small footprint multi-channel network for keyword spotting with centroid based awareness

D Ng, Y Xiao, JQ Yip, Z Yang, B Tian, Q Fu… - Proc …, 2023 - isca-archive.org
Abstract Spoken Keyword Spotting (KWS) in noisy far-field environments is challenging for
small-footprint models, given the restrictions on computational resources (eg, model size …

The dku post-challenge audio-visual wake word spotting system for the 2021 misp challenge: Deep analysis

H Wang, M Cheng, Q Fu, M Li - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
This paper further explores our previous wake word spotting system ranked 2-nd in Track 1
of the MISP Challenge 2021. First, we investigate a robust unimodal approach based on 3D …

Audio-visual wake word spotting in misp2021 challenge: Dataset release and deep analysis

H Zhou, J Du, G Zou, Z Nian, CH Lee… - Proceedings of the …, 2022 - research.tudelft.nl
In this paper, we describe and release publicly the audio-visual wake word spotting (WWS)
database in the MISP2021 Challenge, which covers a range of scenarios of audio and video …

VE-KWS: Visual modality enhanced end-to-end keyword spotting

A Zhang, H Wang, P Guo, Y Fu, L Xie… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
The performance of the keyword spotting (KWS) system based on audio modality, commonly
measured in false alarms and false rejects, degrades significantly under the far field and …

Robust Wake Word Spotting With Frame-Level Cross-Modal Attention Based Audio-Visual Conformer

H Wang, M Cheng, Q Fu, M Li - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
In recent years, neural network-based Wake Word Spotting achieves good performance on
clean audio samples but struggles in noisy environments. Audio-Visual Wake Word Spotting …

Combining Weight Approximation, Sharing and Retraining for Neural Network Model Compression

P Kashikar, O Sentieys, S Sinha - ACM Transactions on Embedded …, 2024 - dl.acm.org
Neural network model compression is very important to achieve model deployment based
on the memory and storage available in different computing systems. Generally, the …

The Whu Wake Word Lipreading System for the 2024 Chat-Scenario Chinese Lipreading Challenge

H Wang, C Li, F Su, J Liu, H Suo… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
The paper describes the Wake Word Lipreading system developed by the WHU team for the
ChatCLR Challenge 2024. Although Lipreading and Wake Word Spotting have seen …

Audio-Visual Wake-up Word Spotting Under Noisy and Multi-person Scenarios

C Li, F Su, J Liu - International Conference on Pattern Recognition, 2024 - Springer
The existing audio-visual wake-up word spotting (AVWWS) methods assume that the audio
signal has been aligned with the lip movement video signal of a specific speaker in noisy …

On‐device audio‐visual multi‐person wake word spotting

Y Li, G Wang, Z Chen, H Tang… - CAAI Transactions on …, 2023 - Wiley Online Library
Audio‐visual wake word spotting is a challenging multi‐modal task that exploits visual
information of lip motion patterns to supplement acoustic speech to improve overall …

TinyML tabanlı görsel işitsel anahtar kelime tespiti

M Tosun, H Erdem - Niğde Ömer Halisdemir Üniversitesi …, 2024 - dergipark.org.tr
Anahtar kelime tespiti (AKT), makine öğreniminin kullanıldığı alanlardan birisidir. Amacı, ses
veya görüntü verisinden belirli kelime veya objenin otomatik tespit edilmesidir. Taşınabilir …