Semi-supervised sound event detection with pre-trained model

L Xu, L Wang, S Bi, H Liu… - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Sound event detection (SED) is an interesting but challenging task due to the scarcity of data
and diverse sound events in real life. In this paper, we focus on the semi-supervised SED …

Semi-supervsied Learning-based Sound Event Detection using Freuqency Dynamic Convolution with Large Kernel Attention for DCASE Challenge 2023 Task 4

JW Kim, SW Son, Y Song, HK Kim, IH Song… - arXiv preprint arXiv …, 2023 - arxiv.org
This report proposes a frequency dynamic convolution (FDY) with a large kernel attention
(LKA)-convolutional recurrent neural network (CRNN) with a pre-trained bidirectional …

Ast-sed: An effective sound event detection method based on audio spectrogram transformer

K Li, Y Song, LR Dai, I McLoughlin… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
In this paper, we propose an effective sound event detection (SED) method based on the
audio spectrogram transformer (AST) model, pretrained on the large-scale AudioSet for …

HPE-Li: WiFi-Enabled Lightweight Dual Selective Kernel Convolution for Human Pose Estimation

TD Gian, T Dac Lai, T Van Luong, KS Wong… - … on Computer Vision, 2025 - Springer
WiFi-based human pose estimation (HPE) has emerged as a promising alternative to
conventional vision-based techniques, yet faces the high computational cost hindering its …

[PDF][PDF] FMSG submission for DCASE 2023 challenge task 4 on sound event detection with weak labels and synthetic soundscapes

Y Xiao, T Khandelwal, RK Das - Proc. DCASE Challenge, 2023 - dcase.community
This report presents the systems developed and submitted by Fortemedia Singapore
(FMSG) for DCASE 2023 Task 4A, which focuses on sound event detection with weak labels …

Fmsg-jless submission for dcase 2024 task4 on sound event detection with heterogeneous training dataset and potentially missing labels

Y Xiao, H Yin, J Bai, RK Das - arXiv preprint arXiv:2407.00291, 2024 - arxiv.org
This report presents the systems developed and submitted by Fortemedia Singapore
(FMSG) and Joint Laboratory of Environmental Sound Sensing (JLESS) for DCASE 2024 …

Multi-dimensional frequency dynamic convolution with confident mean teacher for sound event detection

S Xiao, X Zhang, P Zhang - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Recently, convolutional neural networks (CNNs) have been widely used in sound event
detection (SED). However, traditional convolution is deficient in learning time-frequency …

Leveraging audio-tagging assisted sound event detection using weakified strong labels and frequency dynamic convolutions

T Khandelwal, RK Das, A Koh… - 2023 IEEE Statistical …, 2023 - ieeexplore.ieee.org
Jointly learning from a small labeled set and a larger unlabeled set is an active research
topic under semi-supervised learning (SSL). In this paper, we propose a novel SSL method …

[PDF][PDF] Li USTC team's submission for DCASE 2023 challenge task4a

K Li, P Cai, Y Song - Tech. Rep., DCASE2023 Challenge, 2023 - dcase.community
In this technical report, we present our submissions for DCASE 2023 challenge task4a. We
mainly study how to fine-tune patchout fast spectrogram transformer (PaSST) for sound …

Fine-tune the pretrained atst model for sound event detection

N Shao, X Li, X Li - ICASSP 2024-2024 IEEE International …, 2024 - ieeexplore.ieee.org
Sound event detection (SED) often suffers from the data deficiency problem. Recent SED
systems leverage the large pretrained self-supervised learning (SelfSL) models to mitigate …