DCASE 2017 challenge setup: Tasks, datasets and baseline system

F Alías, RM Alsina-Pagès - Journal of sensors, 2019 - Wiley Online Library

Nowadays, more than half of the world's population lives in urban areas. Since this
proportion is expected to keep rising, the sustainable development of cities is of paramount …

被引用次数：97 相关文章所有 10 个版本

[PDF] uni-goettingen.de

A survey on privacy issues and solutions for Voice-controlled Digital Assistants

LH Acosta, D Reinhardt - Pervasive and Mobile Computing, 2022 - Elsevier

With the development and increasing deployment of smart home devices, voice control
supports comfortable end user interactions. However, potential end users may refuse to use …

被引用次数：34 相关文章所有 4 个版本

[PDF] arxiv.org

Fsd50k: an open dataset of human-labeled sound events

E Fonseca, X Favory, J Pons, F Font… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org

Most existing datasets for sound event recognition (SER) are relatively small and/or domain-
specific, with the exception of AudioSet, based on over 2 M tracks from YouTube videos and …

被引用次数：397 相关文章所有 5 个版本

[PDF] arxiv.org

Panns: Large-scale pretrained audio neural networks for audio pattern recognition

Q Kong, Y Cao, T Iqbal, Y Wang… - … on Audio, Speech …, 2020 - ieeexplore.ieee.org

Audio pattern recognition is an important research topic in the machine learning area, and
includes several tasks such as audio tagging, acoustic scene classification, music …

被引用次数：1004 相关文章所有 8 个版本

[PDF] arxiv.org

Listen, think, and understand

Y Gong, H Luo, AH Liu, L Karlinsky, J Glass - arXiv preprint arXiv …, 2023 - arxiv.org

The ability of artificial intelligence (AI) systems to perceive and comprehend audio signals is
crucial for many applications. Although significant progress has been made in this area …

被引用次数：61 相关文章所有 6 个版本

[HTML] aip.org

[HTML][HTML] Machine learning in acoustics: Theory and applications

MJ Bianco, P Gerstoft, J Traer, E Ozanich… - The Journal of the …, 2019 - pubs.aip.org

Acoustic data provide scientific and engineering insights in fields ranging from biology and
communications to ocean and Earth science. We survey the recent advances and …

被引用次数：471 相关文章所有 14 个版本

[PDF] arxiv.org

Sound event localization and detection of overlapping sources using convolutional recurrent neural networks

S Adavanne, A Politis, J Nikunen… - IEEE Journal of …, 2018 - ieeexplore.ieee.org

In this paper, we propose a convolutional recurrent neural network for joint sound event
localization and detection (SELD) of multiple overlapping sound events in three-dimensional …

被引用次数：522 相关文章所有 10 个版本

[PDF] arxiv.org

Psla: Improving audio tagging with pretraining, sampling, labeling, and aggregation

Y Gong, YA Chung, J Glass - IEEE/ACM Transactions on Audio …, 2021 - ieeexplore.ieee.org

Audio tagging is an active research area and has a wide range of applications. Since the
release of AudioSet, great progress has been made in advancing model performance, which …

被引用次数：152 相关文章所有 6 个版本

[PDF] thecvf.com

The sound of pixels

H Zhao, C Gan, A Rouditchenko… - Proceedings of the …, 2018 - openaccess.thecvf.com

We introduce PixelPlayer, a system that, by leveraging large amounts of unlabeled videos,
learns to locate image regions which produce sounds and separate the input sounds into a …

被引用次数：566 相关文章所有 10 个版本

[PDF] aclanthology.org

Audiocaps: Generating captions for audios in the wild

CD Kim, B Kim, H Lee, G Kim - … of the 2019 Conference of the …, 2019 - aclanthology.org

We explore the problem of Audio Captioning: generating natural language description for
any kind of audio in the wild, which has been surprisingly unexplored in previous research …

被引用次数：373 相关文章

高级搜索

QQ 群