Clar: Contrastive learning of auditory representations

S Liu, A Mallol-Ragolta, E Parada-Cabaleiro, K Qian… - Patterns, 2022 - cell.com

Similar to humans' cognitive ability to generalize knowledge and skills, self-supervised
learning (SSL) targets discovering general representations from large-scale data. This …

被引用次数：126 相关文章所有 12 个版本

Contrastive self-supervised learning: review, progress, challenges and future research directions

P Kumar, P Rawat, S Chauhan - International Journal of Multimedia …, 2022 - Springer

In the last decade, deep supervised learning has had tremendous success. However, its
flaws, such as its dependency on manual and costly annotations on large datasets and …

被引用次数：55 相关文章所有 2 个版本

[PDF] arxiv.org

Beats: Audio pre-training with acoustic tokenizers

S Chen, Y Wu, C Wang, S Liu, D Tompkins… - arXiv preprint arXiv …, 2022 - arxiv.org

The massive growth of self-supervised learning (SSL) has been witnessed in language,
vision, speech, and audio domains over the past few years. While discrete label prediction is …

被引用次数：277 相关文章所有 8 个版本

Contrastive learning based self-supervised time-series analysis

J Pöppelbaum, GS Chadha, A Schwung - Applied Soft Computing, 2022 - Elsevier

Deep learning architectures usually require large scale labeled datasets for achieving good
performance on general classification tasks including computer vision and natural language …

被引用次数：96 相关文章所有 2 个版本

[PDF] arxiv.org

Contrastive learning of musical representations

J Spijkervet, JA Burgoyne - arXiv preprint arXiv:2103.09410, 2021 - arxiv.org

While deep learning has enabled great advances in many areas of music, labeled music
datasets remain especially hard, expensive, and time-consuming to create. In this work, we …

被引用次数：143 相关文章所有 5 个版本

[PDF] wiley.com Full View

Domain‐specific neural networks improve automated bird sound recognition already with small amount of local data

P Lauha, P Somervuo, P Lehikoinen… - Methods in Ecology …, 2022 - Wiley Online Library

An automatic bird sound recognition system is a useful tool for collecting data of different
bird species for ecological analysis. Together with autonomous recording units (ARUs), such …

被引用次数：34 相关文章所有 13 个版本

[PDF] arxiv.org

Contrastive audio-language learning for music

I Manco, E Benetos, E Quinton, G Fazekas - arXiv preprint arXiv …, 2022 - arxiv.org

As one of the most intuitive interfaces known to humans, natural language has the potential
to mediate many tasks that involve human-computer interaction, especially in application …

被引用次数：55 相关文章所有 8 个版本

[PDF] arxiv.org

Foundation models for music: A survey

Y Ma, A Øland, A Ragni, BMS Del Sette, C Saitis… - arXiv preprint arXiv …, 2024 - arxiv.org

In recent years, foundation models (FMs) such as large language models (LLMs) and latent
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …

被引用次数：12 相关文章所有 4 个版本

[PDF] acm.org

Deep learning methods for abstract visual reasoning: A survey on raven's progressive matrices

M Małkiński, J Mańdziuk - ACM Computing Surveys, 2022 - dl.acm.org

Abstract visual reasoning (AVR) domain encompasses problems solving which requires the
ability to reason about relations among entities present in a given scene. While humans …

被引用次数：38 相关文章所有 3 个版本

[PDF] surrey.ac.uk

Asit: Local-global audio spectrogram vision transformer for event classification

S Atito, M Awais, W Wang… - … /ACM Transactions on …, 2024 - ieeexplore.ieee.org

Transformers, which were originally developed for natural language processing, have
recently generated significant interest in the computer vision and audio communities due to …

被引用次数：6 相关文章所有 5 个版本

高级搜索

QQ 群