- 学术资源搜索

Pengi: An audio language model for audio tasks

S Deshmukh, B Elizalde, R Singh… - Advances in Neural …, 2023 - proceedings.neurips.cc

In the domain of audio processing, Transfer Learning has facilitated the rise of Self-
Supervised Learning and Zero-Shot Learning techniques. These approaches have led to …

被引用次数：127 相关文章所有 5 个版本

[PDF] arxiv.org

Audio retrieval with natural language queries: A benchmark study

AS Koepke, AM Oncescu, JF Henriques… - IEEE Transactions …, 2022 - ieeexplore.ieee.org

The objectives of this work are cross-modal text-audio and audio-text retrieval, in which the
goal is to retrieve the audio content from a pool of candidates that best matches a given …

被引用次数：114 相关文章所有 10 个版本

[PDF] arxiv.org

Hawkes processes for events in social media

MA Rizoiu, Y Lee, S Mishra, L Xie - Frontiers of multimedia research, 2017 - dl.acm.org

This chapter provides an accessible introduction for point processes, and especially Hawkes
processes, for modeling discrete, inter-dependent events over continuous time. We start by …

被引用次数：195 相关文章所有 5 个版本

[PDF] arxiv.org

Audio retrieval with natural language queries

AM Oncescu, A Koepke, JF Henriques, Z Akata… - arXiv preprint arXiv …, 2021 - arxiv.org

We consider the task of retrieving audio using free-form natural language queries. To study
this problem, which has received limited attention in the existing literature, we introduce …

被引用次数：90 相关文章所有 13 个版本

Ubicoustics: Plug-and-play acoustic activity recognition

G Laput, K Ahuja, M Goel, C Harrison - Proceedings of the 31st Annual …, 2018 - dl.acm.org

Despite sound being a rich source of information, computing devices with microphones do
not leverage audio to glean useful insights about their physical and social context. For …

被引用次数：147 相关文章

[PDF] arxiv.org

Deep learning for video classification and captioning

Z Wu, T Yao, Y Fu, YG Jiang - Frontiers of multimedia research, 2017 - dl.acm.org

Deep learning for video classification and captioning Page 1 IPART MULTIMEDIA
CONTENT ANALYSIS Page 2 Page 3 1Deep Learning for Video Classification and …

被引用次数：166 相关文章所有 6 个版本

Privacymic: Utilizing inaudible frequencies for privacy preserving daily activity recognition

Y Iravantchi, K Ahuja, M Goel, C Harrison… - Proceedings of the 2021 …, 2021 - dl.acm.org

Sound presents an invaluable signal source that enables computing systems to perform
daily activity recognition. However, microphones are optimized for human speech and …

被引用次数：37 相关文章

[PDF] archive.org

Cross modal audio search and retrieval with joint embeddings based on text and audio

B Elizalde, S Zarar, B Raj - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org

Existing audio search engines use one of two approaches: matching text-text or audio-audio
pairs. In the former, text queries are matched to semantically similar words in an index of …

被引用次数：66 相关文章所有 2 个版本

[PDF] acm.org

Automated class discovery and one-shot interactions for acoustic activity recognition

J Wu, C Harrison, JP Bigham, G Laput - … of the 2020 CHI Conference on …, 2020 - dl.acm.org

Acoustic activity recognition has emerged as a foundational element for imbuing devices
with context-driven capabilities, enabling richer, more assistive, and more accommodating …

被引用次数：44 相关文章所有 8 个版本

[PDF] arxiv.org

Audio retrieval with wavtext5k and clap training

S Deshmukh, B Elizalde, H Wang - arXiv preprint arXiv:2209.14275, 2022 - arxiv.org

Audio-Text retrieval takes a natural language query to retrieve relevant audio files in a
database. Conversely, Text-Audio retrieval takes an audio file as a query to retrieve relevant …

被引用次数：49 相关文章所有 4 个版本

高级搜索

QQ 群

Pengi: An audio language model for audio tasks

Audio retrieval with natural language queries: A benchmark study

Hawkes processes for events in social media

Audio retrieval with natural language queries

Ubicoustics: Plug-and-play acoustic activity recognition

Deep learning for video classification and captioning

Privacymic: Utilizing inaudible frequencies for privacy preserving daily activity recognition

Cross modal audio search and retrieval with joint embeddings based on text and audio

Automated class discovery and one-shot interactions for acoustic activity recognition

Audio retrieval with wavtext5k and clap training

引用