Pengi: An audio language model for audio tasks

S Deshmukh, B Elizalde, R Singh… - Advances in Neural …, 2023 - proceedings.neurips.cc
In the domain of audio processing, Transfer Learning has facilitated the rise of Self-
Supervised Learning and Zero-Shot Learning techniques. These approaches have led to …

Audio retrieval with natural language queries: A benchmark study

AS Koepke, AM Oncescu, JF Henriques… - IEEE Transactions …, 2022 - ieeexplore.ieee.org
The objectives of this work are cross-modal text-audio and audio-text retrieval, in which the
goal is to retrieve the audio content from a pool of candidates that best matches a given …

Hawkes processes for events in social media

MA Rizoiu, Y Lee, S Mishra, L Xie - Frontiers of multimedia research, 2017 - dl.acm.org
This chapter provides an accessible introduction for point processes, and especially Hawkes
processes, for modeling discrete, inter-dependent events over continuous time. We start by …

Audio retrieval with natural language queries

AM Oncescu, A Koepke, JF Henriques, Z Akata… - arXiv preprint arXiv …, 2021 - arxiv.org
We consider the task of retrieving audio using free-form natural language queries. To study
this problem, which has received limited attention in the existing literature, we introduce …

Ubicoustics: Plug-and-play acoustic activity recognition

G Laput, K Ahuja, M Goel, C Harrison - Proceedings of the 31st Annual …, 2018 - dl.acm.org
Despite sound being a rich source of information, computing devices with microphones do
not leverage audio to glean useful insights about their physical and social context. For …

Deep learning for video classification and captioning

Z Wu, T Yao, Y Fu, YG Jiang - Frontiers of multimedia research, 2017 - dl.acm.org
Deep learning for video classification and captioning Page 1 IPART MULTIMEDIA
CONTENT ANALYSIS Page 2 Page 3 1Deep Learning for Video Classification and …

Privacymic: Utilizing inaudible frequencies for privacy preserving daily activity recognition

Y Iravantchi, K Ahuja, M Goel, C Harrison… - Proceedings of the 2021 …, 2021 - dl.acm.org
Sound presents an invaluable signal source that enables computing systems to perform
daily activity recognition. However, microphones are optimized for human speech and …

Cross modal audio search and retrieval with joint embeddings based on text and audio

B Elizalde, S Zarar, B Raj - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org
Existing audio search engines use one of two approaches: matching text-text or audio-audio
pairs. In the former, text queries are matched to semantically similar words in an index of …

Automated class discovery and one-shot interactions for acoustic activity recognition

J Wu, C Harrison, JP Bigham, G Laput - … of the 2020 CHI Conference on …, 2020 - dl.acm.org
Acoustic activity recognition has emerged as a foundational element for imbuing devices
with context-driven capabilities, enabling richer, more assistive, and more accommodating …

Audio retrieval with wavtext5k and clap training

S Deshmukh, B Elizalde, H Wang - arXiv preprint arXiv:2209.14275, 2022 - arxiv.org
Audio-Text retrieval takes a natural language query to retrieve relevant audio files in a
database. Conversely, Text-Audio retrieval takes an audio file as a query to retrieve relevant …