Textually pretrained speech language models

M Hassid, T Remez, TA Nguyen, I Gat… - Advances in …, 2024 - proceedings.neurips.cc
Speech language models (SpeechLMs) process and generate acoustic data only, without
textual supervision. In this work, we propose TWIST, a method for training SpeechLMs using …

Audiobox: Unified audio generation with natural language prompts

A Vyas, B Shi, M Le, A Tjandra, YC Wu, B Guo… - arXiv preprint arXiv …, 2023 - arxiv.org
Audio is an essential part of our life, but creating it often requires expertise and is time-
consuming. Research communities have made great progress over the past year advancing …

Recent advances in speech language models: A survey

W Cui, D Yu, X Jiao, Z Meng, G Zhang, Q Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) have recently garnered significant attention, primarily for
their capabilities in text-based interactions. However, natural human interaction often relies …

SUMMEDITS: measuring LLM ability at factual reasoning through the lens of summarization

P Laban, W Kryściński, D Agarwal… - Proceedings of the …, 2023 - aclanthology.org
With the recent appearance of LLMs in practical settings, having methods that can effectively
detect factual inconsistencies is crucial to reduce the propagation of misinformation and …

Llms as factual reasoners: Insights from existing benchmarks and beyond

P Laban, W Kryściński, D Agarwal, AR Fabbri… - arXiv preprint arXiv …, 2023 - arxiv.org
With the recent appearance of LLMs in practical settings, having methods that can effectively
detect factual inconsistencies is crucial to reduce the propagation of misinformation and …

Computational sociophonetics using automatic speech recognition

R Coto‐Solano - Language and Linguistics Compass, 2022 - Wiley Online Library
Recent years have seen numerous advances in natural language processing that can help
accelerate sociophonetic work. These include software to align speech recordings with their …

Audio summarization for podcasts

A Vartakavi, A Garg, Z Rafii - 2021 29th European signal …, 2021 - ieeexplore.ieee.org
We propose a novel system to automatically generate audio summaries for podcasts,
allowing listeners to quickly preview podcast episodes. The proposed system first …

Bespoke Non-Stationary Solvers for Fast Sampling of Diffusion and Flow Models

N Shaul, U Singer, RTQ Chen, M Le, A Thabet… - arXiv preprint arXiv …, 2024 - arxiv.org
This paper introduces Bespoke Non-Stationary (BNS) Solvers, a solver distillation approach
to improve sample efficiency of Diffusion and Flow models. BNS solvers are based on a …

A two-phase approach for abstractive podcast summarization

C Zheng, K Zhang, HJ Wang, L Fan - arXiv preprint arXiv:2011.08291, 2020 - arxiv.org
Podcast summarization is different from summarization of other data formats, such as news,
patents, and scientific papers in that podcasts are often longer, conversational, colloquial …

Towards proactive information retrieval in noisy text with Wikipedia concepts

T Ahmed, S Bulathwela - arXiv preprint arXiv:2210.09877, 2022 - arxiv.org
Extracting useful information from the user history to clearly understand informational needs
is a crucial feature of a proactive information retrieval system. Regarding understanding …