The spotify podcast dataset

M Hassid, T Remez, TA Nguyen, I Gat… - Advances in …, 2024 - proceedings.neurips.cc

Speech language models (SpeechLMs) process and generate acoustic data only, without
textual supervision. In this work, we propose TWIST, a method for training SpeechLMs using …

被引用次数：55 相关文章所有 5 个版本

[PDF] arxiv.org

Audiobox: Unified audio generation with natural language prompts

A Vyas, B Shi, M Le, A Tjandra, YC Wu, B Guo… - arXiv preprint arXiv …, 2023 - arxiv.org

Audio is an essential part of our life, but creating it often requires expertise and is time-
consuming. Research communities have made great progress over the past year advancing …

被引用次数：83 相关文章所有 2 个版本

[PDF] arxiv.org

Recent advances in speech language models: A survey

W Cui, D Yu, X Jiao, Z Meng, G Zhang, Q Wang… - arXiv preprint arXiv …, 2024 - arxiv.org

Large Language Models (LLMs) have recently garnered significant attention, primarily for
their capabilities in text-based interactions. However, natural human interaction often relies …

被引用次数：3 相关文章所有 2 个版本

[PDF] aclanthology.org

SUMMEDITS: measuring LLM ability at factual reasoning through the lens of summarization

P Laban, W Kryściński, D Agarwal… - Proceedings of the …, 2023 - aclanthology.org

With the recent appearance of LLMs in practical settings, having methods that can effectively
detect factual inconsistencies is crucial to reduce the propagation of misinformation and …

被引用次数：41 相关文章所有 3 个版本

[PDF] arxiv.org

Llms as factual reasoners: Insights from existing benchmarks and beyond

P Laban, W Kryściński, D Agarwal, AR Fabbri… - arXiv preprint arXiv …, 2023 - arxiv.org

With the recent appearance of LLMs in practical settings, having methods that can effectively
detect factual inconsistencies is crucial to reduce the propagation of misinformation and …

被引用次数：36 相关文章所有 2 个版本

Computational sociophonetics using automatic speech recognition

R Coto‐Solano - Language and Linguistics Compass, 2022 - Wiley Online Library

Recent years have seen numerous advances in natural language processing that can help
accelerate sociophonetic work. These include software to align speech recordings with their …

被引用次数：6 相关文章所有 2 个版本

[PDF] eurasip.org

Audio summarization for podcasts

A Vartakavi, A Garg, Z Rafii - 2021 29th European signal …, 2021 - ieeexplore.ieee.org

We propose a novel system to automatically generate audio summaries for podcasts,
allowing listeners to quickly preview podcast episodes. The proposed system first …

被引用次数：23 相关文章所有 6 个版本

[PDF] arxiv.org

Bespoke Non-Stationary Solvers for Fast Sampling of Diffusion and Flow Models

N Shaul, U Singer, RTQ Chen, M Le, A Thabet… - arXiv preprint arXiv …, 2024 - arxiv.org

This paper introduces Bespoke Non-Stationary (BNS) Solvers, a solver distillation approach
to improve sample efficiency of Diffusion and Flow models. BNS solvers are based on a …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

A two-phase approach for abstractive podcast summarization

C Zheng, K Zhang, HJ Wang, L Fan - arXiv preprint arXiv:2011.08291, 2020 - arxiv.org

Podcast summarization is different from summarization of other data formats, such as news,
patents, and scientific papers in that podcasts are often longer, conversational, colloquial …

被引用次数：13 相关文章所有 4 个版本

[PDF] arxiv.org

Towards proactive information retrieval in noisy text with Wikipedia concepts

T Ahmed, S Bulathwela - arXiv preprint arXiv:2210.09877, 2022 - arxiv.org

Extracting useful information from the user history to clearly understand informational needs
is a crucial feature of a proactive information retrieval system. Regarding understanding …

被引用次数：3 相关文章所有 5 个版本

高级搜索

QQ 群

Textually pretrained speech language models

Audiobox: Unified audio generation with natural language prompts

Recent advances in speech language models: A survey

SUMMEDITS: measuring LLM ability at factual reasoning through the lens of summarization

Llms as factual reasoners: Insights from existing benchmarks and beyond

Computational sociophonetics using automatic speech recognition

Audio summarization for podcasts

Bespoke Non-Stationary Solvers for Fast Sampling of Diffusion and Flow Models

A two-phase approach for abstractive podcast summarization

Towards proactive information retrieval in noisy text with Wikipedia concepts

引用