Can Large Language Models Understand Spatial Audio?

文章

学术资源搜索

获得 3 条结果（用时0.01秒）

我的图书馆

Can Large Language Models Understand Spatial Audio?

在引用文章中搜索

[PDF] arxiv.org

Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation

S Wang, W Yu, Y Yang, C Tang, Y Li, J Zhuang… - arXiv preprint arXiv …, 2024 - arxiv.org

Speech quality assessment typically requires evaluating audio from multiple aspects, such
as mean opinion score (MOS) and speaker similarity (SIM) etc., which can be challenging to …

Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization

C Tang, Y Li, Y Yang, J Zhuang, G Sun, W Li… - arXiv preprint arXiv …, 2024 - arxiv.org

Videos contain a wealth of information, and generating detailed and accurate descriptions in
natural language is a key aspect of video understanding. In this paper, we present video …

DigiMindReady: Enhancing Military Readiness through Edge AI-Driven Wellness, Education, and Digital Discipline via Privacy-First mHealth Innovation

MM Hasan - 2024 - digitalcommons.kennesaw.edu

Military personnel often find themselves in intense situations that require high focus.
Successfully engaging in these dangerous missions means they must efficiently control …

高级搜索

QQ 群

Can Large Language Models Understand Spatial Audio?

Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation

Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization

DigiMindReady: Enhancing Military Readiness through Edge AI-Driven Wellness, Education, and Digital Discipline via Privacy-First mHealth Innovation

引用