Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation

S Wang, W Yu, Y Yang, C Tang, Y Li, J Zhuang… - arXiv preprint arXiv …, 2024 - arxiv.org
Speech quality assessment typically requires evaluating audio from multiple aspects, such
as mean opinion score (MOS) and speaker similarity (SIM) etc., which can be challenging to …

Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization

C Tang, Y Li, Y Yang, J Zhuang, G Sun, W Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Videos contain a wealth of information, and generating detailed and accurate descriptions in
natural language is a key aspect of video understanding. In this paper, we present video …

DigiMindReady: Enhancing Military Readiness through Edge AI-Driven Wellness, Education, and Digital Discipline via Privacy-First mHealth Innovation

MM Hasan - 2024 - digitalcommons.kennesaw.edu
Military personnel often find themselves in intense situations that require high focus.
Successfully engaging in these dangerous missions means they must efficiently control …