J Zhang, H Bu, H Wen, Y Chen, L Li, H Zhu - arXiv preprint arXiv …, 2024 - arxiv.org
The rapid advancements in large language models (LLMs) have opened new avenues across various fields, including cybersecurity, which faces an ever-evolving threat landscape …
Large Language Model (LLM) agents, capable of performing a broad range of actions, such as invoking tools and controlling robots, show great potential in tackling real-world …
We introduce UFO, an innovative UI-Focused agent to fulfill user requests tailored to applications on Windows OS, harnessing the capabilities of GPT-Vision. UFO employs a …
In this paper, we introduce" InfiAgent-DABench", the first benchmark specifically designed to evaluate LLM-based agents in data analysis tasks. This benchmark contains DAEval, a …
C Zhang, Z Ma, Y Wu, S He, S Qin, M Ma, X Qin… - arXiv preprint arXiv …, 2024 - arxiv.org
Verbatim feedback constitutes a valuable repository of user experiences, opinions, and requirements essential for software development. Effectively and efficiently extracting …
B Weng - arXiv preprint arXiv:2404.09022, 2024 - arxiv.org
With the surge of ChatGPT, the use of large models has significantly increased, rapidly rising to prominence across the industry and sweeping across the internet. This article is a …
S Cheng, Z Zhuang, Y Xu, F Yang, C Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) have shown potential in reasoning over structured environments, eg, knowledge graph and table. Such tasks typically require multi-hop …
Z Song, Y Li, M Fang, Z Chen, Z Shi… - arXiv preprint arXiv …, 2024 - arxiv.org
Autonomous virtual agents are often limited by their singular mode of interaction with real- world environments, restricting their versatility. To address this, we propose the Multi-Modal …
H Yang, B Zhang, N Wang, C Guo, X Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
As financial institutions and professionals increasingly incorporate Large Language Models (LLMs) into their workflows, substantial barriers, including proprietary data and specialized …