AssistGUI: Task-Oriented PC Graphical User Interface Automation

D Gao, L Ji, Z Bai, M Ouyang, P Li… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Graphical User Interface (GUI) automation holds significant promise for assisting
users with complex tasks thereby boosting human productivity. Existing works leveraging …

Assistgui: Task-oriented desktop graphical user interface automation

D Gao, L Ji, Z Bai, M Ouyang, P Li, D Mao, Q Wu… - arXiv preprint arXiv …, 2023 - arxiv.org
Graphical User Interface (GUI) automation holds significant promise for assisting users with
complex tasks, thereby boosting human productivity. Existing works leveraging Large …

Droidbot-gpt: Gpt-powered ui automation for android

H Wen, H Wang, J Liu, Y Li - arXiv preprint arXiv:2304.07061, 2023 - arxiv.org
This paper introduces DroidBot-GPT, a tool that utilizes GPT-like large language models
(LLMs) to automate the interactions with Android mobile applications. Given a natural …

Android in the zoo: Chain-of-action-thought for gui agents

J Zhang, J Wu, Y Teng, M Liao, N Xu, X Xiao… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language model (LLM) leads to a surge of autonomous GUI agents for smartphone,
which completes a task triggered by natural language through predicting a sequence of …

Seeclick: Harnessing gui grounding for advanced visual gui agents

K Cheng, Q Sun, Y Chu, F Xu, Y Li, J Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Graphical User Interface (GUI) agents are designed to automate complex tasks on digital
devices, such as smartphones and desktops. Most existing GUI agents interact with the …

GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices

Q Lu, W Shao, Z Liu, F Meng, B Li, B Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
Smartphone users often navigate across multiple applications (apps) to complete tasks such
as sharing content between social media platforms. Autonomous Graphical User Interface …

LlamaTouch: A Faithful and Scalable Testbed for Mobile UI Automation Task Evaluation

L Zhang, S Wang, X Jia, Z Zheng, Y Yan, L Gao… - arXiv preprint arXiv …, 2024 - arxiv.org
The emergent large language/multimodal models facilitate the evolution of mobile agents,
especially in the task of mobile UI automation. However, existing evaluation approaches …

AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents

C Rawles, S Clinckemaillie, Y Chang, J Waltz… - arXiv preprint arXiv …, 2024 - arxiv.org
Autonomous agents that execute human tasks by controlling computers can enhance
human productivity and application accessibility. Yet, progress in this field will be driven by …

UINav: A maker of UI automation agents

W Li, FL Hsu, W Bishop, F Campbell-Ajala… - arXiv preprint arXiv …, 2023 - arxiv.org
An automation system that can execute natural language instructions by driving the user
interface (UI) of an application can benefit users, especially when situationally or …

VASTA: a vision and language-assisted smartphone task automation system

AR Sereshkeh, G Leung, K Perumal, C Phillips… - Proceedings of the 25th …, 2020 - dl.acm.org
We present VASTA, a novel vision and language-assisted Programming By Demonstration
(PBD) system for smartphone task automation. Development of a robust PBD automation …