Bliva: A simple multimodal llm for better handling of text-rich visual questions

W Hu, Y Xu, Y Li, W Li, Z Chen, Z Tu - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Vision Language Models (VLMs), which extend Large Language Models (LLM) by
incorporating visual understanding capability, have demonstrated significant advancements …

Effectiveness assessment of recent large vision-language models

Y Jiang, X Yan, GP Ji, K Fu, M Sun, H Xiong, DP Fan… - Visual Intelligence, 2024 - Springer
The advent of large vision-language models (LVLMs) represents a remarkable advance in
the quest for artificial general intelligence. However, the models' effectiveness in both …

Layoutnuwa: Revealing the hidden layout expertise of large language models

Z Tang, C Wu, J Li, N Duan - arXiv preprint arXiv:2309.09506, 2023 - arxiv.org
Graphic layout generation, a growing research field, plays a significant role in user
engagement and information perception. Existing methods primarily treat layout generation …

Towards understanding multi-task learning (generalization) of llms via detecting and exploring task-specific neurons

Y Leng, D Xiong - arXiv preprint arXiv:2407.06488, 2024 - arxiv.org
While large language models (LLMs) have demonstrated superior multi-task capabilities,
understanding the learning mechanisms behind this is still a challenging problem. In this …

Benchmarking llm-based machine translation on cultural awareness

B Yao, M Jiang, D Yang, J Hu - arXiv preprint arXiv:2305.14328, 2023 - arxiv.org
Translating cultural-specific content is crucial for effective cross-cultural communication.
However, many MT systems still struggle to translate sentences containing cultural-specific …

[PDF][PDF] Explicit Memory Learning with Expectation Maximization

Z Yin, Q Sun, Q Guo, Z Zeng, Q Cheng… - Proceedings of the …, 2024 - aclanthology.org
Abstract Large Language Models (LLMs) have revolutionized the landscape of natural
language processing, demonstrating remarkable abilities across various complex tasks …