A systematic survey and critical review on evaluating large language models: Challenges, limitations, and recommendations

MTR Laskar, S Alqahtani, MS Bari… - Proceedings of the …, 2024 - aclanthology.org
Abstract Large Language Models (LLMs) have recently gained significant attention due to
their remarkable capabilities in performing diverse tasks across various domains. However …

A systematic study and comprehensive evaluation of ChatGPT on benchmark datasets

MTR Laskar, MS Bari, M Rahman… - arXiv preprint arXiv …, 2023 - arxiv.org
The development of large language models (LLMs) such as ChatGPT has brought a lot of
attention recently. However, their evaluation in the benchmark academic datasets remains …

[HTML][HTML] Leancontext: Cost-efficient domain-specific question answering using llms

MA Arefeen, B Debnath, S Chakradhar - Natural Language Processing …, 2024 - Elsevier
Question-answering (QA) is a significant application of Large Language Models (LLMs),
shaping chatbot capabilities across healthcare, education, and customer service. However …

Can large language models fix data annotation errors? an empirical study using debatepedia for query-focused text summarization

MTR Laskar, M Rahman, I Jahan… - Findings of the …, 2023 - aclanthology.org
Debatepedia is a publicly available dataset consisting of arguments and counter-arguments
on controversial topics that has been widely used for the single-document query-focused …

ChatGPT Label: Comparing the Quality of Human-Generated and LLM-Generated Annotations in Low-resource Language NLP Tasks

AH Nasution, A Onan - IEEE Access, 2024 - ieeexplore.ieee.org
This research paper presents a comprehensive comparative study assessing the quality of
annotations in Turkish, Indonesian, and Minangkabau Natural Language Processing (NLP) …

Mldt: Multi-level decomposition for complex long-horizon robotic task planning with open-source large language model

Y Wu, J Zhang, N Hu, L Tang, G Qi, J Shao… - … on Database Systems …, 2024 - Springer
In the realm of data-driven AI technology, the application of open-source large language
models (LLMs) in robotic task planning represents a significant milestone. Recent robotic …

Dilated convolution for enhanced extractive summarization: A GAN-based approach with BERT word embedding

H Wu - Journal of Intelligent & Fuzzy Systems, 2024 - content.iospress.com
Text summarization (TS) plays a crucial role in natural language processing (NLP) by
automatically condensing and capturing key information from text documents. Its …

BIDER: Bridging Knowledge Inconsistency for Efficient Retrieval-Augmented LLMs via Key Supporting Evidence

J Jin, Y Zhu, Y Zhou, Z Dou - arXiv preprint arXiv:2402.12174, 2024 - arxiv.org
Retrieval-augmented large language models (LLMs) have demonstrated efficacy in
knowledge-intensive tasks such as open-domain QA, addressing inherent challenges in …

Evaluating the Use of Generative LLMs for Intralingual Diachronic Translation of Middle-Polish Texts into Contemporary Polish

C Klamra, K Kryńska, M Ogrodniczuk - International Conference on Asian …, 2023 - Springer
This paper presents efforts towards creating a tool for translating texts from Middle Polish
into modern Polish. Archaic texts sourced from the CBDU digital library were translated into …

QFMTS: Generating Query-Focused Summaries over Multi-Table Inputs

W Zhang, V Pal, JH Huang, E Kanoulas… - arXiv preprint arXiv …, 2024 - arxiv.org
Table summarization is a crucial task aimed at condensing information from tabular data into
concise and comprehensible textual summaries. However, existing approaches often fall …