查看文章

aclanthology.org 中的 [PDF]

Retrieval-based Evaluation for LLMs: A Case Study in Korean Legal QA

作者

Cheol Ryu, Seolhwa Lee, Subeen Pang, Chanyeol Choi, Hojun Choi, Myeonggee Min, Jy-Yong Sohn

发表日期

2023/12

研讨会论文

Proceedings of the Natural Legal Language Processing Workshop 2023

页码范围

132-137

简介

While large language models (LLMs) have demonstrated significant capabilities in text generation, their utilization in areas requiring domain-specific expertise, such as law, must be approached cautiously. This caution is warranted due to the inherent challenges associated with LLM-generated texts, including the potential presence of factual errors. Motivated by this issue, we propose Eval-RAG, a new evaluation method for LLM-generated texts. Unlike existing methods, Eval-RAG evaluates the validity of generated texts based on the related document that are collected by the retriever. In other words, Eval-RAG adopts the idea of retrieval augmented generation (RAG) for the purpose of evaluation. Our experimental results on Korean Legal Question-Answering (QA) tasks show that conventional LLM-based evaluation methods can be better aligned with Lawyers’ evaluations, by combining with Eval-RAG. In addition, our qualitative analysis show that Eval-RAG successfully finds the factual errors in LLM-generated texts, while existing evaluation methods cannot.

引用总数

被引用次数：2

20242

学术搜索中的文章

Retrieval-based Evaluation for LLMs: A Case Study in Korean Legal QA

C Ryu, S Lee, S Pang, C Choi, H Choi, M Min, JY Sohn - Proceedings of the Natural Legal Language …, 2023

被引用次数：2 相关文章所有 2 个版本