An Automatic Evaluation Framework for Multi-turn Medical Consultations Capabilities of Large...

H Yu, L Fan, L Li, J Zhou, Z Ma, L Xian, W Hua… - arXiv preprint arXiv …, 2024 - arxiv.org

Large Language Models (LLMs) have rapidly become important tools in Biomedical and
Health Informatics (BHI), enabling new ways to analyze data, treat patients, and conduct …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Towards Automatic Evaluation for LLMs' Clinical Capabilities: Metric, Data, and Algorithm

L Liu, X Yang, F Li, C Chi, Y Shen, S Lyu… - Proceedings of the 30th …, 2024 - dl.acm.org

Large language models (LLMs) are gaining increasing interests to improve clinical
efficiency, owing to their unprecedented performance in modelling natural language …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Can You Trust LLM Judgments? Reliability of LLM-as-a-Judge

K Schroeder, Z Wood-Doughty - arXiv preprint arXiv:2412.12509, 2024 - arxiv.org

Large Language Models (LLMs) have become increasingly powerful and ubiquitous, but
their stochastic nature poses challenges to the reliability of their outputs. While deterministic …

A Survey on Medical Large Language Models: Technology, Application, Trustworthiness, and Future Directions

L Liu, X Yang, J Lei, X Liu, Y Shen, Z Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs), such as GPT series models, have received substantial
attention due to their impressive capabilities for generating and understanding human-level …

被引用次数：19 相关文章所有 2 个版本

[PDF] aclanthology.org

Interactive Evaluation for Medical LLMs via Task-oriented Dialogue System

R Liu, K Xue, X Zhang, S Zhang - Proceedings of the 31st …, 2025 - aclanthology.org

This study focuses on evaluating proactive communication and diagnostic capabilities of
medical Large Language Models (LLMs), which directly impact their effectiveness in patient …

[PDF] arxiv.org

Med-PMC: Medical Personalized Multi-modal Consultation with a Proactive Ask-First-Observe-Next Paradigm

H Liu, Y Liao, S Ou, Y Wang, H Liu, Y Wang… - arXiv preprint arXiv …, 2024 - arxiv.org

The application of the Multi-modal Large Language Models (MLLMs) in medical clinical
scenarios remains underexplored. Previous benchmarks only focus on the capacity of the …

Implementing prompt engineering and retrieval augmented generation in pentestgpt with local and open-source large language models

AL Espenes, A Trøan - 2024 - uia.brage.unit.no

Recently various machine learning and Large Language Model tools have been
popularized for their ability to solve simple tasks such as grammar correction, text …

高级搜索

QQ 群