CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery

文章

学术资源搜索

获得 3 条结果（用时0.02秒）

我的图书馆

CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery

在引用文章中搜索

[PDF] arxiv.org

Progressive multimodal reasoning via active retrieval

G Dong, C Zhang, M Deng, Y Zhu, Z Dou… - arXiv preprint arXiv …, 2024 - arxiv.org

Multi-step multimodal reasoning tasks pose significant challenges for multimodal large
language models (MLLMs), and finding effective ways to enhance their performance in such …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Simrag: Self-improving retrieval-augmented generation for adapting large language models to specialized domains

R Xu, H Liu, S Nag, Z Dai, Y Xie, X Tang, C Luo… - arXiv preprint arXiv …, 2024 - arxiv.org

Retrieval-augmented generation (RAG) enhances the question-answering (QA) abilities of
large language models (LLMs) by integrating external knowledge. However, adapting …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

SEAS: Self-Evolving Adversarial Safety Optimization for Large Language Models

M Diao, R Li, S Liu, G Liao, J Wang, X Cai… - arXiv preprint arXiv …, 2024 - arxiv.org

As large language models (LLMs) continue to advance in capability and influence, ensuring
their security and preventing harmful outputs has become crucial. A promising approach to …

高级搜索

QQ 群

CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery

Progressive multimodal reasoning via active retrieval

Simrag: Self-improving retrieval-augmented generation for adapting large language models to specialized domains

SEAS: Self-Evolving Adversarial Safety Optimization for Large Language Models

引用