Chain-of-note: Enhancing robustness in retrieval-augmented language models

W Yu, H Zhang, X Pan, K Ma, H Wang, D Yu - arXiv preprint arXiv …, 2023 - arxiv.org
Retrieval-augmented language models (RALMs) represent a substantial advancement in
the capabilities of large language models, notably in reducing factual hallucination by …

Dense x retrieval: What retrieval granularity should we use?

T Chen, H Wang, S Chen, W Yu, K Ma, X Zhao… - arXiv preprint arXiv …, 2023 - arxiv.org
Dense retrieval has become a prominent method to obtain relevant context or world
knowledge in open-domain NLP tasks. When we use a learned dense retriever on a …

Sub-sentence encoder: Contrastive learning of propositional semantic representations

S Chen, H Zhang, T Chen, B Zhou, W Yu, D Yu… - arXiv preprint arXiv …, 2023 - arxiv.org
We introduce sub-sentence encoder, a contrastively-learned contextual embedding model
for fine-grained semantic representation of text. In contrast to the standard practice with …

Think and retrieval: A hypothesis knowledge graph enhanced medical large language models

X Jiang, R Zhang, Y Xu, R Qiu, Y Fang, Z Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
We explore how the rise of Large Language Models (LLMs) significantly impacts task
performance in the field of Natural Language Processing. We focus on two strategies …

Pre-training multi-task contrastive learning models for scientific literature understanding

Y Zhang, H Cheng, Z Shen, X Liu, YY Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
Scientific literature understanding tasks have gained significant attention due to their
potential to accelerate scientific discovery. Pre-trained language models (LMs) have shown …

IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning Inner Monologues

D Yang, J Rao, K Chen, X Guo, Y Zhang… - Proceedings of the 47th …, 2024 - dl.acm.org
Although the Retrieval-Augmented Generation (RAG) paradigms can use external
knowledge to enhance and ground the outputs of Large Language Models (LLMs) to …

LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

Z Jiang, X Ma, W Chen - arXiv preprint arXiv:2406.15319, 2024 - arxiv.org
In traditional RAG framework, the basic retrieval units are normally short. The common
retrievers like DPR normally work with 100-word Wikipedia paragraphs. Such a design …

SPAGHETTI: Open-Domain Question Answering from Heterogeneous Data Sources with Retrieval and Semantic Parsing

HC Zhang, SJ Semnani, F Ghassemi, J Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce SPAGHETTI: Semantic Parsing Augmented Generation for Hybrid English
information from Text Tables and Infoboxes, a hybrid question-answering (QA) pipeline that …

A Unified Taxonomy-Guided Instruction Tuning Framework for Entity Set Expansion and Taxonomy Expansion

Y Shen, Y Zhang, Y Zhang, J Han - arXiv preprint arXiv:2402.13405, 2024 - arxiv.org
Entity Set Expansion, Taxonomy Expansion, and Seed-Guided Taxonomy Construction are
three representative tasks that can be used to automatically populate an existing taxonomy …

Question Answering with Texts and Tables through Deep Reinforcement Learning

MM José, FN Cação, MF Ribeiro, RM Cheang… - arXiv preprint arXiv …, 2024 - arxiv.org
This paper proposes a novel architecture to generate multi-hop answers to open domain
questions that require information from texts and tables, using the Open Table-and-Text …