A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

Semantic models for the first-stage retrieval: A comprehensive review

J Guo, Y Cai, Y Fan, F Sun, R Zhang… - ACM Transactions on …, 2022 - dl.acm.org
Multi-stage ranking pipelines have been a practical solution in modern search systems,
where the first-stage retrieval is to return a subset of candidate documents and latter stages …

Ernie 3.0: Large-scale knowledge enhanced pre-training for language understanding and generation

Y Sun, S Wang, S Feng, S Ding, C Pang… - arXiv preprint arXiv …, 2021 - arxiv.org
Pre-trained models have achieved state-of-the-art results in various Natural Language
Processing (NLP) tasks. Recent works such as T5 and GPT-3 have shown that scaling up …

Promptagator: Few-shot dense retrieval from 8 examples

Z Dai, VY Zhao, J Ma, Y Luan, J Ni, J Lu… - arXiv preprint arXiv …, 2022 - arxiv.org
Much recent research on information retrieval has focused on how to transfer from one task
(typically with abundant supervised data) to various other tasks where supervision is limited …

RocketQA: An optimized training approach to dense passage retrieval for open-domain question answering

Y Qu, Y Ding, J Liu, K Liu, R Ren, WX Zhao… - arXiv preprint arXiv …, 2020 - arxiv.org
In open-domain question answering, dense passage retrieval has become a new paradigm
to retrieve relevant passages for finding answers. Typically, the dual-encoder architecture is …

Approximate nearest neighbor negative contrastive learning for dense text retrieval

L Xiong, C Xiong, Y Li, KF Tang, J Liu… - arXiv preprint arXiv …, 2020 - arxiv.org
Conducting text retrieval in a dense learned representation space has many intriguing
advantages over sparse retrieval. Yet the effectiveness of dense retrieval (DR) often requires …

Dense text retrieval based on pretrained language models: A survey

WX Zhao, J Liu, R Ren, JR Wen - ACM Transactions on Information …, 2024 - dl.acm.org
Text retrieval is a long-standing research topic on information seeking, where a system is
required to return relevant information resources to user's queries in natural language. From …

[图书][B] Pretrained transformers for text ranking: Bert and beyond

J Lin, R Nogueira, A Yates - 2022 - books.google.com
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in
response to a query. Although the most common formulation of text ranking is search …

Colbert: Efficient and effective passage search via contextualized late interaction over bert

O Khattab, M Zaharia - Proceedings of the 43rd International ACM SIGIR …, 2020 - dl.acm.org
Recent progress in Natural Language Understanding (NLU) is driving fast-paced advances
in Information Retrieval (IR), largely owed to fine-tuning deep language models (LMs) for …

Xlnet: Generalized autoregressive pretraining for language understanding

Z Yang, Z Dai, Y Yang, J Carbonell… - Advances in neural …, 2019 - proceedings.neurips.cc
With the capability of modeling bidirectional contexts, denoising autoencoding based
pretraining like BERT achieves better performance than pretraining approaches based on …