Semantic models for the first-stage retrieval: A comprehensive review

J Guo, Y Cai, Y Fan, F Sun, R Zhang… - ACM Transactions on …, 2022 - dl.acm.org
Multi-stage ranking pipelines have been a practical solution in modern search systems,
where the first-stage retrieval is to return a subset of candidate documents and latter stages …

Large language models for information retrieval: A survey

Y Zhu, H Yuan, S Wang, J Liu, W Liu, C Deng… - arXiv preprint arXiv …, 2023 - arxiv.org
As a primary means of information acquisition, information retrieval (IR) systems, such as
search engines, have integrated themselves into our daily lives. These systems also serve …

[图书][B] Pretrained transformers for text ranking: Bert and beyond

J Lin, R Nogueira, A Yates - 2022 - books.google.com
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in
response to a query. Although the most common formulation of text ranking is search …

Document expansion by query prediction

R Nogueira, W Yang, J Lin, K Cho - arXiv preprint arXiv:1904.08375, 2019 - arxiv.org
One technique to improve the retrieval effectiveness of a search engine is to expand
documents with terms that are related or representative of the documents' content. From the …

An introduction to neural information retrieval

B Mitra, N Craswell - Foundations and Trends® in Information …, 2018 - nowpublishers.com
Neural ranking models for information retrieval (IR) use shallow or deep neural networks to
rank search results in response to a query. Traditional learning to rank models employ …

Pre-training methods in information retrieval

Y Fan, X Xie, Y Cai, J Chen, X Ma, X Li… - … and Trends® in …, 2022 - nowpublishers.com
The core of information retrieval (IR) is to identify relevant information from large-scale
resources and return it as a ranked list to respond to user's information need. In recent years …

Rankvicuna: Zero-shot listwise document reranking with open-source large language models

R Pradeep, S Sharifymoghaddam, J Lin - arXiv preprint arXiv:2309.15088, 2023 - arxiv.org
Researchers have successfully applied large language models (LLMs) such as ChatGPT to
reranking in an information retrieval context, but to date, such work has mostly been built on …

Generation-augmented retrieval for open-domain question answering

Y Mao, P He, X Liu, Y Shen, J Gao, J Han… - arXiv preprint arXiv …, 2020 - arxiv.org
We propose Generation-Augmented Retrieval (GAR) for answering open-domain questions,
which augments a query through text generation of heuristically discovered relevant …

Anserini: Reproducible ranking baselines using Lucene

P Yang, H Fang, J Lin - Journal of Data and Information Quality (JDIQ), 2018 - dl.acm.org
This work tackles the perennial problem of reproducible baselines in information retrieval
research, focusing on bag-of-words ranking models. Although academic information …

Simple applications of BERT for ad hoc document retrieval

W Yang, H Zhang, J Lin - arXiv preprint arXiv:1903.10972, 2019 - arxiv.org
Following recent successes in applying BERT to question answering, we explore simple
applications to ad hoc document retrieval. This required confronting the challenge posed by …