Autonomous chemical research with large language models

DA Boiko, R MacKnight, B Kline, G Gomes - Nature, 2023 - nature.com
Transformer-based large language models are making significant strides in various fields,
such as natural language processing,,,–, biology,, chemistry,–and computer programming …

[图书][B] Pretrained transformers for text ranking: Bert and beyond

J Lin, R Nogueira, A Yates - 2022 - books.google.com
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in
response to a query. Although the most common formulation of text ranking is search …

SPLADE: Sparse lexical and expansion model for first stage ranking

T Formal, B Piwowarski, S Clinchant - Proceedings of the 44th …, 2021 - dl.acm.org
In neural Information Retrieval, ongoing research is directed towards improving the first
retriever in ranking pipelines. Learning dense embeddings to conduct retrieval using …

SPLADE v2: Sparse lexical and expansion model for information retrieval

T Formal, C Lassance, B Piwowarski… - arXiv preprint arXiv …, 2021 - arxiv.org
In neural Information Retrieval (IR), ongoing research is directed towards improving the first
retriever in ranking pipelines. Learning dense embeddings to conduct retrieval using …

An efficiency study for SPLADE models

C Lassance, S Clinchant - Proceedings of the 45th International ACM …, 2022 - dl.acm.org
Latency and efficiency issues are often overlooked when evaluating IR models based on
Pretrained Language Models (PLMs) in reason of multiple hardware and software testing …

A proposed conceptual framework for a representational approach to information retrieval

J Lin - ACM SIGIR Forum, 2022 - dl.acm.org
This paper outlines a conceptual framework for understanding recent developments in
information retrieval and natural language processing that attempts to integrate dense and …

Dynamic self-consistency: Leveraging reasoning paths for efficient llm sampling

G Wan, Y Wu, J Chen, S Li - arXiv preprint arXiv:2408.17017, 2024 - arxiv.org
Self-Consistency (SC) is a widely used method to mitigate hallucinations in Large Language
Models (LLMs) by sampling the LLM multiple times and outputting the most frequent …

Improving biomedical ReQA with consistent NLI-transfer and post-whitening

J Bai, C Yin, Z Wu, J Zhang, Y Wang… - IEEE/ACM …, 2022 - ieeexplore.ieee.org
Retrieval Question Answering (ReQA) is an essential mechanism of information sharing
which aims to find the answer to a posed question from large-scale candidates. Currently …

Composite code sparse autoencoders for first stage retrieval

C Lassance, T Formal, S Clinchant - … of the 44th International ACM SIGIR …, 2021 - dl.acm.org
We present a Composite Code Sparse Autoencoder (CCSA) approach for Approximate
Nearest Neighbor (ANN) search of document representations based on Siamese-BERT …

[PDF][PDF] On the Separation of Logical and Physical Ranking Models for Text Retrieval Applications.

J Lin, X Ma, J Mackenzie, A Mallia - DESIRES, 2021 - jmmackenzie.io
Text retrieval using bags of words is typically formulated as inner products between vector
representations of queries and documents, realized in query evaluation algorithms that …