This paper presents the first few-shot LLM-based chatbot that almost never hallucinates and has high conversationality and low latency. WikiChat is grounded on the English Wikipedia …
MIRACL (Multilingual Information Retrieval Across a Continuum of Languages) is a multilingual dataset we have built for the WSDM 2023 Cup challenge that focuses on ad hoc …
Two key assumptions shape the usual view of ranked retrieval:(1) that the searcher can choose words for their query that might appear in the documents that they wish to see, and …
Z Huang, P Yu, J Allan - Proceedings of the Sixteenth ACM International …, 2023 - dl.acm.org
Benefiting from transformer-based pre-trained language models, neural ranking models have made significant progress. More recently, the advent of multilingual pre-trained …
Pretrained language models have improved effectiveness on numerous tasks, including ad- hoc retrieval. Recent work has shown that continuing to pretrain a language model with …
HC4 is a new suite of test collections for ad hoc Cross-Language Information Retrieval (CLIR), with Common Crawl News documents in Chinese, Persian, and Russian, topics in …
J Zhu, J Wu, X Luo, J Liu - Artificial intelligence and law, 2024 - Springer
Recently, the pandemic caused by COVID-19 is severe in the entire world. The prevention and control of crimes associated with COVID-19 are critical for controlling the pandemic …
Learning sparse representations using pretrained language models enhances the monolingual ranking effectiveness. Such representations are sparse vectors in the …
Abstract Finetuning Pretrained Language Models (PLM) for IR has been de facto the standard practice since their breakthrough effectiveness few years ago. But, is this approach …