Evaluating generative ad hoc information retrieval

L Gienapp, H Scells, N Deckers, J Bevendorff… - Proceedings of the 47th …, 2024 - dl.acm.org
Recent advances in large language models have enabled the development of viable
generative retrieval systems. Instead of a traditional document ranking, generative retrieval …

Summaries, ranked retrieval and sessions: A unified framework for information access evaluation

T Sakai, Z Dou - Proceedings of the 36th international ACM SIGIR …, 2013 - dl.acm.org
We introduce a general information access evaluation framework that can potentially handle
summaries, ranked document lists and even multi query sessions seamlessly. Our …

Metrics, statistics, tests

T Sakai - PROMISE winter school, 2013 - Springer
This lecture is intended to serve as an introduction to Information Retrieval (IR) effectiveness
metrics and their usage in IR experiments using test collections. Evaluation metrics are …

[HTML][HTML] User behavior modeling for web search evaluation

F Zhang, Y Liu, J Mao, M Zhang, S Ma - AI Open, 2020 - Elsevier
Search engines are widely used in our daily life. Batch evaluation of the performance of
search systems to their users has always been an essential issue in the field of information …

A Workbench for Autograding Retrieve/Generate Systems

L Dietz - Proceedings of the 47th International ACM SIGIR …, 2024 - dl.acm.org
This resource paper addresses the challenge of evaluating Information Retrieval (IR)
systems in the era of autoregressive Large Language Models (LLMs). Traditional methods …

SWAN: A Generic Framework for Auditing Textual Conversational Systems

T Sakai - arXiv preprint arXiv:2305.08290, 2023 - arxiv.org
We present a simple and generic framework for auditing a given textual conversational
system, given some samples of its conversation sessions as its input. The framework …

Context-aware web search abandonment prediction

Y Song, X Shi, R White, AH Awadallah - Proceedings of the 37th …, 2014 - dl.acm.org
Web search queries without hyperlink clicks are often referred to as abandoned queries.
Understanding the reasons for abandonment is crucial for search engines in evaluating their …

[PDF][PDF] Fairness-based evaluation of conversational search: A pilot study

T Sakai - Proceedings of EVIA, 2023 - repository.nii.ac.jp
ABSTRACT NTCIR-17 introduced the FairWeb-1 task, which evaluated web page rankings
in terms of both relevance and group fairness. The present study shows how their evaluation …

Query snowball: a co-occurrence-based approach to multi-document summarization for question answering

H Morita, T Sakai, M Okumura - Information and Media Technologies, 2012 - jstage.jst.go.jp
We propose a new method for query-oriented extractive multi-document summarization. To
enrich the information need representation of a given query, we build a co-occurrence graph …

[PDF][PDF] Overview of the NTCIR-12 MobileClick-2 Task.

MP Kato, T Sakai, T Yamamoto, V Pavlu, H Morita… - NTCIR, 2016 - research.nii.ac.jp
This is an overview of the NTCIR-12 MobileClick-2 task (a sequel to 1CLICK in NTCIR-9 and
NTCIR-10). In the MobileClick task, systems are expected to output a concise summary of …