Index compression using 64‐bit words

VN Anh, A Moffat - Software: Practice and Experience, 2010 - Wiley Online Library
Modern computers typically make use of 64‐bit words as the fundamental unit of data
access. However the decade‐long migration from 32‐bit architectures has not been …

Indexing methods for approximate dictionary searching: Comparative analysis

L Boytsov - Journal of Experimental Algorithmics (JEA), 2011 - dl.acm.org
The primary goal of this article is to survey state-of-the-art indexing methods for approximate
dictionary searching. To improve understanding of the field, we introduce a taxonomy that …

Efficient query processing for scalable web search

N Tonellotto, C Macdonald, I Ounis - Foundations and Trends® …, 2018 - nowpublishers.com
Search engines are exceptionally important tools for accessing information in today's world.
In satisfying the information needs of millions of users, the effectiveness (the quality of the …

New algorithms on wavelet trees and applications to information retrieval

T Gagie, G Navarro, SJ Puglisi - Theoretical Computer Science, 2012 - Elsevier
Wavelet trees are widely used in the representation of sequences, permutations, text
collections, binary relations, discrete points, and other succinct data structures. We show …

Fast set intersection in memory

B Ding, AC König - arXiv preprint arXiv:1103.2409, 2011 - arxiv.org
Set intersection is a fundamental operation in information retrieval and database systems.
This paper introduces linear space data structures to represent sets such that their …

Efficient set intersection for inverted indexing

JS Culpepper, A Moffat - ACM Transactions on Information Systems …, 2010 - dl.acm.org
Conjunctive Boolean queries are a key component of modern information retrieval systems,
especially when Web-scale repositories are being searched. A conjunctive query q is …

PAM: parallel augmented maps

Y Sun, D Ferizovic, GE Belloch - Proceedings of the 23rd ACM SIGPLAN …, 2018 - dl.acm.org
Ordered (key-value) maps are an important and widely-used data type for large-scale data
processing frameworks. Beyond simple search, insertion and deletion, more advanced …

Efficient parallel lists intersection and index compression algorithms using graphics processing units

N Ao, F Zhang, D Wu, DS Stones, G Wang… - Proceedings of the …, 2011 - dl.acm.org
Major web search engines answer thousands of queries per second requesting information
about billions of web pages. The data sizes and query loads are growing at an exponential …

[PDF][PDF] Fast Sorted-Set Intersection using SIMD Instructions.

B Schlegel, T Willhalm, W Lehner - ADMS@ VLDB, 2011 - adms-conf.org
In this paper, we focus on sorted-set intersection which is an important part in many
algorithms, eg, RID-list intersection, inverted indexes, and others. In contrast to traditional …

Fast set intersection and two-patterns matching

H Cohen, E Porat - Theoretical Computer Science, 2010 - Elsevier
In this paper we present a new problem, the fast set intersection problem, which is to
preprocess a collection of sets in order to efficiently report the intersection of any two sets in …