Techniques for inverted index compression

GE Pibiri, R Venturini - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
The data structure at the core of large-scale search engines is the inverted index, which is
essentially a collection of sorted integer sequences called inverted lists. Because of the …

Compact representation of web graphs with extended functionality

NR Brisaboa, S Ladra, G Navarro - Information Systems, 2014 - Elsevier
The representation of large subsets of the World Wide Web in the form of a directed graph
has been extensively used to analyze structure, behavior, and evolution of those so-called …

Lossless indexing with counting de Bruijn graphs

M Karasikov, H Mustafa, G Rätsch, A Kahles - Genome Research, 2022 - genome.cshlp.org
Sequencing data are rapidly accumulating in public repositories. Making this resource
accessible for interactive analysis at scale requires efficient approaches for its storage and …

Practical compressed string dictionaries

MA Martínez-Prieto, N Brisaboa, R Cánovas… - Information Systems, 2016 - Elsevier
The need to store and query a set of strings–a string dictionary–arises in many kinds of
applications. While classically these string dictionaries have accounted for a small share of …

Compressed vertical partitioning for efficient RDF management

S Álvarez-García, N Brisaboa, JD Fernández… - … and Information Systems, 2015 - Springer
Abstract The Web of Data has been gaining momentum in recent years. This leads to
increasingly publish more and more semi-structured datasets following, in many cases, the …

GraCT: A grammar-based compressed index for trajectory data

NR Brisaboa, A Gómez-Brandón, G Navarro… - Information …, 2019 - Elsevier
We introduce a compressed data structure for the storage of free trajectories of moving
objects that efficiently supports various spatio-temporal queries. Our structure, dubbed …

[HTML][HTML] Lossless compression of industrial time series with direct access

A Gómez-Brandón, JR Paramá, K Villalobos… - Computers in …, 2021 - Elsevier
The new opportunities generated by the data-driven economy in the manufacturing industry
have caused many companies opt for it. However, the size of time series data that need to …

Efficient processing of raster and vector data

F Silva-Coira, JR Paramá, S Ladra, JR López… - Plos one, 2020 - journals.plos.org
In this work, we propose a framework to store and manage spatial data, which includes new
efficient algorithms to perform operations accepting as input a raster dataset and a vector …

Scalable and queryable compressed storage structure for raster data

S Ladra, JR Paramá, F Silva-Coira - Information Systems, 2017 - Elsevier
Compact data structures are storage structures that combine a compressed representation
of the data and the access mechanisms for retrieving individual data without the need of …

Universal indexes for highly repetitive document collections

F Claude, A Fariña, MA Martínez-Prieto, G Navarro - Information Systems, 2016 - Elsevier
Indexing highly repetitive collections has become a relevant problem with the emergence of
large repositories of versioned documents, among other applications. These collections may …