[图书][B] Modern information retrieval

R Baeza-Yates, B Ribeiro-Neto - 1999 - people.ischool.berkeley.edu
Information retrieval (IR) has changed considerably in recent years with the expansion of the
World Wide Web and the advent of modern and inexpensive graphical user interfaces and …

[图书][B] Data-intensive text processing with MapReduce

J Lin, C Dyer - 2022 - books.google.com
Our world is being revolutionized by data-driven methods: access to large amounts of data
has generated new insights and opened exciting new opportunities in commerce, science …

Mining query logs: Turning search usage data into knowledge

F Silvestri - Foundations and Trends® in Information …, 2009 - nowpublishers.com
Web search engines have stored in their logs information about users since they started to
operate. This information often serves many purposes. The primary focus of this survey is on …

Learning to distribute vocabulary indexing for scalable visual search

R Ji, LY Duan, J Chen, L Xie, H Yao… - IEEE Transactions on …, 2012 - ieeexplore.ieee.org
In recent years, there is an ever-increasing research focus on Bag-of-Words based near
duplicate visual search paradigm with inverted indexing. One fundamental yet unexploited …

A pipelined architecture for distributed text query evaluation

A Moffat, W Webber, J Zobel, R Baeza-Yates - Information Retrieval, 2007 - Springer
Two principal query-evaluation methodologies have been described for cluster-based
implementation of distributed information retrieval systems: document partitioning and term …

Challenges on distributed web retrieval

R Baeza-Yates, C Castillo, F Junqueira… - 2007 IEEE 23rd …, 2006 - ieeexplore.ieee.org
In the ocean of Web data, Web search engines are the primary way to access content. As the
data is on the order of petabytes, current search engines are very large centralized systems …

Parallelism-optimizing data placement for faster data-parallel computations

N Baruah, P Kraft, F Kazhamiaka, P Bailis… - Proceedings of the …, 2022 - dl.acm.org
Systems performing large data-parallel computations, including online analytical processing
(OLAP) systems like Druid and search engines like Elasticsearch, are increasingly being …

Searching Encrypted Data with {Size-Locked} Indexes

M Xu, A Namavari, D Cash, T Ristenpart - 30th USENIX Security …, 2021 - usenix.org
We investigate a simple but overlooked folklore approach for searching encrypted
documents held at an untrusted service: Just stash an index (with unstructured encryption) at …

A term-based inverted index partitioning model for efficient distributed query processing

BB Cambazoglu, E Kayaaslan, S Jonassen… - ACM Transactions on …, 2013 - dl.acm.org
In a shared-nothing, distributed text retrieval system, queries are processed over an inverted
index that is partitioned among a number of index servers. In practice, the index is either …

On the feasibility of multi-site web search engines

R Baeza-Yates, A Gionis, F Junqueira… - Proceedings of the 18th …, 2009 - dl.acm.org
Web search engines are often implemented as centralized systems. Designing and
implementing a Web search engine in a distributed environment is a challenging …