Compiled plans for in-memory path-counting queries

B Myers, J Hyrkas, D Halperin, B Howe - International Workshop on In …, 2013 - Springer
International Workshop on In-Memory Data Management and Analytics, 2013Springer
Dissatisfaction with relational databases for large-scale graph processing has motivated a
new class of graph databases that offer fast graph processing but sacrifice the ability to
express basic relational idioms. However, we hypothesize that the performance benefits
amount to implementation details, not a fundamental limitation of the relational model. To
evaluate this hypothesis, we are exploring code-generation to produce fast in-memory
algorithms and data structures for graph patterns that are inaccessible to conventional …
Abstract
Dissatisfaction with relational databases for large-scale graph processing has motivated a new class of graph databases that offer fast graph processing but sacrifice the ability to express basic relational idioms. However, we hypothesize that the performance benefits amount to implementation details, not a fundamental limitation of the relational model. To evaluate this hypothesis, we are exploring code-generation to produce fast in-memory algorithms and data structures for graph patterns that are inaccessible to conventional relational optimizers.
In this paper, we present preliminary results for this approach on path-counting queries, which includes triangle counting as a special case. We compile Datalog queries into main-memory pipelined hash-join plans in C, and show that the resulting programs easily outperform PostgreSQL on real graphs with different degrees of skew. We then produce analogous parallel programs for Grappa, a runtime system for distributed memory architectures. Grappa is a good target for building a parallel query system as its shared memory programming model and communication mechanisms provide productivity and performance when building communication-intensive applications. Our experiments suggest that Grappa programs using hash joins have competitive performance with queries executed on a commercial parallel database. We find preliminary evidence that a code generation approach simplifies the design of a query engine for graph analysis and improves performance over conventional relational databases.
Springer
以上显示的是最相近的搜索结果。 查看全部搜索结果