Finalized:Sunday, June 26, 2016
Author(s):Aberger, C. R., S. Tu, K. Olukotun, and C. Ré
There are two types of high-performance graph processing engines: low- and high-level engines. Low-level engines (Galois, PowerGraph, Snap) provide optimized data structures and computation models but require users to write low-level imperative code, hence ensuring that efficiency is the burden of the user. In high-level engines, users write in query languages like datalog (SociaLite) or SQL (Grail). High-level engines are easier to use but are orders of magnitude slower than the low-level graph engines. We present EmptyHeaded, a high-level engine that supports a rich datalog-like query language and achieves performance comparable to that of low-level engines. At the core of EmptyHeaded's design is a new class of join algorithms that satisfy strong theoretical guarantees but have thus far not achieved performance comparable to that of specialized graph processing engines. To achieve high performance, EmptyHeaded introduces a new join engine architecture, including a novel query optimizer and data layouts that leverage single-instruction multiple data (SIMD) parallelism. With this architecture, EmptyHeaded outperforms high-level approaches by up to three orders of magnitude on graph pattern queries, PageRank, and Single-Source Shortest Paths (SSSP) and is an order of magnitude faster than many low-level baselines. We validate that EmptyHeaded competes with the best-of-breed low-level engine (Galois), achieving comparable performance on PageRank and at most 3x worse performance on SSSP.
Aberger, C. R., S. Tu, K. Olukotun, and C. Ré, 2016: EmptyHeaded: A relational engine for graph processing. Proc ACM SIGMOD Int Conf Manag Data, 2016, 431–446, doi:10.1145/2882903.2915213.This material is based upon work supported by the National Science Foundation under Grant No. 1343760. Opinions, findings, conclusions or recommendations expressed are those of the authors and do not reflect the views of the NSF.