Abstract: Our studies show that the storage engines of existing graph processing systems are inefficient when running concurrent jobs due to redundant data storage and access overhead. We developed a storage system GraphM. It can be integrated into the existing graph processing systems to efficiently support concurrent iterative graph processing jobs for higher throughput by fully exploiting the similarities of the data accesses between these jobs. GraphM regularizes the traversing order of the graph partitions for concurrent graph processing jobs by streaming the partitions into the cache in a common order, and then processes the related jobs concurrently in a novel fine-grained synchronization. Then, the concurrent jobs share the same graph structure data in the cache/memory and also the data accesses to the graph, amortizing the storage consumption and the data access overhead. We plug GraphM into state-of-the-art graph processing systems and show that it improves the throughput by 1.73-13 times.
Back to Technical Papers Archive Listing