Efficient Distributed All-Pairs Algorithms: Management Using Optimal Cyclic Quorums

CJ Kleinheksel and AK Somani, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 29, 391-404 (2018).

DOI: 10.1109/TPDS.2017.2707417

All Pairs problems occur in many research fields. The all-pairs problem requires all data elements to be paired with all other data elements. With the advent of new data intensive big data applications and increase in data size, methods to reduce memory foot print and distribute to work equally across compute nodes are needed. In this paper, we propose cyclic quorum sets for all-pairs algorithm computations to reduce memory foot print. We show that the cyclic quorum sets have a unique all-pairs property that allows for minimal data replication. The cyclic quorums set based computing requires only N/root P size memory, up to 50 percent smaller than the dual N/root P N/root P array implementations proposed earlier, and significantly smaller than solutions requiring all data in each node. Computation can be distributed efficiently and more importantly and are communication-less after initial data distribution, which is a huge advantage in minimizing computation time. Scaling from 16 to 512 cores ( 1 to 32 compute nodes), our application experiments on a real dataset demonstrated scalability with greater than 150x (super- linear) speedup with less than 1/4th the memory usage per node in our experiments.

Return to Publications page