iPUG: Accelerating Breadth-First Graph Traversals Using Manycore Graphcore IPUs

Published in ISC High Performance 2021, 2021

The Graphcore Intelligence Processing Unit (IPU) is a newly developed processor type whose architecture does not rely on the traditional caching hierarchies. Developed to meet the need for more and more data-centric applications, such as machine learning, IPUs combine a dedicated portion of SRAM with each of its numerous cores, resulting in high memory bandwidth at the price of capacity. The proximity of processor cores and memory makes the IPU a promising field of experimentation for graph algorithms since it is the unpredictable, irregular memory accesses that lead to performance losses in traditional processors with pre-caching.

This paper aims to test the IPU’s suitability for algorithms with hard-to-predict memory accesses by implementing a breadth-first search (BFS) that complies with the Graph500 specifications. Precisely because of its apparent simplicity, BFS is an established benchmark that is not only subroutine for a variety of more complex graph algorithms, but also allows comparability across a wide range of architectures.

We benchmark our IPU code on a wide range of instances and compare its performance to state-of-the-art CPU and GPU codes. The results indicate that the IPU delivers speedups of up to 4x over the fastest competing result on an NVIDIA V100 GPU, with typical speedups of about 1.5x on most test instances.

Download paper here

Bibtex

@inproceedings{burchard2021ipug,
  title={ipug: Accelerating breadth-first graph traversals using manycore graphcore ipus},
  author={Burchard, Luk and Moe, Johannes and Schroeder, Daniel Thilo and Pogorelov, Konstantin and Langguth, Johannes},
  booktitle={High Performance Computing: 36th International Conference, ISC High Performance 2021, Virtual Event, June 24--July 2, 2021, Proceedings},
  pages={291--309},
  year={2021},
  organization={Springer}
}