Authors: Sagnak Tasilar (Two Sigma), Timothy G. Mattson, Romain Cledat, Vincent Cavé, Vivek Sarkar, Zoran Budimlic’, Sanjay Chatterjee, Josh Fryman, Ivan Ganev, Robin Knauerhase, Min Lee, Benôit Meister, Brian Nickerson, Nick Pepperling, Bala Seshasayee, Justin Teller, Nick Vrvilo
Published in: Proceedings of the 2016 IEEE High Performance Extreme Computing Conference (HPEC)
Abstract: The Open Community Runtime (OCR) is a new runtime system designed to meet the needs of extreme-scale computing. While there is growing support for the idea that future execution models will be based on dynamic tasks, there is little agreement on what else should be included. OCR minimally adds events for synchronization and relocatable data-blocks for data management to form a complete system that supports a wide range of higher-level programming models. This paper lays out the fundamental concepts behind OCR and compares OCR performance to that from MPI for two simple benchmarks. OCR has been developed within an open community model with features supporting flexible algorithm expression weighed against the expected realities of extreme-scale computing: power-constrained execution, aggressive growth in the number of computing resources, deepening memory hierarchies and a low mean-time between failures.