ARTÍCULO
TITULO

Collecting and Presenting Reproducible Intranode Stencil Performance: INSPECT

Julian Hornich    
Julian Hammer    
Georg Hager    
Thomas Gruber    
Gerhard Wellein    

Resumen

Stencil algorithms have been receiving considerable interest in HPC research for decades. The techniques used to approach multi-core stencil performance modeling and engineering span basic runtime measurements, elaborate performance models, detailed hardware counter analysis, and thorough scaling behavior evaluation. Due to the plurality of approaches and stencil patterns, we set out to develop a generalizable methodology for reproducible measurements accompanied by state-of-the-art performance models. Our open-source toolchain and collected results are publicly available in the "Intranode Stencil Performance Evaluation Collection" (INSPECT). We present the underlying methods, models and tools involved in gathering and documenting the performance behavior of a collection of typical stencil patterns across multiple architectures and hardware configuration options. Our aim is to endow performance-aware application developers with reproducible baseline performance data and validated models to initiate a well-defined process of performance assessment and optimization. All data is available for inspection: source code, produced assembly, performance measurements, hardware performance counter data, single-core and multicore Roofline and ECM (execution-cache-memory) performance models, and machine properties. Deviations between measured performance and performance models become immediately evident and can be investigated. We also give hints as to how INSPECT can be used in practice for custom code analysis.

 Artículos similares