ARTÍCULO
TITULO

Performance Limits Study of Stencil Codes on Modern GPGPUs

Ilya S. Pershin    
Vadim D. Levchenko    
Anastasia Y. Perepelkina    

Resumen

We study the performance limits of different algorithmic approaches to the implementation of a sample problem of wave equation solution with a cross stencil scheme. With this, we aim to find the highest limit of the achievable performance efficiency for stencil computing.To estimate the limits, we use a quantitative Roofline model to make a thorough analysis of the performance bottlenecks and develop the model further to account for the latency of different levels of GPU memory. These estimates provide an incentive to use spatial and temporal blocking algorithms. Thus, we study stepwise, domain decomposition, and domain decomposition with halo algorithms in that order. The knowledge of the limit incites the motivation to optimize the implementation. This led to the analysis of the block synchronization methods in CUDA, which is also provided in the text.  After all optimizations, we have achieved 90% of the peak performance, which amounts to more than 1 trillion cell updates per second on one consumer level GPU device.

 Artículos similares

       
 
Luigi Lombardi, Gianvito Matarrese and Cristoforo Marzocca    
The quartz tuning fork used as an acoustic sensor in quartz-enhanced photo-acoustic spectroscopy gas detection systems is usually read out by means of a transimpedance preamplifier based on a low-noise operational amplifier closed in a feedback loop. The... ver más
Revista: Acoustics

 
Tobias Graf, Robin Fonk, Christiane Bauer, Josef Kallo and Caroline Willich    
The climate impact of aviation can be reduced using powertrains based on hydrogen fuel cells and batteries. Combining both technologies in a direct-hybrid without a DC/DC converter is a promising approach for light-weight systems. Depending on the power ... ver más
Revista: Aerospace

 
Sofía Ramos-Pulido, Neil Hernández-Gress and Gabriela Torres-Delgado    
Current research on the career satisfaction of graduates limits educational institutions in devising methods to attain high career satisfaction. Thus, this study aims to use data science models to understand and predict career satisfaction based on infor... ver más
Revista: Informatics

 
Sergejus Lebedevas and Tomas Cepaitis    
The decarbonization problem of maritime transport and new restrictions on CO2 emissions (MARPOL Annex VI Chapter 4, COM (2021)562) have prompted the development and practical implementation of new decarbonization solutions. One of them, along with the us... ver más

 
Xintao Liang, Yuhang Li, Xiaomin Li, Yue Zhang and Youdong Ding    
Implementing single-channel speech enhancement under unknown noise conditions is a challenging problem. Most existing time-frequency domain methods are based on the amplitude spectrogram, and these methods often ignore the phase mismatch between noisy sp... ver más
Revista: Information