Resumen
This paper introduces a fundamentally new computer architecture for supercomputers. The core module is application compatible with an existing superscalar microprocessor, with minimized energy use, and is optimized for local sparse matrix operations. Optimized sparse matrix manip- ulation is discussed by analyzing the High Performance Conjugate Gradient (HPCG) benchmark speci...cation. This analysis shows how the DRAM memory wall is removed for this benchmark, and for sparse matrix models of partial di¤erential equations (PDEs) for a wide cross section of applications. By giving the programmer improved control over the con...guration of the super- computer, the potential for communication problems is minimized. Application compatibility is achieved while removing the superscalar instruction interpreter and multi-thread controller from the existing microprocessor?s hardware. These are transformed into compile-time utilities. The instruction cache is removed through an innovation in VLIW instruction processing. The data caches are unnecessary and are turned o¤ in order to optimally implement sparse matrix models.