Resumen
HPC system design and operation are challenged by the critical requirements for signicant advances in eciency, scalability, user productivity, and performance portability, even at the end of Moore's Law with approaching nano-scale semiconductor technology. Conventional practices employ distributed memory message passing programming interfaces, sometimes combining second level thread-based intra shared memory node interfaces such as OpenMP or with means of controlling heterogeneous components such as OpenCL for GPUs. While these methods include some modest runtime control, they are principally course grained and statically scheduled. Yet, performance for many real-world applications yield eciencies of less than 10% although some benchmarks may achieve 80% eciency or better (e.g., HPL). To address these challenges, strategies employing runtime software systems are being pursued to exploit information about the status of the application and the system hardware operation throughout the execution for purposes of introspection to guide the task scheduling and resource management in support of dynamic adaptive control. Runtime systems provide adaptive means to reduce the eects of starvation, latency, overhead, and contention. While each is unique in its details, many share common properties such as multi-tasking either preemptive or non-preemptive, message-driven computation such as active messages, sophisticated ne-grain synchronization such as dataow and futures contructs, global name or address spaces, and control policies for optimizing task scheduling in part to address the uncertainty of asynchrony. This survey will identify key parameters and properties of modern and sometimes experimental runtime systems actively employed today and provide a detailed description, summary, and comparison within a shared space of dimensions. It is not the intent of this paper to determine which is better or worse but rather to provide sucient detail to permit the reader to select among them according to individual need.