ARTÍCULO
TITULO

A cost model for analytical query optimization

Petr Kurapov    
Daniil Kulikov    
Areg Melik-Adamyan    

Resumen

Analytical query performance improvement can be achieved via efficient work distribution among devices of a heterogeneous system. The resulting performance gain highly depends on the ability of an optimizer to compare execution plans. One way to manage the complexity of a heterogeneous system is to develop cost models to that support multiple devices. Reusing existing CPU models is complicated if not impossible. This paper introduces a methodology for analytical query execution time estimation in a heterogeneous system by matching its plan to a set of computational patterns with known performance characteristics. We identify key and most common patterns and show how a query plan maps to them. We provide a general algorithm for cost calculation and evaluate model effectiveness by building a library of portable implementations and comparing their performance to a real in-memory DBMS.