Resumen
The digital transition that drives the new industrial revolution is largely driven by the application of intelligence and data. This boost leads to an increase in energy consumption, much of it associated with computing in data centers. This fact clashes with the growing need to save and improve energy efficiency and requires a more optimized use of resources. The deployment of new services in edge and cloud computing, virtualization, and software-defined networks requires a better understanding of consumption patterns aimed at more efficient and sustainable models and a reduction in carbon footprints. These patterns are suitable to be exploited by machine, deep, and reinforced learning techniques in pursuit of energy consumption optimization, which can ideally improve the energy efficiency of data centers and big computing servers providing these kinds of services. For the application of these techniques, it is essential to investigate data collection processes to create initial information points. Datasets also need to be created to analyze how to diagnose systems and sort out new ways of optimization. This work describes a data collection methodology used to create datasets that collect consumption data from a real-world work environment dedicated to data centers, server farms, or similar architectures. Specifically, it covers the entire process of energy stimuli generation, data extraction, and data preprocessing. The evaluation and reproduction of this method is offered to the scientific community through an online repository created for this work, which hosts all the code available for its download.