Redirigiendo al acceso original de articulo en 22 segundos...
Inicio  /  Computers  /  Vol: 11 Par: 8 (2022)  /  Artículo
ARTÍCULO
TITULO

Combining Log Files and Monitoring Data to Detect Anomaly Patterns in a Data Center

Laura Viola    
Elisabetta Ronchieri and Claudia Cavallaro    

Resumen

Context?Anomaly detection in a data center is a challenging task, having to consider different services on various resources. Current literature shows the application of artificial intelligence and machine learning techniques to either log files or monitoring data: the former created by services at run time, while the latter produced by specific sensors directly on the physical or virtual machine. Objectives?We propose a model that exploits information both in log files and monitoring data to identify patterns and detect anomalies over time both at the service level and at the machine level. Methods?The key idea is to construct a specific dictionary for each log file which helps to extract anomalous n-grams in the feature matrix. Several techniques of Natural Language Processing, such as wordclouds and Topic modeling, have been used to enrich such dictionary. A clustering algorithm was then applied to the feature matrix to identify and group the various types of anomalies. On the other side, time series anomaly detection technique has been applied to sensors data in order to combine problems found in the log files with problems stored in the monitoring data. Several services (i.e., log files) running on the same machine have been grouped together with the monitoring metrics. Results?We have tested our approach on a real data center equipped with log files and monitoring data that can characterize the behaviour of physical and virtual resources in production. The data have been provided by the National Institute for Nuclear Physics in Italy. We have observed a correspondence between anomalies in log files and monitoring data, e.g., a decrease in memory usage or an increase in machine load. The results are extremely promising. Conclusions?Important outcomes have emerged thanks to the integration between these two types of data. Our model requires to integrate site administrators? expertise in order to consider all critical scenarios in the data center and understand results properly.

 Artículos similares

       
 
Jie Wang, Jie Yang, Jiafan He and Dongliang Peng    
Semi-supervised learning has been proven to be effective in utilizing unlabeled samples to mitigate the problem of limited labeled data. Traditional semi-supervised learning methods generate pseudo-labels for unlabeled samples and train the classifier us... ver más
Revista: Algorithms

 
David Mattie, Zihang Fang, Emi Takahashi, Lourdes Peña Castillo and Jacob Levman    
Diffusion magnetic resonance imaging (MRI) tractography is a powerful tool for non-invasively studying brain architecture and structural integrity by inferring fiber tracts based on water diffusion profiles. This study provided a thorough set of baseline... ver más
Revista: Information

 
Jiawei Zhang, Fenglei Han, Duanfeng Han, Jianfeng Yang, Wangyuan Zhao and Hansheng Li    
In the realm of ocean engineering and maintenance of subsea structures, accurate underwater distance quantification plays a crucial role. However, the precision of such measurements is often compromised in underwater environments due to backward scatteri... ver más

 
Min Xu, Wenjie Tian and Xiangpeng Zhang    
The three-degrees-of-freedom (3-DOF) parallel robot is commonly employed as a shipborne stabilized platform for real-time compensation of ship disturbances. Pose accuracy is one of its most critical performance indicators. Currently, neural networks have... ver más

 
Hang Yu, Yixi Zhao, Chongben Ni, Jinhong Ding, Tao Zhang, Ran Zhang and Xintian Jiang    
The diverse nature of hull components in shipbuilding has created a demand for intelligent robots capable of performing various tasks without pre-teaching or template-based programming. Visual perception of a target?s outline is crucial for path planning... ver más