ARTÍCULO
TITULO

System for collecting and analyzing information from various sources in Big Data conditions

D. V. Smirnov    
A. A. Grusho    
M. I. Zabezhailo    
E. E. Timonina    

Resumen

The problem of constructing an architecture and methods of searching for insider activity signs in process-real-time conditions has been investigated. The problem is solved in the following conditions. The source of "raw" data is Big Data, from which data relevant to insider activity signs is selected in accordance to the current list of threats. Search is conducted on a large number of users. Under these conditions, the algorithm is built that breaks the difficulty barrier in finding relevant data. The important complication of the task is the conditions of "openness" of the data. The condition of "openness" of the data involves constant updating of the data. The concept of "openness" also includes changing the signs of hostile activities of insiders. In this case, the search conditions can also dynamically change. The built architecture is two-level. The first level contains data collected from various "raw" databases and relevant to the current list of threats. The second level relates to the maximum availability of data organized at the first level for analysis with the participation of experts - operational workers. Scientific justification of correctness and efficiency of mathematical models and big data mining algorithms involved in implementation of this software system is given. The built solutions showed their operability in the industrial version of the solution of the problem.