Resumen
This article is devoted to the analysis of the possibility of detecting attacks on web applications using machine learning algorithms. Supervised learning is considered. A sample of HTTP DATASET CSIC 2010 is used as a data set. The dataset was automatically generated and contains 36,000 normal queries and over 25,000 anomalous. All HTTP requests are marked as normal or abnormal. The anomalous data contains attacks such as SQL injection, buffer overflow, information gathering, file expansion, CRLF injection, cross-site scripting (XSS), server-side inclusion, parameter spoofing, etc. The training and test samples are selected to analyze the effectiveness of various machine learning algorithms used for traffic classification. Conversion of all text values of attributes to numerical ones was realized. The quality metrics for five machine learning algorithms are determined, and the optimal algorithm is selected that classifies the traffic under consideration into abnormal and normal.