Resumen
The problem of incomplete data is quite typical in sociological, economics or statistical studies that employ online data. The possible reasons for the incompleteness are: errors and changes at the data source websites, failures and errors in the instruments for collecting data, etc. Since missing data is generally undesirable in labor market forecasting, the preferred solution is filling-in the gaps through the use of an appropriate method that wouldn?t bias the results. In our paper we present a brief review of methods for eliminating incompleteness of data and describe the application of the k-means method to fill the gaps in the labor market online data that we previously collected with a dedicated software system. We evaluate the effectiveness of the method by comparing the produced results (average wages and number of ads posted by the companies) with the data additionally collected by the system through the enhanced API-based mechanism. Further, we use autoregressive integrated moving average (ARIMA) model to provide forecasts for the labor market demand in IT specialists. Validation with the data subsequently collected for the last months of 2018 suggest reasonable accuracy of the model, which can be useful in labor market monitoring and management.