ARTÍCULO
TITULO

Missing Data Imputation for Geolocation-based Price Prediction Using KNN?MCF Method

Karshiev Sanjar    
Olimov Bekhzod    
Jaesoo Kim    
Anand Paul and Jeonghong Kim    

Resumen

Accurate house price forecasts are very important for formulating national economic policies. In this paper, we offer an effective method to predict houses? sale prices. Our algorithm includes one-hot encoding to convert text data into numeric data, feature correlation to select only the most correlated variables, and a technique to overcome the missing data. Our approach is an effective way to handle missing data in large datasets with the K-nearest neighbor algorithm based on the most correlated features (KNN?MCF). As far as we are concerned, there has been no previous research that has focused on important features dealing with missing observations. Compared to the typical machine learning prediction algorithms, the prediction accuracy of the proposed method is 92.01% with the random forest algorithm, which is more efficient than the other methods.

 Artículos similares

       
 
Christos Tzimopoulos, Kyriakos Papadopoulos, Nikiforos Samarinas, Basil Papadopoulos and Christos Evangelides    
In this work, a novel fuzzy FEM (Finite Elements Method) numerical solution describing the recession flow in unconfined aquifers is proposed. In general, recession flow and drainage problems can be described by the nonlinear Boussinesq equation, while th... ver más
Revista: Hydrology

 
Menna Ibrahim Gabr, Yehia Mostafa Helmy and Doaa Saad Elzanfaly    
Data completeness is one of the most common challenges that hinder the performance of data analytics platforms. Different studies have assessed the effect of missing values on different classification models based on a single evaluation metric, namely, a... ver más

 
Xing Su, Wenjie Sun, Chenting Song, Zhi Cai and Limin Guo    
With the rapid development of the economy, car ownership has grown rapidly, which causes many traffic problems. In recent years, intelligent transportation systems have been used to solve various traffic problems. To achieve effective and efficient traff... ver más

 
Li Cai, Cong Sha, Jing He and Shaowen Yao    
Traffic flows (e.g., the traffic of vehicles, passengers, and bikes) aim to reveal traffic flow phenomena generated by traffic participants in traffic activities. Various studies of traffic flows rely heavily on high-quality traffic data. The taxi GPS tr... ver más

 
Hatef Dastour and Quazi K. Hassan    
Having a complete hydrological time series is crucial for water-resources management and modeling. However, this can pose a challenge in data-scarce environments where data gaps are widespread. In such situations, recurring data gaps can lead to unfavora... ver más
Revista: Hydrology