Inicio  /  Algorithms  /  Vol: 14 Par: 6 (2021)  /  Artículo
ARTÍCULO
TITULO

A Similarity Measurement with Entropy-Based Weighting for Clustering Mixed Numerical and Categorical Datasets

Xia Que    
Siyuan Jiang    
Jiaoyun Yang and Ning An    

Resumen

Many mixed datasets with both numerical and categorical attributes have been collected in various fields, including medicine, biology, etc. Designing appropriate similarity measurements plays an important role in clustering these datasets. Many traditional measurements treat various attributes equally when measuring the similarity. However, different attributes may contribute differently as the amount of information they contained could vary a lot. In this paper, we propose a similarity measurement with entropy-based weighting for clustering mixed datasets. The numerical data are first transformed into categorical data by an automatic categorization technique. Then, an entropy-based weighting strategy is applied to denote the different importances of various attributes. We incorporate the proposed measurement into an iterative clustering algorithm, and extensive experiments show that this algorithm outperforms OCIL and K-Prototype methods with 2.13% and 4.28% improvements, respectively, in terms of accuracy on six mixed datasets from UCI.

 Artículos similares

       
 
Jie Wang, Hai Lin, Huaihai Guo, Qi Zhang and Junxiang Ge    
The characterization of targets by electromagnetic (EM) scattering and underwater acoustic scattering is an important object of research in these two related fields. However, there are some difficulties in the simulation and measurement of the scattering... ver más

 
Juan Chen, Zhencai Zhu, Haiying Hu, Lin Qiu, Zhenzhen Zheng and Lei Dong    
Infrared (IR) Image preprocessing is aimed at image denoising and enhancement to help with small target detection. According to the sparse representation theory, the IR original image is low rank, and the coefficient shows a sparse character. The low ran... ver más
Revista: Applied Sciences

 
Melinda Szalóki, Zsófia Szabó, Renáta Martos, Attila Csík, Gergo József Szollosi and Csaba Hegedus    
The surface roughness, surface free energy (SFE) of composites, and composite wettability by dental adhesives are determining factors in achieving a strong and durable adhesion (e.g., composite repair, luting adhesively bonded indirect restorations). In ... ver más
Revista: Applied Sciences

 
Qinyu Hu, Xiaomei Zhang, Fangqi Li, Zhushou Tang and Shilin Wang    
Application marketplaces collect ratings and reviews from users to provide references for other consumers. Many crowdturfing activities abuse user reviews to manipulate the reputation of an app and mislead other consumers. To understand and improve the e... ver más
Revista: Information

 
Minglong Zhang, Liang Huang, Yuanqiao Wen, Jinfen Zhang, Yamin Huang and Man Zhu    
The prediction of ship location has become an increasingly popular research hotspot in the field of maritime transportation engineering, which benefits maritime safety supervision and security. Existing methods of ship location prediction based on motion... ver más