Redirigiendo al acceso original de articulo en 23 segundos...
Inicio  /  Algorithms  /  Vol: 14 Par: 1 (2021)  /  Artículo
ARTÍCULO
TITULO

Improving Scalable K-Means++

Joonas Hämäläinen    
Tommi Kärkkäinen and Tuomo Rossi    

Resumen

Two new initialization methods for K-means clustering are proposed. Both proposals are based on applying a divide-and-conquer approach for the K-means? type of an initialization strategy. The second proposal also uses multiple lower-dimensional subspaces produced by the random projection method for the initialization. The proposed methods are scalable and can be run in parallel, which make them suitable for initializing large-scale problems. In the experiments, comparison of the proposed methods to the K-means++ and K-means? methods is conducted using an extensive set of reference and synthetic large-scale datasets. Concerning the latter, a novel high-dimensional clustering data generation algorithm is given. The experiments show that the proposed methods compare favorably to the state-of-the-art by improving clustering accuracy and the speed of convergence. We also observe that the currently most popular K-means++ initialization behaves like the random one in the very high-dimensional cases.

 Artículos similares

       
 
Shindume Lomboleni Hamukwaya, Huiying Hao, Zengying Zhao, Jingjing Dong, Tingting Zhong, Jie Xing, Liu Hao and Melvin Mununuri Mashingaidze    
The recent rapid development in perovskite solar cells (PSCs) has led to significant research interest due to their notable photovoltaic performance, currently exceeding 25% power conversion efficiency for small-area PSCs. The materials used to fabricate... ver más
Revista: Coatings

 
Eva Maia, Norberto Sousa, Nuno Oliveira, Sinan Wannous, Orlando Sousa and Isabel Praça    
Critical infrastructures are an attractive target for attackers, mainly due to the catastrophic impact of these attacks on society. In addition, the cyber?physical nature of these infrastructures makes them more vulnerable to cyber?physical threats and m... ver más
Revista: Information

 
Qiuyu Zhu, Zikuang He, Tao Zhang and Wennan Cui    
This work can be widely used in all kinds of pattern recognition systems based on deep learning, such as face recognition, license plate recognition, and speech recognition, etc.
Revista: Applied Sciences

 
Mulya Agung,Muhammad Alfian Amrizal,Ryusuke Egawa,Hiroyuki Takizawa     Pág. 71 - 90
Mapping MPI processes to processor cores, called process mapping, is crucial to achieving the scalable performance on multi-core processors. By analyzing the communication behavior among MPI processes, process mapping can improve the communication locali... ver más

 
Hua Jiang, Junfeng Kang, Zhenhong Du, Feng Zhang, Xiangzhi Huang, Renyi Liu and Xuanting Zhang    
Faced with the rapid growth of vector data and the urgent requirement of low-latency query, it has become an important and timely challenge to effectively achieve the scalable storage and efficient access of vector big data. However, a systematic method ... ver más
Revista: Information