Redirigiendo al acceso original de articulo en 15 segundos...
Inicio  /  Algorithms  /  Vol: 13 Par: 12 (2020)  /  Artículo
ARTÍCULO
TITULO

k-Means+++: Outliers-Resistant Clustering

Adiel Statman    
Liat Rozenberg and Dan Feldman    

Resumen

The k-means problem is to compute a set of k centers (points) that minimizes the sum of squared distances to a given set of n points in a metric space. Arguably, the most common algorithm to solve it is k-means++ which is easy to implement and provides a provably small approximation error in time that is linear in n. We generalize k-means++ to support outliers in two sense (simultaneously): (i) nonmetric spaces, e.g., M-estimators, where the distance dist(??,??) dist ( p , x ) between a point p and a center x is replaced by min{dist(??,??),??} min dist ( p , x ) , c for an appropriate constant c that may depend on the scale of the input. (ii) k-means clustering with ??=1 m = 1 outliers, i.e., where the m farthest points from any given k centers are excluded from the total sum of distances. This is by using a simple reduction to the (??+??) ( k + m ) -means clustering (with no outliers).

Palabras claves

 Artículos similares

       
 
N.N. Goglev,S.A. Migalin,E.V. Kasatkina     Pág. 111 - 119
The use of artificial intelligence technologies and big data analysis in risk management makes it possible to reduce the burden on experts and reduce the influence of the human factor in risk assessment. These technologies are well studied and actively u... ver más

 
Zhenwen He, Chunfeng Zhang, Xiaogang Ma and Gang Liu    
Time series data are widely found in finance, health, environmental, social, mobile and other fields. A large amount of time series data has been produced due to the general use of smartphones, various sensors, RFID and other internet devices. How a time... ver más
Revista: Algorithms

 
González, J; Rojas, I; Pomares, H; Ortega, J; Prieto, A     Pág. 132 - 142