Improving Imbalanced Land Cover Classification with K-Means SMOTE: Detecting and Oversampling Distinctive Minority Spectral Signatures

Joao Fonseca

Georgios Douzas and Fernando Bacao

Resumen

Land cover maps are a critical tool to support informed policy development, planning, and resource management decisions. With significant upsides, the automatic production of Land Use/Land Cover maps has been a topic of interest for the remote sensing community for several years, but it is still fraught with technical challenges. One such challenge is the imbalanced nature of most remotely sensed data. The asymmetric class distribution impacts negatively the performance of classifiers and adds a new source of error to the production of these maps. In this paper, we address the imbalanced learning problem, by using K-means and the Synthetic Minority Oversampling Technique (SMOTE) as an improved oversampling algorithm. K-means SMOTE improves the quality of newly created artificial data by addressing both the between-class imbalance, as traditional oversamplers do, but also the within-class imbalance, avoiding the generation of noisy data while effectively overcoming data imbalance. The performance of K-means SMOTE is compared to three popular oversampling methods (Random Oversampling, SMOTE and Borderline-SMOTE) using seven remote sensing benchmark datasets, three classifiers (Logistic Regression, K-Nearest Neighbors and Random Forest Classifier) and three evaluation metrics using a five-fold cross-validation approach with three different initialization seeds. The statistical analysis of the results show that the proposed method consistently outperforms the remaining oversamplers producing higher quality land cover classifications. These results suggest that LULC data can benefit significantly from the use of more sophisticated oversamplers as spectral signatures for the same class can vary according to geographical distribution.

Palabras claves

LULC classification - imbalanced learning - oversampling - data augmentation - clustering

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 12 Parte: 7 (2021)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Applied Sciences
Information
Algorithms

DOI

https://doi.org/10.3390/info12070266

Artículos similares

Classification of Skin Lesions Using Weighted Majority Voting Ensemble Deep Learning

Acceso

Damilola A. Okuboyejo and Oludayo O. Olugbara

The conventional dermatology practice of performing noninvasive screening tests to detect skin diseases is a source of escapable diagnostic inaccuracies. Literature suggests that automated diagnosis is essential for improving diagnostic accuracies in med... ver más

Revista: Algorithms

The Impact of Ensemble Techniques on Software Maintenance Change Prediction: An Empirical Study

Acceso

Hadeel Alsolai and Marc Roper

Various prediction models have been proposed by researchers to predict the change-proneness of classes based on source code metrics. However, some of these models suffer from low prediction accuracy because datasets exhibit high dimensionality or imbalan... ver más

Revista: Applied Sciences

Sieve: An Ensemble Algorithm Using Global Consensus for Binary Classification

Acceso

Chongya Song, Alexander Pons and Kang Yen

In the field of machine learning, an ensemble approach is often utilized as an effective means of improving on the accuracy of multiple weak base classifiers. A concern associated with these ensemble algorithms is that they can suffer from the Curse of C... ver más

Revista: AI

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas