Inicio  /  Algorithms  /  Vol: 13 Par: 9 (2020)  /  Artículo
ARTÍCULO
TITULO

A Two-Phase Approach for Semi-Supervised Feature Selection

Amit Saxena    
Shreya Pare    
Mahendra Singh Meena    
Deepak Gupta    
Akshansh Gupta    
Imran Razzak    
Chin-Teng Lin and Mukesh Prasad    

Resumen

This paper proposes a novel approach for selecting a subset of features in semi-supervised datasets where only some of the patterns are labeled. The whole process is completed in two phases. In the first phase, i.e., Phase-I, the whole dataset is divided into two parts: The first part, which contains labeled patterns, and the second part, which contains unlabeled patterns. In the first part, a small number of features are identified using well-known maximum relevance (from first part) and minimum redundancy (whole dataset) based feature selection approaches using the correlation coefficient. The subset of features from the identified set of features, which produces a high classification accuracy using any supervised classifier from labeled patterns, is selected for later processing. In the second phase, i.e., Phase-II, the patterns belonging to the first and second part are clustered separately into the available number of classes of the dataset. In the clusters of the first part, take the majority of patterns belonging to a cluster as the class for that cluster, which is given already. Form the pairs of cluster centroids made in the first and second part. The centroid of the second part nearest to a centroid of the first part will be paired. As the class of the first centroid is known, the same class can be assigned to the centroid of the cluster of the second part, which is unknown. The actual class of the patterns if known for the second part of the dataset can be used to test the classification accuracy of patterns in the second part. The proposed two-phase approach performs well in terms of classification accuracy and number of features selected on the given benchmarked datasets.

 Artículos similares

       
 
Umair Khan, William Pao and Nabihah Sallih    
Gas?liquid two-phase flow is very common in industrial pipelines. Flow regime identification is the first step to design, analyze, and operate the gas?liquid system successfully. The purpose of this study is to develop a methodology for identification of... ver más
Revista: Applied Sciences

 
Yujin Zheng, Alex Yakovlev and Alex Bystrov    
The proposed 8-Transistor (8T) Physically Unclonable Function (PUF), in conjunction with the power gating technique, can significantly accelerate a single evaluation cycle more than 100,000 times faster than a 6-Transistor (6T) Static Random-Access Memor... ver más

 
Wenping Luo, Weiqin Liu, Meng Yang, Shuo Chen, Xuemin Song and Weiguo Wu    
Operating Offshore Floating Vertical Axis Wind Turbines (OF-VAWT) have the potential to perform well in the deep-sea area. Some researchers gave performance prediction by developing simplified computing models. However, these models have imperfections in... ver más

 
Mitja ?trakl, Matja? Hriber?ek, Jana Wedel, Paul Steinmann and Jure Ravnik    
In this paper, forces and torques on solid, non-spherical, orthotropic particles in Stokes flow are investigated by using a numerical approach on the basis of the Boundary Element Method. Different flow patterns around a particle are considered, taking i... ver más

 
Yingchun Tian and Delin Jing    
The emergence and development of systems of systems (SoSs) have expanded the complexity and adaptability of systems engineering. Due to the heterogeneity of its constituent systems, designing and analyzing an SoS faces enormous challenges. Therefore, the... ver más
Revista: Applied Sciences