Inicio  /  Applied Sciences  /  Vol: 9 Par: 5 (2019)  /  Artículo
ARTÍCULO
TITULO

On the Optimal Size of Candidate Feature Set in Random forest

Sunwoo Han and Hyunjoong Kim    

Resumen

Random forest is an ensemble method that combines many decision trees. Each level of trees is determined by an optimal rule among a candidate feature set. The candidate feature set is a random subset of all features, and is different at each level of trees. In this article, we investigated whether the accuracy of Random forest is affected by the size of the candidate feature set. We found that the optimal size differs from data to data without any specific pattern. To estimate the optimal size of feature set, we proposed a novel algorithm which uses the out-of-bag error and the ?SearchSize? exploration. The proposed method is significantly faster than the standard grid search method while giving almost the same accuracy. Finally, we demonstrated that the accuracy of Random forest using the proposed algorithm has increased significantly compared to using a typical size of feature set.

 Artículos similares

       
 
Weihao Cao, Guangli Cheng, Bao Liu and Yangfan Cai    
The current time-domain solution methods for the wavefield equations of a single medium do not apply to the wavefield equations of shallow water seismic with a fluid?elastomer coupling. To solve this problem, based on the explicit central difference meth... ver más
Revista: Applied Sciences

 
Hasan Mhd Nazha, Mhd Ayham Darwich, Basem Ammar, Hala Dakkak and Daniel Juhre    
An investigation was conducted to examine the photothermal and thermomechanical effects of short-pulse laser irradiation on normal tissues. This study analyzed the impact of short-pulse laser radiation on the heat-affected region within tissues, taking i... ver más
Revista: Applied Sciences

 
Myungsung Koo and Inyeong Kwon    
Gizzard shads are facing a continual decline in population, necessitating the implementation of selective gear design for effective resource management. This study aims to prevent the bycatch of young gizzard shads, a non-target fish species, and to deri... ver más

 
Le Duc Quyen, Young-Gyu Park, In-cheol Lee and Jun Myoung Choi    
Microplastics, ubiquitous in our environment, are significantly impacted by the hydrodynamic conditions around them. This study utilizes CFD to explore how various breaker types influence the dispersion and accumulation of microplastics in nearshore area... ver más

 
Nattakan Supajaidee, Nawinda Chutsagulprom and Sompop Moonchai    
Ordinary kriging (OK) is a popular interpolation method for its ability to simultaneously minimize error variance and deliver statistically optimal and unbiased predictions. In this work, the adaptive moving window kriging with K-means clustering (AMWKK)... ver más
Revista: Algorithms