Redirigiendo al acceso original de articulo en 18 segundos...
Inicio  /  Algorithms  /  Vol: 17 Par: 3 (2024)  /  Artículo
ARTÍCULO
TITULO

Automatic Optimization of Deep Learning Training through Feature-Aware-Based Dataset Splitting

Somayeh Shahrabadi    
Telmo Adão    
Emanuel Peres    
Raul Morais    
Luís G. Magalhães and Victor Alves    

Resumen

The proliferation of classification-capable artificial intelligence (AI) across a wide range of domains (e.g., agriculture, construction, etc.) has been allowed to optimize and complement several tasks, typically operationalized by humans. The computational training that allows providing such support is frequently hindered by various challenges related to datasets, including the scarcity of examples and imbalanced class distributions, which have detrimental effects on the production of accurate models. For a proper approach to these challenges, strategies smarter than the traditional brute force-based K-fold cross-validation or the naivety of hold-out are required, with the following main goals in mind: (1) carrying out one-shot, close-to-optimal data arrangements, accelerating conventional training optimization; and (2) aiming at maximizing the capacity of inference models to its fullest extent while relieving computational burden. To that end, in this paper, two image-based feature-aware dataset splitting approaches are proposed, hypothesizing a contribution towards attaining classification models that are closer to their full inference potential. Both rely on strategic image harvesting: while one of them hinges on weighted random selection out of a feature-based clusters set, the other involves a balanced picking process from a sorted list that stores data features? distances to the centroid of a whole feature space. Comparative tests on datasets related to grapevine leaves phenotyping and bridge defects showcase promising results, highlighting a viable alternative to K-fold cross-validation and hold-out methods.

 Artículos similares

       
 
Renato Bruni, Gianpiero Bianchi and Pasquale Papa    
User requests to a customer service, also known as tickets, are essentially short texts in natural language. They should be grouped by topic to be answered efficiently. The effectiveness increases if this semantic categorization becomes automatic. We pur... ver más
Revista: Algorithms

 
Fei Yu, Bang Liang, Bo Tang and Hongrun Wu    
The Interior layout model is to optimize the arrangement position of each room to maximize the comfort and quality of life of residents. Due to the complexity of the Interior layout problem, the computation of fitness function costs lots of time. To redu... ver más
Revista: Algorithms

 
Hui Cheng, Guochen Sui, Guanglu Wang, Junfeng Deng, Huan Wei, Rui Xu, Youshan He and Wei Yang    
This study summarizes the engineering design and calculation methods of micropiles and proposes a pile length optimization model based on numerical simulation software. Based on the proposed micropile calculation method and optimization method, a specifi... ver más
Revista: Applied Sciences

 
Xiaoyu Han, Chenyu Li, Zifan Wang and Guohua Liu    
Neural architecture search (NAS) has shown great potential in discovering powerful and flexible network models, becoming an important branch of automatic machine learning (AutoML). Although search methods based on reinforcement learning and evolutionary ... ver más
Revista: Algorithms

 
Liubov O. Elkhovskaya, Alexander D. Kshenin, Marina A. Balakhontceva, Mikhail V. Ionov and Sergey V. Kovalchuk    
Within process mining, discovery techniques make it possible to construct business process models automatically from event logs. However, results often do not achieve a balance between model complexity and fitting accuracy, establishing a need for manual... ver más
Revista: Algorithms