Inicio  /  Informatics  /  Vol: 6 Par: 1 (2019)  /  Artículo
ARTÍCULO
TITULO

Improvement in the Efficiency of a Distributed Multi-Label Text Classification Algorithm Using Infrastructure and Task-Related Data

Martin Sarnovsky and Marek Olejnik    

Resumen

Distributed computing technologies allow a wide variety of tasks that use large amounts of data to be solved. Various paradigms and technologies are already widely used, but many of them are lacking when it comes to the optimization of resource usage. The aim of this paper is to present the optimization methods used to increase the efficiency of distributed implementations of a text-mining model utilizing information about the text-mining task extracted from the data and information about the current state of the distributed environment obtained from a computational node, and to improve the distribution of the task on the distributed infrastructure. Two optimization solutions are developed and implemented, both based on the prediction of the expected task duration on the existing infrastructure. The solutions are experimentally evaluated in a scenario where a distributed tree-based multi-label classifier is built based on two standard text data collections.

 Artículos similares

       
 
Tomasz Walczyna and Zbigniew Piotrowski    
The proliferation of ?Deep fake? technologies, particularly those facilitating face-swapping in images or videos, poses significant challenges and opportunities in digital media manipulation. Despite considerable advancements, existing methodologies ofte... ver más
Revista: Applied Sciences

 
Wenbo Peng and Jinjie Huang    
Current object detection methods typically focus on addressing the distribution discrepancies between source and target domains. However, solely concentrating on this aspect may lead to overlooking the inherent limitations of the samples themselves. This... ver más
Revista: Applied Sciences

 
Jinghua Li, Yidong Chen, Lei Zhou, Ruipu Dong, Wenhao Yin, Wenhao Huang and Fan Zhang    
In the context of increasingly competitive shipbuilding, the flexible multi-level picking system, composed of high-rise shelves, Automated Guided Vehicles (AGVs), and picking stations, has been of gradual interest because of its advantages in operation e... ver más
Revista: Applied Sciences

 
Dongming Wang, Li Xu, Wei Gao, Hongwei Xia, Ning Guo and Xiaohan Ren    
As an extremely important energy source, improving the efficiency and accuracy of coal classification is important for industrial production and pollution reduction. Laser-induced breakdown spectroscopy (LIBS) is a new technology for coal classification ... ver más
Revista: Applied Sciences

 
Baobao Liu, Heying Wang, Zifan Cao, Yu Wang, Lu Tao, Jingjing Yang and Kaibing Zhang    
Defect detection holds significant importance in improving the overall quality of fabric manufacturing. To improve the effectiveness and accuracy of fabric defect detection, we propose the PRC-Light YOLO model for fabric defect detection and establish a ... ver más
Revista: Applied Sciences