Inicio  /  Information  /  Vol: 12 Par: 12 (2021)  /  Artículo
ARTÍCULO
TITULO

CSFF-Net: Scene Text Detection Based on Cross-Scale Feature Fusion

Yuan Li    
Mayire Ibrayim and Askar Hamdulla    

Resumen

In the last years, methods for detecting text in real scenes have made significant progress with an increase in neural networks. However, due to the limitation of the receptive field of the central nervous system and the simple representation of text by using rectangular bounding boxes, the previous methods may be insufficient for working with more challenging instances of text. To solve this problem, this paper proposes a scene text detection network based on cross-scale feature fusion (CSFF-Net). The framework is based on the lightweight backbone network Resnet, and the feature learning is enhanced by embedding the depth weighted convolution module (DWCM) while retaining the original feature information extracted by CNN. At the same time, the 3D-Attention module is also introduced to merge the context information of adjacent areas, so as to refine the features in each spatial size. In addition, because the Feature Pyramid Network (FPN) cannot completely solve the interdependence problem by simple element-wise addition to process cross-layer information flow, this paper introduces a Cross-Level Feature Fusion Module (CLFFM) based on FPN, which is called Cross-Level Feature Pyramid Network (Cross-Level FPN). The proposed CLFFM can better handle cross-layer information flow and output detailed feature information, thus improving the accuracy of text region detection. Compared to the original network framework, the framework provides a more advanced performance in detecting text images of complex scenes, and extensive experiments on three challenging datasets validate the realizability of our approach.

 Artículos similares

       
 
Xing Wu, Yangyang Qi, Jun Song, Junfeng Yao, Yanzhong Wang, Yang Liu, Yuexing Han and Quan Qian    
Scene Text Detection (STD) is critical for obtaining textual information from natural scenes, serving for automated driving and security surveillance. However, existing text detection methods fall short when dealing with the variation in text curvatures,... ver más
Revista: Information

 
Minjun Jeon and Young-Seob Jeong    
Scene text detection is the task of detecting word boxes in given images. The accuracy of text detection has been greatly elevated using deep learning models, especially convolutional neural networks. Previous studies commonly aimed at developing more ac... ver más
Revista: Applied Sciences

 
Shiwei Chen, Dayue Yao, Huiliang Cao and Chong Shen    
Action and identification problems are the challenges that visually impaired people often encounter in their lives. The high price of existing commercial intelligent auxiliary equipment has placed enormous economic pressure on most visually impaired peop... ver más
Revista: Applied Sciences

 
Kobie Van Krieken    
News stories aim to create an immersive reading experience by virtually transporting the audience to the described scenes. In print journalism, this experience is facilitated by text-linguistic narrative techniques, such as detailed scene reconstructions... ver más
Revista: Information