Redirigiendo al acceso original de articulo en 21 segundos...
Inicio  /  Information  /  Vol: 13 Par: 1 (2022)  /  Artículo
ARTÍCULO
TITULO

Object Detection of Road Assets Using Transformer-Based YOLOX with Feature Pyramid Decoder on Thai Highway Panorama

Teerapong Panboonyuen    
Sittinun Thongbai    
Weerachai Wongweeranimit    
Phisan Santitamnont    
Kittiwan Suphan and Chaiyut Charoenphon    

Resumen

Due to the various sizes of each object, such as kilometer stones, detection is still a challenge, and it directly impacts the accuracy of these object counts. Transformers have demonstrated impressive results in various natural language processing (NLP) and image processing tasks due to long-range modeling dependencies. This paper aims to propose an exceeding you only look once (YOLO) series with two contributions: (i) We propose to employ a pre-training objective to gain the original visual tokens based on the image patches on road asset images. By utilizing pre-training Vision Transformer (ViT) as a backbone, we immediately fine-tune the model weights on downstream tasks by joining task layers upon the pre-trained encoder. (ii) We apply Feature Pyramid Network (FPN) decoder designs to our deep learning network to learn the importance of different input features instead of simply summing up or concatenating, which may cause feature mismatch and performance degradation. Conclusively, our proposed method (Transformer-Based YOLOX with FPN) learns very general representations of objects. It significantly outperforms other state-of-the-art (SOTA) detectors, including YOLOv5S, YOLOv5M, and YOLOv5L. We boosted it to 61.5% AP on the Thailand highway corpus, surpassing the current best practice (YOLOv5L) by 2.56% AP for the test-dev data set.

 Artículos similares

       
 
Majdi Sukkar, Madhu Shukla, Dinesh Kumar, Vassilis C. Gerogiannis, Andreas Kanavos and Biswaranjan Acharya    
Effective collision risk reduction in autonomous vehicles relies on robust and straightforward pedestrian tracking. Challenges posed by occlusion and switching scenarios significantly impede the reliability of pedestrian tracking. In the current study, w... ver más
Revista: Information

 
Sotirios Kontogiannis, Myrto Konstantinidou, Vasileios Tsioukas and Christos Pikridas    
In viticulture, downy mildew is one of the most common diseases that, if not adequately treated, can diminish production yield. However, the uncontrolled use of pesticides to alleviate its occurrence can pose significant risks for farmers, consumers, and... ver más
Revista: Information

 
Weiming Fan, Jiahui Yu and Zhaojie Ju    
Endoscopy, a pervasive instrument for the diagnosis and treatment of hollow anatomical structures, conventionally necessitates the arduous manual scrutiny of seasoned medical experts. Nevertheless, the recent strides in deep learning technologies proffer... ver más
Revista: Information

 
Hamed Raoofi, Asa Sabahnia, Daniel Barbeau and Ali Motamedi    
Traditional methods of supervision in the construction industry are time-consuming and costly, requiring significant investments in skilled labor. However, with advancements in artificial intelligence, computer vision, and deep learning, these methods ca... ver más

 
Yuchen Dong, Heng Zhou, Chengyang Li, Junjie Xie, Yongqiang Xie and Zhongbo Li    
Camouflaged object detection (COD) is an arduous challenge due to the striking resemblance of camouflaged objects to their surroundings. The abundance of similar background information can significantly impede the efficiency of camouflaged object detecti... ver más
Revista: Applied Sciences