Redirigiendo al acceso original de articulo en 19 segundos...
Inicio  /  Applied Sciences  /  Vol: 13 Par: 13 (2023)  /  Artículo
ARTÍCULO
TITULO

Novel Object Captioning with Semantic Match from External Knowledge

Sen Du    
Hong Zhu    
Guangfeng Lin    
Dong Wang and Jing Shi    

Resumen

Automatically describing the content of an image is a challenging task that is on the edge between natural language and computer vision. The current image caption models can describe the objects that are frequently seen in the training set very well, but they fail to describe the novel objects that are rarely seen or never seen in the training set. Despite describing novel objects being important for practical applications, only a few works investigate this issue. Furthermore, those works only investigate rarely seen objects, but ignore the never-seen objects. Meanwhile, the number of never-seen objects is more than the number of frequently seen and rarely seen objects. In this paper, we propose two blocks that incorporate external knowledge into the captioning model to solve this issue. Initially, in the encoding phase, the Semi-Fixed Word Embedding block is an improvement for the word embedding layer that enables the captioning model to understand the meaning of the arbitrary visual words rather than a fixed number of words. Furthermore, the Candidate Sentences Selection block chooses candidate sentences by semantic matching rather than probability, avoiding the influence of never-seen words. In experiments, we qualitatively analyze the proposed blocks and quantitatively evaluate several captioning models with the proposed blocks on the Nocaps dataset. The experimental results show the effectiveness of the proposed blocks for novel objects, especially when describing never-seen objects, CIDEr and SPICE improved by 13.1% and 12.0%, respectively.

 Artículos similares

       
 
Jier Xi and Xiufen Ye    
There are many challenges in using side-scan sonar (SSS) images to detect objects. The challenge of object detection and recognition in sonar data is greater than in optical images due to the sparsity of detectable targets. The complexity of real-world u... ver más

 
Jiawei Zhang, Fenglei Han, Duanfeng Han, Jianfeng Yang, Wangyuan Zhao and Hansheng Li    
In the realm of ocean engineering and maintenance of subsea structures, accurate underwater distance quantification plays a crucial role. However, the precision of such measurements is often compromised in underwater environments due to backward scatteri... ver más

 
Josue-Rafael Montes-Martínez, Hugo Jiménez-Hernández, Ana-Marcela Herrera-Navarro, Luis-Antonio Díaz-Jiménez, Jorge-Luis Perez-Ramos and Julio-César Solano-Vargas    
Artificial vision system applications have generated significant interest as they allow information to be obtained through one or several of the cameras that can be found in daily life in many places, such as parks, avenues, squares, houses, etc. When th... ver más

 
Ahad Alotaibi, Chris Chatwin and Phil Birch    
In aerial surveillance systems, achieving optimal object detection precision is of paramount importance for effective monitoring and reconnaissance. This article presents a novel approach to enhance object detection accuracy through the integration of De... ver más

 
Su Young Kim and Yoon Sang Kim    
Multiple markers are generally used in augmented reality (AR) applications that require accurate registration, such as medical and industrial fields. In AR using these markers, there are two inevitable problems: (1) geometric shape discrepancies between ... ver más
Revista: Applied Sciences