Inicio  /  Applied Sciences  /  Vol: 9 Par: 14 (2019)  /  Artículo
ARTÍCULO
TITULO

Structure Preserving Convolutional Attention for Image Captioning

Shichen Lu    
Ruimin Hu    
Jing Liu    
Longteng Guo and Fei Zheng    

Resumen

In the task of image captioning, learning the attentive image regions is necessary to adaptively and precisely focus on the object semantics relevant to each decoded word. In this paper, we propose a convolutional attention module that can preserve the spatial structure of the image by performing the convolution operation directly on the 2D feature maps. The proposed attention mechanism contains two components: convolutional spatial attention and cross-channel attention, aiming to determine the intended regions to describe the image along the spatial and channel dimensions, respectively. Both of the two attentions are calculated at each decoding step. In order to preserve the spatial structure, instead of operating on the vector representation of each image grid, the two attention components are both computed directly on the entire feature maps with convolution operations. Experiments on two large-scale datasets (MSCOCO and Flickr30K) demonstrate the outstanding performance of our proposed method.

 Artículos similares

       
 
Yuichiro Toda, Akimasa Wada, Hikari Miyase, Koki Ozasa, Takayuki Matsuno and Mamoru Minami    
Three-dimensional space perception is one of the most important capabilities for an autonomous mobile robot in order to operate a task in an unknown environment adaptively since the autonomous robot needs to detect the target object and estimate the 3D p... ver más
Revista: Applied Sciences

 
Irina Cherunova, Nikolai Kornev, Ekaterina Lukyanova and Valery Varavka    
The modern technology of heat-protective clothing is increasingly aimed at maintaining the active function of materials. Adding heat-preserving components into the volume of heat-insulating fibrous materials changes their structure and properties. In thi... ver más
Revista: Applied Sciences

 
Hanlin Sun, Wei Jie, Jonathan Loo, Liang Chen, Zhongmin Wang, Sugang Ma, Gang Li and Shuai Zhang    
Presently, data that are collected from real systems and organized as information networks are universal. Mining hidden information from these data is generally helpful to understand and benefit the corresponding systems. The challenges of analyzing such... ver más
Revista: Information

 
Yaohang Lu and Zhongming Teng    
Principal component analysis (PCA) is one of the most popular tools in multivariate exploratory data analysis. Its probabilistic version (PPCA) based on the maximum likelihood procedure provides a probabilistic manner to implement dimension reduction. Re... ver más
Revista: Algorithms

 
Erasmo Purificato, Sabine Wehnert and Ernesto William De Luca    
In the age of digital information, where the internet and social networks, as well as personalised systems, have become an integral part of everyone?s life, it is often challenging to be aware of the amount of data produced daily and, unfortunately, of t... ver más
Revista: Computers