REVISTA
Information

TODAS

Redirigiendo al acceso original de articulo en 21 segundos...

Inicio / Information / Vol: 10 Par: 11 (2019) / Artículo

ARTÍCULO

TITULO

Dense Model for Automatic Image Description Generation with Game Theoretic Optimization

Sreela S R and Sumam Mary Idicula

Resumen

Due to the rapid growth of deep learning technologies, automatic image description generation is an interesting problem in computer vision and natural language generation. It helps to improve access to photo collections on social media and gives guidance for visually impaired people. Currently, deep neural networks play a vital role in computer vision and natural language processing tasks. The main objective of the work is to generate the grammatically correct description of the image using the semantics of the trained captions. An encoder-decoder framework using the deep neural system is used to implement an image description generation task. The encoder is an image parsing module, and the decoder is a surface realization module. The framework uses Densely connected convolutional neural networks (Densenet) for image encoding and Bidirectional Long Short Term Memory (BLSTM) for language modeling, and the outputs are given to bidirectional LSTM in the caption generator, which is trained to optimize the log-likelihood of the target description of the image. Most of the existing image captioning works use RNN and LSTM for language modeling. RNNs are computationally expensive with limited memory. LSTM checks the inputs in one direction. BLSTM is used in practice, which avoids the problem of RNN and LSTM. In this work, the selection of the best combination of words in caption generation is made using beam search and game theoretic search. The results show the game theoretic search outperforms beam search. The model was evaluated with the standard benchmark dataset Flickr8k. The Bilingual Evaluation Understudy (BLEU) score is taken as the evaluation measure of the system. A new evaluation measure called GCorrectwas used to check the grammatical correctness of the description. The performance of the proposed model achieves greater improvements over previous methods on the Flickr8k dataset. The proposed model produces grammatically correct sentences for images with a GCorrect of 0.040625 and a BLEU score of 69.96%

Palabras claves

image captioning - image description generation - deep learning - Densenet - bidirectional LSTM

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 10 Parte: 11 (2019)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Applied Sciences
Algorithms
Information

DOI

https://doi.org/10.3390/info10110354

Artículos similares

Transfer Learning-Based YOLOv3 Model for Road Dense Object Detection

Acceso

Chunhua Zhu, Jiarui Liang and Fei Zhou

Stemming from the overlap of objects and undertraining due to few samples, road dense object detection is confronted with poor object identification performance and the inability to recognize edge objects. Based on this, one transfer learning-based YOLOv... ver más

Revista: Information

Improving the Accuracy of Satellite-Derived Bathymetry Using Multi-Layer Perceptron and Random Forest Regression Methods: A Case Study of Tavsan Island

Acceso

Osman Isa Çelik, Gürcan Büyüksalih and Cem Gazioglu

The spatial and spectral information brought by the Very High Resolution (VHR) and multispectral satellite images present an advantage for Satellite-Derived Bathymetry (SDB), especially in shallow-water environments with dense wave patterns. This work fo... ver más

Revista: Journal of Marine Science and Engineering

Finite Element?Boundary Element Acoustic Backscattering with Model Reduction of Surface Pressure Based on Coherent Clusters

Acceso

Petr Krysl and Ahmad T. Abawi

Computing backscattering of harmonic acoustic waves from underwater elastic targets of arbitrary shape is a problem of considerable practical significance. The finite element method is commonly applied to the discretization of the target; on the other ha... ver más

Revista: Acoustics

KTAT: A Complex Embedding Model of Knowledge Graph Integrating Type Information and Attention Mechanism

Acceso

Ying Liu, Peng Wang and Di Yang

Knowledge graph embedding learning aims to represent the entities and relationships of real-world knowledge as low-dimensional dense vectors. Existing knowledge representation learning methods mostly aggregate only the internal information of triplets an... ver más

Revista: Applied Sciences

U-Net_dc: A Novel U-Net-Based Model for Endometrial Cancer Cell Image Segmentation

Acceso

Zhanlin Ji, Dashuang Yao, Rui Chen, Tao Lyu, Qinping Liao, Li Zhao and Ivan Ganchev

Mutated cells may constitute a source of cancer. As an effective approach to quantifying the extent of cancer, cell image segmentation is of particular importance for understanding the mechanism of the disease, observing the degree of cancer cell lesions... ver más

Revista: Information

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas