Redirigiendo al acceso original de articulo en 17 segundos...
Inicio  /  Applied Sciences  /  Vol: 13 Par: 7 (2023)  /  Artículo
ARTÍCULO
TITULO

A Survey of Full-Cycle Cross-Modal Retrieval: From a Representation Learning Perspective

Suping Wang    
Ligu Zhu    
Lei Shi    
Hao Mo and Songfu Tan    

Resumen

Cross-modal retrieval aims to elucidate information fusion, imitate human learning, and advance the field. Although previous reviews have primarily focused on binary and real-value coding methods, there is a scarcity of techniques grounded in deep representation learning. In this paper, we concentrated on harmonizing cross-modal representation learning and the full-cycle modeling of high-level semantic associations between vision and language, diverging from traditional statistical methods. We systematically categorized and summarized the challenges and open issues in implementing current technologies and investigated the pipeline of cross-modal retrieval, including pre-processing, feature engineering, pre-training tasks, encoding, cross-modal interaction, decoding, model optimization, and a unified architecture. Furthermore, we propose benchmark datasets and evaluation metrics to assist researchers in keeping pace with cross-modal retrieval advancements. By incorporating recent innovative works, we offer a perspective on potential advancements in cross-modal retrieval.

 Artículos similares