Redirigiendo al acceso original de articulo en 16 segundos...
Inicio  /  Applied Sciences  /  Vol: 10 Par: 15 (2020)  /  Artículo
ARTÍCULO
TITULO

Non-Local Spatial and Temporal Attention Network for Video-Based Person Re-Identification

Zheng Liu    
Feixiang Du    
Wang Li    
Xu Liu and Qiang Zou    

Resumen

Given a video containing a person, the video-based person re-identification (Re-ID) task aims to identify the same person from videos captured under different cameras. How to embed spatial-temporal information of a video into its feature representation is a crucial challenge. Most existing methods have failed to make full use of the relationship between frames during feature extraction. In this work, we propose a plug-and-play non-local attention module (NLAM) for frame-level feature extraction. NLAM, based on global spatial attention and channel attention, helps the network to determine the location of the person in each frame. Besides, we propose a non-local temporal pooling (NLTP) method used for temporal features? aggregation, which can effectively capture long-range and global dependencies among the frames of the video. Our model obtained impressive results on different datasets compared to the state-of-the-art methods. In particular, it achieved the rank-1 accuracy of 86.3% on the MARS (Motion Analysis and Re-identification Set) dataset without re-ranking, which is 1.4% higher than the state-of-the-art way. On the DukeMTMC-VideoReID (Duke Multi-Target Multi-Camera Video Reidentification) dataset, our method also had an excellent performance of 95% rank-1 accuracy and 94.5% mAP (mean Average Precision).

 Artículos similares