Inicio  /  Algorithms  /  Vol: 13 Par: 7 (2020)  /  Artículo
ARTÍCULO
TITULO

TBRNet: Two-Stream BiLSTM Residual Network for Video Action Recognition

Xiao Wu and Qingge Ji    

Resumen

Modeling spatiotemporal representations is one of the most essential yet challenging issues in video action recognition. Existing methods lack the capacity to accurately model either the correlations between spatial and temporal features or the global temporal dependencies. Inspired by the two-stream network for video action recognition, we propose an encoder?decoder framework named Two-Stream Bidirectional Long Short-Term Memory (LSTM) Residual Network (TBRNet) which takes advantage of the interaction between spatiotemporal representations and global temporal dependencies. In the encoding phase, the two-stream architecture, based on the proposed Residual Convolutional 3D (Res-C3D) network, extracts features with residual connections inserted between the two pathways, and then the features are fused to become the short-term spatiotemporal features of the encoder. In the decoding phase, those short-term spatiotemporal features are first fed into a temporal attention-based bidirectional LSTM (BiLSTM) network to obtain long-term bidirectional attention-pooling dependencies. Subsequently, those temporal dependencies are integrated with short-term spatiotemporal features to obtain global spatiotemporal relationships. On two benchmark datasets, UCF101 and HMDB51, we verified the effectiveness of our proposed TBRNet by a series of experiments, and it achieved competitive or even better results compared with existing state-of-the-art approaches.

 Artículos similares

       
 
Shukai Li, Xiaofang Wang, Dongri Shan and Peng Zhang    
Temporal modeling is a key problem in action recognition, and it remains difficult to accurately model temporal information of videos. In this paper, we present a local spatiotemporal extraction module (LSTE) and a channel time excitation module (CTE), w... ver más
Revista: Applied Sciences

 
Dapeng Jiang, Guoyou Shi, Na Li, Lin Ma, Weifeng Li and Jiahui Shi    
In the context of the rapid development of deep learning theory, predicting future motion states based on time series sequence data of ship trajectories can significantly improve the safety of the traffic environment. Considering the spatiotemporal corre... ver más

 
Ning Wang, Zhong Ma, Pengcheng Huo, Xi Liu, Zhao He and Kedi Lu    
Crop yield prediction is essential for tasks like determining the optimal profile of crops to be planted, allocating government resources, effectively planning and preparing for aid distribution, making decisions about imports, and so on. Crop yield pred... ver más
Revista: Applied Sciences

 
Zhen Yang, Guangxue Zhang, Guozhang Fan, Yintao Lu, Dali Shao, Songfeng Liu and Weiwei Wang    
The evolution and mechanisms of tectonic subsidence in the Xisha area are poorly investigated, especially the spatiotemporal distribution features and reasons for the variations in tectonic subsidence. In this study, multi-channel seismic data and strati... ver más
Revista: Applied Sciences

 
Vivian W. H. Wong and Kincho H. Law    
Crowd congestion is one of the main causes of modern public safety issues such as stampedes. Conventional crowd congestion monitoring using closed-circuit television (CCTV) video surveillance relies on manual observation, which is tedious and often error... ver más
Revista: Algorithms