Resumen
Discriminative correlation filters (DCFs) have been shown to perform superiorly in visual object tracking. However, visual tracking is still challenging when the target objects undergo complex scenarios such as occlusion, deformation, scale changes and illumination changes. In this paper, we utilize the hierarchical features of convolutional neural networks (CNNs) and learn a spatial-temporal context correlation filter on convolutional layers. Then, the translation is estimated by fusing the response score of the filters on the three convolutional layers. In terms of scale estimation, we learn a discriminative correlation filter to estimate scale from the best confidence results. Furthermore, we proposed a re-detection activation discrimination method to improve the robustness of visual tracking in the case of tracking failure and an adaptive model update method to reduce tracking drift caused by noisy updates. We evaluate the proposed tracker with DCFs and deep features on OTB benchmark datasets. The tracking results demonstrated that the proposed algorithm is superior to several state-of-the-art DCF methods in terms of accuracy and robustness.