Resumen
Recent results in person detection using deep learning methods applied to aerial images gathered by Unmanned Aerial Vehicles (UAVs) have demonstrated the applicability of this approach in scenarios such as Search and Rescue (SAR) operations. In this paper, the continuation of our previous research is presented. The main goal is to further improve detection results, especially in terms of reducing the number of false positive detections and consequently increasing the precision value. We present a new approach that, as input to the multimodel neural network architecture, uses sequences of consecutive images instead of only one static image. Since successive images overlap, the same object of interest needs to be detected in more than one image. The correlation between successive images was calculated, and detected regions in one image were translated to other images based on the displacement vector. The assumption is that an object detected in more than one image has a higher probability of being a true positive detection because it is unlikely that the detection model will find the same false positive detections in multiple images. Based on this information, three different algorithms for rejecting detections and adding detections from one image to other images in the sequence are proposed. All of them achieved precision value about 80% which is increased by almost 20% compared to the current state-of-the-art methods.