Resumen
Despite the success of learning-based systems, recent studies have highlighted video adversarial examples as a ubiquitous threat to state-of-the-art video classification systems. Video adversarial attacks add subtle noise to the original example, resulting in a false classification result. Thorough studies on how to generate video adversarial examples are essential to prevent potential attacks. Despite much research on this, existing research works on the robustness of video adversarial examples are still limited. To generate highly robust video adversarial examples, we propose a video-augmentation-based adversarial attack (v3a), focusing on the video transformations to reinforce the attack. Further, we investigate different transformations as parts of the loss function to make the video adversarial examples more robust. The experiment results show that our proposed method outperforms other adversarial attacks in terms of robustness. We hope that our study encourages a deeper understanding of adversarial robustness in video classification systems with video augmentation.