Resumen
With the rapid development of emerging technologies such as self-media, the Internet of Things, and cloud computing, massive data applications are crossing the threshold of the era of real-time analysis and value realization, which makes data streams ubiquitous in all kinds of industries. Therefore, detecting anomalies in such data streams could be very important and full of challenges. For example, in industries such as electricity and finance, data stream anomalies often contain information that can help avoiding risks and support decision making. However, most traditional anomaly detection algorithms rely on acquiring global information about the data, which is hard to apply to stream data scenarios. Currently, the reviews of the algorithm in the field of anomaly detection, both domestically and internationally, tend to focus on the exposition of anomaly detection algorithms in static data environments, while lacking in the induction and analysis of anomaly detection algorithms in the context of streaming data. As a result, unlike the existing literature reviews, this review provides the current mainstream anomaly detection algorithms in data streaming scenarios and categorizes them into three types on the basis of their fundamental principles: (1) based on offline learning; (2) based on semi-online learning; (3) based on online learning. This review discusses the current state of research on data stream anomaly detection and studies the key issues in various algorithms for detecting anomalies in data streams on the basis of concise summarization. Moreover, the review conducts a detailed comparison of the pros and cons of the algorithms. Finally, the future challenges in the field are analyzed, and future research directions are proposed.