Resumen
Anomaly detection in data streams (and particularly time series) is today a vitally important task. Machine learning algorithms are a common design for achieving this goal. In particular, deep learning has, in the last decade, proven to be substantially more accurate than shallow learning in a wide variety of machine learning problems, and deep anomaly detection is very effective for point anomalies. However, deep semi-supervised contextual anomaly detection (in which anomalies within a time series are rare and none at all occur in the algorithm?s training data) is a more difficult problem. Hybrid anomaly detectors (a ?normal model? followed by a comparator) are one approach to these problems, but the separate loss functions for the two components can lead to inferior performance. We investigate a novel synthetic-example oversampling technique to harmonize the two components of a hybrid system, thus improving the anomaly detector?s performance. We evaluate our algorithm on two distinct problems: identifying pipeline leaks and patient-ventilator asynchrony.