Autoencoder Forest for Anomaly Detection from IoT Time Series
The document discusses the use of autoencoder forests for anomaly detection in IoT time series data, highlighting the inefficiencies of manual monitoring and traditional rule-based methods. It explains the structure and training process of autoencoding neural networks, particularly in handling time series analysis and anomaly scoring. Experiment results are presented, demonstrating its application in various monitoring scenarios, such as cooling tower temperatures and energy consumption.
• Manual monitoring –Huge human effort – Boring task with low quality • Rule-based method – Cannot differentiate different environment – Cannot adapt to different condition of the equipment • Data-driven method – Model the common behavior of the equipment Time-series anomaly detection
Autoencoder • What isautoencoder – A encoder-decoder type of neural network architecture that is used for self-learning from unlabeled data • The idea of autoencoder – Learn how to compress data into a concise representation to allow for the reconstruction with minimum error • Different variants of autoencoder – Variational Autoencoder – LSTM Autoencoder – Etc. Autoencoder Neural Network
8.
Autoencoder for anomalydetection Online Detection Anomaly score Offline Training Reconstruction errors
The idea ofautoencoder forest x x x xx x x x x o o o o o o o o o + ++ + + + +
12.
Clustering subsequence ismeaningless [1]. Eamonn Keogh, Jessica Lin, Clustering of Time Series Subsequences is Meaningless: Implications for Previous and Future Research
Training autoencoder forest InputLayer Encoder layer 1 (window_size, 1) (window_size/2, 1) Encoder layer 2 (window_size/4, 1) Decoder layer 1 (window_size/2, 1) Decoder Layer 2 (window_size, 1) • Structure is fixed for every autoencoder. (try to make it as generic as possible) • Each autoencoder within forest is independent. So the training is naturally parallelizable • Using early stopping mechanism, the training of individual autoencoder can be stopped at similar accuracy.
Automatic end-to-end workflow Timeseries analysis Train Data Preprocessing Train Window Extraction Autoencoder Forest Training Test Data Preprocessing Test Window Extraction Anomaly scoring Training Anomaly detection
18.
Periodic pattern analysis •Automatic determine the repeating period in time series – Calculate autocorrelations of different lags – Find the strong local maximum of autocorrelation – Calculate the interval of any two local maximum – Find the mode of intervala
19.
Missing data handling 3:053:10 3:15 3:20 … … 16:15 16:21 16:24 16:30 … … Misalignment Missing 3:05 3:10 3:15 3:20 … … 16:15 (16:20 – 16:40) 16:45 … … ? ? ? • No need to impute • If missing gap is small, impute with neighbouring points; • If missing gap is large, impute with the same time of other periods;
20.
Anomaly scoring Extract thesequence window end at time t ...... Median profile Corresponding autoencoder reconstruct the sequence window at time t Compute reconstruction error as anomaly score Learned autoencoder forest