<Title> Autoencoder Forest for Anomaly Detection from IoT Time Series Yiqun Hu, SP Group Data Council Singapore 2019
Agenda • Condition monitoring & anomaly detection • Autoencoder for anomaly detection • Autoencoder Forest • End-to-end workflow • Experiment results
Conditional monitoring & Anomaly Detection
Condition monitoring
• Manual monitoring – Huge human effort – Boring task with low quality • Rule-based method – Cannot differentiate different environment – Cannot adapt to different condition of the equipment • Data-driven method – Model the common behavior of the equipment Time-series anomaly detection
Autoencoder for Anomaly Detection
Autoencoder • What is autoencoder – A encoder-decoder type of neural network architecture that is used for self-learning from unlabeled data • The idea of autoencoder – Learn how to compress data into a concise representation to allow for the reconstruction with minimum error • Different variants of autoencoder – Variational Autoencoder – LSTM Autoencoder – Etc. Autoencoder Neural Network
Autoencoder for anomaly detection Online Detection Anomaly score Offline Training Reconstruction errors
Autoencoder Forest
A key challenge of autoencoder Single Autoencoder
The idea of autoencoder forest x x x xx x x x x o o o o o o o o o + ++ + + + +
Clustering subsequence is meaningless [1]. Eamonn Keogh, Jessica Lin, Clustering of Time Series Subsequences is Meaningless: Implications for Previous and Future Research
Autoencoder forest based on time 0:00 1:00 1:30 22:00 23:30
Training autoencoder forest Input Layer Encoder layer 1 (window_size, 1) (window_size/2, 1) Encoder layer 2 (window_size/4, 1) Decoder layer 1 (window_size/2, 1) Decoder Layer 2 (window_size, 1) • Structure is fixed for every autoencoder. (try to make it as generic as possible) • Each autoencoder within forest is independent. So the training is naturally parallelizable • Using early stopping mechanism, the training of individual autoencoder can be stopped at similar accuracy.
Autoencoder Forest Single Autoencoder Autoencoder Forest
End-to-end Workflow
Automatic end-to-end workflow Time series analysis Train Data Preprocessing Train Window Extraction Autoencoder Forest Training Test Data Preprocessing Test Window Extraction Anomaly scoring Training Anomaly detection
Periodic pattern analysis • Automatic determine the repeating period in time series – Calculate autocorrelations of different lags – Find the strong local maximum of autocorrelation – Calculate the interval of any two local maximum – Find the mode of intervala
Missing data handling 3:05 3:10 3:15 3:20 … … 16:15 16:21 16:24 16:30 … … Misalignment Missing 3:05 3:10 3:15 3:20 … … 16:15 (16:20 – 16:40) 16:45 … … ? ? ? • No need to impute • If missing gap is small, impute with neighbouring points; • If missing gap is large, impute with the same time of other periods;
Anomaly scoring Extract the sequence window end at time t ...... Median profile Corresponding autoencoder reconstruct the sequence window at time t Compute reconstruction error as anomaly score   Learned autoencoder forest
Experiment Results
Cooling tower – return water temperature
Chiller – chilled water return temperature
Smart meter – half hour consumption 2018-12-03 22:00:00 Normal data 2018-09-27 14:30:00 2018-10-06 22:30:00 2018-09-07 15:30:00 Top 3 Detected Anomaly
A common platform for time series data, with built-in AI capabilities
powering the nation

Autoencoder Forest for Anomaly Detection from IoT Time Series

  • 1.
    <Title> Autoencoder Forest forAnomaly Detection from IoT Time Series Yiqun Hu, SP Group Data Council Singapore 2019
  • 2.
    Agenda • Condition monitoring& anomaly detection • Autoencoder for anomaly detection • Autoencoder Forest • End-to-end workflow • Experiment results
  • 3.
    Conditional monitoring &Anomaly Detection
  • 4.
  • 5.
    • Manual monitoring –Huge human effort – Boring task with low quality • Rule-based method – Cannot differentiate different environment – Cannot adapt to different condition of the equipment • Data-driven method – Model the common behavior of the equipment Time-series anomaly detection
  • 6.
  • 7.
    Autoencoder • What isautoencoder – A encoder-decoder type of neural network architecture that is used for self-learning from unlabeled data • The idea of autoencoder – Learn how to compress data into a concise representation to allow for the reconstruction with minimum error • Different variants of autoencoder – Variational Autoencoder – LSTM Autoencoder – Etc. Autoencoder Neural Network
  • 8.
    Autoencoder for anomalydetection Online Detection Anomaly score Offline Training Reconstruction errors
  • 9.
  • 10.
    A key challengeof autoencoder Single Autoencoder
  • 11.
    The idea ofautoencoder forest x x x xx x x x x o o o o o o o o o + ++ + + + +
  • 12.
    Clustering subsequence ismeaningless [1]. Eamonn Keogh, Jessica Lin, Clustering of Time Series Subsequences is Meaningless: Implications for Previous and Future Research
  • 13.
    Autoencoder forest basedon time 0:00 1:00 1:30 22:00 23:30
  • 14.
    Training autoencoder forest InputLayer Encoder layer 1 (window_size, 1) (window_size/2, 1) Encoder layer 2 (window_size/4, 1) Decoder layer 1 (window_size/2, 1) Decoder Layer 2 (window_size, 1) • Structure is fixed for every autoencoder. (try to make it as generic as possible) • Each autoencoder within forest is independent. So the training is naturally parallelizable • Using early stopping mechanism, the training of individual autoencoder can be stopped at similar accuracy.
  • 15.
  • 16.
  • 17.
    Automatic end-to-end workflow Timeseries analysis Train Data Preprocessing Train Window Extraction Autoencoder Forest Training Test Data Preprocessing Test Window Extraction Anomaly scoring Training Anomaly detection
  • 18.
    Periodic pattern analysis •Automatic determine the repeating period in time series – Calculate autocorrelations of different lags – Find the strong local maximum of autocorrelation – Calculate the interval of any two local maximum – Find the mode of intervala
  • 19.
    Missing data handling 3:053:10 3:15 3:20 … … 16:15 16:21 16:24 16:30 … … Misalignment Missing 3:05 3:10 3:15 3:20 … … 16:15 (16:20 – 16:40) 16:45 … … ? ? ? • No need to impute • If missing gap is small, impute with neighbouring points; • If missing gap is large, impute with the same time of other periods;
  • 20.
    Anomaly scoring Extract thesequence window end at time t ...... Median profile Corresponding autoencoder reconstruct the sequence window at time t Compute reconstruction error as anomaly score   Learned autoencoder forest
  • 21.
  • 22.
    Cooling tower –return water temperature
  • 23.
    Chiller – chilledwater return temperature
  • 24.
    Smart meter –half hour consumption 2018-12-03 22:00:00 Normal data 2018-09-27 14:30:00 2018-10-06 22:30:00 2018-09-07 15:30:00 Top 3 Detected Anomaly
  • 26.
    A common platformfor time series data, with built-in AI capabilities
  • 27.