An Investigation on Online Machine Learning for Anomaly Detection in Time Series Data

Sebastian Niklas Wette

PDF Supplemental Materials

Verified Thesis

Abstract

Concept drift in time series data poses a problem for many machine learning algorithms. Underlying shifts in the statistical properties of data lead to a decline in the performance of batch-trained models. Anomaly detection algorithms working with forecasts on the future behavior of a system suffer from these effects. Thus, adaption to concept drift is a fundamental challenge for anomaly detection systems like this, especially in quickly evolving environments. Instead of retraining models from scratch regularly, models can continuously learn and update themselves as new data arrives, a strategy known as online learning. This thesis investigates the efficacy of online machine learning in prediction-based anomaly detection for time series data under concept drift, focusing on accuracy and computational efficiency compared to batch-trained methods. The work presents a proof of concept for prediction-based anomaly detection using online learning. Furthermore, the research compares the performance of the presented approach with the well-known batch-trained models SARIMA and Prophet, using real-world data from Deutsche Telekom’s IP Backbone, focussing on accuracy and efficiency. Resulting measurements indicate that the online learning approach is more accurate in detecting anomalies when concept drift exists in the data. It exhibits superior adaptability to concept drift, whereas batch-trained models fail to produce adequate forecasts after a changepoint. However, batch-trained models perform better in static data environments. Lower CPU and memory usage and faster runtimes indicate the superior computational efficiency of the online learning method. Finally, this study confirms the superiority of online learning for prediction-based anomaly detection under concept drift. It suggests potential applications in real-time systems and dynamic data environments. There are also some limitations to this approach that motivate future work. Forthcoming research should explore more diverse online learning algorithms for different use cases and address the challenges of online MLOps, namely hyperparameter tuning. Additionally, distinguishing between anomalies and concept drift remains a critical challenge, suggesting avenues for further exploration in adaptive learning strategies.

Topics

Online ML Anomaly Detection Time Series Concept Drift

Research Methods

systematic literature review benchmarking

Publication Data

Author: Sebastian Niklas Wette

Signing Author Pub-Key: 0x181b0929177CD63BF1E54Df2aFf0B136d2954F10

Thesis Type: Bachelor's Thesis

Pages: 60

Language: English

DOI:

About the Author:

Major / Study Program: Artificial Intelligence and Machine Learning

Primary Field of Study: Computer Science

Additional Study Interests: Web Developement

Publication Contract: 0xC2DdBdD2b9A1A317f6976005ec62A61149F1B36c

License: CC BY-NC 4.0

Date of Publication: 09/16/24

Status: Available

Date of Grading: 03/31/24

Institution: Darmstadt University of Applied Sciences (Darmstadt University of Applied Sciences, Germany)

Endorsements

#	Name	Details	Endorsement
1	Dr. Florian Heinrichs Examiner	FH Aachen - University of Applied Sciences Medical Engineering and Technomathematics Professor in Data Science and Statistics Email: f.heinrichs@fh-aachen.de Web: https://www.fh-aachen.de/menschen/fheinrichs Pub-Key: 0x7A4F7CdDB154be3BeEF296541afcd2b0a410aBda	09/15/24 01:00:00 AM

Thesis Documents and Supplemental Materials

07/27/25 01:15:30 PM

#	Description	Type	Upload Date	Location
1	Thesis Document	PDF (21.83MB)	09/09/24 01:00:00 AM	IPFS	Download Raw