Improving Detection Capabilities of Traditional Machine Learning (ML) Algorithms Against Data Poisoning Attacks on Image Data

Sabrina Perry, University of MississippiFollow

Date of Award

1-1-2025

Document Type

Dissertation

Degree Name

Ph.D. in Engineering Science

First Advisor

Yili Jiang

Second Advisor

Charles Walter

Third Advisor

Joshua Brown

School

University of Mississippi

Relational Format

dissertation/thesis

Abstract

Machine learning (ML) algorithms play a critical role in automated decision-making systems across domains such as healthcare, finance, and autonomous systems. However, these models are increasingly vulnerable to adversarial threats, particularly poisoning attacks that manipulate training data without the knowledge of the ML developers. As ML models are often trained on publicly available data, data poisoning is trivial for attackers to perform, with no way to determine if training data is legitimate, poisoned during data collection, or poisoned during training in the current ML training pipeline.

This dissertation investigates data poisoning attacks, with a focus on label flipping and gradient manipulation techniques, two attacks capable of compromising the integrity and performance of ML systems. The dissertation also explores the impact of these attacks on key performance metrics, including detection accuracy across multiple ML algorithms. To establish a foundation for evaluation, I benchmark Decision Trees, K-Nearest Neighbors (KNN), Logistic Regression, Random Forest, and Support Vector Machines (SVM) on manipulated datasets, generating strong baseline performance metrics that enable direct comparison of poisoning attacks.

Building on this foundation, I introduce DynaDetect[44], a novel KNN-based algorithm designed to detect data poisoning attacks in real-time. I further develop DynaDetect2.0, an improved version that integrates Convolutional Neural Networks (CNNs) for feature extraction and Mahalanobis distance for improved detection accuracy in high-dimensional data. I show the viability of DynaDetect2.0 on the CIFAR-10, ImageNet, and GTSRB datasets, where it outperformed both DynaDetect and traditional KNN in detecting label-flipping and gradient poisoning attacks.

To better understand the impact of DynaDetect2.0, I assess the vulnerability of multiple ML algorithms to poisoning attacks and examine new potential detection methods. This work emphasizes the importance of computational overhead, efficiency, and latency, ensuring these algorithms can rapidly and accurately detect data poisoning in real-world scenarios. The results provide insights into the conditions under which ML algorithms are most vulnerable to poisoning attacks and offer effective strategies for identifying these threats.

This dissertation’s anticipated contributions include advancements in detection mechanisms, such as DynaDetect2.0, and the application of these techniques to other traditional ML algorithms. By improving ML systems’ resilience, this work aims to improve their security and reliability, ensuring they can withstand sophisticated malicious attacks in diverse application environments.

Recommended Citation

Perry, Sabrina, "Improving Detection Capabilities of Traditional Machine Learning (ML) Algorithms Against Data Poisoning Attacks on Image Data" (2025). Electronic Theses and Dissertations. 3354.
https://egrove.olemiss.edu/etd/3354

Download

COinS

Improving Detection Capabilities of Traditional Machine Learning (ML) Algorithms Against Data Poisoning Attacks on Image Data

Date of Award

Document Type

Degree Name

First Advisor

Second Advisor

Third Advisor

School

Relational Format

Abstract

Recommended Citation

Browse

Search

Author Corner

Additional Information

Improving Detection Capabilities of Traditional Machine Learning (ML) Algorithms Against Data Poisoning Attacks on Image Data

Author

Date of Award

Document Type

Degree Name

First Advisor

Second Advisor

Third Advisor

School

Relational Format

Abstract

Recommended Citation

Share

Browse

Search

Author Corner

Additional Information