Analyzing an Attack and Implementing Compensating Controls for an AI System

This lab directly supports the preparation for the CompTIA SecAI+ (CY0-001) certification exam. The following table maps the major concepts and tasks covered in this lab to the corresponding exam objectives.

Lab Task/ConceptCompTIA SecAI+ (CY0-001) ObjectiveDescription
Task 2: Data Poisoning Analysis2.5: Given a scenario, implement monitoring and auditing for an AI systemAnalyzing training logs and model metrics to detect anomalies (loss spikes) indicative of a poisoning attack
Task 3: Evasion Attack Analysis2.6: Given a scenario, analyze an attack and implement compensating controlsInvestigating adversarial examples and calculating perturbation magnitude to understand the evasion attack vector
Task 4: Implementing Compensating Control2.2: Given a scenario, implement security controls for AI systemsImplementing a compensating control (input filter) to mitigate the immediate threat of an evasion attack
Task 5: Final Reporting & Recommendation2.6: Given a scenario, analyze an attack and implement compensating controlsVerifying the control's effectiveness and recommending a corrective control (model retraining) to address the root cause
General Lab Context4.2: Explain risks associated with AIUnderstanding the mechanics and impact of adversarial machine learning (AML) attacks (poisoning and evasion)

Overview

Artificial intelligence (AI) systems, particularly those based on machine learning (ML), are increasingly deployed in critical infrastructure, including security and defense applications. This reliance introduces a new class of security risks, primarily from adversarial machine learning (AML) attacks. These attacks aim to manipulate the behavior of the AI model, either during training (poisoning) or during inference (evasion), to cause a malfunction or a security breach.

This lab is designed to provide a simulation-based learning scenario where a security analyst must investigate evidence of an AML attack on a critical AI system—specifically, a computer vision model used for object detection in a surveillance system. The analysis will focus on identifying the attack vector and the resulting impact through simulated outputs from Python scripts. Following the analysis, the student will be tasked with suggesting and implementing compensating controls to mitigate the identified risks, aligning with the objective: 2.6: Given a scenario, analyze the evidence of an attack and suggest compensating controls for AI systems.

VM Credentials

Username: student

Password: student

Key terms and descriptions

Adversarial Machine Learning (AML)
A field of study focused on the vulnerabilities of machine learning models to malicious inputs and the development of defensive techniques
Data Poisoning Attack
An attack where an adversary injects malicious data into the training dataset to corrupt the model's integrity and performance
Evasion Attack
An attack where an adversary crafts a subtly modified input (an adversarial example) to cause a trained model to misclassify it during inference
Adversarial Example
An input to a machine learning model that has been intentionally perturbed to cause the model to make an incorrect prediction, while remaining imperceptible to humans
Compensating Control
An alternative security control used to mitigate a risk when a primary control is not feasible or cannot fully address the threat
Preventative Control
A security control designed to stop an attack or security violation from occurring (e.g., input validation)
Detective Control
A security control designed to identify and alert on an attack or security violation that has occurred (e.g., anomaly detection)
Corrective Control
A security control designed to fix the effects of an attack or security violation (e.g., model retraining)
Model Robustness
The ability of a machine learning model to maintain its performance and accuracy when faced with noisy, corrupted, or adversarial inputs
Threat Model
A structured approach to identifying, analyzing, and prioritizing potential threats to a system, including the assets, vulnerabilities, and adversaries
Attack Surface
The sum of all points where an unauthorized user can try to enter data to or extract data from an environment
Feature Space
The multidimensional space where the data points used to train a machine learning model reside, with each dimension representing a feature
Perturbation
A small, often imperceptible, change applied to an input data point to create an adversarial example
Transferability
The phenomenon where an adversarial example crafted for one model can successfully cause misclassification in a different model
White-Box Attack
An attack where the adversary has full knowledge of the target model's architecture, parameters, and training data
Black-Box Attack
An attack where the adversary has no knowledge of the target model's internal workings, only access to its input/output interface
Data Integrity
The assurance that data is accurate, consistent, and trustworthy throughout its lifecycle
Model Inversion Attack
An attack that attempts to reconstruct sensitive training data from the model's outputs
Model Extraction Attack
An attack that attempts to steal the intellectual property of a model by querying it and replicating its functionality
NIST AI RMF
The National Institute of Standards and Technology's Artificial Intelligence Risk Management Framework, a voluntary framework for managing risks associated with AI