| 🏠 Back to Exam Syllabus | 📺 RooCloud on YouTube | 🌐 RooCloud Practice Exams |
AI Threats: Data Poisoning, Prompt Injection, and Model Theft
This episode of the ISACA Advanced in AI Audit (AAIA) exam prep series walks through the major threats facing AI systems across the entire life cycle, from build through to live use. You’ll see how attackers target different stages, the distinct categories of malicious activity that exist, and the unique risks that come with adopting vendor-built AI. The discussion equips auditors to ask sharp security questions of any internal team or external supplier proposing an AI tool.
What this episode covers
- The three AI attack surfaces — development-time, runtime security, and threats through use.
- Training data leakage and exfiltration — how weak supply chain controls let sensitive data walk out of the workspace.
- Data poisoning and RAG risks — corrupting training datasets and the documents AI systems read at run time.
- Model poisoning — tampering with parameters, architecture, libraries, or pre-trained vendor models.
- Model theft — direct file theft and reverse engineering by querying the model thousands of times.
- Prompt injection, model evasion, and model inversion — the main threats that arise during normal user interaction.
- Vendor risk and AI solution disruption — inherited supplier vulnerabilities and denial-of-service attacks on AI services.
Watch the full episode above for the worked examples and detailed explanations of each concept.
Frequently Asked Questions
What are the three attack surfaces of an AI system?
An attack surface is any digital location where an unauthorized person might try to break in or cause damage. The three surfaces are development-time threats that occur while the system is being built, runtime security threats that target the conventional servers and infrastructure hosting the model, and threats through use that happen during routine daily operations involving what users type in and what the system answers back.
What is the difference between data poisoning and model poisoning?
Data poisoning means deliberately injecting malicious or corrupted information into the training dataset to manipulate how the system behaves, which can also affect retrieval augmented generation when an attacker sneaks false information into documents the AI reads. Model poisoning is different because the attacker directly tampers with the mathematical parameters, core architecture, or software libraries of the model itself, which can also happen if you buy a pre-trained system that was tampered with before delivery.
How can an AI model be stolen without breaking into the server?
Beyond directly downloading the model files, an attacker can send thousands of carefully designed questions to the system and analyze the exact answers it gives to reverse engineer the mathematical logic and build an exact digital clone. It is like a rival chef tasting your signature dish every day until they figure out your secret recipe without ever stepping into your kitchen.
What is a prompt injection attack?
A prompt injection targets generative systems with specifically crafted text commands that manipulate the system into ignoring its original safety rules, like hypnotizing a loyal guard into opening a vault. An indirect prompt injection happens when the system reads a seemingly normal file containing hidden malicious instructions. Developers defend with structural templates that sanitize inputs and filters that monitor and block inappropriate outputs.
📚 Master the ISACA AAIA Exam!
Ready to test your knowledge? Access chapter-specific Multiple Choice Questions (MCQs) and full-length practice exams for the ISACA AAIA certification at RooCloud.com. Solve the chapter-wise questions to reinforce this lesson before moving to the next episode.
Reference: This article is based on concepts discussed in AI Threats: Data Poisoning, Prompt Injection & Model Theft.