A curated reference for securing generative AI systems.
| # | Vulnerability | Plain English |
|---|---|---|
| LLM01 | Prompt Injection | Attacker hijacks model via crafted input |
| LLM02 | Insecure Output Handling | Untrusted output causes XSS/SQLi |
| LLM03 | Training Data Poisoning | Malicious training data alters behavior |
| LLM04 | Model Denial of Service | Expensive inputs exhaust resources |
| LLM05 | Supply Chain | Compromised base models or plugins |
| LLM06 | Sensitive Info Disclosure | Model leaks PII or training data |
| LLM07 | Insecure Plugin Design | Over-permissioned plugins exploited |
| LLM08 | Excessive Agency | Agent acts too autonomously |
| LLM09 | Overreliance | Users trust outputs without verification |
| LLM10 | Model Theft | Model extracted via repeated queries |
| Tool | What It Does |
|---|---|
| LLM Guard | Scans prompts/responses for threats |
| NeMo Guardrails | Defines allowed conversation flows |
| Guardrails AI | Validates LLM output schema |
| Rebuff | Detects prompt injection |
| Tool | Purpose |
|---|---|
| Garak | Automated LLM vulnerability scanner |
| PyRIT | Microsoft AI red-teaming framework |
| ART | IBM Adversarial Robustness Toolbox |
Samsung (2023): Engineers pasted proprietary code into ChatGPT. Data entered training set. Lesson: Never paste confidential data into public AI services.
Chevrolet Chatbot (2023): Customer tricked dealership bot into $1 truck price. Lesson: System constraints matter, not just prompts.
Air Canada (2024): Chatbot gave wrong refund policy. Company held legally liable. Lesson: Add human escalation for policy questions.