- Foundation Model Test Catalog - Tests for foundation models and LLM-based applications
- Agent Test Catalog - Tests for AI agents with tool use and autonomous capabilities
Attack Methods by Model Type
1. Large Language Models (LLMs) & AI Agents
Input: Text | Output: Text1.1 Direct Prompt Injection
basic - Raw Prompts (Baseline)
basic - Raw Prompts (Baseline)
basicDescription: Direct injection of adversarial prompts without any obfuscation or encoding. This is the baseline attack that every test should include.When to Use:- Always include as your first attack method
- Establishes baseline vulnerability assessment
- Quick testing during development
iterative - Iterative Attacks (Dynamic)
iterative - Iterative Attacks (Dynamic)
iterativeDescription: Progressive prompt refinement based on model responses. The attack adapts iteratively, learning from each response to craft increasingly effective prompts.When to Use:- Testing models with strong initial defenses
- Comprehensive security assessments
- Understanding model’s resistance to adaptive attacks
width(integer): Number of parallel attack paths to explore (default: 5, range: 1-10)branching_factor(integer): Number of variations per iteration (default: 9, range: 1-15)depth(integer): Maximum iteration depth (default: 3, range: 1-5)
multi_turn - Multi-Turn Attacks (Dynamic)
multi_turn - Multi-Turn Attacks (Dynamic)
multi_turnDescription: Distributes malicious intent across multiple conversation turns, exploiting the model’s conversation memory to build up to harmful outputs.When to Use:- Testing conversational AI and chatbots
- Assessing context window vulnerabilities
- Evaluating memory-based attack resistance
rt_agent - Red Team Agent (Dynamic)
rt_agent - Red Team Agent (Dynamic)
rt_agentDescription: Advanced depth-first search (DFS) based red teaming agent that uses adaptive prompt generation with mutation levels. The agent progressively refines prompts through multiple levels of mutation, learning from blocked attempts to craft increasingly effective adversarial prompts.When to Use:- Comprehensive security assessments
- Testing models with strong initial defenses
- Understanding adaptive attack resistance
- Advanced red teaming scenarios
width(integer): Number of parallel attack paths to explore (default: 5, range: 1-10)branching_factor(integer): Number of variations per iteration (default: 9, range: 1-15)depth(integer): Maximum iteration depth (default: 3, range: 1-5)
eai_attack - EAI Attack (Static)
eai_attack - EAI Attack (Static)
eai_attackDescription: Exploit-Amplify-Iterate methodology for systematic jailbreaking using graph-based encoding techniques.When to Use:- Advanced security testing
- Research-grade assessments
- Testing sophisticated defense mechanisms
1.2 Encoding & Obfuscation Techniques
ascii_encoding - ASCII Encoding
ascii_encoding - ASCII Encoding
ascii_encodingDescription: Converts characters to ASCII decimal values to evade content filters that pattern-match text.Example: “hello” → “104 101 108 108 111”When to Use:- Bypassing simple text-based filters
- Testing encoding awareness
- Combined with other techniques
base64_encoding - Base64 Encoding
base64_encoding - Base64 Encoding
base64_encodingDescription: Base64-encoded prompts to bypass pattern matching and content filters.Example: “attack” → “YXR0YWNr”When to Use:- Testing encoding robustness
- Multi-iteration encoding for advanced evasion
- Common bypass technique
encoding_type(string): Type of encoding (default: “base64”)iterations(integer): Number of encoding iterations (default: 1, range: 1-3)
binary_encoding - Binary Encoding
binary_encoding - Binary Encoding
binary_encodingDescription: Represents text as binary (base-2) to obscure malicious instructions.Example: “hi” → “01101000 01101001”When to Use:- Testing technical encoding understanding
- Advanced obfuscation scenarios
- Combined attack vectors
hex_encoding - Hexadecimal Encoding
hex_encoding - Hexadecimal Encoding
hex_encodingDescription: Hex-encoded prompts for filter evasion using base-16 representation.Example: “test” → “74657374”When to Use:- Technical content filters
- Programming-focused models
- Combined with other encodings
url_encoding - URL Encoding
url_encoding - URL Encoding
url_encodingDescription: Percent-encoded characters to obscure intent using URL encoding standards.Example: “hack me” → “hack%20me”When to Use:- Web-based applications
- URL/link processing models
- Combined obfuscation
obfuscation - General Obfuscation
obfuscation - General Obfuscation
obfuscationDescription: General obfuscation techniques including character substitution, spacing manipulation, and other text transformations.When to Use:- First-line static attack testing
- Complement to basic attacks
- Standard security assessment
1.3 Cipher & Character Substitution
leet_encoding - Leet Speak
leet_encoding - Leet Speak
leet_encodingDescription: Alphanumeric character substitution popular in internet culture.Example: “hack” → “h4ck”, “elite” → “31337”When to Use:- Testing character-level pattern matching
- Social engineering contexts
- Combined with other methods
rot13_encoding - ROT13 Cipher
rot13_encoding - ROT13 Cipher
rot13_encodingDescription: Caesar cipher with 13-position character rotation (A↔N, B↔O, etc.).Example: “hello” → “uryyb”When to Use:- Testing cipher understanding
- Classic obfuscation technique
- Educational/demonstration purposes
rot21_encoding - ROT21 Cipher
rot21_encoding - ROT21 Cipher
rot21_encodingDescription: Caesar cipher with 21-position character rotation.Example: “test” → “ozno”When to Use:- Alternative to ROT13
- Testing cipher detection range
- Comprehensive cipher testing
morse_encoding - Morse Code
morse_encoding - Morse Code
morse_encodingDescription: Represents text using Morse code dots and dashes.Example: “SOS” → ”… --- …”When to Use:- Unique encoding tests
- Historical/educational contexts
- Comprehensive encoding coverage
1.4 Multilingual Attacks
lang_fr - French
lang_fr - French
lang_frDescription: Prompt translation to French for filter bypass. Many content filters are optimized for English.When to Use:- International models
- Testing language-specific defenses
- Comprehensive multilingual testing
lang_it - Italian
lang_it - Italian
lang_itDescription: Italian-language prompt injection to bypass English-focused filters.Configuration:lang_hi - Hindi
lang_hi - Hindi
lang_hiDescription: Hindi-language adversarial prompts, useful for testing non-Latin script handling.Configuration:lang_es - Spanish
lang_es - Spanish
lang_esDescription: Spanish-language attack vectors for Romance language testing.Configuration:lang_ja - Japanese
lang_ja - Japanese
lang_jaDescription: Japanese-language jailbreak attempts, testing Asian language defenses.Configuration:1.5 Advanced Techniques
deep_inception - Deep Inception
deep_inception - Deep Inception
deep_inceptionDescription: Nested multi-layer prompt injection using recursive context framing. Creates layered scenarios that progressively lead the model toward prohibited outputs.When to Use:- Advanced security research
- Testing sophisticated safety measures
- Comprehensive vulnerability assessment
2. Vision-Language Models (VLMs)
Input: Text + Image | Output: Text2.1 Visual Manipulation Attacks
basic - Raw Prompts for VLM
basic - Raw Prompts for VLM
basicDescription: Direct adversarial prompts with unmodified images as baseline for VLM testing.When to Use:- Always include as baseline for VLM testing
- Quick visual content assessment
- Establishing VLM vulnerability baseline
masking - Image Masking
masking - Image Masking
maskingDescription: Strategic occlusion or masking of image regions to manipulate context and bypass visual content filters.When to Use:- Testing visual content moderation
- Occlusion-based attacks
- Combined visual-text attacks
figstep - FigStep Attack
figstep - FigStep Attack
figstepDescription: Figure-based step-wise adversarial attack technique that converts harmful text into typographic images, bypassing text-based safety filters. The attack uses a three-step pipeline: paraphrase (convert prohibited questions into list-format statements), typography (transform text into typographic images), and incitement (add neutral prompts to encourage response).When to Use:- Advanced VLM security testing
- Multi-step visual attacks
- Testing typographic image bypass techniques
camo - CAMO (Cross-Modal Obfuscation)
camo - CAMO (Cross-Modal Obfuscation)
camoDescription: Cross-modal obfuscation attack that decouples harmful text across text and image modalities. Both components appear benign independently but combine to form malicious instructions when processed together by a VLM. Supports multiple variants including basic, steganographic, multi-image, and contextual obfuscation.When to Use:- Advanced VLM security testing
- Cross-modal attack assessment
- Testing modality fusion vulnerabilities
fc - FC-Attack (Flowchart Attack)
fc - FC-Attack (Flowchart Attack)
fcDescription: Jailbreak attack using auto-generated flowcharts. The attack uses a two-stage process: first, an LLM generates harmful steps from the query, then a flowchart image is created from these steps. The VLM analyzes the flowchart and provides detailed harmful guidance, bypassing safety filters through visual representation.When to Use:- Advanced VLM security testing
- Flowchart-based attack assessment
- Multi-step visual attack scenarios
2.2 Coming Soon
hades - HADES (Coming Soon)
hades - HADES (Coming Soon)
hadesDescription: Advanced visual jailbreak methodology using sophisticated image perturbation techniques. This attack leverages image-based prompts to bypass text-based safety filters.When to Use:- Advanced VLM security research
- Image-based jailbreak testing
- Requires special access
jood - JOOD (Coming Soon)
jood - JOOD (Coming Soon)
joodDescription: Joint Out-of-Distribution attack leveraging distributional shifts in both visual and textual modalities to bypass safety mechanisms.When to Use:- Research and academic testing
- OOD robustness evaluation
- Requires special access
3. Audio-Language Models (ALMs)
Input: Text + Audio | Output: Text3.1 Audio-Based Attacks
basic - Raw Prompts for ALM
basic - Raw Prompts for ALM
basicDescription: Direct adversarial prompts with audio input as baseline for ALM testing.When to Use:- Always include as baseline for ALM testing
- Quick audio content assessment
- Establishing ALM vulnerability baseline
waveform - Waveform Manipulation
waveform - Waveform Manipulation
waveformDescription: Audio waveform modification to bypass safety guardrails through signal processing techniques.When to Use:- Testing audio content moderation
- Signal-level attack testing
- Comprehensive ALM security
echo - Echo Effect
echo - Echo Effect
echoDescription: Echo-based audio manipulation to obscure or modify harmful audio content.When to Use:- Audio obfuscation testing
- Environmental effect bypass
- Combined audio attacks
speed - Speed Alteration
speed - Speed Alteration
speedDescription: Audio speed modification techniques (faster/slower playback) to bypass detection.When to Use:- Temporal manipulation testing
- Rate-based evasion
- Audio processing robustness
pitch - Pitch Shifting
pitch - Pitch Shifting
pitchDescription: Pitch modification for audio obfuscation while maintaining intelligibility.When to Use:- Frequency-based evasion
- Voice transformation testing
- Audio filter bypass
reverb - Reverb Effect
reverb - Reverb Effect
reverbDescription: Reverb-based audio manipulation to obscure content through spatial effects.When to Use:- Environmental audio testing
- Spatial effect evasion
- Combined audio techniques
noise - Noise Injection
noise - Noise Injection
noiseDescription: Background noise injection techniques to obscure harmful content while maintaining comprehension.When to Use:- Noise robustness testing
- SNR-based evasion
- Real-world audio scenarios
Attack Method Combinations
Recommended Combinations
Starter Combination (Quick Testing)
Standard Combination (Balanced)
Advanced Combination (Comprehensive)
Best Practices
Start Simple
basic attacks to establish a baseline before adding complexity.Progressive Testing
Match Your Model
Consider Cost
Related Pages
- Payload Guide - Overview and quick reference
- Test Catalogs - Comprehensive test catalogs for foundation models and agents
- Configuration Reference - Detailed configuration options
- Examples - Complete payload examples

