dataset_configuration, redteam_test_configurations, and endpoint_configuration.
Attack Methods
Comprehensive guide to all attack methods
Configuration
Detailed configuration reference
Examples
Ready-to-use payload examples
What’s New in V3
V3 introduces significant enhancements to attack methods configuration:- Granular Parameter Control: Each attack method now supports specific parameters for fine-tuned testing
- Structured Attack Hierarchy: Clear organization of basic, static, and dynamic attack methods
- Enhanced Attack Methods: Expanded suite of encoding, obfuscation, and multi-modal attack techniques
Migrating from V2? Click here for migration guide
Migrating from V2? Click here for migration guide
V2 to V3 Migration
V2 Format (Legacy):- Attack methods are now objects with
paramsinstead of arrays - Each attack method requires a
paramsobject (can be empty{}) - Use specific encoding keywords (e.g.,
base64_encodinginstead of genericencoding) - Configure parameters for iterative attacks:
width,branching_factor,depth
Quick Reference: Attack Methods
Choose attack methods based on your model type and security testing needs. This table provides an enhanced overview of all available attack methods.By Model Type
- LLM (Text)
- VLM (Vision)
- ALM (Audio)
| Attack Method | Keyword | Category | Use When | Parameters | Complexity |
|---|---|---|---|---|---|
| Raw Prompts | basic | Basic | Always (baseline) | None | ⭐ Low |
| Obfuscation | obfuscation | Static | Standard testing | None | ⭐⭐ Medium |
| Base64 | base64_encoding | Static | Encoding evasion | iterations (1-3) | ⭐⭐ Medium |
| Hexadecimal | hex_encoding | Static | Technical filters | None | ⭐⭐ Medium |
| ASCII | ascii_encoding | Static | Text filters | None | ⭐⭐ Medium |
| Binary | binary_encoding | Static | Advanced encoding | None | ⭐⭐ Medium |
| URL Encoding | url_encoding | Static | Web applications | None | ⭐ Low |
| Leet Speak | leet_encoding | Static | Character matching | None | ⭐ Low |
| ROT13 | rot13_encoding | Static | Cipher testing | None | ⭐ Low |
| ROT21 | rot21_encoding | Static | Cipher testing | None | ⭐ Low |
| Morse Code | morse_encoding | Static | Unique encoding | None | ⭐⭐ Medium |
| French | lang_fr | Static | Multilingual bypass | None | ⭐⭐ Medium |
| Italian | lang_it | Static | Multilingual bypass | None | ⭐⭐ Medium |
| Hindi | lang_hi | Static | Non-Latin scripts | None | ⭐⭐ Medium |
| Spanish | lang_es | Static | Multilingual bypass | None | ⭐⭐ Medium |
| Japanese | lang_ja | Static | Asian languages | None | ⭐⭐ Medium |
| EAI Attack | eai_attack | Static | Advanced jailbreak | None | ⭐⭐⭐ High |
| Deep Inception | deep_inception | Static | Nested injection | None | ⭐⭐⭐ High |
| Iterative | iterative | Dynamic | Adaptive attacks | width, branching_factor, depth | ⭐⭐⭐ High |
| Multi-Turn | multi_turn | Dynamic | Conversation exploit | None | ⭐⭐⭐ High |
Quick Lookup: All Keywords
Basic & Obfuscation (3 methods)
Basic & Obfuscation (3 methods)
basic- LLM, VLM, ALMobfuscation- LLM, VLMeai_attack- LLM, VLM
Encoding (9 methods)
Encoding (9 methods)
ascii_encoding- LLMbase64_encoding- LLMbinary_encoding- LLMhex_encoding- LLMurl_encoding- LLMleet_encoding- LLMmorse_encoding- LLMrot13_encoding- LLMrot21_encoding- LLM
Multilingual (5 methods)
Multilingual (5 methods)
lang_fr- French (LLM)lang_it- Italian (LLM)lang_hi- Hindi (LLM)lang_es- Spanish (LLM)lang_ja- Japanese (LLM)
Dynamic (2 methods)
Dynamic (2 methods)
iterative- LLMmulti_turn- LLM
Visual (4 methods)
Visual (4 methods)
masking- VLMfigstep- VLMhades🔒 - VLMjood🔒 - VLM
Audio (6 methods)
Audio (6 methods)
waveform- ALMecho- ALMspeed- ALMpitch- ALMreverb- ALMnoise- ALM
Advanced (1 method)
Advanced (1 method)
deep_inception- LLM
Payload Structure
High-Level Overview
For complete field descriptions, see Configuration Reference.
Quick Start Examples
Starter: Basic LLM Test
Standard: Multi-Test Assessment
For more examples including VLM, ALM, agents, and industry-specific use cases, see the Examples page.
Available Test Types
Standard Tests (12 tests)
Tests available for all model types:| Test | Keyword | Purpose |
|---|---|---|
| Bias | bias_test | Identifies biased outputs |
| CBRN | cbrn_test | Chemical, biological, radiological, nuclear |
| CSEM | csem_test | Child abuse exploitation |
| Harmful | harmful_test | Harm or danger promotion |
| Insecure Code | insecure_code_test | Vulnerable code generation |
| Toxicity | toxicity_test | Offensive content |
| PII | pii_test | Personal information leakage |
| Copyright | copyright_test | Copyrighted material |
| Misinformation | misinformation_test | False information |
| System Prompt | system_prompt_extractions_test | Prompt extraction |
| Sponge | sponge_test | Resource exhaustion |
| Competitor | competitor_test | Competitor information |
Custom & Agentic Tests
Custom Test
Custom Test
custom_test - Test with your own generated datasetAgentic Tests (8 tests)
Agentic Tests (8 tests)
For AI agents with tool use and autonomous capabilities:
governance_test- Alignment, goal misalignment, and policy driftagent_output_quality_test- Output quality, hallucinations, bias, and toxicitytool_misuse_test- API integration, supply chain, and resource consumptionprivacy_test- Sensitive data exposure and exfiltration channelsreliability_and_observability_test- Data poisoning, concept drift, and opaque reasoningagent_behaviour_test- Human manipulation and unsafe actuationaccess_control_and_permissions_test- Credential theft, privilege escalation, confused deputytool_extraction_test- Tool information extraction
Specialized Tests (6 tests)
Specialized Tests (6 tests)
For generated adversarial datasets:
adv_bias_test- Adversarial bias detectionadv_info_test- Sensitive information extractionadv_persona_test- Persona manipulationadv_command_test- Command injectionadv_pii_test- Advanced PII extractionadv_competitor_test- Competitor information
Best Practices
Start Simple
Begin with
basic attacks at low sample percentage (2-5%) to establish baseline.Progressive Testing
Add static methods, then dynamic attacks as you identify vulnerabilities.
Match Your Model
Use appropriate modalities:
["text"] for LLM, ["text", "image"] for VLM, ["text", "audio"] for ALM.Multiple Tests
Run multiple test types (harmful, bias, PII) for comprehensive coverage.
Sample Wisely
Dev: 2-5% | Staging: 10-20% | Production: 50-100%
Consider Cost
Dynamic attacks are thorough but resource-intensive. Start with static methods.
Configuration Guidelines
Sample Percentage by Stage
Attack Method Combinations
- Minimal (Dev)
- Standard (Staging)
- Comprehensive (Prod)
Common Parameters
Number of parallel attack paths (1-10)
Variations per iteration (1-15)
Maximum iteration depth (1-5)
Encoding iterations (1-3). Higher = more obfuscation but less comprehension
Security & Usage Notes
Security:
- Never commit API keys to version control
- Use environment variables for credentials
- Rotate keys regularly
- Separate test and production keys
Usage:
- Verify endpoints and keys are correct
- Adjust
sample_percentagebased on dataset size - Choose attack methods appropriate for your model type
- Use in accordance with provider terms of service
Next Steps
Attack Methods
Explore detailed attack method documentation with parameters and use cases
Configuration
Complete reference for all configuration fields and options
Examples
Browse ready-to-use examples for different scenarios and providers
API Reference
View the complete API specification and endpoints
Additional Resources
- Quickstart Guide - Get started with your first test
- GET Defaults API - Supported providers and models
- Red Team Introduction - Overview of red teaming concepts
This payload structure facilitates in-depth testing across various model types, allowing for comprehensive assessments of behavior and security with fine-grained control over attack parameters.

