Enkrypt AI Red Teaming

What is AI Red Teaming?

AI Red Teaming is a proactive security testing methodology that simulates adversarial attacks against AI systems to identify vulnerabilities before they can be exploited. EnkryptAI’s Red Teaming platform provides automated, comprehensive testing across multiple risk categories and attack vectors.

Key Features

Multi-Modal Testing

Test LLMs, Vision-Language Models (VLMs), and Audio-Language Models (ALMs) with specialized attack methods for each modality.

30+ Attack Methods

From basic prompt injection to advanced techniques like iterative attacks, encoding, obfuscation, and multi-turn conversations.

31 Risk Categories

Comprehensive testing across bias, toxicity, PII, harmful content, CBRN, misinformation, and more.

Agent-Specific Tests

Specialized tests for AI agents including tool misuse, access control, and behavior analysis.

Getting Started

Review the Quickstart

Get up and running quickly with your first red team test.Go to Quickstart →

Explore Test Catalogs

Understand what tests are available for your model type (foundation models or AI agents).

Build Your Payload

Learn the payload structure and how to configure tests and attack methods.Payload Guide →

Choose Attack Methods

Explore the 30+ attack methods available for different model types.Attack Methods Reference →

Reference Configuration

Get detailed information about all configuration options.Configuration Reference →

Use Examples

Copy and adapt ready-to-use examples for your specific use case.Browse Examples →

Available Test Categories

For comprehensive catalogs of all tests with detailed descriptions, categories, sample prompts, and configuration examples:

Foundation Model Test Catalog - Tests for foundation models and LLM-based applications
Agent Test Catalog - Tests for AI agents with tool use and autonomous capabilities

Standard Tests (12 tests)

Tests available for all AI systems to identify common security vulnerabilities:

Bias Test

Keyword: bias_testIdentifies and exposes biased outputs based on protected attributes like race, gender, age, religion, and nationality.Example Use Cases:

HR chatbots
Loan approval systems
Content recommendation engines

Harmful Content Test

Keyword: harmful_testElicits responses that promote harm or danger to individuals or groups.Example Use Cases:

Customer-facing chatbots
Educational AI tutors
Mental health applications

Toxicity Test

Keyword: toxicity_testGenerates harmful or offensive content including hate speech, profanity, and discriminatory language.Example Use Cases:

Social media moderation
Community forums
Gaming chat systems

PII Exposure Test

Keyword: pii_testExposes personally identifiable information that should remain private.Example Use Cases:

Healthcare applications (HIPAA)
Financial services (PCI-DSS)
Customer support systems

CBRN Test

Keyword: cbrn_testAddresses vulnerabilities related to chemical, biological, radiological, and nuclear domains.Example Use Cases:

Scientific research assistants
Educational platforms
General-purpose chatbots

CSEM Test

Keyword: csem_testTests for child abuse exploitation vulnerabilities (Child Sexual Exploitation Material).Example Use Cases:

Platforms serving minors
Educational applications
Family-oriented services

Insecure Code Test

Keyword: insecure_code_testProduces insecure or harmful code snippets with security vulnerabilities.Example Use Cases:

Code generation tools
AI pair programmers
Educational coding assistants

Keyword: copyright_testTests whether the model reproduces copyrighted material.Example Use Cases:

Content generation platforms
Writing assistants
Creative AI tools

Misinformation Test

Keyword: misinformation_testTests the model’s tendency to generate false or misleading information.Example Use Cases:

News summarization
Educational content
Medical information systems

System Prompt Extraction Test

Keyword: system_prompt_extractions_testAttempts to extract the system prompt or internal instructions.Example Use Cases:

Proprietary AI systems
Custom-trained models
Enterprise chatbots

Sponge Attack Test

Keyword: sponge_testTests for infinite loops or resource exhaustion attacks.Example Use Cases:

Production API endpoints
Rate-limited services
Cloud-based AI systems

Competitor Information Test

Keyword: competitor_testAttempts to extract information about competitors.Example Use Cases:

Business chatbots
Sales assistants
Competitive intelligence systems

Custom Tests

Custom Dataset Test

Keyword: custom_testTest with your own custom-generated dataset tailored to your specific domain and use case.When to Use:

Domain-specific applications (healthcare, legal, finance)
Unique business requirements
Specialized security concerns

Learn how to create custom datasets →

Specialized Adversarial Tests (6 tests)

Advanced tests using generated adversarial datasets:

Adversarial Bias
Adversarial Info
Adversarial Persona
Adversarial Command
Adversarial PII
Adversarial Competitor

Keyword: adv_bias_testUncover biased outputs through sophisticated adversarial methods.

Agentic Tests (8 tests)

Specialized tests for AI agents with tool use and autonomous capabilities:

Governance Test

Keyword: governance_testTesting the alignment and governance of the model, including goal misalignment and policy drift detection.When to Use: AI agents with autonomous decision-making capabilities

Agent Output Quality Test

Keyword: agent_output_quality_testTesting the quality of the agent’s output including hallucinations, bias, and toxicity in multi-turn workflows.When to Use: Multi-agent systems, agents with extended conversations

Tool Misuse Test

Keyword: tool_misuse_testTesting the misuse of the model’s tools and capabilities including API integration issues, supply chain vulnerabilities, and resource consumption.When to Use: Agents with external tool access, API integrations

Privacy Test

Keyword: privacy_testTesting privacy protections and data handling including sensitive data exposure and exfiltration channels.When to Use: Agents handling sensitive or personal data

Reliability and Observability Test

Keyword: reliability_and_observability_testTesting the reliability and observability of agent operations including data poisoning, concept drift, and opaque reasoning.When to Use: Production agents requiring auditability and traceability

Agent Behaviour Test

Keyword: agent_behaviour_testTesting the behaviour patterns and decision-making of the agent including human manipulation and unsafe actuation.When to Use: Agents with human interaction or physical/digital actuation capabilities

Access Control and Permissions Test

Keyword: access_control_and_permissions_testTesting access control and permissions enforcement including credential theft, privilege escalation, and confused deputy attacks.When to Use: Agents with privileged access to systems or data

Tool Extraction Test

Keyword: tool_extraction_testTest if the agent tool information can be extracted in outputs, revealing internal capabilities and configurations.When to Use: Agents with proprietary or sensitive tool configurations

Agent Test Catalog

Comprehensive catalog of all agent tests with descriptions, categories, sample prompts, and configuration requirements

Attack Methods Overview

Basic & Static Attacks

Basic Prompts

Direct adversarial prompts without obfuscation (baseline testing)

Encoding

9 encoding methods: Base64, Hex, ASCII, Binary, URL, Leet, Morse, ROT13, ROT21

Obfuscation

Character substitution, EAI attacks, and general obfuscation techniques

Multilingual

5 languages: French, Italian, Hindi, Spanish, Japanese

Visual

Image masking, FigStep, HADES, JOOD for vision models

Audio

Waveform, echo, speed, pitch, reverb, noise for audio models

Model-Specific Attacks

Diffusion Model Attacks

Coming Soon: Prompt injection for harmful image generation, bias amplification, copyright violations, NSFW content bypass

Audio Output Attacks

Coming Soon: Voice cloning attacks, harmful audio generation, bias in speech synthesis, deepfake audio creation

Dynamic Attacks

Iterative Attacks: Progressive prompt refinement based on model responses
Multi-Turn Attacks: Distributing malicious intent across conversation history

Advanced Techniques

Deep Inception: Nested multi-layer prompt injection using recursive context framing

Explore all attack methods in detail →

Model Support

LLM (Text-to-Text)
VLM (Image+Text)
ALM (Audio+Text)
AI Agents
Coming Soon

Supported Models:

OpenAI (GPT-4o, GPT-4o-mini, GPT-3.5-turbo)
Anthropic (Claude 3.5 Sonnet, Claude 3 Opus)
Google (Gemini Pro, Gemini Flash)
Meta (Llama models)
Mistral, Groq, and more

Attack Methods: 20+ methods including encoding, obfuscation, multilingual, dynamic attacks

Use Cases by Industry

Healthcare

Critical Tests:

PII Test (HIPAA compliance)
Misinformation Test (medical accuracy)
Harmful Content Test (patient safety)

Example: Testing a medical chatbot to ensure it doesn’t leak patient data or provide dangerous medical advice.View Healthcare Example →

Financial Services

Critical Tests:

PII Test (PCI-DSS, financial data)
Competitor Test (proprietary information)
Bias Test (fair lending)

Example: Testing a banking assistant to prevent unauthorized access to account information.View Banking Example →

Education

Critical Tests:

CSEM Test (child safety)
Bias Test (fair education)
Harmful Content Test (age-appropriate content)

Example: Testing an AI tutor to ensure safe, unbiased educational support for students.View Education Example →

Enterprise & Business

Critical Tests:

System Prompt Extraction (IP protection)
Competitor Test (confidential information)
Tool Misuse Test (for agents)

Example: Testing a business agent to prevent unauthorized tool use or information leakage.

Documentation Guide

Quickstart

Get your first red team test running in minutes

Test Catalogs

Foundation model and agent test catalogs with detailed descriptions

Payload Guide

Overview and quick reference for building payloads

Attack Methods

Comprehensive guide to all 30+ attack methods

Configuration

Complete reference for all configuration options

Examples

14+ ready-to-use payload examples

API Reference

Complete API specification and endpoints

Why Red Team Your AI?

Identify Vulnerabilities Early

Discover security issues before they’re exploited in production. Proactive testing saves time, money, and reputation.

Meet Compliance Requirements

Demonstrate due diligence for regulations like GDPR, HIPAA, PCI-DSS, and emerging AI governance frameworks.

Build User Trust

Show your commitment to AI safety and security. Users trust systems that have been rigorously tested.

Continuous Improvement

Regular red teaming helps you track security improvements over time as you update your models and prompts.

Getting Help

Schedule a Demo

Get personalized guidance on red teaming your AI systems

Join Community

Connect with other AI security professionals

API Documentation

Explore our complete API documentation

Contact Support

Get help with your red teaming implementation

Ready to start testing? Head to the Quickstart →

Get Started for Guardrails

Get Started for Red Team

Get Started with AI Proxy

On-Prem Installation

​What is AI Red Teaming?

​Key Features

Multi-Modal Testing

30+ Attack Methods

31 Risk Categories

Agent-Specific Tests

​Getting Started

​Available Test Categories

​Standard Tests (12 tests)

​Custom Tests

​Specialized Adversarial Tests (6 tests)

​Agentic Tests (8 tests)

Agent Test Catalog

​Attack Methods Overview

​Basic & Static Attacks

Basic Prompts

Encoding

Obfuscation

Multilingual

Visual

Audio

​Model-Specific Attacks

Diffusion Model Attacks

Audio Output Attacks

​Dynamic Attacks

​Advanced Techniques

​Model Support

​Use Cases by Industry

​Documentation Guide

Quickstart

Test Catalogs

Payload Guide

Attack Methods

Configuration

Examples

API Reference

​Why Red Team Your AI?

​Getting Help

Schedule a Demo

Join Community

API Documentation

Contact Support

What is AI Red Teaming?

Key Features

Getting Started

Available Test Categories

Standard Tests (12 tests)

Custom Tests

Specialized Adversarial Tests (6 tests)

Agentic Tests (8 tests)

Attack Methods Overview

Basic & Static Attacks

Model-Specific Attacks

Dynamic Attacks

Advanced Techniques

Model Support

Use Cases by Industry

Documentation Guide

Why Red Team Your AI?

Getting Help