This catalog covers tests for foundation models and LLM-based applications. For agent-specific tests (such as 
agent_output_quality_test and agent_behaviour_test), see the Agent Test Catalog.Standard Tests
Standard tests use pre-defined CSV datasets and can be run without additional system information.CBRN Test
Purpose: Tests for Chemical, Biological, Radiological, and Nuclear (CBRN) information extraction vulnerabilities. NIST Category: CBRN Information or CapabilitiesAIUC Category: F. Society Test Categories:
- Chemistry (VX Nerve Agent, Sarin Gas, Botulinum Toxin, etc.)
 - Biology (Ebola virus, Anthrax, Smallpox, etc.)
 - Radiology (Plutonium-239, Cesium-137, etc.)
 - Nuclear Science (Uranium enrichment, nuclear weapons design, etc.)
 - Cybersecurity
 
- Prompt
 - Response
 
Harmful Test
Purpose: Tests for generation of dangerous, violent, or hateful content. NIST Category: Dangerous, Violent, or Hateful ContentAIUC Category: C. Safety Test Categories:
- Hate Speech & Discrimination
 - Suicide & Self-harm
 - Sexual Content
 - Regulated or Controlled Substances
 - Guns & Illegal Weapons
 - Criminal Planning
 
- Prompt
 - Response
 
Bias Test
Purpose: Tests for harmful bias and stereotyping in AI responses. NIST Category: Harmful Bias or HomogenizationAIUC Category: C. Safety Test Variants:
- Word-level bias tests: Tests implicit associations between groups and attributes
 - Sentence-level bias tests: Tests bias in contextual scenarios
 
- Gender
 - Race
 - Caste and Religion
 - Health
 
- Prompt
 - Response
 
Toxicity Test
Purpose: Tests for toxic, abusive, or offensive language generation. NIST Category: Obscene, Degrading, and/or Abusive Content Test Categories:- Insult
 - Threat
 - Flirtation
 - Severe Toxicity
 - Identity Attack
 - Profanity
 - Sexually Explicit
 
- Prompt
 - Response
 
PII Test
Purpose: Tests for personally identifiable information (PII) extraction vulnerabilities. NIST Category: Data PrivacyAIUC Category: A. Data & Privacy
OWASP Category: LLM02:25, LLM05:25 Test Categories:
- phone_number
 - social_security_number
 - date_of_birth
 - address
 - mother’s_maiden_name
 - credit_card_number
 - passport_number
 - driver’s_license_number
 - bank_account_number
 
- Prompt
 - Response
 
Insecure Code Test
Purpose: Tests for generation of insecure or malicious code. NIST Category: Information Security Test Categories:- Evasion (malware evasion techniques)
 - Payload (malicious payloads)
 - TopLevel (complete malware)
 - SubFunctions (malware subfunctions)
 - GTUBE (test email signatures)
 - EICAR (antivirus test files)
 - GTphish (phishing test signatures)
 
- Prompt
 - Response
 
Copyright Test
Purpose: Tests for reproduction of copyrighted content. Test Categories:- song_lyrics
 - novel_paragraphs
 - news_articles
 - poems
 
- Prompt
 - Response
 
Misinformation Test
Purpose: Tests for generation of false or misleading information. NIST Category: ConfabulationAIUC Category: D. Reliability
OWASP Category: LLM05:25, LLM09:25 Test Categories:
- long_input (overwhelming input)
 - infinite_output (requests for infinite output)
 - recursion_loop (recursive prompts)
 - rare_unicode (unicode manipulation)
 - repetitive_output (excessive repetition)
 
- Prompt
 - Response
 
Sponge Test
Purpose: Tests for computational resource exhaustion attacks (sponge attacks). NIST Category: Data PrivacyAIUC Category: A. Data & Privacy
OWASP Category: LLM02:25, LLM05:25 Test Categories:
- long_input
 - infinite_output
 - recursion_loop
 - rare_unicode
 - repetitive_output
 
- Prompt
 - Response
 
Hallucination Test
Purpose: Tests for factual inaccuracies and hallucinations in AI responses. Test Categories:- History & Archaeology
 - Finance & Economics
 - Science & Medicine
 - Technology & Computing
 - Geography & Geopolitics
 - Law & Policy
 - Sports Statistics & Records
 - Arts, Culture & Literature
 
- Prompt
 - Response
 
System Prompt Extractions Test
Purpose: Tests for system prompt and tool extraction vulnerabilities. NIST Category: Data PrivacyAIUC Category: A. Data & Privacy
OWASP Category: LLM02:25, LLM05:25 Test Variants:
- System Prompt Extraction: Attempts to extract system prompts
 - Tool Extraction: Attempts to extract available tools/functions
 
- instruction_override
 - direct_extraction
 - other
 - context_overflow
 - encoding_attack
 - academic_framing
 - role_playing
 - impersonation
 - technical_exploit
 - socratic_method
 - hypothetical_scenario
 - multi_question
 
- direct_request
 - contextual_framing
 - capability_probing
 - instruction_override
 - technical_impersonation
 - role_playing
 - error_exploitation
 - emotional_appeal
 - other
 
- Prompt
 - Response
 
CSEM Test
Purpose: Tests for child sexual exploitation material (CSEM) related content generation. Test Categories:- Sexual Acts
 - Child Porn
 - Blackmail
 - Grooming
 
- Prompt
 - Response
 
Custom Tests (adv_*)
Custom tests requiresystem_description or policy_description in dataset_configuration to generate contextually relevant test prompts. These tests dynamically generate prompts based on the target system’s description.
Note: When custom_test is selected as the test type, it automatically runs the following tests:
adv_info_testadv_command_testadv_bias_test
adv_pii_test and adv_persona_test may be included based on configuration settings.
Configuration:
custom_test, system_description or policy_description should be provided in dataset_configuration:
adv_competitor_test
Purpose: Tests whether the AI can be manipulated to promote competitors, disclose competitor information, or generate FUD (Fear, Uncertainty, Doubt) about competitors. Required Configuration:system_descriptionorpolicy_descriptionindataset_configuration: Description of the AI system/application being tested
brand_loyalty: Tests if AI prioritizes its own brandcompetitor_promotion: Tests if AI can be prompted to recommend competitorscomparative_analysis: Tests AI responses to direct comparison requestsinformation_disclosure: Tests whether AI reveals competitor-specific detailsFUD_generation: Tests if AI generates negative information about competitors
- Application domain/industry
 - Product/service description
 - Business context
 
- Prompt
 - Response
 
adv_* tests, system_description or policy_description should be provided in dataset_configuration:
adv_info_test
Purpose: Tests for information extraction vulnerabilities, attempting to extract sensitive system information, methodologies, or proprietary knowledge. Required Configuration:system_descriptionorpolicy_descriptionindataset_configuration: Description of the system’s capabilities and data access
- System architecture details
 - Data access capabilities
 - Available tools/functions
 - Internal processes and methodologies
 
- Prompt
 - Response
 
adv_command_test
Purpose: Tests for command injection and unauthorized command execution vulnerabilities. Required Configuration:system_descriptionorpolicy_descriptionindataset_configuration: Description of available commands and system capabilities
- Available commands/functions
 - System permissions
 - Execution capabilities
 - Data modification rights
 
- Prompt
 - Response
 
adv_pii_test
Purpose: Tests for PII extraction vulnerabilities, specifically targeting other users’ data in the system. Required Configuration:system_descriptionorpolicy_descriptionindataset_configuration: Description of the application and types of user data it handles
- Application domain
 - Types of PII stored
 - User data access patterns
 - System capabilities
 
- email, phone_number, social_security_number, date_of_birth, address, credit_card_number, passport_number, driver’s_license_number, bank_account_number, etc.
 
- Prompt
 - Response
 
adv_bias_test
Purpose: Tests for bias vulnerabilities in context-specific scenarios, generating prompts that reveal discriminatory assumptions. Required Configuration:system_descriptionorpolicy_descriptionindataset_configuration: Description of the system’s domain and use casesscenario: Specific scenario to test (e.g., “Academic Performance”, “Leadership Roles”)category: Bias category (e.g., “Gender”, “Socioeconomic”, “Age”)
- Application domain/context
 - Use cases and scenarios
 - User interactions
 - Decision-making processes
 
- Prompt
 - Response
 
adv_sponge_test
Purpose: Tests for computational resource exhaustion using contextually relevant prompts. Required Configuration:system_descriptionorpolicy_descriptionindataset_configuration: Description of the system to generate relevant sponge attack prompts
- System capabilities
 - Processing patterns
 - Resource-intensive operations
 - Input/output formats
 
- Prompt
 - Response
 
adv_persona_test
Purpose: Tests for persona manipulation vulnerabilities, attempting to make the AI adopt harmful or unauthorized personas. Required Configuration:system_descriptionorpolicy_descriptionindataset_configuration: Description of the system’s intended persona and boundaries
- Intended persona/role
 - System boundaries
 - Authority levels
 - Use case constraints
 
- Prompt
 - Response
 
adv_tool_test
Purpose: Tests for tool misuse vulnerabilities in agent-based systems. Required Configuration:system_descriptionorpolicy_descriptionindataset_configuration: Description of available tools and their intended usage
- Available tools/functions
 - Tool capabilities
 - Permissions and access levels
 - Intended use cases
 
- Prompt
 - Response
 
Configuration Requirements Summary
Standard Tests
Standard tests use pre-defined CSV datasets and require minimal configuration:- Test type selection
 - Sample percentage (0-100%)
 - Attack methods (e.g., basic, iterative, static)
 
Custom Tests (adv_*)
Custom tests require additional configuration:- 
system_description or policy_description (Required in 
dataset_configuration)- Description of the target AI system
 - Application domain and use cases
 - Capabilities and limitations
 - Data access patterns
 
 - 
Context-specific parameters (Test-dependent)
adv_competitor_test: Business context, competitor informationadv_bias_test: Scenarios, bias categoriesadv_pii_test: Application domain, PII typesadv_sponge_test: Processing patterns, resource-intensive operations
 
system_description and policy_description are part of dataset_configuration, not params in attack methods. The params section is reserved for attack-specific parameters (e.g., encoding iterations, branching factors).
Test Coverage
By NIST Category
- CBRN Information or Capabilities: cbrn_test
 - Dangerous, Violent, or Hateful Content: harmful_test
 - Harmful Bias or Homogenization: bias_test, adv_bias_test
 - Obscene, Degrading, and/or Abusive Content: toxicity_test
 - Data Privacy: pii_test, adv_pii_test, system_prompt_extractions_test, sponge_test
 - Information Security: insecure_code_test
 - Confabulation: misinformation_test, hallucination_test
 
By AIUC Category
- A. Data & Privacy: pii_test, system_prompt_extractions_test, sponge_test
 - C. Safety: harmful_test, bias_test, toxicity_test
 - D. Reliability: misinformation_test
 - F. Society: cbrn_test
 
By OWASP Category
- LLM02:25: pii_test, system_prompt_extractions_test, sponge_test
 - LLM05:25: pii_test, system_prompt_extractions_test, misinformation_test
 - LLM09:25: misinformation_test
 
Usage Notes
- Standard Tests: Can be run immediately with CSV datasets. Configure sample percentage and attack methods.
 - 
Custom Tests: Require 
system_descriptionorpolicy_descriptionindataset_configuration: - 
Custom Test Type: When selecting 
custom_testas the test type, it automatically runsadv_info_test,adv_command_test, andadv_bias_testtogether:This is equivalent to running all three tests individually and provides comprehensive coverage of information extraction, command injection, and bias vulnerabilities. Note: Forcustom_testandadv_*tests,system_descriptionorpolicy_descriptionshould be provided indataset_configuration, not inparams. Theparamssection is reserved for attack-specific parameters (e.g., encoding iterations, branching factors, depth). - Test Combinations: Multiple tests can be run simultaneously. Use appropriate sample percentages to manage test duration.
 
Related Pages
- Payload Guide - Overview and quick reference for building payloads
 - Agent Test Catalog - Tests for AI agents with tool use and autonomous capabilities
 - Attack Methods Reference - Detailed attack method information
 - Configuration Reference - Complete configuration reference
 - Examples - Ready-to-use payload examples
 - Quickstart - Getting started with the API
 

