The payload for EnkryptAI’s Red Teaming API is structured as a JSON object with two primary sections: redteam_test_configurations and target_model_configuration.

Payload Structure

{
    "test_name": "Test 1",
    "dataset_name": "standard", // Or your custom dataset name
    "redteam_test_configurations": {
        // Test settings and parameters
    },
    "target_model_configuration": {
        // Details of the target model
    }
}

Red Team Test Configurations

This section specifies the test categories and parameters for each.

{
    "test_name": "Test 1",
    "dataset_name": "standard", // Or your custom dataset name
    "redteam_test_configurations": {
        "bias_test": {
            "sample_percentage": 1,
            "attack_methods": {
                "basic": [
                    "basic"
                ]
            }
        },
        "cbrn_test": {
            "sample_percentage": 1,
            "attack_methods": {
                "basic": [
                    "basic"
                ]
            }
        },
        "harmful_test": {
            "sample_percentage": 1,
            "attack_methods": {
                "basic": [
                    "basic"
                ]
            }
        },
        "insecure_code_test": {
            "sample_percentage": 1,
            "attack_methods": {
                "basic": [
                    "basic"
                ]
            }
        },
        "toxicity_test": {
            "sample_percentage": 1,
            "attack_methods": {
                "basic": [
                    "basic"
                ]
            }
        }
    }
}

Field Descriptions

  • Categories: Each key (bias_test, cbrn_test, harmful_test, insecure_code_test, toxicity_test) represents a specific test category.
  • sample_percentage: Specifies the percentage of the dataset to use (1-100).
  • attack_methods: Methods to apply for each test category.

Test Categories

Standard Tests for Everyone

  1. bias_test: Identifies and exposes biased outputs.

  2. cbrn_test: Addresses vulnerabilities related to chemical, biological, radiological, and nuclear domains.

  3. harmful_test: Elicits responses that promote harm or danger.

  4. insecure_code_test: Produces insecure or harmful code snippets.

  5. toxicity_test: Generates harmful or offensive content.

Specialized Tests for Generated Datasets

  1. adv_bias_test: Uncover biased outputs through adversarial methods.

  2. adv_info_test: Extract sensitive or unintended information from a generated dataset.

  3. adv_tool_test: Misuse integrated tools or features.

  4. adv_command_test: Manipulate the model to execute unintended commands.

  5. adv_pii_test: Expose personally identifiable information.

  6. adv_competitor_test: Exploit vulnerabilities to gain an advantage over competitors.

Attack Methods

  • basic: Basic attack method for each test category.

    • basic
  • advanced: Advanced attack method for each test category.

    • static: Static attack method for each test category.
      • encoding, single_shot
    • dynamic: Dynamic attack method for each test category.
      • iterative

Target Model Configuration

This section provides specifics about the model to be tested.

"target_model_configuration": {
    "testing_for": "LLM", // Options: ["LLM", "Chatbot", "Copilot"]
    "model_name": "google/gemma-7b-it",
    "model_type": "text_2_text",
    "system_prompt": "",
    "conversation_template": "",
    "model_source": "https://docs.anyscale.com",
    "model_provider": "Google",
    "model_endpoint_url": "https://api.endpoints.anyscale.com/v1/chat/completions",
    "model_api_key": "ANYSCALE_API_KEY"
}

Field Descriptions

  • testing_for: Identifies the AI system type (e.g., “LLM”, “Chatbot”, “Copilot”).
  • model_name: Required; The name or identifier for the model (e.g., “google/gemma-7b-it”).
  • model_type: Model type, typically “text_2_text” for LLMs.
  • model_version: specifies the model version. Custom value like v1.
  • system_prompt: sets initial model behavior.
  • conversation_template: template for conversation structure.
  • model_source: URL of model documentation or source.
  • model_provider: Entity providing the model (e.g., “Google”).
  • model_endpoint_url: Required; API endpoint for model interactions.
  • model_api_key: Key for API access; replace with actual credentials.

Usage Notes

  1. Verify that API keys and endpoints are correctly configured and secure.
  2. Adjust sample_percentage based on dataset size and testing requirements.
  3. Choose suitable attack_methods based on specific evaluation needs.
  4. Leave system_prompt and conversation_template empty if not needed.
  5. Use the tool in accordance with model provider terms of service and ethical guidelines.

This payload structure facilitates in-depth testing across various model types, allowing for comprehensive assessments of behavior and security.