Analyze an image with text using individual multimodal guardrails detectors
The text prompt to accompany the image for analysis.
Base64-encoded image file content.
Configuration for multimodal detectors. Each key is a detector name with an object specifying its settings. Supported detectors: toxicity, nsfw, injection_attack, pii, policy_violation.
{
"toxicity": { "enabled": true },
"nsfw": { "enabled": true },
"injection_attack": { "enabled": true },
"pii": {
"enabled": true,
"entities": ["person", "phone", "email"]
},
"policy_violation": {
"enabled": true,
"policy_text": "No violent or illegal content allowed.",
"need_explanation": true
}
}Multimodal image detection results.
Per-detector integer flags (1 = detected, 0 = not detected).
{
"toxicity": 0,
"nsfw": 0,
"pii": 1,
"injection_attack": 0,
"policy_violation": 0
}Per-detector detail objects containing human-readable results and any extra fields (e.g. entities for PII, explanation for policy_violation).
{
"toxicity": { "toxicity": "Toxicity Not Detected" },
"nsfw": { "nsfw": "NSFW Not Detected" },
"pii": {
"entities": { "person": { "John Doe": "<person_0>" } }
},
"injection_attack": {
"injection_attack": "Injection Attack Not Detected"
},
"policy_violation": {
"policy_violation": "Policy Violation Not Detected"
}
}