Skip to content

feat(guardrails): add Veto guardrail plugin#1670

Open
OdysseusU wants to merge 1 commit into
Portkey-AI:mainfrom
OdysseusU:veto-guardrail
Open

feat(guardrails): add Veto guardrail plugin#1670
OdysseusU wants to merge 1 commit into
Portkey-AI:mainfrom
OdysseusU:veto-guardrail

Conversation

@OdysseusU
Copy link
Copy Markdown

Description: (required)

Adds Veto as a native guardrail provider. Veto is an
EU-hosted LLM guardrail layer (PII/secret redaction, prompt-injection detection,
content moderation) behind one endpoint. Independent benchmark:
https://bench.vetocheck.com

  • New plugins/veto/manifest.json, check.ts (handler), check.test.ts.
  • Registered veto.check in plugins/index.ts.
  • Thin HTTP client: calls Veto's POST /v1/check and maps the verdict onto the
    PluginHandler contract — blockverdict:false, redact
    verdict:true + masked transformedData, allowverdict:true. No
    detection logic in the gateway.
  • Reads/writes content via the shared getCurrentContentPart /
    setCurrentContentPart utils (handles multimodal text parts); posts via the
    shared post util with a configurable timeout (default 30s).
  • Fail-closed on a non-2xx (HttpError), timeout, thrown error, or missing
    apiKey (verdict:false) — consistent with screening guardrails; an
    unreachable backend must not pass unscanned text through.
  • Logged data.findings carries only category/rule/severity/score; the
    matched substring and offsets are stripped so PII/secrets never reach request
    logs.
  • Credentials: apiKey (encrypted) + optional apiBase. Params: categories,
    redact, timeout.

Tests Run/Test cases added: (required)

npx jest plugins/veto/check.test.ts — 10 passing:

  • allow → verdict true, no transform
  • block → verdict false
  • redact (single part) → verdict true + masked transformedData
  • redact verdict never logs the matched substring
  • multimodal (text + image) → scans text part, masks it
  • multiple text parts + redact → blocked (single redacted blob can't be re-split)
  • gateway non-2xx → fail-closed (verdict false)
  • network throw → fail-closed (verdict false)
  • missing apiKey → fail-closed, no network call
  • empty text → skipped, verdict true, no network call

Type of Change:

  • New feature (non-breaking change which adds functionality)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant