feat: add ai-lakera-guard plugin#13570
Conversation
Add the ai-lakera-guard plugin (PR-1, input guard MVP) integrating APISIX with the Lakera Guard v2 /guard API to scan LLM request prompts for prompt injection, PII, content-policy violations, and malicious/unknown links at the gateway. The plugin runs in the access phase at priority 1028, below ai-proxy / ai-proxy-multi, which it requires. It extracts the whole request conversation via apisix.plugins.ai-protocols and calls Lakera POST /v2/guard. On a flagged verdict it either blocks with a provider-compatible deny response (a valid chat-completion or SSE carrying request_failure_message, returned with deny_code, default 200) or alerts (log-only shadow mode). Lakera errors and timeouts are governed by fail_open (fail-closed by default). The api_key is secret-managed via encrypt_fields and the native $secret:// / $env:// resolution. Signed-off-by: janiussyafiq <izzraff.js@gmail.com>
- Makefile: install apisix/plugins/ai-lakera-guard/*.lua so the luarocks 'diff -rq' check no longer reports the dir as uninstalled - t/admin/plugins.t: add ai-lakera-guard to the priority-ordered expected plugin list (priority 1028, between ai-aliyun-content- moderation 1029 and proxy-mirror 1010)
Handle requests this plugin cannot inspect (no picked ai instance, or an
unsupported protocol) via the shared ai-protocols.binding helper and a
configurable fail_mode (skip/warn/error, default skip) instead of a hard
500, matching ai-aliyun-content-moderation. This lets non-AI traffic pass
through unchecked when the plugin is bound at the Consumer/Service level.
fail_mode is distinct from fail_open, which governs Lakera API failures.
Also collapse the test routes onto a single route id (overwrite-in-place,
grouping default-config tests first) to match the convention used by the
sibling AI plugins.
- schema: add fail_mode = binding.schema_property("skip")
- access: route no-instance / unsupported-protocol through on_unsupported
- docs: document fail_mode; clarify non-ai-proxy traffic behavior
- t: fail_mode=error (500) and default skip (pass-through) coverage
There was a problem hiding this comment.
Pull request overview
This PR introduces a new APISIX AI security plugin, ai-lakera-guard, which calls Lakera Guard v2 (/v2/guard) during the access phase to scan LLM request content for unsafe/promotional injection/PII/content-policy issues and either block (default) or alert (shadow mode) based on the verdict.
Changes:
- Added the
ai-lakera-guardplugin implementation (schema + HTTP client + access-phase enforcement and provider-compatible deny responses). - Registered the plugin in default configs/build install rules and documentation navigation.
- Added end-to-end tests (including
$secret://and$env://api_key resolution) and Lakera response fixtures.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
apisix/plugins/ai-lakera-guard.lua |
Main plugin logic: extract request content via ai-protocols, call Lakera, block/alert, build provider-compatible deny responses. |
apisix/plugins/ai-lakera-guard/client.lua |
Lakera /v2/guard HTTP client (request building, timeout/ssl_verify handling, response decoding). |
apisix/plugins/ai-lakera-guard/schema.lua |
Plugin schema, defaults, and secret encryption (encrypt_fields). |
apisix/cli/config.lua |
Adds ai-lakera-guard to the default CLI plugin list. |
conf/config.yaml.example |
Documents plugin ordering/priority in the example configuration. |
Makefile |
Installs the new plugin directory and Lua files during make install. |
docs/en/latest/plugins/ai-lakera-guard.md |
New English plugin documentation page (usage, attributes, examples). |
docs/en/latest/config.json |
Adds ai-lakera-guard to the English docs sidebar under AI plugins. |
t/admin/plugins.t |
Adds plugin name to the admin plugin list test coverage. |
t/plugin/ai-lakera-guard.t |
Core behavioral tests: clean/flagged, fail-open/closed, reveal categories, fail_mode behavior, etc. |
t/plugin/ai-lakera-guard-secrets.t |
Tests secret reference and env var resolution for api_key. |
t/fixtures/lakera/scan-clean.json |
Fixture for non-flagged Lakera response. |
t/fixtures/lakera/scan-flagged.json |
Fixture for flagged Lakera response with per-detector breakdown. |
Comments suppressed due to low confidence (1)
docs/en/latest/config.json:85
- This English sidebar adds
plugins/ai-lakera-guard, but the Chinese sidebar (docs/zh/latest/config.json) still lists the AI plugins and currently does not include this new entry. If the plugin docs are intended to be discoverable in the zh docs as well, add the correspondingplugins/ai-lakera-guardentry there (and ideally a zh doc page).
"plugins/ai-proxy",
"plugins/ai-proxy-multi",
"plugins/ai-rate-limiting",
"plugins/ai-prompt-guard",
"plugins/ai-aws-content-moderation",
"plugins/ai-aliyun-content-moderation",
"plugins/ai-lakera-guard",
"plugins/ai-prompt-decorator",
"plugins/ai-prompt-template",
"plugins/ai-rag",
"plugins/ai-request-rewrite"
]
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
api_key is required but the string had no length constraint, so an empty value passed validation and would have sent an empty Authorization header. Add minLength = 1, matching the credential fields in ai-aliyun-content-moderation and ai-proxy.
Translate the ai-lakera-guard plugin page into Chinese and add it to the zh sidebar, mirroring the English version. Code samples are kept identical.
P1: Preserve Lakera message roles instead of flattening the conversation into one user messageThe plugin currently calls messages = { { role = "user", content = content } }This loses the original role and turn boundaries. For OpenAI Chat, this can turn system, assistant, historical user, and current user content into one current Why this blocks merge: Lakera Guard's Suggested fix:
|
Address review feedback on the input-guard MVP: - Forward the role-tagged conversation to Lakera via proto.get_messages instead of flattening it into one user message. Normalize each message's content to text and drop non-text parts so multimodal requests stay within Lakera /v2/guard's text-only contract; fall back to a single user message only when a protocol has no role-preserving representation. - Guard the nil return from get_json_request_body_table() and route it through binding.on_unsupported so fail_mode is honored. - Clarify in the schema and the en/zh docs that action=alert governs flagged verdicts only; Lakera API errors stay controlled by fail_open. - Update the conversation test to assert roles reach Lakera unflattened.
- Decode the Lakera response with null_as_nil and guard the result by type, so a JSON null (e.g. "metadata": null) cannot surface the truthy cjson.null sentinel and error when indexed. - Stop logging the Authorization header in the test mocks so the api key / resolved secret is never written to CI logs. - Strengthen the role-preservation test to assert each role is paired with its own content, not just that the role labels are present.
|
Reply to @membphis :
This also matches how other gateways integrate Lakera. |
Description
This PR adds a new plugin,
ai-lakera-guard, that integrates APISIX with the Lakera Guard v2/guardAPI to perform ML-based security scanning of LLM requests at the gateway — prompt injection / jailbreak, PII leakage, content-policy violations, and malicious / unknown links — so each backend LLM service no longer has to implement its own guardrails.This is PR-1 (input guard MVP) of a planned, independently shippable series (input → output → streaming → observability), modeled closely on
ai-aliyun-content-moderation.How it works
accessphase at priority 1028, just belowai-proxy(1040) andai-proxy-multi(1041), so the AI context is already populated. The plugin is meant to run behind one of those proxies; requests that did not pass throughai-proxy/ai-proxy-multiare handled per the configurablefail_mode(defaultskip— passed through unchecked; setfail_mode: errorto reject them with500).apisix.plugins.ai-protocols(no role distinction) and sends it to LakeraPOST /v2/guard.action:block(default) — returns a provider-compatible deny response (a valid chat-completion, or SSE for streaming requests) carryingrequest_failure_message, built viaproto.build_deny_response, so client SDKs render the refusal as a normal completion. The status isdeny_code(default200; set a 4xx to surface blocks as HTTP errors).alert— log-only shadow mode; traffic passes through.fail_open(fail-closed by default).api_keyis secret-managed viaencrypt_fields+ native$secret:///$env://resolution.reveal_failure_categoriesoptionally appends the matched detectors to the deny message; every flagged verdict logs Lakera's full per-detector breakdown andrequest_uuid.Configuration
api_keyis the only required field. Others:lakera_endpoint,project_id,direction(inputonly in this PR),action,fail_open,timeout,ssl_verify,reveal_failure_categories,deny_code,request_failure_message.Files
apisix/plugins/ai-lakera-guard.lua,apisix/plugins/ai-lakera-guard/schema.lua,apisix/plugins/ai-lakera-guard/client.luaapisix/cli/config.lua,conf/config.yaml.exampledocs/en/latest/plugins/ai-lakera-guard.md,docs/en/latest/config.jsont/plugin/ai-lakera-guard.t,t/plugin/ai-lakera-guard-secrets.t, fixtures undert/fixtures/lakera/Which issue(s) this PR fixes:
Part of #13291
Checklist