feat: add ai-lakera-guard plugin by janiussyafiq · Pull Request #13570 · apache/apisix

janiussyafiq · 2026-06-18T09:07:34Z

Description

This PR adds a new plugin, ai-lakera-guard, that integrates APISIX with the Lakera Guard v2 /guard API to perform ML-based security scanning of LLM requests at the gateway — prompt injection / jailbreak, PII leakage, content-policy violations, and malicious / unknown links — so each backend LLM service no longer has to implement its own guardrails.

This is PR-1 (input guard MVP) of a planned, independently shippable series (input → output → streaming → observability), modeled closely on ai-aliyun-content-moderation.

How it works

Runs in the access phase at priority 1028, just below ai-proxy (1040) and ai-proxy-multi (1041), so the AI context is already populated. The plugin is meant to run behind one of those proxies; requests that did not pass through ai-proxy/ai-proxy-multi are handled per the configurable fail_mode (default skip — passed through unchecked; set fail_mode: error to reject them with 500).
Extracts the whole request conversation via apisix.plugins.ai-protocols (no role distinction) and sends it to Lakera POST /v2/guard.
On a flagged verdict it applies the configured action:
- block (default) — returns a provider-compatible deny response (a valid chat-completion, or SSE for streaming requests) carrying request_failure_message, built via proto.build_deny_response, so client SDKs render the refusal as a normal completion. The status is deny_code (default 200; set a 4xx to surface blocks as HTTP errors).
- alert — log-only shadow mode; traffic passes through.
Lakera errors / timeouts are governed by fail_open (fail-closed by default).
api_key is secret-managed via encrypt_fields + native $secret:// / $env:// resolution.
reveal_failure_categories optionally appends the matched detectors to the deny message; every flagged verdict logs Lakera's full per-detector breakdown and request_uuid.

Configuration

api_key is the only required field. Others: lakera_endpoint, project_id, direction (input only in this PR), action, fail_open, timeout, ssl_verify, reveal_failure_categories, deny_code, request_failure_message.

Files

Plugin: apisix/plugins/ai-lakera-guard.lua, apisix/plugins/ai-lakera-guard/schema.lua, apisix/plugins/ai-lakera-guard/client.lua
Registration: apisix/cli/config.lua, conf/config.yaml.example
Docs: docs/en/latest/plugins/ai-lakera-guard.md, docs/en/latest/config.json
Tests: t/plugin/ai-lakera-guard.t, t/plugin/ai-lakera-guard-secrets.t, fixtures under t/fixtures/lakera/

Which issue(s) this PR fixes:

Part of #13291

Checklist

I have explained the need for this PR and the problem it solves
I have explained the changes or the new features added to this PR
I have added tests corresponding to this change
I have updated the documentation to reflect this change
I have verified that this change is backward compatible (new, opt-in plugin disabled by default; additive registration only)

Add the ai-lakera-guard plugin (PR-1, input guard MVP) integrating APISIX with the Lakera Guard v2 /guard API to scan LLM request prompts for prompt injection, PII, content-policy violations, and malicious/unknown links at the gateway. The plugin runs in the access phase at priority 1028, below ai-proxy / ai-proxy-multi, which it requires. It extracts the whole request conversation via apisix.plugins.ai-protocols and calls Lakera POST /v2/guard. On a flagged verdict it either blocks with a provider-compatible deny response (a valid chat-completion or SSE carrying request_failure_message, returned with deny_code, default 200) or alerts (log-only shadow mode). Lakera errors and timeouts are governed by fail_open (fail-closed by default). The api_key is secret-managed via encrypt_fields and the native $secret:// / $env:// resolution. Signed-off-by: janiussyafiq <izzraff.js@gmail.com>

- Makefile: install apisix/plugins/ai-lakera-guard/*.lua so the luarocks 'diff -rq' check no longer reports the dir as uninstalled - t/admin/plugins.t: add ai-lakera-guard to the priority-ordered expected plugin list (priority 1028, between ai-aliyun-content- moderation 1029 and proxy-mirror 1010)

Handle requests this plugin cannot inspect (no picked ai instance, or an unsupported protocol) via the shared ai-protocols.binding helper and a configurable fail_mode (skip/warn/error, default skip) instead of a hard 500, matching ai-aliyun-content-moderation. This lets non-AI traffic pass through unchecked when the plugin is bound at the Consumer/Service level. fail_mode is distinct from fail_open, which governs Lakera API failures. Also collapse the test routes onto a single route id (overwrite-in-place, grouping default-config tests first) to match the convention used by the sibling AI plugins. - schema: add fail_mode = binding.schema_property("skip") - access: route no-instance / unsupported-protocol through on_unsupported - docs: document fail_mode; clarify non-ai-proxy traffic behavior - t: fail_mode=error (500) and default skip (pass-through) coverage

Copilot

Pull request overview

This PR introduces a new APISIX AI security plugin, ai-lakera-guard, which calls Lakera Guard v2 (/v2/guard) during the access phase to scan LLM request content for unsafe/promotional injection/PII/content-policy issues and either block (default) or alert (shadow mode) based on the verdict.

Changes:

Added the ai-lakera-guard plugin implementation (schema + HTTP client + access-phase enforcement and provider-compatible deny responses).
Registered the plugin in default configs/build install rules and documentation navigation.
Added end-to-end tests (including $secret:// and $env:// api_key resolution) and Lakera response fixtures.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`apisix/plugins/ai-lakera-guard.lua`	Main plugin logic: extract request content via ai-protocols, call Lakera, block/alert, build provider-compatible deny responses.
`apisix/plugins/ai-lakera-guard/client.lua`	Lakera `/v2/guard` HTTP client (request building, timeout/ssl_verify handling, response decoding).
`apisix/plugins/ai-lakera-guard/schema.lua`	Plugin schema, defaults, and secret encryption (`encrypt_fields`).
`apisix/cli/config.lua`	Adds `ai-lakera-guard` to the default CLI plugin list.
`conf/config.yaml.example`	Documents plugin ordering/priority in the example configuration.
`Makefile`	Installs the new plugin directory and Lua files during `make install`.
`docs/en/latest/plugins/ai-lakera-guard.md`	New English plugin documentation page (usage, attributes, examples).
`docs/en/latest/config.json`	Adds `ai-lakera-guard` to the English docs sidebar under AI plugins.
`t/admin/plugins.t`	Adds plugin name to the admin plugin list test coverage.
`t/plugin/ai-lakera-guard.t`	Core behavioral tests: clean/flagged, fail-open/closed, reveal categories, fail_mode behavior, etc.
`t/plugin/ai-lakera-guard-secrets.t`	Tests secret reference and env var resolution for `api_key`.
`t/fixtures/lakera/scan-clean.json`	Fixture for non-flagged Lakera response.
`t/fixtures/lakera/scan-flagged.json`	Fixture for flagged Lakera response with per-detector breakdown.

Comments suppressed due to low confidence (1)

docs/en/latest/config.json:85

This English sidebar adds plugins/ai-lakera-guard, but the Chinese sidebar (docs/zh/latest/config.json) still lists the AI plugins and currently does not include this new entry. If the plugin docs are intended to be discoverable in the zh docs as well, add the corresponding plugins/ai-lakera-guard entry there (and ideally a zh doc page).

            "plugins/ai-proxy",
            "plugins/ai-proxy-multi",
            "plugins/ai-rate-limiting",
            "plugins/ai-prompt-guard",
            "plugins/ai-aws-content-moderation",
            "plugins/ai-aliyun-content-moderation",
            "plugins/ai-lakera-guard",
            "plugins/ai-prompt-decorator",
            "plugins/ai-prompt-template",
            "plugins/ai-rag",
            "plugins/ai-request-rewrite"
          ]

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

api_key is required but the string had no length constraint, so an empty value passed validation and would have sent an empty Authorization header. Add minLength = 1, matching the credential fields in ai-aliyun-content-moderation and ai-proxy.

Translate the ai-lakera-guard plugin page into Chinese and add it to the zh sidebar, mirroring the English version. Code samples are kept identical.

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.

membphis · 2026-06-22T03:37:12Z

P1: Preserve Lakera message roles instead of flattening the conversation into one user message

The plugin currently calls proto.extract_request_content(request_tab), concatenates all extracted text, and client.scan sends the result as:

messages = { { role = "user", content = content } }

This loses the original role and turn boundaries. For OpenAI Chat, this can turn system, assistant, historical user, and current user content into one current user message. For Anthropic and Responses requests, the protocol adapters already have role-preserving canonical message helpers, so flattening here bypasses information the codebase can keep.

Why this blocks merge: Lakera Guard's /v2/guard API is message-based, and role/context semantics matter for policy behavior. Sending the system prompt, assistant output, or older history as a new user message can block valid follow-up requests because old or non-user content is rescanned as the current user input. It can also make the gateway's enforcement differ from the API contract this plugin is integrating with.

Suggested fix:

Pass a messages array to client.scan, not a flattened string.
Build it from the protocol-normalized message helper, preserving system, user, and assistant roles where available.
Only fall back to one user message when the protocol has no role-preserving representation.
Update the "whole conversation is scanned" test to verify the full message array is sent without converting history/system/assistant messages into the latest user input.

Address review feedback on the input-guard MVP: - Forward the role-tagged conversation to Lakera via proto.get_messages instead of flattening it into one user message. Normalize each message's content to text and drop non-text parts so multimodal requests stay within Lakera /v2/guard's text-only contract; fall back to a single user message only when a protocol has no role-preserving representation. - Guard the nil return from get_json_request_body_table() and route it through binding.on_unsupported so fail_mode is honored. - Clarify in the schema and the en/zh docs that action=alert governs flagged verdicts only; Lakera API errors stay controlled by fail_open. - Update the conversation test to assert roles reach Lakera unflattened.

- Decode the Lakera response with null_as_nil and guard the result by type, so a JSON null (e.g. "metadata": null) cannot surface the truthy cjson.null sentinel and error when indexed. - Stop logging the Authorization header in the test mocks so the api key / resolved secret is never written to CI logs. - Strengthen the role-preservation test to assert each role is paired with its own content, not just that the role labels are present.

janiussyafiq · 2026-06-22T07:36:56Z

Reply to @membphis :

client.scan now receives and forwards a role-tagged messages array instead of a concatenated string.
The array is built from proto.get_messages() — the protocol's canonical {role, content} helper — so system/user/assistant turns are preserved for openai-chat, Anthropic and Responses.
Each message's content is coerced to text and non-text parts (e.g. multimodal image_url) are dropped. Lakera /v2/guard rejects non-text content with HTTP 400, which under the default fail-closed mode would otherwise block legitimate multimodal requests.
Falls back to a single user message only when a protocol exposes no role-preserving representation.
The "whole conversation" test now asserts the full role-tagged array reaches Lakera, with each role paired to its own content.

This also matches how other gateways integrate Lakera.

dosubot Bot added enhancement New feature or request plugin size:XXL This PR changes 1000+ lines, ignoring generated files. labels Jun 18, 2026

nic-6443 reviewed Jun 18, 2026

View reviewed changes

Comment thread apisix/plugins/ai-lakera-guard.lua

janiussyafiq added 2 commits June 19, 2026 07:19

nic-6443 previously approved these changes Jun 20, 2026

View reviewed changes

shreemaan-abhishek requested a review from Copilot June 22, 2026 01:19

Copilot started reviewing on behalf of shreemaan-abhishek June 22, 2026 01:19 View session

Copilot AI reviewed Jun 22, 2026

View reviewed changes

Comment thread apisix/plugins/ai-lakera-guard/schema.lua

Comment thread apisix/plugins/ai-lakera-guard/schema.lua

janiussyafiq added 2 commits June 22, 2026 10:28

docs(ai-lakera-guard): add Chinese translation

e164ebb

Translate the ai-lakera-guard plugin page into Chinese and add it to the zh sidebar, mirroring the English version. Code samples are kept identical.

janiussyafiq dismissed nic-6443’s stale review via e164ebb June 22, 2026 02:28

nic-6443 previously approved these changes Jun 22, 2026

View reviewed changes

janiussyafiq requested a review from Copilot June 22, 2026 03:29

Copilot started reviewing on behalf of janiussyafiq June 22, 2026 03:32 View session

Copilot AI reviewed Jun 22, 2026

View reviewed changes

Comment thread docs/en/latest/plugins/ai-lakera-guard.md

Comment thread docs/zh/latest/plugins/ai-lakera-guard.md

Comment thread apisix/plugins/ai-lakera-guard/schema.lua

Comment thread apisix/plugins/ai-lakera-guard.lua Outdated

janiussyafiq added 2 commits June 22, 2026 14:53

janiussyafiq dismissed nic-6443’s stale review via 84d950f June 22, 2026 07:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add ai-lakera-guard plugin#13570

feat: add ai-lakera-guard plugin#13570
janiussyafiq wants to merge 7 commits into
apache:masterfrom
janiussyafiq:feat/ai-lakera-guard-pr1

janiussyafiq commented Jun 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

membphis commented Jun 22, 2026

Uh oh!

janiussyafiq commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

janiussyafiq commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How it works

Configuration

Files

Which issue(s) this PR fixes:

Checklist

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

membphis commented Jun 22, 2026

P1: Preserve Lakera message roles instead of flattening the conversation into one user message

Uh oh!

janiussyafiq commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

janiussyafiq commented Jun 18, 2026 •

edited

Loading