Skip to content

Improve prompt inject for Python#21641

Open
josefs wants to merge 5 commits intomainfrom
josefs/promptInjectionImprovements
Open

Improve prompt inject for Python#21641
josefs wants to merge 5 commits intomainfrom
josefs/promptInjectionImprovements

Conversation

@josefs
Copy link
Copy Markdown

@josefs josefs commented Apr 2, 2026

I have a few repos where I'd like the prompt injection to trigger, and I've verified that it at least finds new sources for these:

For more info on these repos, see:
https://github.com/dsp-testing/xpi-000

@github-actions github-actions Bot added the Python label Apr 2, 2026
@josefs josefs requested review from mbaluda and yoff April 2, 2026 16:47
Comment thread python/ql/src/experimental/semmle/python/frameworks/OpenAI.qll Fixed
@josefs josefs added the no-change-note-required This PR does not need a change note label Apr 2, 2026
Copy link
Copy Markdown
Contributor

@mbaluda mbaluda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add test cases for the Anthropic models

@@ -20,7 +20,7 @@ async def get_input_openai():

response2 = client.responses.create(
instructions="Talks like a " + persona, # $ Alert[py/prompt-injection]
input=[
input=[ # $ Alert[py/prompt-injection]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally the idea was to avoid duplicate alerts like this (already reported for content),
that is why we have that logic in getContentNode()
Can you add a test if that is not sufficient?

yoff
yoff previously approved these changes Apr 7, 2026
Copy link
Copy Markdown
Contributor

@yoff yoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM so far. I assume you will take it out of draft when you want a final review.

@josefs josefs force-pushed the josefs/promptInjectionImprovements branch from 0208d67 to 25a8aa9 Compare April 28, 2026 17:25
@josefs
Copy link
Copy Markdown
Author

josefs commented Apr 28, 2026

Apologies for letting this PR linger.
I've removed the code that changed the prompt injection query. I deemed it too complicated for too little benefit.
I've also added tests for the Anthropic models.

@josefs josefs marked this pull request as ready for review April 28, 2026 21:31
@josefs josefs requested a review from a team as a code owner April 28, 2026 21:31
Copilot AI review requested due to automatic review settings April 28, 2026 21:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends Python prompt-injection modeling and tests to cover additional LLM SDK call patterns (OpenAI responses + chat.completions, and Anthropic messages APIs), ensuring the query flags user-controlled data flowing into these prompt construction sinks.

Changes:

  • Added new OpenAI prompt-injection sinks for chat.completions.create(messages[].content) and responses.create(input/instructions).
  • Introduced Anthropic prompt-injection sink modeling (system prompts + message content) plus corresponding type modeling.
  • Expanded the CWE-1427 PromptInjection query test suite and updated expected results accordingly.
Show a summary per file
File Description
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/openai_test.py Adds an additional alert annotation to validate responses.create(input=[...]) modeling.
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/anthropic_test.py New test coverage for Anthropic SDK prompt sinks (system, messages[].content) across sync/async/beta APIs.
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/PromptInjection.expected Updates expected results to include new Anthropic/OpenAI sink findings and paths.
python/ql/lib/semmle/python/frameworks/openai.model.yml Adds OpenAI sink models for chat completions message content and responses API inputs/instructions.
python/ql/lib/semmle/python/frameworks/anthropic.model.yml New Anthropic sink + type models to support prompt-injection detection.

Copilot's findings

  • Files reviewed: 5/5 changed files
  • Comments generated: 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-change-note-required This PR does not need a change note Python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants