fix(HuggingFaceLocalGenerator): remove stop_words cross-product in reply post-processing#11502
Open
alvinttang wants to merge 1 commit into
Open
Conversation
…ply post-processing With N replies and M stop_words, the previous nested-comprehension produced N*M replies instead of N. Half of the extra replies still contained the stop word because each iteration only stripped one. Switching to a sequential loop (already what the chat sibling at chat/hugging_face_local.py:660 does) keeps the count at N and removes every stop word from every reply. Refs deepset-ai#11409
|
Someone is attempting to deploy a commit to the deepset Team on Vercel. A member of the Team first needs to authorize it. |
|
alvinttang seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
anakin87
requested changes
Jun 4, 2026
Member
anakin87
left a comment
There was a problem hiding this comment.
Thank you for this PR.
Please sign the CLA, then ping me and I'll proceed with the actual review.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Refs #11409.
HuggingFaceLocalGenerator.runpost-processes replies with a nested list comprehension:That's a cross-product. With N replies and M stop words it emits N*M replies, and only every M-th one has every stop word removed. Half the output silently still contains a stop word.
The chat sibling at
chat/hugging_face_local.py:660already does this correctly with a sequential loop, so this PR aligns the non-chat path with the same pattern.RED
Two new regression tests on
main:The original
test_run_stop_words_removal(single stop word) keeps passing.GREEN
Full file:
28 passed, 1 deselected (integration, model download), 6 warnings in 140.60s. No regression elsewhere.