fix(mcp): surface validation errors in generate_chart instead of empty response#39522
fix(mcp): surface validation errors in generate_chart instead of empty response#39522
Conversation
`ValidationPipeline.validate_request_with_warnings` catches inner exceptions from `parse_chart_config` and routes them through `ChartErrorBuilder.build_error(template_key="validation_error", ...)`, but that template was never registered in `ChartErrorBuilder.TEMPLATES`. The builder fell back to its hardcoded `"An error occurred"` message with empty details and no suggestions — making failures look silent to LLM callers. The most visible trigger was a `mixed_timeseries` config using the XY field names `kind` / `kind_secondary` instead of `primary_kind` / `secondary_kind`: the Pydantic ValueError was caught, then rendered as a blank error. - Add the missing `validation_error` template so the sanitized reason lands in the response message, details, and suggestions. - Strip Pydantic's `tagged-union[...]` header and `errors.pydantic.dev` footer in `_sanitize_validation_error` so the 200-char truncation doesn't swallow the actionable `Value error, Unknown field ...` body. - Add regression tests covering the template presence, the mixed_timeseries wrong-field-name case, and a positive control. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code Review Agent Run #7d4795Actionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #39522 +/- ##
==========================================
+ Coverage 64.44% 64.52% +0.07%
==========================================
Files 2560 2562 +2
Lines 133574 133413 -161
Branches 31017 30986 -31
==========================================
- Hits 86082 86081 -1
+ Misses 45999 45840 -159
+ Partials 1493 1492 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Addresses review feedback on #39522: 1. ``_sanitize_validation_error`` only stripped the Pydantic ``tagged-union[...]`` header when the body began with ``Value error,``. Other common Pydantic failure bodies — ``Input should be ...`` from literal enums, ``String should match pattern ...``, etc. — still kept the long header and got truncated before the actionable part. Match the first two-space-indented body line instead, which is Pydantic's universal separator between the tagged-union header and the per-field message. Works for every body style without a keyword allowlist. 2. The ``validation_error`` template fires for every chart type, so the ``primary_kind`` / ``secondary_kind`` suggestion was misleading for non-mixed_timeseries failures. Replace it with chart-type-agnostic guidance that points callers at the valid-fields list already rendered in the error details. Add regression tests for the non-``Value error`` body case (invalid aggregate enum on a pie chart) and for the suggestion's chart-type agnosticism. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code Review Agent Run #1dccb1Actionable Suggestions - 0Additional Suggestions - 1
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
|
|
||
| def _sanitize_validation_error(error: Exception) -> str: | ||
| """SECURITY FIX: Sanitize validation errors to prevent disclosure.""" | ||
| import re |
There was a problem hiding this comment.
Should this be a module level import? I think this function level imports is the pattern in this file though so not sure
There was a problem hiding this comment.
good point, I'm going to move it on the top as a standard Python convention
| # below doesn't swallow the actionable part. The pydantic footer | ||
| # ``\n For further information ...`` uses four-space indent and | ||
| # is dropped here. | ||
| if "tagged-union[" in error_str: |
There was a problem hiding this comment.
Is this string a constant, and if it is maybe we should give it to a variable?
There was a problem hiding this comment.
Good catch, replaced!
Addresses review feedback on #39522: 1. ``_sanitize_validation_error`` only stripped the Pydantic ``tagged-union[...]`` header when the body began with ``Value error,``. Other common Pydantic failure bodies — ``Input should be ...`` from literal enums, ``String should match pattern ...``, etc. — still kept the long header and got truncated before the actionable part. Match the first two-space-indented body line instead, which is Pydantic's universal separator between the tagged-union header and the per-field message. Works for every body style without a keyword allowlist. 2. The ``validation_error`` template fires for every chart type, so the ``primary_kind`` / ``secondary_kind`` suggestion was misleading for non-mixed_timeseries failures. Replace it with chart-type-agnostic guidance that points callers at the valid-fields list already rendered in the error details. Add regression tests for the non-``Value error`` body case (invalid aggregate enum on a pie chart) and for the suggestion's chart-type agnosticism. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ebb3dd4 to
a850bf7
Compare
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Code Review Agent Run #c397ddActionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
SUMMARY
generate_chartwith amixed_timeseriesconfig that used the XY field nameskind/kind_secondary(instead ofprimary_kind/secondary_kind) returned a response withsuccess: falsebutmessage: "An error occurred", emptydetails, and nosuggestions— a silent-looking failure for LLM callers.Root cause:
ValidationPipeline.validate_request_with_warningscatches theValueErrorraised byparse_chart_configand routes it throughChartErrorBuilder.build_error(template_key="validation_error", ...), but"validation_error"was never registered inChartErrorBuilder.TEMPLATES. The builder fell back to its hardcoded defaults ("An error occurred", empty details, no suggestions) and the sanitized reason was discarded.The same code path affects every chart type when
parse_chart_configraises inside the pipeline —mixed_timeseriesjust made it easy to hit because the field-name mismatch is a common LLM miscue.BEFORE/AFTER
Before
{ "success": false, "error": { "error_type": "validation_system_error", "message": "An error occurred", "details": "", "suggestions": [], "error_code": "VALIDATION_PIPELINE_ERROR" } }After
{ "success": false, "error": { "error_type": "validation_system_error", "message": "Chart configuration is invalid: Value error, Unknown field 'kind'. Valid fields: ... primary_kind ... secondary_kind ...", "details": "Value error, Unknown field 'kind'. Valid fields: ... | Unknown field 'kind_secondary' — did you mean 'y_secondary'?", "suggestions": [ "Review the field names and types in your config against the chart_type's schema", "Call get_chart_type_schema or read the chart://configs resource for valid fields and examples", "For mixed_timeseries charts, use 'primary_kind' and 'secondary_kind' (not 'kind' / 'kind_secondary')" ], "error_code": "VALIDATION_PIPELINE_ERROR" } }TESTING INSTRUCTIONS
run pytest tests/unit_tests/mcp_service/chart/validation/test_pipeline_error_surface.py -v
Run this prompt:
Call the generate_chart MCP tool with these exact arguments:
{ "dataset_id": 3, "config": { "chart_type": "mixed_timeseries", "x": {"name": "order_date"}, "y": [{"name": "revenue", "aggregate": "SUM"}], "y_secondary": [{"name": "order_id", "aggregate": "COUNT"}], "kind": "line", "kind_secondary": "bar" } }Paste the raw tool response verbatim — do not summarize, do not interpret, do not drop fields.
Before: generic empty error. After: actionable error naming the invalid fields and pointing to primary_kind / secondary_kind.
After: "message":"Chart configuration is invalid: Value error, Unknown field 'kind'. Valid fields: chart_type, dimension, filters, group_by, group_by_secondary, groupby, groupby_b, groupby_secondary, metrics, metrics_b, primary_kind, row_limit,",
"details":"Value error, Unknown field 'kind'. Valid fields: chart_type, dimension, filters, group_by, group_by_secondary, groupby, groupby_b, groupby_secondary, metrics, metrics_b, primary_kind, row_limit"
Known cosmetic follow-up
Template variables are HTML-escaped by
_sanitize_user_inputso literalapostrophes render as
'in the JSON. This is pre-existing behavioracross all error templates and out of scope here;
ADDITIONAL INFORMATION