fix: unquote SQL identifiers in VISUALIZE column references#450
fix: unquote SQL identifiers in VISUALIZE column references#450cpsievert wants to merge 1 commit into
Conversation
Quoted identifiers like "variable.dotted" were stored with their surrounding double-quotes intact, causing validation to fail when matching against Arrow schema column names (which are unquoted). Promotes the private unquote helper from stat_aggregate to a public naming::unquote_ident and applies it at parse time in builder.rs. Fixes posit-dev/ggsql-python#3
|
I haven’t had a chance to look at this properly, but I wanted to drop a quick drive-by comment to note that we must be careful to continue to quote identifiers in emitted SQL queries, in order to support the snowflake engine, where unquoted identifiers are automatically converted to uppercase (I know…) This PR might not touch that, but I just want to be sure that affected columns remain quoted in SQL queries we emit. I think dots in identifiers is always going to be flakey here, since SQL uses dots to separate database, schema and tables. So, for example, |
Summary
Columns with dots (or other special characters) in their names fail when used in VISUALIZE mappings with quoted identifiers — e.g.
"variable.dotted" AS y. The parser was storing the raw source text including surrounding double-quotes, so validation couldn't match them against the actual (unquoted) Arrow schema column names.unquotehelper fromstat_aggregate.rstonaming::unquote_ident(inverse ofquote_ident)builder.rsfor both explicit and implicit column mappingsFixes posit-dev/ggsql-python#3
Repro
Before:
Validation error: Layer 1: aesthetic 'pos2' references non-existent column '"variable.dotted"'After: renders successfully
Test plan
unquote_ident(basic cases + quote/unquote roundtrip)cargo test --lib naming— 32 passed)