Skip to content

fix: convert Arrow JSON to Lance JSON in single-fragment create path#7469

Open
xloya wants to merge 1 commit into
lance-format:mainfrom
xloya:upstream-pr/json-fragment-create
Open

fix: convert Arrow JSON to Lance JSON in single-fragment create path#7469
xloya wants to merge 1 commit into
lance-format:mainfrom
xloya:upstream-pr/json-fragment-create

Conversation

@xloya

@xloya xloya commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Problem

Creating a single fragment via LanceFragment.create / FragmentCreateBuilder with an Arrow JSON column (arrow.json, stored as Utf8) writes raw UTF-8 bytes into a column whose schema declares Lance JSON (JSONB / LargeBinary), corrupting subsequent reads.

Root cause

The multi-fragment and dataset write paths run the Arrow JSON -> Lance JSON conversion via do_write_fragments. The single-fragment create path skipped it.

Fix

Run the same conversion in the create path via SchemaAdapter::to_physical_stream.

Test

test_fragment_create_with_json_column (Python).

@github-actions github-actions Bot added A-python Python bindings bug Something isn't working labels Jun 25, 2026
@xloya xloya force-pushed the upstream-pr/json-fragment-create branch from b53c9e4 to e015f64 Compare June 25, 2026 08:06
@xloya xloya force-pushed the upstream-pr/json-fragment-create branch from e015f64 to ad85b9e Compare June 25, 2026 08:22
@codecov

codecov Bot commented Jun 25, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-python Python bindings bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant