Skip to content

fix markdown text leaf line breaks#5026

Open
zbeyens wants to merge 1 commit into
mainfrom
codex/4835-markdown-linebreak-serialization
Open

fix markdown text leaf line breaks#5026
zbeyens wants to merge 1 commit into
mainfrom
codex/4835-markdown-linebreak-serialization

Conversation

@zbeyens

@zbeyens zbeyens commented Jun 15, 2026

Copy link
Copy Markdown
Member
  • Auto release

🐛 Completes #4835, crediting @dschoorl

🟢 95% confidence

Phase 🧪 Tests 🌐 Browser
Reproduced 🔴 Source repro + red markdown package test ➖ N/A
Verified 🟢 Focused tests, package typecheck, lint, autoreview, pnpm check ➖ N/A

✅ Outcome

Markdown serialization preserves hard line breaks embedded inside a single paragraph text leaf, so the reported value no longer round-trips into separate paragraphs.

This also keeps AI streaming's exact chunk contract intact, instead of adopting the draft PR's escaped markdown/html output changes.

Credit: based on @dschoorl's original report and PR, with commit co-author credit included.

⚠️ Caveat

This is not a broad rewrite of all trailing hard-break semantics. It keeps existing split-break-child parity and fixes the reproduced embedded-text bug plus required AI streaming compatibility.

🏗️ Design

The durable fix is in @platejs/markdown paragraph serialization: embedded text-leaf \n values are normalized through the existing break-node serialization path before mdast conversion.

@platejs/ai gets a targeted streaming shim so embedded text-leaf newlines remain raw stream text, while explicit markdown hard-break children still serialize as explicit hard-break syntax.

🧪 Verified

  • bun test packages/markdown/src/lib/commonmarkSurface.spec.ts
  • bun test ./apps/www/src/__tests__/package-integration/ai-chat-streaming/streamSerializeMd.slow.tsx ./apps/www/src/__tests__/package-integration/ai-chat-streaming/streamDeserializeMd.slow.tsx
  • pnpm turbo typecheck --filter=./packages/markdown --filter=./packages/ai
  • pnpm --filter @platejs/markdown test
  • pnpm lint:fix
  • .agents/skills/autoreview/scripts/autoreview --mode local
  • pnpm check

@zbeyens zbeyens requested a review from a team June 15, 2026 17:22
@codesandbox

codesandbox Bot commented Jun 15, 2026

Copy link
Copy Markdown

Review or Edit in CodeSandbox

Open the branch in Web EditorVS CodeInsiders

Open Preview

@changeset-bot

changeset-bot Bot commented Jun 15, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: b99cebe

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@platejs/ai Patch
@platejs/markdown Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. patch Bugfix & documentation PR plugin:ai plugin:markdown Markdown deserializer labels Jun 15, 2026
Co-authored-by: Dave Schoorl <1429321+dschoorl@users.noreply.github.com>
@zbeyens zbeyens force-pushed the codex/4835-markdown-linebreak-serialization branch from d44a8f3 to b99cebe Compare June 15, 2026 17:24

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d44a8f31e6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +160 to +161
if (index < parts.length - 1) {
nodes.push({ type: 'break' } as Descendant);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle trailing embedded blank lines

When a paragraph text leaf ends with multiple newlines, e.g. { text: 'a\n\n' }, this unconditional conversion creates consecutive break nodes; the trailing-break path then serializes them as a\\ \n<br />\n, which deserializeMd reads back as a\\ plus a separate paragraph instead of the original hard breaks. This corrupts content for trailing embedded blank lines, so the normalizer needs a case that round-trips consecutive/trailing newlines without producing the break+HTML sequence.

Useful? React with 👍 / 👎.

# 4835 markdown linebreak serialization

Objective:
Complete PR #4835 markdown linebreak serialization fix; done when reproduced, package fix verified, PR opened and tracker credited.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prefix issue-backed plan with ticket id

AGENTS.md under “Goal plans” says issue-backed goal work should start the filename with the ticket number, but this added plan records itself under the date-prefixed path docs/plans/2026-06-15-4835-markdown-linebreak-serialization.md for PR/discussion #4835. This bypasses the repo’s ticket-prefixed plan convention, so rename the file and this internal Goal plan reference to start with 4835- or the actual tracker id.

Useful? React with 👍 / 👎.

Comment on lines +148 to +149
if (trimmedResult.endsWith('\n<br />')) {
result = trimmedResult.slice(0, -'<br />'.length);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve trailing streaming hard breaks

When the streamed value ends with an explicit hard-break child, markdown.serialize emits the existing \n<br /> representation, but this new cleanup strips <br /> whenever the chunk has trailing newline whitespace; the later append then returns only a plain newline, e.g. { text: 'chunk1' }, { text: '\n' } with chunk chunk1\\\n serializes as chunk1\n. That changes a hard break into a soft line break at the end of streamed content, so only remove this suffix when it was generated from an embedded text newline rather than an explicit break node.

Useful? React with 👍 / 👎.

Comment on lines +150 to +151
if (typeof text !== 'string' || !text.includes('\n')) {
return [child];

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Normalize line breaks inside inline children

This early return leaves inline element children unchanged, so a paragraph link like { type: 'a', children: [{ text: 'before\n\nafter' }] } still serializes as [before\n\nafter](url), which deserializeMd reads back as separate paragraphs and literal bracket text. Since links and other inline elements are valid paragraph children, the line-break normalization needs to recurse into inline children before mdast conversion instead of only handling direct text nodes.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

patch Bugfix & documentation PR plugin:ai plugin:markdown Markdown deserializer size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant