fix markdown text leaf line breaks#5026
Conversation
Review or Edit in CodeSandboxOpen the branch in Web Editor • VS Code • Insiders |
🦋 Changeset detectedLatest commit: b99cebe The changes in this PR will be included in the next version bump. This PR includes changesets to release 2 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Co-authored-by: Dave Schoorl <1429321+dschoorl@users.noreply.github.com>
d44a8f3 to
b99cebe
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d44a8f31e6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (index < parts.length - 1) { | ||
| nodes.push({ type: 'break' } as Descendant); |
There was a problem hiding this comment.
Handle trailing embedded blank lines
When a paragraph text leaf ends with multiple newlines, e.g. { text: 'a\n\n' }, this unconditional conversion creates consecutive break nodes; the trailing-break path then serializes them as a\\ \n<br />\n, which deserializeMd reads back as a\\ plus a separate paragraph instead of the original hard breaks. This corrupts content for trailing embedded blank lines, so the normalizer needs a case that round-trips consecutive/trailing newlines without producing the break+HTML sequence.
Useful? React with 👍 / 👎.
| # 4835 markdown linebreak serialization | ||
|
|
||
| Objective: | ||
| Complete PR #4835 markdown linebreak serialization fix; done when reproduced, package fix verified, PR opened and tracker credited. |
There was a problem hiding this comment.
Prefix issue-backed plan with ticket id
AGENTS.md under “Goal plans” says issue-backed goal work should start the filename with the ticket number, but this added plan records itself under the date-prefixed path docs/plans/2026-06-15-4835-markdown-linebreak-serialization.md for PR/discussion #4835. This bypasses the repo’s ticket-prefixed plan convention, so rename the file and this internal Goal plan reference to start with 4835- or the actual tracker id.
Useful? React with 👍 / 👎.
| if (trimmedResult.endsWith('\n<br />')) { | ||
| result = trimmedResult.slice(0, -'<br />'.length); |
There was a problem hiding this comment.
Preserve trailing streaming hard breaks
When the streamed value ends with an explicit hard-break child, markdown.serialize emits the existing \n<br /> representation, but this new cleanup strips <br /> whenever the chunk has trailing newline whitespace; the later append then returns only a plain newline, e.g. { text: 'chunk1' }, { text: '\n' } with chunk chunk1\\\n serializes as chunk1\n. That changes a hard break into a soft line break at the end of streamed content, so only remove this suffix when it was generated from an embedded text newline rather than an explicit break node.
Useful? React with 👍 / 👎.
| if (typeof text !== 'string' || !text.includes('\n')) { | ||
| return [child]; |
There was a problem hiding this comment.
Normalize line breaks inside inline children
This early return leaves inline element children unchanged, so a paragraph link like { type: 'a', children: [{ text: 'before\n\nafter' }] } still serializes as [before\n\nafter](url), which deserializeMd reads back as separate paragraphs and literal bracket text. Since links and other inline elements are valid paragraph children, the line-break normalization needs to recurse into inline children before mdast conversion instead of only handling direct text nodes.
Useful? React with 👍 / 👎.
🐛 Completes #4835, crediting @dschoorl
🟢 95% confidence
pnpm check✅ Outcome
Markdown serialization preserves hard line breaks embedded inside a single paragraph text leaf, so the reported value no longer round-trips into separate paragraphs.
This also keeps AI streaming's exact chunk contract intact, instead of adopting the draft PR's escaped markdown/html output changes.
Credit: based on @dschoorl's original report and PR, with commit co-author credit included.
This is not a broad rewrite of all trailing hard-break semantics. It keeps existing split-break-child parity and fixes the reproduced embedded-text bug plus required AI streaming compatibility.
🏗️ Design
The durable fix is in
@platejs/markdownparagraph serialization: embedded text-leaf\nvalues are normalized through the existing break-node serialization path before mdast conversion.@platejs/aigets a targeted streaming shim so embedded text-leaf newlines remain raw stream text, while explicit markdown hard-break children still serialize as explicit hard-break syntax.🧪 Verified
bun test packages/markdown/src/lib/commonmarkSurface.spec.tsbun test ./apps/www/src/__tests__/package-integration/ai-chat-streaming/streamSerializeMd.slow.tsx ./apps/www/src/__tests__/package-integration/ai-chat-streaming/streamDeserializeMd.slow.tsxpnpm turbo typecheck --filter=./packages/markdown --filter=./packages/aipnpm --filter @platejs/markdown testpnpm lint:fix.agents/skills/autoreview/scripts/autoreview --mode localpnpm check