perf(web): replace softbuffer with direct put_image_data canvas present by irvingoujAtDevolution · Pull Request #1374 · Devolutions/IronRDP

irvingouj@Devolutions (irvingoujAtDevolution) · 2026-06-12T19:05:02Z

Summary

The web client presented frames through softbuffer, whose web backend repacks
the whole surface (RGBA → u32 → RGBA into a fresh buffer) on every present.
This replaces it with a direct put_image_data that uploads only the dirty
region, and drops the softbuffer dependency.

Same idea as the IronVNC change.

What changed

Remove the softbuffer dependency; present each dirty region with
put_image_data at its origin.
No full-surface buffer and no per-region scratch. extract_partial_image fills
a single WriteBuf reused across frames, so steady-state draws don't allocate.
Force opaque alpha before upload (kept — see Correctness).
Add WriteBuf::filled_mut to ironrdp-core (mutable counterpart of filled).
web-sys: add CanvasRenderingContext2d + ImageData, drop the softbuffer-only
features.

Performance

Draw-stage time on a 1080p replay (595 frames / 110 dirty regions), headless
Chromium, 8 measured passes × 3 runs, median. Both rows are reproducible branches
off the replay-bench harness; the only difference is the render path.

Render path	draw (ms)	vs softbuffer	branch
softbuffer `present_with_damage`	~1031	—	`bench/draw-softbuffer`
this PR (direct upload, reused `WriteBuf`)	~97	~10.6×	`bench/draw-zerocopy`

The win is structural: upload the dirty region instead of repacking the whole
surface every present.
Reusing one WriteBuf (vs a per-frame allocation) keeps the steady-state draw
allocation-free; the remaining cost is the unavoidable ImageData JS copy.
Output is byte-identical: framebuffer CRC32 2d8e1b79 matches the recorded
ground truth and the rendered-canvas FNV-1a is unchanged.
Absolute ms carry ~±15% noise from machine load (decode drifted 1.5–1.9 s); the
ratio held across runs.

Reproduce:

git checkout bench/draw-softbuffer   # or bench/draw-zerocopy
cd crates/ironrdp-web && wasm-pack build --target web --release -- --features bench
cd bench-harness && node run.mjs --capture /bench-corpus/<your>.irdprec --passes 8

Correctness

put_image_data stores alpha verbatim, and the decoded framebuffer isn't
guaranteed opaque — it's zero-initialised, a widened whole-rows region can cover
not-yet-painted columns (alpha 0), and the QOI-RGBA path copies source alpha. So
we force alpha opaque before upload. A scan-then-conditionally-force was tried and
is slower than just forcing (the check touches the same bytes), so the
unconditional force stays.

Follow-up (separate PR)

Guarantee framebuffer opacity upstream in ironrdp-session (init alpha to 0xff +
clamp apply_rgba32); after that the web side can drop the alpha force entirely.

The render path converted each dirty region RGBA -> u32 `0RGB`, then let softbuffer repack u32 -> RGBA into a freshly allocated buffer every frame — two pixel passes over the whole surface plus a per-frame allocation. Replace it with the canvas's own 2D context: one copy of the region into a reused RGBA scratch (alpha forced opaque) followed by put_image_data at the region origin. softbuffer is dropped from ironrdp-web (still used by ironrdp-viewer). Mirrors the same fix in IronVNC. Measured with a record/replay draw bench (dev wasm, headless Chromium), draw-stage median: 4K 1706ms -> 83ms (~20x), 1080p 705ms -> 14ms (~50x), with byte-identical canvas output and unchanged framebuffer checksums.

irvingouj@Devolutions (irvingoujAtDevolution) · 2026-06-12T21:18:10Z

Same as VNC, remove softbuffer

Copilot

Pull request overview

This PR updates the ironrdp-web rendering path to remove the softbuffer dependency and present updated regions by uploading RGBA buffers directly to an HTML canvas via CanvasRenderingContext2d::put_image_data, aiming to reduce per-frame work and allocations.

Changes:

Replaces the softbuffer-based canvas present path with direct ImageData + put_image_data blits for dirty regions.
Removes the softbuffer dependency from ironrdp-web.
Enables additional web-sys features needed for 2D canvas rendering (CanvasRenderingContext2d, ImageData).

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

File	Description
crates/ironrdp-web/src/canvas.rs	Implements the new 2D-context `put_image_data` rendering path and removes the softbuffer surface logic.
crates/ironrdp-web/Cargo.toml	Drops `softbuffer` dependency; enables required `web-sys` features for 2D canvas + `ImageData`.
Cargo.lock	Updates lockfile dependency edges to reflect removal of `softbuffer` from `ironrdp-web`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Benoît Cortier (CBenoit)

Nice!

Address review on the canvas present path: the buffer-grows branch initialized the scratch via spare_capacity_mut + set_len and then unconditionally re-copied the whole buffer, writing every byte twice on the grow path behind an unsafe block that bought nothing. Replace it with clear() + extend_from_slice: a single copy that writes straight into spare capacity (no zero-fill), reuses the allocation in steady state, and removes the unsafe entirely. Also fix doc comments that referenced a non-existent in-crate benchmark and reword the resize and context_2d docs to be accurate.

irvingouj@Devolutions (irvingoujAtDevolution) · 2026-06-25T01:52:22Z

Alex Yusiuk (@RRRadicalEdward) Benoît Cortier (@CBenoit) after a closer look, I realized that this rgba buffer might not be needed to begin with, so removed it, no allocation/initialization issue anymore. It's even 20% faster than what we have in the first draft!

extract_partial_image already returns an owned, region-sized RGBA copy, so the reusable rgba scratch in Canvas was a redundant second buffer. Force opaque alpha directly on that buffer and hand it straight to put_image_data; Canvas no longer keeps any per-frame state. This removes the scratch allocation, its copy, and the whole zero-fill / set_len question along with it. draw takes &mut [u8] now (the buffer is the caller's throwaway copy, safe to mutate). Output is byte-identical (replay bench: framebuffer CRC and canvas FNV-1a unchanged) and the present path is ~20-30% faster than the buffered version.

Benoît Cortier (CBenoit)

We’re getting at a good place!

Benoît Cortier (CBenoit) · 2026-06-25T06:16:45Z

+/// This replaced a softbuffer-backed path that converted RGBA -> u32 `0RGB` (our pass) and then let
+/// softbuffer repack u32 -> RGBA per frame into a freshly allocated buffer — two pixel passes over
+/// the whole surface plus a per-frame allocation. The direct path drops the u32 round-trip and the
+/// per-frame allocation, measuring an order of magnitude faster present at 4K with byte-identical
+/// canvas output. Mirrors the same fix in IronVNC.


Keeping some historical documentation on how we ended up converging may be okay in order to justify the current design, but it currently sounds like a "prompt artifact": "Mirrors the same fix in IronVNC" is completely irrelevant outside of this PR.

Benoît Cortier (CBenoit) · 2026-06-25T06:24:47Z

+        for pixel in buffer.chunks_exact_mut(4) {
+            pixel[3] = 0xFF;
        }


question: Just challenging the established pattern here: do we really need to force opaque alpha? I assume that if we don’t do that we end up with visual artifacts? My understanding is that we are just extracting a sub-image from an otherwise fully rendered image, and that we don’t really need any extra cleaning step.

Benoît Cortier (@CBenoit) yeah, works for most cases but it's not guaranteed. Framebuffer starts zero-filled (here) and the QOI-RGBA path keeps source alpha (here) — plus a tall update gets widened to full width, so early / post-resize frames can upload not-yet-painted columns as transparent.

Best case we'd guarantee opaque upstream (init the framebuffer to 0xff + clamp apply_rgba32), but for the scope of this PR I think it's better to keep the force and fix it in a follow-up. Sound good?

Sounds good to me! My only suggestion then is to track this with an issue for visibility

Alex Yusiuk (RRRadicalEdward)

Good job 👍.
Benoit already pointed out good things. I have nothing to add

Alex Yusiuk (RRRadicalEdward) · 2026-06-25T07:36:09Z

Alex Yusiuk (Alex Yusiuk (@RRRadicalEdward)) Benoît Cortier (Benoît Cortier (@CBenoit)) after a closer look, I realized that this rgba buffer might not be needed to begin with, so removed it, no allocation/initialization issue anymore. It's even 20% faster than what we have in the first draft!

Could you apply this method to IronVNC as well?

Copilot

Pull request overview

Copilot reviewed 3 out of 4 changed files in this pull request and generated no new comments.

extract_partial_image now fills a caller-owned WriteBuf (unfilled_to + advance) instead of allocating a fresh Vec on every call. session.rs keeps one buffer across frames and clears it per region, so steady-state draws don't allocate. Adds WriteBuf::filled_mut for the in-place alpha fixup. Also inline the single-call-site blit helper into draw, and reword the canvas/extract docs to describe the types and their contracts rather than the calling code.

Cut the verbose doc blocks down to the non-obvious rationale (why force alpha, the whole-rows widening, the WriteBuf clear-between-regions contract), 2 lines max each.

irvingouj@Devolutions (irvingoujAtDevolution) · 2026-06-25T16:33:24Z

Updated, we now use WriteBuf to achive effectively O(1) allocation per draw on the Rust side rendering path.

Benoît Cortier (CBenoit) · 2026-06-25T17:33:38Z

+    region: InclusiveRectangle,
+    buffer: &mut WriteBuf,
+) -> InclusiveRectangle {
    // PERF: needs actual benchmark to find a better heuristic


Thought: Since you have the tools for benchmarking, I think it may be worth to look into this comment as well. As a follow up 🙂

Benoît Cortier (CBenoit)

LGTM! Good job! Before we merge, let's measure the final gains and clean up the final commit body so it matches the result we ended up having 🙂

irvingouj@Devolutions (irvingoujAtDevolution) marked this pull request as draft June 12, 2026 19:05

irvingouj@Devolutions (irvingoujAtDevolution) requested review from Alex Yusiuk (RRRadicalEdward) and Copilot June 12, 2026 21:17

irvingouj@Devolutions (irvingoujAtDevolution) marked this pull request as ready for review June 12, 2026 21:17

Copilot started reviewing on behalf of irvingouj@Devolutions (irvingoujAtDevolution) June 12, 2026 21:18 View session

Copilot AI reviewed Jun 12, 2026

View reviewed changes

Comment thread crates/ironrdp-web/src/canvas.rs Outdated

Comment thread crates/ironrdp-web/src/canvas.rs

Alex Yusiuk (RRRadicalEdward) reviewed Jun 15, 2026

View reviewed changes

Comment thread crates/ironrdp-web/src/canvas.rs Outdated

fix(web): avoid duplicate canvas resize reset

45fb1a6

irvingouj@Devolutions (irvingoujAtDevolution) requested a review from Alex Yusiuk (RRRadicalEdward) June 23, 2026 19:25

Benoît Cortier (CBenoit) reviewed Jun 24, 2026

View reviewed changes

Comment thread crates/ironrdp-web/src/canvas.rs Outdated

Benoît Cortier (CBenoit) reviewed Jun 24, 2026

View reviewed changes

irvingouj@Devolutions (irvingoujAtDevolution) added 2 commits June 24, 2026 15:26

perf(web): avoid zero-filling canvas scratch

67d73ff

Benoît Cortier (CBenoit) reviewed Jun 25, 2026

View reviewed changes

Alex Yusiuk (RRRadicalEdward) approved these changes Jun 25, 2026

View reviewed changes

Benoît Cortier (CBenoit) requested a review from Copilot June 25, 2026 08:42

Copilot started reviewing on behalf of Benoît Cortier (CBenoit) June 25, 2026 08:43 View session

Copilot AI reviewed Jun 25, 2026

View reviewed changes

irvingouj@Devolutions (irvingoujAtDevolution) added 2 commits June 25, 2026 12:08

docs(web): trim canvas/extract comments to WHY-only

ad0dc5d

Cut the verbose doc blocks down to the non-obvious rationale (why force alpha, the whole-rows widening, the WriteBuf clear-between-regions contract), 2 lines max each.

irvingouj@Devolutions (irvingoujAtDevolution) requested a review from Benoît Cortier (CBenoit) June 25, 2026 16:33

Benoît Cortier (CBenoit) reviewed Jun 25, 2026

View reviewed changes

Benoît Cortier (CBenoit) approved these changes Jun 25, 2026

View reviewed changes

irvingouj@Devolutions (irvingoujAtDevolution) changed the title ~~perf(web): replace softbuffer canvas present with direct put_image_data~~ perf(web): replace softbuffer with direct put_image_data canvas present Jun 25, 2026

irvingouj@Devolutions (irvingoujAtDevolution) merged commit d3705af into master Jun 25, 2026
21 checks passed

irvingouj@Devolutions (irvingoujAtDevolution) deleted the perf/web-remove-softbuffer branch June 25, 2026 18:53

devolutionsbot mentioned this pull request Jun 24, 2026

chore(release): prepare for publishing #1364

Open

Uh oh!

Conversation

irvingouj@Devolutions (irvingoujAtDevolution) commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Performance

Correctness

Follow-up (separate PR)

Uh oh!

irvingouj@Devolutions (irvingoujAtDevolution) commented Jun 12, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Benoît Cortier (CBenoit) left a comment

Choose a reason for hiding this comment

Uh oh!

irvingouj@Devolutions (irvingoujAtDevolution) commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Benoît Cortier (CBenoit) left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Benoît Cortier (CBenoit) Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Benoît Cortier (CBenoit) Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

irvingouj@Devolutions (irvingoujAtDevolution) Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Benoît Cortier (CBenoit) Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Alex Yusiuk (RRRadicalEdward) left a comment

Choose a reason for hiding this comment

Uh oh!

Alex Yusiuk (RRRadicalEdward) commented Jun 25, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

irvingouj@Devolutions (irvingoujAtDevolution) commented Jun 25, 2026

Uh oh!

Benoît Cortier (CBenoit) Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Benoît Cortier (CBenoit) left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

irvingouj@Devolutions (irvingoujAtDevolution) commented Jun 12, 2026 •

edited

Loading

irvingouj@Devolutions (irvingoujAtDevolution) commented Jun 25, 2026 •

edited

Loading

Benoît Cortier (CBenoit) Jun 25, 2026 •

edited

Loading