Skip to content

ralph: shrink #66 memory with measurement-driven cycles#740

Draft
srid wants to merge 5 commits into
masterfrom
ralph/memory-issue-66
Draft

ralph: shrink #66 memory with measurement-driven cycles#740
srid wants to merge 5 commits into
masterfrom
ralph/memory-issue-66

Conversation

@srid

@srid srid commented May 17, 2026

Copy link
Copy Markdown
Owner

Three behaviour-preserving cycles cut peak RSS on a 4.5k-file / 72 MB
synthetic notebook by 29 %.
The full methodology, dead ends, and
root-cause analysis live in
docs/dev/ralph/memory-66/README.md;
this PR description is a thin pointer to it.

What lands

Cycle Change AFTER_HWM (MiB) Δ vs baseline
0 Baseline (origin/master @ 6950760, 5-run median) 5181
1 deepseq parsed Pandoc + Aeson.Value in parseAndInsert 4244 −18.1 %
2 Bake -with-rtsopts=-N -F1.5 into the executable 3936 −24.0 %
3 Drop _relCtx :: [Block] from Rel; recompute on demand 3672 −29.1 %

+RTS -s shows the real live-data win is structural, not RTS-amplified:
maximum residency drops from 1.85 GiB → 1.27 GiB (−31 % live data)
between baseline and cycle 1+3, with GC productivity rising from 70.9 %
to 74.9 %.

What does not land

Per-Pandoc GHC.Compact regions were measured and rejected — region
overhead dominates on ~10 KiB Pandocs (+15 % regression on the 4.5k
corpus). Documented in the report under "Dead end".

Where the next 30 % lives

Beyond cycle 3, each Note still retains its full Pandoc AST in
_modelNotes
. 4500 × ~250 KiB ≈ 1.1 GiB unavoidable live data. To
go below ~3.5 GiB AFTER_HWM, the assumption itself has to change —
options laid out in the report under "Root cause and the ceiling at
~−29 %". This PR intentionally stops at the behaviour-preserving local
fixes.

Test plan

  • cabal test emanote passes (one expected-context assertion in
    RelSpec was tightened to assert source-position ordering directly,
    since _relCtx is no longer carried on each Rel)
  • live-server smoke on the actual docs/ notebook renders /,
    /guide, /yaml-config, /wikilinks, /uses with non-empty
    backlinks panels (29-247 occurrences of "backlink" in HTML)
  • same 5-run methodology measures --allow-broken-internal-links
    paths cleanly
  • e2e suite (live + static + morph) — not yet run locally; relying
    on CI

Closes #66 for the local-fix tier. The architectural cycles 4+ are
follow-up work.

srid added 4 commits May 17, 2026 15:05
Synthetic 4501-file / 72 MB corpus reproduces issue #66 — `emanote run`
peaks at ~5.0 GiB RSS (vs ~4.7 GiB reported). `+RTS -s` shows live
residency of ~1.85 GiB and 2.5x RSS/live ratio from default -F2.
Closure-type heap profile points at Pandoc AST retention (list cons +
Text + ARR_WORDS = ~68% of heap).

Also reproduces the (separate, worse) `emanote gen` blow-up past 108 GiB
on the same corpus.
In parseAndInsert, force the parsed Pandoc and Aeson Value so the
per-file parser closure (held by UnionMount.unionMountStreaming) can
be released as we stream files into the model. Median peak RSS on the
4.5k synthetic corpus drops from 5185 MiB → 4244 MiB (-18%).

Add `deepseq` to emanote's build-depends explicitly so the import is
honest about its package (it is otherwise pulled in transitively by
text/pandoc).
GHC's default old-generation retention factor (-F2) sizes the
post-major-GC heap at 2x live data. With cycle 1 in place, live data
is 1.27 GiB but AFTER_HWM is still 4.24 GiB. -F1.5 trades a few extra
major GCs for a smaller heap high-water — and as a bonus, max gen-1
pause shrinks from 0.63s to 0.49s because each major GC has less to
scan.

5-run median AFTER_HWM on the 4.5k synthetic corpus, default -N:
4199 -> 3936 MiB (-6.3%), cumulative -24.1% vs baseline.

Users can still override with `emanote run +RTS -F2 -RTS`.
`Rel._relCtx :: [B.Block]` carried the Pandoc-block context for every
outgoing link in every note. With ~4500 notes x ~20 links each, that's
~90k duplicated context chunks in `_modelRels` — derivable from each
source note's already-retained `_noteDoc`.

Strip ctx at `noteRels` insert time and recompute it on demand in
`modelLookupBacklinks` via a new `noteRelCtxToTarget` helper that
re-walks the source note's Pandoc once per backlinking source. Bounded
by source-note AST size; paid only when the backlinks-page is rendered.

5-run median AFTER_HWM on the 4.5k synthetic corpus: 3936 -> 3729 MiB
(-5.3%), cumulative -28.1% vs baseline (5185 MiB).
@srid srid force-pushed the ralph/memory-issue-66 branch from 11e2e6b to 9baadc5 Compare May 17, 2026 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

High memory usage on large notebooks

1 participant