fix: pin Lemonade back to 10.2.0 (embedding regression on >= b6524)#1872
fix: pin Lemonade back to 10.2.0 (embedding regression on >= b6524)#1872kovtcharov wants to merge 1 commit into
Conversation
Lemonade 10.7.0/10.8.x bundle a llama.cpp build >= b6524, which crashes loading the embedding model nomic-embed-text-v2-moe on the Vulkan backend (AMD) — "llama-server failed to start". This breaks RAG embeddings on the NPU/GPU path and has been red in CI since the 10.x bumps. Upstream is unfixed: llama.cpp #16301 (b6524 Vulkan regression, open, deprioritized) and lemonade #612 / #941 (open). The maintainer's GGML_VK_DISABLE_COOPMAT=1 workaround is already applied in the embeddings CI job and still fails on our Strix/Windows runners, so there is no effective workaround — revert to the last known-good version. 10.2.0 is GAIA's documented min_lemonade_version floor, so this is a pin to the floor, not below it. version.py is the single source of truth; the cpp setup docs are updated in lock-step.
|
Verdict: Approve — pending the one CI check that actually proves the fix. This pins Lemonade back from 10.8.1 to 10.2.0 to fix RAG embeddings, which crash on the NPU/GPU Vulkan path with the llama.cpp build (≥ b6524) that 10.7.0/10.8.x bundle. It's a clean, well-scoped revert: The bottom line: the code change is correct, but the decisive proof — 🔍 Technical detailsVerification performed (claims hold up):
🟢 Process note (not blocking): the test plan's decisive item ( Strengths:
|
|
Closing — the premise is invalidated. The embedding-load failure is not a Lemonade version regression: 10.2.0 fails today with the identical |
Why this matters
RAG embeddings are broken on the AMD NPU/GPU path. Lemonade 10.7.0/10.8.x bundle a llama.cpp build ≥ b6524, which crashes loading the embedding model
nomic-embed-text-v2-moeon the Vulkan backend (llama-server failed to start).Test Lemonade Embeddingshas been red since the 10.x bumps; 10.2.0 is the last version where it passes.Upstream is unfixed and there's no working workaround:
GGML_VK_DISABLE_COOPMAT=1workaround is already applied in our embeddings CI job (test_embeddings.yml) and still fails on our Strix/Windows runners — so a downgrade is the only fix available to us.10.2.0is GAIA's documentedmin_lemonade_versionfloor (src/gaia/installer/init_command.py), so this pins to the floor, not below it. Port 13305 (introduced in 10.1.0) still applies, and no per-version checksums are pinned.version.pyis the single source of truth; the C++ setup docs are updated in lock-step.Test plan
version.py+ all hardcoded doc references moved 10.8.1 → 10.2.0 (no10.8.1remains outside lockfiles)Test Lemonade Embeddingspasses on this branch — the decisive check; manually dispatched against this branch to confirm 10.2.0 loadsnomic-embed-text-v2-moeRelated: #1871 (makes a
LEMONADE_VERSIONbump trigger the full Lemonade test surface, so this class of regression is caught on the bump PR).