Skip to content

i18n: terminology glossary + AI quality review (#1124)#1126

Merged
comfyui-wiki merged 5 commits into
mainfrom
feat/i18n-glossary
Jun 10, 2026
Merged

i18n: terminology glossary + AI quality review (#1124)#1126
comfyui-wiki merged 5 commits into
mainfrom
feat/i18n-glossary

Conversation

@comfyui-wiki

@comfyui-wiki comfyui-wiki commented Jun 10, 2026

Copy link
Copy Markdown
Member

Summary

Improves the docs translation pipeline with terminology consistency and AI quality review, addressing the term drift described in #1124 (e.g. "custom node" rendered as both 맞춤형 노드 and 커스텀 노드 in Korean).

Two features land here:

  1. Terminology glossary — keep the same English term rendered consistently across pages.
  2. AI quality review (LLM-as-a-judge) — an independent, cheaper model scores existing translations and flags issues.

Plus cleanup of obsolete ZH→JA leak-fixing tooling.


1. Terminology glossary

Three complementary mechanisms feed the translator, each for a different category of term:

Mechanism Effect Example Maintained
preserve_terms (existing) keep the term in English checkpoint, LoRA, scheduler by hand
glossary/frontend/{lang}.json use the frontend's translation workflow → 워크플로 machine-synced
glossary/overrides/{lang}.json correct / extend the frontend custom node → 커스텀 노드 by hand, wins
  • glossary/frontend/{lang}.json — a machine mirror of the ComfyUI frontend locales (the authoritative source of term translations). Rebuilt wholesale by npm run glossary:sync; never hand-edited.
  • glossary/overrides/{lang}.json — hand-maintained, wins over the mirror. The place to record a term decision (Inconsistent terminology in translated docs (no term-mapping mechanism) #1124) or drop a noisy frontend term:
    { "terms": { "custom node": "커스텀 노드" }, "ignore": ["title", "additional", "work"] }
  • At translation time, only terms that actually appear in a document are selected (whole-word, longest-first, capped) and injected as preferred (not mandatory) hints, so the model keeps natural phrasing when a literal substitution would read awkwardly.
  • New languages extend automatically: once a language is in translation-config.json, npm run glossary:sync generates its frontend mirror (provided the frontend ships that locale). The override file is optional.

Design notes

  • The frontend UI locale is low signal as a glossary — full of button/toast text and function words whose UI rendering is wrong in prose (of → 중, work → 업무용). A curated common-word blocklist (not a length filter — node/model and work/mode are the same length) drops these at sync time; the long tail goes in override ignore.
  • ComfyUI proper nouns with no settled translation belong in preserve_terms (expanded with embedding, scheduler, sampler, latent, etc.), kept in English — the opposite of the glossary.
npm run glossary:sync                 # rebuild the frontend mirror, all languages
npm run glossary:sync -- --lang ko    # one language
npm run glossary:sync:dry-run         # report counts without writing

2. AI quality review (LLM-as-a-judge)

npm run translate:review scores existing translations with an independent (typically cheaper) model on four axes — accuracy, completeness, terminology (checked against the glossary), and fluency — and lists concrete issues.

  • Advisory, not blocking. Detailed scores and issues go to .github/i18n-logs/review/ (gitignored). Never blocks a PR; independent of the translation model's own === MISMATCHES === self-notes.
  • Review state is committed. The reviewed English-source hash is stored as reviewSourceHash in the translated file's frontmatter (snippets: an MDX comment), mirroring translationSourceHash. This makes review state shared across the team and visible per file. Only the hash goes in frontmatter — scores/issues stay in the report.
  • Incremental. By default only reviews translations that are up to date with English and not yet reviewed at that hash. Re-translation naturally invalidates the review stamp, so a changed file gets re-reviewed.
  • Configurable judge via REVIEW_API_KEY / REVIEW_API_BASE_URL / REVIEW_API_MODEL (falls back to TRANSLATE_*). Retries with backoff on transient network / 5xx / 429 errors.
npm run translate:review                     # pending reviews, all languages
npm run translate:review -- --lang ko        # one language
npm run translate:review -- --all            # re-review everything
npm run translate:review -- --sample 20      # N pending files per language
npm run translate:review -- --min-score 4    # report files scoring below 4/5

Example of a real flagged finding from a trial run:

[ko] account/login.mdx — overall 4/5 (accuracy 3): "Sign in" mistranslated as 가입 (sign up) instead of 로그인; "Forgot password?" not using preferred terminology.

3. Cleanup

  • Removed obsolete ZH→JA leak-fixing tooling (fix-zh-leaks.ts, zh-ja-dict.ts, check-ja.ts): the pipeline is English-primary now, so Chinese can't leak into Japanese, and these had no main-pipeline or CI references.

Docs & config

  • New .github/scripts/i18n/README.md documenting the full i18n flow, glossary design, and quality review; root README and localized contributing guides updated.
  • .env.local.example: added OpenRouter example, FRONTEND_LOCALES_PATH, and REVIEW_API_*.

Closes #1124

🤖 Generated with Claude Code

- Added new glossary synchronization and management scripts: `sync-glossary.mjs` and `glossary.mjs` for handling term translations.
- Introduced new commands in `package.json` for glossary synchronization and dry-run checks.
- Removed outdated scripts: `check-ja.ts`, `fix-zh-leaks.ts`, and `zh-ja-dict.ts`.
- Updated `translate-i18n.ts` to incorporate glossary prompts during translation.
- Added new glossary files for Japanese and Korean translations.

This update improves the handling of translation terms and enhances the overall i18n workflow.
- Revised the .env.local.example file to include updated API keys and URLs for OpenRouter and DeepSeek, enhancing clarity on usage.
- Added a new command `npm run glossary:sync` to the README files for rebuilding the terminology glossary from the ComfyUI frontend, ensuring consistency across translations.
- Removed the outdated glossary README file to streamline documentation.

These changes improve the setup process for developers and enhance the overall translation workflow.
@mintlify

mintlify Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
comfy 🟢 Ready View Preview Jun 10, 2026, 9:34 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

@OneVth

OneVth commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Thanks for picking this up so quickly! The override layer for recording term decisions is a cleaner approach than I had in mind. Looks great.

New `npm run translate:review` scores existing translations with an
independent, typically cheaper model on four axes (accuracy, completeness,
terminology checked against the glossary, fluency) and lists concrete issues.

- Advisory only: reports to .github/i18n-logs/review/ (gitignored), never
  into MDX, never blocking. Independent of the translation model's own
  MISMATCH self-notes.
- Incremental: reviews only up-to-date, not-yet-reviewed translations; review
  state in a side reviewed.json, not in frontmatter.
- Configurable judge via REVIEW_API_* (falls back to TRANSLATE_*); retry with
  backoff on transient network / 5xx / 429 errors.
- Flags: --lang, --all, --sample N, --min-score N, --snippets, file args.

Docs (root + i18n README) and .env.local.example updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The reviewed English-source hash now lives as `reviewSourceHash` in the
translated file's frontmatter (snippets: an MDX comment), committed to git —
mirroring `translationSourceHash`. This makes review state shared across the
team and visible per file, rather than a local-only `.github/i18n-logs/`
side file that wasn't tracked.

- Incremental skip now reads reviewSourceHash from the translated file.
- Only the hash goes in frontmatter; scores and the issue list stay in the
  gitignored quality-report.
- Removed the reviewed.json side file and REVIEW_STATE_JSON constant.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@comfyui-wiki comfyui-wiki changed the title Add terminology glossary for consistent translations (#1124) i18n: terminology glossary + AI quality review (#1124) Jun 10, 2026
@comfyui-wiki comfyui-wiki merged commit 27c6281 into main Jun 10, 2026
7 checks passed
@github-actions github-actions Bot deleted the feat/i18n-glossary branch June 10, 2026 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent terminology in translated docs (no term-mapping mechanism)

2 participants