Skip to content

Use HTML entities in translation files#1787

Merged
idrassi merged 2 commits into
veracrypt:masterfrom
sandakersmann:master
Jun 21, 2026
Merged

Use HTML entities in translation files#1787
idrassi merged 2 commits into
veracrypt:masterfrom
sandakersmann:master

Conversation

@sandakersmann

Copy link
Copy Markdown
Contributor

Use HTML entities in translation files

@sandakersmann

sandakersmann commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

I used this script:

#!/usr/bin/env python3
"""
escape_translations.py

Replaces, ONLY inside the translated text of <entry>...</entry> elements
of VeraCrypt language XML files:

    "->"  ->  "&gt;"
    "<"   ->  "&lt;"
    ">"   ->  "&gt;"

Usage:
    python escape_translations.py /path/to/VeraCrypt/Translations
"""

import sys
import re
from pathlib import Path

# Matches:  <entry ...attributes...>  CONTENT  </entry>
# group 1 = opening tag, group 2 = inner text, group 3 = closing tag
ENTRY_RE = re.compile(r'(<entry\b[^>]*>)(.*?)(</entry>)', re.DOTALL)

def escape_content(text: str) -> str:
    """Escape <, > and -> inside a translated string."""
    # 1) Handle the arrow first so its '>' isn't touched again below.
    text = text.replace('->', '&gt;')
    # 2) Escape any remaining angle brackets.
    text = text.replace('<', '&lt;')
    text = text.replace('>', '&gt;')
    return text

def process_file(path: Path) -> bool:
    """Process a single file. Returns True if it was modified."""
    original = path.read_text(encoding='utf-8')

    def repl(match: re.Match) -> str:
        opening, content, closing = match.group(1), match.group(2), match.group(3)
        return opening + escape_content(content) + closing

    updated = ENTRY_RE.sub(repl, original)

    if updated != original:
        path.write_text(updated, encoding='utf-8')
        return True
    return False

def main():
    if len(sys.argv) != 2:
        print(f"Usage: {sys.argv[0]} /path/to/Translations")
        sys.exit(1)

    folder = Path(sys.argv[1])
    if not folder.is_dir():
        print(f"Error: '{folder}' is not a directory.")
        sys.exit(1)

    files = sorted(folder.rglob('*'))
    changed = 0
    for f in files:
        if f.is_file() and f.suffix.lower() == '.xml':
            if process_file(f):
                changed += 1
                print(f"Updated: {f}")

    print(f"\nDone. {changed} file(s) modified.")

if __name__ == '__main__':
    main()

@idrassi

idrassi commented Jun 21, 2026

Copy link
Copy Markdown
Member

Thanks.

Most of the change is display neutral but there is one issue: the script converts -> to &gt;, which changes the user visible text from -> to > in a few Czech, Norwegian Bokmål, and Thai strings. Since the goal is only XML entity escaping, these should be preserved as -&gt;.

Can you please address this?

@sandakersmann

sandakersmann commented Jun 21, 2026

Copy link
Copy Markdown
Contributor Author

It looks like > was preferred over -> earlier, so think the correct thing to do is phasing out ->.

Edit: This way we can have a uniform standard in all the translation files.

@idrassi

idrassi commented Jun 21, 2026

Copy link
Copy Markdown
Member

Indeed, I missed this point.
All is ok then. Thanks.

@idrassi idrassi merged commit 0d375e9 into veracrypt:master Jun 21, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants