feat: benchmark /translate skill — 200 keys regenerated across 9 languages#12028
Draft
premiumjibles wants to merge 9 commits intodevelopfrom
Draft
feat: benchmark /translate skill — 200 keys regenerated across 9 languages#12028premiumjibles wants to merge 9 commits intodevelopfrom
premiumjibles wants to merge 9 commits intodevelopfrom
Conversation
Refactor translation pipeline so each per-language sub-agent owns its full lifecycle (translate → validate → retry → review → refine → merge → verify) instead of the orchestrator managing all steps across 9 languages. Reduces orchestrator to a lightweight coordinator that spawns agents and reads status files. - Extract shared script-detection utilities into script-utils.js - Refactor validate.js to import from script-utils.js (no behavior change) - Add validate-file.js for post-merge full-file validation (JSON validity, key completeness, aggregate script ratio, regression detection) - Simplify merge.js: remove duplicate script-validation, add pre-merge backup for rollback support - Rewrite SKILL.md Steps 5-8 for self-contained language agent architecture Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Translate 11 missing English strings into de, es, fr, ja, pt, ru, tr, uk, zh using the new /translate Claude Code skill. Covers RFOX FAQ entries, action center failure messages, and yield cooldown notices. Also fixes merge.js to only add new keys by default, never overwriting existing translations. A --force flag is available for intentional re-translation of changed English strings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix glossary key mismatch in compile-report.js (disambiguated keys
didn't match actual glossary.json keys, silently skipping 4 checks)
- Fix mixed Latin/Cyrillic in ru.md locale guide (vы → вы)
- Fix fragile file-path detection in merge.js (use fs.existsSync instead
of includes('/'), add missing-arg guard and JSON.parse try/catch)
- Add try/catch in missing-keys.js for corrupt/missing locale files
- Add French elision rule to fr.md: use "de" when numeric %{amount}
buffers the symbol, use "en" when symbol placeholder is directly
after the preposition (avoids runtime elision ambiguity)
- Retranslate French yield/unstake strings applying the new rule:
"déstaking de %{symbol}" → "déstake en %{symbol}"
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Ukrainian: в/у and з/із/зі preposition alternation rules for dynamic placeholders where runtime values are unknown at translation time. Turkish: vowel harmony rules for dynamic placeholders — prefer postpositions over direct suffixes on placeholders since crypto symbols span all vowel classes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add register examples to all 9 locale files (de, es, fr, ja, pt, ru, tr, uk, zh) with correct/incorrect pairs for non-pronoun register markers
- Add register consistency as 6th reviewer focus in SKILL.md
- Add "Multichain Snap" and "Snap" to glossary never-translate list
- Fix 2 broken community translations across all 9 locales (stale multiChain.body, missing %{symbol} in getAssets.about)
- Update compile-report.js to use stemMatch instead of raw .includes() for glossary metrics
- Improve stemMatch with language-aware morphological matching (suffix stripping, Levenshtein distance, CJK character overlap)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove 200 existing translated keys from all 9 non-English locales and regenerate them using the /translate skill pipeline. This creates a diff where language experts can compare original vs. AI-generated translations for quality benchmarking. Key selection: 40 short, 50 multi-placeholder, 7 tagged, 30 long, 30 crypto-domain, 43 general — distributed across 31 namespaces. All 1800 translations (200 × 9 locales) passed validation with 0 rejections and 0 manual review items. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Benchmark of the
/translateskill pipeline quality. Removes 200 existing translated keys from all 9 non-English locales and regenerates them using the automated translate-review-refine pipeline. The diff shows original (human) translations vs. AI-generated translations for expert comparison.Key Selection (200 keys, stratified sampling)
%{variable}placeholders<span>,<strong>,<link>etc. (only 7 exist)Keys distributed across 31 top-level namespaces for realistic coverage.
Pipeline Results
All 1800 translations (200 × 9 locales) passed validation:
How to Review
Each locale's diff shows the removed original translations (red) vs. regenerated translations (green). Language experts should evaluate:
%{variables}and HTML tags preserved?Ground truth (original translations) saved at
/tmp/benchmark-ground-truth.jsonfor automated comparison.Issue (if applicable)
N/A — internal benchmark
Risk
Zero risk. This is a draft PR for evaluation only, not intended for merge. Translation-only changes, no code modifications.
Testing
Engineering
node .claude/skills/translate/scripts/validate-file.js {locale}for each of de, es, fr, ja, pt, ru, tr, uk, zhOperations
This is a translation-only benchmark PR. No functional changes to test.
Screenshots (if applicable)
N/A