Skip to content

Commit b03bf22

Browse files
committed
l10n: docs: add review instructions in AGENTS.md
Add a new "Reviewing po/XX.po" section to po/AGENTS.md that provides comprehensive guidance for AI agents to review translation files. Translation diffs lose context, especially for multi-line msgid and msgstr entries. Some LLMs ignore context and cannot evaluate translations accurately; others rely on scripts to search for context in source files, making the review process time-consuming. To address this, git-po-helper implements the compare subcommand, which extracts new or modified translations with full context (complete msgid/msgstr pairs), significantly improving review efficiency. A limitation is that the extracted content lacks other already-translated content for reference, which may affect terminology consistency. This is mitigated by including a glossary in the PO file header. git-po-helper-generated review files include the header entry and glossary (if present) by default. The review workflow leverages git-po-helper subcommands: - git-po-helper compare: Extract new or changed entries between two versions of a PO file into a valid PO file for review. Supports multiple modes: * Compare HEAD with the working tree (local changes) * Compare a commit's parent with the commit (--commit) * Compare a commit with the working tree (--since) * Compare two arbitrary revisions (-r) - git-po-helper msg-select: Split large review files into smaller batches by entry index range for manageable review sessions. Supports range formats like "-50" (first 50), "51-100", "101-" (to end). Evaluation with the Qwen model: git-po-helper agent-run review --commit 2000abe --agent qwen Benchmark results: | Metric | Value | |------------------|----------------------------------| | Turns | 22 | | Input tokens | 537263 | | Output tokens | 4397 | | API duration | 167.84 s | | Review score | 96/100 | | Total entries | 63 | | With issues | 4 (1 critical, 2 major, 1 minor) | Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
1 parent e592143 commit b03bf22

1 file changed

Lines changed: 197 additions & 0 deletions

File tree

po/AGENTS.md

Lines changed: 197 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ most commonly used housekeeping tasks:
1010
1. Generating or updating po/git.pot
1111
2. Updating po/XX.po
1212
3. Translating po/XX.po
13+
4. Reviewing translation quality
1314

1415

1516
## Background knowledge for localization workflows
@@ -669,6 +670,202 @@ step 8 after step 6.
669670
```
670671

671672

673+
### Task 4: Review translation quality
674+
675+
Review may target the full `po/XX.po`, a specific commit, or changes since a
676+
commit. When asked to review, follow the steps below.
677+
678+
**Workflow**: Follow steps in order. Do **NOT** use `git show`, `git diff`,
679+
`git format-patch`, or similar to get changes—they break PO context; use **only**
680+
`git-po-helper compare` for extraction. Without `git-po-helper`, refuse the task.
681+
Steps 3→4→5→6→7 loop: after step 6, **always** go to step 7 (back to step 3).
682+
The **only** ways to step 8 are when step 4 finds `po/review-todo.json` missing
683+
or empty (no batch left to review), or when step 1 finds `po/review-result.json`
684+
already present.
685+
686+
1. **Check for existing review (resume support)**: Evaluate the following in order:
687+
688+
- If `po/review-input.po` does **not** exist, proceed to step 2 (Extract
689+
entries) for a fresh start.
690+
- Else If `po/review-result.json` exists, go to step 8 (only after loop exits).
691+
- Else If `po/review-done.json` exists, go to step 6 (Rename result).
692+
- Else if `po/review-todo.json` exists, go to step 5 (Review the current
693+
batch).
694+
- Else go to step 3 (Prepare one batch).
695+
696+
2. **Extract entries**: Run `git-po-helper compare` with the desired range and
697+
redirect the output to `po/review-input.po`. See "Comparing PO files for
698+
translation and review" under git-po-helper for options.
699+
700+
3. **Prepare one batch**: Batching keeps each run small so the model can
701+
complete review within limited context. **Directly execute** the script
702+
below—it is authoritative; do not reimplement.
703+
704+
```shell
705+
review_one_batch () {
706+
min_batch_size=${1:-100}
707+
INPUT_PO="po/review-input.po"
708+
PENDING="po/review-pending.po"
709+
TODO="po/review-todo.json"
710+
DONE="po/review-done.json"
711+
BATCH_FILE="po/review-batch.txt"
712+
713+
if test ! -f "$INPUT_PO"
714+
then
715+
rm -f "$TODO"
716+
echo >&2 "cannot find $INPUT_PO, nothing for review"
717+
return 1
718+
fi
719+
if test ! -f "$PENDING" || test "$INPUT_PO" -nt "$PENDING"
720+
then
721+
rm -f "$BATCH_FILE" "$TODO" "$DONE"
722+
rm -f po/review-result*.json
723+
cp "$INPUT_PO" "$PENDING"
724+
fi
725+
726+
ENTRY_COUNT=$(grep -c '^msgid ' "$PENDING" 2>/dev/null || echo 0)
727+
ENTRY_COUNT=$((ENTRY_COUNT > 0 ? ENTRY_COUNT - 1 : 0))
728+
if test "$ENTRY_COUNT" -eq 0
729+
then
730+
rm -f "$TODO"
731+
echo >&2 "No entries left for review"
732+
return 1
733+
fi
734+
735+
if test "$ENTRY_COUNT" -gt $min_batch_size
736+
then
737+
if test "$ENTRY_COUNT" -gt $((min_batch_size * 8))
738+
then
739+
NUM=$((min_batch_size * 2))
740+
elif test "$ENTRY_COUNT" -gt $((min_batch_size * 4))
741+
then
742+
NUM=$((min_batch_size + min_batch_size / 2))
743+
else
744+
NUM=$min_batch_size
745+
fi
746+
else
747+
NUM=$ENTRY_COUNT
748+
fi
749+
750+
BATCH=$(cat "$BATCH_FILE" 2>/dev/null || echo 0)
751+
BATCH=$((BATCH + 1))
752+
echo "$BATCH" >"$BATCH_FILE"
753+
754+
git-po-helper msg-select --json --head "$NUM" -o "$TODO" "$PENDING"
755+
git-po-helper msg-select --since "$((NUM + 1))" -o "${PENDING}.tmp" "$PENDING"
756+
mv "${PENDING}.tmp" "$PENDING"
757+
echo "Processing batch $BATCH ($NUM entries out of $ENTRY_COUNT)"
758+
}
759+
# The parameter controls batch size; reduce if the batch file is too large.
760+
review_one_batch 100
761+
```
762+
763+
4. **Check todo file**: If `po/review-todo.json` does not exist or is empty,
764+
review is complete; go to step 8 (only after loop exits). Otherwise proceed to
765+
step 5.
766+
767+
5. **Review the current batch**: Review translations in `po/review-todo.json`
768+
and write findings to `po/review-done.json` as follows:
769+
- Use "Background knowledge for localization workflows" for PO/JSON structure,
770+
placeholders, and terminology.
771+
- If `header_comment` includes a glossary, follow it for consistency.
772+
- Do **not** review the header (`header_comment`, `header_meta`).
773+
- For every other entry, check the entry's `msgstr` **array** (translation
774+
forms) against `msgid` / `msgid_plural` using the "Quality checklist" above.
775+
- Write JSON per "Review result JSON format" below; use `{"issues": []}` when
776+
there are no issues. **Always** write `po/review-done.json`—it marks the
777+
batch complete.
778+
779+
6. **Rename result**: Rename `po/review-done.json` to `po/review-result-<N>.json`,
780+
where N is the value in `po/review-batch.txt` (the batch just completed).
781+
Run the script below:
782+
783+
```shell
784+
review_rename_result () {
785+
TODO="po/review-todo.json"
786+
DONE="po/review-done.json"
787+
BATCH_FILE="po/review-batch.txt"
788+
if test -f "$DONE"
789+
then
790+
N=$(cat "$BATCH_FILE" 2>/dev/null) || { echo "ERROR: $BATCH_FILE not found." >&2; return 1; }
791+
mv "$DONE" "po/review-result-$N.json"
792+
echo "Renamed to po/review-result-$N.json"
793+
fi
794+
rm -f "$TODO"
795+
}
796+
review_rename_result
797+
```
798+
799+
7. **Loop**: **MUST** return to step 3 (Prepare one batch) and repeat the cycle.
800+
Do **not** skip this step or go to step 8. Step 8 is reached **only** when
801+
step 4 finds `po/review-todo.json` missing or empty.
802+
803+
8. **Only after loop exits**: **Directly execute** the command below. It merges
804+
results, applies suggestions, and displays the report. The process ends here.
805+
806+
```shell
807+
git-po-helper agent-run review --report po
808+
```
809+
810+
**Do not** run cleanup or delete intermediate files. Keep them for inspection
811+
or resumption.
812+
813+
**Review result JSON format**:
814+
815+
The **Review result JSON** format defines the structure for translation
816+
review reports. For each entry with translation issues, create an issue
817+
object as follows:
818+
819+
- Copy the original entry's `msgid`, optional `msgid_plural`, and optional
820+
`msgstr` array (original translation forms) into the issue object. Use the
821+
same shape as GETTEXT JSON: `msgstr` is **always a JSON array** when present
822+
(one element singular, multiple for plural).
823+
- Write a summary of all issues found for this entry in `description`.
824+
- Set `score` according to the severity of issues found for this entry,
825+
from 0 to 3 (0 = critical; 1 = major; 2 = minor; 3 = perfect, no issues).
826+
**Lower score means more severe issues.**
827+
- Place the suggested translation in **`suggest_msgstr`** as a **JSON array**:
828+
one string for singular, multiple strings for plural forms in order. This is
829+
required for `git-po-helper` to apply suggestions.
830+
- Include only entries with issues (score less than 3). When no issues are
831+
found in the batch, write `{"issues": []}`.
832+
833+
Example review result (with issues):
834+
835+
```json
836+
{
837+
"issues": [
838+
{
839+
"msgid": "commit",
840+
"msgstr": ["委托"],
841+
"score": 0,
842+
"description": "Terminology error: 'commit' should be translated as '提交'",
843+
"suggest_msgstr": ["提交"]
844+
},
845+
{
846+
"msgid": "repository",
847+
"msgid_plural": "repositories",
848+
"msgstr": ["版本库", "版本库"],
849+
"score": 2,
850+
"description": "Consistency issue: suggest using '仓库' consistently",
851+
"suggest_msgstr": ["仓库", "仓库"]
852+
}
853+
]
854+
}
855+
```
856+
857+
Field descriptions for each issue object (element of the `issues` array):
858+
859+
- `msgid` (and optional `msgid_plural` for plural entries): Original source text.
860+
- `msgstr` (optional): JSON array of original translation forms (same meaning as
861+
in GETTEXT JSON entries).
862+
- `suggest_msgstr`: JSON array of suggested translation forms; **must be an
863+
array** (e.g. `["提交"]` for singular). Plural entries use multiple elements
864+
in order.
865+
- `score`: 0–3 (0 = critical; 1 = major; 2 = minor; 3 = perfect, no issues).
866+
- `description`: Brief summary of the issue.
867+
868+
672869
## Human translators remain in control
673870

674871
Git translation is human-driven; language team leaders and contributors are

0 commit comments

Comments
 (0)