RapidTrans exposes a local REST API for translation, document translation, health checks, metrics, and administration. The default base URL is:
http://127.0.0.1:8080
The OpenAPI specification is available at openapi.yaml and at runtime from GET /openapi.yaml.
| Feature | Method and path | Description |
|---|---|---|
| Text translation | POST /v1/translate |
Translate a single text segment. |
| Document translation | POST /v1/translate/document |
Translate Markdown-like documents while preserving structure and protected spans. |
| Liveness | GET /healthz |
Returns 200 when the process is alive. |
| Readiness | GET /readyz |
Returns 200 when the model backend is ready, 503 while loading or unavailable. |
| Metrics | GET /metrics |
Prometheus text metrics. |
| OpenAPI | GET /openapi.yaml |
Returns the API specification. |
| Admin config | GET /admin/config / POST /admin/config |
Read or update runtime configuration. |
| Reload model | POST /admin/reload |
Reload the backend using current settings. |
| Cache info | GET /admin/cache |
Read persistent translation cache state. |
| Clear cache | POST /admin/cache/clear |
Clear persistent translation cache. |
| Accelerators | GET /admin/accelerators |
Read CPU/Vulkan detection and selected device. |
- Request bodies use
Content-Type: application/json. - Responses are JSON unless otherwise stated.
- Text is UTF-8.
- Unknown JSON fields are rejected.
- Request body size is limited by
MACLAW_REQUEST_LIMIT_BYTES. - If the service is at capacity, translation endpoints return
429. - Cached text and fully cached document requests can bypass the concurrency slot.
Error response shape:
{
"error": "translator is busy"
}source_lang and target_lang accept codes and common names:
zh, en, ja, ko, fr, de, es, ru, it, pt, ar, hi, vi, th, id, ms, tr, nl, pl
Chinese, English, Japanese, Korean, French, German, Spanish
中文, 英文, 日文, 韩文
Default behavior:
- If
target_langis omitted and the input contains Chinese, the target defaults to English. - If
target_langis omitted and the input does not contain Chinese, the target defaults to Chinese. - Integrations should usually pass
target_langexplicitly.
Translates one text segment. This endpoint is best for chat tools, Agent tool calls, and short independent strings.
Request:
{
"text": "The contract shall be governed by New York law.",
"source_lang": "en",
"target_lang": "zh",
"domain": "legal",
"glossary": "governed by=受...管辖"
}Request fields:
| Field | Type | Required | Description |
|---|---|---|---|
text |
string | yes | Text to translate. It is trimmed before translation. |
source_lang |
string | no | Source language code or name. If omitted, the service can infer English/Chinese defaults. |
target_lang |
string | no | Target language code or name. Defaults to English/Chinese mutual translation when omitted. |
domain |
string | no | Optional domain hint, for example legal, finance, or medical. |
glossary |
string | no | Optional terminology hint. |
Response:
{
"translation": "该合同受纽约法律约束。",
"model": "Hy-MT1.5-1.8B-1.25bit-GGUF",
"backend_ms": 1234,
"elapsed_ms": 1236,
"cached": false
}Response fields:
| Field | Type | Description |
|---|---|---|
translation |
string | Translated text. |
model |
string | Model name returned by the backend. |
backend_ms |
integer | Backend translation time in milliseconds. Cached responses report 0. |
elapsed_ms |
integer | End-to-end HTTP handler time in milliseconds. |
cached |
boolean | Present and true when the response came from cache. |
cURL:
curl -s http://127.0.0.1:8080/v1/translate \
-H "Content-Type: application/json" \
-d '{"text":"Hello","target_lang":"zh"}'Translates paragraph-like segments in Markdown-like documents. The service preserves blank lines, headings, list/table markers, URLs, email addresses, Markdown link destinations, HTML tags, inline code, and fenced code blocks.
Request:
{
"text": "# Contract\n\nThe contract shall be governed by New York law.\n\n```go\nfmt.Println(\"hello\")\n```\n",
"source_lang": "en",
"target_lang": "zh",
"format": "markdown",
"domain": "legal",
"concurrency": 2
}Request fields:
| Field | Type | Required | Description |
|---|---|---|---|
text |
string | yes | Document text to translate. |
source_lang |
string | no | Source language code or name. |
target_lang |
string | no | Target language code or name. |
format |
string | no | Optional format hint. Use markdown for Markdown-like documents. |
domain |
string | no | Optional domain hint. |
glossary |
string | no | Optional terminology hint. |
concurrency |
integer | no | Segment-level parallelism, range 1-64. Omit or use 1 for sequential translation. |
Response:
{
"translation": "# 合同\n\n该合同受纽约法律约束。\n\n```go\nfmt.Println(\"hello\")\n```\n",
"model": "Hy-MT1.5-1.8B-1.25bit-GGUF",
"backend_ms": 2800,
"elapsed_ms": 2805,
"cached_hits": 1,
"segments": [
{
"index": 0,
"source": "Contract",
"translation": "合同",
"skipped": false
},
{
"index": 1,
"source": "\n",
"translation": "\n",
"skipped": true
}
]
}segments[].skipped=true means the segment was preserved without model inference. segments[].cached=true means the segment was served from persistent cache. Duplicate and protected-equivalent segments are translated once and reused with each segment's own protected spans restored.
Returns current settings, runtime status, cache state, and counters.
curl -s http://127.0.0.1:8080/admin/configImportant response fields:
settings: active persisted settings.config_path: config file path.inflight: current active translation requests.active_capacity: active concurrency limit.backend_ready: whether the backend is ready.cache: cache path, namespace, entry count, and size.stats: request counters.status: accelerator detection result.reload_required: whether changed settings need model reload.restart_required: whether changed settings need process restart.
Saves runtime configuration. Concurrency changes apply immediately. Model, cache, and generation settings require POST /admin/reload. Listen address and log path changes require process restart.
{
"addr": ":8080",
"max_concurrent": 2,
"model_dir": "C:\\ProgramData\\RapidTrans\\models",
"model_path": "C:\\ProgramData\\RapidTrans\\models\\Hy-MT1.5-1.8B-1.25bit-GGUF\\Hy-MT1.5-1.8B-1.25bit.gguf",
"cache_dir": "C:\\ProgramData\\RapidTrans\\cache",
"log_path": "C:\\ProgramData\\RapidTrans\\logs\\rapidtrans.log",
"max_tokens": 128,
"temperature": 0,
"top_p": 0,
"session_pool": 2,
"model_url": "https://hf-mirror.com/tencent/Hy-MT1.5-1.8B-1.25bit-GGUF/resolve/main/Hy-MT1.5-1.8B-1.25bit.gguf",
"model_urls": [
"https://hf-mirror.com/tencent/Hy-MT1.5-1.8B-1.25bit-GGUF/resolve/main/Hy-MT1.5-1.8B-1.25bit.gguf",
"https://huggingface.co/tencent/Hy-MT1.5-1.8B-1.25bit-GGUF/resolve/main/Hy-MT1.5-1.8B-1.25bit.gguf"
],
"auto_download": true,
"device": "auto",
"kv_dtype": "fp32"
}device accepts auto, cpu, or vulkan. Vulkan is detected and surfaced for future kernels; CPU remains the stable inference path. kv_dtype accepts fp32, fp16, q4, or q3; keep fp32 for production quality unless explicitly testing quantized KV cache behavior.
curl -s http://127.0.0.1:8080/admin/acceleratorsExample:
{
"status": {
"requested": "auto",
"selected": "cpu",
"vulkan_available": true,
"vulkan_version": "1.3.261",
"message": "vulkan runtime detected; using cpu until vulkan kernels are enabled"
}
}Reloads the model backend using current settings.
curl -X POST http://127.0.0.1:8080/admin/reloadReturns persistent cache information.
curl -s http://127.0.0.1:8080/admin/cacheClears the persistent translation cache.
curl -X POST http://127.0.0.1:8080/admin/cache/clear| Code | Meaning |
|---|---|
200 |
Success. |
400 |
Invalid JSON, unknown field, missing text, or invalid configuration. |
408 |
Client canceled or request timeout. |
429 |
Translator is busy and the request could not be served from cache. |
502 |
Backend or cache operation failed. |
503 |
Backend is unavailable or not ready. |
A minimal tool schema for AI agent integration:
{
"name": "rapidtrans_translate",
"description": "Translate text using the local RapidTrans service.",
"parameters": {
"type": "object",
"properties": {
"text": { "type": "string", "description": "Text to translate." },
"target_lang": { "type": "string", "description": "Target language code or name, for example zh, en, ja, fr." },
"source_lang": { "type": "string", "description": "Optional source language code or name." },
"domain": { "type": "string", "description": "Optional domain hint, for example legal." },
"glossary": { "type": "string", "description": "Optional glossary or terminology preference." }
},
"required": ["text", "target_lang"]
}
}Call target:
POST http://127.0.0.1:8080/v1/translate
- Call
GET /readyzbefore sending production translation traffic. - Use
/v1/translate/documentfor Markdown-like documents instead of splitting fenced code blocks client-side. - Pass
target_langexplicitly in AI tool integrations. - Reuse service-side persistent cache. Clients do not need to deduplicate repeated segments manually.
- Configure additional domestic mirrors through
model_urlsorMACLAW_MODEL_URLS;hf-mirror.comis tried beforehuggingface.coby default.