Skip to content

Latest commit

 

History

History
315 lines (248 loc) · 10.1 KB

File metadata and controls

315 lines (248 loc) · 10.1 KB

RapidTrans API Documentation

RapidTrans exposes a local REST API for translation, document translation, health checks, metrics, and administration. The default base URL is:

http://127.0.0.1:8080

The OpenAPI specification is available at openapi.yaml and at runtime from GET /openapi.yaml.

Endpoints

Feature Method and path Description
Text translation POST /v1/translate Translate a single text segment.
Document translation POST /v1/translate/document Translate Markdown-like documents while preserving structure and protected spans.
Liveness GET /healthz Returns 200 when the process is alive.
Readiness GET /readyz Returns 200 when the model backend is ready, 503 while loading or unavailable.
Metrics GET /metrics Prometheus text metrics.
OpenAPI GET /openapi.yaml Returns the API specification.
Admin config GET /admin/config / POST /admin/config Read or update runtime configuration.
Reload model POST /admin/reload Reload the backend using current settings.
Cache info GET /admin/cache Read persistent translation cache state.
Clear cache POST /admin/cache/clear Clear persistent translation cache.
Accelerators GET /admin/accelerators Read CPU/Vulkan detection and selected device.

Common Rules

  • Request bodies use Content-Type: application/json.
  • Responses are JSON unless otherwise stated.
  • Text is UTF-8.
  • Unknown JSON fields are rejected.
  • Request body size is limited by MACLAW_REQUEST_LIMIT_BYTES.
  • If the service is at capacity, translation endpoints return 429.
  • Cached text and fully cached document requests can bypass the concurrency slot.

Error response shape:

{
  "error": "translator is busy"
}

Language Parameters

source_lang and target_lang accept codes and common names:

zh, en, ja, ko, fr, de, es, ru, it, pt, ar, hi, vi, th, id, ms, tr, nl, pl
Chinese, English, Japanese, Korean, French, German, Spanish
中文, 英文, 日文, 韩文

Default behavior:

  • If target_lang is omitted and the input contains Chinese, the target defaults to English.
  • If target_lang is omitted and the input does not contain Chinese, the target defaults to Chinese.
  • Integrations should usually pass target_lang explicitly.

POST /v1/translate

Translates one text segment. This endpoint is best for chat tools, Agent tool calls, and short independent strings.

Request:

{
  "text": "The contract shall be governed by New York law.",
  "source_lang": "en",
  "target_lang": "zh",
  "domain": "legal",
  "glossary": "governed by=受...管辖"
}

Request fields:

Field Type Required Description
text string yes Text to translate. It is trimmed before translation.
source_lang string no Source language code or name. If omitted, the service can infer English/Chinese defaults.
target_lang string no Target language code or name. Defaults to English/Chinese mutual translation when omitted.
domain string no Optional domain hint, for example legal, finance, or medical.
glossary string no Optional terminology hint.

Response:

{
  "translation": "该合同受纽约法律约束。",
  "model": "Hy-MT1.5-1.8B-1.25bit-GGUF",
  "backend_ms": 1234,
  "elapsed_ms": 1236,
  "cached": false
}

Response fields:

Field Type Description
translation string Translated text.
model string Model name returned by the backend.
backend_ms integer Backend translation time in milliseconds. Cached responses report 0.
elapsed_ms integer End-to-end HTTP handler time in milliseconds.
cached boolean Present and true when the response came from cache.

cURL:

curl -s http://127.0.0.1:8080/v1/translate \
  -H "Content-Type: application/json" \
  -d '{"text":"Hello","target_lang":"zh"}'

POST /v1/translate/document

Translates paragraph-like segments in Markdown-like documents. The service preserves blank lines, headings, list/table markers, URLs, email addresses, Markdown link destinations, HTML tags, inline code, and fenced code blocks.

Request:

{
  "text": "# Contract\n\nThe contract shall be governed by New York law.\n\n```go\nfmt.Println(\"hello\")\n```\n",
  "source_lang": "en",
  "target_lang": "zh",
  "format": "markdown",
  "domain": "legal",
  "concurrency": 2
}

Request fields:

Field Type Required Description
text string yes Document text to translate.
source_lang string no Source language code or name.
target_lang string no Target language code or name.
format string no Optional format hint. Use markdown for Markdown-like documents.
domain string no Optional domain hint.
glossary string no Optional terminology hint.
concurrency integer no Segment-level parallelism, range 1-64. Omit or use 1 for sequential translation.

Response:

{
  "translation": "# 合同\n\n该合同受纽约法律约束。\n\n```go\nfmt.Println(\"hello\")\n```\n",
  "model": "Hy-MT1.5-1.8B-1.25bit-GGUF",
  "backend_ms": 2800,
  "elapsed_ms": 2805,
  "cached_hits": 1,
  "segments": [
    {
      "index": 0,
      "source": "Contract",
      "translation": "合同",
      "skipped": false
    },
    {
      "index": 1,
      "source": "\n",
      "translation": "\n",
      "skipped": true
    }
  ]
}

segments[].skipped=true means the segment was preserved without model inference. segments[].cached=true means the segment was served from persistent cache. Duplicate and protected-equivalent segments are translated once and reused with each segment's own protected spans restored.

Admin API

GET /admin/config

Returns current settings, runtime status, cache state, and counters.

curl -s http://127.0.0.1:8080/admin/config

Important response fields:

  • settings: active persisted settings.
  • config_path: config file path.
  • inflight: current active translation requests.
  • active_capacity: active concurrency limit.
  • backend_ready: whether the backend is ready.
  • cache: cache path, namespace, entry count, and size.
  • stats: request counters.
  • status: accelerator detection result.
  • reload_required: whether changed settings need model reload.
  • restart_required: whether changed settings need process restart.

POST /admin/config

Saves runtime configuration. Concurrency changes apply immediately. Model, cache, and generation settings require POST /admin/reload. Listen address and log path changes require process restart.

{
  "addr": ":8080",
  "max_concurrent": 2,
  "model_dir": "C:\\ProgramData\\RapidTrans\\models",
  "model_path": "C:\\ProgramData\\RapidTrans\\models\\Hy-MT1.5-1.8B-1.25bit-GGUF\\Hy-MT1.5-1.8B-1.25bit.gguf",
  "cache_dir": "C:\\ProgramData\\RapidTrans\\cache",
  "log_path": "C:\\ProgramData\\RapidTrans\\logs\\rapidtrans.log",
  "max_tokens": 128,
  "temperature": 0,
  "top_p": 0,
  "session_pool": 2,
  "model_url": "https://hf-mirror.com/tencent/Hy-MT1.5-1.8B-1.25bit-GGUF/resolve/main/Hy-MT1.5-1.8B-1.25bit.gguf",
  "model_urls": [
    "https://hf-mirror.com/tencent/Hy-MT1.5-1.8B-1.25bit-GGUF/resolve/main/Hy-MT1.5-1.8B-1.25bit.gguf",
    "https://huggingface.co/tencent/Hy-MT1.5-1.8B-1.25bit-GGUF/resolve/main/Hy-MT1.5-1.8B-1.25bit.gguf"
  ],
  "auto_download": true,
  "device": "auto",
  "kv_dtype": "fp32"
}

device accepts auto, cpu, or vulkan. Vulkan is detected and surfaced for future kernels; CPU remains the stable inference path. kv_dtype accepts fp32, fp16, q4, or q3; keep fp32 for production quality unless explicitly testing quantized KV cache behavior.

GET /admin/accelerators

curl -s http://127.0.0.1:8080/admin/accelerators

Example:

{
  "status": {
    "requested": "auto",
    "selected": "cpu",
    "vulkan_available": true,
    "vulkan_version": "1.3.261",
    "message": "vulkan runtime detected; using cpu until vulkan kernels are enabled"
  }
}

POST /admin/reload

Reloads the model backend using current settings.

curl -X POST http://127.0.0.1:8080/admin/reload

GET /admin/cache

Returns persistent cache information.

curl -s http://127.0.0.1:8080/admin/cache

POST /admin/cache/clear

Clears the persistent translation cache.

curl -X POST http://127.0.0.1:8080/admin/cache/clear

Status Codes

Code Meaning
200 Success.
400 Invalid JSON, unknown field, missing text, or invalid configuration.
408 Client canceled or request timeout.
429 Translator is busy and the request could not be served from cache.
502 Backend or cache operation failed.
503 Backend is unavailable or not ready.

Agent Tool Schema

A minimal tool schema for AI agent integration:

{
  "name": "rapidtrans_translate",
  "description": "Translate text using the local RapidTrans service.",
  "parameters": {
    "type": "object",
    "properties": {
      "text": { "type": "string", "description": "Text to translate." },
      "target_lang": { "type": "string", "description": "Target language code or name, for example zh, en, ja, fr." },
      "source_lang": { "type": "string", "description": "Optional source language code or name." },
      "domain": { "type": "string", "description": "Optional domain hint, for example legal." },
      "glossary": { "type": "string", "description": "Optional glossary or terminology preference." }
    },
    "required": ["text", "target_lang"]
  }
}

Call target:

POST http://127.0.0.1:8080/v1/translate

Integration Notes

  • Call GET /readyz before sending production translation traffic.
  • Use /v1/translate/document for Markdown-like documents instead of splitting fenced code blocks client-side.
  • Pass target_lang explicitly in AI tool integrations.
  • Reuse service-side persistent cache. Clients do not need to deduplicate repeated segments manually.
  • Configure additional domestic mirrors through model_urls or MACLAW_MODEL_URLS; hf-mirror.com is tried before huggingface.co by default.