Skip to content

Ayubjon/batchline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

batchline

Build, split, and reconcile LLM batch jobs — for the OpenAI Batch API and Anthropic Message Batches. Zero dependencies, no API keys.

batchline demo

Batch APIs are ~50% cheaper than synchronous calls, but they push all the boring plumbing onto you: every request needs a unique custom_id, the input has to be provider-shaped JSONL, files have size and count limits, and the results come back unordered — keyed only by custom_id — so you have to join them back to your prompts yourself and figure out what failed.

batchline does exactly that plumbing and nothing else. It never calls an API and never needs a key — it's a pure, offline transform over your files, so it's safe to run in CI and easy to test.

prompts.jsonl ──build──▶ requests.jsonl ──split──▶ chunk-000.jsonl ...
                                                         │
                                              (you upload, wait, download)
                                                         ▼
prompts.jsonl + results.jsonl ──merge──▶ merged.jsonl  +  retry.jsonl

Why

  • One input format, two providers. Write plain prompts once; emit OpenAI or Anthropic batch files.
  • custom_id done right. IDs are validated for presence and uniqueness before you ever upload.
  • Respects provider limits. Split a job into chunks that stay under the request-count and file-size caps.
  • Reconciliation that tells the truth. merge reports ok, failed, missing (never came back), and unexpected (came back but wasn't in your input) — and writes a ready-to-resubmit retry file.
  • Zero dependencies. Pure ESM, runs on Node 18+, node --test only.

Install

npm install -g batchline      # CLI
# or use it as a library
npm install batchline

Or run straight from a clone (no install, no build):

node bin/batchline.js --help

Quick start

Start with a plain prompts file — one item per line. Each item needs an id and either a prompt (string) or a messages array. A system string and any extra keys (temperature, max_tokens, …) are optional per-request overrides.

{"id":"q1","prompt":"Capital of France?","system":"Answer in one word."}
{"id":"q2","prompt":"What is 2+2?"}
{"id":"q3","messages":[{"role":"user","content":"Say hi"}],"temperature":0.7}

1. Build a provider request file:

batchline build --provider openai --model gpt-4o-mini --max-tokens 64 \
  -i examples/prompts.jsonl -o requests.jsonl
{"custom_id":"q1","method":"POST","url":"/v1/chat/completions","body":{"model":"gpt-4o-mini","max_tokens":64,"messages":[{"role":"system","content":"Answer in one word."},{"role":"user","content":"Capital of France?"}]}}

For Anthropic, the system prompt is lifted to a top-level param automatically:

batchline build --provider anthropic --model claude-3-5-haiku-latest -i examples/prompts.jsonl
# {"custom_id":"q1","params":{"model":"claude-3-5-haiku-latest","max_tokens":1024,"messages":[...],"system":"Answer in one word."}}

2. Split (optional) into chunks that fit provider limits:

batchline split --provider openai -i requests.jsonl -o out/chunk
# wrote 1 chunk(s):  out/chunk-000.jsonl

Now upload each chunk to your provider's batch endpoint, wait for it to finish, and download the output JSONL.

3. Merge the provider output back onto your prompts:

batchline merge --provider openai -i examples/prompts.jsonl -r results.jsonl \
  -o merged.jsonl --failures retry.jsonl
# merged 3 row(s): 2 ok, 1 failed, 0 missing
{"id":"q1","prompt":"Capital of France?","ok":true,"text":"Paris","error":null}
{"id":"q2","prompt":"What is 2+2?","ok":false,"text":null,"error":"server error"}

retry.jsonl contains only the rows that failed or never came back — feed it back into build to resubmit.

CLI reference

Command What it does
build Turn plain prompt items into a provider batch-request JSONL file.
split Split a request JSONL into chunks under provider size/count limits. With -o PREFIX writes PREFIX-000.jsonl, PREFIX-001.jsonl, …
merge Join provider output back to inputs by custom_id; report failures/missing/unexpected and write a retry file.

Common options: --provider openai|anthropic, --model, --max-tokens, --temperature, -i/--input (default stdin), -o/--output (default stdout), -r/--results, --failures, --max-requests, --max-bytes. Run batchline --help for the full list.

Library API

import { buildBatch, splitByLimits, mergeResults, parseJsonl, toJsonl } from 'batchline';

const items = parseJsonl(fs.readFileSync('prompts.jsonl', 'utf8'));

const { requests, errors } = buildBatch(items, { provider: 'openai', model: 'gpt-4o-mini' });
const chunks = splitByLimits(requests, { provider: 'openai' });

// after the batch returns:
const outputs = parseJsonl(fs.readFileSync('results.jsonl', 'utf8'));
const { merged, failures, missing, unexpected } = mergeResults(items, outputs, { provider: 'openai' });
Export Signature
buildBatch(items, {provider, model, defaults}) { requests, errors }
buildRequest(item, {provider, model, defaults}) one provider request object
splitByLimits(requests, {provider?, maxRequests?, maxBytes?}) Array<request[]>
parseResult(line, provider) { custom_id, ok, text, error, raw }
mergeResults(inputs, outputs, {provider}) { merged, failures, missing, unexpected }
parseJsonl(text) / toJsonl(rows) JSONL helpers

Provider limits

Defaults used by split --provider (kept just under the documented hard caps for headroom):

Provider Max requests Max file size
OpenAI 50,000 200 MB
Anthropic 100,000 256 MB

Override either with --max-requests / --max-bytes.

Development

node --test        # run the test suite (zero dependencies)

Support

If batchline saves you time, an optional tip is always welcome (never required):

Please send only on the Ethereum (ERC-20) network.

License

MIT © 2026 Ayubjon

About

Build, split, and reconcile LLM batch jobs (OpenAI Batch API & Anthropic Message Batches). Zero-dep CLI + library, no API keys.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors