batchline

Build, split, and reconcile LLM batch jobs — for the OpenAI Batch API and Anthropic Message Batches. Zero dependencies, no API keys.

Batch APIs are ~50% cheaper than synchronous calls, but they push all the boring plumbing onto you: every request needs a unique custom_id, the input has to be provider-shaped JSONL, files have size and count limits, and the results come back unordered — keyed only by custom_id — so you have to join them back to your prompts yourself and figure out what failed.

batchline does exactly that plumbing and nothing else. It never calls an API and never needs a key — it's a pure, offline transform over your files, so it's safe to run in CI and easy to test.

prompts.jsonl ──build──▶ requests.jsonl ──split──▶ chunk-000.jsonl ...
                                                         │
                                              (you upload, wait, download)
                                                         ▼
prompts.jsonl + results.jsonl ──merge──▶ merged.jsonl  +  retry.jsonl

Why

One input format, two providers. Write plain prompts once; emit OpenAI or Anthropic batch files.
custom_id done right. IDs are validated for presence and uniqueness before you ever upload.
Respects provider limits. Split a job into chunks that stay under the request-count and file-size caps.
Reconciliation that tells the truth. merge reports ok, failed, missing (never came back), and unexpected (came back but wasn't in your input) — and writes a ready-to-resubmit retry file.
Zero dependencies. Pure ESM, runs on Node 18+, node --test only.

Install

npm install -g batchline      # CLI
# or use it as a library
npm install batchline

Or run straight from a clone (no install, no build):

node bin/batchline.js --help

Quick start

Start with a plain prompts file — one item per line. Each item needs an id and either a prompt (string) or a messages array. A system string and any extra keys (temperature, max_tokens, …) are optional per-request overrides.

{"id":"q1","prompt":"Capital of France?","system":"Answer in one word."}
{"id":"q2","prompt":"What is 2+2?"}
{"id":"q3","messages":[{"role":"user","content":"Say hi"}],"temperature":0.7}

1. Build a provider request file:

batchline build --provider openai --model gpt-4o-mini --max-tokens 64 \
  -i examples/prompts.jsonl -o requests.jsonl

{"custom_id":"q1","method":"POST","url":"/v1/chat/completions","body":{"model":"gpt-4o-mini","max_tokens":64,"messages":[{"role":"system","content":"Answer in one word."},{"role":"user","content":"Capital of France?"}]}}

For Anthropic, the system prompt is lifted to a top-level param automatically:

batchline build --provider anthropic --model claude-3-5-haiku-latest -i examples/prompts.jsonl
# {"custom_id":"q1","params":{"model":"claude-3-5-haiku-latest","max_tokens":1024,"messages":[...],"system":"Answer in one word."}}

2. Split (optional) into chunks that fit provider limits:

batchline split --provider openai -i requests.jsonl -o out/chunk
# wrote 1 chunk(s):  out/chunk-000.jsonl

Now upload each chunk to your provider's batch endpoint, wait for it to finish, and download the output JSONL.

3. Merge the provider output back onto your prompts:

batchline merge --provider openai -i examples/prompts.jsonl -r results.jsonl \
  -o merged.jsonl --failures retry.jsonl
# merged 3 row(s): 2 ok, 1 failed, 0 missing

{"id":"q1","prompt":"Capital of France?","ok":true,"text":"Paris","error":null}
{"id":"q2","prompt":"What is 2+2?","ok":false,"text":null,"error":"server error"}

retry.jsonl contains only the rows that failed or never came back — feed it back into build to resubmit.

CLI reference

Command	What it does
`build`	Turn plain prompt items into a provider batch-request JSONL file.
`split`	Split a request JSONL into chunks under provider size/count limits. With `-o PREFIX` writes `PREFIX-000.jsonl`, `PREFIX-001.jsonl`, …
`merge`	Join provider output back to inputs by `custom_id`; report failures/missing/unexpected and write a retry file.

Common options: --provider openai|anthropic, --model, --max-tokens, --temperature, -i/--input (default stdin), -o/--output (default stdout), -r/--results, --failures, --max-requests, --max-bytes. Run batchline --help for the full list.

Library API

import { buildBatch, splitByLimits, mergeResults, parseJsonl, toJsonl } from 'batchline';

const items = parseJsonl(fs.readFileSync('prompts.jsonl', 'utf8'));

const { requests, errors } = buildBatch(items, { provider: 'openai', model: 'gpt-4o-mini' });
const chunks = splitByLimits(requests, { provider: 'openai' });

// after the batch returns:
const outputs = parseJsonl(fs.readFileSync('results.jsonl', 'utf8'));
const { merged, failures, missing, unexpected } = mergeResults(items, outputs, { provider: 'openai' });

Export	Signature
`buildBatch(items, {provider, model, defaults})`	`{ requests, errors }`
`buildRequest(item, {provider, model, defaults})`	one provider request object
`splitByLimits(requests, {provider?, maxRequests?, maxBytes?})`	`Array<request[]>`
`parseResult(line, provider)`	`{ custom_id, ok, text, error, raw }`
`mergeResults(inputs, outputs, {provider})`	`{ merged, failures, missing, unexpected }`
`parseJsonl(text)` / `toJsonl(rows)`	JSONL helpers

Provider limits

Defaults used by split --provider (kept just under the documented hard caps for headroom):

Provider	Max requests	Max file size
OpenAI	50,000	200 MB
Anthropic	100,000	256 MB

Override either with --max-requests / --max-bytes.

Development

node --test        # run the test suite (zero dependencies)

Support

If batchline saves you time, an optional tip is always welcome (never required):

USDT — Ethereum (ERC-20): 0xad39bdf2df0b8dd6991150fcea0a156150ed19b8
Verify: https://etherscan.io/address/0xad39bdf2df0b8dd6991150fcea0a156150ed19b8

Please send only on the Ethereum (ERC-20) network.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
assets		assets
bin		bin
examples		examples
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

batchline

Why

Install

Quick start

CLI reference

Library API

Provider limits

Development

Support

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

batchline

Why

Install

Quick start

CLI reference

Library API

Provider limits

Development

Support

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages