Jetbot is a Filing-to-Model Copilot and Financial Fact Platform for evidence-backed financial report extraction. It turns PDF filings into canonical financial facts, structured statements, key notes, risk signals, event-study outputs, and analyst-ready summaries.
It is designed for teams that need a single workflow to ingest reports, inspect source evidence, review and correct extracted facts, and ship the results through an API, a CLI, exports, or a browser UI.
- End-to-end PDF pipeline for raw text, tables, statements, notes, facts, and report generation.
- Canonical financial fact layer with page/table/cell evidence metadata for review and downstream exports.
- Evaluation runner with machine-readable reports and configurable quality thresholds.
- Works in mock mode out of the box, with optional OpenAI and Anthropic model routing.
- Vue 3 dashboard for reviewing original PDFs alongside extraction and analysis outputs.
- Docker-first local stack with API, worker, Redis, PostgreSQL, and MinIO.
- Pluggable storage, retrieval, tracing, and market-data integrations.
- Production-friendly defaults for auth, rate limiting, metrics, and tracing.
flowchart LR
A[Financial Filing PDF] --> B[PDF extraction and OCR]
B --> C[Statements and canonical facts]
C --> D[Evidence and validation]
D --> E[Review, API, and exports]
D --> F[Risk signals and analyst reports]
E --> G[Vue dashboard at /ui]
F --> G
Use this path when you want the fastest edit-run loop.
python -m venv .venv
# activate the virtual environment for your shell
pip install -e .
make devThe API starts at http://127.0.0.1:8000.
If you want the Vue frontend in dev mode as well:
make web-install
make web-devThe Vite app runs at http://127.0.0.1:5173 and proxies API requests to the local backend.
Use this path when you want the full local system with background worker and infrastructure services.
copy .env.example .env
make docker-upmake docker-up now does four things in one flow:
- builds the backend image with the Vue production bundle included
- starts the API, worker, Redis, PostgreSQL, and MinIO services
- waits until the API health endpoint is ready
- opens the frontend automatically at
http://127.0.0.1:18000/ui/
Set JETBOT_OPEN_BROWSER=0 if you want to skip the automatic browser launch.
Stop the stack with:
make docker-downAfter startup, the main entry points are:
| Surface | URL / Command | Notes |
|---|---|---|
| Web UI | http://127.0.0.1:18000/ui/ |
Review uploaded PDFs, tables, statements, signals, and generated reports |
| API | http://127.0.0.1:18000/v1 |
Programmatic ingestion and retrieval, including canonical facts |
| OpenAPI docs | http://127.0.0.1:18000/docs |
Interactive API explorer |
| Health | http://127.0.0.1:18000/health |
Liveness probe |
| Metrics | http://127.0.0.1:18000/metrics |
Prometheus endpoint |
| CLI | python -m src.cli --help |
Local automation and scripting |
python -m src.cli analyze --pdf path/to/report.pdf --out data --company "Example Co" --period-end 2025-12-31python examples/real_pdf_analysis/run_example.pycurl -F "file=@path/to/report.pdf" \
-H "X-API-Key: your-key" \
http://127.0.0.1:18000/v1/documents
curl -X POST \
-H "X-API-Key: your-key" \
http://127.0.0.1:18000/v1/documents/<doc_id>/analyzeJetbot starts in mock mode if no provider key is configured. Most teams only need a small set of environment variables to get productive:
| Variable | Purpose | Default |
|---|---|---|
OPENAI_API_KEY |
Enable OpenAI-backed extraction and reporting | empty |
ANTHROPIC_API_KEY |
Enable Anthropic-backed models | empty |
LLM_DEFAULT_MODEL |
Default router target in provider:model format |
empty |
LLM_EXTRACTION_MODEL |
Override the extraction model | empty |
LLM_REPORT_MODEL |
Override the reporting model | empty |
RAG_MODE |
Retrieval mode: token_overlap, embedding, hybrid |
token_overlap |
TASK_BACKEND |
background or celery |
background |
STORAGE_BACKEND |
local or postgres |
local |
API_KEYS |
Comma-separated API keys; blank disables auth | empty |
Docker host ports are fixed to 18000 (API/UI), 16379 (Redis), 15432 (PostgreSQL), 19000 (MinIO), and 19001 (MinIO Console), so app-level .env settings cannot remap them accidentally.
See .env.example for the full configuration surface, including tracing, storage, rate limiting, and market-data settings.
Install only the packages you need:
pip install -e ".[embeddings]"
pip install -e ".[anthropic]"
pip install -e ".[celery]"
pip install -e ".[postgres]"
pip install -e ".[s3]"
pip install -e ".[market]"
pip install -e ".[monitoring]"
pip install -e ".[all]"make test
make eval
python scripts/eval.py --thresholds benchmarks/thresholds/golden_minimum.json
make fmt
make lint
make typecheck
make web-lint
make web-buildThe repository is organized around a small number of clear surfaces:
src/api/for HTTP entry points and application wiringsrc/pdf/for extraction, rendering, tables, and OCRsrc/finance/for facts, normalization, validation, and signal logicsrc/agent/for pipeline orchestration and state handlingsrc/market/for event-study analysis and market providersweb/for the Vue 3 dashboardtests/for API, storage, pipeline, frontend-adjacent, and integration coveragebenchmarks/for benchmark manifest schemas, threshold configs, and non-sensitive sample manifestsdocs/for architecture, branch protection, roadmap, and project notes
Benchmark manifests, anonymized labels, synthetic fixtures, schemas, and threshold configs can be committed. Raw third-party or proprietary PDFs, private labels, customer files, and generated benchmark artifacts must stay out of git.
Use benchmarks/raw/ or benchmarks/private/ for local-only datasets. Those paths are ignored by git. Store only stable metadata, expected facts, expected evidence pointers, and licensing notes in committed manifests.
All changes land through pull requests.
git checkout -b feat/<short-description>
bash scripts/local_ci.sh
git push -u origin HEAD
gh pr create --base main --fillBefore opening a PR, make sure local CI passes. The script covers Python linting, typing, tests, and the web checks that mirror CI.
See CONTRIBUTING.md, CODE_OF_CONDUCT.md, SECURITY.md, and docs/BRANCH_PROTECTION.md for project policy and contribution details.
MIT. See LICENSE.
Jetbot produces structured extraction and analytical signals. It does not provide investment advice or recommend trades.