A Discord bot that fetches job postings from Greenhouse and Rippling boards, generates AI-powered summaries via Ollama (using Pydantic AI), and posts new listings to a Discord channel. Valkey (Redis-compatible) tracks already-posted jobs to prevent duplicates.
flowchart LR
subgraph External Services
GH[Greenhouse API]
RP[Rippling API]
DC[Discord Channel]
end
subgraph Docker Compose
subgraph Bot Container
BOT[Bot / Polling Loop]
PF[Preflight Checks]
CFG[Config]
EMB[Embed Builder]
SUM[Summarizer — Pydantic AI]
CTL[Controller]
GHC[Greenhouse Client]
RPC[Rippling Client]
end
VK[(Valkey)]
OL[Ollama LLM]
end
BOT --> PF
PF -->|health check| VK
PF -->|health check| OL
BOT --> CFG
BOT --> CTL
CTL --> GHC
CTL --> RPC
GHC -->|fetch jobs| GH
RPC -->|fetch jobs| RP
BOT -->|filter new| VK
BOT --> SUM
SUM -->|summarize| OL
BOT --> EMB
EMB -->|post embed| DC
BOT -->|mark posted| VK
- Python 3.14+
- A Discord bot token and target channel ID
- Valkey (or Redis) instance
- Ollama instance with your preferred model
cp .env.example .env
# Edit .env with your values
docker compose up --buildThis starts three services: the bot, Valkey, and Ollama.
uv sync
cp .env.example .env
# Edit .env with your values
job-crawlerjob-crawler # Run the bot (polls once, posts to Discord, then exits)
job-crawler --dry-run # Preview jobs locally without Discord
job-crawler --limit 5 # Cap the number of jobs posted per cycle
job-preflight # Check that Valkey and Ollama are reachable
All configuration is via environment variables. See .env.example.
| Variable | Required | Default | Description |
|---|---|---|---|
DISCORD_TOKEN |
Yes | — | Discord bot token (not required for --dry-run) |
DISCORD_CHANNEL_ID |
Yes | — | Channel to post job listings (not required for --dry-run) |
VALKEY_URL |
No | valkey://localhost:6379/0 |
Valkey/Redis connection URL |
JOB_TTL_SECONDS |
No | 7776000 (90 days) |
How long to remember posted jobs |
BOARD_URLS |
No | Value of GREENHOUSE_BOARD_URL |
Comma-separated list of board URLs (Greenhouse and/or Rippling) |
GREENHOUSE_BOARD_URL |
No | Temporal Technologies board | Greenhouse board API endpoint (used as fallback when BOARD_URLS is not set) |
OLLAMA_BASE_URL |
No | http://localhost:11434/v1 |
Ollama API URL (read by Pydantic AI) |
OLLAMA_MODEL |
No | ministral-3 |
LLM model for summarization |
uv run pytestjob_crawler/
├── bot.py # Discord bot, polling loop, and CLI entrypoint
├── config.py # Environment variable configuration
├── greenhouse.py # Greenhouse API client and Job dataclass
├── ripling.py # Rippling API client
├── controller.py # Multi-source job fetcher router
├── state.py # Valkey-backed job deduplication
├── summarize.py # LLM summarization via Pydantic AI + Ollama
├── embeds.py # Discord embed builder
└── preflight.py # Service health checks (Valkey, Ollama)