Perturbation Catalogue LLM Agent

An LLM agent that queries the Perturbation Catalogue API using natural language.

Supports three backends:

Anthropic Claude (recommended) — uses native tool_use feature
OpenAI GPT-4o — uses function calling
Ollama (local) — no API key required, runs locally

Ask questions like:

"What happens when SPI1 is knocked out?"
"Which MAVE variants of UBE2I are loss-of-function?"
"Compare CRISPR screen effects of TP53 vs MYC across cell types"
"What pathways are enriched after BRCA1 knockout in dataset X?"

Architecture

User prompt
    │
    ▼
LLM Agent (Claude / GPT-4o / Ollama)
    │  decides which tools to call
    ▼
┌─────────────────────────────────────┐
│           API Tools                 │
│  search_catalogue()                 │
│  get_perturb_seq()                  │
│  get_mave_data()                    │
│  get_crispr_screen()                │
│  get_gsea_results()                 │
│  get_dataset_info()                 │
│  get_catalogue_summary()            │
└─────────────────────────────────────┘
    │  live HTTP calls
    ▼
Perturbation Catalogue API
https://perturbation-catalogue-be-328296435987.europe-west2.run.app

Quickstart

# 1. Install
pip install -e .

# 2. Choose your LLM backend:

# Option A: Anthropic Claude (recommended, requires API key)
export ANTHROPIC_API_KEY=replace with Anthropic API key

# Option B: OpenAI GPT-4o (requires API key)
export OPENAI_API_KEY = ***

# Option C: Ollama (local, no API key)
# First, download & run Ollama locally:
ollama pull llama3.1:8b
ollama serve

# 3. Run interactive agent
python agent/agent.py

# With Ollama backend:
python agent/agent.py --backend ollama --model llama3.1:8b

# 4. Or single query
python agent/agent.py --query "What genes are downregulated when SPI1 is knocked out?"
python agent/agent.py --backend ollama --model llama3.1:8b --query "What happens when BRCA1 is knocked out?"

# 5. Or start API server
python agent/agent.py --serve --port 8000
python agent/agent.py --backend ollama --model llama3.1:8b --serve --port 8000

Project Structure

├── data/
│   └── api_client.py       # Typed Python client for the Perturbation Catalogue API
├── agent/
│   ├── tools.py            # API-wrapped tools the LLM can call
│   ├── agent.py       # Main agent loop (Anthropic + OpenAI + Ollama backends)
│   └── agent.py            # Original agent implementation
├── indexing/               # Optional: pre-build a vector index for faster retrieval
│   ├── chunker.py
│   ├── embedder.py
│   └── build_index.py
├── retrieval/
│   └── retriever.py        # Vector search (used only if index is built)
├── evaluation/
│   └── evaluate.py
└── scripts/
    └── query_cli.py        # Rich interactive CLI

Backend Comparison

Backend	Model	API Key	Speed	Quality
Anthropic	Claude 3.5 Sonnet	Required	Fast	Excellent
OpenAI	GPT-4o	Required	Fast	Excellent
Ollama	llama3.1:8b (local)	Not needed	Depends on hardware	Good

Choose Ollama for local development, offline usage, or privacy-sensitive applications.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
agent		agent
data		data
evaluation		evaluation
indexing		indexing
perturbation_rag.egg-info		perturbation_rag.egg-info
retrieval		retrieval
scripts		scripts
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Perturbation Catalogue LLM Agent

Architecture

Quickstart

Project Structure

Backend Comparison

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Perturbation Catalogue LLM Agent

Architecture

Quickstart

Project Structure

Backend Comparison

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages