Skip to content

In-tron/PerturbationCatalogue_LLM

Repository files navigation

Perturbation Catalogue LLM Agent

An LLM agent that queries the Perturbation Catalogue API using natural language.

Supports three backends:

  • Anthropic Claude (recommended) — uses native tool_use feature
  • OpenAI GPT-4o — uses function calling
  • Ollama (local) — no API key required, runs locally

Ask questions like:

  • "What happens when SPI1 is knocked out?"
  • "Which MAVE variants of UBE2I are loss-of-function?"
  • "Compare CRISPR screen effects of TP53 vs MYC across cell types"
  • "What pathways are enriched after BRCA1 knockout in dataset X?"

Architecture

User prompt
    │
    ▼
LLM Agent (Claude / GPT-4o / Ollama)
    │  decides which tools to call
    ▼
┌─────────────────────────────────────┐
│           API Tools                 │
│  search_catalogue()                 │
│  get_perturb_seq()                  │
│  get_mave_data()                    │
│  get_crispr_screen()                │
│  get_gsea_results()                 │
│  get_dataset_info()                 │
│  get_catalogue_summary()            │
└─────────────────────────────────────┘
    │  live HTTP calls
    ▼
Perturbation Catalogue API
https://perturbation-catalogue-be-328296435987.europe-west2.run.app

Quickstart

# 1. Install
pip install -e .

# 2. Choose your LLM backend:

# Option A: Anthropic Claude (recommended, requires API key)
export ANTHROPIC_API_KEY=replace with Anthropic API key

# Option B: OpenAI GPT-4o (requires API key)
export OPENAI_API_KEY = ***

# Option C: Ollama (local, no API key)
# First, download & run Ollama locally:
ollama pull llama3.1:8b
ollama serve

# 3. Run interactive agent
python agent/agent.py

# With Ollama backend:
python agent/agent.py --backend ollama --model llama3.1:8b

# 4. Or single query
python agent/agent.py --query "What genes are downregulated when SPI1 is knocked out?"
python agent/agent.py --backend ollama --model llama3.1:8b --query "What happens when BRCA1 is knocked out?"

# 5. Or start API server
python agent/agent.py --serve --port 8000
python agent/agent.py --backend ollama --model llama3.1:8b --serve --port 8000

Project Structure

├── data/
│   └── api_client.py       # Typed Python client for the Perturbation Catalogue API
├── agent/
│   ├── tools.py            # API-wrapped tools the LLM can call
│   ├── agent.py       # Main agent loop (Anthropic + OpenAI + Ollama backends)
│   └── agent.py            # Original agent implementation
├── indexing/               # Optional: pre-build a vector index for faster retrieval
│   ├── chunker.py
│   ├── embedder.py
│   └── build_index.py
├── retrieval/
│   └── retriever.py        # Vector search (used only if index is built)
├── evaluation/
│   └── evaluate.py
└── scripts/
    └── query_cli.py        # Rich interactive CLI

Backend Comparison

Backend Model API Key Speed Quality
Anthropic Claude 3.5 Sonnet Required Fast Excellent
OpenAI GPT-4o Required Fast Excellent
Ollama llama3.1:8b (local) Not needed Depends on hardware Good

Choose Ollama for local development, offline usage, or privacy-sensitive applications.

About

LLM for PerturbationCatalogue (https://www.ebi.ac.uk/perturbation-catalogue/)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages