This guide covers deployment options for Cyber-AutoAgent in various environments.
Cyber-AutoAgent supports 4 invocation methods, each with different use cases:
Best for: Automation, scripting, CI/CD pipelines
# Configure via environment variables
export AZURE_API_KEY="your_key"
export AZURE_API_BASE="https://your-endpoint.openai.azure.com/"
export AZURE_API_VERSION="2024-12-01-preview"
export CYBER_AGENT_LLM_MODEL="azure/gpt-5"
export CYBER_AGENT_EMBEDDING_MODEL="azure/text-embedding-3-large"
export REASONING_EFFORT="medium"
export KMP_DUPLICATE_LIB_OK="TRUE"
# Run with uv (recommended)
uv run python src/cyberautoagent.py \
--target "https://example.com" \
--objective "Bug bounty assessment" \
--iterations 150 \
--provider litellmBest for: Repeated testing with saved config, development
# Uses ~/.cyber-autoagent/config.json for settings
cd src/modules/interfaces/react
npm start -- --auto-run \
--target "https://example.com" \
--objective "Security assessment" \
--iterations 50Configure via ~/.cyber-autoagent/config.json:
{
"modelProvider": "litellm",
"modelId": "azure/gpt-5",
"embeddingModel": "azure/text-embedding-3-large",
"azureApiKey": "your_key",
"azureApiBase": "https://your-endpoint.openai.azure.com/",
"azureApiVersion": "2024-12-01-preview",
"reasoningEffort": "medium"
}Best for: Isolated environments, clean tooling, reproducibility
With Interactive React Terminal:
docker run -it --rm \
-e AZURE_API_KEY=your_key \
-e AZURE_API_BASE=https://your-endpoint.openai.azure.com/ \
-e CYBER_AGENT_LLM_MODEL=azure/gpt-5 \
-v $(pwd)/outputs:/app/outputs \
cyber-autoagent:latestDirect Python Execution (Override Entrypoint):
docker run --rm --entrypoint python \
-e AZURE_API_KEY=your_key \
-e AZURE_API_BASE=https://your-endpoint.openai.azure.com/ \
-e AZURE_API_VERSION=2024-12-01-preview \
-e CYBER_AGENT_LLM_MODEL=azure/gpt-5 \
-e CYBER_AGENT_EMBEDDING_MODEL=azure/text-embedding-3-large \
-e REASONING_EFFORT=medium \
-v $(pwd)/outputs:/app/outputs \
cyber-autoagent:latest \
src/cyberautoagent.py \
--target https://example.com \
--objective "Security assessment" \
--iterations 50 \
--provider litellmBest for: Observability, team deployments, production monitoring
# Uses docker/.env for configuration
docker compose -f docker/docker-compose.yml up -dCyber-AutoAgent supports 300+ LLM providers via LiteLLM. Examples:
Azure OpenAI:
-e AZURE_API_KEY=your_key
-e AZURE_API_BASE=https://your-endpoint.openai.azure.com/
-e AZURE_API_VERSION=2024-12-01-preview
-e CYBER_AGENT_LLM_MODEL=azure/gpt-5
-e CYBER_AGENT_EMBEDDING_MODEL=azure/text-embedding-3-largeAWS Bedrock:
-e AWS_ACCESS_KEY_ID=your_key
-e AWS_SECRET_ACCESS_KEY=your_secret
-e CYBER_AGENT_LLM_MODEL=us.anthropic.claude-sonnet-4-5-20250929-v1:0
-e CYBER_AGENT_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0OpenRouter:
-e OPENROUTER_API_KEY=your_key
-e CYBER_AGENT_LLM_MODEL=openrouter/openrouter/polaris-alpha
-e CYBER_AGENT_EMBEDDING_MODEL=azure/text-embedding-3-largeMoonshot AI:
-e MOONSHOT_API_KEY=your_key
-e OPENAI_API_KEY=your_key # Required for Mem0 OpenAI-compatible providers
-e CYBER_AGENT_LLM_MODEL=moonshot/kimi-k2-thinking
-e CYBER_AGENT_EMBEDDING_MODEL=azure/text-embedding-3-large
-e MEM0_LLM_MODEL=azure/gpt-4o # Memory system LLM (use Azure/Anthropic/Bedrock for Mem0)
-e AZURE_API_KEY=azure_key # Required for embeddings and Mem0
-e AZURE_API_BASE=https://your-endpoint.openai.azure.com/
-e AZURE_API_VERSION=2024-12-01-previewNote: When using OpenAI-compatible providers (Moonshot, OpenRouter, etc.) with Mem0, you must:
- Set
OPENAI_API_KEYto the provider's API key for Mem0 compatibility - Use a supported Mem0 provider (Azure, OpenAI, Anthropic, Bedrock) for
MEM0_LLM_MODEL
Mixed Providers: You can combine any LLM with any embedding model!
# Clone the repository
git clone https://github.com/double16/Cyber-AutoAgent-ng.git
cd cyber-autoagent
# Build and run with Docker Compose (includes observability)
cd docker
docker-compose up -d
# Run a penetration test
docker run --rm \
--network cyber-autoagent_default \
-e AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} \
-e AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} \
-e LANGFUSE_HOST=http://langfuse-web:3000 \
-e LANGFUSE_PUBLIC_KEY=cyber-public \
-e LANGFUSE_SECRET_KEY=cyber-secret \
-v $(pwd)/outputs:/app/outputs \
cyber-autoagent \
--target "example.com" \
--objective "Web application security assessment"For just the agent without observability:
# Build the image
# (optional) docker build --pull -f docker/Dockerfile.tools -t public.ecr.aws/bramblethorn/cyber-autoagent-ng/tools:latest .
docker build -t cyber-autoagent -f docker/Dockerfile .
# Run with AWS Bedrock
docker run --rm \
-e AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} \
-e AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} \
-e AWS_REGION=${AWS_REGION:-us-east-1} \
-v $(pwd)/outputs:/app/outputs \
cyber-autoagent \
--target "192.168.1.100" \
--objective "Network security assessment" \
--provider bedrock
# Run with Ollama (local)
docker run --rm \
-e OLLAMA_HOST=http://host.docker.internal:11434 \
-e OLLAMA_CONTEXT_LENGTH=32768 \
-v $(pwd)/outputs:/app/outputs \
cyber-autoagent \
--target "testsite.local" \
--objective "Basic security scan" \
--provider ollama \
--model qwen3-coder:30b-a3b-q4_K_M- Network Isolation: Deploy in an isolated network segment
- Resource Limits: Set memory and CPU limits in docker-compose.yml
- Secure Keys: Generate proper encryption keys for Langfuse:
# Generate secure keys openssl rand -hex 32 # For ENCRYPTION_KEY openssl rand -base64 32 # For SALT openssl rand -base64 32 # For NEXTAUTH_SECRET
Cyber-AutoAgent uses a modular, three-tier configuration system with automatic model detection and safe token limit allocation.
Configuration Modules:
config/
├── manager.py # Core orchestration (ConfigManager)
├── types.py # Type definitions and dataclasses
├── models/ # Model creation and capabilities
├── system/ # Environment, logging, validation
└── providers/ # Provider-specific helpers
See src/modules/config/README.md for complete module documentation.
Settings are applied in this priority order:
1. CLI/API Arguments (Highest)
└─ Flags: --provider, --model, --iterations
└─ Direct parameters to create_agent()
2. Environment Variables (Override)
└─ CYBER_AGENT_LLM_MODEL
└─ CYBER_AGENT_EMBEDDING_MODEL
└─ REASONING_EFFORT
└─ Provider-specific: AZURE_API_KEY, AWS_REGION, etc.
3. Provider Defaults (Fallback)
└─ Safe defaults for all providers
└─ Automatically selected based on provider
Example:
# Default: temperature=0.5 (from provider defaults)
# Override via environment: CYBER_LLM_TEMPERATURE=0.8
# Override via CLI: create_agent(..., temperature=0.7)
# Result: Uses 0.7 (CLI has highest priority)Token limits are automatically detected using the models.dev API with resilient fallback:
Three-Tier Fallback:
- Disk cache (
~/.cache/cyber-autoagent/models.json, 24h TTL) - Live API (
https://models.dev/api.json) - Embedded snapshot (
models_snapshot.json, 432KB bundled)
Benefits:
- Accurate limits for 1,100+ models across 58 providers
- Works offline (embedded snapshot)
- Safe token allocation (50% of actual limit by default)
- Automatic capability detection (reasoning, tools, attachments)
Safe Token Limits:
# Specialist tools use 50% of model's output limit for reliability
safe_max = model_output_limit * 0.5
# Example: Bedrock Claude 3.5 Sonnet v2
# Actual limit: 8,192 tokens
# Safe allocation: 4,096 tokensToken limits use five-tier precedence:
- Explicit override -
CYBER_PROMPT_LIMIT_FORCEenvironment variable - Context window maximum -
CYBER_CONTEXT_LIMIT, if defined, limits the following - Models.dev API - Authoritative registry (preferred)
- Fallback mappings -
CYBER_CONTEXT_WINDOW_FALLBACKS(JSON) - Provider defaults - Safe defaults per provider
Example fallback configuration:
export CYBER_CONTEXT_WINDOW_FALLBACKS='[
{"azure/gpt-5": ["azure/gpt-4o", "azure/gpt-4"]},
{"anthropic/claude-opus": ["anthropic/claude-sonnet-4-5"]}
]'| Variable | Description | Required |
|---|---|---|
CYBER_AGENT_PROVIDER |
Provider choice (bedrock/ollama/litellm) | No (auto-detected) |
CYBER_AGENT_LLM_MODEL |
Main LLM model ID | Yes |
CYBER_AGENT_EMBEDDING_MODEL |
Embedding model ID | No (provider default) |
REASONING_EFFORT |
Reasoning effort (low/medium/high) | No (default: medium) |
MAX_TOKENS |
Override LLM max (output) tokens | No (models.dev default) |
CYBER_AGENT_SWARM_MODEL |
Swarm LLM model ID | No |
CYBER_AGENT_SWARM_MAX_TOKENS |
Override specialist max tokens | No (models.dev default) |
MAX_TOKENS_LIMIT |
Override LLM output token upper bound | No (12,000 default) |
MAX_TOKENS_REASONING_LIMIT |
Override LLM output token upper bound (reasoning) | No (32,000 default) |
CYBER_CONTEXT_LIMIT |
Limit detected prompt tokens | No (auto-detected) |
CYBER_PROMPT_LIMIT_FORCE |
Force prompt token limit | No (auto-detected) |
AWS_ACCESS_KEY_ID |
AWS credentials for Bedrock | For Bedrock provider |
AWS_SECRET_ACCESS_KEY |
AWS credentials for Bedrock | For Bedrock provider |
AWS_REGION |
AWS region (default: us-east-1) | For Bedrock provider |
OLLAMA_HOST |
Ollama API endpoint | For Ollama provider |
OLLAMA_CONTEXT_LENGTH |
Ollama model context length | No, Ollama default |
OLLAMA_TIMEOUT |
Ollama API timeout in seconds | No (default: 120) |
AZURE_API_KEY |
Azure OpenAI API key | For Azure/LiteLLM |
AZURE_API_BASE |
Azure endpoint URL | For Azure/LiteLLM |
AZURE_API_VERSION |
Azure API version | For Azure/LiteLLM |
MEM0_API_KEY |
Mem0 Platform API key | For cloud memory backend |
MEM0_LLM_MODEL |
Memory system LLM | No (auto-aligned) |
OPENSEARCH_HOST |
OpenSearch endpoint | For OpenSearch memory backend |
LANGFUSE_HOST |
Langfuse observability endpoint | For observability |
LANGFUSE_PUBLIC_KEY |
Langfuse API public key | For observability |
LANGFUSE_SECRET_KEY |
Langfuse API secret key | For observability |
ENABLE_AUTO_EVALUATION |
Enable automatic Ragas evaluation | For evaluation |
CYBER_RATE_LIMIT_REQ_PER_MIN |
Limit model requests per minute | No (no limit) |
CYBER_RATE_LIMIT_TOKENS_PER_MIN |
Limit model tokens per minute | No (no limit) |
CYBER_RATE_LIMIT_MAX_CONCURRENT |
Limit model concurrent requests | No (Ollama defaults to 1) |
CYBER_AGENT_PRICING_INPUT |
Model price per 1M input tokens | No (defaults to models.dev) |
CYBER_AGENT_PRICING_OUTPUT |
Model price per 1M output tokens | No (defaults to models.dev) |
CYBER_AGENT_PRICING_CACHE_READ |
Model price per 1M cache read tokens | No (defaults to models.dev) |
CYBER_AGENT_PRICING_CACHE_WRITE |
Model price per 1M cache write tokens | No (defaults to models.dev) |
Example deployment manifest:
apiVersion: apps/v1
kind: Deployment
metadata:
name: cyber-autoagent
spec:
replicas: 1
selector:
matchLabels:
app: cyber-autoagent
template:
metadata:
labels:
app: cyber-autoagent
spec:
containers:
- name: cyber-autoagent
image: cyber-autoagent:latest
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-credentials
key: access-key-id
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-credentials
key: secret-access-key
volumeMounts:
- name: outputs
mountPath: /app/outputs
volumes:
- name: outputs
persistentVolumeClaim:
claimName: outputs-pvc- Access Langfuse UI at http://localhost:3000
- Default credentials: admin@cyber-autoagent.com / changeme
- View real-time traces of agent operations
- Export results for reporting
The React terminal interface provides interactive configuration and real-time monitoring:
# Install and build
cd src/modules/interfaces/react
npm install
npm run build
# Start the interface
npm start
# The interface will guide you through:
# 1. Docker environment setup
# 2. Deployment mode selection (local-cli, single-container, full-stack)
# 3. Model provider configuration (Bedrock, Ollama, LiteLLM)
# 4. First assessment executionAccess the interface at http://localhost:3000 when using full-stack deployment with observability.
Cyber-AutoAgent supports three memory backends with automatic selection:
| Backend | Priority | Environment Variable | Use Case |
|---|---|---|---|
| Mem0 Platform | 1 | MEM0_API_KEY |
Cloud-hosted, managed service |
| OpenSearch | 2 | OPENSEARCH_HOST |
AWS managed search, production scale |
| FAISS | 3 | None (default) | Local vector storage, development |
Memory persists in outputs/<target>/memory/ for cross-operation learning.
export AZURE_API_KEY=your_key
export AZURE_API_BASE=https://your-endpoint.openai.azure.com/
export AZURE_API_VERSION=2024-12-01-preview
export CYBER_AGENT_LLM_MODEL=azure/gpt-5
export CYBER_AGENT_EMBEDDING_MODEL=azure/text-embedding-3-large
export REASONING_EFFORT=high
export MAX_TOKENS=8000 # Optional: Override defaultexport AWS_REGION=us-east-1
export CYBER_AGENT_LLM_MODEL=us.anthropic.claude-sonnet-4-5-20250929-v1:0
export CYBER_AGENT_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
export MEM0_API_KEY=your_mem0_key # Cloud memory backend
export REASONING_EFFORT=mediumexport MOONSHOT_API_KEY=your_key
export CYBER_AGENT_LLM_MODEL=moonshot/kimi-k2-thinking
export CYBER_AGENT_EMBEDDING_MODEL=azure/text-embedding-3-large
export AZURE_API_KEY=your_azure_key # For embeddings
export AZURE_API_BASE=https://your-endpoint.openai.azure.com/
export AZURE_API_VERSION=2024-12-01-preview
export MEM0_LLM_MODEL=azure/gpt-4o # Memory system uses Azure
export OPENAI_API_KEY=your_moonshot_key # Mem0 compatibilityexport OLLAMA_HOST=http://localhost:11434
export OLLAMA_CONTEXT_LENGTH=32768
export CYBER_AGENT_LLM_MODEL=qwen3-coder:30b-a3b-q4_K_M
export CYBER_AGENT_EMBEDDING_MODEL=nomic-embed-text:latest
export CYBER_CONTEXT_WINDOW_FALLBACKS='[
{"qwen3-coder:30b": ["qwen3-coder:14b", "llama3.2:3b"]}
]'Common deployment issues:
- Container fails to start: Check Docker logs with
docker logs cyber-autoagent - AWS credentials error: Ensure IAM role has Bedrock access and correct region
- Ollama connection failed: Verify Ollama is running and accessible at specified host
- Out of memory: Increase Docker memory limits or reduce
--iterationsparameter - React interface issues: Run
npm run buildafter any code changes - Memory backend errors: Verify environment variables and network connectivity
- Model not found: Check model ID format (use
provider/modelfor LiteLLM) - Token limit errors: Verify models.dev snapshot exists at
src/modules/config/models/models_snapshot.json - Specialist failures: Check swarm max_tokens configuration (should be >100 tokens)