Watson X Integration with Granite Models

This branch adds support for IBM Watson X AI with Granite models as an alternative to Ollama for running LocalGPT.

Overview

LocalGPT now supports two LLM backends:

Ollama (default): Run models locally using Ollama
Watson X: Use IBM's Granite models hosted on Watson X AI

What Changed

Added WatsonXClient class in rag_system/utils/watsonx_client.py that provides an Ollama-compatible interface for Watson X
Updated factory.py and main.py to support backend switching via environment variable
Added ibm-watsonx-ai SDK dependency to requirements.txt
Configuration now supports both backends through environment variables

Prerequisites

To use Watson X with Granite models, you need:

IBM Cloud account with Watson X access
Watson X API key
Watson X project ID

Getting Your Credentials

Go to IBM Cloud
Navigate to Watson X AI service
Create or select a project
Get your API key from IBM Cloud IAM
Copy your project ID from the Watson X project settings

Configuration

Environment Variables

Create a .env file or set these environment variables:

# Choose LLM backend (default: ollama)
LLM_BACKEND=watsonx

# Watson X Configuration
WATSONX_API_KEY=your_api_key_here
WATSONX_PROJECT_ID=your_project_id_here
WATSONX_URL=https://us-south.ml.cloud.ibm.com

# Model Configuration
WATSONX_GENERATION_MODEL=ibm/granite-13b-chat-v2
WATSONX_ENRICHMENT_MODEL=ibm/granite-8b-japanese

Available Granite Models

Watson X offers several Granite models:

ibm/granite-13b-chat-v2 - General purpose chat model
ibm/granite-13b-instruct-v2 - Instruction-following model
ibm/granite-20b-multilingual - Multilingual support
ibm/granite-8b-japanese - Lightweight Japanese model
ibm/granite-3b-code-instruct - Code generation model

For a full list of available models, visit the Watson X documentation.

Installation

Install the Watson X SDK:

pip install ibm-watsonx-ai>=1.3.39

Or install all dependencies:

pip install -r rag_system/requirements.txt

Usage

Running with Watson X

Once configured, simply set the environment variable and run as normal:

export LLM_BACKEND=watsonx
python -m rag_system.main api

Or in Python:

import os
os.environ['LLM_BACKEND'] = 'watsonx'

from rag_system.factory import get_agent

# Get agent with Watson X backend
agent = get_agent(mode="default")

# Use as normal
result = agent.run("What is artificial intelligence?")
print(result)

Switching Between Backends

You can easily switch between Ollama and Watson X:

# Use Ollama (local)
export LLM_BACKEND=ollama
python -m rag_system.main api

# Use Watson X (cloud)
export LLM_BACKEND=watsonx
python -m rag_system.main api

Features

The Watson X client supports all the key features used by LocalGPT:

✅ Text generation / completion
✅ Async generation
✅ Streaming responses
✅ Embeddings (if using Watson X embedding models)
✅ Custom generation parameters (temperature, max_tokens, top_p, top_k)
⚠️ Image/multimodal support (limited, depends on model availability)

API Compatibility

The WatsonXClient provides the same interface as OllamaClient:

from rag_system.utils.watsonx_client import WatsonXClient

client = WatsonXClient(
    api_key="your_api_key",
    project_id="your_project_id"
)

# Generate completion
response = client.generate_completion(
    model="ibm/granite-13b-chat-v2",
    prompt="Explain quantum computing"
)

print(response['response'])

# Stream completion
for chunk in client.stream_completion(
    model="ibm/granite-13b-chat-v2",
    prompt="Write a story about AI"
):
    print(chunk, end='', flush=True)

Limitations

Embedding Models: Watson X uses different embedding models than Ollama. Make sure to configure embedding models appropriately in main.py if needed.
Multimodal Support: Image support varies by model availability in Watson X. Not all Granite models support multimodal inputs.
Streaming: Streaming support depends on the Watson X SDK version and may fall back to returning the full response at once.
Rate Limits: Watson X has API rate limits that may differ from local Ollama usage. Monitor your usage accordingly.

Troubleshooting

Authentication Errors

If you see authentication errors:

Verify your API key is correct
Check that your project ID matches an existing Watson X project
Ensure your IBM Cloud account has Watson X access

Model Not Found

If you get model not found errors:

Verify the model ID is correct (e.g., ibm/granite-13b-chat-v2)
Check that the model is available in your Watson X instance
Some models may require additional permissions

Connection Errors

If you experience connection issues:

Check your internet connection
Verify the Watson X URL is correct for your region
Check IBM Cloud status page for service outages

Cost Considerations

Unlike local Ollama, Watson X is a cloud service with usage-based pricing:

Token-based pricing for generation
Consider your query volume
Monitor usage through IBM Cloud dashboard

Reverting to Ollama

To switch back to local Ollama:

unset LLM_BACKEND  # or set LLM_BACKEND=ollama
python -m rag_system.main api

Support

For Watson X specific issues:

For LocalGPT issues:

LocalGPT GitHub Issues

Contributing

If you find issues with the Watson X integration or want to add features:

Create an issue describing the problem/feature
Submit a pull request with your changes
Ensure all tests pass

License

This integration follows the same license as LocalGPT (MIT License).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Watson X Integration with Granite Models

Overview

What Changed

Prerequisites

Getting Your Credentials

Configuration

Environment Variables

Available Granite Models

Installation

Usage

Running with Watson X

Switching Between Backends

Features

API Compatibility

Limitations

Troubleshooting

Authentication Errors

Model Not Found

Connection Errors

Cost Considerations

Reverting to Ollama

Support

Contributing

License

FilesExpand file tree

WATSONX_README.md

Latest commit

History

WATSONX_README.md

File metadata and controls

Watson X Integration with Granite Models

Overview

What Changed

Prerequisites

Getting Your Credentials

Configuration

Environment Variables

Available Granite Models

Installation

Usage

Running with Watson X

Switching Between Backends

Features

API Compatibility

Limitations

Troubleshooting

Authentication Errors

Model Not Found

Connection Errors

Cost Considerations

Reverting to Ollama

Support

Contributing

License