Feature Request: Add PaddleOCR GPU support to Docker image

# Feature Request: Add PaddleOCR GPU support to unstructured-api Docker image

## Summary

The official `unstructured-api` Docker image includes Tesseract but not PaddleOCR. Users who want to use PaddleOCR with GPU acceleration must build a custom image. It would be valuable to have an official GPU-enabled image with PaddleOCR pre-installed.

## Current Behavior

- The `unstructured-api:latest` image only includes Tesseract OCR
- PaddleOCR must be manually installed via `pip install paddlepaddle unstructured-paddleocr`
- There is no GPU-enabled variant of the image
- The `OCR_AGENT` environment variable is ignored (see related issue: OCR_AGENT_BUG_ISSUE.md)

## Proposed Solution

### Option 1: Provide GPU-enabled image tags

Publish additional Docker image variants:

```
unstructured-api:latest-gpu-cu118  # CUDA 11.8
unstructured-api:latest-gpu-cu126  # CUDA 12.6
```

These images would include:
- `paddlepaddle-gpu` from the appropriate CUDA index
- `unstructured-paddleocr`
- NVIDIA CUDA runtime

### Option 2: Add build args to existing Dockerfile

Add build arguments to allow users to build GPU-enabled images:

```dockerfile
ARG USE_GPU=false
ARG CUDA_VERSION=cu118

RUN if [ "$USE_GPU" = "true" ]; then \
        pip install --no-cache-dir \
            paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/stable/${CUDA_VERSION}/ \
            unstructured-paddleocr; \
    fi
```

## Implementation Details

The `unstructured` library already supports GPU acceleration for PaddleOCR. In `unstructured/partition/utils/ocr_models/paddle_ocr.py`:

```python
gpu_available = paddle.device.cuda.device_count() > 0
if gpu_available:
    logger.info(f"Loading paddle with GPU on language={language}...")

paddle_ocr = PaddleOCR(
    use_angle_cls=True,
    use_gpu=gpu_available,  # Auto-detects GPU
    lang=language,
    enable_mkldnn=True,
    show_log=False,
)
```

This means the library automatically uses GPU when available - the only requirement is installing `paddlepaddle-gpu` instead of `paddlepaddle`.

## Workaround

Users can extend the official image:

```dockerfile
FROM downloads.unstructured.io/unstructured-io/unstructured-api:latest

USER root

ARG USE_GPU=false
RUN if [ "$USE_GPU" = "true" ]; then \
        pip install --no-cache-dir \
            paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/stable/cu118/ \
            unstructured-paddleocr; \
    else \
        pip install --no-cache-dir \
            paddlepaddle \
            unstructured-paddleocr; \
    fi

USER notebook-user

ENV OCR_AGENT=unstructured.partition.utils.ocr_models.paddle_ocr.OCRAgentPaddle
```

**Note:** This also requires patching `general.py` to pass `ocr_agent` to `partition()` - see OCR_AGENT_BUG_ISSUE.md.

## Benefits

1. **Performance**: PaddleOCR with GPU is significantly faster than Tesseract for batch processing
2. **Accuracy**: PaddleOCR (especially PP-OCRv4) provides better accuracy on many document types
3. **Ease of use**: Official GPU images eliminate the need for custom Dockerfiles

## Environment

- unstructured-api: latest
- unstructured: 0.18.18+
- PaddlePaddle: 3.2.2
- CUDA: 11.8 / 12.6

## Related Issues

- OCR_AGENT environment variable is ignored (see OCR_AGENT_BUG_ISSUE.md)

## References

- [PaddleOCR GPU Installation](https://www.paddleocr.ai/main/en/quick_start.html)
- [PaddlePaddle CUDA Packages](https://www.paddlepaddle.org.cn/packages/stable/)
- [unstructured PaddleOCR requirements](https://github.com/Unstructured-IO/unstructured/blob/main/requirements/extra-paddleocr.txt)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Add PaddleOCR GPU support to Docker image #533

Feature Request: Add PaddleOCR GPU support to unstructured-api Docker image

Summary

Current Behavior

Proposed Solution

Option 1: Provide GPU-enabled image tags

Option 2: Add build args to existing Dockerfile

Implementation Details

Workaround

Benefits

Environment

Related Issues

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature Request: Add PaddleOCR GPU support to Docker image #533

Description

Feature Request: Add PaddleOCR GPU support to unstructured-api Docker image

Summary

Current Behavior

Proposed Solution

Option 1: Provide GPU-enabled image tags

Option 2: Add build args to existing Dockerfile

Implementation Details

Workaround

Benefits

Environment

Related Issues

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions