This script performs image super-resolution (2x upscaling) locally using the caidas/swin2SR-classical-sr-x2-64 model via the Hugging Face transformers library's image-to-image pipeline.
It takes an image file as input and generates an output image with twice the width and height, aiming for enhanced detail and sharpness compared to simple resizing. The upscaled image is saved as a PNG file.
It includes flexibility for the image input:
- It prioritizes using a local image file path specified within the script.
- If the specified file isn't found, it downloads a sample image (a city street scene) for demonstration.
- Performs image super-resolution (2x upscale) locally.
- Uses the
caidas/swin2SR-classical-sr-x2-64model (Swin Transformer v2 based). - Generates a higher-resolution version of the input image.
- Handles user-specified local image files with a fallback to a sample image.
- Saves the resulting upscaled image as a PNG file (
super_resolution_output.png). - Leverages the Hugging Face
transformerslibrary. - Optionally utilizes GPU for faster processing.
- Super-Resolution Model:
caidas/swin2SR-classical-sr-x2-64
Before running the script, ensure you have the following installed:
- Python: Python 3.8 or later recommended.
- System Dependencies: None specific beyond standard build tools.
- Python Libraries: Install using pip in a virtual environment. Standard vision libraries are needed.
pip install transformers torch Pillow torchvision timm requests
transformers: The core Hugging Face library.torch: The deep learning framework backend (PyTorch).Pillow: For loading, handling, and saving images.torchvision,timm: Often required/beneficial for vision models like Swin2SR.requests: Used to download the sample image if needed.
- Clone or Download: Get the
run_super_resolution.pyscript onto your local machine. - Create Virtual Environment (Recommended):
(Use
python3 -m venv .venv source .venv/bin/activate.\.venv\Scripts\activateon Windows) - Install Python Libraries: Run the pip command from the Prerequisites section within your activated virtual environment.
-
Configure Image Input:
- Open the
run_super_resolution.pyscript in a text editor. - Locate the line:
user_image_path = "my_low_res_image.jpg" - Option A (Recommended): Change the path to the exact path of the image file you want to upscale. For best results, use an image that isn't already extremely high resolution (e.g., 640x480, 1024x768).
- Option B: Place your image file in the same directory as the script and name it
my_low_res_image.jpg. - Fallback: If no file is found at
user_image_path, the script downloads and uses the sample image (sr_sample_image.jpg).
- Open the
-
Run the Script:
- Open your terminal or command prompt.
- Make sure your virtual environment is activated.
- Navigate to the directory containing the script.
- Execute the script using Python:
python run_super_resolution.py
The script will print status messages, including the dimensions of the input image. The primary output is a saved image file:
- An upscaled image will be saved as
super_resolution_output.pngin the same directory. - The script will print the dimensions of this output image, which should be 2x the width and 2x the height of the input image.
- Visually compare the
super_resolution_output.pngto the input image; it should appear larger and potentially sharper or more detailed (results vary depending on the input).
- File Not Found errors: Double-check the
user_image_path. Check internet connection if relying on the fallback. Ensure the image file is readable. - Library Import Errors: Ensure all required libraries (
transformers,torch,Pillow,torchvision,timm,requests) are installed. - Errors during Upscaling: Ensure the input image file is valid (not corrupted). Very large input images might exceed available RAM or VRAM. Check console for specific errors (e.g., memory errors).
- Output Quality: Super-resolution quality depends heavily on the input image content, the model's capabilities, and the upscaling factor (fixed at 2x here). Artifacts can sometimes occur, especially on heavily compressed or noisy input images.
- CPU: Possible but likely very slow for image-to-image models like Swin2SR.
- GPU: NVIDIA GPU is strongly recommended for this task due to the computational cost.
- RAM/VRAM: Processing images, especially for upscaling, can be memory-intensive. Ensure sufficient system RAM and particularly GPU VRAM.
- The
run_super_resolution.pyscript is provided as an example (consider MIT License). - Hugging Face libraries are typically Apache 2.0 licensed.
- The
caidas/swin2SR-classical-sr-x2-64model license should be checked on its model card (often Apache 2.0 or similar permissive licenses).