Object Detection

한국어 | English

The MLOps platform to Let your AI run

Introduction

We use the Link included in Runway to train and save the image model. To utilize the written model training code for retraining, we construct and save a pipeline.

📘 For quick execution, you can use the Jupyter Notebook provided below. If you download and run the Jupyter Notebook, a model named "my-detection-model" will be created and saved in Runway.

object detection notebook

Package Preparation

Install the required packages for the tutorial.

!pip install torch torchvision Pillow seaborn torchmetrics

Data

This tutorial uses a subset of the publicly available COCO dataset.

📘 The COCO sample dataset used in this tutorial is located in the ./dataset directory, and can also be downloaded from the link below if needed. coco-sample-dataset.zip

Load Data

Check the path of the dataset file in the file explorer.

Assign the dataset file path to the RUNWAY_DATA_PATH parameter.

import os
from pycocotools.coco import COCO

RUNWAY_DATA_PATH = "/home/jovyan/workspace/examples/tutorial/object_detection/dataset"
config_file = None
for dirname, _, filenames in os.walk(RUNWAY_DATA_PATH):
    for filename in filenames:
        if filename.endswith(".json"):
            config_file = os.path.join(dirname, filename)

if config_file is None:
    raise ValueError("Can't find config file in given dataset")

coco = COCO(config_file)

Extract a sample image

Extract a sample data and check the image.

from pathlib import Path

from matplotlib.pyplot import imshow
from PIL import Image


sample_image_path = next(Path(RUNWAY_DATA_PATH).glob("*.jpg"))
image_filename_list = [sample_image_path]

img = Image.open(sample_image_path)
imshow(img)

COCO Dataset

To train the model, create a Dataset provided by PyTorch.

from PIL import Image
from pathlib import Path
from pycocotools.coco import COCO
import torch
from torch.utils.data import Dataset
from torchvision import transforms as T


def get_transforms():
    transforms = []
    transforms.append(T.ToTensor())
    return T.Compose(transforms)


def collate_fn(batch):
    return tuple(zip(*batch))


class COCODataset(Dataset):
    def __init__(self, data_root, coco, transforms=None):
        self.data_root = Path(data_root)
        self.transforms = transforms
        # pre-loaded variables
        self.coco = coco
        self.ids = list(sorted(self.coco.imgs.keys()))

    def __getitem__(self, index):
        ## refer to https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html
        img_id = self.ids[index]
        ann_ids = self.coco.getAnnIds(imgIds=img_id)
        ann = self.coco.loadAnns(ann_ids)
        img_path = self.data_root / self.coco.loadImgs(img_id)[0]["file_name"]
        img = Image.open(img_path)
        num_objs = len(ann)

        boxes = []
        for i in range(num_objs):
            boxes.append([
                ann[i]["bbox"][0],
                ann[i]["bbox"][1],
                ann[i]["bbox"][2] + ann[i]["bbox"][0],
                ann[i]["bbox"][3] + ann[i]["bbox"][1],
            ])

        areas = []
        for i in range(num_objs):
            areas.append(ann[i]["area"])

        target = {
            "boxes": torch.as_tensor(boxes, dtype=torch.float32),
            "labels": torch.ones((num_objs,), dtype=torch.int64),
            "image_id": torch.tensor([img_id]),
            "area": torch.as_tensor(areas, dtype=torch.float32),
            "iscrowd": torch.zeros((num_objs,), dtype=torch.int64),
        }

        ## transform image
        if self.transforms is not None:
            img = self.transforms(img)

        return img, target

    def __len__(self):
        return len(self.ids)

Use the declared data to create a data loader.

from torch.utils.data import DataLoader

## Define Train dataset
data_root = Path(RUNWAY_DATA_PATH).parent
dataset = COCODataset(data_root, coco, get_transforms())

data_loader = DataLoader(
    dataset,
    batch_size=2,
    shuffle=True,
    num_workers=4,
    collate_fn=collate_fn
)

Model

Model Declaration

Declare the model to be used for training. In this tutorial, we use the fasterrcnn_resnet50_fpn model from PyTorch.

import torch
from torchvision.models.detection import fasterrcnn_resnet50_fpn


# Define local variables
print(torch.cuda.is_available())
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

try:
    entrypoints = torch.hub.list('pytorch/vision', force_reload=True)
    model = fasterrcnn_resnet50_fpn(weights="DEFAULT").to(device)
except:
    model = fasterrcnn_resnet50_fpn(weights=None, weights_backbone=None).to(device)

Model Training

📘 You can find guidance on registering Link parameters in the Set Pipeline Parameter.

Set the number of epochs for model training by registering 1 in the N_EPOCHS Link parameter.

Train the declared model using the data loader created above and evaluate the trained model.

import torch.optim as optim


params = [p for p in model.parameters() if p.requires_grad]
optimizer = optim.SGD(params, lr=1e-5)

model.train()
for epoch in range(N_EPOCHS):
    for imgs, annotations in data_loader:
        imgs = list(img.to(device) for img in imgs)
        annotations = [{k: v.to(device) for k, v in t.items()} for t in annotations]
        loss_dict = model(imgs, annotations)
        losses = sum(loss for loss in loss_dict.values())

        optimizer.zero_grad()
        losses.backward()
        optimizer.step()

model.eval()
torch.cuda.empty_cache()

map_metric = MeanAveragePrecision().to(device)
model.eval()
with torch.no_grad():
    preds = []
    annos = []
    for imgs, annotations in data_loader:
        pred = model(list(img.to(device) for img in imgs))
        anno = [{k: v.to(device) for k, v in t.items()} for t in annotations]
        preds.extend(pred)
        annos.extend(anno)

map_metric.update(preds, annos)
map_score = map_metric.compute()

torch.cuda.empty_cache()

Model Inference

Model Wrapping Class Declaration

Create a ModelWrapper to serve the trained model.

import io
import base64

import torch
import pandas as pd
import numpy as np
from torchvision import transforms
from PIL import Image, ImageDraw, ImageFont


class ModelWrapper:
    def __init__(self, model, device):
        self.model = model
        self.device = device

    def bytesarray_to_tensor(self, bytes_array: str):
        # input : "utf-8" decoded bytes_array
        encoded_bytes_array = bytes_array.encode("utf-8")
        # decode encoded_bytes_array with ascii code
        img_64_decode = base64.b64decode(encoded_bytes_array)
        # get image file and transform to tensor
        image_from_bytes = Image.open(io.BytesIO(img_64_decode))
        return transforms.ToTensor()(image_from_bytes).to(self.device)

    def numpy_to_bytesarray(self, numpy_array):
        numpy_array_bytes_array = numpy_array.tobytes()
        numpy_array_64_encode = base64.b64encode(numpy_array_bytes_array)
        bytes_array = numpy_array_64_encode.decode("utf-8")
        return bytes_array

    def draw_detection(self, img_tensor, bboxes, labels, scores, out_img_file):
        """Draw detection result."""
        img_array = img_tensor.permute(1, 2, 0).numpy() * 255
        img = Image.fromarray(img_array.astype(np.uint8))
        
        draw = ImageDraw.Draw(img)    
        font = ImageFont.load_default()
        bboxes = bboxes.cpu().numpy().astype(np.int32)
        labels = labels.cpu().numpy()
        scores = scores.cpu().numpy()
        for box, label, score in zip(bboxes, labels, scores):        
            draw.rectangle([(box[0], box[1]), (box[2], box[3])], outline="red", width=1)  
            text = f"{label}: {score:.2f}"
            draw.text((box[0], box[1]), text, fill="red", font=font)
        img.save(out_img_file)
        return img

    @torch.no_grad()
    def predict(self, df):
        self.model.eval()
        # df is 1-d dataframe with bytes array
        tensor_list = list((map(self.bytesarray_to_tensor, df.squeeze(axis=1).to_list())))

        pred_images = []
        pred_image_shape_c = []
        pred_image_shape_h = []
        pred_image_shape_w = []
        pred_image_dtypes = []

        boxes = []
        labels = []
        scores = []

        boxes_dtypes = []
        labels_dtypes = []
        scores_dtypes = []

        for img in tensor_list:
            output = self.model(img.unsqueeze(0))
            detect_img = self.draw_detection(
                img_tensor=img,
                bboxes=output[0]["boxes"],
                labels=output[0]["labels"],
                scores=output[0]["scores"],
                out_img_file="test.png",
            )
            detect_img = np.array(detect_img)
            h, w, c = detect_img.shape
            box = output[0]["boxes"].cpu().numpy()
            label = output[0]["labels"].cpu().numpy()
            score = output[0]["scores"].cpu().numpy()

            pred_images += [detect_img]
            boxes += [box]
            labels += [label]
            scores += [score]

            pred_image_shape_c += [c]
            pred_image_shape_h += [h]
            pred_image_shape_w += [w]

            pred_image_dtypes += [str(detect_img.dtype)]
            boxes_dtypes += [str(box.dtype)]
            labels_dtypes += [str(label.dtype)]
            scores_dtypes += [str(score.dtype)]

            torch.cuda.empty_cache()

        meta = pd.DataFrame({
            "pred_image_shape_c": pred_image_shape_c,
            "pred_image_shape_h": pred_image_shape_h,
            "pred_image_shape_w": pred_image_shape_w,
            "output_dtype": pred_image_dtypes,
            "boxes_dtypes": boxes_dtypes,
            "labels_dtypes": labels_dtypes,
            "scores_dtypes": scores_dtypes,
        })
        img_byte = pd.DataFrame({
            "output": pred_images,
            "boxes": boxes,
            "labels": labels,
            "scores": scores,
            # "true": tensor_list,
        }).applymap(lambda x: self.numpy_to_bytesarray(x))
        return pd.concat([meta, img_byte], axis="columns")

Sample Image Inference

Currently, Runway only supports Dataframe format for input and output in API serving. To do this, write code to convert the input images to bytearrays.

import base64
import pandas as pd


def convert_image_to_bytearray(img_binary):
    image_64_encode = base64.b64encode(img_binary)
    bytes_array = image_64_encode.decode("utf-8")
    return bytes_array


def images_to_bytearray_df(image_filename_list: list):
    df_list = []
    for img_filename in image_filename_list:
        image = open(img_filename, "rb")  # open binary file in read mode
        image_read = image.read()
        df_list.append(convert_image_to_bytearray(image_read))
    return pd.DataFrame(df_list, columns=["image_data"])

Create an input_sample using the data and conversion code above, and perform inference using the wrapped model.

model = model.cpu()
device = "cpu"
serve_model = ModelWrapper(model=model, device=device)

# make input sample
input_sample = images_to_bytearray_df(image_filename_list)

# For inference
pred = serve_model.predict(input_sample)

output = pred.loc[0]
data, dtype = output["output"], output["output_dtype"]
c, h, w = output["pred_image_shape_c"], output["pred_image_shape_h"], output["pred_image_shape_w"]

type_dict = {"uint8": np.uint8, "float32": np.float32, "int64": np.int64}
pred_decode = base64.b64decode(data)
pred_array = np.frombuffer(pred_decode, dtype=type_dict[dtype])

img = Image.fromarray(pred_array.reshape(h, w, c))

imshow(img)

Check the inference results.

Model Registration

Use the model registration code snippet in the Runway platform to register (log_model) the trained model and record related information.

import mlflow
import runway

del map_score["classes"]
with mlflow.start_run():
    mlflow.log_metrics(map_score)

    runway.log_model(
        model=serve_model,
        input_samples={"predict": input_sample},
        model_name="my-detection-model",
)

Pipeline Configuration and Saving

📘 For specific guidance on creating a pipeline, refer to the Create a pipeline.

Write and verify the pipeline in Link to ensure it runs smoothly.
After verifying successful execution, click the Upload pipeline button in the Link pipeline panel.
Click the New Pipeline button.
Enter the name for the pipeline to be saved in Runway in the Pipeline field.
The Pipeline version field will automatically select version 1.
Click the Upload button.
Once the upload is complete, the uploaded pipeline item will appear on the Pipeline page within the project.

Model Deployment

📘 You can find specific guidance on model deployment in the Model Deployment.

Demo Service

To test the deployed model, you can use the following demo website.
If you are in demo site you will see the following screen:
Fill in the API Endpoint, API Token, and upload the image for prediction:
You will receive the result:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Object Detection

한국어 | English

The MLOps platform to Let your AI run

Introduction

Package Preparation

Data

Load Data

Extract a sample image

COCO Dataset

Model

Model Declaration

Model Training

Model Inference

Model Wrapping Class Declaration

Sample Image Inference

Model Registration

Pipeline Configuration and Saving

Model Deployment

Demo Service

FilesExpand file tree

README_en.md

Latest commit

History

README_en.md

File metadata and controls

Object Detection

한국어 | English

The MLOps platform to Let your AI run

Introduction

Package Preparation

Data

Load Data

Extract a sample image

COCO Dataset

Model

Model Declaration

Model Training

Model Inference

Model Wrapping Class Declaration

Sample Image Inference

Model Registration

Pipeline Configuration and Saving

Model Deployment

Demo Service