Waymo Open Dataset License Notice
The fine-tuned EfficientDet model weights described in this document were developed using the Waymo Open Dataset and are released under the Waymo Dataset License Agreement for Non-Commercial Use. By downloading or using these model weights, you agree that:
- These models are for non-commercial use only. Any use, modification, or redistribution is subject to the terms of the Waymo Dataset License Agreement for Non-Commercial Use, including the non-commercial restrictions therein.
- Any further downstream use or modification of these models is subject to the same agreement.
- A statement of the applicable Waymo Dataset License terms is included in this repository at WAYMO_LICENSE. The full agreement is available at waymo.com/open/terms.
These models were made using the Waymo Open Dataset, provided by Waymo LLC.
This document covers the fine-tuned EfficientDet models used by TURBO for object detection, including download instructions, configuration, and inference pipeline details.
TURBO uses five variants of EfficientDet fine-tuned on the Waymo Open Dataset for 5-class 2D object detection. The models are implemented as PyTorch Lightning modules using the effdet library, and are loaded from checkpoint files (.ckpt) at server startup.
All models are trained to detect the following 5 classes from the Waymo Open Dataset:
| Class ID | Label |
|---|---|
| 0 | Vehicle |
| 1 | Pedestrian |
| 2 | Cyclist |
| 3 | Sign |
| 4 | Unknown |
| Variant | Base Model | Input Resolution | Role | Notes |
|---|---|---|---|---|
| EfficientDet-D1 | tf_efficientdet_d1 |
640 x 640 | Client backup (on-vehicle) | Runs locally on AV when cloud offloading is infeasible |
| EfficientDet-D2 | tf_efficientdet_d2 |
768 x 768 | Server model | Smallest server-side model |
| EfficientDet-D4 | tf_efficientdet_d4 |
1024 x 1024 | Server model | Mid-range accuracy/speed |
| EfficientDet-D6 | tf_efficientdet_d6 |
1280 x 1280 | Server model | High accuracy, slower inference |
| EfficientDet-D7x | tf_efficientdet_d7x |
1536 x 1536 | Server model | Highest accuracy, longest inference |
Larger models produce higher detection accuracy (mAP) but require more GPU compute time, leaving less of the latency SLO budget for network transfer. See ARCHITECTURE.md for the full accuracy/latency/bandwidth trade-off table.
The fine-tuned model checkpoints are hosted on Google Cloud Storage. Download and extract them before running the server:
# Download the model archive (~large file, ensure sufficient disk space)
wget https://storage.googleapis.com/turbo-nines-2026/av-models.zip
# Extract to a directory of your choice
unzip av-models.zip -d ~/av-modelsAlternatively, using curl:
curl -o av-models.zip https://storage.googleapis.com/turbo-nines-2026/av-models.zip
unzip av-models.zip -d ~/av-modelsAfter extraction, the checkpoint files should be organized as follows. Note that the version_N/ directory varies per model (reflecting the training run used to produce the best checkpoint):
av-models/
├── tf_efficientdet_d1-waymo-open-dataset/
│ └── version_1/
│ └── checkpoints/
│ └── epoch=9-step=209850.ckpt
├── tf_efficientdet_d2-waymo-open-dataset/
│ └── version_2/
│ └── checkpoints/
│ └── epoch=9-step=419700.ckpt
├── tf_efficientdet_d4-waymo-open-dataset/
│ └── version_0/
│ └── checkpoints/
│ └── epoch=9-step=839400.ckpt
├── tf_efficientdet_d6-waymo-open-dataset/
│ └── version_2/
│ └── checkpoints/
│ └── epoch=9-step=3357600.ckpt
└── tf_efficientdet_d7x-waymo-open-dataset/
└── version_1/
└── checkpoints/
└── epoch=8-step=1477071.ckpt
After downloading, update the checkpoint paths in two configuration files to point to your extracted model directory.
The server_model_list section defines the models available to each ModelServer. Update each checkpoint_path to match your extraction directory:
server_model_list:
- checkpoint_path: /home/user/av-models/tf_efficientdet_d2-waymo-open-dataset/version_2/checkpoints/epoch=9-step=419700.ckpt
num_classes: 5
image_size: [768, 768]
base_model: "tf_efficientdet_d2"
- checkpoint_path: /home/user/av-models/tf_efficientdet_d4-waymo-open-dataset/version_0/checkpoints/epoch=9-step=839400.ckpt
num_classes: 5
image_size: [1024, 1024]
base_model: "tf_efficientdet_d4"
# ... etc.This file defines the model metadata used for standalone model server testing. Update the checkpoint_path fields under both server_models and client_backup_model:
server_models:
"tf_efficientdet_d2":
base_model: "tf_efficientdet_d2"
checkpoint_path: /home/user/av-models/tf_efficientdet_d2-waymo-open-dataset/version_2/checkpoints/epoch=9-step=419700.ckpt
device: "cuda:0"
num_classes: 5
image_size: [768, 768]
# ... update all entries similarly
client_backup_model:
name: "edd1-imgcompNone-inpcompNone"
checkpoint_path: /home/user/av-models/tf_efficientdet_d1-waymo-open-dataset/version_1/checkpoints/epoch=9-step=209850.ckpt
device: "cuda:1"
num_classes: 5Important: The device field (e.g., cuda:0, cuda:2) must match the GPUs available on your server. Adjust these based on your hardware. See CONFIGURATION.md for the full configuration reference.
At server startup, each ModelServer process loads its assigned model checkpoints using EfficientDetProfiler:
- Load checkpoint —
EfficientDetModel.load_from_checkpoint()restores the PyTorch Lightning module from the.ckptfile, using thebase_modelname andnum_classesto reconstruct the architecture. - Extract model — The inner
EfficientDetmodel is unwrapped from the Lightning module and wrapped ineffdet.DetBenchPredictfor inference mode. - Move to GPU — The model is transferred to the specified CUDA device.
- Precompute normalization — ImageNet mean/std tensors are precomputed on the target device for efficient inference-time normalization.
Input images are preprocessed using effdet.data.transforms.transforms_coco_eval, which applies:
- Resize to the model's native input resolution (e.g., 768x768 for D2, 1024x1024 for D4)
- Normalization using ImageNet mean and standard deviation values
Preprocessing can happen either client-side (input compression mode) or server-side (image compression mode), depending on the active model configuration. See ARCHITECTURE.md for details on these two compression strategies.
The EfficientDetProfiler.predict() method:
- Normalizes the input tensor (subtract ImageNet mean, divide by std)
- Runs the
DetBenchPredictforward pass on GPU - Returns detection outputs as
[x_min, y_min, x_max, y_max, score, label]tensors
For the full system architecture and end-to-end data flow, see ARCHITECTURE.md. For configuration file details, see CONFIGURATION.md.