1. Requirements

1.1 Run a Script to Create a Docker Image

For example codes, tritonserver would have opencv-python
run scripts/create_image.sh to create a docker image we need

cd scripts
chmod 775 create_image.sh
./create_image.sh

1.2 Create ONNX Files

This repository needs mobilenetv2 and resnet50 onnx files to run example codes
You can download model files from Google Drive
Or you can create model files manually with PyTorch
Notice that a model should have dynamic axes on batch if you want to use dynamic batching

import torch
from torchvision.models import (
    resnet50,
    ResNet50_Weights,
    mobilenet_v2,
    MobileNet_V2_Weights,
)

targets = (
    (resnet50, ResNet50_Weights, "resnet50.onnx"),
    (mobilenet_v2, MobileNet_V2_Weights, "mobilenetv2.onnx"),
)

for module, weights, fname in targets:
    model = module(weights=weights.IMAGENET1K_V2)
    torch.onnx.export(
        model,
        torch.randn((1, 3, 224, 224), dtype=torch.float32),
        fname,
        opset=17,
        input_names=["input"],
        output_names=["output"],
        dynamic_axes={
            "input": {0: "batch_size"},
            "output": {0: "batch_size"},
        },
    )

ONNX files should be locatted in model_repository as shown below
The name of model must be model.onnx

tree ./model_repository

# model_repository/
# ├── classification
# │   ├── mobilenetv2
# │   │   ├── mobilenetv2
# │   │   │   ├── 1
# │   │   │   └── config.pbtxt
# │   │   ├── mobilenetv2_config
# │   │   │   ├── 1
# │   │   │   │   └── model.py
# │   │   │   └── config.pbtxt
# │   │   └── mobilenetv2_onnx
# │   │       ├── 1
# │   │       │   └── model.onnx
# │   │       └── config.pbtxt
# │   └── resnet50
# │       ├── resnet50
# │       │   ├── 1
# │       │   └── config.pbtxt
# │       ├── resnet50_config
# │       │   ├── 1
# │       │   │   └── model.py
# │       │   └── config.pbtxt
# │       └── resnet50_onnx
# │           ├── 1
# │           │   └── model.onnx
# │           └── config.pbtxt
# └── common_utils
#     ├── classifier_postprocessor
#     │   ├── 1
#     │   │   └── model.py
#     │   └── config.pbtxt
#     └── image_preprocessor
#         ├── 1
#         │   └── model.py
#         └── config.pbtxt

1.3 Dynamic Batching

If you set dynamic_batching on, tritonserver can handle large number of requests with dividing into batches. Otherwise, it is processed as like batch size is 1
See more about in here
You can add dynamic_batching in config files except ensemble model.
When dynamic_batching is on, max_batch_size should set to be larger than 0
Notice that dynamic_batching is not compatible with ensemble model. An ensemble model is handled with ensemble_scheduling. You just add dynamic_batching to each model used in the ensemble model.

 1  name: "mobilenetv2_config"
 2  backend: "python"
 3
 4  max_batch_size: 8
 5  instance_group [
 6    {
 7      kind: KIND_CPU
 8    }
 9  ]
10  dynamic_batching {
11    preferred_batch_size: [ 4, 8 ]
12    max_queue_delay_microseconds: 100
13  }

2. Test

2.1 Launch TritonServer

First, You should deploy tritonserver in your machine.
run scripts/launch_tritonserver.sh to create tritonserver docker container

./scripts/launch_tritonserver.sh

2.2 Run Example Code

inference.py will use ensemble model which includes pre-process, inference, and post-process.
For MobileNetv2, the ensemble structure is written in ./model_repository/classification/mobilenetv2/mobilenetv2/config.pbtxt
The model name is mobilenetv2 and it use ensemble platform. And it's input name is RAW_INPUT and output name is SCORES
If you want, you can change these names what you want.

 1  name: "mobilenetv2"
 2  platform: "ensemble"
 3
 4  max_batch_size: 8
 5
 6  input [
 7    {
 8      name: "RAW_INPUT"
 9      data_type: TYPE_UINT8
10      dims: [ -1,-1,3 ]
11    }
12  ]
13  output [
14    {
15      name: "SCORES"
16      data_type: TYPE_FP32
17      dims: [ -1 ]
18    }
19  ]

When you run inference.py, you can get results like below:

python inference.py --model mobilenetv2
# ==================================================
# ./resources/images/golden_retrieval.jpg
#         [ 207] Golden Retriever         : 47.27%
#         [ 208] Labrador Retriever       : 7.38%
#         [ 222] Kuvasz                   : 4.41%
#         [ 257] Pyrenean Mountain Dog    : 3.78%
#         [ 176] Saluki                   : 1.10%
# --------------------------------------------------
# ./resources/images/red_fox.jpg
#         [ 277] red fox                  : 12.15%
#         [ 272] coyote                   : 7.00%
#         [ 278] kit fox                  : 5.79%
#         [ 274] dhole                    : 4.90%
#         [ 269] grey wolf                : 1.49%
# ==================================================

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
model_repository		model_repository
resources		resources
scripts		scripts
.gitignore		.gitignore
inference.py		inference.py
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

1. Requirements

1.1 Run a Script to Create a Docker Image

1.2 Create ONNX Files

1.3 Dynamic Batching

2. Test

2.1 Launch TritonServer

2.2 Run Example Code

About

Uh oh!

Releases

Packages

Uh oh!

Languages

rubai1597/triton_examples

Folders and files

Latest commit

History

Repository files navigation

1. Requirements

1.1 Run a Script to Create a Docker Image

1.2 Create ONNX Files

1.3 Dynamic Batching

2. Test

2.1 Launch TritonServer

2.2 Run Example Code

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages