Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
2ec7ddb
remove onnx-tensorrt submodule
stevenlix Sep 6, 2019
d4eea2f
add new onnx-tensorrt submodule (experiment) for trt6
stevenlix Sep 6, 2019
d50c974
update engine build for trt6
stevenlix Sep 7, 2019
3b89475
update compile and compute for tensorrt6.0
stevenlix Sep 9, 2019
ef51e18
Update tensorrt_execution_provider.cc
stevenlix Sep 16, 2019
ae07c80
Update tensorrt_execution_provider.cc
stevenlix Sep 20, 2019
2f043de
Update tensorrt_execution_provider.cc
stevenlix Sep 20, 2019
14d31cd
Update tensorrt_execution_provider.cc
stevenlix Sep 20, 2019
7a47cac
switch to onnx-tensorrt master for TensorRT6'
stevenlix Sep 20, 2019
2abe72b
Merge branch 'stevenlix/trt6' of https://github.com/Microsoft/onnxrun…
stevenlix Sep 20, 2019
902181e
Update tensorrt_execution_provider.cc
stevenlix Sep 20, 2019
694fb9a
Handle dynamic batch size and add memcpy in TensorRT EP
stevenlix Sep 24, 2019
ef23066
update test cases
stevenlix Sep 26, 2019
84dad1a
Update tensorrt_execution_provider.cc
stevenlix Sep 26, 2019
ed2693e
update onnx-tensorrt submodule
stevenlix Sep 26, 2019
1febce8
Merge branch 'stevenlix/trt6' of https://github.com/Microsoft/onnxrun…
stevenlix Sep 26, 2019
c243f53
Update Dockerfile.ubuntu_tensorrt
stevenlix Sep 27, 2019
2091b68
merge master
stevenlix Sep 27, 2019
9ca8d38
Update Dockerfile.ubuntu_tensorrt
stevenlix Sep 27, 2019
70e905d
Update run_dockerbuild.sh
stevenlix Sep 27, 2019
29ade9c
Update run_dockerbuild.sh
stevenlix Sep 27, 2019
2050cb8
Update install_ubuntu.sh
stevenlix Sep 27, 2019
6ef7cf2
Update concat_op_test.cc
stevenlix Sep 27, 2019
37cdc7d
Update tensorrt_execution_provider.cc
stevenlix Oct 1, 2019
cc21556
Upgrade TensorRT to version 6.0.1.5
stevenlix Oct 1, 2019
88a4b7a
Merge branch 'stevenlix/trt6' of https://github.com/Microsoft/onnxrun…
stevenlix Oct 1, 2019
76ad980
Update onnxruntime_providers.cmake
stevenlix Oct 1, 2019
cfba487
Update CMakeLists.txt
stevenlix Oct 1, 2019
c4316d3
Update reduction_ops_test.cc
stevenlix Oct 1, 2019
1a43bb3
Merge branch 'master' into stevenlix/trt6
stevenlix Oct 1, 2019
27487d4
Update install_ubuntu.sh
stevenlix Oct 1, 2019
f7ccd5a
Update Dockerfile.ubuntu_tensorrt
stevenlix Oct 1, 2019
0f3dc0a
Update Dockerfile.tensorrt
stevenlix Oct 1, 2019
c427ede
Update BUILD.md
stevenlix Oct 1, 2019
45781ff
Update run_dockerbuild.sh
stevenlix Oct 1, 2019
ddde6f6
Update install_ubuntu.sh
stevenlix Oct 2, 2019
3b5f551
Update onnxruntime_providers.cmake
stevenlix Oct 2, 2019
909765a
Update install_ubuntu.sh
stevenlix Oct 2, 2019
8f2c7b3
Update install_ubuntu.sh
stevenlix Oct 2, 2019
94c4887
Update gemm_test.cc
stevenlix Oct 2, 2019
b54d9e9
Update gather_op_test.cc
stevenlix Oct 2, 2019
18dfe23
Update CMakeLists.txt
stevenlix Oct 2, 2019
ab37d25
Removed submodule
stevenlix Oct 2, 2019
c315927
update onnx-tensorrt submodule
stevenlix Oct 2, 2019
08ab724
Merge remote-tracking branch 'origin/stevenlix/trt6' into stevenlix/d…
stevenlix Oct 2, 2019
63eff0c
update header file
stevenlix Oct 3, 2019
9f455a8
resolve conflict
stevenlix Oct 8, 2019
681cc57
Removed submodule
stevenlix Oct 8, 2019
7ebf729
add submodule onnx-tensorrt kevin's branch shape-test'
stevenlix Oct 8, 2019
7b09e3f
add debugging code
stevenlix Oct 9, 2019
a9e3e96
Update tensorrt_execution_provider.cc
stevenlix Oct 10, 2019
b165e63
Update tensorrt_execution_provider.cc
stevenlix Oct 10, 2019
b185773
merge master
stevenlix Oct 27, 2019
580940d
merge master
stevenlix Oct 27, 2019
aac11b4
Removed submodule
stevenlix Oct 27, 2019
51b07f9
update onnx-tensorrt submodule
stevenlix Oct 27, 2019
9a24ea0
add more changes for dynamic shapes
stevenlix Oct 28, 2019
837c8f4
Update tensorrt_execution_provider.cc
stevenlix Oct 30, 2019
7777308
update for dynamic shape
Nov 9, 2019
35460f7
Merge branch 'master' into stevenlix/dynamicshape
Nov 9, 2019
a6eb1d1
update dynamic shape processing
Nov 12, 2019
d046f57
Merge branch 'master' into stevenlix/dynamicshape
Nov 15, 2019
6d2b372
fix logger issue
Nov 16, 2019
153b043
Merge remote-tracking branch 'origin/master' into stevenlix/dynamicshape
Nov 19, 2019
c098bea
remove submodule onnx-tensorrt
stevenlix Nov 20, 2019
02c4ab4
add submodule onnx-tensorrt
stevenlix Nov 20, 2019
e2e40fc
add env variable min_subgraph_size
stevenlix Nov 20, 2019
bf2a6e8
remove redundency
stevenlix Nov 21, 2019
f1c9374
update document
Nov 22, 2019
a571e37
use onnxruntime::make_unique
stevenlix Nov 22, 2019
62ffa6e
solve conflict
Nov 22, 2019
001c0f4
fix multi-run issue
Nov 22, 2019
8740a2e
remove some tests to save CI build time
Nov 23, 2019
ef4ecb7
Add dynamic shape test
stevenlix Nov 27, 2019
1673803
Update TensorRT-ExecutionProvider.md
stevenlix Nov 27, 2019
5494497
Add example of running Faster R-CNN model on TensorRT EP
stevenlix Nov 27, 2019
5c7e8f4
Add more details on env variables
stevenlix Nov 27, 2019
1759868
update environment variables
stevenlix Nov 27, 2019
2b6a974
Update tensorrt_basic_test.cc
stevenlix Dec 3, 2019
ac70ec8
Merge branch 'master' into stevenlix/dynamicshape
stevenlix Dec 3, 2019
e63214f
Update model tests
stevenlix Dec 3, 2019
e41ff98
Update tensor_op_test.cc
stevenlix Dec 3, 2019
30240e8
remove --use_full_protobuf
stevenlix Dec 3, 2019
9564b9f
Update build.py
stevenlix Dec 4, 2019
e7fa3f7
Merge branch 'master' into stevenlix/dynamicshape
jywu-msft Dec 4, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -40,12 +40,12 @@
[submodule "cmake/external/cub"]
path = cmake/external/cub
url = https://github.com/NVlabs/cub.git
[submodule "cmake/external/onnx-tensorrt"]
path = cmake/external/onnx-tensorrt
url = https://github.com/onnx/onnx-tensorrt.git
[submodule "cmake/external/wil"]
path = cmake/external/wil
url = https://github.com/microsoft/wil
[submodule "cmake/external/onnx-tensorrt"]
path = cmake/external/onnx-tensorrt
url = https://github.com/onnx/onnx-tensorrt.git
[submodule "cmake/external/json"]
path = cmake/external/json
url = https://github.com/nlohmann/json
2 changes: 1 addition & 1 deletion BUILD.md
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ See more information on the TensorRT Execution Provider [here](./docs/execution_
* The path to the CUDA `bin` directory must be added to the PATH environment variable so that `nvcc` is found.
* The path to the cuDNN installation (path to folder that contains libcudnn.so) must be provided via the cuDNN_PATH environment variable, or `--cudnn_home parameter`.
* Install [TensorRT](https://developer.nvidia.com/nvidia-tensorrt-download)
* The TensorRT execution provider for ONNX Runtime is built and tested with TensorRT 6.0.1.5 but validated with the feature set equivalent to TensorRT 5. Some TensorRT 6 new features such as dynamic shape is not available at this time.
* The TensorRT execution provider for ONNX Runtime is built and tested with TensorRT 6.0.1.5.
* The path to TensorRT installation must be provided via the `--tensorrt_home parameter`.

#### Build Instructions
Expand Down
44 changes: 35 additions & 9 deletions docs/execution_providers/TensorRT-ExecutionProvider.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,20 +7,35 @@ With the TensorRT execution provider, the ONNX Runtime delivers better inferenci
## Build
For build instructions, please see the [BUILD page](../../BUILD.md#tensorrt).

The TensorRT execution provider for ONNX Runtime is built and tested with TensorRT 6.0.1.5 but validated with the feature set equivalent to TensorRT 5. Some TensorRT 6 new features such as dynamic shape is not available as this time.
The TensorRT execution provider for ONNX Runtime is built and tested with TensorRT 6.0.1.5.

## Using the TensorRT execution provider
### C/C++
The TensortRT execution provider needs to be registered with ONNX Runtime to enable in the inference session.
The TensorRT execution provider needs to be registered with ONNX Runtime to enable in the inference session.
```
InferenceSession session_object{so};
session_object.RegisterExecutionProvider(std::make_unique<::onnxruntime::TensorrtExecutionProvider>());
status = session_object.Load(model_file_name);
```
The C API details are [here](../C_API.md#c-api).

#### Sample
To run Faster R-CNN model on TensorRT execution provider,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's better to do a notebook tutorial to be consistent with other EPs.
let's do it as a separate future PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do


First, download Faster R-CNN onnx model from onnx model zoo [here](https://github.com/onnx/models/tree/master/vision/object_detection_segmentation/faster-rcnn).

Second, infer shapes in the model by running shape inference script [here](https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/core/providers/nuphar/scripts/symbolic_shape_infer.py),
```
python symbolic_shape_infer.py --input /path/to/onnx/model/model.onnx --output /path/to/onnx/model/new_model.onnx --auto_merge
```

Third, replace original model with the new model and run onnx_test_runner tool under ONNX Runtime build directory,
```
./onnx_test_runner -e tensorrt /path/to/onnx/model/
```

### Python
When using the Python wheel from the ONNX Runtime build with TensorRT execution provider, it will be automatically prioritized over the default GPU or CPU execution providers. There is no need to separately register the execution provider. Python APIs details are [here](https://microsoft.github.io/onnxruntime/api_summary.html).
When using the Python wheel from the ONNX Runtime build with TensorRT execution provider, it will be automatically prioritized over the default GPU or CPU execution providers. There is no need to separately register the execution provider. Python APIs details are .

#### Sample
Please see [this Notebook](../python/notebooks/onnx-inference-byoc-gpu-cpu-aks.ipynb) for an example of running a model on GPU using ONNX Runtime through Azure Machine Learning Services.
Expand All @@ -30,14 +45,25 @@ For performance tuning, please see guidance on this page: [ONNX Runtime Perf Tun

When/if using [onnxruntime_perf_test](../../onnxruntime/test/perftest#onnxruntime-performance-test), use the flag `-e tensorrt`

## Configuring Engine Max Batch Size and Workspace Size
By default TensorRT execution provider builds an ICudaEngine with max batch size = 1 and max workspace size = 1 GB
One can override these defaults by setting environment variables ORT_TENSORRT_MAX_BATCH_SIZE and ORT_TENSORRT_MAX_WORKSPACE_SIZE.
e.g. on Linux
## Configuring environment variables
There are three environment variables for TensorRT execution provider.

ORT_TENSORRT_MAX_WORKSPACE_SIZE: maximum workspace size for TensorRT engine.

ORT_TENSORRT_MAX_PARTITION_ITERATIONS: maximum number of iterations allowed in model partitioning for TensorRT. If target model can't be successfully partitioned when the maximum number of iterations is reached, the whole model will fall back to other execution providers such as CUDA or CPU.

### override default batch size to 10
export ORT_TENSORRT_MAX_BATCH_SIZE=10
ORT_TENSORRT_MIN_SUBGRAPH_SIZE: minimum node size in a subgraph after partitioning. Subgraphs with smaller size will fall back to other execution providers.

By default TensorRT execution provider builds an ICudaEngine with max workspace size = 1 GB, max partition iterations = 1000 and min subgraph size = 1.

One can override these defaults by setting environment variables ORT_TENSORRT_MAX_WORKSPACE_SIZE, ORT_TENSORRT_MAX_PARTITION_ITERATIONS and ORT_TENSORRT_MIN_SUBGRAPH_SIZE.
e.g. on Linux

### override default max workspace size to 2GB
export ORT_TENSORRT_MAX_WORKSPACE_SIZE=2147483648

### override default maximum number of iterations to 10
export ORT_TENSORRT_MAX_PARTITION_ITERATIONS=10

### override default minimum subgraph node size to 5
export ORT_TENSORRT_MIN_SUBGRAPH_SIZE=5
Loading