microsoft · jywu-msft · Dec 4, 2019 · Sep 6, 2019 · Sep 6, 2019 · Sep 7, 2019
diff --git a/.gitmodules b/.gitmodules
@@ -40,12 +40,12 @@
 [submodule "cmake/external/cub"]
 	path = cmake/external/cub
 	url = https://github.com/NVlabs/cub.git
-[submodule "cmake/external/onnx-tensorrt"]
-	path = cmake/external/onnx-tensorrt
-	url = https://github.com/onnx/onnx-tensorrt.git
 [submodule "cmake/external/wil"]
 	path = cmake/external/wil
 	url = https://github.com/microsoft/wil
+[submodule "cmake/external/onnx-tensorrt"]
+	path = cmake/external/onnx-tensorrt
+	url = https://github.com/onnx/onnx-tensorrt.git
 [submodule "cmake/external/json"]
 	path = cmake/external/json
 	url = https://github.com/nlohmann/json
diff --git a/BUILD.md b/BUILD.md
@@ -189,7 +189,7 @@ See more information on the TensorRT Execution Provider [here](./docs/execution_
    * The path to the CUDA `bin` directory must be added to the PATH environment variable so that `nvcc` is found.
    * The path to the cuDNN installation (path to folder that contains libcudnn.so) must be provided via the cuDNN_PATH environment variable, or `--cudnn_home parameter`.
  * Install [TensorRT](https://developer.nvidia.com/nvidia-tensorrt-download)
-   * The TensorRT execution provider for ONNX Runtime is built and tested with TensorRT 6.0.1.5 but validated with the feature set equivalent to TensorRT 5. Some TensorRT 6 new features such as dynamic shape is not available at this time.
+   * The TensorRT execution provider for ONNX Runtime is built and tested with TensorRT 6.0.1.5.
    * The path to TensorRT installation must be provided via the `--tensorrt_home parameter`.
 
 #### Build Instructions

diff --git a/cmake/external/onnx-tensorrt b/cmake/external/onnx-tensorrt
diff --git a/docs/execution_providers/TensorRT-ExecutionProvider.md b/docs/execution_providers/TensorRT-ExecutionProvider.md
@@ -7,20 +7,35 @@ With the TensorRT execution provider, the ONNX Runtime delivers better inferenci
 ## Build
 For build instructions, please see the [BUILD page](../../BUILD.md#tensorrt). 
 
-The TensorRT execution provider for ONNX Runtime is built and tested with TensorRT 6.0.1.5 but validated with the feature set equivalent to TensorRT 5. Some TensorRT 6 new features such as dynamic shape is not available as this time.
+The TensorRT execution provider for ONNX Runtime is built and tested with TensorRT 6.0.1.5.
 
 ## Using the TensorRT execution provider
 ### C/C++
-The TensortRT execution provider needs to be registered with ONNX Runtime to enable in the inference session. 
+The TensorRT execution provider needs to be registered with ONNX Runtime to enable in the inference session. 
 ```
 InferenceSession session_object{so};
 session_object.RegisterExecutionProvider(std::make_unique<::onnxruntime::TensorrtExecutionProvider>());
 status = session_object.Load(model_file_name);
 ```
 The C API details are [here](../C_API.md#c-api).
 
+#### Sample
+To run Faster R-CNN model on TensorRT execution provider,
+
+First, download Faster R-CNN onnx model from onnx model zoo [here](https://github.com/onnx/models/tree/master/vision/object_detection_segmentation/faster-rcnn).
+
+Second, infer shapes in the model by running shape inference script [here](https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/core/providers/nuphar/scripts/symbolic_shape_infer.py),
+```
+python symbolic_shape_infer.py --input /path/to/onnx/model/model.onnx --output /path/to/onnx/model/new_model.onnx --auto_merge
+```
+
+Third, replace original model with the new model and run onnx_test_runner tool under ONNX Runtime build directory,
+```
+./onnx_test_runner -e tensorrt /path/to/onnx/model/
+```
+
 ### Python
-When using the Python wheel from the ONNX Runtime build with TensorRT execution provider, it will be automatically prioritized over the default GPU or CPU execution providers. There is no need to separately register the execution provider. Python APIs details are [here](https://microsoft.github.io/onnxruntime/api_summary.html).
+When using the Python wheel from the ONNX Runtime build with TensorRT execution provider, it will be automatically prioritized over the default GPU or CPU execution providers. There is no need to separately register the execution provider. Python APIs details are .
 
 #### Sample
 Please see [this Notebook](../python/notebooks/onnx-inference-byoc-gpu-cpu-aks.ipynb) for an example of running a model on GPU using ONNX Runtime through Azure Machine Learning Services.
@@ -30,14 +45,25 @@ For performance tuning, please see guidance on this page: [ONNX Runtime Perf Tun
 
 When/if using [onnxruntime_perf_test](../../onnxruntime/test/perftest#onnxruntime-performance-test), use the flag `-e tensorrt` 
 
-## Configuring Engine Max Batch Size and Workspace Size
-By default TensorRT execution provider builds an ICudaEngine with max batch size = 1 and max workspace size = 1 GB
-One can override these defaults by setting environment variables ORT_TENSORRT_MAX_BATCH_SIZE and ORT_TENSORRT_MAX_WORKSPACE_SIZE.
-e.g. on Linux
+## Configuring environment variables
+There are three environment variables for TensorRT execution provider.
+
+ORT_TENSORRT_MAX_WORKSPACE_SIZE: maximum workspace size for TensorRT engine.
+
+ORT_TENSORRT_MAX_PARTITION_ITERATIONS: maximum number of iterations allowed in model partitioning for TensorRT. If target model can't be successfully partitioned when the maximum number of iterations is reached, the whole model will fall back to other execution providers such as CUDA or CPU.
 
-### override default batch size to 10
-export ORT_TENSORRT_MAX_BATCH_SIZE=10
+ORT_TENSORRT_MIN_SUBGRAPH_SIZE: minimum node size in a subgraph after partitioning. Subgraphs with smaller size will fall back to other execution providers.
+
+By default TensorRT execution provider builds an ICudaEngine with max workspace size = 1 GB, max partition iterations = 1000 and min subgraph size = 1.
+
+One can override these defaults by setting environment variables ORT_TENSORRT_MAX_WORKSPACE_SIZE, ORT_TENSORRT_MAX_PARTITION_ITERATIONS and ORT_TENSORRT_MIN_SUBGRAPH_SIZE.
+e.g. on Linux
 
 ### override default max workspace size to 2GB
 export ORT_TENSORRT_MAX_WORKSPACE_SIZE=2147483648
 
+### override default maximum number of iterations to 10 
+export ORT_TENSORRT_MAX_PARTITION_ITERATIONS=10
+
+### override default minimum subgraph node size to 5
+export ORT_TENSORRT_MIN_SUBGRAPH_SIZE=5
+1 −0		.gitmodules
+2 −3		CMakeLists.txt
+0 −91		Dockerfile
+21 −1		ImporterContext.hpp
+66 −80		ModelImporter.cpp
+1 −0		ModelImporter.hpp
+2 −2		NvOnnxParser.h
+16 −3		README.md
+3 −2		ShapedWeights.cpp
+422 −624		builtin_op_importers.cpp
+70 −0		docker/onnx-tensorrt-deb.Dockerfile
+80 −0		docker/onnx-tensorrt-tar.Dockerfile
+496 −66		onnx2trt_utils.cpp
+58 −4		onnx2trt_utils.hpp
+7 −1		onnx_utils.hpp
+1 −1		setup.py
+1 −1		third_party/onnx