open-mmlab · AllentDan · Jan 26, 2022 · Jan 21, 2022 · Jan 24, 2022
diff --git a/docs/en/faq.md b/docs/en/faq.md
@@ -1 +1,30 @@
 ## Frequently Asked Questions
+
+### TensorRT
+
+- "WARNING: Half2 support requested on hardware without native FP16 support, performance will be negatively affected."
+
+  Fp16 mode requires a device with full-rate fp16 support.
+
+- "error: parameter check failed at: engine.cpp::setBindingDimensions::1046, condition: profileMinDims.d[i] <= dimensions.d[i]"
+
+  When building an `ICudaEngine` from an `INetworkDefinition` that has dynamically resizable inputs, users need to specify at least one optimization profile. Which can be set in deploy config:
+
+  ```python
+  backend_config = dict(
+    common_config=dict(max_workspace_size=1 << 30),
+    model_inputs=[
+        dict(
+            input_shapes=dict(
+                input=dict(
+                    min_shape=[1, 3, 320, 320],
+                    opt_shape=[1, 3, 800, 1344],
+                    max_shape=[1, 3, 1344, 1344])))
+    ])
+  ```
+
+  The input tensor shape should be limited between `min_shape` and `max_shape`.
+
+- error: [TensorRT] INTERNAL ERROR: Assertion failed: cublasStatus == CUBLAS_STATUS_SUCCESS
+
+  TRT 7.2.1 switches to use cuBLASLt (previously it was cuBLAS). cuBLASLt is the defaulted choice for SM version >= 7.0. You may need CUDA-10.2 Patch 1 (Released Aug 26, 2020) to resolve some cuBLASLt issues. Another option is to use the new TacticSource API and disable cuBLASLt tactics if you dont want to upgrade.
diff --git a/mmdeploy/backend/tensorrt/wrapper.py b/mmdeploy/backend/tensorrt/wrapper.py
@@ -88,7 +88,18 @@ def forward(self, inputs: Dict[str,
         assert self._output_names is not None
         bindings = [None] * (len(self._input_names) + len(self._output_names))
 
+        profile_id = 0
         for input_name, input_tensor in inputs.items():
+            # check if input shape is valid
+            profile = self.engine.get_profile_shape(profile_id, input_name)
+            assert input_tensor.dim() == len(
+                profile[0]), 'Input dim is different from engine profile.'
+            for s_min, s_input, s_max in zip(profile[0], input_tensor.shape,
+                                             profile[2]):
+                assert s_min <= s_input <= s_max, \
+                    'Input shape should be between ' \
+                    + f'{profile[0]} and {profile[2]}' \
+                    + f' but get {tuple(input_tensor.shape)}.'
             idx = self.engine.get_binding_index(input_name)
 
             # All input tensors must be gpu variables