Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ All notable changes to this project will be documented in this file.
- Add a new feature to configure input size(<https://github.com/openvinotoolkit/training_extensions/pull/2420>)
- Introduce the OTXSampler and AdaptiveRepeatDataHook to achieve faster training at the small data regime (<https://github.com/openvinotoolkit/training_extensions/pull/2428>)
- Add a new object detector Lite-DINO(<https://github.com/openvinotoolkit/training_extensions/pull/2457>)
- Official supports for YOLOX-X, YOLOX-L, YOLOX-S, ResNeXt101-ATSS (<https://github.com/openvinotoolkit/training_extensions/pull/2485>)

### Enhancements

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -74,12 +74,20 @@ We support the following ready-to-use model templates:
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+
| Template ID | Name | Complexity (GFLOPs) | Model size (MB) |
+===========================================================================================================================================================================================+=====================+=====================+=================+
| `Custom_Object_Detection_YOLOX <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/cspdarknet_yolox/template.yaml>`_ | YOLOX | 6.5 | 20.4 |
| `Custom_Object_Detection_YOLOX <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/cspdarknet_yolox_tiny/template.yaml>`_ | YOLOX-TINY | 6.5 | 20.4 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+
| `Object_Detection_YOLOX_S <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/cspdarknet_yolox_s/template.yaml>`_ | YOLOX_S | 33.51 | 46.0 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+
| `Object_Detection_YOLOX_L <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/cspdarknet_yolox_l/template.yaml>`_ | YOLOX_L | 194.57 | 207.0 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+
| `Object_Detection_YOLOX_X <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/cspdarknet_yolox_x/template.yaml>`_ | YOLOX_X | 352.42 | 378.0 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+
| `Custom_Object_Detection_Gen3_SSD <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/mobilenetv2_ssd/template.yaml>`_ | SSD | 9.4 | 7.6 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+
| `Custom_Object_Detection_Gen3_ATSS <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/mobilenetv2_atss/template.yaml>`_ | MobileNetV2-ATSS | 20.6 | 9.1 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+
| `Object_Detection_ResNeXt101_ATSS <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/resnext101_atss/template.yaml>`_ | ResNeXt101-ATSS | 434.75 | 344.0 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+

Above table can be found using the following command

Expand All @@ -90,32 +98,24 @@ Above table can be found using the following command
`SSD <https://arxiv.org/abs/1512.02325>`_ and `YOLOX <https://arxiv.org/abs/2107.08430>`_ are light models, that a perfect for the fastest inference on low-power hardware.
YOLOX achieved the same accuracy as SSD, and even outperforms its inference on CPU 1.5 times, but requires 3 times more time for training due to `Mosaic augmentation <https://arxiv.org/pdf/2004.10934.pdf>`_, which is even more than for ATSS.
So if you have resources for a long training, you can pick the YOLOX model.
ATSS still shows good performance among `RetinaNet <https://arxiv.org/abs/1708.02002>`_ based models. Therfore, We added ATSS with large scale backbone, ResNeXt101-ATSS. We integrated large ResNeXt101 backbone to our Custom ATSS head, and it shows good transfer learning performance.
In addition, we added a YOLOX variants to support users' diverse situations.

In addition to these models, we supports experimental models for object detection. These experimental models will be changed to official models within a few releases.

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+
| Template ID | Name | Complexity (GFLOPs) | Model size (MB) |
+===========================================================================================================================================================================================================================+=====================+=====================+=================+
| `Custom_Object_Detection_Gen3_Deformable_DETR <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/resnet50_deformable_detr/template_experimental.yaml>`_ | Deformable_DETR | 165 | 157.0 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+
| `Custom_Object_Detection_Gen3_DINO <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/resnet50_dino/template_experimental.yaml>`_ | DINO | 235 | 182.0 |
| `Object_Detection_Deformable_DETR <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/resnet50_deformable_detr/template_experimental.yaml>`_ | Deformable_DETR | 165 | 157.0 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+
| `Custom_Object_Detection_Gen3_Lite_DINO <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/resnet50_litedino/template_experimental.yaml>`_ | Lite-DINO | 140 | 190.0 |
| `Object_Detection_DINO <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/resnet50_dino/template_experimental.yaml>`_ | DINO | 235 | 182.0 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+
| `Custom_Object_Detection_Gen3_ResNeXt101_ATSS <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/resnext101_atss/template_experimental.yaml>`_ | ResNeXt101-ATSS | 434.75 | 344.0 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+
| `Object_Detection_YOLOX_S <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/cspdarknet_yolox_s/template_experimental.yaml>`_ | YOLOX_S | 33.51 | 46.0 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+
| `Object_Detection_YOLOX_L <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/cspdarknet_yolox_l/template_experimental.yaml>`_ | YOLOX_L | 194.57 | 207.0 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+
| `Object_Detection_YOLOX_X <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/cspdarknet_yolox_x/template_experimental.yaml>`_ | YOLOX_X | 352.42 | 378.0 |
| `Object_Detection_Lite_DINO <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/resnet50_litedino/template_experimental.yaml>`_ | Lite-DINO | 140 | 190.0 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+


`Deformable_DETR <https://arxiv.org/abs/2010.04159>`_ is `DETR <https://arxiv.org/abs/2005.12872>`_ based model, and it solves slow convergence problem of DETR. `DINO <https://arxiv.org/abs/2203.03605>`_ improves Deformable DETR based methods via denoising anchor boxes. Current SOTA models for object detection are based on DINO.
`Lite-DINO <https://arxiv.org/abs/2303.07335>`_ is efficient structure for DINO. It reduces FLOPS of transformer's encoder which takes the highest computational costs.
Although transformer based models show notable performance on various object detection benchmark, CNN based model still show good performance with proper latency.
Therefore, we added a new experimental CNN based method, ResNeXt101-ATSS. ATSS still shows good performance among `RetinaNet <https://arxiv.org/abs/1708.02002>`_ based models. We integrated large ResNeXt101 backbone to our Custom ATSS head, and it shows good transfer learning performance.
In addition, we added a YOLOX variants to support users' diverse situations.

.. note::

Expand Down Expand Up @@ -145,7 +145,7 @@ We trained each model with a single Nvidia GeForce RTX3090.
+----------------------------+------------------+-----------+-----------+-----------+-----------+--------------+
| Model name | COCO(AP50) | BDD100K | Brackish | Plantdoc | BCCD | Chess pieces |
+============================+==================+===========+===========+===========+===========+==============+
| YOLOX | 31.0 (48.2) | 24.8 | 96.3 | 51.5 | 88.5 | 99.2 |
| YOLOX-TINY | 31.0 (48.2) | 24.8 | 96.3 | 51.5 | 88.5 | 99.2 |
+----------------------------+------------------+-----------+-----------+-----------+-----------+--------------+
| SSD | 13.5 | 28.2 | 96.5 | 52.9 | 91.1 | 99.1 |
+----------------------------+------------------+-----------+-----------+-----------+-----------+--------------+
Expand All @@ -159,11 +159,11 @@ We trained each model with a single Nvidia GeForce RTX3090.
+----------------------------+------------------+-----------+-----------+-----------+-----------+--------------+
| ResNet50-Lite-DINO | 48.1 (64.4) | 47.0 | 99.0 | 62.5 | 93.6 | 99.4 |
+----------------------------+------------------+-----------+-----------+-----------+-----------+--------------+
| YOLOX_S | 40.3 (59.1) | 37.1 | 93.6 | 54.8 | 92.7 | 98.8 |
| YOLOX-S | 40.3 (59.1) | 37.1 | 93.6 | 54.8 | 92.7 | 98.8 |
+----------------------------+------------------+-----------+-----------+-----------+-----------+--------------+
| YOLOX_L | 49.4 (67.1) | 44.5 | 94.6 | 55.8 | 91.8 | 99.0 |
| YOLOX-L | 49.4 (67.1) | 44.5 | 94.6 | 55.8 | 91.8 | 99.0 |
+----------------------------+------------------+-----------+-----------+-----------+-----------+--------------+
| YOLOX_X | 50.9 (68.4) | 44.2 | 96.3 | 56.2 | 91.5 | 98.9 |
| YOLOX-X | 50.9 (68.4) | 44.2 | 96.3 | 56.2 | 91.5 | 98.9 |
+----------------------------+------------------+-----------+-----------+-----------+-----------+--------------+

************************
Expand Down
19 changes: 11 additions & 8 deletions docs/source/guide/get_started/cli_commands.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,14 +32,17 @@ Example to find ready-to-use templates for the detection task:
.. code-block::

(otx) ...$ otx find --task detection
+-----------+-----------------------------------------------+------------------+-------------------------------------------------------------------------------+
| TASK | ID | NAME | BASE PATH |
+-----------+-----------------------------------------------+------------------+-------------------------------------------------------------------------------+
| DETECTION | Custom_Object_Detection_Gen3_SSD | SSD | src/otx/algorithms/detection/configs/detection/mobilenetv2_ssd/template.yaml |
| DETECTION | Custom_Object_Detection_YOLOX | YOLOX | src/otx/algorithms/detection/configs/detection/cspdarknet_yolox/template.yaml |
| DETECTION | Custom_Object_Detection_Gen3_ATSS | MobileNetV2-ATSS | src/otx/algorithms/detection/configs/detection/mobilenetv2_atss/template.yaml |
+-----------+-----------------------------------------------+------------------+-------------------------------------------------------------------------------+

+-----------+-----------------------------------+------------------+------------------------------------------------------------------------------------+
| TASK | ID | NAME | BASE PATH |
+-----------+-----------------------------------+------------------+------------------------------------------------------------------------------------+
| DETECTION | Custom_Object_Detection_Gen3_ATSS | MobileNetV2-ATSS | src/otx/algorithms/detection/configs/detection/mobilenetv2_atss/template.yaml |
| DETECTION | Object_Detection_ResNeXt101_ATSS | ResNeXt101-ATSS | src/otx/algorithms/detection/configs/detection/resnext101_atss/template.yaml |
| DETECTION | Custom_Object_Detection_Gen3_SSD | SSD | src/otx/algorithms/detection/configs/detection/mobilenetv2_ssd/template.yaml |
| DETECTION | Object_Detection_YOLOX_L | YOLOX-L | src/otx/algorithms/detection/configs/detection/cspdarknet_yolox_l/template.yaml |
| DETECTION | Object_Detection_YOLOX_S | YOLOX-S | src/otx/algorithms/detection/configs/detection/cspdarknet_yolox_s/template.yaml |
| DETECTION | Custom_Object_Detection_YOLOX | YOLOX-TINY | src/otx/algorithms/detection/configs/detection/cspdarknet_yolox_tiny/template.yaml |
| DETECTION | Object_Detection_YOLOX_X | YOLOX-X | src/otx/algorithms/detection/configs/detection/cspdarknet_yolox_x/template.yaml |
+-----------+-----------------------------------+------------------+------------------------------------------------------------------------------------+

Example to find supported torchvision backbones for the detection task:

Expand Down
Loading