[Feature] Add multi-label semantic segmentation support by lzm-build · Pull Request #3479 · PaddlePaddle/PaddleSeg

lzm-build · 2023-08-30T00:57:42Z

目前paddleseg只支持单标签的语义分割，即图像上的某一空间位置的像素点只能对应一个类别(类比单标签分类)，而多标签语义分割在空间维度上来看不同实例间的mask可能会重叠，这就需要图像上的某一空间位置的像素点能同时对应多个类别(类比多标签分类）。但目前关于多标签语义分割的工作较其他视觉任务少得多，故在数据集和模型更加偏向于自定义。为了便于自定义数据集或模型使用我添加了如下模块：

由于paddleseg当下只支持png标注格式，此标注格式显然不适用于多标签语义分割。通过调研，发现主要有两种适配多标签语义分割标注的格式: a. coco instance b. csv 列表，我选择了以coco instance格式来作为多标签语义分割的数据集读取接口，并使其可以复用其他通用图像增强算子，也提供了常见的csv转coco instance的脚本。
由于语义分割任务的输出头的输出形状为：batch_size X num_classes X o_h X o_w, 在单标签和多标签中是通用的。主要不同的点在于对num_classes维度的处理，单标签一般softmax(类比单标签分类), 多标签目前支持两种损失函数：a.BCELoss b. MultiLabelCategoricalCELoss
评估方面修改一下直接套用单标签任务的，专门针对多标签任务评估还没有
图片推理预测还得再优化一下，因为会有重叠部分，如何能更好的进行可视化

增大图片尺寸、辅助样本可以加速收敛

在UWMGI数据上，利用ppmobileseg训练的精度，尺寸为[512,512]，其中第一类为增加的背景类，平均mIOU为82.18：

paddle-bot · 2023-08-30T00:57:45Z

Thanks for your contribution!

Asthestarsfalll · 2023-08-30T14:19:39Z

+        pred = logit
+        pred = (1 - 2 * label) * pred
+        pred_neg = pred - label * 1e12
+        pred_pos = pred - (1 - label) * 1e12


这里如果使用where和现在的代码哪个更快？可以测试一下常用尺寸的

经测试，loss部分batch小的时候算术方法较快，batch大的时候where占优，还有一个是标签的稀疏性也会影响计算时间

测试的图片尺寸是多少，稀疏性的影响大不大

尺寸 512 X 512, 当label中有效值1得占比远小于 1/num_classes时，算术占优

非极端情况下两者时间差别不大

那还是用where吧，看起来可读性好点

Asthestarsfalll · 2023-08-30T14:21:07Z

+            label (Tensor): Label tensor, the data type is int64. Shape is (N, C), where each
+                value is 0 or 1, and if shape is more than 2D, this is
+                (N, C, D1, D2,..., Dk), k >= 1.
+        """


描述似乎不太准确

Asthestarsfalll · 2023-08-30T14:23:29Z

+        data['gt_fields'] = []
+
+        if self.mode.lower() == 'train':
+            assert 'instances' in data, ValueError


写出详细报错信息吧

已重新调整COCOInstance api的结构

Asthestarsfalll · 2023-08-30T14:25:30Z

+            for idx, one_class_label in enumerate(label):
+                data[f'label_{idx}'] = one_class_label
+                if self.mode.lower() == 'train':
+                    data['gt_fields'].append(f'label_{idx}')


为什么要分开呢？似乎大部分针对label的transform都可以通过修改functional中类似与[a:b, c:d, :] 为 [a:b, c:d, ...] 解决。

已重新调整COCOInstance api的结构

Asthestarsfalll · 2023-08-30T14:26:32Z

+                    data['gt_fields'].append(f'label_{idx}')
+
+        try:
+            del data['instances']


用 data.pop('instances', None)吧

已重新调整COCOInstance api的结构

Asthestarsfalll · 2023-08-30T14:44:03Z

+        for i in range(num_classes):
+            pred_i = pred[:, i]
+            label_i = label[:, i]
+            intersect_i = paddle.logical_and(pred_i, label_i.astype(paddle.int32))


看起来似乎不需要使用循环了，上面使用循环是因为类别都拍平在一个h w中了，多标签的情况应该是0-1标签，直接计算就可以吧。

是的，已经调整为直接取和

Asthestarsfalll · 2023-08-30T14:45:24Z

+
+        label = np.concatenate(label, axis=0)
+
+        label[label == self.ignore_index] = 0


这里为啥要将ignore_index变为0呢，变为0就表示负类了，而ignore应该表示不参与loss计算

已经更正

Asthestarsfalll · 2023-08-30T14:48:35Z

+                value is 0 or 1, and if shape is more than 2D, this is
+                (N, C, D1, D2,..., Dk), k >= 1.
+        """
+        label = label.astype(paddle.float32)


这里需不需要将ignore_index的位置求出来，使其不参与正类与负类的计算？

已经更正

Asthestarsfalll · 2023-08-30T14:52:08Z

+    """
+    def __init__(self, mode="train", use_multilabel=False, ignore_index=255):
+        self.mode = mode
+        self.use_multilabel = use_multilabel


我认为把use_multilabel加到dataset合理一点，并在config或者builder中传参，同时需要在config_checker中添加对loss的检查（使用multilabel不支持的损失函数）

重新优化了COCOInstance的逻辑，使用allow_overlap（bool）,来控制是否开启多标签模式

Asthestarsfalll · 2023-08-30T14:58:54Z

@@ -0,0 +1,51 @@
+batch_size: 4


或许可以继承singlelabel

已经改为继承

lzm-build · 2023-08-31T08:07:39Z

@Asthestarsfalll 我已经按照建议进行修改，并更新了pr对应分支的内容

Asthestarsfalll · 2023-08-31T12:44:27Z

+        self.num_classes = self.NUM_CLASSES
+        self.ignore_index = self.IGNORE_INDEX
+        self.allow_overlap = allow_overlap
+        self.add_background = add_background


这个参数的作用是啥，和多标签有关系吗

allow_overlop参数得意思是是否允许标注重叠，若允许则为多标签模型，不允许则为单标签模式
add_background参数的意思是是否需要将背景添加为一类

Asthestarsfalll · 2023-08-31T12:49:19Z

+                 weight=None,
+                 ignore_index=255,
+                 top_k_percent_pixels=1.0,
+                 avg_non_ignore=True,


有些参数似乎没有用到

Asthestarsfalll · 2023-08-31T12:55:11Z

+        logit_pos = paddle.where(paddle.logical_and(label, mask),
+                                 logit, paddle.to_tensor(float("-inf")))
+        logit_neg = paddle.where(paddle.logical_or(label, ~mask),
+                                 paddle.to_tensor(float("-inf")), logit)


这段的可读性比较差，在上一次代码的基础上对logit_pos和logit_neg用where就行了吧

assert len(label.shape) == len(logit.shape) logit = logit.transpose([0, 2, 3, 1]) logexp_one = paddle.zeros_like(logit[..., :1]) logit_pos = paddle.where((label == 1), logit, paddle.to_tensor(float('inf'))) logit_pos = paddle.concat([logexp_one, logit_pos], axis=-1) loss_pos = paddle.logsumexp(-logit_pos, axis=-1) logit_neg = paddle.where((label == 0), logit, paddle.to_tensor(float('-inf'))) logit_neg = paddle.concat([logexp_one, logit_neg], axis=-1) loss_neg = paddle.logsumexp(logit_neg, axis=-1) loss = loss_pos + loss_neg mask = (label != self.ignore_index).astype('float32') loss = paddle.mean(loss) / (paddle.mean(mask) + self.EPS) label.stop_gradient = True mask.stop_gradient = True return loss

我按照公式的逻辑顺序重新排列，但考虑到需要mask掉ig_index部分，没法优化太多

已修改，提高可读性

Asthestarsfalll · 2023-08-31T12:56:05Z

+
+
+@manager.LOSSES.add_component
+class MultiLabelAsymmetricLoss(nn.Layer):


这个loss的参考资料有吗

Asthestarsfalll · 2023-08-31T12:57:23Z

+        label_area = label.sum(0).sum(-1).sum(-1).astype("int64")
+        intersect = paddle.logical_and(
+            pred.astype("bool"), label.astype("bool")).astype("int64")
+        intersect_area = intersect.sum(0).sum(-1).sum(-1)


这些地方用一个sum就行了吧

因为这里label的形状为（bs, num_classes, h, w）需要保留num_classes维度

sum是支持多个axis的

Asthestarsfalll · 2023-08-31T13:00:01Z

还有建议在config_checker中添加对多标签分割检查loss的逻辑

Asthestarsfalll · 2023-09-01T03:35:57Z

+        super(MultiLabelCategoricalCrossEntropyLoss, self).__init__()
+        self.ignore_index = ignore_index
+        self.EPS = 1e-8
+        self.data_format = data_format


下面data_format的逻辑怎么又去掉了

Asthestarsfalll · 2023-09-01T03:37:23Z

+
+        logexp_one = paddle.zeros_like(logit[..., :1])
+
+        logit_pos = paddle.where((label == 1), logit, paddle.to_tensor(float('inf')))


paddle.where应该是支持非布尔矩阵的吧，直接用label就行了吧？下同

Asthestarsfalll · 2023-09-01T03:45:57Z

+        assert len(label.shape) == len(logit.shape)
+        logit = logit.transpose([0, 2, 3, 1])
+
+        logexp_one = paddle.zeros_like(logit[..., :1])


名字有点迷惑了，用zero吧

Asthestarsfalll · 2023-09-01T03:50:01Z

+
+        loss = loss_pos + loss_neg
+        mask = (label != self.ignore_index).astype('float32')
+        loss = paddle.mean(loss) / (paddle.mean(mask) + self.EPS)


除以mask mean是为了放缩loss，以消除对忽略这部分值对loss大小的影响，但是前面的步骤并没有对这部分的值进行忽略，应该需要加一个loss=loss*mask

Asthestarsfalll · 2023-09-01T03:57:44Z

参考一下PR提交规范格式化一下代码

shiyutang · 2023-09-12T07:06:23Z

            'No model specified in the configuration file.'

        if self.config.train_dataset_cfg[
-                'type'] not in ['Dataset', 'SegDataset']:


本文件中的两处或许不需要修改？为什么不做numclass的一致检查呢？

现在修改为直接使用Dataset API

shiyutang · 2023-09-12T07:07:40Z

+import numpy as np
+import pycocotools.coco as cocoAPI
+import pycocotools.mask as maskUtils
+from paddle.io import Dataset


paddle的引用需要放在下一个代码块，和第三方库区分。

现在修改为直接使用Dataset API

shiyutang · 2023-09-12T07:11:12Z

+                added_image = utils.visualize.multi_label_visualize(
+                    im_path, pred, color_map, weight=0.6)
+                added_image_path = os.path.join(added_saved_dir, im_file)
+                mkdir(added_image_path)
+                cv2.imwrite(added_image_path, added_image)


这一段重复度较大，或者使用函数包装下？

已将多标签模式下的可视化功能与单标签模式融合

shiyutang · 2023-09-12T07:18:22Z

+train_dataset:
+  type: COCODataset
+  image_root: data/UWMGI/images/
+  json_file: data/UWMGI/annotations/train.json
+  add_background: True
+  use_multilabel: True
+  transforms:
+    - type: ResizeStepScaling
+      min_scale_factor: 0.5
+      max_scale_factor: 2.0
+      scale_step_size: 0.25
+    - type: RandomPaddingCrop
+      crop_size: [512, 512]
+    - type: RandomHorizontalFlip
+    - type: Normalize
+      mean: [0.0, 0.0, 0.0]
+      std: [1.0, 1.0, 1.0]
+  mode: train
+
+val_dataset:
+  type: COCODataset
+  image_root: data/UWMGI/images/
+  json_file: data/UWMGI/annotations/val.json
+  add_background: True
+  use_multilabel: True
+  transforms:
+    - type: Resize
+      target_size: [2048, 512]
+      keep_ratio: True
+      size_divisor: 32
+    - type: Normalize
+      mean: [0.0, 0.0, 0.0]
+      std: [1.0, 1.0, 1.0]
+  mode: val


数据集部分可以单独出抽离出来放在_base_，正如configs/base/cityscapes.yml

已将uwmgi.yml置入_base_

shiyutang · 2023-09-12T07:18:55Z

+* Install PaddlePaddle and relative environments based on the [installation guide](https://www.paddlepaddle.org.cn/en/install/quick?docurl=/documentation/docs/en/install/pip/linux-pip_en.html).
+* Install PaddleSeg based on the [reference](../../../docs/install.md).
+* Download the UWMGI dataset and link to PaddleSeg/data. 
+


增加怎样准备多标签数据，有必要的话，可以增加转换脚本。

已添加转换脚本，并在说明文档中列出使用说明

shiyutang

整体PR完整度很高了，仅留下了几个小comments。

shiyutang · 2023-09-21T03:36:06Z

+<img src="https://github.com/MINGtoMING/cache_ppseg_multilabelseg_readme_imgs/tree/main/assets/case15_day0_slice_0065.jpg">
+<img src="https://github.com/MINGtoMING/cache_ppseg_multilabelseg_readme_imgs/tree/main/assets/case122_day18_slice_0092.jpg">
+<img src="https://github.com/MINGtoMING/cache_ppseg_multilabelseg_readme_imgs/tree/main/assets/case130_day20_slice_0072.jpg">
+</p>


我这边似乎看不到图，可以图片拖拽到issue聊天框尝试获得链接：

shiyutang · 2023-09-21T03:56:32Z

+<p align="center">
+<img src="https://github.com/MINGtoMING/cache_ppseg_multilabelseg_readme_imgs/tree/main/assets/case15_day0_slice_0065.jpg">
+<img src="https://github.com/MINGtoMING/cache_ppseg_multilabelseg_readme_imgs/tree/main/assets/case122_day18_slice_0092.jpg">
+<img src="https://github.com/MINGtoMING/cache_ppseg_multilabelseg_readme_imgs/tree/main/assets/case130_day20_slice_0072.jpg">
+</p>
+


shiyutang · 2023-09-21T04:03:32Z

+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and


这个脚本是否可复用呢，对所有特定格式的多标签数据都进行转换？

已更新脚本，使其支持UWMGI 和主流的COCO类型标注转换为ppseg dataset api支持的格式

shiyutang

LGTM

paddle-bot Bot added the contributor Contribution from developers label Aug 30, 2023

Asthestarsfalll suggested changes Aug 30, 2023

View reviewed changes

lzm-build force-pushed the add-multi-label-v1.0 branch from 60b6a69 to 8fe2fb9 Compare August 31, 2023 08:03

shiyutang changed the title ~~Add multi-label semantic segmentation support~~ [Feature] Add multi-label semantic segmentation support Aug 31, 2023

Asthestarsfalll reviewed Aug 31, 2023

View reviewed changes

Asthestarsfalll suggested changes Sep 1, 2023

View reviewed changes

shiyutang assigned juncaipeng Sep 1, 2023

lzm-build force-pushed the add-multi-label-v1.0 branch from 9729310 to 25ba1f7 Compare September 11, 2023 02:01

shiyutang requested changes Sep 12, 2023

View reviewed changes

lzm-build added 6 commits September 19, 2023 07:09

添加UWMGI数据集的转换脚本

98fdf1d

修改Dataset和Compose op使其适配读取多标签数据的情况

420047a

添加对多标签模式下的推理结果的可视化支持

64d8548

添加对多标签模式下的语义分割任务评估指标的支持

a045e4c

添加对多标签模式下，传入--use_multilabel参数的支持

7dc44dd

添加多标签语义分割任务在UWMGI数据集上的实例配置文件和说明文档

0a5adbb

lzm-build force-pushed the add-multi-label-v1.0 branch from 9e0498a to 0a5adbb Compare September 18, 2023 23:16

lzm-build added 5 commits September 19, 2023 07:35

添加多标签语义分割任务在UWMGI数据集上的实例配置文件和说明文档

ac747f8

添加多标签语义分割任务的辅助类transform op

367a7df

更新数据增强策略，加快收敛

eee7783

添加使用辅助类transform op的配置文件

2cfa00d

更新配置文件

5c2eebc

shiyutang requested changes Sep 21, 2023

View reviewed changes

lzm-build added 4 commits September 22, 2023 05:44

更新图片

a5196cf

更新图片

00f9f37

更新脚本，使其支持UWMGI 和主流的COCO类型标注转换为ppseg dataset api支持的格式

f50c33f

更新图片和转换脚本的相关命令

8cf3d51

shiyutang approved these changes Sep 22, 2023

View reviewed changes

shiyutang merged commit 63f95e6 into PaddlePaddle:develop Sep 22, 2023

shiyutang added the Contributor PR is Merged label Sep 22, 2023

lzm-build mentioned this pull request Sep 22, 2023

🏅️飞桨套件快乐开源常规赛 PaddlePaddle/PaddleOCR#10223

Closed

shiyutang mentioned this pull request Sep 27, 2023

为PaddleSeg添加多标签语义分割的功能 #3456

Closed


		label = np.concatenate(label, axis=0)

		label[label == self.ignore_index] = 0



		@manager.LOSSES.add_component
		class MultiLabelAsymmetricLoss(nn.Layer):


		logexp_one = paddle.zeros_like(logit[..., :1])

		logit_pos = paddle.where((label == 1), logit, paddle.to_tensor(float('inf')))

Conversation

lzm-build commented Aug 30, 2023 • edited by shiyutang Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

增大图片尺寸、辅助样本可以加速收敛

Uh oh!

paddle-bot Bot commented Aug 30, 2023

Uh oh!

Asthestarsfalll Aug 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lzm-build Aug 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Asthestarsfalll Aug 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lzm-build commented Aug 31, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lzm-build Aug 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lzm-build Aug 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

lzm-build commented Aug 30, 2023 •

edited by shiyutang

Loading

Asthestarsfalll Aug 30, 2023 •

edited

Loading

lzm-build Aug 31, 2023 •

edited

Loading

Asthestarsfalll Aug 30, 2023 •

edited

Loading

lzm-build Aug 31, 2023 •

edited

Loading

lzm-build Aug 31, 2023 •

edited

Loading

Asthestarsfalll commented Aug 31, 2023 •

edited

Loading