-
Notifications
You must be signed in to change notification settings - Fork 227
Description
Dear authors, thanks for the exciting work and I'd like to apologize in advance if I misunderstood.
As you may already know, CrowdPose dataset itself is constituted by cherry-picked crowd samples selected from MSCOCO, MPII and AIC, but CrowdPose did not specify if they treated train/val/test images from MSCOCO/MPII/AIC differently. They also re-annotated (presumably more accurately) these samples.
What we have noticed is that many of the test images in "MS COCO val set" also present in "CrowdPose train" and "CrowdPose train/val" splits. Although CrowdPose has renamed all their images, we have identified at least 181 images in "CrowdPose train/val" having the same md5 info as in "MS COCO val set".
For example, "108951.jpg" in "CrowdPose train" and "000000147740.jpg" in "MS COCO val set" are the same image with md5: f9fc120dc085166b30c08da3de333b69
We did not identify any image overlap between CrowdPose and MPII/AIC on md5 level for both train and test images, possibly because CrowdPose did some preprocessing for selected MPII/AIC images, but based on the finding on COCO, the possibility for such train-test overlap with MPII/AIC is notable. We have not checked if "CrowdPose test" images also present in "COCO train set" yet.
So if I did not miss anything, the model jointly trained on COCO+AIC+MPII+CrowdPose would have seen many of the test images (with labels, at least for COCO) during the training process, making the results untrustworthy.