When training Group DETR, the training AP remains at 0. I referred to some articles, and it seems that this issue might be related to the batch size. I used a single A6000 GPU with a batch size of 4, while the original setup used 8 GPUs with a total batch size of 16. Could a smaller batch size affect the training process?