Skip to content

飞桨2.0rc高级API在使用model.fit的时候,准确率acc显示不正确 #28673

@wk-mike

Description

@wk-mike
  • 版本、环境信息:
       1)PaddlePaddle版本:2.0rc
       2)CPU:i5 6600k
       3)GPU:GTX 1660 super CUDA10.2
       4)系统环境:windows10, ubuntu18.04

  • 训练信息
       1)单机
       2)显存 6G

  • 复现信息:

import paddle
print(paddle.__version__)

train_dataset = paddle.vision.datasets.Cifar10(mode='train')
val_dataset =  paddle.vision.datasets.Cifar10(mode='test')

mnist = paddle.vision.models.resnet18(num_classes=10)

# 预计模型结构生成模型实例,便于进行后续的配置、训练和验证
model = paddle.Model(mnist)

# 模型训练相关配置,准备损失计算方法,优化器和精度计算方法
model.prepare(paddle.optimizer.Adam(parameters=mnist.parameters()),
              paddle.nn.CrossEntropyLoss(),
              paddle.metric.Accuracy())

# 开始模型训练
model.fit(train_dataset,
          epochs=5,
          batch_size=128,
          verbose=1
          )

print(model.evaluate(val_dataset,batch_size=1, verbose=1))

输出

2.0.0-rc0
W1117 13:59:02.444118  9196 device_context.cc:338] Please NOTE: device: 0, CUDA Capability: 75, Driver API Version: 10.2, Runtime API Version: 10.2
W1117 13:59:02.722856  9196 device_context.cc:346] device: 0, cuDNN Version: 7.6.
Epoch 1/5
D:\Anaconda3\anaconda\lib\site-packages\paddle\fluid\layers\utils.py:77: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  return (isinstance(seq, collections.Sequence) and
D:\Anaconda3\anaconda\lib\site-packages\paddle\nn\layer\norm.py:637: UserWarning: When training, we now always track global mean and variance.
  "When training, we now always track global mean and variance.")
step 391/391 [==============================] - loss: 1.2710 - acc: 0.1015 - 52ms/step           
Epoch 2/5
step 391/391 [==============================] - loss: 0.9901 - acc: 0.1022 - 53ms/step           
Epoch 3/5
step 391/391 [==============================] - loss: 1.0232 - acc: 0.1033 - 51ms/step           
Epoch 4/5
step 391/391 [==============================] - loss: 0.7586 - acc: 0.1037 - 53ms/step           
Epoch 5/5
step 391/391 [==============================] - loss: 0.5976 - acc: 0.1037 - 51ms/step           
Eval begin...
step 10000/10000 [==============================] - loss: 0.0107 - acc: 0.6796 - 11ms/step                
Eval samples: 10000
{'loss': [0.010651758], 'acc': 0.6796}
  • 问题描述: 在使用resnet18训练cifar10分类的过程中,发现fit过程中,输出的acc一直是10%左右(不正常),在最后验证的时候,acc显示67% (正常)。出问题的原因在于batch_size, 当batch_size不等于1的时候 ,acc显示不正常; 当batch_size=1的时候,acc现在正常。

通过上面的代码可以复现问题,已经有同学也发现了这个问题。

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions