Skip to content

test_dist_ctr failed in ci task #16312

@wanghaoshuang

Description

@wanghaoshuang

CI task: http://ci.paddlepaddle.org/viewLog.html?buildId=71978&tab=buildLog&buildTypeId=PaddleMac_MacPrCi&logTab=tail

Error log:

[12:18:59]197/493 Test #192: test_dist_ctr ...................................***Failed    5.42 sec
[12:18:59]local_stderr: 
[12:18:59]W0319 20:20:25.035552 2543620992 graph.h:204] WARN: After a series of passes, the current graph can be quite different from OriginProgram. So, please avoid using the `OriginProgram()` method!
[12:18:59]
[12:18:59]
[12:18:59]local_stderr: 
[12:18:59]W0319 20:20:26.976441 2543620992 graph.h:204] WARN: After a series of passes, the current graph can be quite different from OriginProgram. So, please avoid using the `OriginProgram()` method!
[12:18:59]
[12:18:59]
[12:18:59]test_dist_ctr failed
[12:18:59] EE
[12:18:59]======================================================================
[12:18:59]ERROR: test_dist_ctr (test_dist_ctr.TestDistCTR2x2)
[12:18:59]----------------------------------------------------------------------
[12:18:59]Traceback (most recent call last):
[12:18:59]  File "/home/teamcity/work/e84e6e698a3f913d/build/python/paddle/fluid/tests/unittests/test_dist_ctr.py", line 27, in test_dist_ctr
[12:18:59]    self.check_with_place("dist_ctr.py", delta=1e-7, check_error_log=False)
[12:18:59]  File "/home/teamcity/work/e84e6e698a3f913d/build/python/paddle/fluid/tests/unittests/test_dist_base.py", line 531, in check_with_place
[12:18:59]    check_error_log)
[12:18:59]  File "/home/teamcity/work/e84e6e698a3f913d/build/python/paddle/fluid/tests/unittests/test_dist_base.py", line 340, in _run_local
[12:18:59]    sys.stderr.write('local_stdout: %s\n' % pickle.loads(local_out))
[12:18:59]  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1388, in loads
[12:18:59]    return Unpickler(file).load()
[12:18:59]  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 864, in load
[12:18:59]    dispatch[key](self)
[12:18:59]  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1153, in load_dup
[12:18:59]    self.append(self.stack[-1])
[12:18:59]IndexError: list index out of range

相关code:
test_dist_base: https://github.com/PaddlePaddle/Paddle/blame/develop/python/paddle/fluid/tests/unittests/test_dist_base.py#L365

PR: #16226

可能的原因:

  • 单测间计算资源争用?
  • 单测使用的相同的本地路径做checkpoints?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions