Skip to content

update rocprofiler-sdk with stuff in 0.8.0#513

Merged
i-chaochen merged 5 commits into
rocm-jaxlib-v0.7.1from
ci_cj_rocprofv3_v1_rocm-jaxlib-v0.7.1
Jan 12, 2026
Merged

update rocprofiler-sdk with stuff in 0.8.0#513
i-chaochen merged 5 commits into
rocm-jaxlib-v0.7.1from
ci_cj_rocprofv3_v1_rocm-jaxlib-v0.7.1

Conversation

@cj401-amd
Copy link
Copy Markdown

@cj401-amd cj401-amd commented Jan 9, 2026

Motivation

  • backporting rocprofiler-sdk from 0.8.0 update rocprofiler-sdk (v3) and roctracer (v1) #473

  • still no kernel details in the trace file when building from rocm-jax.

  • running python3 profiler_test.py from jax ci_cj_profiler_test_rocm-jaxlib-v0.8.0, which requires xprof (it may be needed for CI later)

[ RUN      ] ProfilerTest.test_rocm_gpu_events_present_for_many_matmul_shapes
W0109 15:19:23.492738 140051182680960 raw_to_tool_data.py:103] Received old tool format: trace_viewer@^; mapped to new format: trace_viewer@
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1767971963.496396 1297529 derived_timeline.cc:590] Found derived XLine with clashing display ID: 1. This will cause rendering issues in Trace Viewer.
W0000 00:00:1767971963.498741 1297529 trace_events.cc:433] Remapping device id 1for host 1 to 1001
W0000 00:00:1767971963.498752 1297529 trace_events.cc:433] Remapping device id 701for host 1 to 1701
W0109 15:19:27.411699 140051182680960 raw_to_tool_data.py:103] Received old tool format: trace_viewer@^; mapped to new format: trace_viewer@
W0000 00:00:1767971967.413357 1297529 derived_timeline.cc:590] Found derived XLine with clashing display ID: 1. This will cause rendering issues in Trace Viewer.
W0000 00:00:1767971967.415660 1297529 trace_events.cc:433] Remapping device id 701for host 1 to 1701
W0000 00:00:1767971967.415669 1297529 trace_events.cc:433] Remapping device id 1for host 1 to 1001
W0109 15:19:32.673717 140051182680960 raw_to_tool_data.py:103] Received old tool format: trace_viewer@^; mapped to new format: trace_viewer@
W0000 00:00:1767971972.678285 1297529 derived_timeline.cc:590] Found derived XLine with clashing display ID: 1. This will cause rendering issues in Trace Viewer.
W0000 00:00:1767971972.680584 1297529 trace_events.cc:433] Remapping device id 701for host 1 to 1701
W0000 00:00:1767971972.680594 1297529 trace_events.cc:433] Remapping device id 1for host 1 to 1001
W0109 15:19:36.813895 140051182680960 raw_to_tool_data.py:103] Received old tool format: trace_viewer@^; mapped to new format: trace_viewer@
W0000 00:00:1767971976.815375 1297529 derived_timeline.cc:590] Found derived XLine with clashing display ID: 1. This will cause rendering issues in Trace Viewer.
W0000 00:00:1767971976.818270 1297529 trace_events.cc:433] Remapping device id 701for host 1 to 1701
W0000 00:00:1767971976.818290 1297529 trace_events.cc:433] Remapping device id 1for host 1 to 1001
W0109 15:19:42.465533 140051182680960 raw_to_tool_data.py:103] Received old tool format: trace_viewer@^; mapped to new format: trace_viewer@
W0000 00:00:1767971982.467098 1297529 derived_timeline.cc:590] Found derived XLine with clashing display ID: 1. This will cause rendering issues in Trace Viewer.
W0000 00:00:1767971982.470008 1297529 trace_events.cc:433] Remapping device id 1for host 1 to 1001
W0000 00:00:1767971982.470021 1297529 trace_events.cc:433] Remapping device id 701for host 1 to 1701
W0109 15:19:46.431867 140051182680960 raw_to_tool_data.py:103] Received old tool format: trace_viewer@^; mapped to new format: trace_viewer@
W0000 00:00:1767971986.433316 1297529 derived_timeline.cc:590] Found derived XLine with clashing display ID: 1. This will cause rendering issues in Trace Viewer.
W0000 00:00:1767971986.435200 1297529 trace_events.cc:433] Remapping device id 1for host 1 to 1001
W0000 00:00:1767971986.435211 1297529 trace_events.cc:433] Remapping device id 701for host 1 to 1701
W0109 15:19:50.702192 140051182680960 raw_to_tool_data.py:103] Received old tool format: trace_viewer@^; mapped to new format: trace_viewer@
W0000 00:00:1767971990.704211 1297529 derived_timeline.cc:590] Found derived XLine with clashing display ID: 1. This will cause rendering issues in Trace Viewer.
W0000 00:00:1767971990.707054 1297529 trace_events.cc:433] Remapping device id 701for host 1 to 1701
W0000 00:00:1767971990.707074 1297529 trace_events.cc:433] Remapping device id 1for host 1 to 1001
[       OK ] ProfilerTest.test_rocm_gpu_events_present_for_many_matmul_shapes
----------------------------------------------------------------------
Ran 21 tests in 63.537s

build:ci_multi_gpu --experimental_guard_against_concurrent_changes
build:ci_multi_gpu --test_env=HIP_VISIBLE_DEVICES=0,1,2,3
build:ci_multi_gpu --strategy=TestRunner=local

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why your rocprof-sdk PR has this change? I don't think this is relevant and don't put any others in this profiling backport PR

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why your rocprof-sdk PR has this change? I don't think this is relevant and don't put any others in this profiling backport PR

those were mainly for CI test. Otherwise, it failed straightaway as ci_single/mult_gpu requires those definition. it seems we still got failures that are not related to the backported stuff.

Copy link
Copy Markdown
Collaborator

@i-chaochen i-chaochen Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is only for CI test due to #453 and f6785ef , please create another PR

cc @alekstheod

Copy link
Copy Markdown
Author

@cj401-amd cj401-amd Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • six failed ci_multi_gpu tests on a local MI300
bazel --bazelrc=/work/xla/build_tools/rocm/rocm_xla.bazelrc test \
  --config=rocm_ci \
  --config=ci_multi_gpu \
  --test_output=errors \
  --spawn_strategy=local \
  --strategy=TestRunner=local \
  --repo_env=TF_ROCM_AMDGPU_TARGETS=gfx942,gfx90a \
 //xla/tests:array_elementwise_ops_test_amdgpu_any
 //xla/tests:array_elementwise_ops_test_amdgpu_any \ 
 //xla/tests:convert_test_amdgpu_any \
 //xla/tests:dot_operation_test_autotune_disabled_amdgpu_any \
 //xla/tests:iota_test_amdgpu_any \
 //xla/service/gpu/transforms:command_buffer_scheduling_test_amdgpu_any \ 
 //xla/tests:local_client_execute_test_amdgpu_any
  1. //xla/tests:array_elementwise_ops_test_amdgpu_any
INFO: Found 1 test target...
Target //xla/tests:array_elementwise_ops_test_amdgpu_any up-to-date:
  bazel-bin/xla/tests/array_elementwise_ops_test_amdgpu_any
INFO: Elapsed time: 240.038s, Critical Path: 237.85s
INFO: 12 processes: 7 internal, 5 local.
INFO: Build completed successfully, 12 total actions
//xla/tests:array_elementwise_ops_test_amdgpu_any                        PASSED in 212.8s

Executed 1 out of 1 test: 1 test passes.
  1. //xla/tests:convert_test_amdgpu_any
-- Test timed out at 2026-01-12 18:13:37 UTC --
================================================================================
INFO: Found 1 test target...
Target //xla/tests:convert_test_amdgpu_any up-to-date:
  bazel-bin/xla/tests/convert_test_amdgpu_any
INFO: Elapsed time: 928.377s, Critical Path: 681.32s
INFO: 8616 processes: 75 internal, 8541 local.
INFO: Build completed successfully, 8616 total actions
//xla/tests:convert_test_amdgpu_any                                       FLAKY, failed in 1 out of 2 in 300.7s
  Stats over 2 runs: max = 300.7s, min = 222.3s, avg = 261.5s, dev = 39.2s
  /root/.cache/bazel/_bazel_root/ea1efa0977f8828bf242d5b6a382af7f/execroot/xla/bazel-out/k8-opt/testlogs/xla/tests/convert_test_amdgpu_any/test_attempts/attempt_1.log

Executed 1 out of 1 test: 1 test passes.
  1. //xla/tests:dot_operation_test_autotune_disabled_amdgpu_any
INFO: Found 1 test target...
Target //xla/tests:dot_operation_test_autotune_disabled_amdgpu_any up-to-date:
  bazel-bin/xla/tests/dot_operation_test_autotune_disabled_amdgpu_any
INFO: Elapsed time: 88.008s, Critical Path: 86.59s
INFO: 16 processes: 9 internal, 7 local.
INFO: Build completed successfully, 16 total actions
Executed 1 out of 1 test: 1 test passes.disabled_amdgpu_any              PASSED in 63.4s
  1. //xla/tests:iota_test_amdgpu_any
[ RUN      ] IotaR2TestInstantiation/IotaR2Test.DoIt/1171
I0000 00:00:1768218999.961985 3016214 se_gpu_pjrt_client.cc:1381] Using BFC allocator.
I0000 00:00:1768218999.962074 3016214 gpu_helpers.cc:136] XLA backend allocating 16491332239 bytes on device 0 for BFCAllocator.
I0000 00:00:1768218999.962097 3016214 gpu_helpers.cc:136] XLA backend allocating 16491332239 bytes on device 1 for BFCAllocator.
I0000 00:00:1768218999.962110 3016214 gpu_helpers.cc:136] XLA backend allocating 16491332239 bytes on device 2 for BFCAllocator.
I0000 00:00:1768218999.962119 3016214 gpu_helpers.cc:136] XLA backend allocating 16491332239 bytes on device 3 for BFCAllocator.
I0000 00:00:1768218999.962128 3016214 gpu_helpers.cc:177] XLA backend will use up to 189650320752 bytes on device 0 for CollectiveBFCAllocator.
I0000 00:00:1768218999.962136 3016214 gpu_helpers.cc:177] XLA backend will use up to 189650320752 bytes on device 1 for CollectiveBFCAllocator.
I0000 00:00:1768218999.962144 3016214 gpu_helpers.cc:177] XLA backend will use up to 189650320752 bytes on device 2 for CollectiveBFCAllocator.
I0000 00:00:1768218999.962151 3016214 gpu_helpers.cc:177] XLA backend will use up to 189650320752 bytes on device 3 for CollectiveBFCAllocator.
-- Test timed out at 2026-01-12 11:56:40 UTC --

WARNING: Build options --action_env, --run_under, and --test_env have changed, discarding analysis cache (this can be expensive, see https://bazel.build/advanced/performance/iteration-speed).
INFO: Analyzed target //xla/tests:iota_test_amdgpu_any (376 packages loaded, 53666 targets configured).
INFO: Found 1 test target...
Target //xla/tests:iota_test_amdgpu_any up-to-date:
  bazel-bin/xla/tests/iota_test_amdgpu_any
INFO: Elapsed time: 6654.189s, Critical Path: 301.90s
INFO: 8663 processes: 77 internal, 8586 local.
INFO: Build completed successfully, 8663 total actions
//xla/tests:iota_test_amdgpu_any                                         PASSED in 126.7s
  Stats over 50 runs: max = 126.7s, min = 123.1s, avg = 124.5s, dev = 0.7s

Executed 1 out of 1 test: 1 test passes.
There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.
  1. //xla/service/gpu/transforms:command_buffer_scheduling_test_amdgpu_any
[ RUN      ] CommandBufferSchedulingTest.DynamicSliceFusionWithDynamicAddressesNotACommand
2026-01-12 12:34:57.785081: W ./xla/service/compiler.h:234] Ignoring the buffer assignment proto provided.
xla/service/gpu/transforms/command_buffer_scheduling_test.cc:1477: Failure
Value of: RunAndCompareTwoModulesReplicated(std::move(m_ref), std::move(m), true, true, std::nullopt)
  Actual: false (UNIMPLEMENTED: Empty nodes are not supported on ROCM.)
Expected: true

[  FAILED  ] CommandBufferSchedulingTest.DynamicSliceFusionWithDynamicAddressesNotACommand (338 ms)
[ RUN      ] CommandBufferSchedulingTest.AllGatherStartFollowedByDone
[       OK ] CommandBufferSchedulingTest.AllGatherStartFollowedByDone (3 ms)
[ RUN      ] CommandBufferSchedulingTest.MoveGTEs
[       OK ] CommandBufferSchedulingTest.MoveGTEs (3 ms)
[ RUN      ] CommandBufferSchedulingTest.SingleCommandBuffer
[       OK ] CommandBufferSchedulingTest.SingleCommandBuffer (1 ms)
[----------] 29 tests from CommandBufferSchedulingTest (3838 ms total)

[----------] Global test environment tear-down
[==========] 30 tests from 2 test suites ran. (4605 ms total)
[  PASSED  ] 27 tests.
[  SKIPPED ] 2 tests, listed below:
[  SKIPPED ] CommandBufferSchedulingTest.Conditional
[  SKIPPED ] CommandBufferSchedulingTest.While
[  FAILED  ] 1 test, listed below:
[  FAILED  ] CommandBufferSchedulingTest.DynamicSliceFusionWithDynamicAddressesNotACommand

 1 FAILED TEST

  1. //xla/tests:local_client_execute_test_amdgpu_any
[ RUN      ] LocalClientExecuteTest.CompilePartitionedExecutable
2026-01-12 12:38:10.563499: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
xla/tests/local_client_execute_test.cc:767: Failure
Expected equality of these values:
  2
  executables.size()
    Which is: 1

[  FAILED  ] LocalClientExecuteTest.CompilePartitionedExecutable (34 ms)
[ RUN      ] LocalClientExecuteTest.AddArraysWithDifferentInputLayouts
2026-01-12 12:38:10.597713: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
[       OK ] LocalClientExecuteTest.AddArraysWithDifferentInputLayouts (57 ms)
[ RUN      ] LocalClientExecuteTest.Constant
2026-01-12 12:38:10.655456: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
[       OK ] LocalClientExecuteTest.Constant (8 ms)
[ RUN      ] LocalClientExecuteTest.SizeOfGeneratedCodeInBytes
2026-01-12 12:38:10.664086: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
[       OK ] LocalClientExecuteTest.SizeOfGeneratedCodeInBytes (34 ms)
[ RUN      ] LocalClientExecuteTest.InfeedOutfeedTest
2026-01-12 12:38:10.698750: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
[       OK ] LocalClientExecuteTest.InfeedOutfeedTest (29 ms)
[ RUN      ] LocalClientExecuteTest.ValidateMemoryFittingLevel
2026-01-12 12:38:10.728554: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
[       OK ] LocalClientExecuteTest.ValidateMemoryFittingLevel (30 ms)
[ RUN      ] LocalClientExecuteTest.ShapeBufferToLiteralConversion
2026-01-12 12:38:10.759389: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
[       OK ] LocalClientExecuteTest.ShapeBufferToLiteralConversion (2 ms)
[ RUN      ] LocalClientExecuteTest.AddScalars
2026-01-12 12:38:10.762242: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
[       OK ] LocalClientExecuteTest.AddScalars (29 ms)
[ RUN      ] LocalClientExecuteTest.ValidateOptimizationLevel
2026-01-12 12:38:10.792232: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
[       OK ] LocalClientExecuteTest.ValidateOptimizationLevel (23 ms)
[ RUN      ] LocalClientExecuteTest.TupleArguments
2026-01-12 12:38:10.815470: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
[       OK ] LocalClientExecuteTest.TupleArguments (31 ms)
[ RUN      ] LocalClientExecuteTest.LargeNestedTuple
2026-01-12 12:38:10.846715: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
[       OK ] LocalClientExecuteTest.LargeNestedTuple (5281 ms)
[ RUN      ] LocalClientExecuteTest.ValidateExecTimeOptimizationEffort
2026-01-12 12:38:16.128208: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
[       OK ] LocalClientExecuteTest.ValidateExecTimeOptimizationEffort (28 ms)
[ RUN      ] LocalClientExecuteTest.RunOnStreamForWrongPlatform
2026-01-12 12:38:16.156942: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
[       OK ] LocalClientExecuteTest.RunOnStreamForWrongPlatform (5 ms)
[ RUN      ] LocalClientExecuteTest.DeepTuple
2026-01-12 12:38:16.162605: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
[       OK ] LocalClientExecuteTest.DeepTuple (131 ms)
[ RUN      ] LocalClientExecuteTest.ValidateDeviceMemorySize
2026-01-12 12:38:16.294098: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
[       OK ] LocalClientExecuteTest.ValidateDeviceMemorySize (24 ms)
[ RUN      ] LocalClientExecuteTest.ValidateFDOProfile
2026-01-12 12:38:16.318581: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
2026-01-12 12:38:16.324076: I xla/service/gpu/gpu_hlo_schedule.cc:342] Attempting to parse as a binary proto.
2026-01-12 12:38:16.324101: I xla/service/gpu/gpu_hlo_schedule.cc:347] Not a binary proto, attempt to parse it as a text proto.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1768221496.324192 3356912 text_format.cc:378] Error parsing text-format tensorflow.profiler.ProfiledInstructionsProto: 1:8: Message type "tensorflow.profiler.ProfiledInstructionsProto" has no field named "Testing".
2026-01-12 12:38:16.324219: E xla/service/gpu/gpu_hlo_schedule.cc:356] Unable to parse fdo_profile: not a valid text or binary ProfiledInstructionsProto
[       OK ] LocalClientExecuteTest.ValidateFDOProfile (26 ms)
[ RUN      ] LocalClientExecuteTest.AddArraysWithDifferentOutputLayouts
2026-01-12 12:38:16.344869: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
[       OK ] LocalClientExecuteTest.AddArraysWithDifferentOutputLayouts (58 ms)
[ RUN      ] LocalClientExecuteTest.RunOnAllDeviceOrdinals
2026-01-12 12:38:16.403325: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
[       OK ] LocalClientExecuteTest.RunOnAllDeviceOrdinals (34 ms)
[----------] 37 tests from LocalClientExecuteTest (11475 ms total)

[----------] Global test environment tear-down
[==========] 37 tests from 1 test suite ran. (11475 ms total)
[  PASSED  ] 36 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] LocalClientExecuteTest.CompilePartitionedExecutable

 1 FAILED TEST

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • cherry-pick from 926af0e to 0.7.1

The following two need further investigation

//xla/service/gpu/transforms:command_buffer_scheduling_test_amdgpu_any \ 
//xla/tests:local_client_execute_test_amdgpu_any
  1. //xla/service/gpu/transforms:command_buffer_scheduling_test_amdgpu_any
[  SKIPPED ] CommandBufferSchedulingTest.Conditional (0 ms)
[ RUN      ] CommandBufferSchedulingTest.CollectCommandBufferSequence
[       OK ] CommandBufferSchedulingTest.CollectCommandBufferSequence (0 ms)
[ RUN      ] CommandBufferSchedulingTest.DynamicSliceFusionWithDynamicAddressesNotACommand
2026-01-12 17:55:42.615535: W ./xla/service/compiler.h:234] Ignoring the buffer assignment proto provided.
xla/service/gpu/transforms/command_buffer_scheduling_test.cc:1477: Failure
Value of: RunAndCompareTwoModulesReplicated(std::move(m_ref), std::move(m), true, true, std::nullopt)
  Actual: false (UNIMPLEMENTED: Empty nodes are not supported on ROCM.)
Expected: true

[  FAILED  ] CommandBufferSchedulingTest.DynamicSliceFusionWithDynamicAddressesNotACommand (386 ms)
[ RUN      ] CommandBufferSchedulingTest.While
xla/service/gpu/transforms/command_buffer_scheduling_test.cc:962: Skipped
Not supported for ROCm!

[  SKIPPED ] CommandBufferSchedulingTest.While (1 ms)
[ RUN      ] CommandBufferSchedulingTest.CollectivePermuteStartFollowedByAnotherStart
[       OK ] CommandBufferSchedulingTest.CollectivePermuteStartFollowedByAnotherStart (3 ms)
[ RUN      ] CommandBufferSchedulingTest.ReduceScatterStartFollowedByDone
[       OK ] CommandBufferSchedulingTest.ReduceScatterStartFollowedByDone (1 ms)
[----------] 29 tests from CommandBufferSchedulingTest (5484 ms total)

[----------] Global test environment tear-down
[==========] 30 tests from 2 test suites ran. (7164 ms total)
[  PASSED  ] 27 tests.
[  SKIPPED ] 2 tests, listed below:
[  SKIPPED ] CommandBufferSchedulingTest.Conditional
[  SKIPPED ] CommandBufferSchedulingTest.While
[  FAILED  ] 1 test, listed below:
[  FAILED  ] CommandBufferSchedulingTest.DynamicSliceFusionWithDynamicAddressesNotACommand

 1 FAILED TEST

  1. //xla/tests:local_client_execute_test_amdgpu_any
[ RUN      ] LocalClientExecuteTest.CompilePartitionedExecutable
2026-01-12 17:59:44.042897: I xla/service/platform_util.cc:84] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
2026-01-12 17:59:45.376245: I xla/service/service.cc:163] XLA service 0x55bfd2d516d0 initialized for platform ROCM (this does not guarantee that XLA will be used). Devices:
2026-01-12 17:59:45.376623: I xla/service/service.cc:171]   StreamExecutor device (0): gfx942:sramecc+:xnack-, AMDGPU ISA version: gfx942:sramecc+:xnack-
2026-01-12 17:59:45.376629: I xla/service/service.cc:171]   StreamExecutor device (1): gfx942:sramecc+:xnack-, AMDGPU ISA version: gfx942:sramecc+:xnack-
2026-01-12 17:59:45.376634: I xla/service/service.cc:171]   StreamExecutor device (2): gfx942:sramecc+:xnack-, AMDGPU ISA version: gfx942:sramecc+:xnack-
2026-01-12 17:59:45.376638: I xla/service/service.cc:171]   StreamExecutor device (3): gfx942:sramecc+:xnack-, AMDGPU ISA version: gfx942:sramecc+:xnack-
2026-01-12 17:59:45.376642: I xla/service/service.cc:171]   StreamExecutor device (4): gfx942:sramecc+:xnack-, AMDGPU ISA version: gfx942:sramecc+:xnack-
2026-01-12 17:59:45.376645: I xla/service/service.cc:171]   StreamExecutor device (5): gfx942:sramecc+:xnack-, AMDGPU ISA version: gfx942:sramecc+:xnack-
2026-01-12 17:59:45.376649: I xla/service/service.cc:171]   StreamExecutor device (6): gfx942:sramecc+:xnack-, AMDGPU ISA version: gfx942:sramecc+:xnack-
2026-01-12 17:59:45.376653: I xla/service/service.cc:171]   StreamExecutor device (7): gfx942:sramecc+:xnack-, AMDGPU ISA version: gfx942:sramecc+:xnack-
xla/tests/local_client_execute_test.cc:767: Failure
Expected equality of these values:
  2
  executables.size()
    Which is: 1

[  FAILED  ] LocalClientExecuteTest.CompilePartitionedExecutable (1427 ms)
[----------] 1 test from LocalClientExecuteTest (1427 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (1427 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] LocalClientExecuteTest.CompilePartitionedExecutable

 1 FAILED TEST

Copy link
Copy Markdown
Collaborator

@i-chaochen i-chaochen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these are tests are CI's quirks and irrelevent to your PR, then you can merge it to unblock newer 0.7.1 release.

@i-chaochen i-chaochen merged commit 00c7948 into rocm-jaxlib-v0.7.1 Jan 12, 2026
6 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants