PR #35132: [XLA:GPU] Update HLO cublas workspace size after autotuner select the algorithm by copybara-service[bot] · Pull Request #35342 · openxla/xla

copybara-service · 2025-12-16T09:35:46Z

PR #35132: [XLA:GPU] Update HLO cublas workspace size after autotuner select the algorithm

Imported from GitHub PR #35132

📝 Summary of Changes

This PR introduces a pass that updates the workspace size for cuBLAS/cuBLASLt GEMM operations after autotuning has selected a specific algorithm. The GemmRewriter pass conservatively allocates workspace before autotuning. After autotuning,we know the exact algorithm selected and can query its actual workspace requirement, potentially reducing memory usage.

🎯 Justification
Potentially reducing memory usage.

🚀 Kind of Contribution
Please remove what does not apply: ⚡️ Performance Improvement,

🧪 Unit Tests:
Existing gemm tests should cover the workspace size config.

Copybara import of the project:

--
a6ed265 by Shawn Wang [email protected]:

Update cublas workspace size with the exact size extracted from algorithm

--
d67a48a by Shawn Wang [email protected]:

fix comments

--
613e090 by Shawn Wang [email protected]:

add unittest

Merging this change closes #35132

FUTURE_COPYBARA_INTEGRATE_REVIEW=#35132 from shawnwang18:shawnw/cublas_workspace 613e090

… select the algorithm Imported from GitHub PR #35132 📝 Summary of Changes This PR introduces a pass that updates the workspace size for cuBLAS/cuBLASLt GEMM operations after autotuning has selected a specific algorithm. The GemmRewriter pass conservatively allocates workspace before autotuning. After autotuning,we know the exact algorithm selected and can query its actual workspace requirement, potentially reducing memory usage. 🎯 Justification Potentially reducing memory usage. 🚀 Kind of Contribution Please remove what does not apply: ⚡️ Performance Improvement, 🧪 Unit Tests: Existing gemm tests should cover the workspace size config. Copybara import of the project: -- a6ed265 by Shawn Wang <[email protected]>: Update cublas workspace size with the exact size extracted from algorithm -- d67a48a by Shawn Wang <[email protected]>: fix comments -- 613e090 by Shawn Wang <[email protected]>: add unittest Merging this change closes #35132 COPYBARA_INTEGRATE_REVIEW=#35132 from shawnwang18:shawnw/cublas_workspace 613e090 PiperOrigin-RevId: 845601031

copybara-service Bot force-pushed the test_845138789 branch 6 times, most recently from 93b0f83 to 96d9ada Compare December 17, 2025 06:38

copybara-service Bot force-pushed the test_845138789 branch from 96d9ada to 907b576 Compare December 17, 2025 07:04

copybara-service Bot merged commit 907b576 into main Dec 17, 2025

copybara-service Bot deleted the test_845138789 branch December 17, 2025 07:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PR #35132: [XLA:GPU] Update HLO cublas workspace size after autotuner select the algorithm#35342

PR #35132: [XLA:GPU] Update HLO cublas workspace size after autotuner select the algorithm#35342
copybara-service[bot] merged 1 commit into
mainfrom
test_845138789

copybara-service Bot commented Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

copybara-service Bot commented Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant