Skip to content

PR #35132: [XLA:GPU] Update HLO cublas workspace size after autotuner select the algorithm#35342

Merged
copybara-service[bot] merged 1 commit into
mainfrom
test_845138789
Dec 17, 2025
Merged

PR #35132: [XLA:GPU] Update HLO cublas workspace size after autotuner select the algorithm#35342
copybara-service[bot] merged 1 commit into
mainfrom
test_845138789

Conversation

@copybara-service
Copy link
Copy Markdown

PR #35132: [XLA:GPU] Update HLO cublas workspace size after autotuner select the algorithm

Imported from GitHub PR #35132

📝 Summary of Changes

This PR introduces a pass that updates the workspace size for cuBLAS/cuBLASLt GEMM operations after autotuning has selected a specific algorithm. The GemmRewriter pass conservatively allocates workspace before autotuning. After autotuning,we know the exact algorithm selected and can query its actual workspace requirement, potentially reducing memory usage.

🎯 Justification
Potentially reducing memory usage.

🚀 Kind of Contribution
Please remove what does not apply: ⚡️ Performance Improvement,

🧪 Unit Tests:
Existing gemm tests should cover the workspace size config.

Copybara import of the project:

--
a6ed265 by Shawn Wang [email protected]:

Update cublas workspace size with the exact size extracted from algorithm

--
d67a48a by Shawn Wang [email protected]:

fix comments

--
613e090 by Shawn Wang [email protected]:

add unittest

Merging this change closes #35132

FUTURE_COPYBARA_INTEGRATE_REVIEW=#35132 from shawnwang18:shawnw/cublas_workspace 613e090

@copybara-service copybara-service Bot force-pushed the test_845138789 branch 6 times, most recently from 93b0f83 to 96d9ada Compare December 17, 2025 06:38
… select the algorithm

Imported from GitHub PR #35132

📝 Summary of Changes

This PR introduces a pass that updates the workspace size for cuBLAS/cuBLASLt GEMM operations after autotuning has selected a specific algorithm. The GemmRewriter pass conservatively allocates workspace before autotuning. After autotuning,we know the exact algorithm selected and can query its actual workspace requirement, potentially reducing memory usage.

🎯 Justification
Potentially reducing memory usage.

🚀 Kind of Contribution
Please remove what does not apply: ⚡️ Performance Improvement,

🧪 Unit Tests:
Existing gemm tests should cover the workspace size config.

Copybara import of the project:

--
a6ed265 by Shawn Wang <[email protected]>:

Update cublas workspace size with the exact size extracted from algorithm

--
d67a48a by Shawn Wang <[email protected]>:

fix comments

--
613e090 by Shawn Wang <[email protected]>:

add unittest

Merging this change closes #35132

COPYBARA_INTEGRATE_REVIEW=#35132 from shawnwang18:shawnw/cublas_workspace 613e090
PiperOrigin-RevId: 845601031
@copybara-service copybara-service Bot merged commit 907b576 into main Dec 17, 2025
@copybara-service copybara-service Bot deleted the test_845138789 branch December 17, 2025 07:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant