[BUG] Achieve ROCKET GPU kernel and feature parity using CPU kernel generation#3227
[BUG] Achieve ROCKET GPU kernel and feature parity using CPU kernel generation#3227Adityakushwaha2006 wants to merge 3 commits intoaeon-toolkit:mainfrom
Conversation
…l parity along with feature divergence <1e-4
Thank you for contributing to
|
MatthewMiddlehurst
left a comment
There was a problem hiding this comment.
Good in principle that it is now equal but not sure about some bits.
This seems a bit hacky with the _convert_cpu_kernels_to_gpu_format function rather than just incorporating the output. Removal of parameters would require deprecation most likely. Good if @hadifawaz1999 could review.
| @@ -152,4 +155,5 @@ def test_rocket_cpu_gpu(n_channels): | |||
|
|
|||
| X_transform_cpu = rocket_cpu.transform(X) | |||
| X_transform_gpu = rocket_gpu.transform(X) | |||
| assert_array_almost_equal(X_transform_cpu, X_transform_gpu, decimal=8) | |||
| # Set decimal threshold here | |||
| assert_array_almost_equal(X_transform_cpu, X_transform_gpu, decimal=4) | |||
There was a problem hiding this comment.
Why the changes here? Not against the decimal changes but interested in hearing why. Docs changes seem unnecessary.
| Notes | ||
| ----- | ||
| This GPU implementation uses the CPU's kernel generation logic | ||
| (from `_rocket._generate_kernels`) to ensure exact kernel parity | ||
| when using the same random seed. |
There was a problem hiding this comment.
Would a user need to know this? I can see noting a difference in results but this seems a bit unnecessary.
| # Transpose and convert to float32 for TensorFlow compatibility | ||
| X = X.transpose(0, 2, 1).astype(np.float32) |
Reference Issues/PRs
Related to #1248
What does this implement/fix? Explain your changes.
This PR implements kernel and feature parity between CPU and GPU ROCKET implementations by reusing the CPU's kernel generation function while maintaining GPU acceleration for transform operations.
Changes:
Results:
Key insight:
The sparse to dense conversion places CPU's selected channel weights at correct positions in a dense kernel, with zeros for non selected channels. Since zero weights contribute nothing to convolution, this achieves mathematical equivalence while using standard TensorFlow operations.
Does your contribution introduce a new dependency? If yes, which one?
No.
Any other comments?
None.
PR checklist
For all contributions
For new estimators and functions
__maintainer__at the top of relevant files and want to be contacted regarding its maintenance. Unmaintained files may be removed. This is for the full file, and you should not add yourself if you are just making minor changes or do not want to help maintain its contents.For developers with write access