[ENH] Optimize ROCKET GPU PPV calculation for 1.45× speedup#3232
Conversation
Thank you for contributing to
|
|
Could you post a snippet showing it produces the same output as before? Alternatively are there tests that check the output that still pass. Generally experiment results are harder to trust if you do not show the code ran. |
|
hi, thanks @Adityakushwaha2006 and sorry it took so long. So I have run this for correctness on CPU only for now just for correctness sanity def _get_ppv1(x):
#New
positive_mask = tf.cast(x > 0, tf.float32)
return tf.reduce_mean(positive_mask, axis=1)
def _get_ppv2(x):
# Old
x_pos = tf.math.count_nonzero(tf.nn.relu(x), axis=1)
return tf.math.divide(x_pos, x.shape[1])and even on CPU its significantly faster so looks good, but there are tiny numeric differences around 1e-08. Given how central this function is, should speed up rocket classifiers. I'll test overall performance next on UCR. only issue chatgpt raises is with different handling of NaN "Two small caveats if you ever switch _get_ppv1 to the positive_mask version:
|
|
ran it in rocket classifier pre and before on 21 on the train, no sig diff in accuracy (7/8/6 W/L/D, 0.0018 diff in average accuracy, 66% of the time taken after the change. Im calling this as a good change |
TonyBagnall
left a comment
There was a problem hiding this comment.
thank you for the contribution


Reference Issues/PRs
None
What does this implement/fix? Explain your changes.
This PR optimizes the
_get_ppv()method inbase.pyfor ROCKET GPU transformers by replacing integer operations with GPU optimised float32 operationsPerformance Impact:
PPV calculation takes around 15-20% of the time in rocket aswell so extrapolating this speed up ,it should cause a ~5% speedup in the GPU variant. (both measures approximate)
Benchmark across dataset sizes:
Verified Correctness:
Does your contribution introduce a new dependency? If yes, which one?
None
Any other comments?
GPUs handle floating point math much faster than integer counting, so switching to float operations gives a good speed boost.
For all contributions
For new estimators and functions
__maintainer__at the top of relevant files and want to be contacted regarding its maintenance. Unmaintained files may be removed. This is for the full file, and you should not add yourself if you are just making minor changes or do not want to help maintain its contents.For developers with write access