Commit 32230d8
authored
Accelerate linear model predict on C-ordered inputs (#7329)
This started out as a cleanup PR, but moved to a performance improvement after some benchmarking.
`LinearRegression`, `ElasticNet`, `Lasso`, and `Ridge` all share the same `predict` method. This calculates `X.dot(coef.T) + intercept`.
Previously we used a function from `libcuml` to compute the single target case, and `cupy` to handle the multitarget case.
After some benchmarking, I no longer think using `libcuml` at all here is worth it. It's simpler to always take the `cupy` path, and `cupy` already handles dispatching to cublas appropriately to handle disparate layouts (C vs F).
For F-ordered inputs we see roughly the same performance as before.
For C-ordered inputs, we see anything from mild speedups (150 us now, vs 200 us before) on small data, to up to 10x speedup on larger data (0.75 ms now vs 8.4 ms before). Presumably this is due to avoiding unnecessary copies to force a uniform F order as we did before.
Authors:
- Jim Crist-Harif (https://github.com/jcrist)
Approvers:
- Victor Lafargue (https://github.com/viclafargue)
- Simon Adorf (https://github.com/csadorf)
URL: #73291 parent df3b839 commit 32230d8
3 files changed
Lines changed: 60 additions & 153 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | | - | |
18 | 17 | | |
19 | 18 | | |
20 | 19 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
This file was deleted.
0 commit comments