Commit fbac55b
authored
[AArch64] Optimize vector fmul(sitofp/uitofp, 1/2^N) -> scvtf/ucvtf (llvm#141480)
When a vector integer-to-float conversion is followed by a multiply with a
reciprocal power-of-two constant, we can fold both operations into a single
SCVTF or UCVTF instruction with a fixed-point shift operand.
For example, `fmul(sitofp(v2i32 x), <0.5, 0.5>)` becomes `scvtf.2s v0, v0, #1`.
This is a reworked version with several improvements over the original
submission:
- Rewrite the C++ operand matcher to share implementation with the existing
`SelectCVTFixedPointVec` (MOVIshift, FMOV, and DUP handling with correct
truncation for f16)
- Add `uitofp`/`ucvtf` patterns via a `CVTFRecipPat` multiclass
- Add full GlobalISel support (`GIComplexOperandMatcher` + renderer)
Supported vector types: `v2f32`, `v4f32`, `v2f64`, `v4f16`, `v8f16`.
Fixes llvm#949091 parent 7b43dcd commit fbac55b
4 files changed
Lines changed: 591 additions & 16 deletions
File tree
- llvm
- lib/Target/AArch64
- GISel
- test/CodeGen/AArch64
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
496 | 496 | | |
497 | 497 | | |
498 | 498 | | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
499 | 507 | | |
500 | 508 | | |
501 | 509 | | |
| |||
4147 | 4155 | | |
4148 | 4156 | | |
4149 | 4157 | | |
4150 | | - | |
4151 | | - | |
4152 | | - | |
4153 | | - | |
4154 | | - | |
4155 | | - | |
4156 | | - | |
4157 | | - | |
| 4158 | + | |
| 4159 | + | |
| 4160 | + | |
| 4161 | + | |
| 4162 | + | |
4158 | 4163 | | |
4159 | 4164 | | |
4160 | 4165 | | |
| |||
4192 | 4197 | | |
4193 | 4198 | | |
4194 | 4199 | | |
4195 | | - | |
4196 | | - | |
| 4200 | + | |
| 4201 | + | |
4197 | 4202 | | |
4198 | 4203 | | |
4199 | 4204 | | |
4200 | 4205 | | |
4201 | 4206 | | |
4202 | 4207 | | |
4203 | 4208 | | |
| 4209 | + | |
| 4210 | + | |
| 4211 | + | |
| 4212 | + | |
| 4213 | + | |
| 4214 | + | |
| 4215 | + | |
| 4216 | + | |
| 4217 | + | |
| 4218 | + | |
| 4219 | + | |
| 4220 | + | |
| 4221 | + | |
| 4222 | + | |
| 4223 | + | |
| 4224 | + | |
| 4225 | + | |
| 4226 | + | |
| 4227 | + | |
4204 | 4228 | | |
4205 | 4229 | | |
4206 | 4230 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9296 | 9296 | | |
9297 | 9297 | | |
9298 | 9298 | | |
| 9299 | + | |
| 9300 | + | |
| 9301 | + | |
| 9302 | + | |
| 9303 | + | |
| 9304 | + | |
| 9305 | + | |
| 9306 | + | |
| 9307 | + | |
| 9308 | + | |
| 9309 | + | |
| 9310 | + | |
| 9311 | + | |
| 9312 | + | |
| 9313 | + | |
| 9314 | + | |
| 9315 | + | |
| 9316 | + | |
| 9317 | + | |
| 9318 | + | |
| 9319 | + | |
| 9320 | + | |
| 9321 | + | |
| 9322 | + | |
| 9323 | + | |
| 9324 | + | |
| 9325 | + | |
| 9326 | + | |
| 9327 | + | |
| 9328 | + | |
| 9329 | + | |
| 9330 | + | |
| 9331 | + | |
| 9332 | + | |
| 9333 | + | |
| 9334 | + | |
| 9335 | + | |
| 9336 | + | |
| 9337 | + | |
| 9338 | + | |
| 9339 | + | |
| 9340 | + | |
| 9341 | + | |
| 9342 | + | |
| 9343 | + | |
| 9344 | + | |
| 9345 | + | |
9299 | 9346 | | |
9300 | 9347 | | |
9301 | 9348 | | |
| |||
9350 | 9397 | | |
9351 | 9398 | | |
9352 | 9399 | | |
| 9400 | + | |
| 9401 | + | |
| 9402 | + | |
| 9403 | + | |
| 9404 | + | |
| 9405 | + | |
| 9406 | + | |
| 9407 | + | |
| 9408 | + | |
| 9409 | + | |
9353 | 9410 | | |
9354 | 9411 | | |
9355 | 9412 | | |
| |||
Lines changed: 26 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
484 | 484 | | |
485 | 485 | | |
486 | 486 | | |
487 | | - | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
488 | 491 | | |
489 | 492 | | |
| 493 | + | |
| 494 | + | |
490 | 495 | | |
491 | 496 | | |
492 | 497 | | |
| |||
7863 | 7868 | | |
7864 | 7869 | | |
7865 | 7870 | | |
7866 | | - | |
| 7871 | + | |
7867 | 7872 | | |
7868 | 7873 | | |
7869 | 7874 | | |
| |||
7892 | 7897 | | |
7893 | 7898 | | |
7894 | 7899 | | |
7895 | | - | |
7896 | | - | |
| 7900 | + | |
| 7901 | + | |
7897 | 7902 | | |
7898 | 7903 | | |
7899 | 7904 | | |
7900 | 7905 | | |
7901 | 7906 | | |
7902 | 7907 | | |
7903 | 7908 | | |
7904 | | - | |
| 7909 | + | |
| 7910 | + | |
| 7911 | + | |
| 7912 | + | |
| 7913 | + | |
| 7914 | + | |
| 7915 | + | |
7905 | 7916 | | |
7906 | 7917 | | |
7907 | 7918 | | |
| |||
7911 | 7922 | | |
7912 | 7923 | | |
7913 | 7924 | | |
7914 | | - | |
| 7925 | + | |
7915 | 7926 | | |
7916 | 7927 | | |
7917 | 7928 | | |
7918 | 7929 | | |
7919 | 7930 | | |
| 7931 | + | |
| 7932 | + | |
| 7933 | + | |
| 7934 | + | |
| 7935 | + | |
| 7936 | + | |
| 7937 | + | |
| 7938 | + | |
| 7939 | + | |
7920 | 7940 | | |
7921 | 7941 | | |
7922 | 7942 | | |
| |||
0 commit comments