[AArch64] Combine store (trunc X to <3 x i8>) to sequence of ST1.b #8052

fhahn · 2024-01-25T19:06:37Z

Improve codegen for (trunc X to <3 x i8>) by converting it to a sequence
of 3 ST1.b, but first converting the truncate operand to either v8i8 or
v16i8, extracting the lanes for the truncate results and storing them.

At the moment, there are almost no cases in which such vector operations
will be generated automatically. The motivating case is non-power-of-2
SLP vectorization: llvm#77790

PR: llvm#78637

(cherry-picked from eb678d8)

(cherry-picked from 8336515)

Extra tests for llvm#78637 llvm#78632 (cherry-picked from ff1cde5)

Extra tests for llvm#78637 llvm#78632 (cherry-picked from e7b4ff8)

Add extra tests with different load/store alignments for llvm#78637. (cherry-picked from 98509c7)

…lvm#78637) Improve codegen for (trunc X to <3 x i8>) by converting it to a sequence of 3 ST1.b, but first converting the truncate operand to either v8i8 or v16i8, extracting the lanes for the truncate results and storing them. At the moment, there are almost no cases in which such vector operations will be generated automatically. The motivating case is non-power-of-2 SLP vectorization: llvm#77790 PR: llvm#78637 (cherry-picked from eb678d8)

fhahn · 2024-01-25T19:06:47Z

@swift-ci please test

fhahn · 2024-01-25T19:06:53Z

@swift-ci please test llvm

fhahn · 2024-01-26T09:55:52Z

@swift-ci please test macos

fhahn added 5 commits January 25, 2024 18:58

[AArch64] Add tests for operations on vectors with 3 elements.

7431631

(cherry-picked from 8336515)

[AArch64] Add vec3 load/store tests with GEPs with const offsets.

8af7a14

Extra tests for llvm#78637 llvm#78632 (cherry-picked from ff1cde5)

[AArch64] Add vec3 tests with add between load and store.

44f385a

Extra tests for llvm#78637 llvm#78632 (cherry-picked from e7b4ff8)

[AArch64] Add vec3 tests with different load/store alignments.

3d6f586

Add extra tests with different load/store alignments for llvm#78637. (cherry-picked from 98509c7)

fhahn merged commit b0f6f4f into swiftlang:stable/20230725 Jan 26, 2024

fhahn deleted the vec3-trunc-store branch January 26, 2024 20:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AArch64] Combine store (trunc X to <3 x i8>) to sequence of ST1.b #8052

[AArch64] Combine store (trunc X to <3 x i8>) to sequence of ST1.b #8052

Uh oh!

fhahn commented Jan 25, 2024

Uh oh!

fhahn commented Jan 25, 2024

Uh oh!

fhahn commented Jan 25, 2024

Uh oh!

fhahn commented Jan 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[AArch64] Combine store (trunc X to <3 x i8>) to sequence of ST1.b #8052

[AArch64] Combine store (trunc X to <3 x i8>) to sequence of ST1.b #8052

Uh oh!

Conversation

fhahn commented Jan 25, 2024

Uh oh!

fhahn commented Jan 25, 2024

Uh oh!

fhahn commented Jan 25, 2024

Uh oh!

fhahn commented Jan 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant