Skip to content

Conversation

@fhahn
Copy link

@fhahn fhahn commented Jan 25, 2024

Improve codegen for (trunc X to <3 x i8>) by converting it to a sequence
of 3 ST1.b, but first converting the truncate operand to either v8i8 or
v16i8, extracting the lanes for the truncate results and storing them.

At the moment, there are almost no cases in which such vector operations
will be generated automatically. The motivating case is non-power-of-2
SLP vectorization: llvm#77790

PR: llvm#78637

(cherry-picked from eb678d8)

fhahn added 5 commits January 25, 2024 18:58
Add extra tests with different load/store alignments for
llvm#78637.

(cherry-picked from 98509c7)
…lvm#78637)

Improve codegen for (trunc X to <3 x i8>) by converting it to a sequence
of 3 ST1.b, but first converting the truncate operand to either v8i8 or
v16i8, extracting the lanes for the truncate results and storing them.

At the moment, there are almost no cases in which such vector operations
will be generated automatically. The motivating case is non-power-of-2
SLP vectorization: llvm#77790

PR: llvm#78637

(cherry-picked from eb678d8)
@fhahn
Copy link
Author

fhahn commented Jan 25, 2024

@swift-ci please test

@fhahn
Copy link
Author

fhahn commented Jan 25, 2024

@swift-ci please test llvm

@fhahn
Copy link
Author

fhahn commented Jan 26, 2024

@swift-ci please test macos

@fhahn fhahn merged commit b0f6f4f into swiftlang:stable/20230725 Jan 26, 2024
@fhahn fhahn deleted the vec3-trunc-store branch January 26, 2024 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant