Skip to content

Conversation

@JackAKirk
Copy link
Contributor

@JackAKirk JackAKirk commented Mar 2, 2022

CUDA backend Implementation of tf32 MAD using the underlying 32 bit type, fully consistent with the existing matrix extension.

Integration test added here: intel/llvm-test-suite#881

buffer<uint32_t, 1> bufB(B, range<1>(K * N));
buffer<float, 1> bufC(C, range<1>(M * N));
buffer<float, 1> bufD(D, range<1>(M * N));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a complete example in test/matrix where you show the necessary "manual" conversion function from float to fp19(uint32) during initialization and then from fp19 to float during accumulation and verification?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's here: intel/llvm-test-suite#881
for the float to fp19

uint32_t make_tf32(float const &x);

For the fp19 to float:

float tf32_to_fp32(uint32_t x);

(I'll rename both to e.g. make_fp19)

// number of rows of a.
constexpr int K = 8; // number of cols of a/number of rows of b.

uint32_t A[M * K];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment that uint32 is used here as a storage for fp19

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments, I've updated both tests now.

@JackAKirk JackAKirk requested a review from dkhaldi March 7, 2022 13:43
@dkhaldi
Copy link
Contributor

dkhaldi commented Mar 10, 2022

LGTM but we need to start adopting the name tf32 instead of fp19.
As soon as you make the change, I will approve.

@JackAKirk JackAKirk changed the title [SYCL][CUDA] fp19 matrix MAD impl using uint32_t [SYCL][CUDA] tf32 matrix MAD impl using uint32_t Mar 11, 2022
@JackAKirk
Copy link
Contributor Author

This PR is no longer necessary: The complete tf32 implementation is now ready which can replace this PR: #5870

@JackAKirk JackAKirk closed this Mar 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants