Skip to content

Conversation

@dkarrasch
Copy link
Contributor

In the context of JuliaGPU/CUDA.jl#1904 I realized that between LinearAlgebra and CUDA there is yet another layer, which is this GPUArrays.jl. I believe that if we hook into the method hierarchy one level lower, many things will greatly simplify. I don't know if this should be annotated with @inline so that potentially constant propagation could eliminate the character fiddling. Comments welcome!

@maleadt
Copy link
Member

maleadt commented May 16, 2023

Seeing the Metal.jl CI failure, I guess this is a breaking change?

@dkarrasch
Copy link
Contributor Author

Sorry for seemingly abandoning this PR. I was working on JuliaLang/julia#49806, and now I'm working on including HermOrSym wrappers into that mechanism. Once that is done, I'll continue with SparseArrays.jl, so that these changes are included in Julia v1.10. Afterwards, I'll return to this one.

Regarding your question, this PR by itself is breaking, that is correct. But we can preempt breakage by introducing a few methods in Metal.jl first. Eventually, starting with Julia v1.10, there will be only one method overload that will handle multiplication by once-wrapped GPUArrays, where the wrappers include Adjoint, Transpose, Hermitian and Symmetric.

@maleadt
Copy link
Member

maleadt commented May 17, 2023

Sorry for seemingly abandoning this PR. I was working on JuliaLang/julia#49806, and now I'm working on including HermOrSym wrappers into that mechanism. Once that is done, I'll continue with SparseArrays.jl, so that these changes are included in Julia v1.10. Afterwards, I'll return to this one.

Of course, no problem; thanks for doing this!

transA = tA == 'N' ? identity : tA == 'T' ? transpose : adjoint
generic_matmatmul!(C, transA(A), B, a, b)
end
function LinearAlgebra.generic_matvecmul!(C::AbstractGPUVector, tA::AbstractChar, A::AbstractGPUMatrix, B::AbstractGPUVector, _add::MulAddMul = MulAddMul())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really related to this PR but I think it's a shame the the generic matmul in LinearAlgebra isn't like the one defined in this file, i.e. just three nested loops without trying to be smart about memory. The fact that it's necessary to define a "generic" version here is evidence that the version in LinearAlgebra isn't generic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this is fixable by rewriting the generic version as you say, and have a more specific signature for the smart-about-memory version?

@maleadt
Copy link
Member

maleadt commented May 31, 2023

Ah, a new issue with Metal.jl; a bad convert method is being called:

  MethodError: convert(::Type{Union{}}, ::MtlMatrix{ComplexF32}) is ambiguous. Candidates:
    convert(T::Type{<:SparseArrays.AbstractSparseMatrixCSC}, m::AbstractMatrix) in SparseArrays at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-aarch64-2.0/julia_installs/bin/mac/aarch64/1.8/julia-1.8-latest-macaarch64/share/julia/stdlib/v1.8/SparseArrays/src/sparsematrix.jl:745
    convert(T::Type{<:LinearAlgebra.Bidiagonal}, m::AbstractMatrix) in LinearAlgebra at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-aarch64-2.0/julia_installs/bin/mac/aarch64/1.8/julia-1.8-latest-macaarch64/share/julia/stdlib/v1.8/LinearAlgebra/src/bidiag.jl:203
    convert(::Type{Union{}}, a::AbstractArray) in Base at array.jl:618
    convert(::Type{T}, a::AbstractArray) where T<:GPUArraysCore.AbstractGPUArray in GPUArrays at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-aarch64-2.0/build/default-macmini-aarch64-2-0/julialang/gpuarrays-dot-jl/src/host/construction.jl:4
    convert(T::Type{<:BitArray}, a::AbstractArray) in Base at bitarray.jl:580
    convert(::Type{T}, a::AbstractArray) where T<:Array in Base at array.jl:617
    convert(::Type{SA}, a::AbstractArray) where SA<:StaticArraysCore.StaticArray in StaticArrays at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-aarch64-2.0/depots/c9f52312-b528-44e4-9501-6d408762012b/packages/StaticArrays/J9itA/src/convert.jl:194
    convert(::Type{Union{}}, x) in Base at essentials.jl:213
    convert(::Type{T}, arg) where T<:VecElement in Base at baseext.jl:19
  Possible fix, define
    convert(::Type{Union{}}, ::AbstractMatrix)
  Stacktrace:
   [1] to_power_type(x::MtlMatrix{ComplexF32})
     @ Base ./intfuncs.jl:250

@maleadt
Copy link
Member

maleadt commented May 31, 2023

/AppleInternal/Library/BuildRoots/9941690d-bcf7-11ed-a645-863efbbaf80d/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSMatrix/LinearAlgebra/ARM64/MPSMatrixMultiplication.mm:3274: failed assertion `Number of requested rows in left input matrix exceeds left input matrix size.'

Interesting; I guess Metal.jl should protect against that to avoid an abort.

@maleadt
Copy link
Member

maleadt commented Jun 1, 2023

Finally, all green. Let's tag things to get this out there!

@maleadt maleadt merged commit 443dab4 into JuliaGPU:master Jun 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants