-
Notifications
You must be signed in to change notification settings - Fork 7
summary simplification, test cases, and eager checks #28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Also allow incomplete `alongs::TypedBool...` input, with tailing missing values default to `False()`
add JuliennedArrays to docs/Project.toml, ven though we must do `pkg> dev .` to use local development version of it.
| export Slices, Align | ||
| export True, False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think these names are too common to be exported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who uses them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slice exists in Julia Base with a completely different meaning, I feel that's quite ambiguous to have Slices meaning the others. SliceView and AlignView are more accurate IMO.
True/False are too common names here and I think they should be either implementation details or preserved by Base. But apparently, people just invent their own without much effort:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be fine with SlicesView and AlignView. I added the s because it's many slices. I think post JuliaLang/julia#40561 we might be able to just use true and false instead with no extra performance cost as long as they are constant, but someone would need to verify that.
| else | ||
| 1 | ||
| end | ||
| @inline axis_or_1(switch, axis) = untyped(switch) ? axis : 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend not to like to use the ternary operator. It's inconvenient if you want to add more lines to the if statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here is to only use short-form function definitions when they can be fit into one line.
https://github.com/invenia/BlueStyle#method-definitions
I have to admit that the current codebase requires a little bit of extra effort to read even if I've been used to Julia for years. I didn't make style changes to every line of codes that I feel not so comfortable to read, because they're mostly unrelated to this PR. If you insist we can revert this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh, I don't really care that much. Just the way I do things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's up to you. IMHO the codebase should be clean and simplified so that others could jump in and make contributions without much mental efforts. If that codebase or code style is too opinioned then it's setting a high wall and blocks people.
|
Hi! Thanks for the contribution! I think most of the changes look good. Could you separate out the changes to the constructors to a separate pull request? That's the part I'm hesitant about |
src/JuliennedArrays.jl
Outdated
| alongs::Alongs | ||
| function Align{T,N,S,A}(slices::S, alongs::A) where {T,N,S,A} | ||
| sz = size(first(slices)) | ||
| all(x->sz==size(x), slices) || throw(ArgumentError("All sizes of slices should be the same.")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it could have a significant performance cost? Maybe annotate with @boundscheck?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think @boundscheck would help here as it's internally managed by all.
Some of the difficulty comes from the fact that slices is a vector and not a tuple. If it's a tuple the check would be quicker to run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out that the main overhead here is map(size, slices):
julia> Xs = Slices(rand(2, 3, 4, 5), 1);
julia> @btime map(size, $Xs);
77.824 ns (1 allocation: 576 bytes)I've commented this check so that no significant runtime overhead is introduced. But I do believe we need this check to prevent undefined behaviors, so I mark it as "TODO".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I don't understand how @boundscheck works then. My understanding was that if Align is called with @inbounds, it will skip evaulating the entire statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, if I understand it correctly, you need to add Base.@propagate_inbounds annotation to pass through @inbounds check, so for example, it should be
- @inline getindex(slices::Slices{T,N}, indices::Vararg{Int,N}) where {T,N} =
+ Base.@propagate_inbounds getindex(slices::Slices{T,N}, indices::Vararg{Int,N}) where {T,N} =
view(slices.whole, slice_index(slices, indices)...)so that @inbounds X[i] will become @inbounds view(...). Currently, it will only be view(...).
I'm not sure about this. |
| @test Xs[1, 3] == X[:, 1, :, 3] # test cartesian indexing | ||
|
|
||
| # type is not inferrable for integer alongs | ||
| RT = Base.return_types(Slices, (typeof(X), Int, Int))[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I wrote it I took special care to make sure that this would be inferable if constant propagation kicks in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless wrapped with Val, I don't think it's inferrable if the input is an integer whose value is not known at compile time.
This is the behavior on master branch:
julia> @inferred Slices(rand(2, 3), 2)
ERROR: return type Slices{SubArray{Float64, 1, Matrix{Float64}, Tuple{Int64, Base.OneTo{Int64}}, true}, 1, Matrix{Float64}, Tuple{False, True}} does not match inferred return type Slices{Item, Dimensions, Matrix{Float64}, Alongs} where {Item, Dimensions, Alongs}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but @inferred (x -> Slices(x, 2))(rand(2, 3)) or something like that should hopefully work
src/JuliennedArrays.jl
Outdated
| sz = size(first(slices)) | ||
| all(x->sz==size(x), slices) || throw(ArgumentError("All sizes of slices should be the same.")) | ||
| length(alongs) == N || throw(DimensionMismatch("The total dimension $(N) is expected to be the sum of inner dimension $(length(sz)) and outer dimension $(length(alongs))")) | ||
| inner_dimensions = mapreduce(isequal(True()), +, alongs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to verify that these checks can be done by constant-propagation at compile-time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For slices whose size isn't known prior, it's unable to optimize it away at compile time.
For alongs whose value is stored as types, it's mostly optimized, check the benchmark for Align #28 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used True instead of true on purpose to make sure everything stays in the type-domain. So this is the kind of thing that the compiler should be able to figure out.
johnnychen94
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will remove the incomplete construction and check the construction overhead later.
My guess is, unless we choose to store slices in NTuple, it's not so easy to make the constructor fast.
| @test Xs[1, 3] == X[:, 1, :, 3] # test cartesian indexing | ||
|
|
||
| # type is not inferrable for integer alongs | ||
| RT = Base.return_types(Slices, (typeof(X), Int, Int))[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless wrapped with Val, I don't think it's inferrable if the input is an integer whose value is not known at compile time.
This is the behavior on master branch:
julia> @inferred Slices(rand(2, 3), 2)
ERROR: return type Slices{SubArray{Float64, 1, Matrix{Float64}, Tuple{Int64, Base.OneTo{Int64}}, true}, 1, Matrix{Float64}, Tuple{False, True}} does not match inferred return type Slices{Item, Dimensions, Matrix{Float64}, Alongs} where {Item, Dimensions, Alongs}
src/JuliennedArrays.jl
Outdated
| alongs::Alongs | ||
| function Align{T,N,S,A}(slices::S, alongs::A) where {T,N,S,A} | ||
| sz = size(first(slices)) | ||
| all(x->sz==size(x), slices) || throw(ArgumentError("All sizes of slices should be the same.")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think @boundscheck would help here as it's internally managed by all.
Some of the difficulty comes from the fact that slices is a vector and not a tuple. If it's a tuple the check would be quicker to run.
| export Slices, Align | ||
| export True, False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slice exists in Julia Base with a completely different meaning, I feel that's quite ambiguous to have Slices meaning the others. SliceView and AlignView are more accurate IMO.
True/False are too common names here and I think they should be either implementation details or preserved by Base. But apparently, people just invent their own without much effort:
|
There are several other things that can be improved, but for simplicity, I'd like to leave them as future PRs:
For runtime overhead, I wouldn't say it's zero overhead, but I think they're necessary checks. using BenchmarkTools
using JuliennedArrays
# Align
Xs = [rand(3, 4) for _ in 1:8];
Xs = reshape(Xs, 2, 4);
@btime Align($Xs, 1, 3);
# PR: 998.615 ns (2 allocations: 48 bytes)
# master: 1.005 μs (2 allocations: 48 bytes)
@btime Align($Xs, $(True()), $(False()), $(True()), $(False()));
# PR: 1.475 ns (0 allocations: 0 bytes)
# master: 1.431 ns (0 allocations: 0 bytes)
X = rand(2, 3, 4, 5);
Xs = Slices(X, True(), False(), True(), False());
@btime Align($Xs, 1, 3);
# PR: 1.008 μs (3 allocations: 64 bytes)
# master: 995.364 ns (3 allocations: 64 bytes)
@btime Align($Xs, $(True()), $(False()), $(True()), $(False()));
# PR: 6.798 ns (0 allocations: 0 bytes)
# master: 1.432 ns (0 allocations: 0 bytes)
# Slices
X = rand(2, 3, 4, 5);
@btime Slices($X, 1, 3);
# PR: 2.731 μs (19 allocations: 816 bytes)
# master: 2.745 μs (18 allocations: 784 bytes)
@btime Slices($X, $(True()), $(False()), $(True()), $(False()));
# PR: 1.620 ns (0 allocations: 0 bytes)
# master: 1.432 ns (0 allocations: 0 bytes)
Xs = [rand(3, 4) for _ in 1:8];
Xs = reshape(Xs, 2, 4);
X = Align(Xs, True(), False(), True(), False());
@btime Slices($X, 1, 3);
# PR: 2.923 μs (20 allocations: 832 bytes)
# master: 2.871 μs (19 allocations: 800 bytes)
@btime Slices($X, $(True()), $(False()), $(True()), $(False()));
# PR: 1.831 μs (18 allocations: 768 bytes)
# master: 1.742 μs (18 allocations: 768 bytes)
|
|
I'm not sure if I get it right. Is there any change request on the current codes or are those comments just questions? |
|
Mostly just comments. If you separate out the changes to the constructor/things that might affect performance, I can merge that, and then we can work on reducing the performance cost of the rest? I know it's relatively minor but I think we can get rid of most of the cost. |
johnnychen94
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bramtayl I'm running out of time before the semester begins (Sep 1) and still have other things to take care of, so I'll leave everything else to you.
The unresolved comments are still quite unclear to me, if you have concerns about the runtime check overhead and if you have ideas about how to give better hints to the compilers I'd be happy to see how that's applied.
For the moment I just commented out all the construction-time checks, if you want a minimal PR differences and a clean codebase you can also choose to delete them.
| else | ||
| 1 | ||
| end | ||
| @inline axis_or_1(switch, axis) = untyped(switch) ? axis : 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's up to you. IMHO the codebase should be clean and simplified so that others could jump in and make contributions without much mental efforts. If that codebase or code style is too opinioned then it's setting a high wall and blocks people.
|
Ok, well thanks for the contribution! I'll take a stab at increasing the performance of the commented code and then tag you in the PR when I get to it. |
Because the current test fails on the master version, I submit all commits in one single PR.
Changes:
AlignandSlicesAlignandSlices(closes Slices doesn't support linear indices #27 )