Change split point calculation in KD-tree construction#64
Merged
andreasnoack merged 2 commits intomasterfrom Apr 11, 2023
Merged
Change split point calculation in KD-tree construction#64andreasnoack merged 2 commits intomasterfrom
andreasnoack merged 2 commits intomasterfrom
Conversation
This tries to mimic the splitting of the original Loess implementation which is used by R. The implementation is based on reverse enginerring of the behavior as the rules are only loosely described in the original paper. With the rules described in the comment we are able to match the splits of R. When adding tests, I realized that the weight calculation in the local regression were off by a square root. They were computed as the diagonal Elements of W in `inv(X'*W*X)*X'*W*y` but we applied them to X and y before computing the OLS estimates so the weights were squared. The signatures have also been loosened to allow more element types. This made it easier to test with the cars dataset from R. I've added a lot of `@debug` statements to made it easier to follow the KD-tree construction.
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #64 +/- ##
==========================================
- Coverage 92.59% 92.11% -0.48%
==========================================
Files 2 2
Lines 189 203 +14
==========================================
+ Hits 175 187 +12
- Misses 14 16 +2
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
Member
Author
|
I forgot to mention that, with these changes, I was able to change one of the broken tests to a working test. The other broken test was changed to a |
devmotion
reviewed
Apr 5, 2023
7a824f9 to
7363754
Compare
Member
Author
|
Thanks for the comments. I believe that I've now addressed all of them so please have another look. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This tries to mimic the splitting of the original Loess implementation which is used by R. The implementation is based on reverse enginerring of the behavior as the rules are only loosely described in the original paper. With the rules described in the comment we are able to match the splits of R.
When adding tests, I realized that the weight calculation in the local regression were off by a square root. They were computed as the diagonal elements of W in
inv(X'*W*X)*X'*W*ybut we applied them to X and y before computing the OLS estimates so the weights were squared.The signatures have also been loosened to allow more element types. This made it easier to test with the cars dataset from R.
I've added a lot of
@debugstatements to made it easier to follow the KD-tree construction.Update: The changes to the signatures fixes #48