-
Notifications
You must be signed in to change notification settings - Fork 36
Description
While I dived into the source here the use of the KDTree confused me. I'm not experienced in working with trees so I'd appreciate some feedback from experienced tree traversers.
What I find confusing with the current implementation is that the KDTree doesn't contain the elements of the set used to construct the tree, i.e. the rows of the model matrix x. I would have thought that the purpose of the tree was to be able to look up each of the rows in x efficiently. However, the tree consists only of the splitting medians for each dimension. Hence, it seems that the purpose of the tree is more to decide which neighborhoods to use for computing the local regression than a structure for organizing the rows of x.
I've browsed the two papers of Cleveland but I was unable to extract any details on the tree structure. I also tried to read a bit of the Fortran source code but the code isn't easy to follow to state it mildly. @dcjones in case you are still reading GitHub notification, do you recall what your implementation is based on? I feel like I'm missing something here, so if there is a different reference then I'd like to take a look.