Skip to content

Tree confusion #44

@andreasnoack

Description

@andreasnoack

While I dived into the source here the use of the KDTree confused me. I'm not experienced in working with trees so I'd appreciate some feedback from experienced tree traversers.

What I find confusing with the current implementation is that the KDTree doesn't contain the elements of the set used to construct the tree, i.e. the rows of the model matrix x. I would have thought that the purpose of the tree was to be able to look up each of the rows in x efficiently. However, the tree consists only of the splitting medians for each dimension. Hence, it seems that the purpose of the tree is more to decide which neighborhoods to use for computing the local regression than a structure for organizing the rows of x.

I've browsed the two papers of Cleveland but I was unable to extract any details on the tree structure. I also tried to read a bit of the Fortran source code but the code isn't easy to follow to state it mildly. @dcjones in case you are still reading GitHub notification, do you recall what your implementation is based on? I feel like I'm missing something here, so if there is a different reference then I'd like to take a look.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions