Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions python/cuml/cuml/manifold/umap/umap.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -715,6 +715,23 @@ class UMAP(Base, InteropMixin, CMajorInputTagMixin, SparseInputTagMixin):
More specific parameters controlling the embedding. If None these
values are set automatically as determined by ``min_dist`` and
``spread``.
target_n_neighbors: int (optional, default=-1)
The number of nearest neighbors to use to construct the target
simplicial set. If set to -1 use the ``n_neighbors`` value.
target_metric: string (optional, default='categorical')
The metric used to measure distance for a target array when using
supervised dimension reduction. By default this is 'categorical'
which will measure distance in terms of whether categories match
or are different. Furthermore, if semi-supervised is required
target values of -1 will be treated as unlabelled under the
'categorical' metric. If the target array takes continuous values
(e.g. for a regression problem) then metric of 'euclidean' or 'l2'
is probably more appropriate.
target_weight: float (optional, default=0.5)
Weighting factor between data topology and target topology. A
value of 0.0 weights predominantly on data, a value of 1.0 places
a strong emphasis on target. The default of 0.5 balances the
weighting equally between data and target.
hash_input: bool, optional (default = False)
UMAP can hash the training input so that exact embeddings
are returned when transform is called on the same data upon
Expand Down
15 changes: 12 additions & 3 deletions python/cuml/cuml/neighbors/nearest_neighbors.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -470,9 +470,18 @@ class NearestNeighbors(Base,
partial information allowing faster distances calculations

metric : string (default='euclidean').
Distance metric to use. Supported distances are ['l1, 'cityblock',
'taxicab', 'manhattan', 'euclidean', 'l2', 'braycurtis', 'canberra',
'minkowski', 'chebyshev', 'jensenshannon', 'cosine', 'correlation']
Distance metric to use. Supported metrics include: 'l1', 'cityblock',
'taxicab', 'manhattan', 'euclidean', 'l2', 'sqeuclidean', 'canberra',
'minkowski', 'lp', 'chebyshev', 'linf', 'jensenshannon', 'cosine',
'braycurtis', 'jaccard', 'hellinger', 'correlation', 'inner_product'.
The ``'ivfflat'`` and ``'ivfpq'``
algorithms only support: 'euclidean', 'l2', 'sqeuclidean', 'cosine',
'correlation', 'inner_product', whereas the ``'rbc'`` algorithm only
supports 'euclidean', 'l2', and 'haversine' (≤3 dimensions only).
For sparse inputs, only the ``'brute'`` algorithm is supported, with
metrics: 'l1', 'cityblock', 'taxicab', 'manhattan', 'euclidean', 'l2',
'canberra', 'minkowski', 'lp', 'chebyshev', 'linf', 'cosine',
'inner_product', 'jaccard', 'hellinger'.
p : float (default=2)
Comment thread
coderabbitai[bot] marked this conversation as resolved.
Parameter for the Minkowski metric. When p = 1, this is equivalent to
manhattan distance (l1), and euclidean distance (l2) for p = 2. For
Expand Down