topo.tpgraph.cknn

Module Contents

Functions

cknn_graph(X[, n_neighbors, delta, metric, weighted, ...])

Function-oriented implementation of Continuous k-Nearest-Neighbors (CkNN).

topo.tpgraph.cknn.cknn_graph(X, n_neighbors=10, delta=1.0, metric='euclidean', weighted=False, include_self=False, return_densities=False, backend='nmslib', n_jobs=1, verbose=False, **kwargs)

Function-oriented implementation of Continuous k-Nearest-Neighbors (CkNN). An efficient implementation of [CkNN](https://arxiv.org/pdf/1606.02353.pdf). CkNN is the only unweighted graph construction that can be used to approximate the Laplace-Beltrami Operator via the unnormalized graph Laplacian. It can also be used to wield an weighted affinity matrix with locality-sensitive weights.

Parameters:
  • n_neighbors (int (optional, default=5).) – Number of neighbors to compute. The actual number of k-nearest neighbors to be used in the CkNN normalization is half of it.

  • delta (float (optional, default=1.0).) – A parameter to decide the radius for each points. The combination radius increases in proportion to this parameter. This should be tunned.

  • metric (str (optional, default='euclidean').) – The metric of each points. This parameter depends on the parameter metric of scipy.spatial.distance.pdist.

  • weighted (bool (optional, default=False).) – If True, the CkNN graph is weighted (i.e. an affinity matrix). If False, the CkNN graph is unweighted (i.e. the proper adjacency matrix). If None, will return a tuple of the adjacency matrix (unweighted) and the affinity matrix (weighted).

  • return_densities (bool (optional, default=False).) – If True, will return the distance to the k-nearest-neighbor of each points.

  • include_self (bool (optional, default=True).) – All diagonal elements are 1.0 if this parameter is True.

  • backend (str 'hnwslib', 'nmslib' or 'sklearn' (optional, default 'nmslib').) –

    Which backend to use to compute nearest-neighbors. Options for fast, approximate nearest-neighbors are ‘hnwslib’ and ‘nmslib’ (default). For exact nearest-neighbors, use ‘sklearn’.

    • If using ‘nmslib’, a sparse

    csr_matrix input is expected. If using ‘hnwslib’ or ‘sklearn’, a dense array is expected. * I strongly recommend you use ‘hnswlib’ if handling with somewhat dense, array-shaped data. If the data is relatively sparse, you should use ‘nmslib’, which operates on sparse matrices by default on TopOMetry and will automatically convert the input array to csr_matrix for performance.

  • n_jobs (int (optional, default 1).) – The number of jobs to use in the k-nearest-neighbors computation. Defaults to one (I highly recommend you use all available).

  • verbose (bool (optional, default False).) – If True, print progress messages.

  • kwargs (dict (optional, default {}).) – Additional parameters to pass to the k-nearest-neighbors backend.