topo.utils

Submodules

Functions

get_landmark_indices(data[, n_landmarks, method, ...])

Select landmark indices from data.

get_sparse_matrix_from_indices_distances(knn_indices, ...)

get_indices_distances_from_sparse_matrix(X, n_neighbors)

Get the knn indices and distances for each point in a sparse k-nearest-neighbors matrix.

Package Contents

topo.utils.get_landmark_indices(data, n_landmarks=1000, method='random', random_state=None, **kwargs)

Select landmark indices from data.

Parameters:
  • data (array-like of shape (n_samples, n_features) or sparse matrix) – Input data. For method='kmeans', must be a feature matrix (not a precomputed graph).

  • n_landmarks (int, default 1000) – Number of landmarks to select.

  • method ({'random', 'kmeans'}, default 'random') –

    Landmark selection strategy. * 'random': uniform random sample of row indices. * 'kmeans': MiniBatchKMeans clustering; for each centroid the

    nearest actual data point is returned (so the result is always a valid index array into data).

  • random_state (int or numpy.random.RandomState, optional) – RNG seed / state.

  • **kwargs – Extra keyword arguments forwarded to MiniBatchKMeans.

Returns:

indices (ndarray of int, shape (n_landmarks,)) – Row indices of the selected landmarks.

topo.utils.get_sparse_matrix_from_indices_distances(knn_indices, knn_dists, n_obs, n_neighbors)
topo.utils.get_indices_distances_from_sparse_matrix(X, n_neighbors)

Get the knn indices and distances for each point in a sparse k-nearest-neighbors matrix.

Parameters:
  • X (sparse matrix) – Input knn matrix to get indices and distances from.

  • n_neighbors (int) – Number of neighbors to get.

Returns:

  • knn_indices (ndarray of shape (n_obs, n_neighbors)) – The indices of the nearest neighbors for each point.

  • knn_dists (ndarray of shape (n_obs, n_neighbors)) – The distances to the nearest neighbors for each point.