topo.utils
Submodules
Functions
|
Select landmark indices from data. |
|
|
|
Get the knn indices and distances for each point in a sparse k-nearest-neighbors matrix. |
Package Contents
- topo.utils.get_landmark_indices(data, n_landmarks=1000, method='random', random_state=None, **kwargs)
Select landmark indices from data.
- Parameters:
data (array-like of shape (n_samples, n_features) or sparse matrix) – Input data. For
method='kmeans', must be a feature matrix (not a precomputed graph).n_landmarks (int, default 1000) – Number of landmarks to select.
method ({'random', 'kmeans'}, default 'random') –
Landmark selection strategy. *
'random': uniform random sample of row indices. *'kmeans': MiniBatchKMeans clustering; for each centroid thenearest actual data point is returned (so the result is always a valid index array into
data).random_state (int or numpy.random.RandomState, optional) – RNG seed / state.
**kwargs – Extra keyword arguments forwarded to
MiniBatchKMeans.
- Returns:
indices (ndarray of int, shape (n_landmarks,)) – Row indices of the selected landmarks.
- topo.utils.get_sparse_matrix_from_indices_distances(knn_indices, knn_dists, n_obs, n_neighbors)
- topo.utils.get_indices_distances_from_sparse_matrix(X, n_neighbors)
Get the knn indices and distances for each point in a sparse k-nearest-neighbors matrix.
- Parameters:
X (sparse matrix) – Input knn matrix to get indices and distances from.
n_neighbors (int) – Number of neighbors to get.
- Returns:
knn_indices (ndarray of shape (n_obs, n_neighbors)) – The indices of the nearest neighbors for each point.
knn_dists (ndarray of shape (n_obs, n_neighbors)) – The distances to the nearest neighbors for each point.