topo.eval

Submodules

Classes

RiemannMetric

Functions

global_score_pca(X, Y[, Y_pca])

Compute the global score comparing an embedding to PCA.

global_score_laplacian(X, Y[, k, data_is_graph, ...])

Compute the global score comparing an embedding to a Laplacian Eigenmap baseline.

knn_spearman_r(data_graph, embedding_graph[, ...])

knn_kendall_tau(data_graph, embedding_graph[, ...])

geodesic_distance(A[, method, unweighted, directed, ...])

Compute the geodesic distance matrix from an adjacency (or an affinity) matrix.

geodesic_correlation(data, emb[, landmarks, ...])

get_eccentricity(emb, laplacian[, G_emb])

Package Contents

topo.eval.global_score_pca(X, Y, Y_pca=None)

Compute the global score comparing an embedding to PCA.

The score is defined as exp(-(L_emb - L_pca) / L_pca) where L denotes the mean reconstruction error (global loss) of a linear projection. A score of 1 means the embedding preserves as much global structure as PCA; scores below 1 indicate worse global preservation. The result is clipped to [0, 1].

Parameters:
  • X (array-like of shape (n_samples, n_features) or sparse matrix) – Input feature matrix.

  • Y (array-like of shape (n_samples, n_components)) – Low-dimensional embedding to evaluate.

  • Y_pca (array-like of shape (n_samples, n_components), optional) – Pre-computed PCA embedding. If None, computed from X.

Returns:

score (float in (0, 1]) – Global structure preservation score relative to PCA.

topo.eval.global_score_laplacian(X, Y, k=10, data_is_graph=False, n_jobs=12, random_state=None)

Compute the global score comparing an embedding to a Laplacian Eigenmap baseline.

The score is defined as exp(-(L_emb - L_lap) / L_lap) where L denotes the mean reconstruction error (global loss) of a linear projection. A score of 1 means the embedding preserves as much global structure as a Laplacian Eigenmap of the same dimension; scores below 1 indicate worse global preservation. The result is clipped to [0, 1].

Parameters:
  • X (array-like of shape (n_samples, n_features) or sparse (n_samples, n_samples)) – Input feature matrix, or precomputed affinity graph if data_is_graph=True.

  • Y (array-like of shape (n_samples, n_components)) – Low-dimensional embedding to evaluate.

  • k (int, default 10) – Number of neighbors used by SpectralEmbedding when data_is_graph=False.

  • data_is_graph (bool, default False) – If True, X is treated as a precomputed affinity graph.

  • n_jobs (int, default 12) – Number of parallel jobs for SpectralEmbedding.

  • random_state (numpy.random.RandomState or int, optional) – Random state for SpectralEmbedding.

Returns:

score (float in (0, 1]) – Global structure preservation score relative to Laplacian Eigenmaps.

topo.eval.knn_spearman_r(data_graph, embedding_graph, path_method='D', subsample_idx=None, unweighted=False, n_jobs=1)
topo.eval.knn_kendall_tau(data_graph, embedding_graph, path_method='D', subsample_idx=None, unweighted=False, n_jobs=1)
topo.eval.geodesic_distance(A, method='D', unweighted=False, directed=False, indices=None, n_jobs=-1, random_state=None)

Compute the geodesic distance matrix from an adjacency (or an affinity) matrix. The default behavior is to subset the geodesic distance matrix to only include distances up to the k-th nearest neighbor distance for each point. This is to ensure we are only assessing the performance of the embedding on the local structure of the data.

Parameters:
  • A (array-like, shape (n_vertices, n_vertices)) – Adjacency or affinity matrix of a graph.

  • method (string, optional, default: 'D') – Method to compute the shortest path. - ‘D’: Dijkstra’s algorithm. - ‘FW’: Floyd-Warshall algorithm. - ‘B’: Bellman-Ford algorithm. - ‘J’: Johnson algorithm. - ‘F’: Floyd algorithm.

  • unweighted (bool, optional, default: False) – If True, the adjacency matrix is considered as unweighted.

  • directed (bool, optional, default: True) – If True, the adjacency matrix is considered as directed.

  • indices (array-like, shape (n_indices, ), optional, default: None) – Indices of the vertices to compute the geodesic distance matrix.

  • n_jobs (int, optional, default: 1) – The number of parallel jobs to use during search.

Returns:

geodesic_distance (array-like, shape (n_vertices, n_vertices))

topo.eval.geodesic_correlation(data, emb, landmarks=None, landmark_method='random', metric='euclidean', n_neighbors=3, n_jobs=-1, cor_method='spearman', random_state=None, return_graphs=False, verbose=False, **kwargs)
class topo.eval.RiemannMetric(Y, L)
Y
L
mdimG
get_dual_rmetric(invert_h=False)
get_rmetric(return_svd=False)
get_mdimG()
get_detG(use_log=True)
fit(Y, L=None)
transform(Y, L=None)
topo.eval.get_eccentricity(emb, laplacian, G_emb=None)