topo.eval

Submodules

Classes

RiemannMetric

Functions

`global_score_pca`(X, Y[, Y_pca])	Compute the global score comparing an embedding to PCA.
`global_score_laplacian`(X, Y[, k, data_is_graph, ...])	Compute the global score comparing an embedding to a Laplacian Eigenmap baseline.
`knn_spearman_r`(data_graph, embedding_graph[, ...])
`knn_kendall_tau`(data_graph, embedding_graph[, ...])
`geodesic_distance`(A[, method, unweighted, directed, ...])	Compute the geodesic distance matrix from an adjacency (or an affinity) matrix.
`geodesic_correlation`(data, emb[, landmarks, ...])
`get_eccentricity`(emb, laplacian[, G_emb])

Package Contents

topo.eval.global_score_pca(X, Y, Y_pca=None)

Compute the global score comparing an embedding to PCA.

The score is defined as exp(-(L_emb - L_pca) / L_pca) where L denotes the mean reconstruction error (global loss) of a linear projection. A score of 1 means the embedding preserves as much global structure as PCA; scores below 1 indicate worse global preservation. The result is clipped to [0, 1].

Parameters:

X (array-like of shape (n_samples, n_features) or sparse matrix) – Input feature matrix.
Y (array-like of shape (n_samples, n_components)) – Low-dimensional embedding to evaluate.
Y_pca (array-like of shape (n_samples, n_components), optional) – Pre-computed PCA embedding. If None, computed from X.

Returns:

score (float in (0, 1]) – Global structure preservation score relative to PCA.

topo.eval.global_score_laplacian(X, Y, k=10, data_is_graph=False, n_jobs=12, random_state=None)

Compute the global score comparing an embedding to a Laplacian Eigenmap baseline.

The score is defined as exp(-(L_emb - L_lap) / L_lap) where L denotes the mean reconstruction error (global loss) of a linear projection. A score of 1 means the embedding preserves as much global structure as a Laplacian Eigenmap of the same dimension; scores below 1 indicate worse global preservation. The result is clipped to [0, 1].

Parameters:

X (array-like of shape (n_samples, n_features) or sparse (n_samples, n_samples)) – Input feature matrix, or precomputed affinity graph if data_is_graph=True.
Y (array-like of shape (n_samples, n_components)) – Low-dimensional embedding to evaluate.
k (int, default 10) – Number of neighbors used by SpectralEmbedding when data_is_graph=False.
data_is_graph (bool, default False) – If True, X is treated as a precomputed affinity graph.
n_jobs (int, default 12) – Number of parallel jobs for SpectralEmbedding.
random_state (numpy.random.RandomState or int, optional) – Random state for SpectralEmbedding.

Returns:

score (float in (0, 1]) – Global structure preservation score relative to Laplacian Eigenmaps.

topo.eval.knn_spearman_r(data_graph, embedding_graph, path_method='D', subsample_idx=None, unweighted=False, n_jobs=1)

topo.eval.knn_kendall_tau(data_graph, embedding_graph, path_method='D', subsample_idx=None, unweighted=False, n_jobs=1)

topo.eval.geodesic_distance(A, method='D', unweighted=False, directed=False, indices=None, n_jobs=-1, random_state=None)

Compute the geodesic distance matrix from an adjacency (or an affinity) matrix. The default behavior is to subset the geodesic distance matrix to only include distances up to the k-th nearest neighbor distance for each point. This is to ensure we are only assessing the performance of the embedding on the local structure of the data.

Parameters:

A (array-like, shape (n_vertices, n_vertices)) – Adjacency or affinity matrix of a graph.
method (string, optional, default: 'D') – Method to compute the shortest path. - ‘D’: Dijkstra’s algorithm. - ‘FW’: Floyd-Warshall algorithm. - ‘B’: Bellman-Ford algorithm. - ‘J’: Johnson algorithm. - ‘F’: Floyd algorithm.
unweighted (bool, optional, default: False) – If True, the adjacency matrix is considered as unweighted.
directed (bool, optional, default: True) – If True, the adjacency matrix is considered as directed.
indices (array-like, shape (n_indices, ), optional, default: None) – Indices of the vertices to compute the geodesic distance matrix.
n_jobs (int, optional, default: 1) – The number of parallel jobs to use during search.

Returns:

geodesic_distance (array-like, shape (n_vertices, n_vertices))

topo.eval.geodesic_correlation(data, emb, landmarks=None, landmark_method='random', metric='euclidean', n_neighbors=3, n_jobs=-1, cor_method='spearman', random_state=None, return_graphs=False, verbose=False, **kwargs)

class topo.eval.RiemannMetric(Y, L)

Y

L

mdimG

get_dual_rmetric(invert_h=False)

get_rmetric(return_svd=False)

get_mdimG()

get_detG(use_log=True)

fit(Y, L=None)

transform(Y, L=None)

topo.eval.get_eccentricity(emb, laplacian, G_emb=None)