topo.spectral._spectral

Module Contents

Functions

_dense_degree(W)

_dense_unnormalized_laplacian(W[, return_D])

_dense_symmetric_normalized_laplacian(W[, return_D])

_dense_normalized_random_walk_laplacian(W[, return_D])

_dense_diffusion(W[, alpha, semi_aniso])

_dense_diffusion_symmetric(W[, alpha, semi_aniso, ...])

_sparse_degree(W)

degree(W)

_sparse_unnormalized_laplacian(W[, return_D])

_sparse_symmetrized_normalized_laplacian(W[, return_D])

_sparse_normalized_random_walk_laplacian(W[, return_D])

_sparse_diffusion(W[, alpha, semi_aniso])

_sparse_diffusion_symmetric(W[, alpha, semi_aniso, ...])

graph_laplacian(W[, laplacian_type, return_D])

Compute the graph Laplacian, given a adjacency or affinity graph W. For a friendly reference,

LE(W[, n_eigs, laplacian_type, drop_first, ...])

Performs [Laplacian Eigenmaps](https://www2.imm.dtu.dk/projects/manifold/Papers/Laplacian.pdf), given a adjacency or affinity graph W.

diffusion_operator(W[, alpha, symmetric, semi_aniso, ...])

Computes the [diffusion operator](https://doi.org/10.1016/j.acha.2006.04.006).

DM(W[, n_eigs, alpha, return_evals, symmetric, ...])

Performs [Diffusion Maps](https://doi.org/10.1016/j.acha.2006.04.006), given a adjacency or affinity graph W.

_set_diag(laplacian, value, norm_laplacian)

spectral_clustering(init[, max_svd_restarts, ...])

Search for a partition matrix (clustering) which is closest to the eigenvector embedding.

topo.spectral._spectral._dense_degree(W)
topo.spectral._spectral._dense_unnormalized_laplacian(W, return_D=False)
topo.spectral._spectral._dense_symmetric_normalized_laplacian(W, return_D=False)
topo.spectral._spectral._dense_normalized_random_walk_laplacian(W, return_D=False)
topo.spectral._spectral._dense_diffusion(W, alpha=0, semi_aniso=False)
topo.spectral._spectral._dense_diffusion_symmetric(W, alpha=0, semi_aniso=False, return_D_inv_sqrt=False)
topo.spectral._spectral._sparse_degree(W)
topo.spectral._spectral.degree(W)
topo.spectral._spectral._sparse_unnormalized_laplacian(W, return_D=False)
topo.spectral._spectral._sparse_symmetrized_normalized_laplacian(W, return_D=False)
topo.spectral._spectral._sparse_normalized_random_walk_laplacian(W, return_D=False)
topo.spectral._spectral._sparse_diffusion(W, alpha=0, semi_aniso=False)
topo.spectral._spectral._sparse_diffusion_symmetric(W, alpha=0, semi_aniso=False, return_D_inv_sqrt=False)
topo.spectral._spectral.graph_laplacian(W, laplacian_type='normalized', return_D=False)

Compute the graph Laplacian, given a adjacency or affinity graph W. For a friendly reference, see this material from James Melville: https://jlmelville.github.io/smallvis/spectral.html

Parameters:
  • W (scipy.sparse.csr_matrix or np.ndarray) – The graph adjacency or affinity matrix. Assumed to be symmetric and with zero diagonal. No further symmetrization is performed, so make sure to symmetrize W if necessary (usually done additively with W = (W + W.T)/2 ).

  • laplacian (str (optional, default 'random_walk').) – The type of laplacian to use. Can be ‘unnormalized’, ‘normalized’ or ‘random_walk’.

  • return_D (bool (optional, default False).) – Whether to also return a degree matrix with the Laplacian in a tuple

Returns:

L (scipy.sparse.csr_matrix) – The graph Laplacian.

topo.spectral._spectral.LE(W, n_eigs=10, laplacian_type='random_walk', drop_first=True, return_evals=False, eigen_tol=0, random_state=None)

Performs [Laplacian Eigenmaps](https://www2.imm.dtu.dk/projects/manifold/Papers/Laplacian.pdf), given a adjacency or affinity graph W. The graph W can be a sparse matrix or a dense matrix. It is assumed to be symmetric (no further symmetrization is performed, be sure it is), and with zero diagonal (all diagonal elements are 0). The eigenvectors associated with the smallest eigenvalues form a new orthonormal basis which represents the graph in the feature space and are useful for denoising and clustering.

Parameters:
  • W (scipy.sparse.csr_matrix or np.ndarray) – The graph adjacency or affinity matrix. Assumed to be symmetric and with zero diagonal.

  • n_eigs (int (optional, default 10).) – The number of eigenvectors to compute.

  • laplacian (str (optional, default 'random_walk').) – The type of laplacian to use. Can be ‘unnormalized’, ‘symmetric’, or ‘random_walk’.

  • drop_first (bool (optional, default True).) – Whether to drop the first eigenvector.

  • return_evals (bool (optional, default False).) – Whether to return the eigenvalues. If True, returns a tuple of (eigenvectors, eigenvalues).

  • eigen_tol (float (optional, default 0).) – The tolerance for the eigendecomposition.

  • random_state (int (optional, default None).) – The random state for the eigendecomposition in scipy.sparse.linalg.lobpcg() if the data has more than a million samples.

Returns:

  • evecs (np.ndarray of shape (W.shape[0], n_eigs)) – The eigenvectors of the graph Laplacian, sorted by ascending eigenvalues.

  • If return_evals – evecs, evals : tuple of ndarrays The eigenvectors and associated eigenvalues, sorted by ascending eigenvalues.

topo.spectral._spectral.diffusion_operator(W, alpha=1.0, symmetric=False, semi_aniso=False, return_D_inv_sqrt=False)

Computes the [diffusion operator](https://doi.org/10.1016/j.acha.2006.04.006).

Parameters:
  • W (scipy.sparse.csr_matrix or np.ndarray) – The graph adjacency or affinity matrix. Assumed to be symmetric and with zero diagonal. No further symmetrization is performed, so make sure to symmetrize W if necessary (usually done additively with W = (W + W.T)/2 ).

  • alpha (float (optional, default 1.0).) – Anisotropy to apply. ‘Alpha’ in the diffusion maps literature.

  • symmetric (bool (optional, default True).) – Whether to use a symmetric version of the diffusion operator. This is particularly useful to yield a symmetric operator when using anisotropy (alpha > 0), as the diffusion operator P would be assymetric otherwise, which can be problematic during matrix decomposition. Eigenvalues are the same of the assymetric version, and the eigenvectors of the original assymetric operator can be obtained by left multiplying by D_inv_sqrt (returned if return_D_inv_sqrt set to True).

  • semi_aniso (bool (optional, default False).) – Whether to use semi-anisotropic diffusion. This reweights the original kernel (not the renormalized kernel) by the renormalized degree.

  • return_D_inv_sqrt (bool (optional, default False).) – Whether to return a tuple of diffusion operator P and inverse square root of the degree matrix.

Returns:

P (scipy.sparse.csr_matrix) – The graph diffusion operator.

topo.spectral._spectral.DM(W, n_eigs=10, alpha=0, return_evals=False, symmetric=False, eigen_tol=0.001, t=None)

Performs [Diffusion Maps](https://doi.org/10.1016/j.acha.2006.04.006), given a adjacency or affinity graph W. The graph W can be a sparse matrix or a dense matrix. It is assumed to be symmetric (no further symmetrization is performed, be sure it is), and with zero diagonal (all diagonal elements are 0). The eigenvectors associated with the largest eigenvalues form a new orthonormal basis which represents the graph in the feature space and are useful for denoising and clustering.

Parameters:
  • W (scipy.sparse.csr_matrix or np.ndarray) – The graph adjacency or affinity matrix. Assumed to be symmetric and with zero diagonal.

  • n_eigs (int (optional, default 10).) – The number of eigenvectors to return.

  • alpha (float (optional, default 0).) – Anisotropy to be applied to the diffusion map. Refered to as alpha in the diffusion maps literature.

  • return_evals (bool (optional, default False).) – Whether to return the eigenvalues. If True, returns a tuple of (eigenvectors, eigenvalues).

  • symmetric (bool (optional, default False).) – Whether to use a symmetric version of the diffusion operator. This is particularly useful to yield a symmetric operator, but that can also be obtained by a simplistic mean with its the transpose with near identical results.

  • eigen_tol (float (optional, default 0).) – The tolerance for the eigendecomposition in scipy.sparse.linalg.eigsh().

  • t (int (optional, default 1).) – The number of steps to take in the diffusion map.

Returns:

  • * If return_evals is True – A tuple of scaled eigenvectors (the diffusion maps) and eigenvalues.

  • * If return_evals is False – An array of scaled eigenvectors (the diffusion maps).

topo.spectral._spectral._set_diag(laplacian, value, norm_laplacian)
topo.spectral._spectral.spectral_clustering(init, max_svd_restarts=50, n_iter_max=50, random_state=None, copy=True)

Search for a partition matrix (clustering) which is closest to the eigenvector embedding.

Parameters:
  • init (array-like of shape (n_samples, n_clusters)) – The embedding space of the samples.

  • max_svd_restarts (int, default=30) – Maximum number of attempts to restart SVD if convergence fails

  • n_iter_max (int, default=30) – Maximum number of iterations to attempt in rotation and partition matrix search if machine precision convergence is not reached

  • random_state (int, RandomState instance, default=None) – Determines random number generation for rotation matrix initialization. Use an int to make the randomness deterministic. See Glossary.

  • copy (bool, default=True) – Whether to copy vectors, or perform in-place normalization.

Returns:

labels (array of integers, shape: n_samples) – The labels of the clusters.

References

Notes

The eigenvector embedding is used to iteratively search for the closest discrete partition. First, the eigenvector embedding is normalized to the space of partition matrices. An optimal discrete partition matrix closest to this normalized embedding multiplied by an initial rotation is calculated. Fixing this discrete partition matrix, an optimal rotation matrix is calculated. These two calculations are performed until convergence. The discrete partition matrix is returned as the clustering solution. Used in spectral clustering, this method tends to be faster and more robust to random initialization than k-means.