topo.spectral.umap_layouts

Module Contents

Functions

clip(val)

Standard clamping of a value into a fixed range (in this case -4.0 to

rdist(x, y)

Reduced Euclidean distance.

_optimize_layout_euclidean_single_epoch(...)

_optimize_layout_euclidean_densmap_epoch_init(...)

optimize_layout_euclidean(head_embedding, ...[, ...])

Improve an embedding using stochastic gradient descent to minimize the

optimize_layout_generic(head_embedding, ...[, gamma, ...])

Improve an embedding using stochastic gradient descent to minimize the

optimize_layout_inverse(head_embedding, ...[, gamma, ...])

Improve an embedding using stochastic gradient descent to minimize the

_optimize_layout_aligned_euclidean_single_epoch(...)

optimize_layout_aligned_euclidean(head_embeddings, ...)

topo.spectral.umap_layouts.clip(val)

Standard clamping of a value into a fixed range (in this case -4.0 to 4.0) :param val: The value to be clamped. :type val: float

Returns:

The clamped value, now fixed to be in the range -4.0 to 4.0.

topo.spectral.umap_layouts.rdist(x, y)

Reduced Euclidean distance. :param x: :type x: array of shape (embedding_dim,) :param y: :type y: array of shape (embedding_dim,)

Returns:

The squared euclidean distance between x and y

topo.spectral.umap_layouts._optimize_layout_euclidean_single_epoch(head_embedding, tail_embedding, head, tail, n_vertices, epochs_per_sample, a, b, rng_state, gamma, dim, move_other, alpha, epochs_per_negative_sample, epoch_of_next_negative_sample, epoch_of_next_sample, n, densmap_flag, dens_phi_sum, dens_re_sum, dens_re_cov, dens_re_std, dens_re_mean, dens_lambda, dens_R, dens_mu, dens_mu_tot)
topo.spectral.umap_layouts._optimize_layout_euclidean_densmap_epoch_init(head_embedding, tail_embedding, head, tail, a, b, re_sum, phi_sum)
topo.spectral.umap_layouts.optimize_layout_euclidean(head_embedding, tail_embedding, head, tail, n_epochs, n_vertices, epochs_per_sample, a, b, rng_state, gamma=1.0, initial_alpha=1.0, negative_sample_rate=5.0, parallel=False, verbose=False, densmap=False, densmap_kwds={})

Improve an embedding using stochastic gradient descent to minimize the fuzzy set cross entropy between the 1-skeletons of the high dimensional and low dimensional fuzzy simplicial sets. In practice this is done by sampling edges based on their membership strength (with the (1-p) terms coming from negative sampling similar to word2vec). :param head_embedding: The initial embedding to be improved by SGD. :type head_embedding: array of shape (n_samples, n_components) :param tail_embedding: The reference embedding of embedded points. If not embedding new

previously unseen points with respect to an existing embedding this is simply the head_embedding (again); otherwise it provides the existing embedding to embed with respect to.

Parameters:
  • head (array of shape (n_1_simplices)) – The indices of the heads of 1-simplices with non-zero membership.

  • tail (array of shape (n_1_simplices)) – The indices of the tails of 1-simplices with non-zero membership.

  • n_epochs (int) – The number of training epochs to use in optimization.

  • n_vertices (int) – The number of vertices (0-simplices) in the dataset.

  • epochs_per_samples (array of shape (n_1_simplices)) – A float value of the number of epochs per 1-simplex. 1-simplices with weaker membership strength will have more epochs between being sampled.

  • a (float) – Parameter of differentiable approximation of right adjoint functor

  • b (float) – Parameter of differentiable approximation of right adjoint functor

  • rng_state (array of int64, shape (3,)) – The internal state of the rng

  • gamma (float (optional, default 1.0)) – Weight to apply to negative samples.

  • initial_alpha (float (optional, default 1.0)) – Initial learning rate for the SGD.

  • negative_sample_rate (int (optional, default 5)) – Number of negative samples to use per positive sample.

  • parallel (bool (optional, default False)) – Whether to run the computation using numba parallel. Running in parallel is non-deterministic, and is not used if a random seed has been set, to ensure reproducibility.

  • verbose (bool (optional, default False)) – Whether to report information on the current progress of the algorithm.

  • densmap (bool (optional, default False)) – Whether to use the density-augmented densMAP objective

  • densmap_kwds (dict (optional, default {})) – Auxiliary data for densMAP

Returns:

embedding (array of shape (n_samples, n_components)) – The optimized embedding.

topo.spectral.umap_layouts.optimize_layout_generic(head_embedding, tail_embedding, head, tail, n_epochs, n_vertices, epochs_per_sample, a, b, rng_state, gamma=1.0, initial_alpha=1.0, negative_sample_rate=5.0, output_metric=dist.euclidean, output_metric_kwds=(), verbose=False)

Improve an embedding using stochastic gradient descent to minimize the fuzzy set cross entropy between the 1-skeletons of the high dimensional and low dimensional fuzzy simplicial sets. In practice this is done by sampling edges based on their membership strength (with the (1-p) terms coming from negative sampling similar to word2vec). :param head_embedding: The initial embedding to be improved by SGD. :type head_embedding: array of shape (n_samples, n_components) :param tail_embedding: The reference embedding of embedded points. If not embedding new

previously unseen points with respect to an existing embedding this is simply the head_embedding (again); otherwise it provides the existing embedding to embed with respect to.

Parameters:
  • head (array of shape (n_1_simplices)) – The indices of the heads of 1-simplices with non-zero membership.

  • tail (array of shape (n_1_simplices)) – The indices of the tails of 1-simplices with non-zero membership.

  • weight (array of shape (n_1_simplices)) – The membership weights of the 1-simplices.

  • n_epochs (int) – The number of training epochs to use in optimization.

  • n_vertices (int) – The number of vertices (0-simplices) in the dataset.

  • epochs_per_sample (array of shape (n_1_simplices)) – A float value of the number of epochs per 1-simplex. 1-simplices with weaker membership strength will have more epochs between being sampled.

  • a (float) – Parameter of differentiable approximation of right adjoint functor

  • b (float) – Parameter of differentiable approximation of right adjoint functor

  • rng_state (array of int64, shape (3,)) – The internal state of the rng

  • gamma (float (optional, default 1.0)) – Weight to apply to negative samples.

  • initial_alpha (float (optional, default 1.0)) – Initial learning rate for the SGD.

  • negative_sample_rate (int (optional, default 5)) – Number of negative samples to use per positive sample.

  • verbose (bool (optional, default False)) – Whether to report information on the current progress of the algorithm.

Returns:

embedding (array of shape (n_samples, n_components)) – The optimized embedding.

topo.spectral.umap_layouts.optimize_layout_inverse(head_embedding, tail_embedding, head, tail, weight, sigmas, rhos, n_epochs, n_vertices, epochs_per_sample, a, b, rng_state, gamma=1.0, initial_alpha=1.0, negative_sample_rate=5.0, output_metric=dist.euclidean, output_metric_kwds=(), verbose=False)

Improve an embedding using stochastic gradient descent to minimize the fuzzy set cross entropy between the 1-skeletons of the high dimensional and low dimensional fuzzy simplicial sets. In practice this is done by sampling edges based on their membership strength (with the (1-p) terms coming from negative sampling similar to word2vec). :param head_embedding: The initial embedding to be improved by SGD. :type head_embedding: array of shape (n_samples, n_components) :param tail_embedding: The reference embedding of embedded points. If not embedding new

previously unseen points with respect to an existing embedding this is simply the head_embedding (again); otherwise it provides the existing embedding to embed with respect to.

Parameters:
  • head (array of shape (n_1_simplices)) – The indices of the heads of 1-simplices with non-zero membership.

  • tail (array of shape (n_1_simplices)) – The indices of the tails of 1-simplices with non-zero membership.

  • weight (array of shape (n_1_simplices)) – The membership weights of the 1-simplices.

  • n_epochs (int) – The number of training epochs to use in optimization.

  • n_vertices (int) – The number of vertices (0-simplices) in the dataset.

  • epochs_per_sample (array of shape (n_1_simplices)) – A float value of the number of epochs per 1-simplex. 1-simplices with weaker membership strength will have more epochs between being sampled.

  • a (float) – Parameter of differentiable approximation of right adjoint functor

  • b (float) – Parameter of differentiable approximation of right adjoint functor

  • rng_state (array of int64, shape (3,)) – The internal state of the rng

  • gamma (float (optional, default 1.0)) – Weight to apply to negative samples.

  • initial_alpha (float (optional, default 1.0)) – Initial learning rate for the SGD.

  • negative_sample_rate (int (optional, default 5)) – Number of negative samples to use per positive sample.

  • verbose (bool (optional, default False)) – Whether to report information on the current progress of the algorithm.

Returns:

embedding (array of shape (n_samples, n_components)) – The optimized embedding.

topo.spectral.umap_layouts._optimize_layout_aligned_euclidean_single_epoch(head_embeddings, tail_embeddings, heads, tails, epochs_per_sample, a, b, regularisation_weights, relations, rng_state, gamma, lambda_, dim, move_other, alpha, epochs_per_negative_sample, epoch_of_next_negative_sample, epoch_of_next_sample, n)
topo.spectral.umap_layouts.optimize_layout_aligned_euclidean(head_embeddings, tail_embeddings, heads, tails, n_epochs, epochs_per_sample, regularisation_weights, relations, rng_state, a=1.576943460405378, b=0.8950608781227859, gamma=1.0, lambda_=0.005, initial_alpha=1.0, negative_sample_rate=5.0, parallel=True, verbose=False)