topo.layouts.graph_utils
Attributes
Functions
|
Given a set of weights and number of epochs generate the number of |
|
Perform a fuzzy simplicial set embedding (UMAP/MAP), optionally saving |
|
Fit a, b params for the differentiable curve used in lower |
Module Contents
- topo.layouts.graph_utils.INT32_MIN = 1
- topo.layouts.graph_utils.INT32_MAX
- topo.layouts.graph_utils.make_epochs_per_sample(weights, n_epochs)
Given a set of weights and number of epochs generate the number of epochs per sample for each weight. :param weights: The weights ofhow much we wish to sample each 1-simplex. :type weights: array of shape (n_1_simplices) :param n_epochs: The total number of epochs we want to train for. :type n_epochs: int
- Returns:
An array of number of epochs per sample, one for each 1-simplex.
- topo.layouts.graph_utils.simplicial_set_embedding(graph, n_components, initial_alpha, a, b, gamma, negative_sample_rate, n_epochs, init, random_state, metric, metric_kwds, densmap, densmap_kwds, output_dens, output_metric=dist.named_distances_with_gradients['euclidean'], output_metric_kwds={}, euclidean_output=True, parallel=True, verbose=False, save_every=None, save_limit=None, save_callback=None, include_init_snapshot=True)
Perform a fuzzy simplicial set embedding (UMAP/MAP), optionally saving intermediate embeddings every few epochs.
- Parameters:
graph (sparse matrix (CSR/COO)) – Weighted adjacency of the high-dimensional fuzzy 1-skeleton.
n_components (int) – Target embedding dimensionality.
initial_alpha (float) – Initial learning rate for the SGD.
a (floats/ints) – Standard UMAP/MAP parameters.
b (floats/ints) – Standard UMAP/MAP parameters.
gamma (floats/ints) – Standard UMAP/MAP parameters.
negative_sample_rate (floats/ints) – Standard UMAP/MAP parameters.
n_epochs (int) – Total optimization epochs. If <=0, a heuristic is used.
init ({"spectral","random"} or ndarray) – Initialization strategy or explicit initial coordinates.
random_state (numpy RandomState) – RNG.
metric (for densMAP internals) –
metric_kwds (for densMAP internals) –
densmap (bool) – Use density-augmented objective (densMAP).
densmap_kwds (dict) – densMAP internals (expects “graph_dists” etc. if densMAP/output_dens).
output_dens (bool) – If True, also compute embedding densities in aux_data.
output_metric – As in the original implementation.
output_metric_kwds – As in the original implementation.
euclidean_output – As in the original implementation.
parallel – As in the original implementation.
verbose – As in the original implementation.
save_every (int or None, optional) –
If provided and >0, store the embedding every save_every epochs into aux_data[“checkpoints”] as a list of dicts:
[{“epoch”: e, “embedding”: Y_e}, …]
WARNING: storing many snapshots can be memory intensive. Consider passing save_callback to stream snapshots to disk.
save_limit (int or None, optional) – Maximum number of snapshots to keep in-memory in aux_data. If exceeded, the earliest snapshots are discarded (FIFO).
save_callback (callable or None, optional) – If provided, called as save_callback(epoch:int, Y:np.ndarray) for each snapshot. Use this to persist to disk and avoid RAM growth.
include_init_snapshot (bool, default True) – If True, also store a snapshot at epoch=0 (post initialisation/pre-SGD).
- Returns:
embedding ((n_samples, n_components) array) – Final optimized embedding.
aux_data (dict) –
- Auxiliary outputs. New keys:
”checkpoints”: list of {“epoch”: int, “embedding”: np.ndarray} (only if save_every is set or include_init_snapshot is True)
Existing keys unchanged; when densMAP/output_dens are enabled, includes “rad_orig”/”rad_emb” radii etc.
- topo.layouts.graph_utils.find_ab_params(spread, min_dist)
Fit a, b params for the differentiable curve used in lower dimensional fuzzy simplicial complex construction. We want the smooth curve (from a pre-defined family with simple gradient) that best matches an offset exponential decay.