Quick-start cheat-sheet

Fitting a TopOGraph

Now, let’s go through a quick start!

TopOMetry functions around the TopOGraph class. It contains dictionaries, attributes and functions to analyse your data. From a data matrix data (np.ndarray, pd.DataFrame or sp.csr_matrix), you can set up a TopoGraph with default parameters:

import topo as tp
   
# Learn topological metrics and basis from data. The default is to use diffusion harmonics.
tg = tp.TopOGraph()
tg.fit(data)

After learning a topological basis, we can access topological metrics and basis in the TopOGraph object, and build different topological graphs.

# Learn a topological graph. Again, the default is to use diffusion harmonics.
tgraph = tg.transform(data) 

Then, it is possible to optimize the topological graph layout. TopOMetry has 5 different layout options: tSNE, MAP, TriMAP, PaCMAP and MDE.

# Graph layout optimization
map_emb = tg.MAP()
mde_emb = tg.MDE()
pacmap_emb = tg.PaCMAP()
trimap_emb = tg.TriMAP()
tsne_emb = tg.tSNE()

We can also plot the embeddings:

tp.plot.scatter(map_emb)

Computing several models at once

The run_layouts() attribute of the TopOGraph object runs all possible combinations of algorithms to perform DR in the TopOMetry framework.

# These settings run all models and layouts
tg.run_layouts(X, n_components=2,
                    bases=['diffusion', 'fuzzy', 'continuous'],
                    graphs=['diff', 'cknn', 'fuzzy'],
                    layouts=['tSNE', 'MAP', 'MDE', 'PaCMAP', 'TriMAP', 'NCVis'])

If no parameters are passed to the run_layouts() function, by default it will perform the following steps:

Similarity learning and building a topological orthogonal basis with:

  • Multiscale diffusion maps ('diffusion')

  • Fuzzy simplicial sets Laplacian Eigenmaps (‘fuzzy’);

Learn the topological graphs with:

  • Diffusion harmonics ('diff')

  • Fuzzy simplicial sets

Next, it will use all layout optimization methods:

  • MAP - a lighter UMAP with looser assumptions;

    • MAP and MDE use information both from the orthogonal basis and the topological graph

  • MDE - a general framework for graph layout optimization, with the pyMDE implementation.

  • t-SNE - using MulticoreTSNE

  • PaCMAP

  • TriMAP

  • NCVis

MAP, MDE, PaCMAP and NCVis use an spectral initialization from the learned topological graph. TriMAP uses PCA internally as a initialization. NCVis uses a custom initialization procedure.

So if you want to compute the diffusion basis, its diffusion and fuzzy topological graphs, and the associated MAP and PaCMAP layouts, you can simply run:

tg.run_layouts(X, n_components=2,
                    bases=['diffusion'],
                    graphs=['diff', 'fuzzy'],
                    layouts=['MAP','PaCMAP'])

This diversity of options is useful for comparisons and scoring, instead of selecting a single layout algorithm a priori.