topo.single_cell

Module Contents

Functions

preprocess(AnnData[, normalize, log, target_sum, ...])

A wrapper around Scanpy's preprocessing functions. Normalizes RNA library by size, logarithmizes it and

Attributes

_HAVE_SCANPY

topo.single_cell._HAVE_SCANPY = True
topo.single_cell.preprocess(AnnData, normalize=True, log=True, target_sum=10000.0, min_mean=0.0125, max_mean=8, min_disp=0.3, max_value=10, save_to_raw=True, plot_hvg=False, scale=True, **kwargs)

A wrapper around Scanpy’s preprocessing functions. Normalizes RNA library by size, logarithmizes it and selects highly variable genes for subsetting the AnnData object. Automatically subsets the Anndata object and saves the full expression matrix to AnnData.raw.

Parameters:
  • AnnData (the target AnnData object.) –

  • normalize (bool (optional, default True).) – Whether to size-normalize each cell.

  • log (bool (optional, default True).) – Whether to log-transform for variance stabilization.

  • target_sum (int (optional, default 1e4).) – constant for library size normalization.

  • min_mean (float (optional, default 0.0125).) – Minimum gene expression level for inclusion as highly-variable gene.

  • max_mean (float (optional, default 8.0).) – Maximum gene expression level for inclusion as highly-variable gene.

  • min_disp (float (optional, default 0.3).) – Minimum expression dispersion for inclusion as highly-variable gene.

  • save_to_raw (bool (optional, default True).) – Whether to save the full expression matrix to AnnData.raw.

  • plot_hvg (bool (optional, default False).) – Whether to plot the high-variable genes plot.

  • scale (bool (optional, default True).) – Whether to zero-center and scale the data to unit variance.

  • max_value (float (optional, default 10.0).) – Maximum value for clipping the data after scaling.

  • **kwargs (dict (optional, default {})) – Additional keyword arguments for sc.pp.highly_variable_genes().

Returns:

Updated AnnData object.