topo.tpgraph.procrustes

Module Contents

Classes

ProcrustesResult

This is from the Procrustes [package](https://github.com/theochem/procrustes).

GeneralizedProcrustes

Generalized Procrustes Analysis in a scikit-learn flavor.

Functions

_zero_padding(array_a, array_b[, pad_mode])

This is from the Procrustes [package](https://github.com/theochem/procrustes)

_translate_array(array_a[, array_b, weight])

This is from the Procrustes [package](https://github.com/theochem/procrustes)

_scale_array(array_a[, array_b])

This is from the Procrustes [package](https://github.com/theochem/procrustes)

_hide_zero_padding(array_a[, remove_zero_col, ...])

This is from the Procrustes [package](https://github.com/theochem/procrustes)

compute_error(a, b, t[, s])

This is from the Procrustes [package](https://github.com/theochem/procrustes).

setup_input_arrays(array_a, array_b, remove_zero_col, ...)

This is from the Procrustes [package](https://github.com/theochem/procrustes).

setup_input_arrays_multi(array_list, array_ref, ...[, ...])

This is from the Procrustes [package](https://github.com/theochem/procrustes).

_setup_input_array_lower(array_a, array_ref, ...[, weight])

Pre-processing the matrices with translation, scaling.

_check_arraytypes(*args)

Check array input types to Procrustes transformation routines.

orthogonal(a, b[, pad, translate, scale, unpad_col, ...])

Perform orthogonal Procrustes.

generalized(array_list[, ref, tol, n_iter, check_finite])

Generalized Procrustes Analysis.

_orthogonal(arr_a, arr_b)

Orthogonal Procrustes transformation and returns the transformed array.

fit_transform_procrustes(x, fit_transform_call[, ...])

Fit model and transform data for larger datasets. This is from GRAE (https://github.com/KevinMoonLab/GRAE).

procrustes(X, Y[, scaling, reflection])

This is from GRAE (https://github.com/KevinMoonLab/GRAE).

topo.tpgraph.procrustes._zero_padding(array_a, array_b, pad_mode='row-col')

This is from the Procrustes [package](https://github.com/theochem/procrustes) Return arrays padded with rows and/or columns of zero. :param array_a: The 2D-array \(\mathbf{A}_{n_a imes m_a}\). :type array_a: ndarray :param array_b: The 2D-array \(\mathbf{B}_{n_b imes m_b}\). :type array_b: ndarray :param pad_mode: Specifying how to pad the arrays. Should be one of

  • “row”

    The array with fewer rows is padded with zero rows so that both have the same number of rows.

  • “col”

    The array with fewer columns is padded with zero columns so that both have the same number of columns.

  • “row-col”

    The array with fewer rows is padded with zero rows, and the array with fewer columns is padded with zero columns, so that both have the same dimensions. This does not necessarily result in square arrays.

  • “square”

    The arrays are padded with zero rows and zero columns so that they are both squared arrays. The dimension of square array is specified based on the highest dimension, i.e. :math:` ext{max}(n_a, m_a, n_b, m_b)`.

Returns:

  • padded_a (ndarray) – Padded array_a.

  • padded_b (ndarray) – Padded array_b.

topo.tpgraph.procrustes._translate_array(array_a, array_b=None, weight=None)

This is from the Procrustes [package](https://github.com/theochem/procrustes) Return translated array_a and translation vector. Columns of both arrays will have mean zero. :param array_a: The 2D-array to translate. :type array_a: ndarray :param array_b: The 2D-array to translate array_a based on. :type array_b: ndarray, optional :param weight: The weight vector. :type weight: ndarray, optional

Returns:

  • array_a (ndarray) – If array_b is None, array_a is translated to origin using its centroid. If array_b is given, array_a is translated to centroid of array_b (the centroid of translated array_a will centroid with the centroid array_b).

  • centroid (float) – If array_b is given, the centroid is returned.

topo.tpgraph.procrustes._scale_array(array_a, array_b=None)

This is from the Procrustes [package](https://github.com/theochem/procrustes) Return scaled/normalized array_a and scaling vector. :param array_a: The 2D-array to scale :type array_a: ndarray :param array_b: The 2D-array to scale array_a based on. :type array_b: ndarray, default=None

Returns:

  • scaled_a, ndarray – If array_b is None, array_a is normalized using the Frobenius norm. If array_b is given, array_a is scaled to match array_b”s norm (the norm of array_a will be equal norm of array_b).

  • scale (float) – The scaling factor to match array_b norm.

topo.tpgraph.procrustes._hide_zero_padding(array_a, remove_zero_col=True, remove_zero_row=True, tol=1e-08)

This is from the Procrustes [package](https://github.com/theochem/procrustes) Return array with zero-padded rows (bottom) and columns (right) removed. :param array_a: The initial array. :type array_a: ndarray :param remove_zero_col: If True, zero columns (values less than 1e-8) on the right side will be removed. :type remove_zero_col: bool, optional :param remove_zero_row: If True, zero rows (values less than 1e-8) on the bottom will be removed. :type remove_zero_row: bool, optional :param tol: Tolerance value. :type tol: float, optional

Returns:

new_A (ndarray) – Array, with either near zero columns and/or zero rows are removed.

topo.tpgraph.procrustes.compute_error(a, b, t, s=None)

This is from the Procrustes [package](https://github.com/theochem/procrustes). Return the one- or two-sided Procrustes (squared Frobenius norm) error. The double-sided Procrustes error is defined as .. math:

\|\mathbf{S}\mathbf{A}\mathbf{T} - \mathbf{B}\|_{F}^2 =
 ext{Tr}\left[
     \left(\mathbf{S}\mathbf{A}\mathbf{T} - \mathbf{B}
ight)^dagger

left(mathbf{S}mathbf{A}mathbf{T} - mathbf{B}

ight) ight]

when \(\mathbf{S}\) is the identity matrix \(\mathbf{I}\), this is called the one-sided Procrustes error. Parameters ———- a : ndarray

The 2D-array \(\mathbf{A}_{m imes n}\) which is going to be transformed.

bndarray

The 2D-array \(\mathbf{B}_{m imes n}\) representing the reference matrix.

tndarray

The 2D-array \(\mathbf{T}_{n imes n}\) representing the right-hand-side transformation matrix.

sndarray, optional

The 2D-array \(\mathbf{S}_{m imes m}\) representing the left-hand-side transformation matrix. If set to None, the one-sided Procrustes error is computed.

errorfloat

The squared Frobenius norm of difference between the transformed array, \(\mathbf{S} \mathbf{A}\mathbf{T}\), and the reference array, \(\mathbf{B}\).

topo.tpgraph.procrustes.setup_input_arrays(array_a, array_b, remove_zero_col, remove_zero_row, pad, translate, scale, check_finite, weight=None)

This is from the Procrustes [package](https://github.com/theochem/procrustes). Check and process array inputs for the Procrustes transformation routines. Usually, the precursor step before all Procrustes methods. :param array_a: The 2D array \(A\) being transformed. :type array_a: npdarray :param array_b: The 2D reference array \(B\). :type array_b: npdarray :param remove_zero_col: If True, zero columns (values less than 1e-8) on the right side will be removed. :type remove_zero_col: bool :param remove_zero_row: If True, zero rows (values less than 1e-8) on the bottom will be removed. :type remove_zero_row: bool :param pad: Add zero rows (at the bottom) and/or columns (to the right-hand side) of matrices

\(\mathbf{A}\) and \(\mathbf{B}\) so that they have the same shape.

Parameters:
  • translate (bool) – If true, then translate both arrays \(A, B\) to the origin, ie columns of the arrays will have mean zero.

  • scale – If True, both arrays are normalized to one with respect to the Frobenius norm, ie \(Tr(A^T A) = 1\).

  • check_finite (bool) – If true, then checks if both arrays \(A, B\) are numpy arrays and two-dimensional.

  • weight (A list of ndarray or ndarray) – A list of the weight arrays or one numpy array. When only on numpy array provided, it is assumed that the two arrays \(A\) and \(B\) share the same weight matrix.

Returns:

(ndarray, ndarray) – Returns the padded arrays, in that they have the same matrix dimensions.

topo.tpgraph.procrustes.setup_input_arrays_multi(array_list, array_ref, remove_zero_col, remove_zero_row, pad_mode, translate, scale, check_finite, weight=None)

This is from the Procrustes [package](https://github.com/theochem/procrustes). Check and process array inputs for the Procrustes transformation routines. :param array_list: A list of 2D arrays that being transformed. :type array_list: List :param array_ref: The 2D reference array \(B\). :type array_ref: ndarray :param remove_zero_col: If True, zero columns (values less than 1e-8) on the right side will be removed. :type remove_zero_col: bool :param remove_zero_row: If True, zero rows (values less than 1e-8) on the bottom will be removed. :type remove_zero_row: bool :param pad_mode:

Specifying how to pad the arrays. Should be one of
  • “row”

    The array with fewer rows is padded with zero rows so that both have the same number of rows.

  • “col”

    The array with fewer columns is padded with zero columns so that both have the same number of columns.

  • “row-col”

    The array with fewer rows is padded with zero rows, and the array with fewer columns is padded with zero columns, so that both have the same dimensions. This does not necessarily result in square arrays.

  • “square”

    The arrays are padded with zero rows and zero columns so that they are both squared arrays. The dimension of square array is specified based on the highest dimension, i.e. :math:` ext{max}(n_a, m_a, n_b, m_b)`.

Parameters:
  • translate (bool) – If true, then translate both arrays \(A, B\) to the origin, ie columns of the arrays will have mean zero.

  • scale – If True, both arrays are normalized to one with respect to the Frobenius norm, ie \(Tr(A^T A) = 1\).

  • check_finite (bool) – If true, then checks if both arrays \(A, B\) are numpy arrays and two-dimensional.

  • weight (A list of ndarray or ndarray, optional) – A list of the weight arrays or one numpy array. When only on numpy array provided, it is assumed that the two arrays \(A\) and \(B\) share the same weight matrix.

Returns:

List of arrays – Returns the padded arrays, in that they have the same matrix dimensions.

topo.tpgraph.procrustes._setup_input_array_lower(array_a, array_ref, remove_zero_col, remove_zero_row, translate, scale, check_finite, weight=None)

Pre-processing the matrices with translation, scaling.

topo.tpgraph.procrustes._check_arraytypes(*args)

Check array input types to Procrustes transformation routines.

class topo.tpgraph.procrustes.ProcrustesResult

Bases: dict

This is from the Procrustes [package](https://github.com/theochem/procrustes). Represents the Procrustes analysis result. :ivar error: The Procrustes (squared Frobenius norm) error. :vartype error: float :ivar new_a: The translated/scaled numpy ndarray \(\mathbf{A}\). :vartype new_a: ndarray :ivar new_b: The translated/scaled numpy ndarray \(\mathbf{B}\). :vartype new_b: ndarray :ivar t: The 2D-array \(\mathbf{T}\) representing the right-hand-side transformation matrix. :vartype t: ndarray :ivar s: The 2D-array \(\mathbf{S}\) representing the left-hand-side transformation

matrix. If set to None, the one-sided Procrustes was performed.

__setattr__
__delattr__
__getattr__(name)

Deal with attributes which it doesn’t explicitly manage.

__repr__()

Return a human friendly representation.

__dir__()

Provide basic customization of module attribute access with a list.

topo.tpgraph.procrustes.orthogonal(a, b, pad=True, translate=False, scale=False, unpad_col=False, unpad_row=False, check_finite=True, weight=None, lapack_driver='gesvd')

Perform orthogonal Procrustes. This is from the Procrustes [package](https://github.com/theochem/procrustes).

Parameters:
  • a (ndarray) – The 2D-array which is going to be transformed.

  • b (ndarray) – The 2D-array representing the reference matrix.

  • pad (bool, optional) – Add zero rows (at the bottom) and/or columns (to the right-hand side) of the matrices so that they have the same shape.

  • translate (bool, optional) – If True, both arrays are centered at origin (columns of the arrays will have mean zero).

  • scale (bool, optional) – If True, both arrays are normalized with respect to the Frobenius norm, i.e.,

  • unpad_col (bool, optional) –

    If True, zero columns (with values less than 1.0e-8) on the right-hand side of the intial

    and matrices are removed.

  • unpad_row (bool, optional) –

    If True, zero rows (with values less than 1.0e-8) at the bottom of the intial

    and ` matrices are removed.

  • check_finite (bool, optional) – If True, convert the input to an array, checking for NaNs or Infs.

  • weight (ndarray, optional) – The 1D-array representing the weights of each row of. This defines the elements of the diagonal matrix that is multiplied by matrix.

  • lapack_driver ({'gesvd', 'gesdd'}, optional) – Whether to use the more efficient divide-and-conquer approach (‘gesdd’) or the more robust general rectangular approach (‘gesvd’) to compute the singular-value decomposition with scipy.linalg.svd.

Returns:

res (ProcrustesResult) – The Procrustes result represented as a class:utils.ProcrustesResult object.

topo.tpgraph.procrustes.generalized(array_list, ref=None, tol=1e-07, n_iter=200, check_finite=True)

Generalized Procrustes Analysis. This is from the Procrustes [package](https://github.com/theochem/procrustes). Parameters ———- array_list : List

The list of 2D-array which is going to be transformed.

refndarray, optional

The reference array to initialize the first iteration. If None, the first array in array_list will be used.

tol: float, optional

Tolerance value to stop the iterations.

n_iter: int, optional

Number of total iterations.

check_finitebool, optional

If true, convert the input to an array, checking for NaNs or Infs.

array_alignedList

A list of transformed arrays with generalized Procrustes analysis.

new_distance_gpa: float

The distance for matching all the transformed arrays with generalized Procrustes analysis.

Given a set of matrices, \(\mathbf{A}_1, \mathbf{A}_2, \cdots, \mathbf{A}_k\) with \(k > 2\), the objective is to minimize in order to superimpose pairs of matrices. .. math:

\min \quad = \sum_{i<j}^{j} {\left\| \mathbf{A}_i \mathbf{T}_i  -
 \mathbf{A}_j \mathbf{T}_j
ight| }^2

This function implements the Equation (20) and the corresponding algorithm in Gower’s paper.

class topo.tpgraph.procrustes.GeneralizedProcrustes(ref=None, tol=1e-07, n_iter=200, check_finite=True)

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Generalized Procrustes Analysis in a scikit-learn flavor. Automatically tries to align the provided matrices by finding transformations that make them as similar as possible to each other. This is from the Procrustes [package](https://github.com/theochem/procrustes), available under the GPL v3.

Parameters:
  • ref (ndarray, optional) – The reference array to initialize the first iteration. If None, the first array in array_list will be used.

  • tol (float, optional) – Tolerance value to stop the iterations.

  • n_iter (int, optional) – Number of total iterations.

  • check_finite (bool, optional) – If true, convert the input to an array, checking for NaNs or Infs.

fit(array_list)

Fit the model with the given array_list. :param array_list: The list of 2D-array which is going to be transformed. :type array_list: list

Returns:

self (object) – Returns the instance itself.

transform(array_list=None)

Returns a tuple of the aligned concatenated array and the error (distance) for the matching. Here only for scikit-learn consistency.

Parameters:

array_list (List) – The list of 2D-array which is going to be transformed.

Returns:

  • array_aligned (List) – A list of transformed arrays with generalized Procrustes analysis.

  • new_distance_gpa (float) – The distance for matching all the transformed arrays with generalized Procrustes analysis.

fit_transform(array_list)

Fit the model with the given array_list and returns a tuple of the aligned concatenated array and the error (distance) for the matching.

Returns:

  • array_aligned (List) – A list of transformed arrays with generalized Procrustes analysis.

  • new_distance_gpa (float) – The distance for matching all the transformed arrays with generalized Procrustes analysis.

topo.tpgraph.procrustes._orthogonal(arr_a, arr_b)

Orthogonal Procrustes transformation and returns the transformed array.

topo.tpgraph.procrustes.fit_transform_procrustes(x, fit_transform_call, procrustes_batch_size=5000, procrustes_lm=1000)

Fit model and transform data for larger datasets. This is from GRAE (https://github.com/KevinMoonLab/GRAE). If dataset has more than self.proc_threshold samples, then compute the eigendecomposition or projection over mini-batches. In each batch, add self.procrustes_lm samples (which are the same for all batches), which can be used to compute a procrustes transform to roughly align all batches in a coherent manner.

Parameters:
  • x (np.array) – Data to be transformed

  • fit_transform_call (function) – Function to be called to fit and transform the data (scikit-learn style estimator).

  • procrustes_batch_size (int) – Number of samples in each batch of procrustes

  • procrustes_lm (int) – Number of anchor points present in all batches. Used as a reference for the procrustes transform.

  • Returns

  • --------

  • x_transformed (np.array) – Embedding of x, which is the union of all batches aligned with procrustes.

topo.tpgraph.procrustes.procrustes(X, Y, scaling=True, reflection='best')

This is from GRAE (https://github.com/KevinMoonLab/GRAE). Taken from https://stackoverrun.com/es/q/5162566 adaptation of MATLAB to numpy. A port of MATLAB’s procrustes function to Numpy. Procrustes analysis determines a linear transformation (translation, reflection, orthogonal rotation and scaling) of the points in Y to best conform them to the points in matrix X, using the sum of squared errors as the goodness of fit criterion.

d, Z, [tform] = procrustes(X, Y)

Inputs:

X, Y

matrices of target and input coordinates. they must have equal numbers of points (rows), but Y may have fewer dimensions (columns) than X.

scaling

if False, the scaling component of the transformation is forced to 1

reflection

if ‘best’ (default), the transformation solution may or may not include a reflection component, depending on which fits the data best. setting reflection to True or False forces a solution with reflection or no reflection respectively.

Outputs

d

the residual sum of squared errors, normalized according to a measure of the scale of X, ((X - X.mean(0))**2).sum()

Z

the matrix of transformed Y-values

tform

a dict specifying the rotation, translation and scaling that maps X –> Y