Module dstats.pca

This module contains a basic implementation of principal component analysis, based on the NIPALS algorithm. This is fast when you only need the first few components (which is usually the case since PCA's main uses are visualization and dimensionality reduction). However, convergence slows drastically after the first few components have been removed and most of the matrix is just noise.

References

en.wikipedia.org/wiki/Principal_component_analysis#Computing_principal_components_iteratively

Author

David Simcha

Functions

NameDescription
doubleTempdup(range, alloc)
firstComponent(data, opts, buf) Uses expectation-maximization to compute the first principal component of mat. Since there are a lot of options, they are controlled by a PrinCompOptions struct. (See above. PrinCompOptions.init contains the default values.) To have the results returned in a pre-allocated space, pass an explicit value for buf.
firstNComponents(data, n, opts, buf) Computes the first N principal components of the matrix. More efficient than calling firstComponent and removeComponent repeatedly because copying and transposing, if enabled, only happen once.
removeComponent(data, rotation, transposed) Remove the principal component specified by the given rotation vector from data. data must have assignable elements. Transposed controls whether rotation is considered a loading for the transposed matrix or the matrix as-is.

Structs

NameDescription
PrincipalComponent Result holder
PrinCompOptions Sets options for principal component analysis. The default options are also the values in PrinCompOptions.init.

Enums

NameDescription
Transposed Used for removeComponent().