Module dstats.cor

Pearson, Spearman and Kendall correlations, covariance.

Author

David Simcha

Functions

NameDescription
covariance(input1, input2)
covarianceMatrix(mat, ans, pool) These overloads allow for correlation and covariance matrices to be computed with the results being stored in a pre-allocated variable, ans. ans must be either a SciD matrix or a random-access range of ranges with assignable elements of a floating point type. It must have the same number of rows as the number of vectors in mat and must have at least enough columns in each row to support storing the lower triangle. If ans is a full rectangular matrix/range of ranges, only the lower triangle results will be stored.
kendallCor(input1, input2) Kendall's Tau-b, O(N log N) version. This is a non-parametric measure of monotonic association and can be defined in terms of the bubble sort distance, or the number of swaps that would be needed in a bubble sort to sort input2 into the same order as input1.
kendallCorDestructive(input1, input2) Kendall's Tau-b O(N log N), overwrites input arrays with undefined data but uses only O(log N) stack space for sorting, not O(N) space to duplicate input. R1 and R2 must be either SortedRange structs with the default predicate or arrays.
kendallCorDestructiveLowLevel(input1, input2, needTies)
kendallMatrix(mat, ans, pool) These overloads allow for correlation and covariance matrices to be computed with the results being stored in a pre-allocated variable, ans. ans must be either a SciD matrix or a random-access range of ranges with assignable elements of a floating point type. It must have the same number of rows as the number of vectors in mat and must have at least enough columns in each row to support storing the lower triangle. If ans is a full rectangular matrix/range of ranges, only the lower triangle results will be stored.
partial(vec1, vec2, conditionsIn) Computes the partial correlation between vec1, vec2 given conditions. conditions can be either a tuple of ranges, a range of ranges, or (for a single condition) a single range.
pearsonCor(input1, input2) Convenience function for calculating Pearson correlation. When the term correlation is used unqualified, it is usually referring to this quantity. This is a parametric correlation metric and should not be used with extremely ill-behaved data. This function works with any pair of input ranges.
pearsonMatrix(mat, ans, pool) These overloads allow for correlation and covariance matrices to be computed with the results being stored in a pre-allocated variable, ans. ans must be either a SciD matrix or a random-access range of ranges with assignable elements of a floating point type. It must have the same number of rows as the number of vectors in mat and must have at least enough columns in each row to support storing the lower triangle. If ans is a full rectangular matrix/range of ranges, only the lower triangle results will be stored.
spearmanCor(input1, input2) Spearman's rank correlation. Non-parametric. This is essentially the Pearson correlation of the ranks of the data, with ties dealt with by averaging.
spearmanMatrix(mat, ans, pool) These overloads allow for correlation and covariance matrices to be computed with the results being stored in a pre-allocated variable, ans. ans must be either a SciD matrix or a random-access range of ranges with assignable elements of a floating point type. It must have the same number of rows as the number of vectors in mat and must have at least enough columns in each row to support storing the lower triangle. If ans is a full rectangular matrix/range of ranges, only the lower triangle results will be stored.

Structs

NameDescription
KendallLowLevel
PearsonCor Allows computation of mean, stdev, variance, covariance, Pearson correlation online. Getters for stdev, var, cov, cor cost floating point division ops. Getters for means cost a single branch to check for N == 0. This struct uses O(1) space.

Aliases

NameTypeDescription
kcor
kcorDestructive
Pcor PearsonCor
pcor
scor