Function kendallCor

Kendall's Tau-b, O(N log N) version. This is a non-parametric measure of monotonic association and can be defined in terms of the bubble sort distance, or the number of swaps that would be needed in a bubble sort to sort input2 into the same order as input1.

double kendallCor(T, U) (
  T input1,
  U input2
)
if (isInputRange!T && isInputRange!U);

Since a copy of the inputs is made anyhow because they need to be sorted, this function can work with any input range. However, the ranges must have the same length.

Note

As an optimization, when a range is a SortedRange with predicate "a < b", it is assumed already sorted and not sorted a second time by this function. This is useful when applying this function multiple times with one of the arguments the same every time:

auto lhs = randArray!rNormal(1_000, 0, 1);
auto indices = new size_t[1_000];
makeIndex(lhs, indices);

foreach(i; 0..1_000) {
    auto rhs = randArray!rNormal(1_000, 0, 1);
    auto lhsSorted = assumeSorted(
        indexed(lhs, indices)
    );

    // Rearrange rhs according to the sorting permutation of lhs.
    // kendallCor(lhsSorted, rhsRearranged) will be much faster than
    // kendallCor(lhs, rhs).
    auto rhsRearranged = indexed(rhs, indices);
    assert(kendallCor(lhsSorted, rhsRearranged) == kendallCor(lhs, rhs));
}

References

A Computer Method for Calculating Kendall's Tau with Ungrouped Data, William R. Knight, Journal of the American Statistical Association, Vol. 61, No. 314, Part 1 (Jun., 1966), pp. 436-439

The Variance of Tau When Both Rankings Contain Ties. M.G. Kendall. Biometrika, Vol 34, No. 3/4 (Dec., 1947), pp. 297-298