Name | Description |
binomialTest(k, n, p)
|
Two-sided binomial test for whether P(success) == p. The one-sided
alternatives are covered by dstats.distrib.binomialCDF and binomialCDFR.
k is the number of successes observed, n is the number of trials, p
is the probability of success under the null.
|
chiSquareContingency(inputData)
|
Performs a Pearson's chi-square test on a contingency table of arbitrary
dimensions. When the chi-square test is mentioned, this is usually the one
being referred to. Takes a set of finite forward ranges, one for each column
in the contingency table. These can be expressed either as a tuple of ranges
or a range of ranges. Returns a P-value for the alternative hypothesis that
frequencies in each row of the contingency table depend on the column against
the null that they don't.
|
chiSquareFit(observed, expected, countProp)
|
Performs a one-way Pearson's chi-square goodness of fit test between a range
of observed and a range of expected values. This is a useful statistical
test for testing whether a set of observations fits a discrete distribution.
|
chiSquareObs(x, y)
|
Given two vectors of observations of jointly distributed variables x, y, tests
the null hypothesis that values in x are independent of the corresponding
values in y. This is done using Pearson's Chi-Square Test. For a similar test
that assumes the data has already been tabulated into a contingency table, see
chiSquareContingency.
|
correlatedAnova(dataIn)
|
Performs a correlated sample (within-subjects) ANOVA. This is a
generalization of the paired T-test to 3 or more treatments. This
function accepts data as either a tuple of ranges (1 for each treatment,
such that a given index represents the same subject in each range) or
similarly as a range of ranges.
|
dAgostinoK(range)
|
A test for normality of the distribution of a range of values. Based on
the assumption that normally distributed values will have a sample skewness
and sample kurtosis very close to zero.
|
falseDiscoveryRate(pVals, dep)
|
Computes the false discovery rate statistic given a list of
p-values, according to Benjamini and Hochberg (1995) (independent) or
Benjamini and Yekutieli (2001) (dependent). The Dependency parameter
controls whether hypotheses are assumed to be independent, or whether
the more conservative assumption that they are correlated must be made.
|
fisherExact(contingencyTable, alt)
|
Fisher's Exact test for difference in odds between rows/columns
in a 2x2 contingency table. Specifically, this function tests the odds
ratio, which is defined, for a contingency table c, as (c[0][0] * c[1][1])
/ (c[1][0] * c[0][1]). Alternatives are Alt.less, meaning true odds ratio
< 1, Alt.greater, meaning true odds ratio > 1, and Alt.twoSided, meaning
true odds ratio != 1.
|
fisherExact(contingencyTable, alt)
|
Convenience function. Converts a dynamic array to a static one, then
calls the overload.
|
fishersMethod(pVals)
|
Fisher's method of meta-analyzing a set of P-values to determine whether
there are more significant results than would be expected by chance.
Based on a chi-square statistic for the sum of the logs of the P-values.
|
friedmanTest(dataIn)
|
The Friedman test is a non-parametric within-subject ANOVA. It's useful
when parametric assumptions cannot be made. Usage is identical to
correlatedAnova().
|
fTest(data)
|
The F-test is a one-way ANOVA extension of the T-test to >2 groups.
It's useful when you have 3 or more groups with equal variance and want
to test whether their means are equal. Data can be input as either a
tuple or a range. This may contain any combination of ranges of numeric
types, MeanSD structs and Summary structs.
|
gTestContingency(inputData)
|
The G or likelihood ratio chi-square test for contingency tables. Roughly
the same as Pearson's chi-square test (chiSquareContingency), but may be more
accurate in certain situations and less accurate in others.
|
gTestFit(observed, expected, countProp)
|
The G or likelihood ratio chi-square test for goodness of fit. Roughly
the same as Pearson's chi-square test (chiSquareFit), but may be more
accurate in certain situations and less accurate in others. However, it is
still based on asymptotic distributions, and is not exact. Usage is is
identical to chiSquareFit.
|
gTestObs(x, y)
|
Given two ranges of observations of jointly distributed variables x, y, tests
the null hypothesis that values in x are independent of the corresponding
values in y. This is done using the Likelihood Ratio G test. Usage is similar
to chiSquareObs. For an otherwise identical test that assumes the data has
already been tabulated into a contingency table, see gTestContingency.
|
hochberg(pVals)
|
Uses the Hochberg procedure to control the familywise error rate assuming
that hypothesis tests are independent. This is more powerful than
Holm-Bonferroni correction, but requires the independence assumption.
|
holmBonferroni(pVals)
|
Uses the Holm-Bonferroni method to adjust a set of P-values in a way that
controls the familywise error rate (The probability of making at least one
Type I error). This is basically a less conservative version of
Bonferroni correction that is still valid for arbitrary assumptions and
controls the familywise error rate. Therefore, there aren't too many good
reasons to use regular Bonferroni correction instead.
|
kendallCorTest(range1, range2, alt, exactThresh)
|
Tests the hypothesis that the Kendall Tau-b between two ranges is
different from 0. Alternatives are Alt.less (kendallCor(range1, range2) < 0),
Alt.greater (kendallCor(range1, range2) > 0) and Alt.twoSided
(kendallCor(range1, range2) != 0).
|
kruskalWallis(dataIn)
|
The Kruskal-Wallis rank sum test. Tests the null hypothesis that data in
each group is not stochastically ordered with respect to data in each other
groups. This is a one-way non-parametric ANOVA and can be thought of
as either a generalization of the Wilcoxon rank sum test to >2 groups or
a non-parametric equivalent to the F-test. Data can be input as either a
tuple of ranges (one range for each group) or a range of ranges
(one element for each group).
|
ksTest(F, Fprime)
|
Performs a Kolmogorov-Smirnov (K-S) 2-sample test. The K-S test is a
non-parametric test for a difference between two empirical distributions or
between an empirical distribution and a reference distribution.
|
ksTest(Femp, F)
|
One-sample Kolmogorov-Smirnov test against a reference distribution.
Takes a callable object for the CDF of refernce distribution.
|
ksTestDestructive(F, Fprime)
|
Same as ksTest, except sorts in place, avoiding memory allocations.
|
ksTestDestructive(Femp, F)
|
Ditto.
|
levenesTest(data)
|
Tests the null hypothesis that the variances of all groups are equal against
the alternative that heteroscedasticity exists. data must be either a
tuple of ranges or a range of ranges. central is an alias for the measure
of central tendency to be used. This can be any function that maps a
forward range of numeric types to a numeric type. The commonly used ones
are median (default) and mean (less robust). Trimmed mean is sometimes
useful, but is currently not implemented in dstats.summary.
|
multinomialTest(countsIn, proportions)
|
The exact multinomial goodness of fit test for whether a set of counts
fits a hypothetical distribution. counts is an input range of counts.
proportions is an input range of expected proportions. These are normalized
automatically, so they can sum to any value.
|
pairedTTest(before, after, testMean, alt, confLevel)
|
Paired T test. Tests the hypothesis that the mean difference between
corresponding elements of before and after is testMean. Alternatives are
Alt.less, meaning the that the true mean difference (before[i] - after[i])
is less than testMean, Alt.greater, meaning the true mean difference is
greater than testMean, and Alt.twoSided, meaning the true mean difference is not
equal to testMean.
|
pairedTTest(diffSummary, testMean, alt, confLevel)
|
Compute a paired T test directly from summary statistics of the differences
between corresponding samples.
|
pearsonCorTest(range1, range2, alt, confLevel)
|
Tests the hypothesis that the Pearson correlation between two ranges is
different from some 0. Alternatives are Alt.less
(pearsonCor(range1, range2) < 0), Alt.greater (pearsonCor(range1, range2)
0) and Alt.twoSided (pearsonCor(range1, range2) != 0).
|
pearsonCorTest(cor, N, alt, confLevel)
|
Same as overload, but uses pre-computed correlation coefficient and sample
size instead of computing them.
|
runsTest(obs, alt)
|
Wald-wolfowitz or runs test for randomness of the distribution of
elements for which positive() evaluates to true. For example, given
a sequence of coin flips [H,H,H,H,H,T,T,T,T,T] and a positive() function of
"a == 'H'", this test would determine that the heads are non-randomly
distributed, since they are all at the beginning of obs. This is done
by counting the number of runs of consecutive elements for which
positive() evaluates to true, and the number of consecutive runs for which
it evaluates to false. In the example above, we have 2 runs. These are the
block of 5 consecutive heads at the beginning and the 5 consecutive tails
at the end.
|
signTest(before, after, alt)
|
Sign test for differences between paired values. This is a very robust
but very low power test. Alternatives are Alt.less, meaning elements
of before are typically less than corresponding elements of after,
Alt.greater, meaning elements of before are typically greater than
elements of after, and Alt.twoSided, meaning that there is a significant
difference in either direction.
|
signTest(data, mu, alt)
|
Similar to the overload, but allows testing for a difference between a
range and a fixed value mu.
|
spearmanCorTest(range1, range2, alt)
|
Tests the hypothesis that the Spearman correlation between two ranges is
different from some 0. Alternatives are
Alt.less (spearmanCor(range1, range2) < 0), Alt.greater (spearmanCor(range1, range2)
> 0) and Alt.twoSided (spearmanCor(range1, range2) != 0).
|
studentsTTest(data, testMean, alt, confLevel)
|
One-sample Student's T-test for difference between mean of data and
a fixed value. Alternatives are Alt.less, meaning mean(data) < testMean,
Alt.greater, meaning mean(data) > testMean, and Alt.twoSided, meaning
mean(data)!= testMean.
|
studentsTTest(sample1, sample2, testMean, alt, confLevel)
|
Two-sample T test for a difference in means,
assumes variances of samples are equal. Alteratives are Alt.less, meaning
mean(sample1) - mean(sample2) < testMean, Alt.greater, meaning
mean(sample1) - mean(sample2) > testMean, and Alt.twoSided, meaning
mean(sample1) - mean(sample2) != testMean.
|
welchAnova(data)
|
Same as fTest, except that this test does not require the assumption of
equal variances. In exchange it's slightly less powerful.
|
welchTTest(sample1, sample2, testMean, alt, confLevel)
|
Two-sample T-test for difference in means. Does not assume variances are equal.
Alteratives are Alt.less, meaning mean(sample1) - mean(sample2) < testMean,
Alt.greater, meaning mean(sample1) - mean(sample2) > testMean, and
Alt.twoSided, meaning mean(sample1) - mean(sample2) != testMean.
|
wilcoxonRankSum(sample1, sample2, alt, exactThresh)
|
Computes Wilcoxon rank sum test statistic and P-value for
a set of observations against another set, using the given alternative.
Alt.less means that sample1 is stochastically less than sample2.
Alt.greater means sample1 is stochastically greater than sample2.
Alt.twoSided means sample1 is stochastically less than or greater than
sample2.
|
wilcoxonSignedRank(before, after, alt, exactThresh)
|
Computes a test statistic and P-value for a Wilcoxon signed rank test against
the given alternative. Alt.less means that elements of before are stochastically
less than corresponding elements of after. Alt.greater means elements of
before are stochastically greater than corresponding elements of after.
Alt.twoSided means there is a significant difference in either direction.
|
wilcoxonSignedRank(data, mu, alt, exactThresh)
|
Same as the overload, but allows testing whether a range is stochastically
less than or greater than a fixed value mu rather than paired elements of
a second range.
|