Module dstats.infotheory

Basic information theory. Joint entropy, mutual information, conditional mutual information. This module uses the base 2 definition of these quantities, i.e, entropy, mutual info, etc. are output in bits.


David Simcha


condEntropy(data, cond)Calculate the conditional entropy H(data | cond).
condMutualInfo(x, y, z)Calculates the conditional mutual information I(x, y | z) from a set of observations.
entropy(data)Calculates the joint entropy of a set of observations. Each input range represents a vector of observations. If only one range is given, this reduces to the plain old entropy. Input range must have a length.
entropyCounts(data)This function calculates the Shannon entropy of a forward range that is treated as frequency counts of a set of discrete observations.
entropyCounts(data, n)
entropySorted(data)Calculates the entropy of any old input range of observations more quickly than entropy(), provided that all equal values are adjacent. If the input is sorted by more than one key, i.e. structs, the result will be the joint entropy of all of the keys. The compFun alias will be used to compare adjacent elements and determine how many instances of each value exist.
joint(args)Bind a set of ranges together to represent a joint probability distribution.
mutualInfo(x, y)Calculates the mutual information of two vectors of discrete observations.
mutualInfoTable(table)Calculates the mutual information of a contingency table representing a joint discrete probability distribution. Takes a set of finite forward ranges, one for each column in the contingency table. These can be expressed either as a tuple of ranges or a range of ranges.


DenseInfoTheoryMuch faster implementations of information theory functions for the special but common case where all observations are integers on the range [0, nBin). This is the case, for example, when the observations have been previously binned using, for example, dstats.base.frqBin().
JointIterate over a set of ranges by value in lockstep and return an ObsEnt, which is used internally by entropy functions on each iteration.