Module dstats.regress

A module for performing linear regression. This module has an unusual interface, as it is range-based instead of matrix based. Values for independent variables are provided as either a tuple or a range of ranges. This means that one can use, for example, map, to fit high order models and lazily evaluate certain values. (For details, see examples below.)

Author

David Simcha

Functions

NameDescription
linearRegress(Y, input) Perform a linear regression as in linearRegressBeta, but return a RegressRes with useful stuff for statistical inference. If the last element of input is a real, this is used to specify the confidence intervals to be calculated. Otherwise, the default of 0.95 is used. The rest of input should be the elements of X.
linearRegressBeta(Y, XIn) Perform a linear regression and return just the beta values. The advantages to just returning the beta values are that it's faster and that each range needs to be iterated over only once, and thus can be just an input range. The beta values are returned such that the smallest index corresponds to the leftmost element of X. X can be either a tuple or a range of input ranges. Y must be an input range.
linearRegressBetaBuf(buf, Y, XRidge) Same as linearRegressBeta, but allows the user to specify a buffer for the beta terms. If the buffer is too short, a new one is allocated. Otherwise, the results are returned in the user-provided buffer.
linearRegressPenalized(yIn, xIn, lasso, ridge) Performs lasso (L1) and/or ridge (L2) penalized linear regression. Due to the way the data is standardized, no intercept term should be included in x (unlike linearRegress and linearRegressBeta). The intercept coefficient is implicitly included and returned in the first element of the returned array. Usage is otherwise identical.
loess1D(y, x, span, degree) This function performs loess regression. Loess regression is a local regression procedure, where a prediction of the dependent (y) variable is made from an observation of the independent (x) variable by weighted least squares over x values in the neighborhood of the value being evaluated.
logistic(xb) The logistic function used in logistic regression.
logisticRegress(yIn, input) Similar to logisticRegressBeta, but returns a LogisticRes with useful stuff for statistical inference. If the last element of input is a floating point number instead of a range, it is used to specify the confidence interval calculated. Otherwise, the default of 0.95 is used.
logisticRegressBeta(yIn, xRidge) Computes a logistic regression using a maximum likelihood estimator and returns the beta coefficients. This is a generalized linear model with the link function f(XB) = 1 / (1 + exp(XB)). This is generally used to model the probability that a binary Y variable is 1 given a set of X variables.
logisticRegressPenalized(yIn, xIn, lasso, ridge) Performs lasso (L1) and/or ridge (L2) penalized logistic regression. Due to the way the data is standardized, no intercept term should be included in x (unlike logisticRegress and logisticRegressBeta). The intercept coefficient is implicitly included and returned in the first element of the returned array. Usage is otherwise identical.
polyFit(Y, X, N, confInt) Convenience function that takes a forward range X and a forward range Y, creates an array of PowMap structs for integer powers 0 through N, and calls linearRegress.
polyFitBeta(Y, X, N, ridge) Convenience function that takes a forward range X and a forward range Y, creates an array of PowMap structs for integer powers from 0 through N, and calls linearRegressBeta.
polyFitBetaBuf(buf, Y, X, N, ridge) Same as polyFitBeta, but allows the caller to provide an explicit buffer to return the coefficients in. If it's too short, a new one will be allocated. Otherwise, results will be returned in the user-provided buffer.
powMap(range, exponent) Maps a forward range to a power determined at runtime. ExpType is the type of the exponent. Using an int is faster than using a double, but obviously less flexible.
residuals(betas, Y, X) Given the beta coefficients from a linear regression, and X and Y values, returns a range that lazily computes the residuals.
_arrayExpSliceMulSliceAddass_d(p2, p1, c0)
_arrayExpSliceMulSliceAssign_d(p2, p1, c0)
_arrayExpSliceMulSliceMinass_d(p2, p1, c0)
_arraySliceExpMulSliceAddass_d(p2, c1, p0)
_arraySliceSliceMinSliceAssign_d(p2, p1, p0)
_arraySliceSliceMulSliceAssign_d(p2, p1, p0)

Classes

NameDescription
Loess1D This class is returned from the loess1D function and holds the state of a loess regression with one predictor variable.

Structs

NameDescription
LogisticRes Plain old data struct to hold the results of a logistic regression.
PolyFitRes Struct returned by polyFit.
PowMap
RegressRes Struct that holds the results of a linear regression. It's a plain old data struct.
Residuals Forward Range for holding the residuals from a regression analysis.

Manifest constants

NameTypeDescription
haveSvd

Aliases

NameTypeDescription
repeat