Skip to content

Latest commit

 

History

History
1271 lines (821 loc) · 40.7 KB

DOCUMENTATION.md

File metadata and controls

1271 lines (821 loc) · 40.7 KB

stats

import "github.com/montanaflynn/stats"

Package stats is a well tested and comprehensive statistics library package with no dependencies.

Example Usage:

// start with some source data to use
data := []float64{1.0, 2.1, 3.2, 4.823, 4.1, 5.8}

// you could also use different types like this
// data := stats.LoadRawData([]int{1, 2, 3, 4, 5})
// data := stats.LoadRawData([]interface{}{1.1, "2", 3})
// etc...

median, _ := stats.Median(data)
fmt.Println(median) // 3.65

roundedMedian, _ := stats.Round(median, 0)
fmt.Println(roundedMedian) // 4

MIT License Copyright (c) 2014-2020 Montana Flynn (https://montanaflynn.com)

correlation.go cumulative_sum.go data.go deviation.go distances.go doc.go entropy.go errors.go geometric_distribution.go legacy.go load.go max.go mean.go median.go min.go mode.go norm.go outlier.go percentile.go quartile.go ranksum.go regression.go round.go sample.go sigmoid.go softmax.go sum.go util.go variance.go

var (
    // ErrEmptyInput Input must not be empty
    ErrEmptyInput = statsError{"Input must not be empty."}
    // ErrNaN Not a number
    ErrNaN = statsError{"Not a number."}
    // ErrNegative Must not contain negative values
    ErrNegative = statsError{"Must not contain negative values."}
    // ErrZero Must not contain zero values
    ErrZero = statsError{"Must not contain zero values."}
    // ErrBounds Input is outside of range
    ErrBounds = statsError{"Input is outside of range."}
    // ErrSize Must be the same length
    ErrSize = statsError{"Must be the same length."}
    // ErrInfValue Value is infinite
    ErrInfValue = statsError{"Value is infinite."}
    // ErrYCoord Y Value must be greater than zero
    ErrYCoord = statsError{"Y Value must be greater than zero."}
)

These are the package-wide error values. All error identification should use these values. https://github.com/golang/go/wiki/Errors#naming

var (
    EmptyInputErr = ErrEmptyInput
    NaNErr        = ErrNaN
    NegativeErr   = ErrNegative
    ZeroErr       = ErrZero
    BoundsErr     = ErrBounds
    SizeErr       = ErrSize
    InfValue      = ErrInfValue
    YCoordErr     = ErrYCoord
    EmptyInput    = ErrEmptyInput
)

Legacy error names that didn't start with Err

func AutoCorrelation(data Float64Data, lags int) (float64, error)

AutoCorrelation is the correlation of a signal with a delayed copy of itself as a function of delay

func ChebyshevDistance(dataPointX, dataPointY Float64Data) (distance float64, err error)

ChebyshevDistance computes the Chebyshev distance between two data sets

func Correlation(data1, data2 Float64Data) (float64, error)

Correlation describes the degree of relationship between two sets of data

func Covariance(data1, data2 Float64Data) (float64, error)

Covariance is a measure of how much two sets of data change

func CovariancePopulation(data1, data2 Float64Data) (float64, error)

CovariancePopulation computes covariance for entire population between two variables.

func CumulativeSum(input Float64Data) ([]float64, error)

CumulativeSum calculates the cumulative sum of the input slice

func Entropy(input Float64Data) (float64, error)

Entropy provides calculation of the entropy

func EuclideanDistance(dataPointX, dataPointY Float64Data) (distance float64, err error)

EuclideanDistance computes the Euclidean distance between two data sets

func ExpGeom(p float64) (exp float64, err error)

ProbGeom generates the expectation or average number of trials for a geometric random variable with parameter p

func GeometricMean(input Float64Data) (float64, error)

GeometricMean gets the geometric mean for a slice of numbers

func HarmonicMean(input Float64Data) (float64, error)

HarmonicMean gets the harmonic mean for a slice of numbers

func InterQuartileRange(input Float64Data) (float64, error)

InterQuartileRange finds the range between Q1 and Q3

func ManhattanDistance(dataPointX, dataPointY Float64Data) (distance float64, err error)

ManhattanDistance computes the Manhattan distance between two data sets

func Max(input Float64Data) (max float64, err error)

Max finds the highest number in a slice

func Mean(input Float64Data) (float64, error)

Mean gets the average of a slice of numbers

func Median(input Float64Data) (median float64, err error)

Median gets the median number in a slice of numbers

func MedianAbsoluteDeviation(input Float64Data) (mad float64, err error)

MedianAbsoluteDeviation finds the median of the absolute deviations from the dataset median

func MedianAbsoluteDeviationPopulation(input Float64Data) (mad float64, err error)

MedianAbsoluteDeviationPopulation finds the median of the absolute deviations from the population median

func Midhinge(input Float64Data) (float64, error)

Midhinge finds the average of the first and third quartiles

func Min(input Float64Data) (min float64, err error)

Min finds the lowest number in a set of data

func MinkowskiDistance(dataPointX, dataPointY Float64Data, lambda float64) (distance float64, err error)

MinkowskiDistance computes the Minkowski distance between two data sets

Arguments:

dataPointX: First set of data points
dataPointY: Second set of data points. Length of both data
            sets must be equal.
lambda:     aka p or city blocks; With lambda = 1
            returned distance is manhattan distance and
            lambda = 2; it is euclidean distance. Lambda
            reaching to infinite - distance would be chebysev
            distance.

Return:

Distance or error
func Mode(input Float64Data) (mode []float64, err error)

Mode gets the mode [most frequent value(s)] of a slice of float64s

func Ncr(n, r int) int

Ncr is an N choose R algorithm. Aaron Cannon's algorithm.

func NormBoxMullerRvs(loc float64, scale float64, size int) []float64

NormBoxMullerRvs generates random variates using the Box–Muller transform. For more information please visit: http://mathworld.wolfram.com/Box-MullerTransformation.html

func NormCdf(x float64, loc float64, scale float64) float64

NormCdf is the cumulative distribution function.

func NormEntropy(loc float64, scale float64) float64

NormEntropy is the differential entropy of the RV.

func NormFit(data []float64) [2]float64

NormFit returns the maximum likelihood estimators for the Normal Distribution. Takes array of float64 values. Returns array of Mean followed by Standard Deviation.

func NormInterval(alpha float64, loc float64, scale float64) [2]float64

NormInterval finds endpoints of the range that contains alpha percent of the distribution.

func NormIsf(p float64, loc float64, scale float64) (x float64)

NormIsf is the inverse survival function (inverse of sf).

func NormLogCdf(x float64, loc float64, scale float64) float64

NormLogCdf is the log of the cumulative distribution function.

func NormLogPdf(x float64, loc float64, scale float64) float64

NormLogPdf is the log of the probability density function.

func NormLogSf(x float64, loc float64, scale float64) float64

NormLogSf is the log of the survival function.

func NormMean(loc float64, scale float64) float64

NormMean is the mean/expected value of the distribution.

func NormMedian(loc float64, scale float64) float64

NormMedian is the median of the distribution.

func NormMoment(n int, loc float64, scale float64) float64

NormMoment approximates the non-central (raw) moment of order n. For more information please visit: https://math.stackexchange.com/questions/1945448/methods-for-finding-raw-moments-of-the-normal-distribution

func NormPdf(x float64, loc float64, scale float64) float64

NormPdf is the probability density function.

func NormPpf(p float64, loc float64, scale float64) (x float64)

NormPpf is the point percentile function. This is based on Peter John Acklam's inverse normal CDF. algorithm: http://home.online.no/~pjacklam/notes/invnorm/ (no longer visible). For more information please visit: https://stackedboxes.org/2017/05/01/acklams-normal-quantile-function/

func NormPpfRvs(loc float64, scale float64, size int) []float64

NormPpfRvs generates random variates using the Point Percentile Function. For more information please visit: https://demonstrations.wolfram.com/TheMethodOfInverseTransforms/

func NormSf(x float64, loc float64, scale float64) float64

NormSf is the survival function (also defined as 1 - cdf, but sf is sometimes more accurate).

func NormStats(loc float64, scale float64, moments string) []float64

NormStats returns the mean, variance, skew, and/or kurtosis. Mean(‘m’), variance(‘v’), skew(‘s’), and/or kurtosis(‘k’). Takes string containing any of 'mvsk'. Returns array of m v s k in that order.

func NormStd(loc float64, scale float64) float64

NormStd is the standard deviation of the distribution.

func NormVar(loc float64, scale float64) float64

NormVar is the variance of the distribution.

func Pearson(data1, data2 Float64Data) (float64, error)

Pearson calculates the Pearson product-moment correlation coefficient between two variables

func Percentile(input Float64Data, percent float64) (percentile float64, err error)

Percentile finds the relative standing in a slice of floats

func PercentileNearestRank(input Float64Data, percent float64) (percentile float64, err error)

PercentileNearestRank finds the relative standing in a slice of floats using the Nearest Rank method

func PopulationVariance(input Float64Data) (pvar float64, err error)

PopulationVariance finds the amount of variance within a population

func ProbGeom(a int, b int, p float64) (prob float64, err error)

ProbGeom generates the probability for a geometric random variable with parameter p to achieve success in the interval of [a, b] trials See https://en.wikipedia.org/wiki/Geometric_distribution for more information

func Round(input float64, places int) (rounded float64, err error)

Round a float to a specific decimal place or precision

func Sample(input Float64Data, takenum int, replacement bool) ([]float64, error)

Sample returns sample from input with replacement or without

func SampleVariance(input Float64Data) (svar float64, err error)

SampleVariance finds the amount of variance within a sample

func Sigmoid(input Float64Data) ([]float64, error)

Sigmoid returns the input values in the range of -1 to 1 along the sigmoid or s-shaped curve, commonly used in machine learning while training neural networks as an activation function.

func SoftMax(input Float64Data) ([]float64, error)

SoftMax returns the input values in the range of 0 to 1 with sum of all the probabilities being equal to one. It is commonly used in machine learning neural networks.

func StableSample(input Float64Data, takenum int) ([]float64, error)

StableSample like stable sort, it returns samples from input while keeps the order of original data.

func StandardDeviation(input Float64Data) (sdev float64, err error)

StandardDeviation the amount of variation in the dataset

func StandardDeviationPopulation(input Float64Data) (sdev float64, err error)

StandardDeviationPopulation finds the amount of variation from the population

func StandardDeviationSample(input Float64Data) (sdev float64, err error)

StandardDeviationSample finds the amount of variation from a sample

func StdDevP(input Float64Data) (sdev float64, err error)

StdDevP is a shortcut to StandardDeviationPopulation

func StdDevS(input Float64Data) (sdev float64, err error)

StdDevS is a shortcut to StandardDeviationSample

func Sum(input Float64Data) (sum float64, err error)

Sum adds all the numbers of a slice together

func Trimean(input Float64Data) (float64, error)

Trimean finds the average of the median and the midhinge

func VarGeom(p float64) (exp float64, err error)

ProbGeom generates the variance for number for a geometric random variable with parameter p

func VarP(input Float64Data) (sdev float64, err error)

VarP is a shortcut to PopulationVariance

func VarS(input Float64Data) (sdev float64, err error)

VarS is a shortcut to SampleVariance

func Variance(input Float64Data) (sdev float64, err error)

Variance the amount of variation in the dataset

type Coordinate struct {
    X, Y float64
}

Coordinate holds the data in a series

func ExpReg(s []Coordinate) (regressions []Coordinate, err error)

ExpReg is a shortcut to ExponentialRegression

func LinReg(s []Coordinate) (regressions []Coordinate, err error)

LinReg is a shortcut to LinearRegression

func LogReg(s []Coordinate) (regressions []Coordinate, err error)

LogReg is a shortcut to LogarithmicRegression

type Float64Data []float64

Float64Data is a named type for []float64 with helper methods

func LoadRawData(raw interface{}) (f Float64Data)

LoadRawData parses and converts a slice of mixed data types to floats

func (Float64Data) AutoCorrelation

func (f Float64Data) AutoCorrelation(lags int) (float64, error)

AutoCorrelation is the correlation of a signal with a delayed copy of itself as a function of delay

func (Float64Data) Correlation

func (f Float64Data) Correlation(d Float64Data) (float64, error)

Correlation describes the degree of relationship between two sets of data

func (Float64Data) Covariance

func (f Float64Data) Covariance(d Float64Data) (float64, error)

Covariance is a measure of how much two sets of data change

func (f Float64Data) CovariancePopulation(d Float64Data) (float64, error)

CovariancePopulation computes covariance for entire population between two variables

func (Float64Data) CumulativeSum

func (f Float64Data) CumulativeSum() ([]float64, error)

CumulativeSum returns the cumulative sum of the data

func (Float64Data) Entropy

func (f Float64Data) Entropy() (float64, error)

Entropy provides calculation of the entropy

func (Float64Data) GeometricMean

func (f Float64Data) GeometricMean() (float64, error)

GeometricMean returns the median of the data

func (Float64Data) Get

func (f Float64Data) Get(i int) float64

Get item in slice

func (Float64Data) HarmonicMean

func (f Float64Data) HarmonicMean() (float64, error)

HarmonicMean returns the mode of the data

func (Float64Data) InterQuartileRange

func (f Float64Data) InterQuartileRange() (float64, error)

InterQuartileRange finds the range between Q1 and Q3

func (Float64Data) Len

func (f Float64Data) Len() int

Len returns length of slice

func (Float64Data) Less

func (f Float64Data) Less(i, j int) bool

Less returns if one number is less than another

func (Float64Data) Max

func (f Float64Data) Max() (float64, error)

Max returns the maximum number in the data

func (Float64Data) Mean

func (f Float64Data) Mean() (float64, error)

Mean returns the mean of the data

func (Float64Data) Median

func (f Float64Data) Median() (float64, error)

Median returns the median of the data

func (f Float64Data) MedianAbsoluteDeviation() (float64, error)

MedianAbsoluteDeviation the median of the absolute deviations from the dataset median

func (f Float64Data) MedianAbsoluteDeviationPopulation() (float64, error)

MedianAbsoluteDeviationPopulation finds the median of the absolute deviations from the population median

func (Float64Data) Midhinge

func (f Float64Data) Midhinge(d Float64Data) (float64, error)

Midhinge finds the average of the first and third quartiles

func (Float64Data) Min

func (f Float64Data) Min() (float64, error)

Min returns the minimum number in the data

func (Float64Data) Mode

func (f Float64Data) Mode() ([]float64, error)

Mode returns the mode of the data

func (Float64Data) Pearson

func (f Float64Data) Pearson(d Float64Data) (float64, error)

Pearson calculates the Pearson product-moment correlation coefficient between two variables.

func (Float64Data) Percentile

func (f Float64Data) Percentile(p float64) (float64, error)

Percentile finds the relative standing in a slice of floats

func (f Float64Data) PercentileNearestRank(p float64) (float64, error)

PercentileNearestRank finds the relative standing using the Nearest Rank method

func (Float64Data) PopulationVariance

func (f Float64Data) PopulationVariance() (float64, error)

PopulationVariance finds the amount of variance within a population

func (Float64Data) Quartile

func (f Float64Data) Quartile(d Float64Data) (Quartiles, error)

Quartile returns the three quartile points from a slice of data

func (Float64Data) QuartileOutliers

func (f Float64Data) QuartileOutliers() (Outliers, error)

QuartileOutliers finds the mild and extreme outliers

func (Float64Data) Quartiles

func (f Float64Data) Quartiles() (Quartiles, error)

Quartiles returns the three quartile points from instance of Float64Data

func (Float64Data) Sample

func (f Float64Data) Sample(n int, r bool) ([]float64, error)

Sample returns sample from input with replacement or without

func (Float64Data) SampleVariance

func (f Float64Data) SampleVariance() (float64, error)

SampleVariance finds the amount of variance within a sample

func (Float64Data) Sigmoid

func (f Float64Data) Sigmoid() ([]float64, error)

Sigmoid returns the input values along the sigmoid or s-shaped curve

func (Float64Data) SoftMax

func (f Float64Data) SoftMax() ([]float64, error)

SoftMax returns the input values in the range of 0 to 1 with sum of all the probabilities being equal to one.

func (Float64Data) StandardDeviation

func (f Float64Data) StandardDeviation() (float64, error)

StandardDeviation the amount of variation in the dataset

func (f Float64Data) StandardDeviationPopulation() (float64, error)

StandardDeviationPopulation finds the amount of variation from the population

func (f Float64Data) StandardDeviationSample() (float64, error)

StandardDeviationSample finds the amount of variation from a sample

func (Float64Data) Sum

func (f Float64Data) Sum() (float64, error)

Sum returns the total of all the numbers in the data

func (Float64Data) Swap

func (f Float64Data) Swap(i, j int)

Swap switches out two numbers in slice

func (Float64Data) Trimean

func (f Float64Data) Trimean(d Float64Data) (float64, error)

Trimean finds the average of the median and the midhinge

func (Float64Data) Variance

func (f Float64Data) Variance() (float64, error)

Variance the amount of variation in the dataset

type Outliers struct {
    Mild    Float64Data
    Extreme Float64Data
}

Outliers holds mild and extreme outliers found in data

func QuartileOutliers(input Float64Data) (Outliers, error)

QuartileOutliers finds the mild and extreme outliers

type Quartiles struct {
    Q1 float64
    Q2 float64
    Q3 float64
}

Quartiles holds the three quartile points

func Quartile(input Float64Data) (Quartiles, error)

Quartile returns the three quartile points from a slice of data

type Series []Coordinate

Series is a container for a series of data

func ExponentialRegression(s Series) (regressions Series, err error)

ExponentialRegression returns an exponential regression on data series

func LinearRegression(s Series) (regressions Series, err error)

LinearRegression finds the least squares linear regression on data series

func LogarithmicRegression(s Series) (regressions Series, err error)

LogarithmicRegression returns an logarithmic regression on data series


Generated by godoc2md