Class StatUtils


  • public class StatUtils
    extends Object
    A statistical helper class.
    Author:
    fracpete (fracpete at waikato dot ac dot nz)
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected static org.apache.commons.math3.distribution.NormalDistribution m_NormalDist
      for computation.
    • Constructor Summary

      Constructors 
      Constructor Description
      StatUtils()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static double correlationCoefficient​(double[] y1, double[] y2)
      Computes the correlation coefficient between the two data vectors and returns it.
      static double correlationCoefficient​(Number[] y1, Number[] y2)
      Computes the correlation coefficient between the two data vectors and returns it.
      static double covariance​(double[] x, double[] y)
      Computes the covariance between the two data vectors and returns it.
      static double covariance​(Number[] x, Number[] y)
      Computes the covariance between the two data vectors and returns it.
      static int findClosest​(double[] array, double toFind)
      Returns the index of the double closest to the one one is looking for in the given array.
      static int findClosest​(int[] array, int toFind)
      Returns the index of the integer closest to the one one is looking for in the given array.
      static int findClosest​(Number[] array, Number toFind)
      Returns the index of the number closest to the one one is looking for in the given array.
      static int findFirst​(double[] array, double toFind)
      Returns the (first) index of the double one is looking for in the given array.
      static int findFirst​(int[] array, int toFind)
      Returns the (first) index of the integer one is looking for in the given array.
      static int findFirst​(Number[] array, Number toFind)
      Returns the (first) index of the number one is looking for in the given array.
      static byte[] flatten​(byte[][] matrix)
      Converts the matrix into a flat array, row after row.
      static double[] flatten​(double[][] matrix)
      Converts the matrix into a flat array, row after row.
      static float[] flatten​(float[][] matrix)
      Converts the matrix into a flat array, row after row.
      static int[] flatten​(int[][] matrix)
      Converts the matrix into a flat array, row after row.
      static long[] flatten​(long[][] matrix)
      Converts the matrix into a flat array, row after row.
      static short[] flatten​(short[][] matrix)
      Converts the matrix into a flat array, row after row.
      static Number[] flatten​(Number[][] matrix)
      Converts the matrix into a flat array, row after row.
      static double iqr​(double[] array)
      Returns the iqr of the given array.
      static double iqr​(int[] array)
      Returns the iqr of the given array.
      static double iqr​(Number[] array)
      Returns the interquartile of the given array.
      static double[] kendallTheil​(double[] x, double[] y)
      Computes the Kendall-Theil robust regression of the given data points.
      static double[] kendallTheil​(Number[] x, Number[] y)
      Computes the Kendall-Theil robust regression of the given data points.
      static double[] linearRegression​(double[] x, double[] y)
      Calculates the slope and intercept between the two arrays.
      static double[] linearRegression​(Number[] x, Number[] y)
      Calculates the slope and intercept between the two arrays.
      static double mae​(double[] actual, double[] predicted)
      Computes the mean absolute error between the two data vectors and returns it.
      static double mae​(Number[] actual, Number[] predicted)
      Computes the mean absolute error between the two data vectors and returns it.
      static void main​(String[] args)
      Just for testing.
      static double max​(double[] array)
      Returns the (first occurrence of the) biggest value in the given array.
      static int max​(int[] array)
      Returns the (first occurrence of the) biggest value in the given array.
      static Number max​(Number[] array)
      Returns the (first occurrence of the) biggest value in the given array.
      static int maxIndex​(double[] array)
      Returns the (first occurrence of the) index of the cell with the biggest double.
      static int maxIndex​(int[] array)
      Returns the (first occurrence of the) index of the cell with the biggest int.
      static int maxIndex​(Number[] array)
      Returns the (first occurrence of the) index of the cell with the biggest number.
      static double mean​(double[] array)
      Returns the mean of the given array.
      static double mean​(int[] array)
      Returns the mean of the given array.
      static double mean​(Number[] array)
      Returns the mean of the given array.
      static double median​(double[] array)
      Returns the median of the given array.
      static double median​(int[] array)
      Returns the median of the given array.
      static double median​(Number[] array)
      Returns the median of the given array.
      static double min​(double[] array)
      Returns the (first occurrence of the) smallest value in the given array.
      static int min​(int[] array)
      Returns the (first occurrence of the) smallest value in the given array.
      static Number min​(Number[] array)
      Returns the (first occurrence of the) smallest value in the given array.
      static int minIndex​(double[] array)
      Returns the (first occurrence of the) index of the cell with the smallest double.
      static int minIndex​(int[] array)
      Returns the (first occurrence of the) index of the cell with the smallest int.
      static int minIndex​(Number[] array)
      Returns the (first occurrence of the) index of the cell with the smallest number.
      static double normalInverse​(double y0)
      Returns the value, x, for which the area under the Normal (Gaussian) probability density function (integrated from minus infinity to x) is equal to the argument y (assumes mean is zero, variance is one).
      static double[] normalize​(double[] array)
      Normalizes the given array (returns a copy).
      static double[] normalize​(int[] array)
      Normalizes the given array (returns a copy).
      static Double[] normalize​(Number[] array)
      Normalizes the given array (returns a copy), i.e., the array will sum up to 1.
      static double[] normalizeRange​(double[] array, double lower, double upper)
      Normalizes the given array (returns a copy), to have its values range from lower to upper bound.
      static double[] normalizeRange​(int[] array, double lower, double upper)
      Normalizes the given array (returns a copy), to have its values range from lower to upper bound.
      static Double[] normalizeRange​(Number[] array, double lower, double upper)
      Normalizes the given array (returns a copy), to have its values range from lower to upper bound.
      static double normalProbability​(double a)
      Returns the area under the Normal (Gaussian) probability density function, integrated from minus infinity to x (assumes mean is zero, variance is one).
      static double quartile​(double[] array, double quartile)
      Returns the quartile of the given array.
      static double quartile​(int[] array, double quartile)
      Returns the quartile of the given array.
      static double quartile​(Number[] array, double quartile)
      Returns the quartile of the given array.
      static double rae​(double[] actual, double[] predicted)
      Computes the relative absolute error between the two data vectors and returns it.
      static double rae​(Number[] actual, Number[] predicted)
      Computes the relative absolute error between the two data vectors and returns it.
      static double rmse​(double[] actual, double[] predicted)
      Computes the root mean squared error between the two data vectors and returns it.
      static double rmse​(Number[] actual, Number[] predicted)
      Computes the root mean squared error between the two data vectors and returns it.
      static double[] rowNorm​(double[] x)
      Applies row-wise normalization to the data.
      static double[] rowNorm​(Number[] x)
      Applies row-wise normalization to the data.
      static double rrse​(double[] actual, double[] predicted)
      Computes the root relative squared error between the two data vectors and returns it.
      static double rrse​(Number[] actual, Number[] predicted)
      Computes the root relative squared error between the two data vectors and returns it.
      static double rSquared​(double[] actual, double[] predicted)
      Computes the R^2 between the two data vectors and returns it.
      static double rSquared​(Number[] actual, Number[] predicted)
      Computes the R^2 between the two data vectors and returns it.
      static double signalToNoiseRatio​(double[] x)
      Calculates the signal/noise ratio.
      static double signalToNoiseRatio​(Number[] x)
      Calculates the signal/noise ratio.
      static double[] sort​(double[] array)
      Returns a sorted copy of the array (ascending).
      static double[] sort​(double[] array, boolean asc)
      Returns a sorted copy of the array (ascending or descending).
      static int[] sort​(int[] array)
      Returns a sorted copy of the array (ascending).
      static int[] sort​(int[] array, boolean asc)
      Returns a sorted copy of the array (ascending or descending).
      static Number[] sort​(Number[] array)
      Returns a sorted copy of the array (ascending).
      static Number[] sort​(Number[] array, boolean asc)
      Returns a sorted copy of the array (ascending or descending).
      static double[] standardize​(double[] array, boolean isSample)
      Standardizes the given array (returns a copy).
      static double[] standardize​(int[] array, boolean isSample)
      Standardizes the given array (returns a copy).
      static Double[] standardize​(Number[] array, boolean isSample)
      Standardizes the given array (returns a copy).
      static double[] standardScores​(double[] x, boolean isSample)
      Computes the standard scores for the array.
      static double[] standardScores​(double[] actual, double[] predicted, boolean isSample)
      Computes the standard scores.
      static double[] standardScores​(Number[] x, boolean isSample)
      Computes the standard scores for the array.
      static double[] standardScores​(Number[] actual, Number[] predicted, boolean isSample)
      Computes the standard scores.
      static double stddev​(double[] array, boolean isSample)
      Returns the std deviation of the given array.
      static double stddev​(int[] array, boolean isSample)
      Returns the std deviation of the given array.
      static double stddev​(Number[] array, boolean isSample)
      Returns the std deviation of the given array.
      static gnu.trove.list.array.TIntArrayList subsample​(int num, double perc, long seed)
      Creates a random sub-sample of indices of a certain percentage using the specified number of entrie.
      static gnu.trove.list.array.TIntArrayList subsample​(int num, double perc, RandomIntegerRangeGenerator generator)
      Creates a random sub-sample of indices of a certain percentage using the specified number of entrie.
      static double sum​(double[] array)
      Returns sum of all the elements in the array.
      static double sum​(int[] array)
      Returns sum of all the elements in the array.
      static double sum​(Number[] array)
      Returns sum of all the elements in the array.
      static double sumOfSquares​(double[] array)
      Returns sum of all the squared elements in the array.
      static double sumOfSquares​(int[] array)
      Returns sum of all the squared elements in the array.
      static double sumOfSquares​(Number[] array)
      Returns sum of all the squared elements in the array.
      static byte[] toByteArray​(Number[] array)
      Turns the Number array into one consisting of primitive bytes.
      static byte[][] toByteMatrix​(Number[][] matrix)
      Turns the Number matrix into one consisting of primitive bytes.
      static double[] toDoubleArray​(Number[] array)
      Turns the Number array into one consisting of primitive doubles.
      static double[][] toDoubleMatrix​(Number[][] matrix)
      Turns the Number matrix into one consisting of primitive doubles.
      static float[] toFloatArray​(Number[] array)
      Turns the Number array into one consisting of primitive floats.
      static float[][] toFloatMatrix​(Number[][] matrix)
      Turns the Number matrix into one consisting of primitive floats.
      static int[] toIntArray​(Number[] array)
      Turns the Number array into one consisting of primitive ints.
      static int[][] toIntMatrix​(Number[][] matrix)
      Turns the Number matrix into one consisting of primitive ints.
      static long[] toLongArray​(Number[] array)
      Turns the Number array into one consisting of primitive longs.
      static long[][] toLongMatrix​(Number[][] matrix)
      Turns the Number matrix into one consisting of primitive longs.
      static Number[] toNumberArray​(byte[] array)
      Turns the byte array into a Byte array.
      static Number[] toNumberArray​(double[] array)
      Turns the double array into a Double array.
      static Number[] toNumberArray​(float[] array)
      Turns the float array into a Float array.
      static Number[] toNumberArray​(int[] array)
      Turns the int array into a Integer array.
      static Number[] toNumberArray​(long[] array)
      Turns the long array into a Long array.
      static Number[] toNumberArray​(short[] array)
      Turns the short array into a Short array.
      static Number[][] toNumberMatrix​(byte[][] matrix)
      Turns the Number matrix into one consisting of primitive bytes.
      static Number[][] toNumberMatrix​(double[][] matrix)
      Turns the primitive double matrix into one consisting of Doubles.
      static Number[][] toNumberMatrix​(float[][] matrix)
      Turns the primitive float matrix into one consisting of Floats.
      static Number[][] toNumberMatrix​(int[][] matrix)
      Turns the primitve int matrix into one consisting of Integers.
      static Number[][] toNumberMatrix​(long[][] matrix)
      Turns the primitive long matrix into one consisting of Longs.
      static Number[][] toNumberMatrix​(short[][] matrix)
      Turns the Number matrix into one consisting of primitive shorts.
      static short[] toShortArray​(Number[] array)
      Turns the Number array into one consisting of primitive shorts.
      static short[][] toShortMatrix​(Number[][] matrix)
      Turns the Number matrix into one consisting of primitive shorts.
      static gnu.trove.map.TByteIntMap uniqueCounts​(byte[] numbers)
      Returns all the unique numbers in the array.
      static gnu.trove.map.TDoubleIntMap uniqueCounts​(double[] numbers)
      Returns all the counts for the unique numbers in the array.
      static gnu.trove.map.TFloatIntMap uniqueCounts​(float[] numbers)
      Returns all the unique numbers in the array.
      static gnu.trove.map.TIntIntMap uniqueCounts​(int[] numbers)
      Returns all the unique numbers in the array.
      static gnu.trove.map.TLongIntMap uniqueCounts​(long[] numbers)
      Returns all the unique numbers in the array.
      static gnu.trove.map.TShortIntMap uniqueCounts​(short[] numbers)
      Returns all the unique numbers in the array.
      static byte[] uniqueValues​(byte[] numbers)
      Returns all the unique numbers in the array.
      static double[] uniqueValues​(double[] numbers)
      Returns all the unique numbers in the array.
      static float[] uniqueValues​(float[] numbers)
      Returns all the unique numbers in the array.
      static int[] uniqueValues​(int[] numbers)
      Returns all the unique numbers in the array.
      static long[] uniqueValues​(long[] numbers)
      Returns all the unique numbers in the array.
      static short[] uniqueValues​(short[] numbers)
      Returns all the unique numbers in the array.
    • Field Detail

      • m_NormalDist

        protected static org.apache.commons.math3.distribution.NormalDistribution m_NormalDist
        for computation.
    • Constructor Detail

      • StatUtils

        public StatUtils()
    • Method Detail

      • toNumberArray

        public static Number[] toNumberArray​(byte[] array)
        Turns the byte array into a Byte array.
        Parameters:
        array - the array to convert
        Returns:
        the converted array
      • toNumberArray

        public static Number[] toNumberArray​(short[] array)
        Turns the short array into a Short array.
        Parameters:
        array - the array to convert
        Returns:
        the converted array
      • toNumberArray

        public static Number[] toNumberArray​(int[] array)
        Turns the int array into a Integer array.
        Parameters:
        array - the array to convert
        Returns:
        the converted array
      • toNumberArray

        public static Number[] toNumberArray​(long[] array)
        Turns the long array into a Long array.
        Parameters:
        array - the array to convert
        Returns:
        the converted array
      • toNumberArray

        public static Number[] toNumberArray​(float[] array)
        Turns the float array into a Float array.
        Parameters:
        array - the array to convert
        Returns:
        the converted array
      • toNumberArray

        public static Number[] toNumberArray​(double[] array)
        Turns the double array into a Double array.
        Parameters:
        array - the array to convert
        Returns:
        the converted array
      • toByteArray

        public static byte[] toByteArray​(Number[] array)
        Turns the Number array into one consisting of primitive bytes.
        Parameters:
        array - the array to convert
        Returns:
        the converted array
      • toShortArray

        public static short[] toShortArray​(Number[] array)
        Turns the Number array into one consisting of primitive shorts.
        Parameters:
        array - the array to convert
        Returns:
        the converted array
      • toIntArray

        public static int[] toIntArray​(Number[] array)
        Turns the Number array into one consisting of primitive ints.
        Parameters:
        array - the array to convert
        Returns:
        the converted array
      • toLongArray

        public static long[] toLongArray​(Number[] array)
        Turns the Number array into one consisting of primitive longs.
        Parameters:
        array - the array to convert
        Returns:
        the converted array
      • toFloatArray

        public static float[] toFloatArray​(Number[] array)
        Turns the Number array into one consisting of primitive floats.
        Parameters:
        array - the array to convert
        Returns:
        the converted array
      • toDoubleArray

        public static double[] toDoubleArray​(Number[] array)
        Turns the Number array into one consisting of primitive doubles.
        Parameters:
        array - the array to convert
        Returns:
        the converted array
      • minIndex

        public static int minIndex​(Number[] array)
        Returns the (first occurrence of the) index of the cell with the smallest number. -1 in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the index
      • minIndex

        public static int minIndex​(int[] array)
        Returns the (first occurrence of the) index of the cell with the smallest int. -1 in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the index
      • minIndex

        public static int minIndex​(double[] array)
        Returns the (first occurrence of the) index of the cell with the smallest double. -1 in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the index
      • min

        public static Number min​(Number[] array)
        Returns the (first occurrence of the) smallest value in the given array. Null in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the smallest value
      • min

        public static int min​(int[] array)
        Returns the (first occurrence of the) smallest value in the given array. Integer.MIN_VALUE in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the smallest value
      • min

        public static double min​(double[] array)
        Returns the (first occurrence of the) smallest value in the given array. -Double.MAX_VALUE in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the smallest value
      • maxIndex

        public static int maxIndex​(Number[] array)
        Returns the (first occurrence of the) index of the cell with the biggest number. -1 in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the index
      • maxIndex

        public static int maxIndex​(int[] array)
        Returns the (first occurrence of the) index of the cell with the biggest int. -1 in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the index
      • maxIndex

        public static int maxIndex​(double[] array)
        Returns the (first occurrence of the) index of the cell with the biggest double. -1 in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the index
      • max

        public static Number max​(Number[] array)
        Returns the (first occurrence of the) biggest value in the given array. Null in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the biggest value
      • max

        public static int max​(int[] array)
        Returns the (first occurrence of the) biggest value in the given array. Integer.MAX_VALUE in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the biggest value
      • max

        public static double max​(double[] array)
        Returns the (first occurrence of the) biggest value in the given array. Double.MAX_VALUE in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the biggest value
      • mean

        public static double mean​(Number[] array)
        Returns the mean of the given array. NaN is returned in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the mean
      • mean

        public static double mean​(int[] array)
        Returns the mean of the given array. NaN is returned in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the mean
      • mean

        public static double mean​(double[] array)
        Returns the mean of the given array. NaN is returned in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the mean
      • iqr

        public static double iqr​(double[] array)
        Returns the iqr of the given array. NaN is returned in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the iqr
      • iqr

        public static double iqr​(int[] array)
        Returns the iqr of the given array. NaN is returned in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the iqr
      • iqr

        public static double iqr​(Number[] array)
        Returns the interquartile of the given array. NaN is returned in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the iqr
      • quartile

        public static double quartile​(double[] array,
                                      double quartile)
        Returns the quartile of the given array. NaN is returned in case of zero-length arrays.
        Parameters:
        array - the array to work on
        quartile - the quartile to return (0-1)
        Returns:
        the quartile
      • quartile

        public static double quartile​(int[] array,
                                      double quartile)
        Returns the quartile of the given array. NaN is returned in case of zero-length arrays.
        Parameters:
        array - the array to work on
        quartile - the quartile to return (0-1)
        Returns:
        the quartile
      • quartile

        public static double quartile​(Number[] array,
                                      double quartile)
        Returns the quartile of the given array. NaN is returned in case of zero-length arrays.
        Parameters:
        array - the array to work on
        quartile - the quartile to return (0-1)
        Returns:
        the quartile
      • median

        public static double median​(Number[] array)
        Returns the median of the given array. NaN is returned in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the median
      • median

        public static double median​(int[] array)
        Returns the median of the given array. NaN is returned in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the median
      • median

        public static double median​(double[] array)
        Returns the median of the given array. NaN is returned in case of zero-length arrays.
        Parameters:
        array - the array to work on
        Returns:
        the median
      • stddev

        public static double stddev​(Number[] array,
                                    boolean isSample)
        Returns the std deviation of the given array. NaN is returned in case of zero-length arrays.
        Parameters:
        array - the array to work on
        isSample - if true, then the sample standard deviation instead of the population standard deviation is calculated (using n-1 instead of n).
        Returns:
        the std deviation
      • stddev

        public static double stddev​(int[] array,
                                    boolean isSample)
        Returns the std deviation of the given array. NaN is returned in case of zero-length arrays.
        Parameters:
        array - the array to work on
        isSample - if true, then the sample standard deviation instead of the population standard deviation is calculated (using n-1 instead of n).
        Returns:
        the std deviation
      • stddev

        public static double stddev​(double[] array,
                                    boolean isSample)
        Returns the std deviation of the given array. NaN is returned in case of zero-length arrays.
        Parameters:
        array - the array to work on
        isSample - if true, then the sample standard deviation instead of the population standard deviation is calculated (using n-1 instead of n).
        Returns:
        the std deviation
      • normalize

        public static Double[] normalize​(Number[] array)
        Normalizes the given array (returns a copy), i.e., the array will sum up to 1. In case of a sum of 0, it returns null.
        Parameters:
        array - the array to work on
        Returns:
        the normalized array
      • normalize

        public static double[] normalize​(int[] array)
        Normalizes the given array (returns a copy).
        Parameters:
        array - the array to work on
        Returns:
        the std deviation
      • normalize

        public static double[] normalize​(double[] array)
        Normalizes the given array (returns a copy).
        Parameters:
        array - the array to work on
        Returns:
        the std deviation
      • normalizeRange

        public static Double[] normalizeRange​(Number[] array,
                                              double lower,
                                              double upper)
        Normalizes the given array (returns a copy), to have its values range from lower to upper bound.
        Parameters:
        array - the array to work on
        lower - the lower bound
        upper - the upper bound
        Returns:
        the normalized array, null if failed to determine min/max or range is zero
      • normalizeRange

        public static double[] normalizeRange​(int[] array,
                                              double lower,
                                              double upper)
        Normalizes the given array (returns a copy), to have its values range from lower to upper bound.
        Parameters:
        array - the array to work on
        lower - the lower bound
        upper - the upper bound
        Returns:
        the normalized array, null if failed to determine min/max or range is zero
      • normalizeRange

        public static double[] normalizeRange​(double[] array,
                                              double lower,
                                              double upper)
        Normalizes the given array (returns a copy), to have its values range from lower to upper bound.
        Parameters:
        array - the array to work on
        lower - the lower bound
        upper - the upper bound
        Returns:
        the normalized array, null if failed to determine min/max or range is zero
      • standardize

        public static Double[] standardize​(Number[] array,
                                           boolean isSample)
        Standardizes the given array (returns a copy). Returns null if the standard deviation is zero and data cannot be standardized.
        Parameters:
        array - the array to work on
        isSample - if true, then the sample standard deviation instead of the population standard deviation is calculated (using n-1 instead of n).
        Returns:
        the std deviation
      • standardize

        public static double[] standardize​(double[] array,
                                           boolean isSample)
        Standardizes the given array (returns a copy). Returns null if the standard deviation is zero and data cannot be standardized.
        Parameters:
        array - the array to work on
        isSample - if true, then the sample standard deviation instead of the population standard deviation is calculated (using n-1 instead of n).
        Returns:
        the std deviation
      • standardize

        public static double[] standardize​(int[] array,
                                           boolean isSample)
        Standardizes the given array (returns a copy). Returns null if the standard deviation is zero and data cannot be standardized.
        Parameters:
        array - the array to work on
        isSample - if true, then the sample standard deviation instead of the population standard deviation is calculated (using n-1 instead of n).
        Returns:
        the std deviation
      • sum

        public static double sum​(Number[] array)
        Returns sum of all the elements in the array.
        Parameters:
        array - the array to work on
        Returns:
        the sum
      • sum

        public static double sum​(int[] array)
        Returns sum of all the elements in the array.
        Parameters:
        array - the array to work on
        Returns:
        the sum
      • sum

        public static double sum​(double[] array)
        Returns sum of all the elements in the array.
        Parameters:
        array - the array to work on
        Returns:
        the sum
      • sumOfSquares

        public static double sumOfSquares​(Number[] array)
        Returns sum of all the squared elements in the array.
        Parameters:
        array - the array to work on
        Returns:
        the sum
      • sumOfSquares

        public static double sumOfSquares​(int[] array)
        Returns sum of all the squared elements in the array.
        Parameters:
        array - the array to work on
        Returns:
        the sum
      • sumOfSquares

        public static double sumOfSquares​(double[] array)
        Returns sum of all the squared elements in the array.
        Parameters:
        array - the array to work on
        Returns:
        the sum
      • sort

        public static Number[] sort​(Number[] array)
        Returns a sorted copy of the array (ascending).
        Parameters:
        array - the array to sort
        Returns:
        the sorted array
      • sort

        public static int[] sort​(int[] array)
        Returns a sorted copy of the array (ascending).
        Parameters:
        array - the array to sort
        Returns:
        the sorted array
      • sort

        public static double[] sort​(double[] array)
        Returns a sorted copy of the array (ascending).
        Parameters:
        array - the array to sort
        Returns:
        the sorted array
      • sort

        public static Number[] sort​(Number[] array,
                                    boolean asc)
        Returns a sorted copy of the array (ascending or descending).
        Parameters:
        array - the array to sort
        asc - if true then the data gets sorted in ascending manner, otherwise in descending manner
        Returns:
        the sorted array
      • sort

        public static int[] sort​(int[] array,
                                 boolean asc)
        Returns a sorted copy of the array (ascending or descending).
        Parameters:
        array - the array to sort
        asc - if true then the data gets sorted in ascending manner, otherwise in descending manner
        Returns:
        the sorted array
      • sort

        public static double[] sort​(double[] array,
                                    boolean asc)
        Returns a sorted copy of the array (ascending or descending).
        Parameters:
        array - the array to sort
        asc - if true then the data gets sorted in ascending manner, otherwise in descending manner
        Returns:
        the sorted array
      • findFirst

        public static int findFirst​(Number[] array,
                                    Number toFind)
        Returns the (first) index of the number one is looking for in the given array. -1 is returned if not found.
        Parameters:
        array - the array to search
        toFind - the number to find
        Returns:
        the index
      • findFirst

        public static int findFirst​(int[] array,
                                    int toFind)
        Returns the (first) index of the integer one is looking for in the given array. -1 is returned if not found.
        Parameters:
        array - the array to search
        toFind - the integer to find
        Returns:
        the index
      • findFirst

        public static int findFirst​(double[] array,
                                    double toFind)
        Returns the (first) index of the double one is looking for in the given array. -1 is returned if not found.
        Parameters:
        array - the array to search
        toFind - the double to find
        Returns:
        the index
      • findClosest

        public static int findClosest​(Number[] array,
                                      Number toFind)
        Returns the index of the number closest to the one one is looking for in the given array.
        Parameters:
        array - the array to search
        toFind - the number to find
        Returns:
        the index
      • findClosest

        public static int findClosest​(int[] array,
                                      int toFind)
        Returns the index of the integer closest to the one one is looking for in the given array.
        Parameters:
        array - the array to search
        toFind - the integer to find
        Returns:
        the index
      • findClosest

        public static int findClosest​(double[] array,
                                      double toFind)
        Returns the index of the double closest to the one one is looking for in the given array.
        Parameters:
        array - the array to search
        toFind - the double to find
        Returns:
        the index
      • correlationCoefficient

        public static double correlationCoefficient​(double[] y1,
                                                    double[] y2)
        Computes the correlation coefficient between the two data vectors and returns it.
        Parameters:
        y1 - the first data array
        y2 - the second data array
        Returns:
        the computed correlation
      • correlationCoefficient

        public static double correlationCoefficient​(Number[] y1,
                                                    Number[] y2)
        Computes the correlation coefficient between the two data vectors and returns it.
        Parameters:
        y1 - the first data array
        y2 - the second data array
        Returns:
        the computed correlation
      • rSquared

        public static double rSquared​(double[] actual,
                                      double[] predicted)
        Computes the R^2 between the two data vectors and returns it.
        Parameters:
        actual - the first data array
        predicted - the second data array
        Returns:
        the computed correlation
      • rSquared

        public static double rSquared​(Number[] actual,
                                      Number[] predicted)
        Computes the R^2 between the two data vectors and returns it. https://en.wikipedia.org/wiki/Coefficient_of_determination
        Parameters:
        actual - the first data array
        predicted - the second data array
        Returns:
        the computed correlation
      • covariance

        public static double covariance​(double[] x,
                                        double[] y)
        Computes the covariance between the two data vectors and returns it. Cov(X,Y) = Sum((Xi-Xbar)*(Yi-Ybar)) / n with n = length of vectors, Xi the ith element of X, Yi the ith element of Y, Xbar the mean of X, Ybar the mean of Y.
        Parameters:
        x - the first data array
        y - the second data array
        Returns:
        the computed correlation
      • covariance

        public static double covariance​(Number[] x,
                                        Number[] y)
        Computes the covariance between the two data vectors and returns it. Cov(X,Y) = Sum((Xi-Xbar)*(Yi-Ybar)) / n with n = length of vectors, Xi the ith element of X, Yi the ith element of Y, Xbar the mean of X, Ybar the mean of Y.
        Parameters:
        x - the first data array
        y - the second data array
        Returns:
        the computed correlation
      • rmse

        public static double rmse​(double[] actual,
                                  double[] predicted)
        Computes the root mean squared error between the two data vectors and returns it.
        Parameters:
        actual - the second data array
        predicted - the first data array
        Returns:
        the rmse
      • rmse

        public static double rmse​(Number[] actual,
                                  Number[] predicted)
        Computes the root mean squared error between the two data vectors and returns it.
        Parameters:
        actual - the second data array
        predicted - the first data array
        Returns:
        the rmse
      • mae

        public static double mae​(double[] actual,
                                 double[] predicted)
        Computes the mean absolute error between the two data vectors and returns it.
        Parameters:
        actual - the second data array
        predicted - the first data array
        Returns:
        the mae
      • mae

        public static double mae​(Number[] actual,
                                 Number[] predicted)
        Computes the mean absolute error between the two data vectors and returns it.
        Parameters:
        actual - the second data array
        predicted - the first data array
        Returns:
        the mae
      • rae

        public static double rae​(double[] actual,
                                 double[] predicted)
        Computes the relative absolute error between the two data vectors and returns it.
        Parameters:
        actual - the second data array
        predicted - the first data array
        Returns:
        the rae
      • rae

        public static double rae​(Number[] actual,
                                 Number[] predicted)
        Computes the relative absolute error between the two data vectors and returns it.
        Parameters:
        actual - the second data array
        predicted - the first data array
        Returns:
        the rae
      • rrse

        public static double rrse​(double[] actual,
                                  double[] predicted)
        Computes the root relative squared error between the two data vectors and returns it.
        Parameters:
        actual - the second data array
        predicted - the first data array
        Returns:
        the rrse
      • rrse

        public static double rrse​(Number[] actual,
                                  Number[] predicted)
        Computes the root relative squared error between the two data vectors and returns it.
        Parameters:
        actual - the second data array
        predicted - the first data array
        Returns:
        the rrse
      • standardScores

        public static double[] standardScores​(double[] x,
                                              boolean isSample)
        Computes the standard scores for the array.
        Parameters:
        x - the data array
        isSample - if true, then the sample standard deviation instead of the population standard deviation is calculated (using n-1 instead of n).
        Returns:
        the standard scores
      • standardScores

        public static double[] standardScores​(Number[] x,
                                              boolean isSample)
        Computes the standard scores for the array.
        Parameters:
        x - the data array
        isSample - if true, then the sample standard deviation instead of the population standard deviation is calculated (using n-1 instead of n).
        Returns:
        the standard scores
      • standardScores

        public static double[] standardScores​(double[] actual,
                                              double[] predicted,
                                              boolean isSample)
        Computes the standard scores. The mean/stdev are determined from the first array (actual) and the z-scores are produced for the second one (predicted).
        Parameters:
        actual - the second data array
        predicted - the first data array
        isSample - if true, then the sample standard deviation instead of the population standard deviation is calculated (using n-1 instead of n).
        Returns:
        the standard scores
      • standardScores

        public static double[] standardScores​(Number[] actual,
                                              Number[] predicted,
                                              boolean isSample)
        Computes the standard scores. The mean/stdev are determined from the first array (actual) and the z-scores are produced for the second one (predicted).
        Parameters:
        actual - the second data array
        predicted - the first data array
        isSample - if true, then the sample standard deviation instead of the population standard deviation is calculated (using n-1 instead of n).
        Returns:
        the standard scores
      • signalToNoiseRatio

        public static double signalToNoiseRatio​(Number[] x)
        Calculates the signal/noise ratio.

        For more details, see Signal-to-noise ratio.
        Parameters:
        x - the input data to calculate the ratio for
        Returns:
        the ratio
      • signalToNoiseRatio

        public static double signalToNoiseRatio​(double[] x)
        Calculates the signal/noise ratio.

        For more details, see Signal-to-noise ratio.
        Parameters:
        x - the input data to calculate the ratio for
        Returns:
        the ratio
      • rowNorm

        public static double[] rowNorm​(Number[] x)
        Applies row-wise normalization to the data.
        Parameters:
        x - the input data
        Returns:
        the normalize data
      • rowNorm

        public static double[] rowNorm​(double[] x)
        Applies row-wise normalization to the data.
        Parameters:
        x - the input data
        Returns:
        the normalize data
      • normalProbability

        public static double normalProbability​(double a)
        Returns the area under the Normal (Gaussian) probability density function, integrated from minus infinity to x (assumes mean is zero, variance is one).
                                    x
                                     -
                           1        | |          2
          normal(x)  = ---------    |    exp( - t /2 ) dt
                       sqrt(2pi)  | |
                                   -
                                  -inf.
         
                     =  ( 1 + erf(z) ) / 2
                     =  erfc(z) / 2
         
        where z = x/sqrt(2). Computation is via the functions errorFunction and errorFunctionComplement.
        Parameters:
        a - the z-value
        Returns:
        the probability of the z value according to the normal pdf
      • normalInverse

        public static double normalInverse​(double y0)
        Returns the value, x, for which the area under the Normal (Gaussian) probability density function (integrated from minus infinity to x) is equal to the argument y (assumes mean is zero, variance is one).

        For small arguments 0 < y < exp(-2), the program computes z = sqrt( -2.0 * log(y) ); then the approximation is x = z - log(z)/z - (1/z) P(1/z) / Q(1/z). There are two rational functions P/Q, one for 0 < y < exp(-32) and the other for y up to exp(-2). For larger arguments, w = y - 0.5, and x/sqrt(2pi) = w + w**3 R(w**2)/S(w**2)).

        Parameters:
        y0 - the area under the normal pdf
        Returns:
        the z-value
      • linearRegression

        public static double[] linearRegression​(Number[] x,
                                                Number[] y)
        Calculates the slope and intercept between the two arrays.
        Parameters:
        x - the first array, representing the X values
        y - the second array, representing the Y values
        Returns:
        intercept/slope
      • linearRegression

        public static double[] linearRegression​(double[] x,
                                                double[] y)
        Calculates the slope and intercept between the two arrays.
        Parameters:
        x - the first array, representing the X values
        y - the second array, representing the Y values
        Returns:
        intercept/slope
      • kendallTheil

        public static double[] kendallTheil​(double[] x,
                                            double[] y)
        Computes the Kendall-Theil robust regression of the given data points. Also called Theil-Sen estimator (see here).
        Parameters:
        x - the x coordinates
        y - the y coordinates
        Returns:
        intercept/slope
      • kendallTheil

        public static double[] kendallTheil​(Number[] x,
                                            Number[] y)
        Computes the Kendall-Theil robust regression of the given data points. Also called Theil-Sen estimator (see here).
        Parameters:
        x - the x coordinates
        y - the y coordinates
        Returns:
        intercept/slope
      • flatten

        public static byte[] flatten​(byte[][] matrix)
        Converts the matrix into a flat array, row after row.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the matrix as arrawy (row after row)
      • flatten

        public static short[] flatten​(short[][] matrix)
        Converts the matrix into a flat array, row after row.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the matrix as arrawy (row after row)
      • flatten

        public static int[] flatten​(int[][] matrix)
        Converts the matrix into a flat array, row after row.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the matrix as arrawy (row after row)
      • flatten

        public static long[] flatten​(long[][] matrix)
        Converts the matrix into a flat array, row after row.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the matrix as arrawy (row after row)
      • flatten

        public static float[] flatten​(float[][] matrix)
        Converts the matrix into a flat array, row after row.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the matrix as arrawy (row after row)
      • flatten

        public static double[] flatten​(double[][] matrix)
        Converts the matrix into a flat array, row after row.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the matrix as arrawy (row after row)
      • flatten

        public static Number[] flatten​(Number[][] matrix)
        Converts the matrix into a flat array, row after row.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the matrix as arrawy (row after row)
      • toByteMatrix

        public static byte[][] toByteMatrix​(Number[][] matrix)
        Turns the Number matrix into one consisting of primitive bytes.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the converted matrix
      • toShortMatrix

        public static short[][] toShortMatrix​(Number[][] matrix)
        Turns the Number matrix into one consisting of primitive shorts.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the converted matrix
      • toIntMatrix

        public static int[][] toIntMatrix​(Number[][] matrix)
        Turns the Number matrix into one consisting of primitive ints.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the converted matrix
      • toLongMatrix

        public static long[][] toLongMatrix​(Number[][] matrix)
        Turns the Number matrix into one consisting of primitive longs.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the converted matrix
      • toFloatMatrix

        public static float[][] toFloatMatrix​(Number[][] matrix)
        Turns the Number matrix into one consisting of primitive floats.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the converted matrix
      • toDoubleMatrix

        public static double[][] toDoubleMatrix​(Number[][] matrix)
        Turns the Number matrix into one consisting of primitive doubles.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the converted matrix
      • toNumberMatrix

        public static Number[][] toNumberMatrix​(byte[][] matrix)
        Turns the Number matrix into one consisting of primitive bytes.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the converted matrix
      • toNumberMatrix

        public static Number[][] toNumberMatrix​(short[][] matrix)
        Turns the Number matrix into one consisting of primitive shorts.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the converted matrix
      • toNumberMatrix

        public static Number[][] toNumberMatrix​(int[][] matrix)
        Turns the primitve int matrix into one consisting of Integers.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the converted matrix
      • toNumberMatrix

        public static Number[][] toNumberMatrix​(long[][] matrix)
        Turns the primitive long matrix into one consisting of Longs.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the converted matrix
      • toNumberMatrix

        public static Number[][] toNumberMatrix​(float[][] matrix)
        Turns the primitive float matrix into one consisting of Floats.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the converted matrix
      • toNumberMatrix

        public static Number[][] toNumberMatrix​(double[][] matrix)
        Turns the primitive double matrix into one consisting of Doubles.
        Parameters:
        matrix - the matrix to convert
        Returns:
        the converted matrix
      • subsample

        public static gnu.trove.list.array.TIntArrayList subsample​(int num,
                                                                   double perc,
                                                                   long seed)
        Creates a random sub-sample of indices of a certain percentage using the specified number of entrie. Uses JavaRandomInt.
        Parameters:
        num - the maximum number of indices
        perc - the size of the subsample (0-1)
        seed - the seed value for JavaRandomInt
        Returns:
        the subsample of indices (chosen from 0 to num-1)
      • subsample

        public static gnu.trove.list.array.TIntArrayList subsample​(int num,
                                                                   double perc,
                                                                   RandomIntegerRangeGenerator generator)
        Creates a random sub-sample of indices of a certain percentage using the specified number of entrie.
        Parameters:
        num - the maximum number of indices
        perc - the size of the subsample (0-1)
        generator - the random int generator to use
        Returns:
        the subsample of indices (chosen from 0 to num-1)
      • uniqueValues

        public static byte[] uniqueValues​(byte[] numbers)
        Returns all the unique numbers in the array.
        Parameters:
        numbers - the numbers to use
        Returns:
        the unique numbers
      • uniqueValues

        public static short[] uniqueValues​(short[] numbers)
        Returns all the unique numbers in the array.
        Parameters:
        numbers - the numbers to use
        Returns:
        the unique numbers
      • uniqueValues

        public static int[] uniqueValues​(int[] numbers)
        Returns all the unique numbers in the array.
        Parameters:
        numbers - the numbers to use
        Returns:
        the unique numbers
      • uniqueValues

        public static long[] uniqueValues​(long[] numbers)
        Returns all the unique numbers in the array.
        Parameters:
        numbers - the numbers to use
        Returns:
        the unique numbers
      • uniqueValues

        public static float[] uniqueValues​(float[] numbers)
        Returns all the unique numbers in the array.
        Parameters:
        numbers - the numbers to use
        Returns:
        the unique numbers
      • uniqueValues

        public static double[] uniqueValues​(double[] numbers)
        Returns all the unique numbers in the array.
        Parameters:
        numbers - the numbers to use
        Returns:
        the unique numbers
      • uniqueCounts

        public static gnu.trove.map.TByteIntMap uniqueCounts​(byte[] numbers)
        Returns all the unique numbers in the array.
        Parameters:
        numbers - the numbers to use
        Returns:
        the unique numbers
      • uniqueCounts

        public static gnu.trove.map.TShortIntMap uniqueCounts​(short[] numbers)
        Returns all the unique numbers in the array.
        Parameters:
        numbers - the numbers to use
        Returns:
        the unique numbers
      • uniqueCounts

        public static gnu.trove.map.TIntIntMap uniqueCounts​(int[] numbers)
        Returns all the unique numbers in the array.
        Parameters:
        numbers - the numbers to use
        Returns:
        the unique numbers
      • uniqueCounts

        public static gnu.trove.map.TLongIntMap uniqueCounts​(long[] numbers)
        Returns all the unique numbers in the array.
        Parameters:
        numbers - the numbers to use
        Returns:
        the unique numbers
      • uniqueCounts

        public static gnu.trove.map.TFloatIntMap uniqueCounts​(float[] numbers)
        Returns all the unique numbers in the array.
        Parameters:
        numbers - the numbers to use
        Returns:
        the unique numbers
      • uniqueCounts

        public static gnu.trove.map.TDoubleIntMap uniqueCounts​(double[] numbers)
        Returns all the counts for the unique numbers in the array.
        Parameters:
        numbers - the numbers to use
        Returns:
        the unique number counts (number -> count)
      • main

        public static void main​(String[] args)
        Just for testing.
        Parameters:
        args - ignored