Class LinearRegressionJ

  • All Implemented Interfaces:
    Stoppable, StoppableWithFeedback, Serializable, Cloneable, weka.classifiers.Classifier, weka.core.BatchPredictor, weka.core.CapabilitiesHandler, weka.core.CapabilitiesIgnorer, weka.core.CommandlineRunnable, weka.core.OptionHandler, weka.core.RevisionHandler, weka.core.WeightedInstancesHandler

    public class LinearRegressionJ
    extends StoppableClassifier
    implements weka.core.OptionHandler, weka.core.WeightedInstancesHandler
    Class for using linear regression for prediction. Uses the Akaike criterion for model selection, and is able to deal with weighted instances.

    Valid options are:

     -S <number of selection method>
      Set the attribute selection method to use. 1 = None, 2 = Greedy.
      (default 0 = M5' method)
     
     -C
      Do not try to eliminate colinear attributes.
     
     -R <double>
      Set ridge parameter (default 1.0e-8).
     
     -minimal
      Conserve memory, don't keep dataset header and means/stdevs.
      Model cannot be printed out if this option is enabled. (default: keep data)
     
     -additional-stats
      Output additional statistics.
     
     -output-debug-info
      If set, classifier is run in debug mode and
      may output additional info to the console
     
     -do-not-check-capabilities
      If set, classifier capabilities are not checked before classifier is built
      (use with caution).
     
    LinearRegression version based on r12246 before switch to MTJ.
    Author:
    Eibe Frank ([email protected]), Len Trigg ([email protected])
    See Also:
    Serialized Form
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected int m_AttributeSelection
      The current attribute selection method
      protected boolean m_checksTurnedOff
      Turn off all checks and conversions?
      protected int m_ClassIndex
      The index of the class attribute
      protected double m_ClassMean
      The mean of the class attribute
      protected double m_ClassStdDev
      The standard deviations of the class attribute
      protected double[] m_Coefficients
      Array for storing coefficients of linear regression.
      protected boolean m_EliminateColinearAttributes
      Try to eliminate correlated attributes?
      protected double[] m_Means
      The attributes means
      protected boolean m_Minimal
      Conserve memory?
      protected weka.filters.unsupervised.attribute.ReplaceMissingValues m_MissingFilter
      The filter for removing missing values.
      protected boolean m_ModelBuilt
      Model already built?
      protected boolean m_outputAdditionalStats
      Whether to output additional statistics such as std.
      protected double m_Ridge
      The ridge parameter
      protected boolean[] m_SelectedAttributes
      Which attributes are relevant?
      protected double[] m_StdDevs
      The attribute standard deviations
      protected weka.core.Instances m_TransformedData
      Variable for storing transformed training data.
      protected weka.filters.supervised.attribute.NominalToBinary m_TransformFilter
      The filter storing the transformation from nominal to binary attributes.
      static int SELECTION_GREEDY
      Attribute selection method: Greedy method
      static int SELECTION_M5
      Attribute selection method: M5 method
      static int SELECTION_NONE
      Attribute selection method: No attribute selection
      static weka.core.Tag[] TAGS_SELECTION
      Attribute selection methods
      • Fields inherited from class weka.classifiers.AbstractClassifier

        BATCH_SIZE_DEFAULT, m_BatchSize, m_Debug, m_DoNotCheckCapabilities, m_numDecimalPlaces, NUM_DECIMAL_PLACES_DEFAULT
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      String attributeSelectionMethodTipText()
      Returns the tip text for this property
      void buildClassifier​(weka.core.Instances data)
      Builds a regression model for the given data.
      protected double calculateSE​(boolean[] selectedAttributes, double[] coefficients)
      Calculate the squared error of a regression model on the training data
      double classifyInstance​(weka.core.Instance instance)
      Classifies the given instance using the linear regression function.
      double[] coefficients()
      Returns the coefficients for this linear model.
      protected boolean deselectColinearAttributes​(boolean[] selectedAttributes, double[] coefficients)
      Removes the attribute with the highest standardised coefficient greater than 1.5 from the selected attributes.
      protected double[] doRegression​(boolean[] selectedAttributes)
      Calculate a linear regression using the selected attributes
      String eliminateColinearAttributesTipText()
      Returns the tip text for this property
      protected void findBestModel()
      Performs a greedy search for the best regression model using Akaike's criterion.
      weka.core.SelectedTag getAttributeSelectionMethod()
      Gets the method used to select attributes for use in the linear regression.
      weka.core.Capabilities getCapabilities()
      Returns default capabilities of the classifier.
      boolean getEliminateColinearAttributes()
      Get the value of EliminateColinearAttributes.
      boolean getMinimal()
      Returns whether to be more memory conservative or being able to output the model as string.
      String[] getOptions()
      Gets the current settings of the classifier.
      boolean getOutputAdditionalStats()
      Get whether to output additional statistics (such as std.
      String getRevision()
      Returns the revision string.
      double getRidge()
      Get the value of Ridge.
      String globalInfo()
      Returns a string describing this classifier
      Enumeration<weka.core.Option> listOptions()
      Returns an enumeration describing the available options.
      static void main​(String[] argv)
      Generates a linear regression function predictor.
      String minimalTipText()
      Returns the tip text for this property.
      int numParameters()
      Get the number of coefficients used in the model
      String outputAdditionalStatsTipText()
      Returns the tip text for this property.
      protected double regressionPrediction​(weka.core.Instance transformedInstance, boolean[] selectedAttributes, double[] coefficients)
      Calculate the dependent value for a given instance for a given regression model.
      String ridgeTipText()
      Returns the tip text for this property
      void setAttributeSelectionMethod​(weka.core.SelectedTag method)
      Sets the method used to select attributes for use in the linear regression.
      void setEliminateColinearAttributes​(boolean newEliminateColinearAttributes)
      Set the value of EliminateColinearAttributes.
      void setMinimal​(boolean value)
      Sets whether to be more memory conservative or being able to output the model as string.
      void setOptions​(String[] options)
      Parses a given list of options.
      void setOutputAdditionalStats​(boolean additional)
      Set whether to output additional statistics (such as std.
      void setRidge​(double newRidge)
      Set the value of Ridge.
      String toString()
      Outputs the linear regression model as a string.
      void turnChecksOff()
      Turns off checks for missing values, etc.
      void turnChecksOn()
      Turns on checks for missing values, etc.
      • Methods inherited from class weka.classifiers.AbstractClassifier

        batchSizeTipText, debugTipText, distributionForInstance, distributionsForInstances, doNotCheckCapabilitiesTipText, forName, getBatchSize, getDebug, getDoNotCheckCapabilities, getNumDecimalPlaces, implementsMoreEfficientBatchPrediction, makeCopies, makeCopy, numDecimalPlacesTipText, postExecution, preExecution, run, runClassifier, setBatchSize, setDebug, setDoNotCheckCapabilities, setNumDecimalPlaces
    • Field Detail

      • SELECTION_M5

        public static final int SELECTION_M5
        Attribute selection method: M5 method
        See Also:
        Constant Field Values
      • SELECTION_NONE

        public static final int SELECTION_NONE
        Attribute selection method: No attribute selection
        See Also:
        Constant Field Values
      • SELECTION_GREEDY

        public static final int SELECTION_GREEDY
        Attribute selection method: Greedy method
        See Also:
        Constant Field Values
      • TAGS_SELECTION

        public static final weka.core.Tag[] TAGS_SELECTION
        Attribute selection methods
      • m_Coefficients

        protected double[] m_Coefficients
        Array for storing coefficients of linear regression.
      • m_SelectedAttributes

        protected boolean[] m_SelectedAttributes
        Which attributes are relevant?
      • m_TransformedData

        protected weka.core.Instances m_TransformedData
        Variable for storing transformed training data.
      • m_MissingFilter

        protected weka.filters.unsupervised.attribute.ReplaceMissingValues m_MissingFilter
        The filter for removing missing values.
      • m_TransformFilter

        protected weka.filters.supervised.attribute.NominalToBinary m_TransformFilter
        The filter storing the transformation from nominal to binary attributes.
      • m_ClassStdDev

        protected double m_ClassStdDev
        The standard deviations of the class attribute
      • m_ClassMean

        protected double m_ClassMean
        The mean of the class attribute
      • m_ClassIndex

        protected int m_ClassIndex
        The index of the class attribute
      • m_Means

        protected double[] m_Means
        The attributes means
      • m_StdDevs

        protected double[] m_StdDevs
        The attribute standard deviations
      • m_outputAdditionalStats

        protected boolean m_outputAdditionalStats
        Whether to output additional statistics such as std. dev. of coefficients and t-stats
      • m_AttributeSelection

        protected int m_AttributeSelection
        The current attribute selection method
      • m_EliminateColinearAttributes

        protected boolean m_EliminateColinearAttributes
        Try to eliminate correlated attributes?
      • m_checksTurnedOff

        protected boolean m_checksTurnedOff
        Turn off all checks and conversions?
      • m_Ridge

        protected double m_Ridge
        The ridge parameter
      • m_Minimal

        protected boolean m_Minimal
        Conserve memory?
      • m_ModelBuilt

        protected boolean m_ModelBuilt
        Model already built?
    • Constructor Detail

      • LinearRegressionJ

        public LinearRegressionJ()
    • Method Detail

      • main

        public static void main​(String[] argv)
        Generates a linear regression function predictor.
        Parameters:
        argv - the options
      • globalInfo

        public String globalInfo()
        Returns a string describing this classifier
        Returns:
        a description of the classifier suitable for displaying in the explorer/experimenter gui
      • getCapabilities

        public weka.core.Capabilities getCapabilities()
        Returns default capabilities of the classifier.
        Specified by:
        getCapabilities in interface weka.core.CapabilitiesHandler
        Specified by:
        getCapabilities in interface weka.classifiers.Classifier
        Overrides:
        getCapabilities in class weka.classifiers.AbstractClassifier
        Returns:
        the capabilities of this classifier
      • buildClassifier

        public void buildClassifier​(weka.core.Instances data)
                             throws Exception
        Builds a regression model for the given data.
        Specified by:
        buildClassifier in interface weka.classifiers.Classifier
        Parameters:
        data - the training data to be used for generating the linear regression function
        Throws:
        Exception - if the classifier could not be built successfully
      • classifyInstance

        public double classifyInstance​(weka.core.Instance instance)
                                throws Exception
        Classifies the given instance using the linear regression function.
        Specified by:
        classifyInstance in interface weka.classifiers.Classifier
        Overrides:
        classifyInstance in class weka.classifiers.AbstractClassifier
        Parameters:
        instance - the test instance
        Returns:
        the classification
        Throws:
        Exception - if classification can't be done successfully
      • toString

        public String toString()
        Outputs the linear regression model as a string.
        Overrides:
        toString in class Object
        Returns:
        the model as string
      • listOptions

        public Enumeration<weka.core.Option> listOptions()
        Returns an enumeration describing the available options.
        Specified by:
        listOptions in interface weka.core.OptionHandler
        Overrides:
        listOptions in class weka.classifiers.AbstractClassifier
        Returns:
        an enumeration of all the available options.
      • coefficients

        public double[] coefficients()
        Returns the coefficients for this linear model.
        Returns:
        the coefficients for this linear model
      • getOptions

        public String[] getOptions()
        Gets the current settings of the classifier.
        Specified by:
        getOptions in interface weka.core.OptionHandler
        Overrides:
        getOptions in class weka.classifiers.AbstractClassifier
        Returns:
        an array of strings suitable for passing to setOptions
      • setOptions

        public void setOptions​(String[] options)
                        throws Exception
        Parses a given list of options.

        Valid options are:

         -S <number of selection method>
          Set the attribute selection method to use. 1 = None, 2 = Greedy.
          (default 0 = M5' method)
         
         -C
          Do not try to eliminate colinear attributes.
         
         -R <double>
          Set ridge parameter (default 1.0e-8).
         
         -minimal
          Conserve memory, don't keep dataset header and means/stdevs.
          Model cannot be printed out if this option is enabled. (default: keep data)
         
         -additional-stats
          Output additional statistics.
         
         -output-debug-info
          If set, classifier is run in debug mode and
          may output additional info to the console
         
         -do-not-check-capabilities
          If set, classifier capabilities are not checked before classifier is built
          (use with caution).
         
        Specified by:
        setOptions in interface weka.core.OptionHandler
        Overrides:
        setOptions in class weka.classifiers.AbstractClassifier
        Parameters:
        options - the list of options as an array of strings
        Throws:
        Exception - if an option is not supported
      • ridgeTipText

        public String ridgeTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getRidge

        public double getRidge()
        Get the value of Ridge.
        Returns:
        Value of Ridge.
      • setRidge

        public void setRidge​(double newRidge)
        Set the value of Ridge.
        Parameters:
        newRidge - Value to assign to Ridge.
      • eliminateColinearAttributesTipText

        public String eliminateColinearAttributesTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getEliminateColinearAttributes

        public boolean getEliminateColinearAttributes()
        Get the value of EliminateColinearAttributes.
        Returns:
        Value of EliminateColinearAttributes.
      • setEliminateColinearAttributes

        public void setEliminateColinearAttributes​(boolean newEliminateColinearAttributes)
        Set the value of EliminateColinearAttributes.
        Parameters:
        newEliminateColinearAttributes - Value to assign to EliminateColinearAttributes.
      • numParameters

        public int numParameters()
        Get the number of coefficients used in the model
        Returns:
        the number of coefficients
      • attributeSelectionMethodTipText

        public String attributeSelectionMethodTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getAttributeSelectionMethod

        public weka.core.SelectedTag getAttributeSelectionMethod()
        Gets the method used to select attributes for use in the linear regression.
        Returns:
        the method to use.
      • setAttributeSelectionMethod

        public void setAttributeSelectionMethod​(weka.core.SelectedTag method)
        Sets the method used to select attributes for use in the linear regression.
        Parameters:
        method - the attribute selection method to use.
      • minimalTipText

        public String minimalTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getMinimal

        public boolean getMinimal()
        Returns whether to be more memory conservative or being able to output the model as string.
        Returns:
        true if memory conservation is preferred over outputting model description
      • setMinimal

        public void setMinimal​(boolean value)
        Sets whether to be more memory conservative or being able to output the model as string.
        Parameters:
        value - if true memory will be conserved
      • outputAdditionalStatsTipText

        public String outputAdditionalStatsTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getOutputAdditionalStats

        public boolean getOutputAdditionalStats()
        Get whether to output additional statistics (such as std. deviation of coefficients and t-statistics
        Returns:
        true if additional stats are to be output
      • setOutputAdditionalStats

        public void setOutputAdditionalStats​(boolean additional)
        Set whether to output additional statistics (such as std. deviation of coefficients and t-statistics
        Parameters:
        additional - true if additional stats are to be output
      • turnChecksOff

        public void turnChecksOff()
        Turns off checks for missing values, etc. Use with caution. Also turns off scaling.
      • turnChecksOn

        public void turnChecksOn()
        Turns on checks for missing values, etc. Also turns on scaling.
      • deselectColinearAttributes

        protected boolean deselectColinearAttributes​(boolean[] selectedAttributes,
                                                     double[] coefficients)
        Removes the attribute with the highest standardised coefficient greater than 1.5 from the selected attributes.
        Parameters:
        selectedAttributes - an array of flags indicating which attributes are included in the regression model
        coefficients - an array of coefficients for the regression model
        Returns:
        true if an attribute was removed
      • findBestModel

        protected void findBestModel()
                              throws Exception
        Performs a greedy search for the best regression model using Akaike's criterion.
        Throws:
        Exception - if regression can't be done
      • calculateSE

        protected double calculateSE​(boolean[] selectedAttributes,
                                     double[] coefficients)
                              throws Exception
        Calculate the squared error of a regression model on the training data
        Parameters:
        selectedAttributes - an array of flags indicating which attributes are included in the regression model
        coefficients - an array of coefficients for the regression model
        Returns:
        the mean squared error on the training data
        Throws:
        Exception - if there is a missing class value in the training data
      • regressionPrediction

        protected double regressionPrediction​(weka.core.Instance transformedInstance,
                                              boolean[] selectedAttributes,
                                              double[] coefficients)
                                       throws Exception
        Calculate the dependent value for a given instance for a given regression model.
        Parameters:
        transformedInstance - the input instance
        selectedAttributes - an array of flags indicating which attributes are included in the regression model
        coefficients - an array of coefficients for the regression model
        Returns:
        the regression value for the instance.
        Throws:
        Exception - if the class attribute of the input instance is not assigned
      • doRegression

        protected double[] doRegression​(boolean[] selectedAttributes)
                                 throws Exception
        Calculate a linear regression using the selected attributes
        Parameters:
        selectedAttributes - an array of booleans where each element is true if the corresponding attribute should be included in the regression.
        Returns:
        an array of coefficients for the linear regression model.
        Throws:
        Exception - if an error occurred during the regression.
      • getRevision

        public String getRevision()
        Returns the revision string.
        Specified by:
        getRevision in interface weka.core.RevisionHandler
        Overrides:
        getRevision in class weka.classifiers.AbstractClassifier
        Returns:
        the revision