Class AdditiveRegressionUserDefined

  • All Implemented Interfaces:
    Serializable, Cloneable, weka.classifiers.Classifier, weka.core.AdditionalMeasureProducer, weka.core.BatchPredictor, weka.core.CapabilitiesHandler, weka.core.CapabilitiesIgnorer, weka.core.CommandlineRunnable, weka.core.OptionHandler, weka.core.RevisionHandler, weka.core.TechnicalInformationHandler, weka.core.WeightedInstancesHandler

    public class AdditiveRegressionUserDefined
    extends weka.classifiers.MultipleClassifiersCombiner
    implements weka.core.OptionHandler, weka.core.AdditionalMeasureProducer, weka.core.WeightedInstancesHandler, weka.core.TechnicalInformationHandler
    Meta classifier that enhances the performance of the regression base classifiers, iterating through the list of specified classifiers. Each iteration fits a model to the residuals left by the classifier on the previous iteration. Prediction is accomplished by adding the predictions of each classifier. Reducing the shrinkage (learning rate) parameter helps prevent overfitting and has a smoothing effect but increases the learning time.

    Based on additive regression, but iterates through the user-defined list of classifiers rather than using the same classifier in each iteration.

    For more information see:

    J.H. Friedman (1999). Stochastic Gradient Boosting.

    BibTeX:
     @techreport{Friedman1999,
        author = {J.H. Friedman},
        institution = {Stanford University},
        title = {Stochastic Gradient Boosting},
        year = {1999},
        PS = {http://www-stat.stanford.edu/\~jhf/ftp/stobst.ps}
     }
     


    Valid options are:

     -S
      Specify shrinkage rate. (default = 1.0, i.e., no shrinkage)
     -A
      Minimize absolute error instead of squared error (assumes that base learner minimizes absolute error).
     -resume
      Set whether classifier can continue training after performing therequested number of iterations.
      Note that setting this to true will retain certain data structures which can increase the
      size of the model.
     
     -B <classifier specification>
      Full class name of classifier to include, followed
      by scheme options. May be specified multiple times.
      (default: "weka.classifiers.rules.ZeroR")
     -output-debug-info
      If set, classifier is run in debug mode and
      may output additional info to the console
     -do-not-check-capabilities
      If set, classifier capabilities are not checked before classifier is built
      (use with caution).
     -num-decimal-places
      The number of decimal places for the output of numbers in the model (default 2).
     -batch-size
      The desired batch size for batch prediction  (default 100).
     Options specific to classifier weka.classifiers.rules.ZeroR:
     
     -output-debug-info
      If set, classifier is run in debug mode and
      may output additional info to the console
     -do-not-check-capabilities
      If set, classifier capabilities are not checked before classifier is built
      (use with caution).
     -num-decimal-places
      The number of decimal places for the output of numbers in the model (default 2).
     -batch-size
      The desired batch size for batch prediction  (default 100).
    Author:
    Mark Hall ([email protected]), FracPete (fracpete at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected ArrayList<weka.classifiers.Classifier> m_ClassifierList
      ArrayList for storing the generated base classifiers.
      protected weka.core.Instances m_Data
      The working data
      protected double m_Diff
      The improvement in the sum of (absolute or squared) residuals.
      protected double m_Error
      The sum of (absolute or squared) residuals.
      protected double m_InitialPrediction
      The mean or median
      protected boolean m_MinimizeAbsoluteError
      Whether to minimise absolute error instead of squared error.
      protected int m_numItsPerformed
      Number of iterations performed in this session of iterating
      protected boolean m_resume
      Whether to allow training to continue at a later point after the initial model is built.
      protected double m_shrinkage
      Shrinkage (Learning rate).
      protected boolean m_SuitableData
      whether we have suitable data or nor (if only mean/mode is used)
      • Fields inherited from class weka.classifiers.MultipleClassifiersCombiner

        m_Classifiers
      • Fields inherited from class weka.classifiers.AbstractClassifier

        BATCH_SIZE_DEFAULT, m_BatchSize, m_Debug, m_DoNotCheckCapabilities, m_numDecimalPlaces, NUM_DECIMAL_PLACES_DEFAULT
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void buildClassifier​(weka.core.Instances data)
      Method used to build the classifier.
      double classifyInstance​(weka.core.Instance inst)
      Classify an instance.
      Enumeration<String> enumerateMeasures()
      Returns an enumeration of the additional measure names
      weka.core.Capabilities getCapabilities()
      Returns default capabilities of the classifier.
      double getMeasure​(String additionalMeasureName)
      Returns the value of the named measure
      boolean getMinimizeAbsoluteError()
      Gets whether absolute error is to be minimized.
      String[] getOptions()
      Gets the current settings of the Classifier.
      boolean getResume()
      Returns true if the model is to be finalized (or has been finalized) after training.
      String getRevision()
      Returns the revision string.
      double getShrinkage()
      Get the shrinkage rate.
      weka.core.TechnicalInformation getTechnicalInformation()
      Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
      String globalInfo()
      Returns a string describing this attribute evaluator
      void initializeClassifier​(weka.core.Instances data)
      Initialize classifier.
      Enumeration<weka.core.Option> listOptions()
      Returns an enumeration describing the available options.
      static void main​(String[] args)
      Main method for testing this class.
      double measureNumIterations()
      return the number of iterations (base classifiers) completed
      String minimizeAbsoluteErrorTipText()
      Returns the tip text for this property
      boolean next​(int index)
      Perform another iteration.
      protected weka.core.Instances residualReplace​(weka.core.Instances data, double c)
      Replace the class values of the instances from the current iteration with residuals after predicting the given constant.
      protected weka.core.Instances residualReplace​(weka.core.Instances data, weka.classifiers.Classifier c)
      Replace the class values of the instances from the current iteration with residuals after predicting with the supplied classifier.
      String resumeTipText()
      Tool tip text for the resume property
      void setMinimizeAbsoluteError​(boolean f)
      Sets whether absolute error is to be minimized.
      void setOptions​(String[] options)
      Parses a given list of options.
      void setResume​(boolean resume)
      If called with argument true, then the next time done() is called the model is effectively "frozen" and no further iterations can be performed
      void setShrinkage​(double l)
      Set the shrinkage parameter
      String shrinkageTipText()
      Returns the tip text for this property
      String toString()
      Returns textual description of the classifier.
      • Methods inherited from class weka.classifiers.MultipleClassifiersCombiner

        classifiersTipText, getClassifier, getClassifiers, getClassifierSpec, postExecution, preExecution, setClassifiers
      • Methods inherited from class weka.classifiers.AbstractClassifier

        batchSizeTipText, debugTipText, distributionForInstance, distributionsForInstances, doNotCheckCapabilitiesTipText, forName, getBatchSize, getDebug, getDoNotCheckCapabilities, getNumDecimalPlaces, implementsMoreEfficientBatchPrediction, makeCopies, makeCopy, numDecimalPlacesTipText, run, runClassifier, setBatchSize, setDebug, setDoNotCheckCapabilities, setNumDecimalPlaces
    • Field Detail

      • m_ClassifierList

        protected ArrayList<weka.classifiers.Classifier> m_ClassifierList
        ArrayList for storing the generated base classifiers. Note: we are hiding the variable from IteratedSingleClassifierEnhancer
      • m_shrinkage

        protected double m_shrinkage
        Shrinkage (Learning rate). Default = no shrinkage.
      • m_InitialPrediction

        protected double m_InitialPrediction
        The mean or median
      • m_SuitableData

        protected boolean m_SuitableData
        whether we have suitable data or nor (if only mean/mode is used)
      • m_Data

        protected weka.core.Instances m_Data
        The working data
      • m_Error

        protected double m_Error
        The sum of (absolute or squared) residuals.
      • m_Diff

        protected double m_Diff
        The improvement in the sum of (absolute or squared) residuals.
      • m_MinimizeAbsoluteError

        protected boolean m_MinimizeAbsoluteError
        Whether to minimise absolute error instead of squared error.
      • m_resume

        protected boolean m_resume
        Whether to allow training to continue at a later point after the initial model is built.
      • m_numItsPerformed

        protected int m_numItsPerformed
        Number of iterations performed in this session of iterating
    • Constructor Detail

      • AdditiveRegressionUserDefined

        public AdditiveRegressionUserDefined()
    • Method Detail

      • globalInfo

        public String globalInfo()
        Returns a string describing this attribute evaluator
        Returns:
        a description of the evaluator suitable for displaying in the explorer/experimenter gui
      • getTechnicalInformation

        public weka.core.TechnicalInformation getTechnicalInformation()
        Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
        Specified by:
        getTechnicalInformation in interface weka.core.TechnicalInformationHandler
        Returns:
        the technical information about this class
      • listOptions

        public Enumeration<weka.core.Option> listOptions()
        Returns an enumeration describing the available options.
        Specified by:
        listOptions in interface weka.core.OptionHandler
        Overrides:
        listOptions in class weka.classifiers.MultipleClassifiersCombiner
        Returns:
        an enumeration of all the available options.
      • setOptions

        public void setOptions​(String[] options)
                        throws Exception
        Parses a given list of options.
        Specified by:
        setOptions in interface weka.core.OptionHandler
        Overrides:
        setOptions in class weka.classifiers.MultipleClassifiersCombiner
        Parameters:
        options - the list of options as an array of strings
        Throws:
        Exception - if an option is not supported
      • getOptions

        public String[] getOptions()
        Gets the current settings of the Classifier.
        Specified by:
        getOptions in interface weka.core.OptionHandler
        Overrides:
        getOptions in class weka.classifiers.MultipleClassifiersCombiner
        Returns:
        an array of strings suitable for passing to setOptions
      • shrinkageTipText

        public String shrinkageTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setShrinkage

        public void setShrinkage​(double l)
        Set the shrinkage parameter
        Parameters:
        l - the shrinkage rate.
      • getShrinkage

        public double getShrinkage()
        Get the shrinkage rate.
        Returns:
        the value of the learning rate
      • minimizeAbsoluteErrorTipText

        public String minimizeAbsoluteErrorTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setMinimizeAbsoluteError

        public void setMinimizeAbsoluteError​(boolean f)
        Sets whether absolute error is to be minimized.
        Parameters:
        f - true if absolute error is to be minimized.
      • getMinimizeAbsoluteError

        public boolean getMinimizeAbsoluteError()
        Gets whether absolute error is to be minimized.
        Returns:
        true if absolute error is to be minimized
      • resumeTipText

        public String resumeTipText()
        Tool tip text for the resume property
        Returns:
        the tool tip text for the finalize property
      • setResume

        public void setResume​(boolean resume)
        If called with argument true, then the next time done() is called the model is effectively "frozen" and no further iterations can be performed
        Parameters:
        resume - true if the model is to be finalized after performing iterations
      • getResume

        public boolean getResume()
        Returns true if the model is to be finalized (or has been finalized) after training.
        Returns:
        the current value of finalize
      • getCapabilities

        public weka.core.Capabilities getCapabilities()
        Returns default capabilities of the classifier.
        Specified by:
        getCapabilities in interface weka.core.CapabilitiesHandler
        Specified by:
        getCapabilities in interface weka.classifiers.Classifier
        Overrides:
        getCapabilities in class weka.classifiers.MultipleClassifiersCombiner
        Returns:
        the capabilities of this classifier
      • initializeClassifier

        public void initializeClassifier​(weka.core.Instances data)
                                  throws Exception
        Initialize classifier.
        Parameters:
        data - the training data
        Throws:
        Exception - if the classifier could not be initialized successfully
      • next

        public boolean next​(int index)
                     throws Exception
        Perform another iteration.
        Throws:
        Exception
      • buildClassifier

        public void buildClassifier​(weka.core.Instances data)
                             throws Exception
        Method used to build the classifier.
        Specified by:
        buildClassifier in interface weka.classifiers.Classifier
        Throws:
        Exception
      • classifyInstance

        public double classifyInstance​(weka.core.Instance inst)
                                throws Exception
        Classify an instance.
        Specified by:
        classifyInstance in interface weka.classifiers.Classifier
        Overrides:
        classifyInstance in class weka.classifiers.AbstractClassifier
        Parameters:
        inst - the instance to predict
        Returns:
        a prediction for the instance
        Throws:
        Exception - if an error occurs
      • residualReplace

        protected weka.core.Instances residualReplace​(weka.core.Instances data,
                                                      weka.classifiers.Classifier c)
                                               throws Exception
        Replace the class values of the instances from the current iteration with residuals after predicting with the supplied classifier.
        Parameters:
        data - the instances to predict
        c - the classifier to use
        Returns:
        a new set of instances with class values replaced by residuals
        Throws:
        Exception - if something goes wrong
      • residualReplace

        protected weka.core.Instances residualReplace​(weka.core.Instances data,
                                                      double c)
                                               throws Exception
        Replace the class values of the instances from the current iteration with residuals after predicting the given constant.
        Parameters:
        data - the instances to predict
        c - the constant to use
        Returns:
        a new set of instances with class values replaced by residuals
        Throws:
        Exception - if something goes wrong
      • enumerateMeasures

        public Enumeration<String> enumerateMeasures()
        Returns an enumeration of the additional measure names
        Specified by:
        enumerateMeasures in interface weka.core.AdditionalMeasureProducer
        Returns:
        an enumeration of the measure names
      • getMeasure

        public double getMeasure​(String additionalMeasureName)
        Returns the value of the named measure
        Specified by:
        getMeasure in interface weka.core.AdditionalMeasureProducer
        Parameters:
        additionalMeasureName - the name of the measure to query for its value
        Returns:
        the value of the named measure
        Throws:
        IllegalArgumentException - if the named measure is not supported
      • measureNumIterations

        public double measureNumIterations()
        return the number of iterations (base classifiers) completed
        Returns:
        the number of iterations (same as number of base classifier models)
      • toString

        public String toString()
        Returns textual description of the classifier.
        Overrides:
        toString in class Object
        Returns:
        a description of the classifier as a string
      • getRevision

        public String getRevision()
        Returns the revision string.
        Specified by:
        getRevision in interface weka.core.RevisionHandler
        Overrides:
        getRevision in class weka.classifiers.AbstractClassifier
        Returns:
        the revision
      • main

        public static void main​(String[] args)
        Main method for testing this class.
        Parameters:
        args - should contain the following arguments: -t training file [-T test file] [-c class index]