Class StreamingGradientBoostedTrees

  • All Implemented Interfaces:
    Configurable, Serializable, CapabilitiesHandler, Classifier, MultiClassClassifier, Regressor, AWTRenderable, Learner<Example<Instance>>, MOAObject, OptionHandler

    public class StreamingGradientBoostedTrees
    extends AbstractClassifier
    implements MultiClassClassifier, Regressor
    Gradient boosted trees for evolving data streams

    Streaming Gradient Boosted Trees (SGBT), which is trained using weighted squared loss elicited in XGBoost. SGBT exploits trees with a replacement strategy to detect and recover from drifts, thus enabling the ensemble to adapt without sacrificing the predictive performance.

    See details in:
    Nuwan Gunasekara, Bernhard Pfahringer, Heitor Murilo Gomes, Albert Bifet. Gradient Boosted Trees for Evolving Data Streams. Machine Learning, Springer, 2024. DOI.

    Parameters:

    • -l : Classifier to train on instances.
    • -s : The number of boosting iterations.
    • -m : Percentage (%) of attributes for each boosting iteration.
    • -L : Learning rate.
    • -H : Disable one-hot encoding for regressors that supports nominal attributes.
    • -M : Multiple training iterations by Ceiling (Hessian * M).
    • -S : Randomly skipp 1/S th of instances at training (S=1: No Skip, use all instances for training).
    • -K : Use Squared Loss for Classification.
    Version:
    $Revision: 1 $
    Author:
    Nuwan Gunasekara (ng98 at students dot waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Detail

      • baseLearnerOption

        public ClassOption baseLearnerOption
      • numberOfboostingIterations

        public IntOption numberOfboostingIterations
      • percentageOfAttributesForEachBoostingIteration

        public IntOption percentageOfAttributesForEachBoostingIteration
      • learningRateOption

        public FloatOption learningRateOption
      • disableOneHotEncoding

        public FlagOption disableOneHotEncoding
      • multipleIterationByCeilingOfHessianTimesM

        public IntOption multipleIterationByCeilingOfHessianTimesM
      • randomlySkip1SthOfInstancesAtTraining

        public IntOption randomlySkip1SthOfInstancesAtTraining
      • useSquaredLossForClassification

        public FlagOption useSquaredLossForClassification
      • randomSeedOption

        public IntOption randomSeedOption
      • reset

        protected boolean reset
      • numberClasses

        protected int numberClasses
      • lastPrediction

        protected double[] lastPrediction
    • Constructor Detail

      • StreamingGradientBoostedTrees

        public StreamingGradientBoostedTrees()
    • Method Detail

      • resetLearningImpl

        public void resetLearningImpl()
        Description copied from class: AbstractClassifier
        Resets this classifier. It must be similar to starting a new classifier from scratch.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        resetLearningImpl in class AbstractClassifier
      • getModelMeasurementsImpl

        protected Measurement[] getModelMeasurementsImpl()
        Description copied from class: AbstractClassifier
        Gets the current measurements of this classifier.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        getModelMeasurementsImpl in class AbstractClassifier
        Returns:
        an array of measurements to be used in evaluation tasks
      • getModelDescription

        public void getModelDescription​(StringBuilder out,
                                        int indent)
        Description copied from class: AbstractClassifier
        Returns a string representation of the model.
        Specified by:
        getModelDescription in class AbstractClassifier
        Parameters:
        out - the stringbuilder to add the description
        indent - the number of characters to indent
      • isRandomizable

        public boolean isRandomizable()
        Description copied from interface: Learner
        Gets whether this learner needs a random seed. Examples of methods that needs a random seed are bagging and boosting.
        Specified by:
        isRandomizable in interface Learner<Example<Instance>>
        Returns:
        true if the learner needs a random seed.
      • correctlyClassifies

        public boolean correctlyClassifies​(Instance inst)
        Description copied from interface: Classifier
        Gets whether this classifier correctly classifies an instance. Uses getVotesForInstance to obtain the prediction and the instance to obtain its true class.
        Specified by:
        correctlyClassifies in interface Classifier
        Overrides:
        correctlyClassifies in class AbstractClassifier
        Parameters:
        inst - the instance to be classified
        Returns:
        true if the instance is correctly classified
      • trainOnInstanceImpl

        public void trainOnInstanceImpl​(Instance inst)
        Description copied from class: AbstractClassifier
        Trains this classifier incrementally using the given instance.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        trainOnInstanceImpl in class AbstractClassifier
        Parameters:
        inst - the instance to be used for training
      • getVotesForInstance

        public double[] getVotesForInstance​(Instance inst)
        Description copied from interface: Classifier
        Predicts the class memberships for a given instance. If an instance is unclassified, the returned array elements must be all zero.
        Specified by:
        getVotesForInstance in interface Classifier
        Specified by:
        getVotesForInstance in class AbstractClassifier
        Parameters:
        inst - the instance to be classified
        Returns:
        an array containing the estimated membership probabilities of the test instance in each class
      • newBinaryClassInstance

        public static Instance newBinaryClassInstance​(Instance instance)
      • getSubInstance

        public static Instance getSubInstance​(Instance instance,
                                              double weight,
                                              ArrayList<Integer> subSpaceFeaturesIndexes,
                                              boolean setNumericClassAttribute,
                                              double numericClassValue,
                                              boolean useOneHotEncoding)
      • getScoresWhenNullTree

        public static double[] getScoresWhenNullTree​(int outputSize)
      • createSGBTs

        protected void createSGBTs​(int numSGBTs)