Class OzaBagAdwin

  • All Implemented Interfaces:
    Configurable, Serializable, CapabilitiesHandler, Classifier, MultiClassClassifier, AWTRenderable, Learner<Example<Instance>>, MOAObject, OptionHandler
    Direct Known Subclasses:
    OzaBagAdwinML

    public class OzaBagAdwin
    extends AbstractClassifier
    implements MultiClassClassifier, CapabilitiesHandler
    Bagging for evolving data streams using ADWIN.

    ADWIN is a change detector and estimator that solves in a well-specified way the problem of tracking the average of a stream of bits or real-valued numbers. ADWIN keeps a variable-length window of recently seen items, with the property that the window has the maximal length statistically consistent with the hypothesis “there has been no change in the average value inside the window”.
    More precisely, an older fragment of the window is dropped if and only if there is enough evidence that its average value differs from that of the rest of the window. This has two consequences: one, that change reliably declared whenever the window shrinks; and two, that at any time the average over the existing window can be reliably taken as an estimation of the current average in the stream (barring a very small or very recent change that is still not statistically visible). A formal and quantitative statement of these two points (a theorem) appears in

    Albert Bifet and Ricard Gavaldà. Learning from time-changing data with adaptive windowing. In SIAM International Conference on Data Mining, 2007.

    ADWIN is parameter- and assumption-free in the sense that it automatically detects and adapts to the current rate of change. Its only parameter is a confidence bound δ, indicating how confident we want to be in the algorithm’s output, inherent to all algorithms dealing with random processes. Also important, ADWIN does not maintain the window explicitly, but compresses it using a variant of the exponential histogram technique. This means that it keeps a window of length W using only O(log W) memory and O(log W) processing time per item.
    ADWIN Bagging is the online bagging method of Oza and Rusell with the addition of the ADWIN algorithm as a change detector and as an estimator for the weights of the boosting method. When a change is detected, the worst classifier of the ensemble of classifiers is removed and a new classifier is added to the ensemble.

    See details in:
    [BHPKG] Albert Bifet, Geoff Holmes, Bernhard Pfahringer, Richard Kirkby, and Ricard Gavaldà . New ensemble methods for evolving data streams. In 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2009.

    Example:

    OzaBagAdwin -l HoeffdingTreeNBAdaptive -s 10

    Parameters:

    • -l : Classifier to train
    • -s : The number of models in the bag
    Version:
    $Revision: 7 $
    Author:
    Albert Bifet (abifet at cs dot waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Detail

      • baseLearnerOption

        public ClassOption baseLearnerOption
      • ensembleSizeOption

        public IntOption ensembleSizeOption
      • ADError

        protected ADWIN[] ADError
    • Constructor Detail

      • OzaBagAdwin

        public OzaBagAdwin()
    • Method Detail

      • resetLearningImpl

        public void resetLearningImpl()
        Description copied from class: AbstractClassifier
        Resets this classifier. It must be similar to starting a new classifier from scratch.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        resetLearningImpl in class AbstractClassifier
      • trainOnInstanceImpl

        public void trainOnInstanceImpl​(Instance inst)
        Description copied from class: AbstractClassifier
        Trains this classifier incrementally using the given instance.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        trainOnInstanceImpl in class AbstractClassifier
        Parameters:
        inst - the instance to be used for training
      • getVotesForInstance

        public double[] getVotesForInstance​(Instance inst)
        Description copied from interface: Classifier
        Predicts the class memberships for a given instance. If an instance is unclassified, the returned array elements must be all zero.
        Specified by:
        getVotesForInstance in interface Classifier
        Specified by:
        getVotesForInstance in class AbstractClassifier
        Parameters:
        inst - the instance to be classified
        Returns:
        an array containing the estimated membership probabilities of the test instance in each class
      • isRandomizable

        public boolean isRandomizable()
        Description copied from interface: Learner
        Gets whether this learner needs a random seed. Examples of methods that needs a random seed are bagging and boosting.
        Specified by:
        isRandomizable in interface Learner<Example<Instance>>
        Returns:
        true if the learner needs a random seed.
      • getModelDescription

        public void getModelDescription​(StringBuilder out,
                                        int indent)
        Description copied from class: AbstractClassifier
        Returns a string representation of the model.
        Specified by:
        getModelDescription in class AbstractClassifier
        Parameters:
        out - the stringbuilder to add the description
        indent - the number of characters to indent
      • getModelMeasurementsImpl

        protected Measurement[] getModelMeasurementsImpl()
        Description copied from class: AbstractClassifier
        Gets the current measurements of this classifier.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        getModelMeasurementsImpl in class AbstractClassifier
        Returns:
        an array of measurements to be used in evaluation tasks