Class OnlineUnderOverBagging

  • All Implemented Interfaces:
    Configurable, Serializable, CapabilitiesHandler, Classifier, MultiClassClassifier, AWTRenderable, Learner<Example<Instance>>, MOAObject, OptionHandler

    public class OnlineUnderOverBagging
    extends AbstractClassifier
    implements MultiClassClassifier, CapabilitiesHandler
    Online UnderOverBagging is the online version of the ensemble method.

    In case of imbalanced classes UnderOverBagging uses the strategy of under-sampling the majority class and oversampling the minority class. In addition the sampling rate can be also varied over the bagging iterations, which further boosts the diversity of the base learners.

    The derivation of the online UnderOverBagging algorithm is made through the observation that a Binomial distribution with sampling rate :math:`\frac{C}{N}` corresponds to a poisson distribution with :math:`\lambda=C`.

    This online ensemble learner method is improved by the addition of an ADWIN change detector. ADWIN stands for Adaptive Windowing. It works by keeping updated statistics of a variable sized window, so it can detect changes and perform cuts in its window to better adapt the learning algorithms.

    See details in:
    B. Wang and J. Pineau, "Online Bagging and Boosting for Imbalanced Data Streams," in IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 12, pp. 3353-3366, 1 Dec. 2016. doi: 10.1109/TKDE.2016.2609424

    Parameters:

    • -l : Each classifier to train of the ensemble is an instance of the base estimator.
    • -s : The size of the ensemble, in other words, how many classifiers to train.
    • -i : The sampling rate of the positive instances.
    • -d : Should use ADWIN as drift detector? If enabled it is used by the method to track the performance of the classifiers and adapt when a drift is detected.
    • -r : Seed for the random state.
    Version:
    $Revision: 1 $
    Author:
    Alessio Bernardo (alessio dot bernardo at polimi dot dot it)
    See Also:
    Serialized Form
    • Field Detail

      • baseLearnerOption

        public ClassOption baseLearnerOption
      • ensembleSizeOption

        public IntOption ensembleSizeOption
      • samplingRateOption

        public IntOption samplingRateOption
      • disableDriftDetectionOption

        public FlagOption disableDriftDetectionOption
      • nEstimators

        protected int nEstimators
      • samplingRate

        protected int samplingRate
      • driftDetection

        protected boolean driftDetection
    • Constructor Detail

      • OnlineUnderOverBagging

        public OnlineUnderOverBagging()
    • Method Detail

      • resetLearningImpl

        public void resetLearningImpl()
        Description copied from class: AbstractClassifier
        Resets this classifier. It must be similar to starting a new classifier from scratch.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        resetLearningImpl in class AbstractClassifier
      • trainOnInstanceImpl

        public void trainOnInstanceImpl​(Instance instance)
        Description copied from class: AbstractClassifier
        Trains this classifier incrementally using the given instance.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        trainOnInstanceImpl in class AbstractClassifier
        Parameters:
        instance - the instance to be used for training
      • getVotesForInstance

        public double[] getVotesForInstance​(Instance instance)
        Description copied from interface: Classifier
        Predicts the class memberships for a given instance. If an instance is unclassified, the returned array elements must be all zero.
        Specified by:
        getVotesForInstance in interface Classifier
        Specified by:
        getVotesForInstance in class AbstractClassifier
        Parameters:
        instance - the instance to be classified
        Returns:
        an array containing the estimated membership probabilities of the test instance in each class
      • isRandomizable

        public boolean isRandomizable()
        Description copied from interface: Learner
        Gets whether this learner needs a random seed. Examples of methods that needs a random seed are bagging and boosting.
        Specified by:
        isRandomizable in interface Learner<Example<Instance>>
        Returns:
        true if the learner needs a random seed.
      • getModelDescription

        public void getModelDescription​(StringBuilder arg0,
                                        int arg1)
        Description copied from class: AbstractClassifier
        Returns a string representation of the model.
        Specified by:
        getModelDescription in class AbstractClassifier
        Parameters:
        arg0 - the stringbuilder to add the description
        arg1 - the number of characters to indent
      • getModelMeasurementsImpl

        protected Measurement[] getModelMeasurementsImpl()
        Description copied from class: AbstractClassifier
        Gets the current measurements of this classifier.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        getModelMeasurementsImpl in class AbstractClassifier
        Returns:
        an array of measurements to be used in evaluation tasks
      • adjustEnsembleSize

        protected void adjustEnsembleSize​(int nClasses)