Class CSMOTE

  • All Implemented Interfaces:
    Configurable, Serializable, CapabilitiesHandler, Classifier, MultiClassClassifier, AWTRenderable, Learner<Example<Instance>>, MOAObject, OptionHandler

    public class CSMOTE
    extends AbstractClassifier
    implements MultiClassClassifier
    CSMOTE

    This strategy save all the minority samples in a window managed by ADWIN. In the meantime, a model is trained with the data in input. When the minority sample ratio is less than a certain threshold, an online SMOTE version is applied. A random minority sample is chosen from the window and a new synthetic sample is generated until the minority sample ratio is greater or equal than the threshold. The model is then trained with the new samples generated.

    See details in:
    Alessio Bernardo, Heitor Murilo Gomes, Jacob Montiel, Bernhard Pfharinger, Albert Bifet, Emanuele Della Valle. C-SMOTE: Continuous Synthetic Minority Oversampling for Evolving Data Streams. In BigData, IEEE, 2020.

    Parameters:

    • -l : Classifier to train. Default is ARF
    • -k : Number of neighbors for SMOTE. Default is 5
    • -t : Threshold for the minority samples. Default is 0.5
    • -m : Minimum number of samples in the minority class for applying SMOTE. Default is 100
    • -d : Should use ADWIN as drift detector? If enabled it is used by the method to track the performance of the classifiers and adapt when a drift is detected.
    Version:
    $Revision: 1 $
    Author:
    Alessio Bernardo (alessio dot bernardo at polimi dot com)
    See Also:
    Serialized Form
    • Field Detail

      • baseLearnerOption

        public ClassOption baseLearnerOption
      • neighborsOption

        public IntOption neighborsOption
      • minSizeAllowedOption

        public IntOption minSizeAllowedOption
      • disableDriftDetectionOption

        public FlagOption disableDriftDetectionOption
      • neighbors

        protected int neighbors
      • threshold

        protected double threshold
      • minSizeAllowed

        protected int minSizeAllowed
      • driftDetection

        protected boolean driftDetection
      • adwin

        protected ADWIN adwin
      • adwinDriftDetector

        protected ADWIN adwinDriftDetector
      • nMinorityTotal

        protected int nMinorityTotal
      • nMajorityTotal

        protected int nMajorityTotal
      • nGeneratedMinorityTotal

        protected int nGeneratedMinorityTotal
      • nGeneratedMajorityTotal

        protected int nGeneratedMajorityTotal
      • indexValues

        protected int[] indexValues
    • Constructor Detail

      • CSMOTE

        public CSMOTE()
    • Method Detail

      • resetLearningImpl

        public void resetLearningImpl()
        Description copied from class: AbstractClassifier
        Resets this classifier. It must be similar to starting a new classifier from scratch.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        resetLearningImpl in class AbstractClassifier
      • getVotesForInstance

        public double[] getVotesForInstance​(Instance instance)
        Description copied from interface: Classifier
        Predicts the class memberships for a given instance. If an instance is unclassified, the returned array elements must be all zero.
        Specified by:
        getVotesForInstance in interface Classifier
        Specified by:
        getVotesForInstance in class AbstractClassifier
        Parameters:
        instance - the instance to be classified
        Returns:
        an array containing the estimated membership probabilities of the test instance in each class
      • trainOnInstanceImpl

        public void trainOnInstanceImpl​(Instance instance)
        Description copied from class: AbstractClassifier
        Trains this classifier incrementally using the given instance.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        trainOnInstanceImpl in class AbstractClassifier
        Parameters:
        instance - the instance to be used for training
      • isRandomizable

        public boolean isRandomizable()
        Description copied from interface: Learner
        Gets whether this learner needs a random seed. Examples of methods that needs a random seed are bagging and boosting.
        Specified by:
        isRandomizable in interface Learner<Example<Instance>>
        Returns:
        true if the learner needs a random seed.
      • getModelDescription

        public void getModelDescription​(StringBuilder arg0,
                                        int arg1)
        Description copied from class: AbstractClassifier
        Returns a string representation of the model.
        Specified by:
        getModelDescription in class AbstractClassifier
        Parameters:
        arg0 - the stringbuilder to add the description
        arg1 - the number of characters to indent
      • getModelMeasurementsImpl

        protected Measurement[] getModelMeasurementsImpl()
        Description copied from class: AbstractClassifier
        Gets the current measurements of this classifier.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        getModelMeasurementsImpl in class AbstractClassifier
        Returns:
        an array of measurements to be used in evaluation tasks