Class ALUncertainty

  • All Implemented Interfaces:
    Configurable, Serializable, CapabilitiesHandler, ALClassifier, Classifier, AWTRenderable, Learner<Example<Instance>>, MOAObject, OptionHandler

    public class ALUncertainty
    extends AbstractClassifier
    implements ALClassifier
    Active learning setting for evolving data streams.

    Active learning focuses on learning an accurate model with as few labels as possible. Streaming data poses additional challenges for active learning, since the data distribution may change over time (concept drift) and classifiers need to adapt. Conventional active learning strategies concentrate on querying the most uncertain instances, which are typically concentrated around the decision boundary. If changes do not occur close to the boundary, they will be missed and classifiers will fail to adapt. This class contains three active learning strategies for streaming data that explicitly handle concept drift. They are based on fixed uncertainty, dynamic allocation of labeling efforts over time and randomization of the search space [ZBPH]. It also contains the Selective Sampling strategy, which is adapted from [CGZ] and uses a variable labeling threshold.

    [ZBPH] Indre Zliobaite, Albert Bifet, Bernhard Pfahringer, Geoff Holmes: Active Learning with Evolving Streaming Data. ECML/PKDD (3) 2011: 597-612

    [CGZ] N. Cesa-Bianchi, C. Gentile, and L. Zaniboni. Worst-case analysis of selective sampling for linear classification. J. Mach. Learn. Res. (7) 2006: 1205-1230

    .

    Parameters:

    • -l : Classifier to train
    • -d : Strategy to use: FixedUncertainty, VarUncertainty, RandVarUncertainty, SelSampling
    • -b : Budget to use
    • -u : Fixed threshold
    • -s : Floating budget step
    • -n : Number of instances at beginning without active learning

    Structural changes to match active learning framework by Daniel Kottke.

    Version:
    $Revision: 7 $
    Author:
    Indre Zliobaite (zliobaite at gmail dot com), Albert Bifet (abifet at cs dot waikato dot ac dot nz), Daniel Kottke (daniel dot kottke at ovgu dot de) - adapted to AL framework
    See Also:
    Serialized Form
    • Field Detail

      • baseLearnerOption

        public ClassOption baseLearnerOption
      • activeLearningStrategyOption

        public MultiChoiceOption activeLearningStrategyOption
      • fixedThresholdOption

        public FloatOption fixedThresholdOption
      • numInstancesInitOption

        public FloatOption numInstancesInitOption
      • lastLabelAcq

        public int lastLabelAcq
      • costLabeling

        public int costLabeling
      • iterationControl

        public int iterationControl
      • newThreshold

        public double newThreshold
      • maxPosterior

        public double maxPosterior
      • accuracyBaseLearner

        public double accuracyBaseLearner
    • Constructor Detail

      • ALUncertainty

        public ALUncertainty()
    • Method Detail

      • resetLearningImpl

        public void resetLearningImpl()
        Description copied from class: AbstractClassifier
        Resets this classifier. It must be similar to starting a new classifier from scratch.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        resetLearningImpl in class AbstractClassifier
      • trainOnInstanceImpl

        public void trainOnInstanceImpl​(Instance inst)
        Description copied from class: AbstractClassifier
        Trains this classifier incrementally using the given instance.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        trainOnInstanceImpl in class AbstractClassifier
        Parameters:
        inst - the instance to be used for training
      • getVotesForInstance

        public double[] getVotesForInstance​(Instance inst)
        Description copied from interface: Classifier
        Predicts the class memberships for a given instance. If an instance is unclassified, the returned array elements must be all zero.
        Specified by:
        getVotesForInstance in interface Classifier
        Specified by:
        getVotesForInstance in class AbstractClassifier
        Parameters:
        inst - the instance to be classified
        Returns:
        an array containing the estimated membership probabilities of the test instance in each class
      • isRandomizable

        public boolean isRandomizable()
        Description copied from interface: Learner
        Gets whether this learner needs a random seed. Examples of methods that needs a random seed are bagging and boosting.
        Specified by:
        isRandomizable in interface Learner<Example<Instance>>
        Returns:
        true if the learner needs a random seed.
      • getModelDescription

        public void getModelDescription​(StringBuilder out,
                                        int indent)
        Description copied from class: AbstractClassifier
        Returns a string representation of the model.
        Specified by:
        getModelDescription in class AbstractClassifier
        Parameters:
        out - the stringbuilder to add the description
        indent - the number of characters to indent
      • getModelMeasurementsImpl

        protected Measurement[] getModelMeasurementsImpl()
        Description copied from class: AbstractClassifier
        Gets the current measurements of this classifier.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        getModelMeasurementsImpl in class AbstractClassifier
        Returns:
        an array of measurements to be used in evaluation tasks
      • getLastLabelAcqReport

        public int getLastLabelAcqReport()
        Description copied from interface: ALClassifier
        Returns true if the previously chosen instance was added to the training set of the active learner.
        Specified by:
        getLastLabelAcqReport in interface ALClassifier
      • setModelContext

        public void setModelContext​(InstancesHeader ih)
        Description copied from interface: Learner
        Sets the reference to the header of the data stream. The header of the data stream is extended from WEKA Instances. This header is needed to know the number of classes and attributes
        Specified by:
        setModelContext in interface Learner<Example<Instance>>
        Overrides:
        setModelContext in class AbstractClassifier
        Parameters:
        ih - the reference to the data stream header