Class FeatureImportanceHoeffdingTreeEnsemble

  • All Implemented Interfaces:
    Configurable, Serializable, CapabilitiesHandler, Classifier, MultiClassClassifier, AWTRenderable, FeatureImportanceClassifier, Learner<Example<Instance>>, MOAObject, OptionHandler

    public class FeatureImportanceHoeffdingTreeEnsemble
    extends AbstractClassifier
    implements MultiClassClassifier, CapabilitiesHandler, FeatureImportanceClassifier
    HoeffdingTree Ensemble Feature Importance.

    This produce feature importances from ensembles of HoeffdingTree models and its subclasses. This class does not interfere with the training algorithm of the underlying ensemble model. The base learner of the ensemble model must be either a HoeffdingTree or one of its subclasses.

    See details in:
    Heitor Murilo Gomes, Rodrigo Fernandes de Mello, Bernhard Pfahringer, Albert Bifet. Feature Scoring using Tree-Based Ensembles for Evolving Data Streams. IEEE International Conference on Big Data (pp. 761-769), 2019

    Parameters:

    • -l : The ensemble classifier used to train and to be analyzed.
    • -t : HoeffdingTree FeatureImportance object. Important: the learner option of the HoeffdingTreeFeatureImportance object is overridden by the ensemble base tree model.
    Version:
    $Revision: 1 $
    Author:
    Heitor Murilo Gomes (heitor dot gomes at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Detail

      • ensembleLearnerOption

        public ClassOption ensembleLearnerOption
      • hoeffdingTreeFeatureImportanceOption

        public ClassOption hoeffdingTreeFeatureImportanceOption
      • featureImportances

        protected double[] featureImportances
    • Constructor Detail

      • FeatureImportanceHoeffdingTreeEnsemble

        public FeatureImportanceHoeffdingTreeEnsemble()
    • Method Detail

      • getTopKFeatures

        public int[] getTopKFeatures​(int k,
                                     boolean normalize)
        Description copied from interface: FeatureImportanceClassifier
        The output is a double array where values indicates the original feature index and the order of the array its ranking. The size of this array is expected to be less than the complete set of features.
        Specified by:
        getTopKFeatures in interface FeatureImportanceClassifier
        Returns:
        the k features with the highest scores.
      • resetLearningImpl

        public void resetLearningImpl()
        Description copied from class: AbstractClassifier
        Resets this classifier. It must be similar to starting a new classifier from scratch.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        resetLearningImpl in class AbstractClassifier
      • trainOnInstanceImpl

        public void trainOnInstanceImpl​(Instance instance)
        Description copied from class: AbstractClassifier
        Trains this classifier incrementally using the given instance.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        trainOnInstanceImpl in class AbstractClassifier
        Parameters:
        instance - the instance to be used for training
      • getVotesForInstance

        public double[] getVotesForInstance​(Instance instance)
        Description copied from interface: Classifier
        Predicts the class memberships for a given instance. If an instance is unclassified, the returned array elements must be all zero.
        Specified by:
        getVotesForInstance in interface Classifier
        Specified by:
        getVotesForInstance in class AbstractClassifier
        Parameters:
        instance - the instance to be classified
        Returns:
        an array containing the estimated membership probabilities of the test instance in each class
      • getModelMeasurementsImpl

        protected Measurement[] getModelMeasurementsImpl()
        Description copied from class: AbstractClassifier
        Gets the current measurements of this classifier.

        The reason for ...Impl methods: ease programmer burden by not requiring them to remember calls to super in overridden methods. Note that this will produce compiler errors if not overridden.
        Specified by:
        getModelMeasurementsImpl in class AbstractClassifier
        Returns:
        an array of measurements to be used in evaluation tasks
      • getModelDescription

        public void getModelDescription​(StringBuilder out,
                                        int indent)
        Description copied from class: AbstractClassifier
        Returns a string representation of the model.
        Specified by:
        getModelDescription in class AbstractClassifier
        Parameters:
        out - the stringbuilder to add the description
        indent - the number of characters to indent
      • isRandomizable

        public boolean isRandomizable()
        Description copied from interface: Learner
        Gets whether this learner needs a random seed. Examples of methods that needs a random seed are bagging and boosting.
        Specified by:
        isRandomizable in interface Learner<Example<Instance>>
        Returns:
        true if the learner needs a random seed.