Package weka.classifiers.meta
Class ClassifierCascade
- java.lang.Object
-
- weka.classifiers.AbstractClassifier
-
- weka.classifiers.MultipleClassifiersCombiner
-
- weka.classifiers.RandomizableMultipleClassifiersCombiner
-
- weka.classifiers.meta.ClassifierCascade
-
- All Implemented Interfaces:
Serializable
,Cloneable
,weka.classifiers.Classifier
,weka.core.BatchPredictor
,weka.core.CapabilitiesHandler
,weka.core.CapabilitiesIgnorer
,weka.core.CommandlineRunnable
,weka.core.OptionHandler
,weka.core.Randomizable
,weka.core.RevisionHandler
public class ClassifierCascade extends weka.classifiers.RandomizableMultipleClassifiersCombiner
Generates a classifier cascade, with each deeper level of classifiers being built on the input data and either the class distributions (nominal class) or classification (numeric class) of the classifiers of the previous level in the cascade.
The build process is stopped when either the maximum number of levels is reached, the termination criterion is satisfied or no further improvement is achieved.
In case of a level performing worse than the prior one, the build process is terminated immediately and the current level discarded.
Valid options are:-max-levels <value> The maximum number of levels to build. (default: 10)
-statistic <value> The statistic to evaluate on. (default: Percent correct)
-threshold <value> The threshold that, when reached, terminates the build process. (default: 90.0)
-threshold-check <value> How to apply the provided threshold. (default: ABOVE)
-min-improvement <value> The minimum improvement between levels, otherwise the build process gets terminated. (default: 0.01)
-num-folds <value> The number of folds to use for internal cross-validation. (default: 10)
-num-threads <value> The number of threads to use. (default: -1)
-holdout-percentage <value> The size of the validation set in percent (0-100). (default: 20.0)
-class-index <value> The 0-based index of the class-label to use for class-label-based statistics. (default: 0)
-combination <value> Determines how to combine the statistics. (default: MEDIAN)
-S <num> Random number seed. (default 1)
-B <classifier specification> Full class name of classifier to include, followed by scheme options. May be specified multiple times. (default: "weka.classifiers.rules.ZeroR")
-output-debug-info If set, classifier is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, classifier capabilities are not checked before classifier is built (use with caution).
-num-decimal-places The number of decimal places for the output of numbers in the model (default 2).
-batch-size The desired batch size for batch prediction (default 100).
Options specific to classifier weka.classifiers.rules.ZeroR:
-output-debug-info If set, classifier is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, classifier capabilities are not checked before classifier is built (use with caution).
-num-decimal-places The number of decimal places for the output of numbers in the model (default 2).
-batch-size The desired batch size for batch prediction (default 100).
- Version:
- $Revision$
- Author:
- FracPete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
ClassifierCascade.Combination
Defines how to combine the predictions of the final layer and turn it into actual predictions.static class
ClassifierCascade.ThresholdCheck
Defines how to check the threshold.
-
Field Summary
Fields Modifier and Type Field Description static String
ATTRIBUTE_PREFIX
the prefix for the additional cascade attributes.protected static String
CLASS_INDEX
protected static String
COMBINATION
static int
DEFAULT_CLASS_INDEX
static ClassifierCascade.Combination
DEFAULT_COMBINATION
static double
DEFAULT_HOLDOUT_PERCENTAGE
static int
DEFAULT_MAX_LEVELS
static double
DEFAULT_MIN_IMPROVEMENT
static int
DEFAULT_NUM_FOLDS
static int
DEFAULT_NUM_THREADS
static EvaluationStatistic
DEFAULT_STATISTIC
static double
DEFAULT_THRESHOLD
static ClassifierCascade.ThresholdCheck
DEFAULT_THRESHOLD_CHECK
protected static String
HOLDOUT_PERCENTAGE
protected List<List<weka.classifiers.Classifier>>
m_Cascade
the cascade.protected int
m_ClassIndex
the class index.protected ClassifierCascade.Combination
m_Combination
how to combine the statistics.protected double
m_HoldOutPercentage
the percentage to use for validation set to determine termination criterion (0-100).protected int
m_MaxLevels
the maximum number of levels in the cascade.protected weka.core.Instances
m_MetaLevelHeader
the meta-level structure.protected List<Integer>
m_MetaLevelStart
the start indices for the classifier stats in the meta-levels.protected double
m_MinImprovement
the minimum improvement between levels that the statistic must improve.protected boolean
m_Nominal
whether regression or classification.protected int
m_NumFolds
the number of folds for cross-validation.protected int
m_NumThreads
the number of threads to use.protected EvaluationStatistic
m_Statistic
the statistic to use for termination.protected double
m_Threshold
the threshold for the statistic for termination.protected ClassifierCascade.ThresholdCheck
m_ThresholdCheck
whether to go below or above the threshold.protected static String
MAX_LEVELS
protected static String
MIN_IMPROVEMENT
protected static String
NUM_FOLDS
protected static String
NUM_THREADS
protected static String
STATISTIC
protected static String
THRESHOLD
protected static String
THRESHOLD_CHECK
-
Constructor Summary
Constructors Constructor Description ClassifierCascade()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
addMetaLevelPrediction(weka.core.Instance inst, int index, double cls)
Adds the class distribution of the specified classifier to the meta-level instance.protected void
addMetaLevelPrediction(weka.core.Instance inst, int index, double[] dist)
Adds the class distribution of the specified classifier to the meta-level instance.protected double
applyCombination(double[] stats)
Applies the selected combination to the array.void
buildClassifier(weka.core.Instances data)
Builds the classifier.double
classifyInstance(weka.core.Instance instance)
Returns the classification for the instance.String
classIndexTipText()
Returns the tip text for this property.String
combinationTipText()
Returns the tip text for this property.protected weka.core.Instances
createMetaLevelHeader(weka.core.Instances data)
Generates the dataset structure for the meta-levels.protected weka.core.Instance
createMetaLevelInstance(weka.core.Instances metaLevel, weka.core.Instance data)
Generates an instance for the meta-level using the original data.double[]
distributionForInstance(weka.core.Instance instance)
Returns the distribution for the instance.weka.core.Capabilities
getCapabilities()
Returns combined capabilities of the base classifiers, i.e., the capabilities all of them have in common.int
getClassIndex()
the class index.ClassifierCascade.Combination
getCombination()
how to combine the statistics.double
getHoldOutPercentage()
the percentage to use for validation set to determine termination criterion (0-100).int
getMaxLevels()
the maximum number of levels in the cascade.double
getMinImprovement()
the minimum improvement between levels that the statistic must improve.int
getNumFolds()
the number of folds for cross-validation.int
getNumThreads()
the number of threads to use.String[]
getOptions()
Gets the current settings of the classifier.String
getRevision()
Returns the revision string.EvaluationStatistic
getStatistic()
the statistic to use for termination.double
getThreshold()
the threshold for the statistic for termination.ClassifierCascade.ThresholdCheck
getThresholdCheck()
whether to go below or above the threshold.String
globalInfo()
Returns a string describing classifier.String
holdOutPercentageTipText()
Returns the tip text for this property.Enumeration
listOptions()
Returns an enumeration describing the available options.static void
main(String[] args)
Main method for executing the class.String
maxLevelsTipText()
Returns the tip text for this property.String
minImprovementTipText()
Returns the tip text for this property.String
numFoldsTipText()
Returns the tip text for this property.String
numThreadsTipText()
Returns the tip text for this property.protected Object
predictionForInstance(weka.core.Instance instance, boolean distribution)
Returns the prediction for the instance.void
setClassIndex(int classIndex)
the class index.void
setCombination(ClassifierCascade.Combination combination)
how to combine the statistics.void
setHoldOutPercentage(double holdOutPercentage)
the percentage to use for validation set to determine termination criterion (0-100).void
setMaxLevels(int maxLevels)
the maximum number of levels in the cascade.void
setMinImprovement(double minImprovement)
the minimum improvement between levels that the statistic must improve.void
setNumFolds(int numFolds)
the number of folds for cross-validation.void
setNumThreads(int numThreads)
the number of threads to use.void
setOptions(String[] options)
Parses a given list of options.void
setStatistic(EvaluationStatistic statistic)
the statistic to use for termination.void
setThreshold(double threshold)
the threshold for the statistic for termination.void
setThresholdCheck(ClassifierCascade.ThresholdCheck thresholdCheck)
whether to go below or above the threshold.String
statisticTipText()
Returns the tip text for this property.String
thresholdCheckTipText()
Returns the tip text for this property.String
thresholdTipText()
Returns the tip text for this property.String
toString()
Outputs a short description of the classifier model.-
Methods inherited from class weka.classifiers.RandomizableMultipleClassifiersCombiner
getSeed, seedTipText, setSeed
-
Methods inherited from class weka.classifiers.MultipleClassifiersCombiner
classifiersTipText, getClassifier, getClassifiers, getClassifierSpec, postExecution, preExecution, setClassifiers
-
Methods inherited from class weka.classifiers.AbstractClassifier
batchSizeTipText, debugTipText, distributionsForInstances, doNotCheckCapabilitiesTipText, forName, getBatchSize, getDebug, getDoNotCheckCapabilities, getNumDecimalPlaces, implementsMoreEfficientBatchPrediction, makeCopies, makeCopy, numDecimalPlacesTipText, run, runClassifier, setBatchSize, setDebug, setDoNotCheckCapabilities, setNumDecimalPlaces
-
-
-
-
Field Detail
-
ATTRIBUTE_PREFIX
public static final String ATTRIBUTE_PREFIX
the prefix for the additional cascade attributes.- See Also:
- Constant Field Values
-
DEFAULT_MAX_LEVELS
public static final int DEFAULT_MAX_LEVELS
- See Also:
- Constant Field Values
-
DEFAULT_STATISTIC
public static final EvaluationStatistic DEFAULT_STATISTIC
-
DEFAULT_THRESHOLD
public static final double DEFAULT_THRESHOLD
- See Also:
- Constant Field Values
-
DEFAULT_THRESHOLD_CHECK
public static final ClassifierCascade.ThresholdCheck DEFAULT_THRESHOLD_CHECK
-
DEFAULT_MIN_IMPROVEMENT
public static final double DEFAULT_MIN_IMPROVEMENT
- See Also:
- Constant Field Values
-
DEFAULT_NUM_FOLDS
public static final int DEFAULT_NUM_FOLDS
- See Also:
- Constant Field Values
-
DEFAULT_NUM_THREADS
public static final int DEFAULT_NUM_THREADS
- See Also:
- Constant Field Values
-
DEFAULT_HOLDOUT_PERCENTAGE
public static final double DEFAULT_HOLDOUT_PERCENTAGE
- See Also:
- Constant Field Values
-
DEFAULT_CLASS_INDEX
public static final int DEFAULT_CLASS_INDEX
- See Also:
- Constant Field Values
-
DEFAULT_COMBINATION
public static final ClassifierCascade.Combination DEFAULT_COMBINATION
-
MAX_LEVELS
protected static String MAX_LEVELS
-
STATISTIC
protected static String STATISTIC
-
THRESHOLD
protected static String THRESHOLD
-
THRESHOLD_CHECK
protected static String THRESHOLD_CHECK
-
MIN_IMPROVEMENT
protected static String MIN_IMPROVEMENT
-
NUM_FOLDS
protected static String NUM_FOLDS
-
NUM_THREADS
protected static String NUM_THREADS
-
HOLDOUT_PERCENTAGE
protected static String HOLDOUT_PERCENTAGE
-
CLASS_INDEX
protected static String CLASS_INDEX
-
COMBINATION
protected static String COMBINATION
-
m_MaxLevels
protected int m_MaxLevels
the maximum number of levels in the cascade.
-
m_Statistic
protected EvaluationStatistic m_Statistic
the statistic to use for termination.
-
m_Threshold
protected double m_Threshold
the threshold for the statistic for termination.
-
m_ThresholdCheck
protected ClassifierCascade.ThresholdCheck m_ThresholdCheck
whether to go below or above the threshold.
-
m_MinImprovement
protected double m_MinImprovement
the minimum improvement between levels that the statistic must improve.
-
m_NumFolds
protected int m_NumFolds
the number of folds for cross-validation.
-
m_NumThreads
protected int m_NumThreads
the number of threads to use.
-
m_HoldOutPercentage
protected double m_HoldOutPercentage
the percentage to use for validation set to determine termination criterion (0-100).
-
m_ClassIndex
protected int m_ClassIndex
the class index.
-
m_Combination
protected ClassifierCascade.Combination m_Combination
how to combine the statistics.
-
m_MetaLevelHeader
protected weka.core.Instances m_MetaLevelHeader
the meta-level structure.
-
m_MetaLevelStart
protected List<Integer> m_MetaLevelStart
the start indices for the classifier stats in the meta-levels.
-
m_Nominal
protected boolean m_Nominal
whether regression or classification.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing classifier.- Returns:
- a description suitable for displaying in the explorer/experimenter gui
-
listOptions
public Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceweka.core.OptionHandler
- Overrides:
listOptions
in classweka.classifiers.RandomizableMultipleClassifiersCombiner
- Returns:
- an enumeration of all the available options.
-
setOptions
public void setOptions(String[] options) throws Exception
Parses a given list of options.- Specified by:
setOptions
in interfaceweka.core.OptionHandler
- Overrides:
setOptions
in classweka.classifiers.RandomizableMultipleClassifiersCombiner
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
getOptions
public String[] getOptions()
Gets the current settings of the classifier.- Specified by:
getOptions
in interfaceweka.core.OptionHandler
- Overrides:
getOptions
in classweka.classifiers.RandomizableMultipleClassifiersCombiner
- Returns:
- an array of strings suitable for passing to setOptions
-
setMaxLevels
public void setMaxLevels(int maxLevels)
the maximum number of levels in the cascade.
-
getMaxLevels
public int getMaxLevels()
the maximum number of levels in the cascade.
-
maxLevelsTipText
public String maxLevelsTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setStatistic
public void setStatistic(EvaluationStatistic statistic)
the statistic to use for termination.
-
getStatistic
public EvaluationStatistic getStatistic()
the statistic to use for termination.
-
statisticTipText
public String statisticTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setThreshold
public void setThreshold(double threshold)
the threshold for the statistic for termination.
-
getThreshold
public double getThreshold()
the threshold for the statistic for termination.
-
thresholdTipText
public String thresholdTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setThresholdCheck
public void setThresholdCheck(ClassifierCascade.ThresholdCheck thresholdCheck)
whether to go below or above the threshold.
-
getThresholdCheck
public ClassifierCascade.ThresholdCheck getThresholdCheck()
whether to go below or above the threshold.
-
thresholdCheckTipText
public String thresholdCheckTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setMinImprovement
public void setMinImprovement(double minImprovement)
the minimum improvement between levels that the statistic must improve.
-
getMinImprovement
public double getMinImprovement()
the minimum improvement between levels that the statistic must improve.
-
minImprovementTipText
public String minImprovementTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNumFolds
public void setNumFolds(int numFolds)
the number of folds for cross-validation.
-
getNumFolds
public int getNumFolds()
the number of folds for cross-validation.
-
numFoldsTipText
public String numFoldsTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNumThreads
public void setNumThreads(int numThreads)
the number of threads to use.
-
getNumThreads
public int getNumThreads()
the number of threads to use.
-
numThreadsTipText
public String numThreadsTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setHoldOutPercentage
public void setHoldOutPercentage(double holdOutPercentage)
the percentage to use for validation set to determine termination criterion (0-100).
-
getHoldOutPercentage
public double getHoldOutPercentage()
the percentage to use for validation set to determine termination criterion (0-100).
-
holdOutPercentageTipText
public String holdOutPercentageTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setClassIndex
public void setClassIndex(int classIndex)
the class index.
-
getClassIndex
public int getClassIndex()
the class index.
-
classIndexTipText
public String classIndexTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setCombination
public void setCombination(ClassifierCascade.Combination combination)
how to combine the statistics.
-
getCombination
public ClassifierCascade.Combination getCombination()
how to combine the statistics.
-
combinationTipText
public String combinationTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getCapabilities
public weka.core.Capabilities getCapabilities()
Returns combined capabilities of the base classifiers, i.e., the capabilities all of them have in common.- Specified by:
getCapabilities
in interfaceweka.core.CapabilitiesHandler
- Specified by:
getCapabilities
in interfaceweka.classifiers.Classifier
- Overrides:
getCapabilities
in classweka.classifiers.MultipleClassifiersCombiner
- Returns:
- the capabilities of the base classifiers
-
createMetaLevelHeader
protected weka.core.Instances createMetaLevelHeader(weka.core.Instances data)
Generates the dataset structure for the meta-levels.- Parameters:
data
- the training data- Returns:
- the structure
-
createMetaLevelInstance
protected weka.core.Instance createMetaLevelInstance(weka.core.Instances metaLevel, weka.core.Instance data)
Generates an instance for the meta-level using the original data.- Parameters:
data
- the original data- Returns:
- the meta-level instance, but with missing meta-level data
-
addMetaLevelPrediction
protected void addMetaLevelPrediction(weka.core.Instance inst, int index, double[] dist)
Adds the class distribution of the specified classifier to the meta-level instance.- Parameters:
inst
- the meta-level instance to modifyindex
- the index of the classifierdist
- the class distribution to add
-
addMetaLevelPrediction
protected void addMetaLevelPrediction(weka.core.Instance inst, int index, double cls)
Adds the class distribution of the specified classifier to the meta-level instance.- Parameters:
inst
- the meta-level instance to modifyindex
- the index of the classifiercls
- the classification
-
applyCombination
protected double applyCombination(double[] stats)
Applies the selected combination to the array.- Parameters:
stats
- the statistic values to combine- Returns:
- the combination
-
buildClassifier
public void buildClassifier(weka.core.Instances data) throws Exception
Builds the classifier.- Parameters:
data
- the training data- Throws:
Exception
- if build fails
-
predictionForInstance
protected Object predictionForInstance(weka.core.Instance instance, boolean distribution) throws Exception
Returns the prediction for the instance.- Parameters:
instance
- the instance to get the class distribution fordistribution
- class distribution or classification- Returns:
- the class distribution or prediction
- Throws:
Exception
- if prediction fails
-
distributionForInstance
public double[] distributionForInstance(weka.core.Instance instance) throws Exception
Returns the distribution for the instance.- Specified by:
distributionForInstance
in interfaceweka.classifiers.Classifier
- Overrides:
distributionForInstance
in classweka.classifiers.AbstractClassifier
- Parameters:
instance
- the instance to get the class distribution for- Returns:
- the class distribution
- Throws:
Exception
- if prediction fails
-
classifyInstance
public double classifyInstance(weka.core.Instance instance) throws Exception
Returns the classification for the instance.- Specified by:
classifyInstance
in interfaceweka.classifiers.Classifier
- Overrides:
classifyInstance
in classweka.classifiers.AbstractClassifier
- Parameters:
instance
- the instance to get the classification for- Returns:
- the classification
- Throws:
Exception
- if prediction fails
-
toString
public String toString()
Outputs a short description of the classifier model.
-
getRevision
public String getRevision()
Returns the revision string.- Specified by:
getRevision
in interfaceweka.core.RevisionHandler
- Overrides:
getRevision
in classweka.classifiers.AbstractClassifier
- Returns:
- the revision
-
-