Package weka.classifiers.meta
Class VotedImbalance
- java.lang.Object
-
- weka.classifiers.AbstractClassifier
-
- weka.classifiers.SingleClassifierEnhancer
-
- weka.classifiers.RandomizableSingleClassifierEnhancer
-
- weka.classifiers.meta.VotedImbalance
-
- All Implemented Interfaces:
Serializable
,Cloneable
,weka.classifiers.Classifier
,weka.core.BatchPredictor
,weka.core.CapabilitiesHandler
,weka.core.CapabilitiesIgnorer
,weka.core.CommandlineRunnable
,ModelOutputHandler
,weka.core.OptionHandler
,weka.core.Randomizable
,weka.core.RevisionHandler
public class VotedImbalance extends weka.classifiers.RandomizableSingleClassifierEnhancer implements ModelOutputHandler
Generates an ensemble using the following approach:
- do x times:
* create new dataset, resampled with specified bias
* build base classifier with it
If no classifier gets built at all, use ZeroR as backup model, built on the full dataset.
At prediction time, the Vote meta-classifier (using the pre-built classifiers) is used to determining the class probabilities or regression value.
Instead of just using a fixed number of resampled models, you can also specify thresholds (= probability that the minority class does not meet) with associated number of resampled models to use.
Valid options are:-num-slots <num> Number of execution slots. (default: 1 - i.e. no parallelism)
-combination-rule <AVG|PROD|MAJ|MIN|MAX|MED> The combination rule to use (default: AVG)
-num-balanced <num> Number of balanced datasets (= number of classifiers) to create. (default: 1)
-thresholds <prob=# [prob=# [...]]> Thresholds for number of resampled models (probability=#models); blank-separated list. (default: none)
-num-balanced <num> Number of balanced datasets (= number of classifiers) to create. (default: 1)
-B <num> Bias factor towards uniform class distribution. 0 = distribution in input data -- 1 = uniform distribution. (default 0)
-no-replacement Disables replacement of instances (default: with replacement)
-suppress-model-output Suppress model output (default: no)
-S <num> Random number seed. (default 1)
-W Full name of base classifier. (default: weka.classifiers.rules.ZeroR)
-output-debug-info If set, classifier is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, classifier capabilities are not checked before classifier is built (use with caution).
-num-decimal-places The number of decimal places for the output of numbers in the model (default 2).
Options specific to classifier weka.classifiers.rules.ZeroR:
-output-debug-info If set, classifier is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, classifier capabilities are not checked before classifier is built (use with caution).
-num-decimal-places The number of decimal places for the output of numbers in the model (default 2).
Options after -- are passed to the designated classifier.- Version:
- $Revision$
- Author:
- fracpete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected int
m_ActualNumBalanced
the actual number of balanced datasets to generate.protected weka.classifiers.rules.ZeroR
m_BackupModel
The backup classifier, in case no ensemble could be constructed at prediction time.protected double
m_Bias
the bias for the dataset balancing (0 = distribution in input data -- 1 = uniform distribution).protected weka.classifiers.Classifier[]
m_Classifiers
the actual classifiers in use.protected int
m_CombinationRule
Combination Rule variable.protected int
m_Completed
The number of classifiers completed so farprotected weka.core.Instances
m_Data
For holding the original training set temporarily.protected weka.classifiers.Classifier
m_Ensemble
the vote classifier in use.protected ThreadPoolExecutor
m_ExecutorPool
Pool of threads to train models withprotected int
m_Failed
The number of classifiers that experienced a failure of some sort during construction.protected weka.core.Instances
m_Header
The header of the training set.protected boolean
m_NoReplacement
Whether to perform sampling with replacement or without.protected int
m_NumBalanced
the number of balanced datasets to generate.protected int
m_NumExecutionSlots
The number of threads to have executing at any one timeprotected double
m_SamplePercentage
the sample percentage to use (0-100).protected boolean
m_SuppressModelOutput
whether to suppress the model output.protected BaseKeyValuePair[]
m_Thresholds
the thresholds to use (pair: probability minority class = num balanced).
-
Constructor Summary
Constructors Constructor Description VotedImbalance()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description String
biasTipText()
Returns the tip text for this property.void
buildClassifier(weka.core.Instances data)
Stump method for building the classifiersprotected void
buildClassifiers()
Does the actual construction of the ensemble.double
classifyInstance(weka.core.Instance instance)
Classifies the given test instance.String
combinationRuleTipText()
Returns the tip text for this property.protected void
completedClassifier(int index, boolean success)
Records the completion of the training of a single classifier.protected weka.classifiers.Classifier
constructEnsemble()
Constructs the ensemble.double[]
distributionForInstance(weka.core.Instance instance)
Predicts the class memberships for a given instance.double
getBias()
Gets the bias towards a uniform class.weka.core.Capabilities
getCapabilities()
Returns default capabilities of the base classifier.weka.core.SelectedTag
getCombinationRule()
Gets the combination rule usedprotected weka.filters.Filter
getFilter(int index, int seed)
Gets a filter for a particular index.boolean
getNoReplacement()
Gets whether instances are drawn with or without replacement.int
getNumBalanced()
Returns the number of balanced datasets to generate (= #classifiers).int
getNumExecutionSlots()
Get the number of execution slots (threads) to use for building the members of the ensemble.String[]
getOptions()
Gets the current settings of the classifier.String
getRevision()
Returns the revision string.boolean
getSuppressModelOutput()
Returns whether to output the model with the toString() method or not.String
getThresholds()
Returns the pairs of threshold/number of resampled models.protected weka.core.Instances
getTrainingSet(int index, int seed)
Gets a training set for a particular index.String
globalInfo()
Returns a string describing the classifier.Enumeration
listOptions()
Returns an enumeration describing the available options.static void
main(String[] args)
Main method for running this class from commandline.String
noReplacementTipText()
Returns the tip text for this property.String
numBalancedTipText()
Returns the tip text for this property.String
numExecutionSlotsTipText()
Returns the tip text for this property.void
setBias(double value)
Sets the bias towards a uniform class.void
setClassifier(weka.classifiers.Classifier newClassifier)
void
setCombinationRule(weka.core.SelectedTag value)
Sets the combination rule to use.void
setNoReplacement(boolean value)
Sets whether instances are drawn with or with out replacement.void
setNumBalanced(int value)
Set the number of balanced datasets to generated (= #classifiers).void
setNumExecutionSlots(int value)
Set the number of execution slots (threads) to use for building the members of the ensemble.void
setOptions(String[] options)
Parses a given list of options.void
setSuppressModelOutput(boolean value)
Sets whether to output the model with the toString() method or not.void
setThresholds(String value)
Set the pairs of threshold/number of resampled models.protected void
startExecutorPool()
Start the pool of execution threads.String
suppressModelOutputTipText()
Returns the tip text for this property.String
thresholdsTipText()
Returns the tip text for this property.String
toString()
Returns a string representation of the classifier.-
Methods inherited from class weka.classifiers.RandomizableSingleClassifierEnhancer
getSeed, seedTipText, setSeed
-
Methods inherited from class weka.classifiers.SingleClassifierEnhancer
classifierTipText, defaultClassifierOptions, defaultClassifierString, getClassifier, getClassifierSpec, postExecution, preExecution
-
Methods inherited from class weka.classifiers.AbstractClassifier
batchSizeTipText, debugTipText, distributionsForInstances, doNotCheckCapabilitiesTipText, forName, getBatchSize, getDebug, getDoNotCheckCapabilities, getNumDecimalPlaces, implementsMoreEfficientBatchPrediction, makeCopies, makeCopy, numDecimalPlacesTipText, run, runClassifier, setBatchSize, setDebug, setDoNotCheckCapabilities, setNumDecimalPlaces
-
-
-
-
Field Detail
-
m_Classifiers
protected weka.classifiers.Classifier[] m_Classifiers
the actual classifiers in use.
-
m_NumExecutionSlots
protected int m_NumExecutionSlots
The number of threads to have executing at any one time
-
m_CombinationRule
protected int m_CombinationRule
Combination Rule variable.
-
m_NumBalanced
protected int m_NumBalanced
the number of balanced datasets to generate.
-
m_Thresholds
protected BaseKeyValuePair[] m_Thresholds
the thresholds to use (pair: probability minority class = num balanced).
-
m_ActualNumBalanced
protected int m_ActualNumBalanced
the actual number of balanced datasets to generate.
-
m_Bias
protected double m_Bias
the bias for the dataset balancing (0 = distribution in input data -- 1 = uniform distribution).
-
m_NoReplacement
protected boolean m_NoReplacement
Whether to perform sampling with replacement or without.
-
m_ExecutorPool
protected transient ThreadPoolExecutor m_ExecutorPool
Pool of threads to train models with
-
m_Completed
protected int m_Completed
The number of classifiers completed so far
-
m_Failed
protected int m_Failed
The number of classifiers that experienced a failure of some sort during construction.
-
m_Data
protected weka.core.Instances m_Data
For holding the original training set temporarily.
-
m_Header
protected weka.core.Instances m_Header
The header of the training set.
-
m_BackupModel
protected weka.classifiers.rules.ZeroR m_BackupModel
The backup classifier, in case no ensemble could be constructed at prediction time.
-
m_Ensemble
protected weka.classifiers.Classifier m_Ensemble
the vote classifier in use.
-
m_SamplePercentage
protected double m_SamplePercentage
the sample percentage to use (0-100).
-
m_SuppressModelOutput
protected boolean m_SuppressModelOutput
whether to suppress the model output.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing the classifier.- Returns:
- a description suitable for displaying in the gui
-
listOptions
public Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceweka.core.OptionHandler
- Overrides:
listOptions
in classweka.classifiers.RandomizableSingleClassifierEnhancer
- Returns:
- an enumeration of all the available options.
-
setOptions
public void setOptions(String[] options) throws Exception
Parses a given list of options.- Specified by:
setOptions
in interfaceweka.core.OptionHandler
- Overrides:
setOptions
in classweka.classifiers.RandomizableSingleClassifierEnhancer
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
getOptions
public String[] getOptions()
Gets the current settings of the classifier.- Specified by:
getOptions
in interfaceweka.core.OptionHandler
- Overrides:
getOptions
in classweka.classifiers.RandomizableSingleClassifierEnhancer
- Returns:
- an array of strings suitable for passing to setOptions
-
setClassifier
public void setClassifier(weka.classifiers.Classifier newClassifier)
- Overrides:
setClassifier
in classweka.classifiers.SingleClassifierEnhancer
-
setNumExecutionSlots
public void setNumExecutionSlots(int value)
Set the number of execution slots (threads) to use for building the members of the ensemble.- Parameters:
value
- the number of slots to use.
-
getNumExecutionSlots
public int getNumExecutionSlots()
Get the number of execution slots (threads) to use for building the members of the ensemble.- Returns:
- the number of slots to use
-
numExecutionSlotsTipText
public String numExecutionSlotsTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setCombinationRule
public void setCombinationRule(weka.core.SelectedTag value)
Sets the combination rule to use. Values other than- Parameters:
value
- the combination rule method to use
-
getCombinationRule
public weka.core.SelectedTag getCombinationRule()
Gets the combination rule used- Returns:
- the combination rule used
-
combinationRuleTipText
public String combinationRuleTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNumBalanced
public void setNumBalanced(int value)
Set the number of balanced datasets to generated (= #classifiers).- Parameters:
value
- the number of datasets
-
getNumBalanced
public int getNumBalanced()
Returns the number of balanced datasets to generate (= #classifiers).- Returns:
- the number of datasets
-
numBalancedTipText
public String numBalancedTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setThresholds
public void setThresholds(String value)
Set the pairs of threshold/number of resampled models.- Parameters:
value
- the pairs (blank-separated list; probability=#models)
-
getThresholds
public String getThresholds()
Returns the pairs of threshold/number of resampled models.- Returns:
- the pairs
-
thresholdsTipText
public String thresholdsTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setBias
public void setBias(double value)
Sets the bias towards a uniform class. A value of 0 leaves the class distribution as-is, a value of 1 ensures the class distributions are uniform in the output data.- Parameters:
value
- the new bias value, between 0 and 1.
-
getBias
public double getBias()
Gets the bias towards a uniform class. A value of 0 leaves the class distribution as-is, a value of 1 ensures the class distributions are uniform in the output data.- Returns:
- the current bias
-
biasTipText
public String biasTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNoReplacement
public void setNoReplacement(boolean value)
Sets whether instances are drawn with or with out replacement.- Parameters:
value
- if true then the replacement of instances is disabled
-
getNoReplacement
public boolean getNoReplacement()
Gets whether instances are drawn with or without replacement.- Returns:
- true if the replacement is disabled
-
noReplacementTipText
public String noReplacementTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setSuppressModelOutput
public void setSuppressModelOutput(boolean value)
Sets whether to output the model with the toString() method or not.- Specified by:
setSuppressModelOutput
in interfaceModelOutputHandler
- Parameters:
value
- true if to suppress model output
-
getSuppressModelOutput
public boolean getSuppressModelOutput()
Returns whether to output the model with the toString() method or not.- Specified by:
getSuppressModelOutput
in interfaceModelOutputHandler
- Returns:
- the label index
-
suppressModelOutputTipText
public String suppressModelOutputTipText()
Returns the tip text for this property.- Specified by:
suppressModelOutputTipText
in interfaceModelOutputHandler
- Returns:
- tip text for this property suitable for displaying in the gui
-
startExecutorPool
protected void startExecutorPool()
Start the pool of execution threads.
-
getFilter
protected weka.filters.Filter getFilter(int index, int seed) throws Exception
Gets a filter for a particular index.- Parameters:
index
- the index for the requested filterseed
- the seed value to use for the determining the additional random features- Throws:
Exception
- if something goes wrong
-
getTrainingSet
protected weka.core.Instances getTrainingSet(int index, int seed) throws Exception
Gets a training set for a particular index.- Parameters:
index
- the index for the requested training setseed
- the seed value to use for the determining the additional random features- Returns:
- the training set for the supplied index
- Throws:
Exception
- if something goes wrong
-
completedClassifier
protected void completedClassifier(int index, boolean success)
Records the completion of the training of a single classifier. Unblocks if all classifiers have been trained.- Parameters:
index
- the index of the classifier that has completedsuccess
- whether the classifier trained successfully
-
buildClassifiers
protected void buildClassifiers() throws Exception
Does the actual construction of the ensemble.- Throws:
Exception
- if something goes wrong during the training process
-
constructEnsemble
protected weka.classifiers.Classifier constructEnsemble()
Constructs the ensemble.
-
getCapabilities
public weka.core.Capabilities getCapabilities()
Returns default capabilities of the base classifier.- Specified by:
getCapabilities
in interfaceweka.core.CapabilitiesHandler
- Specified by:
getCapabilities
in interfaceweka.classifiers.Classifier
- Overrides:
getCapabilities
in classweka.classifiers.SingleClassifierEnhancer
- Returns:
- the capabilities of the base classifier
-
buildClassifier
public void buildClassifier(weka.core.Instances data) throws Exception
Stump method for building the classifiers- Specified by:
buildClassifier
in interfaceweka.classifiers.Classifier
- Parameters:
data
- the training data to be used for generating the ensemble- Throws:
Exception
- if the classifier could not be built successfully
-
classifyInstance
public double classifyInstance(weka.core.Instance instance) throws Exception
Classifies the given test instance. The instance has to belong to a dataset when it's being classified.- Specified by:
classifyInstance
in interfaceweka.classifiers.Classifier
- Overrides:
classifyInstance
in classweka.classifiers.AbstractClassifier
- Parameters:
instance
- the instance to be classified- Returns:
- the predicted most likely class for the instance or Utils.missingValue() if no prediction is made
- Throws:
Exception
- if an error occurred during the prediction
-
distributionForInstance
public double[] distributionForInstance(weka.core.Instance instance) throws Exception
Predicts the class memberships for a given instance. If an instance is unclassified, the returned array elements must be all zero. If the class is numeric, the array must consist of only one element, which contains the predicted value.- Specified by:
distributionForInstance
in interfaceweka.classifiers.Classifier
- Overrides:
distributionForInstance
in classweka.classifiers.AbstractClassifier
- Parameters:
instance
- the instance to be classified- Returns:
- an array containing the estimated membership probabilities of the test instance in each class or the numeric prediction
- Throws:
Exception
- if distribution could not be computed successfully
-
toString
public String toString()
Returns a string representation of the classifier.
-
getRevision
public String getRevision()
Returns the revision string.- Specified by:
getRevision
in interfaceweka.core.RevisionHandler
- Overrides:
getRevision
in classweka.classifiers.AbstractClassifier
- Returns:
- the revision
-
main
public static void main(String[] args)
Main method for running this class from commandline.- Parameters:
args
- the options
-
-