Package weka.classifiers.meta
Class SubsetEnsemble
- java.lang.Object
-
- weka.classifiers.AbstractClassifier
-
- weka.classifiers.SingleClassifierEnhancer
-
- weka.classifiers.RandomizableSingleClassifierEnhancer
-
- weka.classifiers.meta.SubsetEnsemble
-
- All Implemented Interfaces:
Serializable
,Cloneable
,weka.classifiers.Classifier
,weka.core.BatchPredictor
,weka.core.CapabilitiesHandler
,weka.core.CapabilitiesIgnorer
,weka.core.CommandlineRunnable
,weka.core.OptionHandler
,weka.core.Randomizable
,weka.core.RevisionHandler
public class SubsetEnsemble extends weka.classifiers.RandomizableSingleClassifierEnhancer
Generates an ensemble using the following approach:
- for each attribute apart from class attribute do:
* create new dataset with only this feature and the class attribute
* remove all instances that contain a missing value
* if no instances left in subset, don't build a classifier for this feature
* if at least 1 instance is left in subset, build base classifier with it
If no classifier gets built at all, use ZeroR as backup model, built on the full dataset.
In addition to the default feature for a subset, a number of random features can be added to the subset before the classifier is trained.
At prediction time, the Vote meta-classifier (using the pre-built classifiers) is used to determing the class probabilities or regression value.
Valid options are:
-num-slots <num> Number of execution slots. (default: 1 - i.e. no parallelism)
-combination-rule <AVG|PROD|MAJ|MIN|MAX|MED> The combination rule to use (default: AVG)
-num-random <num> Number of random features to use in addition. (default: 0)
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.rules.ZeroR)
Options specific to classifier weka.classifiers.rules.ZeroR:
-D If set, classifier is run in debug mode and may output additional info to the console
Options after -- are passed to the designated classifier.- Author:
- fracpete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected weka.classifiers.rules.ZeroR
m_BackupModel
The backup classifier, in case no ensemble could be constructed at prediction time.protected weka.classifiers.Classifier[]
m_Classifiers
the actual classifiers in use.protected int
m_CombinationRule
Combination Rule variable.protected int
m_Completed
The number of classifiers completed so farprotected weka.core.Instances
m_Data
For holding the original training set temporarily.protected ThreadPoolExecutor
m_ExecutorPool
Pool of threads to train models withprotected int
m_Failed
The number of classifiers that experienced a failure of some sort during construction.protected weka.core.Instances
m_Header
The header of the training set.protected int
m_NumExecutionSlots
The number of threads to have executing at any one timeprotected int
m_NumRandomFeatures
the number of random features to use (in addition to base attribute).
-
Constructor Summary
Constructors Constructor Description SubsetEnsemble()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
buildClassifier(weka.core.Instances data)
Stump method for building the classifiersprotected void
buildClassifiers()
Does the actual construction of the ensemble.double
classifyInstance(weka.core.Instance instance)
Classifies the given test instance.String
combinationRuleTipText()
Returns the tip text for this property.protected void
completedClassifier(int index, boolean success)
Records the completion of the training of a single classifier.protected weka.classifiers.Classifier
constructEnsemble(weka.core.Instance instance)
Constructs the ensemble.double[]
distributionForInstance(weka.core.Instance instance)
Predicts the class memberships for a given instance.protected int
getActualIndex(int index)
Returns the actual index in the data of the feature attribute.weka.core.SelectedTag
getCombinationRule()
Gets the combination rule usedprotected weka.filters.Filter
getFilter(int index, int seed, boolean withMissing)
Gets a filter for a particular index.int
getNumExecutionSlots()
Get the number of execution slots (threads) to use for building the members of the ensemble.int
getNumRandomFeatures()
Returns the number of additional random features to use.String[]
getOptions()
Gets the current settings of the classifier.String
getRevision()
Returns the revision string.protected weka.core.Instances
getTrainingSet(int index, int seed)
Gets a training set for a particular index.String
globalInfo()
Returns a string describing the classifier.Enumeration
listOptions()
Returns an enumeration describing the available options.static void
main(String[] args)
Main method for running this class from commandline.String
numExecutionSlotsTipText()
Returns the tip text for this property.String
numRandomFeaturesTipText()
Returns the tip text for this property.void
setCombinationRule(weka.core.SelectedTag value)
Sets the combination rule to use.void
setNumExecutionSlots(int value)
Set the number of execution slots (threads) to use for building the members of the ensemble.void
setNumRandomFeatures(int value)
Set the number of additional random features to use.void
setOptions(String[] options)
Parses a given list of options.protected void
startExecutorPool()
Start the pool of execution threads.String
toString()
Returns a string representation of the classifier.-
Methods inherited from class weka.classifiers.RandomizableSingleClassifierEnhancer
getSeed, seedTipText, setSeed
-
Methods inherited from class weka.classifiers.SingleClassifierEnhancer
classifierTipText, defaultClassifierOptions, defaultClassifierString, getCapabilities, getClassifier, getClassifierSpec, postExecution, preExecution, setClassifier
-
Methods inherited from class weka.classifiers.AbstractClassifier
batchSizeTipText, debugTipText, distributionsForInstances, doNotCheckCapabilitiesTipText, forName, getBatchSize, getDebug, getDoNotCheckCapabilities, getNumDecimalPlaces, implementsMoreEfficientBatchPrediction, makeCopies, makeCopy, numDecimalPlacesTipText, run, runClassifier, setBatchSize, setDebug, setDoNotCheckCapabilities, setNumDecimalPlaces
-
-
-
-
Field Detail
-
m_Classifiers
protected weka.classifiers.Classifier[] m_Classifiers
the actual classifiers in use.
-
m_NumExecutionSlots
protected int m_NumExecutionSlots
The number of threads to have executing at any one time
-
m_CombinationRule
protected int m_CombinationRule
Combination Rule variable.
-
m_NumRandomFeatures
protected int m_NumRandomFeatures
the number of random features to use (in addition to base attribute).
-
m_ExecutorPool
protected transient ThreadPoolExecutor m_ExecutorPool
Pool of threads to train models with
-
m_Completed
protected int m_Completed
The number of classifiers completed so far
-
m_Failed
protected int m_Failed
The number of classifiers that experienced a failure of some sort during construction.
-
m_Data
protected weka.core.Instances m_Data
For holding the original training set temporarily.
-
m_Header
protected weka.core.Instances m_Header
The header of the training set.
-
m_BackupModel
protected weka.classifiers.rules.ZeroR m_BackupModel
The backup classifier, in case no ensemble could be constructed at prediction time.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing the classifier.- Returns:
- a description suitable for displaying in the gui
-
listOptions
public Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceweka.core.OptionHandler
- Overrides:
listOptions
in classweka.classifiers.RandomizableSingleClassifierEnhancer
- Returns:
- an enumeration of all the available options.
-
setOptions
public void setOptions(String[] options) throws Exception
Parses a given list of options.
Valid options are:
-num-slots <num> Number of execution slots. (default: 1 - i.e. no parallelism)
-combination-rule <AVG|PROD|MAJ|MIN|MAX|MED> The combination rule to use (default: AVG)
-num-random <num> Number of random features to use in addition. (default: 0)
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.rules.ZeroR)
Options specific to classifier weka.classifiers.rules.ZeroR:
-D If set, classifier is run in debug mode and may output additional info to the console
Options after -- are passed to the designated classifier.- Specified by:
setOptions
in interfaceweka.core.OptionHandler
- Overrides:
setOptions
in classweka.classifiers.RandomizableSingleClassifierEnhancer
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
getOptions
public String[] getOptions()
Gets the current settings of the classifier.- Specified by:
getOptions
in interfaceweka.core.OptionHandler
- Overrides:
getOptions
in classweka.classifiers.RandomizableSingleClassifierEnhancer
- Returns:
- an array of strings suitable for passing to setOptions
-
setNumExecutionSlots
public void setNumExecutionSlots(int value)
Set the number of execution slots (threads) to use for building the members of the ensemble.- Parameters:
value
- the number of slots to use.
-
getNumExecutionSlots
public int getNumExecutionSlots()
Get the number of execution slots (threads) to use for building the members of the ensemble.- Returns:
- the number of slots to use
-
numExecutionSlotsTipText
public String numExecutionSlotsTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setCombinationRule
public void setCombinationRule(weka.core.SelectedTag value)
Sets the combination rule to use. Values other than- Parameters:
value
- the combination rule method to use
-
getCombinationRule
public weka.core.SelectedTag getCombinationRule()
Gets the combination rule used- Returns:
- the combination rule used
-
combinationRuleTipText
public String combinationRuleTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNumRandomFeatures
public void setNumRandomFeatures(int value)
Set the number of additional random features to use.- Parameters:
value
- the number of random features
-
getNumRandomFeatures
public int getNumRandomFeatures()
Returns the number of additional random features to use.- Returns:
- the number of random features
-
numRandomFeaturesTipText
public String numRandomFeaturesTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
startExecutorPool
protected void startExecutorPool()
Start the pool of execution threads.
-
buildClassifiers
protected void buildClassifiers() throws Exception
Does the actual construction of the ensemble.- Throws:
Exception
- if something goes wrong during the training process
-
completedClassifier
protected void completedClassifier(int index, boolean success)
Records the completion of the training of a single classifier. Unblocks if all classifiers have been trained.- Parameters:
index
- the index of the classifier that has completedsuccess
- whether the classifier trained successfully
-
getActualIndex
protected int getActualIndex(int index) throws Exception
Returns the actual index in the data of the feature attribute.- Parameters:
index
- the index for the requested attribute- Returns:
- the actual attribute index for the supplied index
- Throws:
Exception
- if something goes wrong
-
getFilter
protected weka.filters.Filter getFilter(int index, int seed, boolean withMissing) throws Exception
Gets a filter for a particular index.- Parameters:
index
- the index for the requested filterseed
- the seed value to use for the determining the additional random featureswithMissing
- whether to include the RemoveInstancesWithMissingValue filter- Returns:
- the filter for the supplied index
- Throws:
Exception
- if something goes wrong
-
getTrainingSet
protected weka.core.Instances getTrainingSet(int index, int seed) throws Exception
Gets a training set for a particular index.- Parameters:
index
- the index for the requested training setseed
- the seed value to use for the determining the additional random features- Returns:
- the training set for the supplied index
- Throws:
Exception
- if something goes wrong
-
buildClassifier
public void buildClassifier(weka.core.Instances data) throws Exception
Stump method for building the classifiers- Parameters:
data
- the training data to be used for generating the ensemble- Throws:
Exception
- if the classifier could not be built successfully
-
constructEnsemble
protected weka.classifiers.Classifier constructEnsemble(weka.core.Instance instance)
Constructs the ensemble.- Parameters:
instance
- the instance to base the construction on
-
classifyInstance
public double classifyInstance(weka.core.Instance instance) throws Exception
Classifies the given test instance. The instance has to belong to a dataset when it's being classified.- Specified by:
classifyInstance
in interfaceweka.classifiers.Classifier
- Overrides:
classifyInstance
in classweka.classifiers.AbstractClassifier
- Parameters:
instance
- the instance to be classified- Returns:
- the predicted most likely class for the instance or Utils.missingValue() if no prediction is made
- Throws:
Exception
- if an error occurred during the prediction
-
distributionForInstance
public double[] distributionForInstance(weka.core.Instance instance) throws Exception
Predicts the class memberships for a given instance. If an instance is unclassified, the returned array elements must be all zero. If the class is numeric, the array must consist of only one element, which contains the predicted value.- Specified by:
distributionForInstance
in interfaceweka.classifiers.Classifier
- Overrides:
distributionForInstance
in classweka.classifiers.AbstractClassifier
- Parameters:
instance
- the instance to be classified- Returns:
- an array containing the estimated membership probabilities of the test instance in each class or the numeric prediction
- Throws:
Exception
- if distribution could not be computed successfully
-
toString
public String toString()
Returns a string representation of the classifier.
-
getRevision
public String getRevision()
Returns the revision string.- Specified by:
getRevision
in interfaceweka.core.RevisionHandler
- Overrides:
getRevision
in classweka.classifiers.AbstractClassifier
- Returns:
- the revision
-
main
public static void main(String[] args)
Main method for running this class from commandline.- Parameters:
args
- the options
-
-