Package weka.classifiers.meta
Class SubsetEnsemble
- java.lang.Object
-
- weka.classifiers.AbstractClassifier
-
- weka.classifiers.SingleClassifierEnhancer
-
- weka.classifiers.RandomizableSingleClassifierEnhancer
-
- weka.classifiers.meta.SubsetEnsemble
-
- All Implemented Interfaces:
Serializable,Cloneable,weka.classifiers.Classifier,weka.core.BatchPredictor,weka.core.CapabilitiesHandler,weka.core.CapabilitiesIgnorer,weka.core.CommandlineRunnable,weka.core.OptionHandler,weka.core.Randomizable,weka.core.RevisionHandler
public class SubsetEnsemble extends weka.classifiers.RandomizableSingleClassifierEnhancerGenerates an ensemble using the following approach:
- for each attribute apart from class attribute do:
* create new dataset with only this feature and the class attribute
* remove all instances that contain a missing value
* if no instances left in subset, don't build a classifier for this feature
* if at least 1 instance is left in subset, build base classifier with it
If no classifier gets built at all, use ZeroR as backup model, built on the full dataset.
In addition to the default feature for a subset, a number of random features can be added to the subset before the classifier is trained.
At prediction time, the Vote meta-classifier (using the pre-built classifiers) is used to determing the class probabilities or regression value.
Valid options are:
-num-slots <num> Number of execution slots. (default: 1 - i.e. no parallelism)
-combination-rule <AVG|PROD|MAJ|MIN|MAX|MED> The combination rule to use (default: AVG)
-num-random <num> Number of random features to use in addition. (default: 0)
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.rules.ZeroR)
Options specific to classifier weka.classifiers.rules.ZeroR:
-D If set, classifier is run in debug mode and may output additional info to the console
Options after -- are passed to the designated classifier.- Author:
- fracpete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected weka.classifiers.rules.ZeroRm_BackupModelThe backup classifier, in case no ensemble could be constructed at prediction time.protected weka.classifiers.Classifier[]m_Classifiersthe actual classifiers in use.protected intm_CombinationRuleCombination Rule variable.protected intm_CompletedThe number of classifiers completed so farprotected weka.core.Instancesm_DataFor holding the original training set temporarily.protected ThreadPoolExecutorm_ExecutorPoolPool of threads to train models withprotected intm_FailedThe number of classifiers that experienced a failure of some sort during construction.protected weka.core.Instancesm_HeaderThe header of the training set.protected intm_NumExecutionSlotsThe number of threads to have executing at any one timeprotected intm_NumRandomFeaturesthe number of random features to use (in addition to base attribute).
-
Constructor Summary
Constructors Constructor Description SubsetEnsemble()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidbuildClassifier(weka.core.Instances data)Stump method for building the classifiersprotected voidbuildClassifiers()Does the actual construction of the ensemble.doubleclassifyInstance(weka.core.Instance instance)Classifies the given test instance.StringcombinationRuleTipText()Returns the tip text for this property.protected voidcompletedClassifier(int index, boolean success)Records the completion of the training of a single classifier.protected weka.classifiers.ClassifierconstructEnsemble(weka.core.Instance instance)Constructs the ensemble.double[]distributionForInstance(weka.core.Instance instance)Predicts the class memberships for a given instance.protected intgetActualIndex(int index)Returns the actual index in the data of the feature attribute.weka.core.SelectedTaggetCombinationRule()Gets the combination rule usedprotected weka.filters.FiltergetFilter(int index, int seed, boolean withMissing)Gets a filter for a particular index.intgetNumExecutionSlots()Get the number of execution slots (threads) to use for building the members of the ensemble.intgetNumRandomFeatures()Returns the number of additional random features to use.String[]getOptions()Gets the current settings of the classifier.StringgetRevision()Returns the revision string.protected weka.core.InstancesgetTrainingSet(int index, int seed)Gets a training set for a particular index.StringglobalInfo()Returns a string describing the classifier.EnumerationlistOptions()Returns an enumeration describing the available options.static voidmain(String[] args)Main method for running this class from commandline.StringnumExecutionSlotsTipText()Returns the tip text for this property.StringnumRandomFeaturesTipText()Returns the tip text for this property.voidsetCombinationRule(weka.core.SelectedTag value)Sets the combination rule to use.voidsetNumExecutionSlots(int value)Set the number of execution slots (threads) to use for building the members of the ensemble.voidsetNumRandomFeatures(int value)Set the number of additional random features to use.voidsetOptions(String[] options)Parses a given list of options.protected voidstartExecutorPool()Start the pool of execution threads.StringtoString()Returns a string representation of the classifier.-
Methods inherited from class weka.classifiers.RandomizableSingleClassifierEnhancer
getSeed, seedTipText, setSeed
-
Methods inherited from class weka.classifiers.SingleClassifierEnhancer
classifierTipText, defaultClassifierOptions, defaultClassifierString, getCapabilities, getClassifier, getClassifierSpec, postExecution, preExecution, setClassifier
-
Methods inherited from class weka.classifiers.AbstractClassifier
batchSizeTipText, debugTipText, distributionsForInstances, doNotCheckCapabilitiesTipText, forName, getBatchSize, getDebug, getDoNotCheckCapabilities, getNumDecimalPlaces, implementsMoreEfficientBatchPrediction, makeCopies, makeCopy, numDecimalPlacesTipText, run, runClassifier, setBatchSize, setDebug, setDoNotCheckCapabilities, setNumDecimalPlaces
-
-
-
-
Field Detail
-
m_Classifiers
protected weka.classifiers.Classifier[] m_Classifiers
the actual classifiers in use.
-
m_NumExecutionSlots
protected int m_NumExecutionSlots
The number of threads to have executing at any one time
-
m_CombinationRule
protected int m_CombinationRule
Combination Rule variable.
-
m_NumRandomFeatures
protected int m_NumRandomFeatures
the number of random features to use (in addition to base attribute).
-
m_ExecutorPool
protected transient ThreadPoolExecutor m_ExecutorPool
Pool of threads to train models with
-
m_Completed
protected int m_Completed
The number of classifiers completed so far
-
m_Failed
protected int m_Failed
The number of classifiers that experienced a failure of some sort during construction.
-
m_Data
protected weka.core.Instances m_Data
For holding the original training set temporarily.
-
m_Header
protected weka.core.Instances m_Header
The header of the training set.
-
m_BackupModel
protected weka.classifiers.rules.ZeroR m_BackupModel
The backup classifier, in case no ensemble could be constructed at prediction time.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing the classifier.- Returns:
- a description suitable for displaying in the gui
-
listOptions
public Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptionsin interfaceweka.core.OptionHandler- Overrides:
listOptionsin classweka.classifiers.RandomizableSingleClassifierEnhancer- Returns:
- an enumeration of all the available options.
-
setOptions
public void setOptions(String[] options) throws Exception
Parses a given list of options.
Valid options are:
-num-slots <num> Number of execution slots. (default: 1 - i.e. no parallelism)
-combination-rule <AVG|PROD|MAJ|MIN|MAX|MED> The combination rule to use (default: AVG)
-num-random <num> Number of random features to use in addition. (default: 0)
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.rules.ZeroR)
Options specific to classifier weka.classifiers.rules.ZeroR:
-D If set, classifier is run in debug mode and may output additional info to the console
Options after -- are passed to the designated classifier.- Specified by:
setOptionsin interfaceweka.core.OptionHandler- Overrides:
setOptionsin classweka.classifiers.RandomizableSingleClassifierEnhancer- Parameters:
options- the list of options as an array of strings- Throws:
Exception- if an option is not supported
-
getOptions
public String[] getOptions()
Gets the current settings of the classifier.- Specified by:
getOptionsin interfaceweka.core.OptionHandler- Overrides:
getOptionsin classweka.classifiers.RandomizableSingleClassifierEnhancer- Returns:
- an array of strings suitable for passing to setOptions
-
setNumExecutionSlots
public void setNumExecutionSlots(int value)
Set the number of execution slots (threads) to use for building the members of the ensemble.- Parameters:
value- the number of slots to use.
-
getNumExecutionSlots
public int getNumExecutionSlots()
Get the number of execution slots (threads) to use for building the members of the ensemble.- Returns:
- the number of slots to use
-
numExecutionSlotsTipText
public String numExecutionSlotsTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setCombinationRule
public void setCombinationRule(weka.core.SelectedTag value)
Sets the combination rule to use. Values other than- Parameters:
value- the combination rule method to use
-
getCombinationRule
public weka.core.SelectedTag getCombinationRule()
Gets the combination rule used- Returns:
- the combination rule used
-
combinationRuleTipText
public String combinationRuleTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNumRandomFeatures
public void setNumRandomFeatures(int value)
Set the number of additional random features to use.- Parameters:
value- the number of random features
-
getNumRandomFeatures
public int getNumRandomFeatures()
Returns the number of additional random features to use.- Returns:
- the number of random features
-
numRandomFeaturesTipText
public String numRandomFeaturesTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
startExecutorPool
protected void startExecutorPool()
Start the pool of execution threads.
-
buildClassifiers
protected void buildClassifiers() throws ExceptionDoes the actual construction of the ensemble.- Throws:
Exception- if something goes wrong during the training process
-
completedClassifier
protected void completedClassifier(int index, boolean success)Records the completion of the training of a single classifier. Unblocks if all classifiers have been trained.- Parameters:
index- the index of the classifier that has completedsuccess- whether the classifier trained successfully
-
getActualIndex
protected int getActualIndex(int index) throws ExceptionReturns the actual index in the data of the feature attribute.- Parameters:
index- the index for the requested attribute- Returns:
- the actual attribute index for the supplied index
- Throws:
Exception- if something goes wrong
-
getFilter
protected weka.filters.Filter getFilter(int index, int seed, boolean withMissing) throws ExceptionGets a filter for a particular index.- Parameters:
index- the index for the requested filterseed- the seed value to use for the determining the additional random featureswithMissing- whether to include the RemoveInstancesWithMissingValue filter- Returns:
- the filter for the supplied index
- Throws:
Exception- if something goes wrong
-
getTrainingSet
protected weka.core.Instances getTrainingSet(int index, int seed) throws ExceptionGets a training set for a particular index.- Parameters:
index- the index for the requested training setseed- the seed value to use for the determining the additional random features- Returns:
- the training set for the supplied index
- Throws:
Exception- if something goes wrong
-
buildClassifier
public void buildClassifier(weka.core.Instances data) throws ExceptionStump method for building the classifiers- Parameters:
data- the training data to be used for generating the ensemble- Throws:
Exception- if the classifier could not be built successfully
-
constructEnsemble
protected weka.classifiers.Classifier constructEnsemble(weka.core.Instance instance)
Constructs the ensemble.- Parameters:
instance- the instance to base the construction on
-
classifyInstance
public double classifyInstance(weka.core.Instance instance) throws ExceptionClassifies the given test instance. The instance has to belong to a dataset when it's being classified.- Specified by:
classifyInstancein interfaceweka.classifiers.Classifier- Overrides:
classifyInstancein classweka.classifiers.AbstractClassifier- Parameters:
instance- the instance to be classified- Returns:
- the predicted most likely class for the instance or Utils.missingValue() if no prediction is made
- Throws:
Exception- if an error occurred during the prediction
-
distributionForInstance
public double[] distributionForInstance(weka.core.Instance instance) throws ExceptionPredicts the class memberships for a given instance. If an instance is unclassified, the returned array elements must be all zero. If the class is numeric, the array must consist of only one element, which contains the predicted value.- Specified by:
distributionForInstancein interfaceweka.classifiers.Classifier- Overrides:
distributionForInstancein classweka.classifiers.AbstractClassifier- Parameters:
instance- the instance to be classified- Returns:
- an array containing the estimated membership probabilities of the test instance in each class or the numeric prediction
- Throws:
Exception- if distribution could not be computed successfully
-
toString
public String toString()
Returns a string representation of the classifier.
-
getRevision
public String getRevision()
Returns the revision string.- Specified by:
getRevisionin interfaceweka.core.RevisionHandler- Overrides:
getRevisionin classweka.classifiers.AbstractClassifier- Returns:
- the revision
-
main
public static void main(String[] args)
Main method for running this class from commandline.- Parameters:
args- the options
-
-