Package weka.filters.supervised.instance
Class RemoveOutliers
- java.lang.Object
-
- weka.filters.Filter
-
- weka.filters.SimpleFilter
-
- weka.filters.SimpleBatchFilter
-
- weka.filters.supervised.instance.RemoveOutliers
-
- All Implemented Interfaces:
Serializable
,weka.core.CapabilitiesHandler
,weka.core.CapabilitiesIgnorer
,weka.core.CommandlineRunnable
,weka.core.OptionHandler
,weka.core.Randomizable
,weka.core.RevisionHandler
public class RemoveOutliers extends weka.filters.SimpleBatchFilter implements weka.core.Randomizable
Cross-validates the specified classifier on the incoming data and applies the outlier detector to the actual vs predicted data to remove the outliers.
NB: only works on full dataset, not instance by instance. Valid options are:-classifier <value> The classifier to use for generating the actual vs predicted data. (default: Linear Regression: No model built yet.)
-num-folds <value> The number of folds to use in the cross-validation. (default: 10)
-num-threads <value> The number of threads to use for cross-validation; -1 = number of CPUs/cores; 0 or 1 = sequential execution. (default: 1)
-detector The outlier detector to use.
-output-debug-info If set, filter is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, filter capabilities are not checked before filter is built (use with caution).
- Version:
- $Revision$
- Author:
- FracPete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static String
CLASSIFIER
static String
DETECTOR
protected weka.classifiers.Classifier
m_Classifier
the classifier to use for evaluation.protected adams.flow.control.removeoutliers.AbstractOutlierDetector
m_Detector
the outlier detector to use.protected int
m_NumFolds
the number of folds to use.protected int
m_NumThreads
the number of threads to use for parallel execution.protected int
m_Seed
the seed value.static String
NUM_FOLDS
static String
NUM_THREADS
-
Constructor Summary
Constructors Constructor Description RemoveOutliers()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description String
classifierTipText()
Returns the tip text for this property.protected weka.classifiers.Evaluation
crossValidate(weka.core.Instances data, int folds)
Cross-validates the classifier on the given data.String
detectorTipText()
Returns the tip text for this property.protected weka.core.Instances
determineOutputFormat(weka.core.Instances inputFormat)
Determines the output format based on the input format and returns this.protected adams.data.spreadsheet.SpreadSheet
evaluationToSpreadSheet(weka.classifiers.Evaluation eval)
Turns the predictions of the evaluation object into a spreadsheet.weka.core.Capabilities
getCapabilities()
Returns the ensemble's capabilities.weka.classifiers.Classifier
getClassifier()
Returns the classifier.protected weka.classifiers.Classifier
getDefaultClassifier()
Returns the default classifier.protected adams.flow.control.removeoutliers.AbstractOutlierDetector
getDefaultDetector()
Returns the default detector.protected int
getDefaultNumFolds()
Returns the default number of folds to use in CV.protected int
getDefaultNumThreads()
Returns the default number of threads to use for cross-validation.protected int
getDefaultSeed()
Returns the default seed value.adams.flow.control.removeoutliers.AbstractOutlierDetector
getDetector()
Returns the detector.int
getNumFolds()
Returns the number of folds to use in CV.int
getNumThreads()
Returns the number of threads to use for cross-validation.String[]
getOptions()
Gets the current option settings for the OptionHandler.int
getSeed()
Returns the seed value.String
globalInfo()
Returns a string describing this filter.Enumeration
listOptions()
Returns an enumeration describing the available options.String
numFoldsTipText()
Returns the tip text for this property.String
numThreadsTipText()
Returns the tip text for this property.protected weka.core.Instances
process(weka.core.Instances data)
Processes the given data (may change the provided dataset) and returns the modified version.String
seedTipText()
Returns the tip text for this property.void
setClassifier(weka.classifiers.Classifier value)
Sets the classifier.void
setDetector(adams.flow.control.removeoutliers.AbstractOutlierDetector value)
Sets the detector.void
setNumFolds(int value)
Sets the number of folds to use.void
setNumThreads(int value)
Sets the number of threads to use for cross-validation.void
setOptions(String[] options)
Sets the OptionHandler's options using the given list.void
setSeed(int value)
Sets the seed value.-
Methods inherited from class weka.filters.SimpleBatchFilter
allowAccessToFullInputFormat, batchFinished, hasImmediateOutputFormat, input
-
Methods inherited from class weka.filters.Filter
batchFilterFile, bufferInput, copyValues, copyValues, debugTipText, doNotCheckCapabilitiesTipText, filterFile, flushInput, getCapabilities, getDebug, getDoNotCheckCapabilities, getInputFormat, getOutputFormat, getRevision, initInputLocators, initOutputLocators, inputFormatPeek, isFirstBatchDone, isNewBatch, isOutputFormatDefined, main, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputFormatPeek, outputPeek, postExecution, preExecution, push, push, resetQueue, run, runFilter, setDebug, setDoNotCheckCapabilities, setOutputFormat, testInputFormat, toString, useFilter, wekaStaticWrapper
-
-
-
-
Field Detail
-
CLASSIFIER
public static final String CLASSIFIER
- See Also:
- Constant Field Values
-
NUM_FOLDS
public static final String NUM_FOLDS
- See Also:
- Constant Field Values
-
NUM_THREADS
public static final String NUM_THREADS
- See Also:
- Constant Field Values
-
DETECTOR
public static final String DETECTOR
- See Also:
- Constant Field Values
-
m_Classifier
protected weka.classifiers.Classifier m_Classifier
the classifier to use for evaluation.
-
m_Seed
protected int m_Seed
the seed value.
-
m_NumFolds
protected int m_NumFolds
the number of folds to use.
-
m_Detector
protected adams.flow.control.removeoutliers.AbstractOutlierDetector m_Detector
the outlier detector to use.
-
m_NumThreads
protected int m_NumThreads
the number of threads to use for parallel execution.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing this filter.- Specified by:
globalInfo
in classweka.filters.SimpleFilter
- Returns:
- a description of the filter suitable for displaying in the explorer/experimenter gui
-
getDefaultClassifier
protected weka.classifiers.Classifier getDefaultClassifier()
Returns the default classifier.- Returns:
- the default classifier
-
setClassifier
public void setClassifier(weka.classifiers.Classifier value)
Sets the classifier.- Parameters:
value
- the classifier
-
getClassifier
public weka.classifiers.Classifier getClassifier()
Returns the classifier.- Returns:
- the classifier
-
classifierTipText
public String classifierTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
getDefaultSeed
protected int getDefaultSeed()
Returns the default seed value.- Returns:
- the default seed
-
setSeed
public void setSeed(int value)
Sets the seed value.- Specified by:
setSeed
in interfaceweka.core.Randomizable
- Parameters:
value
- the seed
-
getSeed
public int getSeed()
Returns the seed value.- Specified by:
getSeed
in interfaceweka.core.Randomizable
- Returns:
- the seed
-
seedTipText
public String seedTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
getDefaultNumFolds
protected int getDefaultNumFolds()
Returns the default number of folds to use in CV.- Returns:
- the default folds
-
setNumFolds
public void setNumFolds(int value)
Sets the number of folds to use.- Parameters:
value
- the folds
-
getNumFolds
public int getNumFolds()
Returns the number of folds to use in CV.- Returns:
- the folds
-
numFoldsTipText
public String numFoldsTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
getDefaultNumThreads
protected int getDefaultNumThreads()
Returns the default number of threads to use for cross-validation.- Returns:
- the default number of threads: -1 = # of CPUs/cores; 0/1 = sequential execution
-
setNumThreads
public void setNumThreads(int value)
Sets the number of threads to use for cross-validation.- Parameters:
value
- the number of threads: -1 = # of CPUs/cores; 0/1 = sequential execution
-
getNumThreads
public int getNumThreads()
Returns the number of threads to use for cross-validation.- Returns:
- the number of threads: -1 = # of CPUs/cores; 0/1 = sequential execution
-
numThreadsTipText
public String numThreadsTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
getDefaultDetector
protected adams.flow.control.removeoutliers.AbstractOutlierDetector getDefaultDetector()
Returns the default detector.- Returns:
- the default detector
-
setDetector
public void setDetector(adams.flow.control.removeoutliers.AbstractOutlierDetector value)
Sets the detector.- Parameters:
value
- the detector
-
getDetector
public adams.flow.control.removeoutliers.AbstractOutlierDetector getDetector()
Returns the detector.- Returns:
- the detector
-
detectorTipText
public String detectorTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
listOptions
public Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceweka.core.OptionHandler
- Overrides:
listOptions
in classweka.filters.Filter
- Returns:
- an enumeration of all the available options.
-
setOptions
public void setOptions(String[] options) throws Exception
Sets the OptionHandler's options using the given list. All options will be set (or reset) during this call (i.e. incremental setting of options is not possible).- Specified by:
setOptions
in interfaceweka.core.OptionHandler
- Overrides:
setOptions
in classweka.filters.Filter
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
getOptions
public String[] getOptions()
Gets the current option settings for the OptionHandler.- Specified by:
getOptions
in interfaceweka.core.OptionHandler
- Overrides:
getOptions
in classweka.filters.Filter
- Returns:
- the list of current option settings as an array of strings
-
getCapabilities
public weka.core.Capabilities getCapabilities()
Returns the ensemble's capabilities.- Specified by:
getCapabilities
in interfaceweka.core.CapabilitiesHandler
- Overrides:
getCapabilities
in classweka.filters.Filter
- Returns:
- the capabilities
-
determineOutputFormat
protected weka.core.Instances determineOutputFormat(weka.core.Instances inputFormat) throws Exception
Determines the output format based on the input format and returns this. In case the output format cannot be returned immediately, i.e., immediateOutputFormat() returns false, then this method will be called from batchFinished().- Specified by:
determineOutputFormat
in classweka.filters.SimpleFilter
- Parameters:
inputFormat
- the input format to base the output format on- Returns:
- the output format
- Throws:
Exception
- in case the determination goes wrong
-
crossValidate
protected weka.classifiers.Evaluation crossValidate(weka.core.Instances data, int folds) throws Exception
Cross-validates the classifier on the given data.- Parameters:
data
- the data to use for cross-validationfolds
- the number of folds- Returns:
- the evaluation
- Throws:
Exception
- if cross-validation fails
-
evaluationToSpreadSheet
protected adams.data.spreadsheet.SpreadSheet evaluationToSpreadSheet(weka.classifiers.Evaluation eval) throws Exception
Turns the predictions of the evaluation object into a spreadsheet.- Parameters:
eval
- the evaluation object to convert- Returns:
- the generated spreadsheet
- Throws:
Exception
-
process
protected weka.core.Instances process(weka.core.Instances data) throws Exception
Processes the given data (may change the provided dataset) and returns the modified version. This method is called in batchFinished().- Specified by:
process
in classweka.filters.SimpleFilter
- Parameters:
data
- the data to process- Returns:
- the modified data
- Throws:
Exception
- in case the processing goes wrong
-
-