weka.filters.unsupervised.instance
Class RemoveMisclassifiedRel

java.lang.Object
  extended by weka.filters.Filter
      extended by weka.filters.unsupervised.instance.RemoveMisclassifiedRel
All Implemented Interfaces:
Serializable, weka.core.CapabilitiesHandler, weka.core.OptionHandler, weka.core.RevisionHandler, weka.filters.UnsupervisedFilter

public class RemoveMisclassifiedRel
extends weka.filters.Filter
implements weka.filters.UnsupervisedFilter, weka.core.OptionHandler

A filter that removes instances which are incorrectly classified. Useful for removing outliers.

Valid options are:

 -W <classifier specification>
  Full class name of classifier to use, followed
  by scheme options. eg:
   "weka.classifiers.bayes.NaiveBayes -D"
  (default: weka.classifiers.rules.ZeroR)
 -C <class index>
  Attribute on which misclassifications are based.
  If < 0 will use any current set class or default to the last attribute.
 -F <number of folds>
  The number of folds to use for cross-validation cleansing.
  (<2 = no cross-validation - default).
 -T <threshold>
  Threshold for the max error when predicting numeric class.
  (Value should be >= 0, default = 0.1).
 -I
  The maximum number of cleansing iterations to perform.
  (<1 = until fully cleansed - default)
 -V
  Invert the match so that correctly classified instances are discarded.
 

Version:
$Revision: 4584 $
Author:
Richard Kirkby (rkirkby@cs.waikato.ac.nz), Malcolm Ware (mfw4@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
protected  int m_classIndex
          The attribute to treat as the class for purposes of cleansing.
protected  weka.classifiers.Classifier m_cleansingClassifier
          The classifier used to do the cleansing
protected  boolean m_firstBatchFinished
          Have we processed the first batch (i.e.
protected  boolean m_invertMatching
          Whether to invert the match so the correctly classified instances are discarded
protected  double m_numericClassifyThreshold
          The threshold for deciding when a numeric value is correctly classified
protected  double m_numericClassifyThresholdAbs
          if Absolute error is less than this, then we're ok
protected  int m_numOfCleansingIterations
          The maximum number of cleansing iterations to perform (<1 = until fully cleansed)
protected  int m_numOfCrossValidationFolds
          The number of cross validation folds to perform (<2 = no cross validation)
 
Fields inherited from class weka.filters.Filter
m_FirstBatchDone, m_InputRelAtts, m_InputStringAtts, m_NewBatch, m_OutputRelAtts, m_OutputStringAtts
 
Constructor Summary
RemoveMisclassifiedRel()
           
 
Method Summary
 String absErrTipText()
          Returns the tip text for this property
 boolean batchFinished()
          Signify that this batch of input to the filter is finished.
 String classifierTipText()
          Returns the tip text for this property
 String classIndexTipText()
          Returns the tip text for this property
 double getAbsErr()
          Gets the threshold for the max error when predicting a numeric class.
 weka.core.Capabilities getCapabilities()
          Returns the Capabilities of this filter.
 weka.classifiers.Classifier getClassifier()
          Gets the classifier used by the filter.
protected  String getClassifierSpec()
          Gets the classifier specification string, which contains the class name of the classifier and any options to the classifier.
 int getClassIndex()
          Gets the attribute on which misclassifications are based.
 boolean getInvert()
          Get whether selection is inverted.
 int getMaxIterations()
          Gets the maximum number of cleansing iterations performed
 int getNumFolds()
          Gets the number of cross-validation folds used by the filter.
 String[] getOptions()
          Gets the current settings of the filter.
 String getRevision()
          Returns the revision string.
 double getThreshold()
          Gets the threshold for the max error when predicting a numeric class.
 String globalInfo()
          Returns a string describing this filter
 boolean input(weka.core.Instance instance)
          Input an instance for filtering.
 String invertTipText()
          Returns the tip text for this property
 Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(String[] argv)
          Main method for testing this class.
 String maxIterationsTipText()
          Returns the tip text for this property
 String numFoldsTipText()
          Returns the tip text for this property
 void setAbsErr(double threshold)
          Sets the threshold for the max error when predicting a numeric class.
 void setClassifier(weka.classifiers.Classifier classifier)
          Sets the classifier to classify instances with.
 void setClassIndex(int classIndex)
          Sets the attribute on which misclassifications are based.
 boolean setInputFormat(weka.core.Instances instanceInfo)
          Sets the format of the input instances.
 void setInvert(boolean invert)
          Set whether selection is inverted.
 void setMaxIterations(int iterations)
          Sets the maximum number of cleansing iterations to perform - < 1 means go until fully cleansed
 void setNumFolds(int numOfFolds)
          Sets the number of cross-validation folds to use - < 2 means no cross-validation.
 void setOptions(String[] options)
          Parses a given list of options.
 void setThreshold(double threshold)
          Sets the threshold for the max error when predicting a numeric class.
 String thresholdTipText()
          Returns the tip text for this property.
 
Methods inherited from class weka.filters.Filter
batchFilterFile, bufferInput, copyValues, copyValues, filterFile, flushInput, getCapabilities, getInputFormat, getOutputFormat, initInputLocators, initOutputLocators, inputFormatPeek, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputFormatPeek, outputPeek, push, resetQueue, runFilter, setOutputFormat, testInputFormat, toString, useFilter, wekaStaticWrapper
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_cleansingClassifier

protected weka.classifiers.Classifier m_cleansingClassifier
The classifier used to do the cleansing


m_classIndex

protected int m_classIndex
The attribute to treat as the class for purposes of cleansing.


m_numOfCrossValidationFolds

protected int m_numOfCrossValidationFolds
The number of cross validation folds to perform (<2 = no cross validation)


m_numOfCleansingIterations

protected int m_numOfCleansingIterations
The maximum number of cleansing iterations to perform (<1 = until fully cleansed)


m_numericClassifyThreshold

protected double m_numericClassifyThreshold
The threshold for deciding when a numeric value is correctly classified


m_numericClassifyThresholdAbs

protected double m_numericClassifyThresholdAbs
if Absolute error is less than this, then we're ok


m_invertMatching

protected boolean m_invertMatching
Whether to invert the match so the correctly classified instances are discarded


m_firstBatchFinished

protected boolean m_firstBatchFinished
Have we processed the first batch (i.e. training data)?

Constructor Detail

RemoveMisclassifiedRel

public RemoveMisclassifiedRel()
Method Detail

getCapabilities

public weka.core.Capabilities getCapabilities()
Returns the Capabilities of this filter.

Specified by:
getCapabilities in interface weka.core.CapabilitiesHandler
Overrides:
getCapabilities in class weka.filters.Filter
Returns:
the capabilities of this object
See Also:
Capabilities

setInputFormat

public boolean setInputFormat(weka.core.Instances instanceInfo)
                       throws Exception
Sets the format of the input instances.

Overrides:
setInputFormat in class weka.filters.Filter
Parameters:
instanceInfo - an Instances object containing the input instance structure (any instances contained in the object are ignored - only the structure is required).
Returns:
true if the outputFormat may be collected immediately
Throws:
Exception - if the inputFormat can't be set successfully

input

public boolean input(weka.core.Instance instance)
              throws Exception
Input an instance for filtering.

Overrides:
input in class weka.filters.Filter
Parameters:
instance - the input instance
Returns:
true if the filtered instance may now be collected with output().
Throws:
NullPointerException - if the input format has not been defined.
Exception - if the input instance was not of the correct format or if there was a problem with the filtering.

batchFinished

public boolean batchFinished()
                      throws Exception
Signify that this batch of input to the filter is finished.

Overrides:
batchFinished in class weka.filters.Filter
Returns:
true if there are instances pending output
Throws:
IllegalStateException - if no input structure has been defined
Exception

listOptions

public Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface weka.core.OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(String[] options)
                throws Exception
Parses a given list of options.

Valid options are:

 -W <classifier specification>
  Full class name of classifier to use, followed
  by scheme options. eg:
   "weka.classifiers.bayes.NaiveBayes -D"
  (default: weka.classifiers.rules.ZeroR)
 -C <class index>
  Attribute on which misclassifications are based.
  If < 0 will use any current set class or default to the last attribute.
 -F <number of folds>
  The number of folds to use for cross-validation cleansing.
  (<2 = no cross-validation - default).
 -T <threshold>
  Threshold for the max error when predicting numeric class.
  (Value should be >= 0, default = 0.1).
 -I
  The maximum number of cleansing iterations to perform.
  (<1 = until fully cleansed - default)
 -V
  Invert the match so that correctly classified instances are discarded.
 

Specified by:
setOptions in interface weka.core.OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
Exception - if an option is not supported

getOptions

public String[] getOptions()
Gets the current settings of the filter.

Specified by:
getOptions in interface weka.core.OptionHandler
Returns:
an array of strings suitable for passing to setOptions

globalInfo

public String globalInfo()
Returns a string describing this filter

Returns:
a description of the filter suitable for displaying in the explorer/experimenter gui

classifierTipText

public String classifierTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setClassifier

public void setClassifier(weka.classifiers.Classifier classifier)
Sets the classifier to classify instances with.

Parameters:
classifier - The classifier to be used (with its options set).

getClassifier

public weka.classifiers.Classifier getClassifier()
Gets the classifier used by the filter.

Returns:
The classifier to be used.

getClassifierSpec

protected String getClassifierSpec()
Gets the classifier specification string, which contains the class name of the classifier and any options to the classifier.

Returns:
the classifier string.

classIndexTipText

public String classIndexTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setClassIndex

public void setClassIndex(int classIndex)
Sets the attribute on which misclassifications are based. If < 0 will use any current set class or default to the last attribute.

Parameters:
classIndex - the class index.

getClassIndex

public int getClassIndex()
Gets the attribute on which misclassifications are based.

Returns:
the class index.

numFoldsTipText

public String numFoldsTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setNumFolds

public void setNumFolds(int numOfFolds)
Sets the number of cross-validation folds to use - < 2 means no cross-validation.

Parameters:
numOfFolds - the number of folds.

getNumFolds

public int getNumFolds()
Gets the number of cross-validation folds used by the filter.

Returns:
the number of folds.

absErrTipText

public String absErrTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setAbsErr

public void setAbsErr(double threshold)
Sets the threshold for the max error when predicting a numeric class. The value should be >= 0.

Parameters:
threshold - the numeric theshold.

getAbsErr

public double getAbsErr()
Gets the threshold for the max error when predicting a numeric class.

Returns:
the numeric threshold.

thresholdTipText

public String thresholdTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setThreshold

public void setThreshold(double threshold)
Sets the threshold for the max error when predicting a numeric class. The value should be >= 0.

Parameters:
threshold - the numeric theshold.

getThreshold

public double getThreshold()
Gets the threshold for the max error when predicting a numeric class.

Returns:
the numeric threshold.

maxIterationsTipText

public String maxIterationsTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setMaxIterations

public void setMaxIterations(int iterations)
Sets the maximum number of cleansing iterations to perform - < 1 means go until fully cleansed

Parameters:
iterations - the maximum number of iterations.

getMaxIterations

public int getMaxIterations()
Gets the maximum number of cleansing iterations performed

Returns:
the maximum number of iterations.

invertTipText

public String invertTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setInvert

public void setInvert(boolean invert)
Set whether selection is inverted.

Parameters:
invert - whether or not to invert selection.

getInvert

public boolean getInvert()
Get whether selection is inverted.

Returns:
whether or not selection is inverted.

getRevision

public String getRevision()
Returns the revision string.

Specified by:
getRevision in interface weka.core.RevisionHandler
Overrides:
getRevision in class weka.filters.Filter
Returns:
the revision

main

public static void main(String[] argv)
Main method for testing this class.

Parameters:
argv - should contain arguments to the filter: use -h for help


Copyright © 2012 University of Waikato, Hamilton, NZ. All Rights Reserved.