weka.filters.unsupervised.attribute
Class InterquartileRangeSamp

java.lang.Object
  extended by weka.filters.Filter
      extended by weka.filters.SimpleFilter
          extended by weka.filters.SimpleBatchFilter
              extended by weka.filters.unsupervised.attribute.InterquartileRange
                  extended by weka.filters.unsupervised.attribute.InterquartileRangeSamp
All Implemented Interfaces:
Serializable, weka.core.CapabilitiesHandler, weka.core.OptionHandler, weka.core.RevisionHandler

public class InterquartileRangeSamp
extends weka.filters.unsupervised.attribute.InterquartileRange

A filter for detecting outliers and extreme values based on interquartile ranges. The filter skips the class attribute.

Outliers:
Q3 + OF*IQR < x <= Q3 + EVF*IQR
or
Q1 - EVF*IQR <= x < Q1 - OF*IQR

Extreme values:
x > Q3 + EVF*IQR
or
x < Q1 - EVF*IQR

Key:
Q1 = 25% quartile
Q3 = 75% quartile
IQR = Interquartile Range, difference between Q1 and Q3
OF = Outlier Factor
EVF = Extreme Value Factor

Valid options are:

 -D
  Turns on output of debugging information.
 -R <col1,col2-col4,...>
  Specifies list of columns to base outlier/extreme value detection
  on. If an instance is considered in at least one of those
  attributes an outlier/extreme value, it is tagged accordingly.
  'first' and 'last' are valid indexes.
  (default none)
 -O <num>
  The factor for outlier detection.
  (default: 3)
 -E <num>
  The factor for extreme values detection.
  (default: 2*Outlier Factor)
 -E-as-O
  Tags extreme values also as outliers.
  (default: off)
 -P
  Generates Outlier/ExtremeValue pair for each numeric attribute in
  the range, not just a single indicator pair for all the attributes.
  (default: off)
 -M
  Generates an additional attribute 'Offset' per Outlier/ExtremeValue
  pair that contains the multiplier that the value is off the median.
     value = median + 'multiplier' * IQR
 Note: implicitely sets '-P'. (default: off)
Thanks to Dale for a few brainstorming sessions.

Version:
$Revision: 5157 $
Author:
Dale Fletcher (dale at cs dot waikato dot ac dot nz), fracpete (fracpete at waikato dot ac dot nz)
See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class weka.filters.unsupervised.attribute.InterquartileRange
weka.filters.unsupervised.attribute.InterquartileRange.ValueType
 
Field Summary
 
Fields inherited from class weka.filters.unsupervised.attribute.InterquartileRange
m_AttributeIndices, m_Attributes, m_DetectionPerAttribute, m_ExtremeValuesAsOutliers, m_ExtremeValuesFactor, m_IQR, m_LowerExtremeValue, m_LowerOutlier, m_Median, m_OutlierAttributePosition, m_OutlierFactor, m_OutputOffsetMultiplier, m_UpperExtremeValue, m_UpperOutlier, NON_NUMERIC
 
Fields inherited from class weka.filters.SimpleFilter
m_Debug
 
Fields inherited from class weka.filters.Filter
m_FirstBatchDone, m_InputRelAtts, m_InputStringAtts, m_NewBatch, m_OutputRelAtts, m_OutputStringAtts
 
Constructor Summary
InterquartileRangeSamp()
           
 
Method Summary
protected  void computeThresholds(weka.core.Instances instances)
          computes the thresholds for outliers and extreme values
 String globalInfo()
          Returns a string describing this filter
static void main(String[] args)
          Main method for testing this class.
 
Methods inherited from class weka.filters.unsupervised.attribute.InterquartileRange
attributeIndicesTipText, calculateMultiplier, detectionPerAttributeTipText, determineOutputFormat, extremeValuesAsOutliersTipText, extremeValuesFactorTipText, getAttributeIndices, getCapabilities, getDetectionPerAttribute, getExtremeValuesAsOutliers, getExtremeValuesFactor, getOptions, getOutlierFactor, getOutputOffsetMultiplier, getRevision, getValues, isExtremeValue, isExtremeValue, isOutlier, isOutlier, listOptions, outlierFactorTipText, outputOffsetMultiplierTipText, process, setAttributeIndices, setAttributeIndicesArray, setDetectionPerAttribute, setExtremeValuesAsOutliers, setExtremeValuesFactor, setOptions, setOutlierFactor, setOutputOffsetMultiplier
 
Methods inherited from class weka.filters.SimpleBatchFilter
batchFinished, hasImmediateOutputFormat, input
 
Methods inherited from class weka.filters.SimpleFilter
debugTipText, getDebug, reset, setDebug, setInputFormat
 
Methods inherited from class weka.filters.Filter
batchFilterFile, bufferInput, copyValues, copyValues, filterFile, flushInput, getCapabilities, getInputFormat, getOutputFormat, initInputLocators, initOutputLocators, inputFormatPeek, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputFormatPeek, outputPeek, push, resetQueue, runFilter, setOutputFormat, testInputFormat, toString, useFilter, wekaStaticWrapper
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

InterquartileRangeSamp

public InterquartileRangeSamp()
Method Detail

globalInfo

public String globalInfo()
Returns a string describing this filter

Overrides:
globalInfo in class weka.filters.unsupervised.attribute.InterquartileRange
Returns:
a description of the filter suitable for displaying in the explorer/experimenter gui

computeThresholds

protected void computeThresholds(weka.core.Instances instances)
computes the thresholds for outliers and extreme values

Overrides:
computeThresholds in class weka.filters.unsupervised.attribute.InterquartileRange
Parameters:
instances - the data to work on

main

public static void main(String[] args)
Main method for testing this class.

Parameters:
args - should contain arguments to the filter: use -h for help


Copyright © 2012 University of Waikato, Hamilton, NZ. All Rights Reserved.