weka.filters.unsupervised.attribute
Class InterquartileRangeSamp
java.lang.Object
weka.filters.Filter
weka.filters.SimpleFilter
weka.filters.SimpleBatchFilter
weka.filters.unsupervised.attribute.InterquartileRange
weka.filters.unsupervised.attribute.InterquartileRangeSamp
- All Implemented Interfaces:
- Serializable, weka.core.CapabilitiesHandler, weka.core.OptionHandler, weka.core.RevisionHandler
public class InterquartileRangeSamp
- extends weka.filters.unsupervised.attribute.InterquartileRange
A filter for detecting outliers and extreme values based on interquartile ranges. The filter skips the class attribute.
Outliers:
Q3 + OF*IQR < x <= Q3 + EVF*IQR
or
Q1 - EVF*IQR <= x < Q1 - OF*IQR
Extreme values:
x > Q3 + EVF*IQR
or
x < Q1 - EVF*IQR
Key:
Q1 = 25% quartile
Q3 = 75% quartile
IQR = Interquartile Range, difference between Q1 and Q3
OF = Outlier Factor
EVF = Extreme Value Factor
Valid options are:
-D
Turns on output of debugging information.
-R <col1,col2-col4,...>
Specifies list of columns to base outlier/extreme value detection
on. If an instance is considered in at least one of those
attributes an outlier/extreme value, it is tagged accordingly.
'first' and 'last' are valid indexes.
(default none)
-O <num>
The factor for outlier detection.
(default: 3)
-E <num>
The factor for extreme values detection.
(default: 2*Outlier Factor)
-E-as-O
Tags extreme values also as outliers.
(default: off)
-P
Generates Outlier/ExtremeValue pair for each numeric attribute in
the range, not just a single indicator pair for all the attributes.
(default: off)
-M
Generates an additional attribute 'Offset' per Outlier/ExtremeValue
pair that contains the multiplier that the value is off the median.
value = median + 'multiplier' * IQR
Note: implicitely sets '-P'. (default: off)
Thanks to Dale for a few brainstorming sessions.
- Version:
- $Revision: 5157 $
- Author:
- Dale Fletcher (dale at cs dot waikato dot ac dot nz), fracpete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
| Nested classes/interfaces inherited from class weka.filters.unsupervised.attribute.InterquartileRange |
weka.filters.unsupervised.attribute.InterquartileRange.ValueType |
| Fields inherited from class weka.filters.unsupervised.attribute.InterquartileRange |
m_AttributeIndices, m_Attributes, m_DetectionPerAttribute, m_ExtremeValuesAsOutliers, m_ExtremeValuesFactor, m_IQR, m_LowerExtremeValue, m_LowerOutlier, m_Median, m_OutlierAttributePosition, m_OutlierFactor, m_OutputOffsetMultiplier, m_UpperExtremeValue, m_UpperOutlier, NON_NUMERIC |
| Fields inherited from class weka.filters.SimpleFilter |
m_Debug |
| Fields inherited from class weka.filters.Filter |
m_FirstBatchDone, m_InputRelAtts, m_InputStringAtts, m_NewBatch, m_OutputRelAtts, m_OutputStringAtts |
|
Method Summary |
protected void |
computeThresholds(weka.core.Instances instances)
computes the thresholds for outliers and extreme values |
String |
globalInfo()
Returns a string describing this filter |
static void |
main(String[] args)
Main method for testing this class. |
| Methods inherited from class weka.filters.unsupervised.attribute.InterquartileRange |
attributeIndicesTipText, calculateMultiplier, detectionPerAttributeTipText, determineOutputFormat, extremeValuesAsOutliersTipText, extremeValuesFactorTipText, getAttributeIndices, getCapabilities, getDetectionPerAttribute, getExtremeValuesAsOutliers, getExtremeValuesFactor, getOptions, getOutlierFactor, getOutputOffsetMultiplier, getRevision, getValues, isExtremeValue, isExtremeValue, isOutlier, isOutlier, listOptions, outlierFactorTipText, outputOffsetMultiplierTipText, process, setAttributeIndices, setAttributeIndicesArray, setDetectionPerAttribute, setExtremeValuesAsOutliers, setExtremeValuesFactor, setOptions, setOutlierFactor, setOutputOffsetMultiplier |
| Methods inherited from class weka.filters.SimpleBatchFilter |
batchFinished, hasImmediateOutputFormat, input |
| Methods inherited from class weka.filters.SimpleFilter |
debugTipText, getDebug, reset, setDebug, setInputFormat |
| Methods inherited from class weka.filters.Filter |
batchFilterFile, bufferInput, copyValues, copyValues, filterFile, flushInput, getCapabilities, getInputFormat, getOutputFormat, initInputLocators, initOutputLocators, inputFormatPeek, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputFormatPeek, outputPeek, push, resetQueue, runFilter, setOutputFormat, testInputFormat, toString, useFilter, wekaStaticWrapper |
InterquartileRangeSamp
public InterquartileRangeSamp()
globalInfo
public String globalInfo()
- Returns a string describing this filter
- Overrides:
globalInfo in class weka.filters.unsupervised.attribute.InterquartileRange
- Returns:
- a description of the filter suitable for
displaying in the explorer/experimenter gui
computeThresholds
protected void computeThresholds(weka.core.Instances instances)
- computes the thresholds for outliers and extreme values
- Overrides:
computeThresholds in class weka.filters.unsupervised.attribute.InterquartileRange
- Parameters:
instances - the data to work on
main
public static void main(String[] args)
- Main method for testing this class.
- Parameters:
args - should contain arguments to the filter: use -h for help
Copyright © 2012 University of Waikato, Hamilton, NZ. All Rights Reserved.