Package adams.data.cleaner.instance
Class IQRCleaner
- java.lang.Object
-
- adams.core.logging.LoggingObject
-
- adams.core.logging.CustomLoggingLevelObject
-
- adams.core.option.AbstractOptionHandler
-
- adams.data.cleaner.instance.AbstractCleaner
-
- adams.data.cleaner.instance.AbstractSerializableCleaner
-
- adams.data.cleaner.instance.IQRCleaner
-
- All Implemented Interfaces:
adams.core.Destroyable
,adams.core.GlobalInfoSupporter
,adams.core.logging.LoggingLevelHandler
,adams.core.logging.LoggingSupporter
,adams.core.option.OptionHandler
,adams.core.SerializableObject
,adams.core.ShallowCopySupporter<AbstractCleaner>
,adams.core.SizeOfHandler
,CleanerDetails<adams.data.spreadsheet.SpreadSheet>
,adams.flow.core.FlowContextHandler
,Serializable
,Comparable
public class IQRCleaner extends AbstractSerializableCleaner implements CleanerDetails<adams.data.spreadsheet.SpreadSheet>
Removes instances outside the given IQR multiplier.
-logging-level <OFF|SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST> (property: loggingLevel) The logging level for outputting errors and debugging output. default: WARNING
-pre-filter <weka.filters.Filter> (property: preFilter) The filter to use for pre-filtering the data. default: weka.filters.AllFilter
-serialization-file <adams.core.io.PlaceholderFile> (property: serializationFile) The file to serialize the generated internal model to. default: ${CWD}
-override-serialized-file <boolean> (property: overrideSerializedFile) If set to true, then any serialized file will be ignored and the setup for serialization will be regenerated. default: false
-filter <weka.filters.Filter> (property: filter) The IQR filter to use; parameters get set internally. default: weka.filters.unsupervised.attribute.InterquartileRange -R first-last -O 3.0 -E 6.0
-iqr <double> (property: iqr) IQR multipler for min/max values. default: 4.25 minimum: 0.0
-attribute-range <adams.core.Range> (property: attributeRange) The attribute range to work on. default: first-last example: A range is a comma-separated list of single 1-based indices or sub-ranges of indices ('start-end'); 'inv(...)' inverts the range '...'; the following placeholders can be used as well: first, second, third, last_2, last_1, last
-remove-with-missing <boolean> (property: removeWithMissing) If enabled, instances with missing values get removed. default: true
- Author:
- dale (dale at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected weka.filters.Filter
m_ActualFilter
the actual IQR filter.protected weka.filters.unsupervised.attribute.InterquartileRange
m_Filter
the IQR filter.protected double
m_IQR
the maximum value of the attribute.protected adams.core.Range
m_Range
the attribute range to work on.protected boolean
m_RemoveWithMissing
whether to remove instances with missing values.-
Fields inherited from class adams.data.cleaner.instance.AbstractSerializableCleaner
m_InitData, m_OverrideSerializationFile, m_SerializableObjectHelper, m_SerializationFile
-
Fields inherited from class adams.data.cleaner.instance.AbstractCleaner
m_ActualPreFilter, m_CleanInstancesError, m_FlowContext, m_PreFilter
-
-
Constructor Summary
Constructors Constructor Description IQRCleaner()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description String
attributeRangeTipText()
Returns the tip text for this property.void
defineOptions()
Adds options to the internal list of options.String
filterTipText()
Returns the tip text for this property.adams.core.Range
getAttributeRange()
Returns the attribute range to work on.adams.data.spreadsheet.SpreadSheet
getDetails()
Returns details on the IQR filter.weka.filters.Filter
getFilter()
Returns the IQR filter.double
getIqr()
Returns the iqr multiplier.boolean
getRemoveWithMissing()
Returns whether to remove instances with missing values.String
globalInfo()
Returns a string describing the object.protected void
initialize()
Initializes the members.void
initSerializationSetup()
Regenerates all the objects that are necessary for serialization.String
iqrTipText()
Returns the tip text for this property.protected String
performCheck(weka.core.Instance data)
Performs the actual check.protected weka.core.Instances
performClean(weka.core.Instances instances)
Clean InstancesString
removeWithMissingTipText()
Returns the tip text for this property.protected void
reset()
Resets the scheme.Object[]
retrieveSerializationSetup()
Returns the member variables to serialize to a file.void
setAttributeRange(adams.core.Range value)
Sets the attribute range to work on.void
setFilter(weka.filters.Filter value)
Sets the IQR filter.void
setIqr(double value)
Sets the IQR multiplier.void
setRemoveWithMissing(boolean value)
Sets whether to remove instances with missing values.void
setSerializationSetup(Object[] value)
Updates the member variables with the provided objects obtained from deserialization.-
Methods inherited from class adams.data.cleaner.instance.AbstractSerializableCleaner
destroy, getOverrideSerializedFile, getSerializationFile, isSetupLoadedOrGenerated, overrideSerializedFileTipText, preCheck, preCheck, serializationFileTipText, setOverrideSerializedFile, setSerializationFile, setSetupLoadedOrGenerated
-
Methods inherited from class adams.data.cleaner.instance.AbstractCleaner
check, clean, compareTo, equals, forCommandLine, forName, getCleaners, getCleanInstancesError, getFlowContext, getPreFilter, hasCleanInstancesError, preFilter, preFilter, preFilterTipText, setFlowContext, setPreFilter, shallowCopy, shallowCopy
-
Methods inherited from class adams.core.option.AbstractOptionHandler
cleanUpOptions, finishInit, getDefaultLoggingLevel, getOptionManager, loggingLevelTipText, newOptionManager, setLoggingLevel, toCommandLine, toString
-
Methods inherited from class adams.core.logging.LoggingObject
configureLogger, getLogger, getLoggingLevel, initializeLogging, isLoggingEnabled, sizeOf
-
-
-
-
Field Detail
-
m_Filter
protected weka.filters.unsupervised.attribute.InterquartileRange m_Filter
the IQR filter.
-
m_ActualFilter
protected weka.filters.Filter m_ActualFilter
the actual IQR filter.
-
m_IQR
protected double m_IQR
the maximum value of the attribute.
-
m_Range
protected adams.core.Range m_Range
the attribute range to work on.
-
m_RemoveWithMissing
protected boolean m_RemoveWithMissing
whether to remove instances with missing values.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing the object.- Specified by:
globalInfo
in interfaceadams.core.GlobalInfoSupporter
- Specified by:
globalInfo
in classadams.core.option.AbstractOptionHandler
- Returns:
- a description suitable for displaying in the gui
-
defineOptions
public void defineOptions()
Adds options to the internal list of options.- Specified by:
defineOptions
in interfaceadams.core.option.OptionHandler
- Overrides:
defineOptions
in classAbstractSerializableCleaner
-
initialize
protected void initialize()
Initializes the members.- Overrides:
initialize
in classAbstractSerializableCleaner
-
reset
protected void reset()
Resets the scheme.- Overrides:
reset
in classAbstractSerializableCleaner
-
setFilter
public void setFilter(weka.filters.Filter value)
Sets the IQR filter.- Parameters:
value
- the filter
-
getFilter
public weka.filters.Filter getFilter()
Returns the IQR filter.- Returns:
- the filter
-
filterTipText
public String filterTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setIqr
public void setIqr(double value)
Sets the IQR multiplier.- Parameters:
value
- iqr
-
getIqr
public double getIqr()
Returns the iqr multiplier.- Returns:
- the iqr
-
iqrTipText
public String iqrTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setAttributeRange
public void setAttributeRange(adams.core.Range value)
Sets the attribute range to work on.- Parameters:
value
- the range
-
getAttributeRange
public adams.core.Range getAttributeRange()
Returns the attribute range to work on.- Returns:
- the range
-
attributeRangeTipText
public String attributeRangeTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setRemoveWithMissing
public void setRemoveWithMissing(boolean value)
Sets whether to remove instances with missing values.- Parameters:
value
- true if to remove
-
getRemoveWithMissing
public boolean getRemoveWithMissing()
Returns whether to remove instances with missing values.- Returns:
- true if to remove
-
removeWithMissingTipText
public String removeWithMissingTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
initSerializationSetup
public void initSerializationSetup()
Regenerates all the objects that are necessary for serialization.- Specified by:
initSerializationSetup
in interfaceadams.core.SerializableObject
-
retrieveSerializationSetup
public Object[] retrieveSerializationSetup()
Returns the member variables to serialize to a file.- Specified by:
retrieveSerializationSetup
in interfaceadams.core.SerializableObject
- Returns:
- the objects to serialize
-
setSerializationSetup
public void setSerializationSetup(Object[] value)
Updates the member variables with the provided objects obtained from deserialization.- Specified by:
setSerializationSetup
in interfaceadams.core.SerializableObject
- Parameters:
value
- the deserialized objects
-
performCheck
protected String performCheck(weka.core.Instance data)
Performs the actual check.- Specified by:
performCheck
in classAbstractCleaner
- Parameters:
data
- the Instance to check- Returns:
- null if no outlier/extreme value detected
-
performClean
protected weka.core.Instances performClean(weka.core.Instances instances)
Clean Instances- Specified by:
performClean
in classAbstractCleaner
- Parameters:
instances
- Instances- Returns:
- null if ok, otherwise error message
-
getDetails
public adams.data.spreadsheet.SpreadSheet getDetails()
Returns details on the IQR filter.- Specified by:
getDetails
in interfaceCleanerDetails<adams.data.spreadsheet.SpreadSheet>
- Returns:
- the details as spreadsheet, null if no filter available yet
-
-