Package adams.data.cleaner.instance
Class IQRCleaner
- java.lang.Object
-
- adams.core.logging.LoggingObject
-
- adams.core.logging.CustomLoggingLevelObject
-
- adams.core.option.AbstractOptionHandler
-
- adams.data.cleaner.instance.AbstractCleaner
-
- adams.data.cleaner.instance.AbstractSerializableCleaner
-
- adams.data.cleaner.instance.IQRCleaner
-
- All Implemented Interfaces:
adams.core.Destroyable,adams.core.GlobalInfoSupporter,adams.core.logging.LoggingLevelHandler,adams.core.logging.LoggingSupporter,adams.core.option.OptionHandler,adams.core.SerializableObject,adams.core.ShallowCopySupporter<AbstractCleaner>,adams.core.SizeOfHandler,CleanerDetails<adams.data.spreadsheet.SpreadSheet>,adams.flow.core.FlowContextHandler,Serializable,Comparable
public class IQRCleaner extends AbstractSerializableCleaner implements CleanerDetails<adams.data.spreadsheet.SpreadSheet>
Removes instances outside the given IQR multiplier.
-logging-level <OFF|SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST> (property: loggingLevel) The logging level for outputting errors and debugging output. default: WARNING
-pre-filter <weka.filters.Filter> (property: preFilter) The filter to use for pre-filtering the data. default: weka.filters.AllFilter
-serialization-file <adams.core.io.PlaceholderFile> (property: serializationFile) The file to serialize the generated internal model to. default: ${CWD}-override-serialized-file <boolean> (property: overrideSerializedFile) If set to true, then any serialized file will be ignored and the setup for serialization will be regenerated. default: false
-filter <weka.filters.Filter> (property: filter) The IQR filter to use; parameters get set internally. default: weka.filters.unsupervised.attribute.InterquartileRange -R first-last -O 3.0 -E 6.0
-iqr <double> (property: iqr) IQR multipler for min/max values. default: 4.25 minimum: 0.0
-attribute-range <adams.core.Range> (property: attributeRange) The attribute range to work on. default: first-last example: A range is a comma-separated list of single 1-based indices or sub-ranges of indices ('start-end'); 'inv(...)' inverts the range '...'; the following placeholders can be used as well: first, second, third, last_2, last_1, last-remove-with-missing <boolean> (property: removeWithMissing) If enabled, instances with missing values get removed. default: true
- Author:
- dale (dale at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected weka.filters.Filterm_ActualFilterthe actual IQR filter.protected weka.filters.unsupervised.attribute.InterquartileRangem_Filterthe IQR filter.protected doublem_IQRthe maximum value of the attribute.protected adams.core.Rangem_Rangethe attribute range to work on.protected booleanm_RemoveWithMissingwhether to remove instances with missing values.-
Fields inherited from class adams.data.cleaner.instance.AbstractSerializableCleaner
m_InitData, m_OverrideSerializationFile, m_SerializableObjectHelper, m_SerializationFile
-
Fields inherited from class adams.data.cleaner.instance.AbstractCleaner
m_ActualPreFilter, m_CleanInstancesError, m_FlowContext, m_PreFilter
-
-
Constructor Summary
Constructors Constructor Description IQRCleaner()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description StringattributeRangeTipText()Returns the tip text for this property.voiddefineOptions()Adds options to the internal list of options.StringfilterTipText()Returns the tip text for this property.adams.core.RangegetAttributeRange()Returns the attribute range to work on.adams.data.spreadsheet.SpreadSheetgetDetails()Returns details on the IQR filter.weka.filters.FiltergetFilter()Returns the IQR filter.doublegetIqr()Returns the iqr multiplier.booleangetRemoveWithMissing()Returns whether to remove instances with missing values.StringglobalInfo()Returns a string describing the object.protected voidinitialize()Initializes the members.voidinitSerializationSetup()Regenerates all the objects that are necessary for serialization.StringiqrTipText()Returns the tip text for this property.protected StringperformCheck(weka.core.Instance data)Performs the actual check.protected weka.core.InstancesperformClean(weka.core.Instances instances)Clean InstancesStringremoveWithMissingTipText()Returns the tip text for this property.protected voidreset()Resets the scheme.Object[]retrieveSerializationSetup()Returns the member variables to serialize to a file.voidsetAttributeRange(adams.core.Range value)Sets the attribute range to work on.voidsetFilter(weka.filters.Filter value)Sets the IQR filter.voidsetIqr(double value)Sets the IQR multiplier.voidsetRemoveWithMissing(boolean value)Sets whether to remove instances with missing values.voidsetSerializationSetup(Object[] value)Updates the member variables with the provided objects obtained from deserialization.-
Methods inherited from class adams.data.cleaner.instance.AbstractSerializableCleaner
destroy, getOverrideSerializedFile, getSerializationFile, isSetupLoadedOrGenerated, overrideSerializedFileTipText, preCheck, preCheck, serializationFileTipText, setOverrideSerializedFile, setSerializationFile, setSetupLoadedOrGenerated
-
Methods inherited from class adams.data.cleaner.instance.AbstractCleaner
check, clean, compareTo, equals, forCommandLine, forName, getCleaners, getCleanInstancesError, getFlowContext, getPreFilter, hasCleanInstancesError, preFilter, preFilter, preFilterTipText, setFlowContext, setPreFilter, shallowCopy, shallowCopy
-
Methods inherited from class adams.core.option.AbstractOptionHandler
cleanUpOptions, finishInit, getDefaultLoggingLevel, getOptionManager, loggingLevelTipText, newOptionManager, setLoggingLevel, toCommandLine, toString
-
Methods inherited from class adams.core.logging.LoggingObject
configureLogger, getLogger, getLoggingLevel, initializeLogging, isLoggingEnabled, sizeOf
-
-
-
-
Field Detail
-
m_Filter
protected weka.filters.unsupervised.attribute.InterquartileRange m_Filter
the IQR filter.
-
m_ActualFilter
protected weka.filters.Filter m_ActualFilter
the actual IQR filter.
-
m_IQR
protected double m_IQR
the maximum value of the attribute.
-
m_Range
protected adams.core.Range m_Range
the attribute range to work on.
-
m_RemoveWithMissing
protected boolean m_RemoveWithMissing
whether to remove instances with missing values.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing the object.- Specified by:
globalInfoin interfaceadams.core.GlobalInfoSupporter- Specified by:
globalInfoin classadams.core.option.AbstractOptionHandler- Returns:
- a description suitable for displaying in the gui
-
defineOptions
public void defineOptions()
Adds options to the internal list of options.- Specified by:
defineOptionsin interfaceadams.core.option.OptionHandler- Overrides:
defineOptionsin classAbstractSerializableCleaner
-
initialize
protected void initialize()
Initializes the members.- Overrides:
initializein classAbstractSerializableCleaner
-
reset
protected void reset()
Resets the scheme.- Overrides:
resetin classAbstractSerializableCleaner
-
setFilter
public void setFilter(weka.filters.Filter value)
Sets the IQR filter.- Parameters:
value- the filter
-
getFilter
public weka.filters.Filter getFilter()
Returns the IQR filter.- Returns:
- the filter
-
filterTipText
public String filterTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setIqr
public void setIqr(double value)
Sets the IQR multiplier.- Parameters:
value- iqr
-
getIqr
public double getIqr()
Returns the iqr multiplier.- Returns:
- the iqr
-
iqrTipText
public String iqrTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setAttributeRange
public void setAttributeRange(adams.core.Range value)
Sets the attribute range to work on.- Parameters:
value- the range
-
getAttributeRange
public adams.core.Range getAttributeRange()
Returns the attribute range to work on.- Returns:
- the range
-
attributeRangeTipText
public String attributeRangeTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setRemoveWithMissing
public void setRemoveWithMissing(boolean value)
Sets whether to remove instances with missing values.- Parameters:
value- true if to remove
-
getRemoveWithMissing
public boolean getRemoveWithMissing()
Returns whether to remove instances with missing values.- Returns:
- true if to remove
-
removeWithMissingTipText
public String removeWithMissingTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
initSerializationSetup
public void initSerializationSetup()
Regenerates all the objects that are necessary for serialization.- Specified by:
initSerializationSetupin interfaceadams.core.SerializableObject
-
retrieveSerializationSetup
public Object[] retrieveSerializationSetup()
Returns the member variables to serialize to a file.- Specified by:
retrieveSerializationSetupin interfaceadams.core.SerializableObject- Returns:
- the objects to serialize
-
setSerializationSetup
public void setSerializationSetup(Object[] value)
Updates the member variables with the provided objects obtained from deserialization.- Specified by:
setSerializationSetupin interfaceadams.core.SerializableObject- Parameters:
value- the deserialized objects
-
performCheck
protected String performCheck(weka.core.Instance data)
Performs the actual check.- Specified by:
performCheckin classAbstractCleaner- Parameters:
data- the Instance to check- Returns:
- null if no outlier/extreme value detected
-
performClean
protected weka.core.Instances performClean(weka.core.Instances instances)
Clean Instances- Specified by:
performCleanin classAbstractCleaner- Parameters:
instances- Instances- Returns:
- null if ok, otherwise error message
-
getDetails
public adams.data.spreadsheet.SpreadSheet getDetails()
Returns details on the IQR filter.- Specified by:
getDetailsin interfaceCleanerDetails<adams.data.spreadsheet.SpreadSheet>- Returns:
- the details as spreadsheet, null if no filter available yet
-
-