Class IQRCleaner

  • All Implemented Interfaces:
    adams.core.Destroyable, adams.core.GlobalInfoSupporter, adams.core.logging.LoggingLevelHandler, adams.core.logging.LoggingSupporter, adams.core.option.OptionHandler, adams.core.SerializableObject, adams.core.ShallowCopySupporter<AbstractCleaner>, adams.core.SizeOfHandler, CleanerDetails<adams.data.spreadsheet.SpreadSheet>, adams.flow.core.FlowContextHandler, Serializable, Comparable

    public class IQRCleaner
    extends AbstractSerializableCleaner
    implements CleanerDetails<adams.data.spreadsheet.SpreadSheet>
    Removes instances outside the given IQR multiplier.

    -logging-level <OFF|SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST> (property: loggingLevel)
        The logging level for outputting errors and debugging output.
        default: WARNING
     
    -pre-filter <weka.filters.Filter> (property: preFilter)
        The filter to use for pre-filtering the data.
        default: weka.filters.AllFilter
     
    -serialization-file <adams.core.io.PlaceholderFile> (property: serializationFile)
        The file to serialize the generated internal model to.
        default: ${CWD}
     
    -override-serialized-file <boolean> (property: overrideSerializedFile)
        If set to true, then any serialized file will be ignored and the setup for 
        serialization will be regenerated.
        default: false
     
    -filter <weka.filters.Filter> (property: filter)
        The IQR filter to use; parameters get set internally.
        default: weka.filters.unsupervised.attribute.InterquartileRange -R first-last -O 3.0 -E 6.0
     
    -iqr <double> (property: iqr)
        IQR multipler for min/max values.
        default: 4.25
        minimum: 0.0
     
    -attribute-range <adams.core.Range> (property: attributeRange)
        The attribute range to work on.
        default: first-last
        example: A range is a comma-separated list of single 1-based indices or sub-ranges of indices ('start-end'); 'inv(...)' inverts the range '...'; the following placeholders can be used as well: first, second, third, last_2, last_1, last
     
    -remove-with-missing <boolean> (property: removeWithMissing)
        If enabled, instances with missing values get removed.
        default: true
     
    Author:
    dale (dale at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Detail

      • m_Filter

        protected weka.filters.unsupervised.attribute.InterquartileRange m_Filter
        the IQR filter.
      • m_ActualFilter

        protected weka.filters.Filter m_ActualFilter
        the actual IQR filter.
      • m_IQR

        protected double m_IQR
        the maximum value of the attribute.
      • m_Range

        protected adams.core.Range m_Range
        the attribute range to work on.
      • m_RemoveWithMissing

        protected boolean m_RemoveWithMissing
        whether to remove instances with missing values.
    • Constructor Detail

      • IQRCleaner

        public IQRCleaner()
    • Method Detail

      • globalInfo

        public String globalInfo()
        Returns a string describing the object.
        Specified by:
        globalInfo in interface adams.core.GlobalInfoSupporter
        Specified by:
        globalInfo in class adams.core.option.AbstractOptionHandler
        Returns:
        a description suitable for displaying in the gui
      • defineOptions

        public void defineOptions()
        Adds options to the internal list of options.
        Specified by:
        defineOptions in interface adams.core.option.OptionHandler
        Overrides:
        defineOptions in class AbstractSerializableCleaner
      • setFilter

        public void setFilter​(weka.filters.Filter value)
        Sets the IQR filter.
        Parameters:
        value - the filter
      • getFilter

        public weka.filters.Filter getFilter()
        Returns the IQR filter.
        Returns:
        the filter
      • filterTipText

        public String filterTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setIqr

        public void setIqr​(double value)
        Sets the IQR multiplier.
        Parameters:
        value - iqr
      • getIqr

        public double getIqr()
        Returns the iqr multiplier.
        Returns:
        the iqr
      • iqrTipText

        public String iqrTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setAttributeRange

        public void setAttributeRange​(adams.core.Range value)
        Sets the attribute range to work on.
        Parameters:
        value - the range
      • getAttributeRange

        public adams.core.Range getAttributeRange()
        Returns the attribute range to work on.
        Returns:
        the range
      • attributeRangeTipText

        public String attributeRangeTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setRemoveWithMissing

        public void setRemoveWithMissing​(boolean value)
        Sets whether to remove instances with missing values.
        Parameters:
        value - true if to remove
      • getRemoveWithMissing

        public boolean getRemoveWithMissing()
        Returns whether to remove instances with missing values.
        Returns:
        true if to remove
      • removeWithMissingTipText

        public String removeWithMissingTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • initSerializationSetup

        public void initSerializationSetup()
        Regenerates all the objects that are necessary for serialization.
        Specified by:
        initSerializationSetup in interface adams.core.SerializableObject
      • retrieveSerializationSetup

        public Object[] retrieveSerializationSetup()
        Returns the member variables to serialize to a file.
        Specified by:
        retrieveSerializationSetup in interface adams.core.SerializableObject
        Returns:
        the objects to serialize
      • setSerializationSetup

        public void setSerializationSetup​(Object[] value)
        Updates the member variables with the provided objects obtained from deserialization.
        Specified by:
        setSerializationSetup in interface adams.core.SerializableObject
        Parameters:
        value - the deserialized objects
      • performCheck

        protected String performCheck​(weka.core.Instance data)
        Performs the actual check.
        Specified by:
        performCheck in class AbstractCleaner
        Parameters:
        data - the Instance to check
        Returns:
        null if no outlier/extreme value detected
      • performClean

        protected weka.core.Instances performClean​(weka.core.Instances instances)
        Clean Instances
        Specified by:
        performClean in class AbstractCleaner
        Parameters:
        instances - Instances
        Returns:
        null if ok, otherwise error message
      • getDetails

        public adams.data.spreadsheet.SpreadSheet getDetails()
        Returns details on the IQR filter.
        Specified by:
        getDetails in interface CleanerDetails<adams.data.spreadsheet.SpreadSheet>
        Returns:
        the details as spreadsheet, null if no filter available yet