Class RemoveDuplicateIDs

  • All Implemented Interfaces:
    Serializable, weka.core.CapabilitiesHandler, weka.core.CapabilitiesIgnorer, weka.core.CommandlineRunnable, weka.core.OptionHandler, weka.core.RevisionHandler, weka.filters.UnsupervisedFilter

    public class RemoveDuplicateIDs
    extends weka.filters.SimpleBatchFilter
    implements weka.filters.UnsupervisedFilter
    Removes rows with IDs that occur multiple times.
    Also skips rows with missing ID.

    Valid options are:

     -id <1-based index or name>
      The index/name of ID attribute to use for identifying duplicates.
     
    Version:
    $Revision$
    Author:
    fracpete (fracpete at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected WekaAttributeIndex m_ID
      the attribute with the IDs.
      • Fields inherited from class weka.filters.Filter

        m_Debug, m_DoNotCheckCapabilities, m_FirstBatchDone, m_InputRelAtts, m_InputStringAtts, m_NewBatch, m_OutputRelAtts, m_OutputStringAtts
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected weka.core.Instances determineOutputFormat​(weka.core.Instances inputFormat)
      Determines the output format based on the input format and returns this.
      weka.core.Capabilities getCapabilities()
      Returns the Capabilities of this filter.
      WekaAttributeIndex getID()
      Returns the attribute name/index of attribute with IDs.
      String[] getOptions()
      Gets the current settings of the filter.
      String getRevision()
      Returns the revision string.
      String globalInfo()
      Returns a string describing this classifier.
      String IDTipText()
      Returns the tip text for this property.
      Enumeration listOptions()
      Returns an enumeration describing the available options.
      static void main​(String[] args)
      Main method for running this filter.
      protected weka.core.Instances process​(weka.core.Instances instances)
      Processes the given data (may change the provided dataset) and returns the modified version.
      void setID​(WekaAttributeIndex value)
      Sets the attribute name/index of attribute with IDs.
      void setOptions​(String[] options)
      Parses a list of options for this object.
      • Methods inherited from class weka.filters.SimpleBatchFilter

        allowAccessToFullInputFormat, batchFinished, hasImmediateOutputFormat, input, input
      • Methods inherited from class weka.filters.SimpleFilter

        reset, setInputFormat
      • Methods inherited from class weka.filters.Filter

        batchFilterFile, bufferInput, copyValues, copyValues, debugTipText, doNotCheckCapabilitiesTipText, filterFile, flushInput, getCapabilities, getCopyOfInputFormat, getDebug, getDoNotCheckCapabilities, getInputFormat, getOutputFormat, initInputLocators, initOutputLocators, inputFormatPeek, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputFormatPeek, outputPeek, postExecution, preExecution, push, push, resetQueue, run, runFilter, setDebug, setDoNotCheckCapabilities, setOutputFormat, testInputFormat, toString, useFilter, wekaStaticWrapper
    • Constructor Detail

      • RemoveDuplicateIDs

        public RemoveDuplicateIDs()
    • Method Detail

      • globalInfo

        public String globalInfo()
        Returns a string describing this classifier.
        Specified by:
        globalInfo in class weka.filters.SimpleFilter
        Returns:
        a description of the classifier suitable for displaying in the explorer/experimenter gui
      • listOptions

        public Enumeration listOptions()
        Returns an enumeration describing the available options.
        Specified by:
        listOptions in interface weka.core.OptionHandler
        Overrides:
        listOptions in class weka.filters.Filter
        Returns:
        an enumeration of all the available options.
      • setOptions

        public void setOptions​(String[] options)
                        throws Exception
        Parses a list of options for this object.
        Specified by:
        setOptions in interface weka.core.OptionHandler
        Overrides:
        setOptions in class weka.filters.Filter
        Parameters:
        options - the list of options as an array of strings
        Throws:
        Exception - if an option is not supported
      • getOptions

        public String[] getOptions()
        Gets the current settings of the filter.
        Specified by:
        getOptions in interface weka.core.OptionHandler
        Overrides:
        getOptions in class weka.filters.Filter
        Returns:
        an array of strings suitable for passing to setOptions
      • setID

        public void setID​(WekaAttributeIndex value)
        Sets the attribute name/index of attribute with IDs.
        Parameters:
        value - the attribute name/index
      • getID

        public WekaAttributeIndex getID()
        Returns the attribute name/index of attribute with IDs.
        Returns:
        the attribute name/index
      • IDTipText

        public String IDTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getCapabilities

        public weka.core.Capabilities getCapabilities()
        Returns the Capabilities of this filter.
        Specified by:
        getCapabilities in interface weka.core.CapabilitiesHandler
        Overrides:
        getCapabilities in class weka.filters.Filter
        Returns:
        the capabilities of this object
        See Also:
        Capabilities
      • determineOutputFormat

        protected weka.core.Instances determineOutputFormat​(weka.core.Instances inputFormat)
                                                     throws Exception
        Determines the output format based on the input format and returns this.
        Specified by:
        determineOutputFormat in class weka.filters.SimpleFilter
        Parameters:
        inputFormat - the input format to base the output format on
        Returns:
        the output format
        Throws:
        Exception - in case the determination goes wrong
      • process

        protected weka.core.Instances process​(weka.core.Instances instances)
                                       throws Exception
        Processes the given data (may change the provided dataset) and returns the modified version. This method is called in batchFinished().
        Specified by:
        process in class weka.filters.SimpleFilter
        Parameters:
        instances - the data to process
        Returns:
        the modified data
        Throws:
        Exception - in case the processing goes wrong
      • getRevision

        public String getRevision()
        Returns the revision string.
        Specified by:
        getRevision in interface weka.core.RevisionHandler
        Overrides:
        getRevision in class weka.filters.Filter
        Returns:
        the revision
      • main

        public static void main​(String[] args)
        Main method for running this filter.
        Parameters:
        args - should contain arguments to the filter: use -h for help