Class DatasetCleaner

  • All Implemented Interfaces:
    Serializable, weka.core.CapabilitiesHandler, weka.core.CapabilitiesIgnorer, weka.core.CommandlineRunnable, weka.core.OptionHandler, weka.core.RevisionHandler

    public class DatasetCleaner
    extends AbstractColumnFinderApplier
    Removes all columns from the data data that have been indentified.

    Valid options are:

     -D
      Turns on output of debugging information.
     -W <column finder specification>
      Full class name of column finder to use, followed
      by scheme options. eg:
       "adams.data.weka.columnfinder.NullFinder -D 1"
      (default: adams.data.weka.columnfinder.NullFinder)
     -invert
      Whether to invert the found column indices.
      (default: off)
    Version:
    $Revision$
    Author:
    fracpete (fracpete at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected weka.filters.unsupervised.attribute.Remove m_Remove
      the remove filter to use.
      • Fields inherited from class weka.filters.Filter

        m_Debug, m_DoNotCheckCapabilities, m_FirstBatchDone, m_InputRelAtts, m_InputStringAtts, m_NewBatch, m_OutputRelAtts, m_OutputStringAtts
    • Constructor Summary

      Constructors 
      Constructor Description
      DatasetCleaner()  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected weka.core.Instances apply​(weka.core.Instances data, int[] indices)
      Applies the indices to the data.
      String columnFinderTipText()
      Returns the tip text for this property.
      protected weka.core.Instances determineOutputFormat​(weka.core.Instances inputFormat)
      Determines the output format based on the input format and returns this.
      String getRevision()
      Returns the revision string.
      String globalInfo()
      Returns a string describing this classifier.
      • Methods inherited from class weka.filters.SimpleBatchFilter

        allowAccessToFullInputFormat, batchFinished, hasImmediateOutputFormat, input, input
      • Methods inherited from class weka.filters.SimpleFilter

        reset, setInputFormat
      • Methods inherited from class weka.filters.Filter

        batchFilterFile, bufferInput, copyValues, copyValues, debugTipText, doNotCheckCapabilitiesTipText, filterFile, flushInput, getCapabilities, getCopyOfInputFormat, getDebug, getDoNotCheckCapabilities, getInputFormat, getOutputFormat, initInputLocators, initOutputLocators, inputFormatPeek, isFirstBatchDone, isNewBatch, isOutputFormatDefined, main, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputFormatPeek, outputPeek, postExecution, preExecution, push, push, resetQueue, run, runFilter, setDebug, setDoNotCheckCapabilities, setOutputFormat, testInputFormat, toString, useFilter, wekaStaticWrapper
    • Field Detail

      • m_Remove

        protected weka.filters.unsupervised.attribute.Remove m_Remove
        the remove filter to use.
    • Constructor Detail

      • DatasetCleaner

        public DatasetCleaner()
    • Method Detail

      • globalInfo

        public String globalInfo()
        Returns a string describing this classifier.
        Specified by:
        globalInfo in class weka.filters.SimpleFilter
        Returns:
        a description of the classifier suitable for displaying in the explorer/experimenter gui
      • columnFinderTipText

        public String columnFinderTipText()
        Returns the tip text for this property.
        Specified by:
        columnFinderTipText in class AbstractColumnFinderApplier
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • determineOutputFormat

        protected weka.core.Instances determineOutputFormat​(weka.core.Instances inputFormat)
                                                     throws Exception
        Determines the output format based on the input format and returns this. In case the output format cannot be returned immediately, i.e., immediateOutputFormat() returns false, then this method will be called from batchFinished().
        Specified by:
        determineOutputFormat in class AbstractColumnFinderApplier
        Parameters:
        inputFormat - the input format to base the output format on
        Returns:
        the output format
        Throws:
        Exception
      • apply

        protected weka.core.Instances apply​(weka.core.Instances data,
                                            int[] indices)
        Applies the indices to the data. In case inverting is enabled, the indices have already been inverted.
        Specified by:
        apply in class AbstractColumnFinderApplier
        Parameters:
        data - the data to process
        indices - the indices to use
        Returns:
        the processed data
      • getRevision

        public String getRevision()
        Returns the revision string.
        Specified by:
        getRevision in interface weka.core.RevisionHandler
        Overrides:
        getRevision in class weka.filters.Filter
        Returns:
        the revision