Class SegmentedDownSample

  • All Implemented Interfaces:
    Serializable, weka.core.CapabilitiesHandler, weka.core.CapabilitiesIgnorer, weka.core.CommandlineRunnable, weka.core.OptionHandler, weka.core.RevisionHandler, weka.filters.UnsupervisedFilter

    public class SegmentedDownSample
    extends weka.filters.SimpleBatchFilter
    implements weka.filters.UnsupervisedFilter
    Configures a weka.filters.unsupervised.attribute.PartitionedMultiFilter2, using the supplied number of splits and the nth points to configure the weka.filters.unsupervised.attribute.DownSample filter to apply to the subsets.

    Valid options are:

     -nth-points <list>
      The blank-separated list of number of points to use for the savitzky-golay window.
      (default: 1).
     -exclude <expr>
      The regular expression for identifying attributes to exclude from
      the splits (default: ^(sample_id)$)
     -U
      Flag for leaving unused attributes out of the output, by default
      these are included in the filter output.
     -output-debug-info
      If set, filter is run in debug mode and
      may output additional info to the console
     -do-not-check-capabilities
      If set, filter capabilities are not checked before filter is built
      (use with caution).
    Version:
    $Revision$
    Author:
    FracPete (fracpete at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static String DEFAULT_EXCLUDE
      the default for the exclude expression.
      static int DEFAULT_NTH_POINT
      the default for the nth point.
      protected adams.core.base.BaseRegExp m_Exclude
      the regular expression for attributes to exclude from the splits.
      protected weka.filters.unsupervised.attribute.PartitionedMultiFilter2 m_Filter
      the filter used internally.
      protected weka.core.Instances m_FirstPassData
      the filtered data from the first pass.
      protected List<Integer> m_NthPoints
      the blank-separated list of number of savgol points to use.
      protected boolean m_RemoveUnused
      Whether unused attributes are left out of the output.
      • Fields inherited from class weka.filters.Filter

        m_Debug, m_DoNotCheckCapabilities, m_FirstBatchDone, m_InputRelAtts, m_InputStringAtts, m_NewBatch, m_OutputRelAtts, m_OutputStringAtts
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      boolean allowAccessToFullInputFormat()  
      protected weka.core.Instances determineOutputFormat​(weka.core.Instances inputFormat)
      Determines the output format based on the input format and returns this.
      String excludeTipText()
      Returns the tip text for this property.
      adams.core.base.BaseRegExp getExclude()
      Returns the regular expression that identifies attributes to be excluded from the splits.
      String getNthPoints()
      Returns the blank-separated list number of points to use for the savitzky-golay window (>= 1).
      String[] getOptions()
      Gets the current settings of the filter.
      boolean getRemoveUnused()
      Gets whether unused attributes (ones that are not covered by any of the ranges) are removed from the output.
      String globalInfo()
      Returns a string describing this filter.
      Enumeration listOptions()
      Returns an enumeration describing the available options.
      String nthPointsTipText()
      Returns the tip text for this property.
      protected weka.core.Instances process​(weka.core.Instances instances)
      Processes the given data (may change the provided dataset) and returns the modified version.
      String removeUnusedTipText()
      Returns the tip text for this property.
      protected void reset()
      resets the filter, i.e., m_NewBatch to true and m_FirstBatchDone to false.
      void setExclude​(adams.core.base.BaseRegExp value)
      Sets the regular expression that identifies attributes to be excluded from the splits.
      void setNthPoints​(String value)
      Sets the blank-separated list number of points to use for the savitzky-golay window (>= 1).
      void setOptions​(String[] options)
      Parses a list of options for this object.
      void setRemoveUnused​(boolean value)
      Sets whether unused attributes (ones that are not covered by any of the ranges) are removed from the output.
      • Methods inherited from class weka.filters.SimpleBatchFilter

        batchFinished, hasImmediateOutputFormat, input
      • Methods inherited from class weka.filters.SimpleFilter

        setInputFormat
      • Methods inherited from class weka.filters.Filter

        batchFilterFile, bufferInput, copyValues, copyValues, debugTipText, doNotCheckCapabilitiesTipText, filterFile, flushInput, getCapabilities, getCapabilities, getDebug, getDoNotCheckCapabilities, getInputFormat, getOutputFormat, getRevision, initInputLocators, initOutputLocators, inputFormatPeek, isFirstBatchDone, isNewBatch, isOutputFormatDefined, main, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputFormatPeek, outputPeek, postExecution, preExecution, push, push, resetQueue, run, runFilter, setDebug, setDoNotCheckCapabilities, setOutputFormat, testInputFormat, toString, useFilter, wekaStaticWrapper
    • Field Detail

      • DEFAULT_NTH_POINT

        public static final int DEFAULT_NTH_POINT
        the default for the nth point.
        See Also:
        Constant Field Values
      • DEFAULT_EXCLUDE

        public static final String DEFAULT_EXCLUDE
        the default for the exclude expression.
      • m_NthPoints

        protected List<Integer> m_NthPoints
        the blank-separated list of number of savgol points to use.
      • m_Exclude

        protected adams.core.base.BaseRegExp m_Exclude
        the regular expression for attributes to exclude from the splits.
      • m_RemoveUnused

        protected boolean m_RemoveUnused
        Whether unused attributes are left out of the output.
      • m_Filter

        protected weka.filters.unsupervised.attribute.PartitionedMultiFilter2 m_Filter
        the filter used internally.
      • m_FirstPassData

        protected weka.core.Instances m_FirstPassData
        the filtered data from the first pass.
    • Constructor Detail

      • SegmentedDownSample

        public SegmentedDownSample()
    • Method Detail

      • globalInfo

        public String globalInfo()
        Returns a string describing this filter.
        Specified by:
        globalInfo in class weka.filters.SimpleFilter
        Returns:
        a description of the filter suitable for displaying in the explorer/experimenter gui
      • listOptions

        public Enumeration listOptions()
        Returns an enumeration describing the available options.
        Specified by:
        listOptions in interface weka.core.OptionHandler
        Overrides:
        listOptions in class weka.filters.Filter
        Returns:
        an enumeration of all the available options.
      • setOptions

        public void setOptions​(String[] options)
                        throws Exception
        Parses a list of options for this object. Also resets the state of the filter (this reset doesn't affect the options).
        Specified by:
        setOptions in interface weka.core.OptionHandler
        Overrides:
        setOptions in class weka.filters.Filter
        Parameters:
        options - the list of options as an array of strings
        Throws:
        Exception - if an option is not supported
        See Also:
        reset()
      • getOptions

        public String[] getOptions()
        Gets the current settings of the filter.
        Specified by:
        getOptions in interface weka.core.OptionHandler
        Overrides:
        getOptions in class weka.filters.Filter
        Returns:
        an array of strings suitable for passing to setOptions
      • setNthPoints

        public void setNthPoints​(String value)
        Sets the blank-separated list number of points to use for the savitzky-golay window (>= 1).
        Parameters:
        value - the number of points
      • getNthPoints

        public String getNthPoints()
        Returns the blank-separated list number of points to use for the savitzky-golay window (>= 1).
        Returns:
        the number of points
      • nthPointsTipText

        public String nthPointsTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setExclude

        public void setExclude​(adams.core.base.BaseRegExp value)
        Sets the regular expression that identifies attributes to be excluded from the splits.
        Parameters:
        value - the expression
      • getExclude

        public adams.core.base.BaseRegExp getExclude()
        Returns the regular expression that identifies attributes to be excluded from the splits.
        Returns:
        the expression
      • excludeTipText

        public String excludeTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setRemoveUnused

        public void setRemoveUnused​(boolean value)
        Sets whether unused attributes (ones that are not covered by any of the ranges) are removed from the output.
        Parameters:
        value - if true then the unused attributes get removed
      • getRemoveUnused

        public boolean getRemoveUnused()
        Gets whether unused attributes (ones that are not covered by any of the ranges) are removed from the output.
        Returns:
        true if unused attributes are removed
      • removeUnusedTipText

        public String removeUnusedTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • reset

        protected void reset()
        resets the filter, i.e., m_NewBatch to true and m_FirstBatchDone to false.
        Overrides:
        reset in class weka.filters.SimpleFilter
      • allowAccessToFullInputFormat

        public boolean allowAccessToFullInputFormat()
        Overrides:
        allowAccessToFullInputFormat in class weka.filters.SimpleBatchFilter
      • determineOutputFormat

        protected weka.core.Instances determineOutputFormat​(weka.core.Instances inputFormat)
                                                     throws Exception
        Determines the output format based on the input format and returns this. In case the output format cannot be returned immediately, i.e., immediateOutputFormat() returns false, then this method will be called from batchFinished().
        Specified by:
        determineOutputFormat in class weka.filters.SimpleFilter
        Parameters:
        inputFormat - the input format to base the output format on
        Returns:
        the output format
        Throws:
        Exception - in case the determination goes wrong
      • process

        protected weka.core.Instances process​(weka.core.Instances instances)
                                       throws Exception
        Processes the given data (may change the provided dataset) and returns the modified version. This method is called in batchFinished().
        Specified by:
        process in class weka.filters.SimpleFilter
        Parameters:
        instances - the data to process
        Returns:
        the modified data
        Throws:
        Exception - in case the processing goes wrong