Class AlignDataset

  • All Implemented Interfaces:
    Serializable, weka.core.CapabilitiesHandler, weka.core.CapabilitiesIgnorer, weka.core.CommandlineRunnable, weka.core.OptionHandler, weka.core.RevisionHandler

    public class AlignDataset
    extends weka.filters.SimpleBatchFilter
    Aligns the dataset(s) passing through to the reference dataset.
    Makes use of the following other filters internally:
    - weka.filters.unsupervised.attribute.AnyToString
    - weka.filters.unsupervised.instance.RemoveWithLabels

    Valid options are:

     -reference-dataset <file>
      The reference dataset to load.
     
     -use-custom-loader
      Whether to use a custom loader.
     
     -custom-loader <classname + options>
      The custom loader to use.
     
     -output-debug-info
      If set, filter is run in debug mode and
      may output additional info to the console
     -do-not-check-capabilities
      If set, filter capabilities are not checked before filter is built
      (use with caution).
    Author:
    fracpete (fracpete at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected weka.core.Instances m_ActualReferenceDataset
      the actual reference dataset in use.
      protected weka.core.converters.AbstractFileLoader m_CustomLoader
      the file loader to use for loading the reference data.
      protected File m_ReferenceDataset
      the file containing the reference dataset.
      protected weka.core.Instances m_SuppliedReferenceDataset
      the supplied test set, when using programmatically.
      protected boolean m_UseCustomLoader
      whether to use a custom loader for the reference data.
      • Fields inherited from class weka.filters.Filter

        m_Debug, m_DoNotCheckCapabilities, m_FirstBatchDone, m_InputRelAtts, m_InputStringAtts, m_NewBatch, m_OutputRelAtts, m_OutputStringAtts
    • Constructor Summary

      Constructors 
      Constructor Description
      AlignDataset()  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected weka.filters.MultiFilter checkCompatibility​(weka.core.Instances reference, weka.core.Instances current)
      Checks the compatibility between reference dataset and the one to be aligned with it.
      String customLoaderTipText()
      Returns the tip text for this property.
      protected weka.core.Instances determineOutputFormat​(weka.core.Instances inputFormat)
      Determines the output format based on the input format and returns this.
      weka.core.Capabilities getCapabilities()
      Returns the Capabilities of this filter.
      weka.core.converters.AbstractFileLoader getCustomLoader()
      Returns the custom loader to use (if enabled).
      String[] getOptions()
      Gets the current settings of the filter.
      File getReferenceDataset()
      Returns the file containing the reference dataset.
      String getRevision()
      Returns the revision string.
      weka.core.Instances getSuppliedReferenceDataset()
      Returns the manually set reference dataset instead of loading one from disk.
      boolean getUseCustomLoader()
      Returns whether to use a custom loader or automatic loading.
      String globalInfo()
      Returns a string describing this filter.
      Enumeration listOptions()
      Returns an enumeration describing the available options.
      protected weka.core.Instances loadReferenceDataset()
      Loads the reference dataset from disk or returns the manually supplied one.
      static void main​(String[] args)
      Main method for testing this class.
      protected weka.core.Instances process​(weka.core.Instances instances)
      Processes the given data (may change the provided dataset) and returns the modified version.
      void setCustomLoader​(weka.core.converters.AbstractFileLoader value)
      Sets the custom loader to use (if enabled).
      void setOptions​(String[] options)
      Parses a list of options for this object.
      void setReferenceDataset​(File value)
      Sets the file containing the reference dataset.
      void setSuppliedReferenceDataset​(weka.core.Instances value)
      Sets the reference dataset to use instead of loading one from disk.
      void setUseCustomLoader​(boolean value)
      Sets whether to use a custom loader or automatic loading.
      String testSetTipText()
      Returns the tip text for this property.
      String useCustomLoaderTipText()
      Returns the tip text for this property.
      • Methods inherited from class weka.filters.SimpleBatchFilter

        allowAccessToFullInputFormat, batchFinished, hasImmediateOutputFormat, input
      • Methods inherited from class weka.filters.SimpleFilter

        reset, setInputFormat
      • Methods inherited from class weka.filters.Filter

        batchFilterFile, bufferInput, copyValues, copyValues, debugTipText, doNotCheckCapabilitiesTipText, filterFile, flushInput, getCapabilities, getDebug, getDoNotCheckCapabilities, getInputFormat, getOutputFormat, initInputLocators, initOutputLocators, inputFormatPeek, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputFormatPeek, outputPeek, postExecution, preExecution, push, push, resetQueue, run, runFilter, setDebug, setDoNotCheckCapabilities, setOutputFormat, testInputFormat, toString, useFilter, wekaStaticWrapper
    • Field Detail

      • m_ReferenceDataset

        protected File m_ReferenceDataset
        the file containing the reference dataset.
      • m_UseCustomLoader

        protected boolean m_UseCustomLoader
        whether to use a custom loader for the reference data.
      • m_CustomLoader

        protected weka.core.converters.AbstractFileLoader m_CustomLoader
        the file loader to use for loading the reference data.
      • m_SuppliedReferenceDataset

        protected weka.core.Instances m_SuppliedReferenceDataset
        the supplied test set, when using programmatically.
      • m_ActualReferenceDataset

        protected transient weka.core.Instances m_ActualReferenceDataset
        the actual reference dataset in use.
    • Constructor Detail

      • AlignDataset

        public AlignDataset()
    • Method Detail

      • globalInfo

        public String globalInfo()
        Returns a string describing this filter.
        Specified by:
        globalInfo in class weka.filters.SimpleFilter
        Returns:
        a description of the filter suitable for displaying in the explorer/experimenter gui
      • listOptions

        public Enumeration listOptions()
        Returns an enumeration describing the available options.
        Specified by:
        listOptions in interface weka.core.OptionHandler
        Overrides:
        listOptions in class weka.filters.Filter
        Returns:
        an enumeration of all the available options.
      • setOptions

        public void setOptions​(String[] options)
                        throws Exception
        Parses a list of options for this object.
        Specified by:
        setOptions in interface weka.core.OptionHandler
        Overrides:
        setOptions in class weka.filters.Filter
        Parameters:
        options - the list of options as an array of strings
        Throws:
        Exception - if an option is not supported
      • getOptions

        public String[] getOptions()
        Gets the current settings of the filter.
        Specified by:
        getOptions in interface weka.core.OptionHandler
        Overrides:
        getOptions in class weka.filters.Filter
        Returns:
        an array of strings suitable for passing to setOptions
      • setReferenceDataset

        public void setReferenceDataset​(File value)
        Sets the file containing the reference dataset.
        Parameters:
        value - the file
      • getReferenceDataset

        public File getReferenceDataset()
        Returns the file containing the reference dataset.
        Returns:
        the file
      • testSetTipText

        public String testSetTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setUseCustomLoader

        public void setUseCustomLoader​(boolean value)
        Sets whether to use a custom loader or automatic loading.
        Parameters:
        value - true if to use custom loader
      • getUseCustomLoader

        public boolean getUseCustomLoader()
        Returns whether to use a custom loader or automatic loading.
        Returns:
        true if using custom loader
      • useCustomLoaderTipText

        public String useCustomLoaderTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setCustomLoader

        public void setCustomLoader​(weka.core.converters.AbstractFileLoader value)
        Sets the custom loader to use (if enabled).
        Parameters:
        value - the custom loader
      • getCustomLoader

        public weka.core.converters.AbstractFileLoader getCustomLoader()
        Returns the custom loader to use (if enabled).
        Returns:
        the custom loader
      • customLoaderTipText

        public String customLoaderTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setSuppliedReferenceDataset

        public void setSuppliedReferenceDataset​(weka.core.Instances value)
        Sets the reference dataset to use instead of loading one from disk.
        Parameters:
        value - the reference dataset to use, null to remove
      • getSuppliedReferenceDataset

        public weka.core.Instances getSuppliedReferenceDataset()
        Returns the manually set reference dataset instead of loading one from disk.
        Returns:
        the manually set reference dataset to use, null if to load one from disk
      • loadReferenceDataset

        protected weka.core.Instances loadReferenceDataset()
                                                    throws Exception
        Loads the reference dataset from disk or returns the manually supplied one.
        Returns:
        the dataset
        Throws:
        Exception - if loader fails
        See Also:
        getSuppliedReferenceDataset()
      • getCapabilities

        public weka.core.Capabilities getCapabilities()
        Returns the Capabilities of this filter.
        Specified by:
        getCapabilities in interface weka.core.CapabilitiesHandler
        Overrides:
        getCapabilities in class weka.filters.Filter
        Returns:
        the capabilities of this object
        See Also:
        Capabilities
      • determineOutputFormat

        protected weka.core.Instances determineOutputFormat​(weka.core.Instances inputFormat)
                                                     throws Exception
        Determines the output format based on the input format and returns this. In case the output format cannot be returned immediately, i.e., immediateOutputFormat() returns false, then this method will be called from batchFinished().
        Specified by:
        determineOutputFormat in class weka.filters.SimpleFilter
        Parameters:
        inputFormat - the input format to base the output format on
        Returns:
        the output format
        Throws:
        Exception - in case the determination goes wrong
        See Also:
        SimpleBatchFilter.hasImmediateOutputFormat(), SimpleBatchFilter.batchFinished()
      • checkCompatibility

        protected weka.filters.MultiFilter checkCompatibility​(weka.core.Instances reference,
                                                              weka.core.Instances current)
                                                       throws Exception
        Checks the compatibility between reference dataset and the one to be aligned with it.
        Parameters:
        reference - the reference dataset
        current - the dataset to align
        Returns:
        the multi filter for aligning the datasets, null if nothing needs to be done
        Throws:
        Exception - if the datasets cannot be aligned
      • process

        protected weka.core.Instances process​(weka.core.Instances instances)
                                       throws Exception
        Processes the given data (may change the provided dataset) and returns the modified version. This method is called in batchFinished().
        Specified by:
        process in class weka.filters.SimpleFilter
        Parameters:
        instances - the data to process
        Returns:
        the modified data
        Throws:
        Exception - in case the processing goes wrong
        See Also:
        SimpleBatchFilter.batchFinished()
      • getRevision

        public String getRevision()
        Returns the revision string.
        Specified by:
        getRevision in interface weka.core.RevisionHandler
        Overrides:
        getRevision in class weka.filters.Filter
        Returns:
        the revision
      • main

        public static void main​(String[] args)
        Main method for testing this class.
        Parameters:
        args - should contain arguments to the filter: use -h for help