Class ColumnSplitter

  • All Implemented Interfaces:
    adams.core.Destroyable, adams.core.GlobalInfoSupporter, adams.core.logging.LoggingLevelHandler, adams.core.logging.LoggingSupporter, adams.core.option.OptionHandler, adams.core.SizeOfHandler, Serializable

    public class ColumnSplitter
    extends AbstractSplitter
    Splits a dataset in two based on the columns selected by a column-finder. Selected columns go in the first dataset, and the rest go in the second.

    -logging-level <OFF|SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST> (property: loggingLevel)
        The logging level for outputting errors and debugging output.
        default: WARNING
     
    -column-finder <adams.data.weka.columnfinder.ColumnFinder> (property: columnFinder)
        Column-finder defining which attributes go into which dataset.
        default: adams.data.weka.columnfinder.NullFinder
     
    Author:
    Corey Sterling (csterlin at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected ColumnFinder m_ColumnFinder
      Column-finder for selecting which attributes go in which dataset.
      protected int[][] m_SourceLookup
      Mapping from the split attributes to their source in the original dataset.
      • Fields inherited from class adams.core.option.AbstractOptionHandler

        m_OptionManager
      • Fields inherited from class adams.core.logging.LoggingObject

        m_Logger, m_LoggingIsEnabled, m_LoggingLevel
    • Constructor Summary

      Constructors 
      Constructor Description
      ColumnSplitter()  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      String check​(weka.core.Instances dataset)
      Checks that the input data is correctly formatted for our purposes.
      String columnFinderTipText()
      Gets the tip-text for the columnFinder option.
      void defineOptions()
      Adds options to the internal list of options.
      ColumnFinder getColumnFinder()
      Gets the column finder.
      protected int getSelectedColumn​(int[] selectedColumns, int index)
      Gets the column number of the selected column at the given index.
      protected int[] getUnselectedColumns​(int[] selectedColumns, int numColumns)
      Creates an int[] which contains the unselected columns.
      String globalInfo()
      Returns a string describing the object.
      protected weka.core.Instance newInstanceForDataset​(weka.core.Instances dataset)
      Creates a new empty instance suited to the given dataset
      void setColumnFinder​(ColumnFinder value)
      Sets the column finder.
      weka.core.Instances[] split​(weka.core.Instances dataset)
      Splits the given dataset into a number of other datasets.
      protected ArrayList<weka.core.Attribute>[] splitAttributes​(weka.core.Instances dataset)
      Creates the attribute lists for the two datasets resulting from this split.
      • Methods inherited from class adams.core.option.AbstractOptionHandler

        cleanUpOptions, destroy, finishInit, getDefaultLoggingLevel, getOptionManager, initialize, loggingLevelTipText, newOptionManager, reset, setLoggingLevel, toCommandLine, toString
      • Methods inherited from class adams.core.logging.LoggingObject

        configureLogger, getLogger, getLoggingLevel, initializeLogging, isLoggingEnabled, sizeOf
      • Methods inherited from interface adams.core.logging.LoggingLevelHandler

        getLoggingLevel
    • Field Detail

      • m_ColumnFinder

        protected ColumnFinder m_ColumnFinder
        Column-finder for selecting which attributes go in which dataset.
      • m_SourceLookup

        protected int[][] m_SourceLookup
        Mapping from the split attributes to their source in the original dataset.
    • Constructor Detail

      • ColumnSplitter

        public ColumnSplitter()
    • Method Detail

      • globalInfo

        public String globalInfo()
        Returns a string describing the object.
        Specified by:
        globalInfo in interface adams.core.GlobalInfoSupporter
        Specified by:
        globalInfo in class adams.core.option.AbstractOptionHandler
        Returns:
        a description suitable for displaying in the gui
      • defineOptions

        public void defineOptions()
        Adds options to the internal list of options. Derived classes must override this method to add additional options.
        Specified by:
        defineOptions in interface adams.core.option.OptionHandler
        Overrides:
        defineOptions in class adams.core.option.AbstractOptionHandler
      • getColumnFinder

        public ColumnFinder getColumnFinder()
        Gets the column finder.
        Returns:
        The column finder.
      • setColumnFinder

        public void setColumnFinder​(ColumnFinder value)
        Sets the column finder.
        Parameters:
        value - The column finder.
      • columnFinderTipText

        public String columnFinderTipText()
        Gets the tip-text for the columnFinder option.
        Returns:
        The tip-text as a string.
      • check

        public String check​(weka.core.Instances dataset)
        Checks that the input data is correctly formatted for our purposes.
        Parameters:
        dataset - The dataset to check.
        Returns:
        Null if all okay, or an error message if not.
      • getUnselectedColumns

        protected int[] getUnselectedColumns​(int[] selectedColumns,
                                             int numColumns)
        Creates an int[] which contains the unselected columns. i.e. all column indices up to numColumns that aren't in selectedColumns.
        Parameters:
        selectedColumns - The columns to exclude from the array. Must be sorted.
        numColumns - The total number of columns.
        Returns:
        The array of columns not in selectedColumns.
      • getSelectedColumn

        protected int getSelectedColumn​(int[] selectedColumns,
                                        int index)
        Gets the column number of the selected column at the given index.
        Parameters:
        selectedColumns - The array of selected columns.
        index - The index of the column to get.
        Returns:
        The number of the selected column, or -1 if index out of range.
      • splitAttributes

        protected ArrayList<weka.core.Attribute>[] splitAttributes​(weka.core.Instances dataset)
        Creates the attribute lists for the two datasets resulting from this split.
        Parameters:
        dataset - The dataset being split.
        Returns:
        Two lists, the first containing the selected attributes, the second containing the rest.
      • newInstanceForDataset

        protected weka.core.Instance newInstanceForDataset​(weka.core.Instances dataset)
        Creates a new empty instance suited to the given dataset
        Parameters:
        dataset - The dataset to create the instance for.
        Returns:
        The created instance.
      • split

        public weka.core.Instances[] split​(weka.core.Instances dataset)
        Splits the given dataset into a number of other datasets. Should be implemented by sub-classes to perform actual splitting.
        Specified by:
        split in class AbstractSplitter
        Parameters:
        dataset - The dataset to split.
        Returns:
        An array of datasets resulting from the split.