Package adams.data.weka.datasetsplitter
Class ColumnSplitter
- java.lang.Object
-
- adams.core.logging.LoggingObject
-
- adams.core.logging.CustomLoggingLevelObject
-
- adams.core.option.AbstractOptionHandler
-
- adams.data.weka.datasetsplitter.AbstractSplitter
-
- adams.data.weka.datasetsplitter.ColumnSplitter
-
- All Implemented Interfaces:
Destroyable,GlobalInfoSupporter,LoggingLevelHandler,LoggingSupporter,OptionHandler,SizeOfHandler,Serializable
public class ColumnSplitter extends AbstractSplitter
Splits a dataset in two based on the columns selected by a column-finder. Selected columns go in the first dataset, and the rest go in the second.
-logging-level <OFF|SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST> (property: loggingLevel) The logging level for outputting errors and debugging output. default: WARNING
-column-finder <adams.data.weka.columnfinder.ColumnFinder> (property: columnFinder) Column-finder defining which attributes go into which dataset. default: adams.data.weka.columnfinder.NullFinder
- Author:
- Corey Sterling (csterlin at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected ColumnFinderm_ColumnFinderColumn-finder for selecting which attributes go in which dataset.protected int[][]m_SourceLookupMapping from the split attributes to their source in the original dataset.-
Fields inherited from class adams.core.option.AbstractOptionHandler
m_OptionManager
-
Fields inherited from class adams.core.logging.LoggingObject
m_Logger, m_LoggingIsEnabled, m_LoggingLevel
-
-
Constructor Summary
Constructors Constructor Description ColumnSplitter()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Stringcheck(weka.core.Instances dataset)Checks that the input data is correctly formatted for our purposes.StringcolumnFinderTipText()Gets the tip-text for the columnFinder option.voiddefineOptions()Adds options to the internal list of options.ColumnFindergetColumnFinder()Gets the column finder.protected intgetSelectedColumn(int[] selectedColumns, int index)Gets the column number of the selected column at the given index.protected int[]getUnselectedColumns(int[] selectedColumns, int numColumns)Creates an int[] which contains the unselected columns. i.e. all column indices up to numColumns that aren't in selectedColumns.StringglobalInfo()Returns a string describing the object.protected weka.core.InstancenewInstanceForDataset(weka.core.Instances dataset)Creates a new empty instance suited to the given datasetvoidsetColumnFinder(ColumnFinder value)Sets the column finder.weka.core.Instances[]split(weka.core.Instances dataset)Splits the given dataset into a number of other datasets.protected ArrayList<weka.core.Attribute>[]splitAttributes(weka.core.Instances dataset)Creates the attribute lists for the two datasets resulting from this split.-
Methods inherited from class adams.core.option.AbstractOptionHandler
cleanUpOptions, destroy, finishInit, getDefaultLoggingLevel, getOptionManager, initialize, loggingLevelTipText, newOptionManager, reset, setLoggingLevel, toCommandLine, toString
-
Methods inherited from class adams.core.logging.LoggingObject
configureLogger, getLogger, getLoggingLevel, initializeLogging, isLoggingEnabled, sizeOf
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface adams.core.logging.LoggingLevelHandler
getLoggingLevel
-
-
-
-
Field Detail
-
m_ColumnFinder
protected ColumnFinder m_ColumnFinder
Column-finder for selecting which attributes go in which dataset.
-
m_SourceLookup
protected int[][] m_SourceLookup
Mapping from the split attributes to their source in the original dataset.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing the object.- Specified by:
globalInfoin interfaceGlobalInfoSupporter- Specified by:
globalInfoin classAbstractOptionHandler- Returns:
- a description suitable for displaying in the gui
-
defineOptions
public void defineOptions()
Adds options to the internal list of options. Derived classes must override this method to add additional options.- Specified by:
defineOptionsin interfaceOptionHandler- Overrides:
defineOptionsin classAbstractOptionHandler
-
getColumnFinder
public ColumnFinder getColumnFinder()
Gets the column finder.- Returns:
- The column finder.
-
setColumnFinder
public void setColumnFinder(ColumnFinder value)
Sets the column finder.- Parameters:
value- The column finder.
-
columnFinderTipText
public String columnFinderTipText()
Gets the tip-text for the columnFinder option.- Returns:
- The tip-text as a string.
-
check
public String check(weka.core.Instances dataset)
Checks that the input data is correctly formatted for our purposes.- Parameters:
dataset- The dataset to check.- Returns:
- Null if all okay, or an error message if not.
-
getUnselectedColumns
protected int[] getUnselectedColumns(int[] selectedColumns, int numColumns)Creates an int[] which contains the unselected columns. i.e. all column indices up to numColumns that aren't in selectedColumns.- Parameters:
selectedColumns- The columns to exclude from the array. Must be sorted.numColumns- The total number of columns.- Returns:
- The array of columns not in selectedColumns.
-
getSelectedColumn
protected int getSelectedColumn(int[] selectedColumns, int index)Gets the column number of the selected column at the given index.- Parameters:
selectedColumns- The array of selected columns.index- The index of the column to get.- Returns:
- The number of the selected column, or -1 if index out of range.
-
splitAttributes
protected ArrayList<weka.core.Attribute>[] splitAttributes(weka.core.Instances dataset)
Creates the attribute lists for the two datasets resulting from this split.- Parameters:
dataset- The dataset being split.- Returns:
- Two lists, the first containing the selected attributes, the second containing the rest.
-
newInstanceForDataset
protected weka.core.Instance newInstanceForDataset(weka.core.Instances dataset)
Creates a new empty instance suited to the given dataset- Parameters:
dataset- The dataset to create the instance for.- Returns:
- The created instance.
-
split
public weka.core.Instances[] split(weka.core.Instances dataset)
Splits the given dataset into a number of other datasets. Should be implemented by sub-classes to perform actual splitting.- Specified by:
splitin classAbstractSplitter- Parameters:
dataset- The dataset to split.- Returns:
- An array of datasets resulting from the split.
-
-