Class AttributeSummaryTransferFilter
- java.lang.Object
-
- weka.filters.Filter
-
- weka.filters.SimpleFilter
-
- weka.filters.SimpleBatchFilter
-
- weka.filters.unsupervised.attribute.AttributeSummaryTransferFilter
-
- All Implemented Interfaces:
Serializable
,weka.core.CapabilitiesHandler
,weka.core.CapabilitiesIgnorer
,weka.core.CommandlineRunnable
,weka.core.OptionHandler
,weka.core.RevisionHandler
,weka.filters.UnsupervisedFilter
public class AttributeSummaryTransferFilter extends weka.filters.SimpleBatchFilter implements weka.filters.UnsupervisedFilter
Filter which trains another filter to summarise a sub-set of the data's attributes. The trained filter should be a supervised or unsupervised attribute filter. Trains the summary filter on a large set of unannotated data so it can be applied to a relatively small set which is annotated with other information.
Valid options are:
-row-finder <value> Row finder which selects rows for training the attribute-summarising filter. (default: adams.data.weka.rowfinder.NullFinder)
-column-finder <value> Column finder which selects attributes to summarise. (default: adams.data.weka.columnfinder.NullFinder)
-summary-filter <value> The filter to use to summarise the attributes. (default: weka.filters.unsupervised.attribute.PrincipalComponentsJ -R 0.95 -A 5 -M -1)
-preserve-id-column <value> Whether the first column of the test data should be treated as a sample ID and kept in the first position of the output. (default: off)
-class-name <value> The name of the attribute to treat as the class for supervised filters. (default: )
-keep-supervised-class <value> Whether the class value for supervised filters should be kept in the resultant dataset or discarded. (default: off)
-output-debug-info If set, filter is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, filter capabilities are not checked before filter is built (use with caution).
- Author:
- Corey Sterling (csterlin at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected BaseString
m_ClassName
The class-attribute for supervised attribute filters.protected ColumnFinder
m_ColumnFinder
The column-finder which selects the attributes to summarise.protected ColumnSplitter
m_ColumnSplitter
Column-splitter for separating attributes to be summarised.protected ColumnSplitter
m_IDSplitter
Column-splitter for separating the ID column.protected boolean
m_KeepSupervisedClass
Whether to keep the supervised filter class or discard it.protected Simple
m_Merger
Merger for reconstructing partial datasets.protected boolean
m_PreserveIDColumn
Whether to treat the first attribute as an ID.protected RowFinder
m_RowFinder
The row-finder which separates training data from actual data.protected RowSplitter
m_RowSplitter
Row-splitter for splitting training and actual data.protected weka.filters.Filter
m_SummaryFilter
The filter which performs attribute summarising.protected ColumnSplitter
m_SupervisedClassSplitter
Column-splitter for removing the supervised filter class.
-
Constructor Summary
Constructors Constructor Description AttributeSummaryTransferFilter()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description boolean
allowAccessToFullInputFormat()
Returns whether to allow the determineOutputFormat(Instances) method access to the full dataset rather than just the header.String
classNameTipText()
Gets the tip-text for the class-name option.String
columnFinderTipText()
Gets the tip-text for the column-finder option.protected weka.core.Instances
determineOutputFormat(weka.core.Instances inputFormat)
Determines the output format based on the input format and returns this.protected weka.core.Instances
formatOutput(weka.core.Instances filterOutput, weka.core.Instances theRest)
Handles merging of output datasets and formatting.weka.core.Capabilities
getCapabilities()
Returns the Capabilities of this filter.BaseString
getClassName()
Gets the name of the attribute to use as the class attribute for supervised summary filters.ColumnFinder
getColumnFinder()
Gets the column finder which selects the attributes for summarisation.BaseString
getDefaultClassName()
Gets the name of the default attribute to use as the class attribute for supervised summary filters.ColumnFinder
getDefaultColumnFinder()
Gets the default column finder which selects the attributes for summarisation.RowFinder
getDefaultRowFinder()
Gets the default training data row selector.weka.filters.Filter
getDefaultSummaryFilter()
Gets the default filter to use to summarise the attributes.boolean
getKeepSupervisedClass()
Gets whether to keep the class attribute of the summary attributes in the final dataset.String[]
getOptions()
returns the options of the current setupboolean
getPreserveIDColumn()
Gets whether the first non-summary attribute should be treated as an ID and moved to the first attribute position.RowFinder
getRowFinder()
Gets the training data row selector.weka.filters.Filter
getSummaryFilter()
Gets the filter to use to summarise the attributes.String
globalInfo()
Returns a string describing this filter.String
keepSupervisedClassTipText()
Gets the tip-text for the keep-supervised-class option.Enumeration<weka.core.Option>
listOptions()
Gets an enumeration describing the available options.String
preserveIDColumnTipText()
Gets the tip-text for the preserve-id-column option.protected weka.core.Instances
process(weka.core.Instances instances)
Processes the given data (may change the provided dataset) and returns the modified version.String
rowFinderTipText()
Gets the tip-text for the row-finder option.void
setClassName(BaseString value)
Sets the name of the attribute to use as the class attribute for supervised summary filters.void
setColumnFinder(ColumnFinder value)
Sets the column finder which selects the attributes for summarisation.void
setKeepSupervisedClass(boolean value)
Sets whether to keep the class attribute of the summary attributes in the final dataset.void
setOptions(String[] options)
Parses the options for this object.void
setPreserveIDColumn(boolean value)
Sets whether the first non-summary attribute should be treated as an ID and moved to the first attribute position.void
setRowFinder(RowFinder value)
Sets the training data row selector.void
setSummaryFilter(weka.filters.Filter value)
Sets the filter to use to summarise the attributes.String
summaryFilterTipText()
Gets the tip-text for the pca-filter option.-
Methods inherited from class weka.filters.SimpleBatchFilter
batchFinished, hasImmediateOutputFormat, input, input
-
Methods inherited from class weka.filters.Filter
batchFilterFile, bufferInput, copyValues, copyValues, debugTipText, doNotCheckCapabilitiesTipText, filterFile, flushInput, getCapabilities, getCopyOfInputFormat, getDebug, getDoNotCheckCapabilities, getInputFormat, getOutputFormat, getRevision, initInputLocators, initOutputLocators, inputFormatPeek, isFirstBatchDone, isNewBatch, isOutputFormatDefined, main, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputFormatPeek, outputPeek, postExecution, preExecution, push, push, resetQueue, run, runFilter, setDebug, setDoNotCheckCapabilities, setOutputFormat, testInputFormat, toString, useFilter, wekaStaticWrapper
-
-
-
-
Field Detail
-
m_RowFinder
protected RowFinder m_RowFinder
The row-finder which separates training data from actual data.
-
m_ColumnFinder
protected ColumnFinder m_ColumnFinder
The column-finder which selects the attributes to summarise.
-
m_SummaryFilter
protected weka.filters.Filter m_SummaryFilter
The filter which performs attribute summarising.
-
m_PreserveIDColumn
protected boolean m_PreserveIDColumn
Whether to treat the first attribute as an ID.
-
m_ClassName
protected BaseString m_ClassName
The class-attribute for supervised attribute filters.
-
m_KeepSupervisedClass
protected boolean m_KeepSupervisedClass
Whether to keep the supervised filter class or discard it.
-
m_Merger
protected Simple m_Merger
Merger for reconstructing partial datasets.
-
m_RowSplitter
protected RowSplitter m_RowSplitter
Row-splitter for splitting training and actual data.
-
m_ColumnSplitter
protected ColumnSplitter m_ColumnSplitter
Column-splitter for separating attributes to be summarised.
-
m_IDSplitter
protected ColumnSplitter m_IDSplitter
Column-splitter for separating the ID column.
-
m_SupervisedClassSplitter
protected ColumnSplitter m_SupervisedClassSplitter
Column-splitter for removing the supervised filter class.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing this filter.- Specified by:
globalInfo
in classweka.filters.SimpleFilter
- Returns:
- a description of the filter suitable for displaying in the explorer/experimenter gui
-
listOptions
public Enumeration<weka.core.Option> listOptions()
Gets an enumeration describing the available options.- Specified by:
listOptions
in interfaceweka.core.OptionHandler
- Overrides:
listOptions
in classweka.filters.Filter
- Returns:
- an enumeration of all the available options.
-
getOptions
public String[] getOptions()
returns the options of the current setup- Specified by:
getOptions
in interfaceweka.core.OptionHandler
- Overrides:
getOptions
in classweka.filters.Filter
- Returns:
- the current options
-
setOptions
public void setOptions(String[] options) throws Exception
Parses the options for this object.- Specified by:
setOptions
in interfaceweka.core.OptionHandler
- Overrides:
setOptions
in classweka.filters.Filter
- Parameters:
options
- the options to use- Throws:
Exception
- if the option setting fails
-
getDefaultRowFinder
public RowFinder getDefaultRowFinder()
Gets the default training data row selector.- Returns:
- The default training data row selector.
-
setRowFinder
public void setRowFinder(RowFinder value)
Sets the training data row selector.- Parameters:
value
- The training data row selector.
-
getRowFinder
public RowFinder getRowFinder()
Gets the training data row selector.- Returns:
- The training data row selector.
-
rowFinderTipText
public String rowFinderTipText()
Gets the tip-text for the row-finder option.- Returns:
- The tip-text as a string.
-
getDefaultColumnFinder
public ColumnFinder getDefaultColumnFinder()
Gets the default column finder which selects the attributes for summarisation.- Returns:
- The default column finder.
-
setColumnFinder
public void setColumnFinder(ColumnFinder value)
Sets the column finder which selects the attributes for summarisation.- Parameters:
value
- The column finder.
-
getColumnFinder
public ColumnFinder getColumnFinder()
Gets the column finder which selects the attributes for summarisation.- Returns:
- The column finder.
-
columnFinderTipText
public String columnFinderTipText()
Gets the tip-text for the column-finder option.- Returns:
- The tip-text as a string.
-
getDefaultSummaryFilter
public weka.filters.Filter getDefaultSummaryFilter()
Gets the default filter to use to summarise the attributes.- Returns:
- The default filter.
-
setSummaryFilter
public void setSummaryFilter(weka.filters.Filter value)
Sets the filter to use to summarise the attributes.- Parameters:
value
- The filter.
-
getSummaryFilter
public weka.filters.Filter getSummaryFilter()
Gets the filter to use to summarise the attributes.- Returns:
- The filter.
-
summaryFilterTipText
public String summaryFilterTipText()
Gets the tip-text for the pca-filter option.- Returns:
- The tip-text as a string.
-
setPreserveIDColumn
public void setPreserveIDColumn(boolean value)
Sets whether the first non-summary attribute should be treated as an ID and moved to the first attribute position.- Parameters:
value
- True to preserve the ID column, false to not.
-
getPreserveIDColumn
public boolean getPreserveIDColumn()
Gets whether the first non-summary attribute should be treated as an ID and moved to the first attribute position.- Returns:
- True to preserve the ID column, false to not.
-
preserveIDColumnTipText
public String preserveIDColumnTipText()
Gets the tip-text for the preserve-id-column option.- Returns:
- The tip-text as a string.
-
getDefaultClassName
public BaseString getDefaultClassName()
Gets the name of the default attribute to use as the class attribute for supervised summary filters.- Returns:
- The default attribute name.
-
setClassName
public void setClassName(BaseString value)
Sets the name of the attribute to use as the class attribute for supervised summary filters.- Parameters:
value
- The attribute name.
-
getClassName
public BaseString getClassName()
Gets the name of the attribute to use as the class attribute for supervised summary filters.- Returns:
- The attribute name.
-
classNameTipText
public String classNameTipText()
Gets the tip-text for the class-name option.- Returns:
- The tip-text as a string.
-
setKeepSupervisedClass
public void setKeepSupervisedClass(boolean value)
Sets whether to keep the class attribute of the summary attributes in the final dataset.- Parameters:
value
- True to keep the attribute in the final dataset, false to discard it.
-
getKeepSupervisedClass
public boolean getKeepSupervisedClass()
Gets whether to keep the class attribute of the summary attributes in the final dataset.- Returns:
- True to keep the attribute in the final dataset, false to discard it.
-
keepSupervisedClassTipText
public String keepSupervisedClassTipText()
Gets the tip-text for the keep-supervised-class option.- Returns:
- The tip-text as a string.
-
allowAccessToFullInputFormat
public boolean allowAccessToFullInputFormat()
Returns whether to allow the determineOutputFormat(Instances) method access to the full dataset rather than just the header. Default implementation returns false.- Overrides:
allowAccessToFullInputFormat
in classweka.filters.SimpleBatchFilter
- Returns:
- whether determineOutputFormat has access to the full input dataset
-
getCapabilities
public weka.core.Capabilities getCapabilities()
Returns the Capabilities of this filter. Derived filters have to override this method to enable capabilities.- Specified by:
getCapabilities
in interfaceweka.core.CapabilitiesHandler
- Overrides:
getCapabilities
in classweka.filters.Filter
- Returns:
- the capabilities of this object
- See Also:
Capabilities
-
determineOutputFormat
protected weka.core.Instances determineOutputFormat(weka.core.Instances inputFormat) throws Exception
Determines the output format based on the input format and returns this. In case the output format cannot be returned immediately, i.e., immediateOutputFormat() returns false, then this method will be called from batchFinished().- Specified by:
determineOutputFormat
in classweka.filters.SimpleFilter
- Parameters:
inputFormat
- the input format to base the output format on- Returns:
- the output format
- Throws:
Exception
- in case the determination goes wrong- See Also:
SimpleBatchFilter.hasImmediateOutputFormat()
,SimpleBatchFilter.batchFinished()
-
process
protected weka.core.Instances process(weka.core.Instances instances) throws Exception
Processes the given data (may change the provided dataset) and returns the modified version. This method is called in batchFinished().- Specified by:
process
in classweka.filters.SimpleFilter
- Parameters:
instances
- the data to process- Returns:
- the modified data
- Throws:
Exception
- in case the processing goes wrong- See Also:
SimpleBatchFilter.batchFinished()
-
formatOutput
protected weka.core.Instances formatOutput(weka.core.Instances filterOutput, weka.core.Instances theRest)
Handles merging of output datasets and formatting. Optionally moves the ID attribute to the first position. Optionally removes the class attribute for supervised filters.- Parameters:
filterOutput
- The output of the attribute filter.theRest
- The part of the input that was attribute-reduced.- Returns:
- The formatted dataset.
-
-