Class RemoveWithLabels
- java.lang.Object
-
- weka.filters.Filter
-
- weka.filters.SimpleFilter
-
- weka.filters.SimpleBatchFilter
-
- weka.filters.unsupervised.instance.RemoveWithLabels
-
- All Implemented Interfaces:
Serializable
,weka.core.CapabilitiesHandler
,weka.core.CapabilitiesIgnorer
,weka.core.CommandlineRunnable
,weka.core.OptionHandler
,weka.core.RevisionHandler
public class RemoveWithLabels extends weka.filters.SimpleBatchFilter
Allows the user to remove nominal labels via a regular expression.
Valid options are:-index <value> The index of the attribute to process; An index is a number starting with 1; apart from attribute names (case-sensitive), the following placeholders can be used as well: first, second, third, last_2, last_1, last; numeric indices can be enforced by preceding them with '#' (eg '#12'); attribute names can be surrounded by double quotes. (default: index=last, max=-1)
-label-regexp <value> The regular expression for matching the labels to remove. (default: ^(label1|label2|label3)$)
-invert If enabled, the matching sense is inverted, i.e., the matching labels are kept and all others removed.
-update-header If enabled, the labels also get removed from the attribute definition.
-output-debug-info If set, filter is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, filter capabilities are not checked before filter is built (use with caution).
- Author:
- fracpete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static String
INDEX
static String
INVERT
static String
LABEL_REGEXP
protected WekaAttributeIndex
m_Index
the attribute to remove the labels from.protected boolean
m_Invert
whether to invert the matching.protected Map<Integer,Integer>
m_LabelMapping
the label mapping (old -> new).protected BaseRegExp
m_LabelRegExp
the regular expression for matching the labels to remove.protected boolean
m_UpdateHeader
whether to update the header.static String
UPDATE_HEADER
-
Constructor Summary
Constructors Constructor Description RemoveWithLabels()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected weka.core.Instances
determineOutputFormat(weka.core.Instances inputFormat)
Determines the output format based on the input format and returns this.weka.core.Capabilities
getCapabilities()
Returns the Capabilities of this filter.protected WekaAttributeIndex
getDefaultIndex()
Returns the default attribute index.protected BaseRegExp
getDefaultLabelRegExp()
Returns the default label regular expression.WekaAttributeIndex
getIndex()
Returns the index of the attribute to convert.boolean
getInvert()
Returns whether to invert the matching sense.BaseRegExp
getLabelRegExp()
Returns the regular expression for matching the labels to remove.String[]
getOptions()
Gets the current settings of the filter.String
getRevision()
Returns the revision string.boolean
getUpdateHeader()
Returns whether to remove the labels also from the attribute definition.String
globalInfo()
Returns a string describing this filter.String
indexTipText()
Returns the tip text for this property.String
invertTipText()
Returns the tip text for this property.String
labelRegExpTipText()
Returns the tip text for this property.Enumeration
listOptions()
Returns an enumeration describing the available options.static void
main(String[] args)
Main method for testing this class.protected weka.core.Instances
process(weka.core.Instances instances)
Processes the given data (may change the provided dataset) and returns the modified version.void
setIndex(WekaAttributeIndex value)
Sets the index of the attribute to convert.void
setInvert(boolean value)
Sets whether to invert the matching sense.void
setLabelRegExp(BaseRegExp value)
Sets the regular expression for matching the labels to remove.void
setOptions(String[] options)
Parses a list of options for this object.void
setUpdateHeader(boolean value)
Sets whether to remove the labels also from the attribute definition.String
updateHeaderTipText()
Returns the tip text for this property.-
Methods inherited from class weka.filters.SimpleBatchFilter
allowAccessToFullInputFormat, batchFinished, hasImmediateOutputFormat, input, input
-
Methods inherited from class weka.filters.Filter
batchFilterFile, bufferInput, copyValues, copyValues, debugTipText, doNotCheckCapabilitiesTipText, filterFile, flushInput, getCapabilities, getCopyOfInputFormat, getDebug, getDoNotCheckCapabilities, getInputFormat, getOutputFormat, initInputLocators, initOutputLocators, inputFormatPeek, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputFormatPeek, outputPeek, postExecution, preExecution, push, push, resetQueue, run, runFilter, setDebug, setDoNotCheckCapabilities, setOutputFormat, testInputFormat, toString, useFilter, wekaStaticWrapper
-
-
-
-
Field Detail
-
INDEX
public static final String INDEX
- See Also:
- Constant Field Values
-
LABEL_REGEXP
public static final String LABEL_REGEXP
- See Also:
- Constant Field Values
-
INVERT
public static final String INVERT
- See Also:
- Constant Field Values
-
UPDATE_HEADER
public static final String UPDATE_HEADER
- See Also:
- Constant Field Values
-
m_Index
protected WekaAttributeIndex m_Index
the attribute to remove the labels from.
-
m_LabelRegExp
protected BaseRegExp m_LabelRegExp
the regular expression for matching the labels to remove.
-
m_Invert
protected boolean m_Invert
whether to invert the matching.
-
m_UpdateHeader
protected boolean m_UpdateHeader
whether to update the header.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing this filter.- Specified by:
globalInfo
in classweka.filters.SimpleFilter
- Returns:
- a description of the filter suitable for displaying in the explorer/experimenter gui
-
listOptions
public Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceweka.core.OptionHandler
- Overrides:
listOptions
in classweka.filters.Filter
- Returns:
- an enumeration of all the available options.
-
setOptions
public void setOptions(String[] options) throws Exception
Parses a list of options for this object.- Specified by:
setOptions
in interfaceweka.core.OptionHandler
- Overrides:
setOptions
in classweka.filters.Filter
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
getOptions
public String[] getOptions()
Gets the current settings of the filter.- Specified by:
getOptions
in interfaceweka.core.OptionHandler
- Overrides:
getOptions
in classweka.filters.Filter
- Returns:
- an array of strings suitable for passing to setOptions
-
getDefaultIndex
protected WekaAttributeIndex getDefaultIndex()
Returns the default attribute index.- Returns:
- the default
-
setIndex
public void setIndex(WekaAttributeIndex value)
Sets the index of the attribute to convert.- Parameters:
value
- the regexp
-
getIndex
public WekaAttributeIndex getIndex()
Returns the index of the attribute to convert.- Returns:
- the index
-
indexTipText
public String indexTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
getDefaultLabelRegExp
protected BaseRegExp getDefaultLabelRegExp()
Returns the default label regular expression.- Returns:
- the default
-
setLabelRegExp
public void setLabelRegExp(BaseRegExp value)
Sets the regular expression for matching the labels to remove.- Parameters:
value
- the expression
-
getLabelRegExp
public BaseRegExp getLabelRegExp()
Returns the regular expression for matching the labels to remove.- Returns:
- the expression
-
labelRegExpTipText
public String labelRegExpTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setInvert
public void setInvert(boolean value)
Sets whether to invert the matching sense.- Parameters:
value
- true if to invert
-
getInvert
public boolean getInvert()
Returns whether to invert the matching sense.- Returns:
- true if to invert
-
invertTipText
public String invertTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setUpdateHeader
public void setUpdateHeader(boolean value)
Sets whether to remove the labels also from the attribute definition.- Parameters:
value
- true if to update header
-
getUpdateHeader
public boolean getUpdateHeader()
Returns whether to remove the labels also from the attribute definition.- Returns:
- true if to update header
-
updateHeaderTipText
public String updateHeaderTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
getCapabilities
public weka.core.Capabilities getCapabilities()
Returns the Capabilities of this filter.- Specified by:
getCapabilities
in interfaceweka.core.CapabilitiesHandler
- Overrides:
getCapabilities
in classweka.filters.Filter
- Returns:
- the capabilities of this object
- See Also:
Capabilities
-
determineOutputFormat
protected weka.core.Instances determineOutputFormat(weka.core.Instances inputFormat) throws Exception
Determines the output format based on the input format and returns this. In case the output format cannot be returned immediately, i.e., immediateOutputFormat() returns false, then this method will be called from batchFinished().- Specified by:
determineOutputFormat
in classweka.filters.SimpleFilter
- Parameters:
inputFormat
- the input format to base the output format on- Returns:
- the output format
- Throws:
Exception
- in case the determination goes wrong- See Also:
SimpleBatchFilter.hasImmediateOutputFormat()
,SimpleBatchFilter.batchFinished()
-
process
protected weka.core.Instances process(weka.core.Instances instances) throws Exception
Processes the given data (may change the provided dataset) and returns the modified version. This method is called in batchFinished().- Specified by:
process
in classweka.filters.SimpleFilter
- Parameters:
instances
- the data to process- Returns:
- the modified data
- Throws:
Exception
- in case the processing goes wrong- See Also:
SimpleBatchFilter.batchFinished()
-
getRevision
public String getRevision()
Returns the revision string.- Specified by:
getRevision
in interfaceweka.core.RevisionHandler
- Overrides:
getRevision
in classweka.filters.Filter
- Returns:
- the revision
-
main
public static void main(String[] args)
Main method for testing this class.- Parameters:
args
- should contain arguments to the filter: use -h for help
-
-