Class NominalToNumeric
- java.lang.Object
-
- weka.filters.Filter
-
- weka.filters.SimpleFilter
-
- weka.filters.SimpleStreamFilter
-
- weka.filters.unsupervised.attribute.NominalToNumeric
-
- All Implemented Interfaces:
Serializable,weka.core.CapabilitiesHandler,weka.core.CapabilitiesIgnorer,weka.core.CommandlineRunnable,weka.core.OptionHandler,weka.core.RevisionHandler,weka.filters.StreamableFilter
public class NominalToNumeric extends weka.filters.SimpleStreamFilterConverts a nominal attribute into a numeric one. Can either just use the internal representation of the labels as numeric value or parse the label itself (subset can be extracted via regexp).
Valid options are:-index <value> The index of the attribute to convert; An index is a number starting with 1; apart from attribute names (case-sensitive), the following placeholders can be used as well: first, second, third, last_2, last_1, last; numeric indices can be enforced by preceding them with '#' (eg '#12'); attribute names can be surrounded by double quotes. (default: index=last, max=-1)
-type <value> The type of conversion to perform. (default: INTERNAL_REPRESENTATION)
-find <value> The regular expression to use for extracting the numeric part from the label; use .* to match label as a whole. (default: .*)
-replace <value> The expression to use for assembling the numeric part; use $0 to use label as is. (default: $0)
-output-debug-info If set, filter is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, filter capabilities are not checked before filter is built (use with caution).
- Author:
- FracPete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classNominalToNumeric.ConversionTypeEnumeration of conversion types.
-
Field Summary
Fields Modifier and Type Field Description static StringFINDstatic StringINDEXprotected intm_AttIndexthe attribute index.protected adams.core.base.BaseRegExpm_Findthe regular expression to use.protected WekaAttributeIndexm_Indexthe attribute to convert.protected Map<String,Double>m_Mappingthe mapping between label and new value.protected Stringm_Replacethe replacement string.protected NominalToNumeric.ConversionTypem_Typethe type of conversion to perform.static StringREPLACEstatic StringTYPE
-
Constructor Summary
Constructors Constructor Description NominalToNumeric()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected weka.core.InstancesdetermineOutputFormat(weka.core.Instances inputFormat)Determines the output format based on the input format and returns this.StringfindTipText()Returns the tip text for this property.weka.core.CapabilitiesgetCapabilities()protected adams.core.base.BaseRegExpgetDefaultFind()Returns the default regular expression for finding tokens to clean.protected WekaAttributeIndexgetDefaultIndex()Returns the default attribute index.protected StringgetDefaultReplace()Returns the default expression for replacing matching tokens with.protected NominalToNumeric.ConversionTypegetDefaultType()Returns the default regular expression for finding tokens to clean.adams.core.base.BaseRegExpgetFind()Returns the regular expression to use for extracting the numeric part from the label.WekaAttributeIndexgetIndex()Returns the index of the attribute to convert.String[]getOptions()Gets the current option settings for the OptionHandler.StringgetReplace()Returns the expression to use for assembling the numeric part.StringgetRevision()Returns the revision string.NominalToNumeric.ConversionTypegetType()Returns the conversion type to use.StringglobalInfo()Returns a string describing this filter.StringindexTipText()Returns the tip text for this property.EnumerationlistOptions()Returns an enumeration describing the available options.static voidmain(String[] args)Main method for testing this class.protected weka.core.Instanceprocess(weka.core.Instance instance)processes the given instance (may change the provided instance) and returns the modified version.StringreplaceTipText()Returns the tip text for this property.protected voidreset()Resets the cleaner.voidsetFind(adams.core.base.BaseRegExp value)Sets the regular expression to use for extracting the numeric part from the label.voidsetIndex(WekaAttributeIndex value)Sets the index of the attribute to convert.voidsetOptions(String[] options)Sets the OptionHandler's options using the given list.voidsetReplace(String value)Sets the expression to use for assembling the numeric part.voidsetType(NominalToNumeric.ConversionType value)Sets the conversion type to use.StringtypeTipText()Returns the tip text for this property.-
Methods inherited from class weka.filters.SimpleStreamFilter
batchFinished, hasImmediateOutputFormat, input, preprocess, process
-
Methods inherited from class weka.filters.Filter
batchFilterFile, bufferInput, copyValues, copyValues, debugTipText, doNotCheckCapabilitiesTipText, filterFile, flushInput, getCapabilities, getDebug, getDoNotCheckCapabilities, getInputFormat, getOutputFormat, initInputLocators, initOutputLocators, inputFormatPeek, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputFormatPeek, outputPeek, postExecution, preExecution, push, push, resetQueue, run, runFilter, setDebug, setDoNotCheckCapabilities, setOutputFormat, testInputFormat, toString, useFilter, wekaStaticWrapper
-
-
-
-
Field Detail
-
INDEX
public static final String INDEX
- See Also:
- Constant Field Values
-
TYPE
public static final String TYPE
- See Also:
- Constant Field Values
-
FIND
public static final String FIND
- See Also:
- Constant Field Values
-
REPLACE
public static final String REPLACE
- See Also:
- Constant Field Values
-
m_Index
protected WekaAttributeIndex m_Index
the attribute to convert.
-
m_Type
protected NominalToNumeric.ConversionType m_Type
the type of conversion to perform.
-
m_Find
protected adams.core.base.BaseRegExp m_Find
the regular expression to use.
-
m_Replace
protected String m_Replace
the replacement string.
-
m_AttIndex
protected int m_AttIndex
the attribute index.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing this filter.- Specified by:
globalInfoin classweka.filters.SimpleFilter- Returns:
- a description of the filter suitable for displaying in the explorer/experimenter gui
-
listOptions
public Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptionsin interfaceweka.core.OptionHandler- Overrides:
listOptionsin classweka.filters.Filter- Returns:
- an enumeration of all the available options.
-
setOptions
public void setOptions(String[] options) throws Exception
Sets the OptionHandler's options using the given list. All options will be set (or reset) during this call (i.e. incremental setting of options is not possible).- Specified by:
setOptionsin interfaceweka.core.OptionHandler- Overrides:
setOptionsin classweka.filters.Filter- Parameters:
options- the list of options as an array of strings- Throws:
Exception- if an option is not supported
-
getOptions
public String[] getOptions()
Gets the current option settings for the OptionHandler.- Specified by:
getOptionsin interfaceweka.core.OptionHandler- Overrides:
getOptionsin classweka.filters.Filter- Returns:
- the list of current option settings as an array of strings
-
reset
protected void reset()
Resets the cleaner.- Overrides:
resetin classweka.filters.SimpleFilter
-
getDefaultIndex
protected WekaAttributeIndex getDefaultIndex()
Returns the default attribute index.- Returns:
- the default
-
setIndex
public void setIndex(WekaAttributeIndex value)
Sets the index of the attribute to convert.- Parameters:
value- the regexp
-
getIndex
public WekaAttributeIndex getIndex()
Returns the index of the attribute to convert.- Returns:
- the index
-
indexTipText
public String indexTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
getDefaultType
protected NominalToNumeric.ConversionType getDefaultType()
Returns the default regular expression for finding tokens to clean.- Returns:
- the default
-
setType
public void setType(NominalToNumeric.ConversionType value)
Sets the conversion type to use.- Parameters:
value- the type
-
getType
public NominalToNumeric.ConversionType getType()
Returns the conversion type to use.- Returns:
- the type
-
typeTipText
public String typeTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
getDefaultFind
protected adams.core.base.BaseRegExp getDefaultFind()
Returns the default regular expression for finding tokens to clean.- Returns:
- the default
-
setFind
public void setFind(adams.core.base.BaseRegExp value)
Sets the regular expression to use for extracting the numeric part from the label.- Parameters:
value- the regexp
-
getFind
public adams.core.base.BaseRegExp getFind()
Returns the regular expression to use for extracting the numeric part from the label.- Returns:
- the regexp
-
findTipText
public String findTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
getDefaultReplace
protected String getDefaultReplace()
Returns the default expression for replacing matching tokens with.- Returns:
- the default
-
setReplace
public void setReplace(String value)
Sets the expression to use for assembling the numeric part.- Parameters:
value- the expression
-
getReplace
public String getReplace()
Returns the expression to use for assembling the numeric part.- Returns:
- the expression
-
replaceTipText
public String replaceTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
getCapabilities
public weka.core.Capabilities getCapabilities()
- Specified by:
getCapabilitiesin interfaceweka.core.CapabilitiesHandler- Overrides:
getCapabilitiesin classweka.filters.Filter
-
determineOutputFormat
protected weka.core.Instances determineOutputFormat(weka.core.Instances inputFormat) throws ExceptionDetermines the output format based on the input format and returns this. In case the output format cannot be returned immediately, i.e., hasImmediateOutputFormat() returns false, then this method will called from batchFinished() after the call of preprocess(Instances), in which, e.g., statistics for the actual processing step can be gathered.- Specified by:
determineOutputFormatin classweka.filters.SimpleStreamFilter- Parameters:
inputFormat- the input format to base the output format on- Returns:
- the output format
- Throws:
Exception- in case the determination goes wrong
-
process
protected weka.core.Instance process(weka.core.Instance instance) throws Exceptionprocesses the given instance (may change the provided instance) and returns the modified version.- Specified by:
processin classweka.filters.SimpleStreamFilter- Parameters:
instance- the instance to process- Returns:
- the modified data
- Throws:
Exception- in case the processing goes wrong
-
getRevision
public String getRevision()
Returns the revision string.- Specified by:
getRevisionin interfaceweka.core.RevisionHandler- Overrides:
getRevisionin classweka.filters.Filter- Returns:
- the revision
-
main
public static void main(String[] args)
Main method for testing this class.- Parameters:
args- should contain arguments to the filter: use -h for help
-
-