Package weka.core.tokenizers.cleaners
Class NormalizeDuplicateChars
- java.lang.Object
-
- weka.core.tokenizers.cleaners.AbstractTokenCleaner
-
- weka.core.tokenizers.cleaners.NormalizeDuplicateChars
-
- All Implemented Interfaces:
Serializable,weka.core.OptionHandler,TokenCleaner
public class NormalizeDuplicateChars extends AbstractTokenCleaner
Replaces all duplicate characters with a single one. Eg 'oooooh noooo!!!!' becomes 'oh no!'.- Version:
- $Revision$
- Author:
- FracPete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description NormalizeDuplicateChars()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Stringclean(String token)Determines whether a token is clean or not.StringglobalInfo()Returns a string describing the cleaner.-
Methods inherited from class weka.core.tokenizers.cleaners.AbstractTokenCleaner
getOptions, listOptions, reset, setOptions
-
-
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing the cleaner.- Specified by:
globalInfoin classAbstractTokenCleaner- Returns:
- a description suitable for displaying in the explorer/experimenter gui
-
clean
public String clean(String token)
Determines whether a token is clean or not.- Specified by:
cleanin interfaceTokenCleaner- Specified by:
cleanin classAbstractTokenCleaner- Parameters:
token- the token to check- Returns:
- the clean token or null to ignore
-
-