Package weka.core.tokenizers.cleaners
Class NormalizeURLs
- java.lang.Object
-
- weka.core.tokenizers.cleaners.AbstractTokenCleaner
-
- weka.core.tokenizers.cleaners.NormalizeURLs
-
- All Implemented Interfaces:
Serializable
,weka.core.OptionHandler
,weka.core.tokenizers.cleaners.TokenCleaner
public class NormalizeURLs extends weka.core.tokenizers.cleaners.AbstractTokenCleaner
Replaces all urls with the same dummy url.- Version:
- $Revision$
- Author:
- FracPete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description NormalizeURLs()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description String
clean(String token)
Determines whether a token is clean or not.String
globalInfo()
Returns a string describing the cleaner.protected void
reset()
Resets the cleaner.
-
-
-
Field Detail
-
URL
public static final String URL
the url to replace all urls with.- See Also:
- Constant Field Values
-
PATTERN
public static final String PATTERN
the pattern to match.- See Also:
- Constant Field Values
-
m_Pattern
protected transient Pattern m_Pattern
the compiled pattern.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing the cleaner.- Specified by:
globalInfo
in classweka.core.tokenizers.cleaners.AbstractTokenCleaner
- Returns:
- a description suitable for displaying in the explorer/experimenter gui
-
reset
protected void reset()
Resets the cleaner.- Overrides:
reset
in classweka.core.tokenizers.cleaners.AbstractTokenCleaner
-
clean
public String clean(String token)
Determines whether a token is clean or not.- Specified by:
clean
in interfaceweka.core.tokenizers.cleaners.TokenCleaner
- Specified by:
clean
in classweka.core.tokenizers.cleaners.AbstractTokenCleaner
- Parameters:
token
- the token to check- Returns:
- the clean token or null to ignore
-
-