Class RegExp

  • All Implemented Interfaces:
    Serializable, weka.core.OptionHandler, weka.core.tokenizers.cleaners.TokenCleaner

    public class RegExp
    extends weka.core.tokenizers.cleaners.AbstractTokenCleaner
    Cleans tokens based on regular expressions, i.e., if token matches regexp it gets replaced with the specified expression.
    Version:
    $Revision$
    Author:
    FracPete (fracpete at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Constructor Detail

      • RegExp

        public RegExp()
    • Method Detail

      • globalInfo

        public String globalInfo()
        Returns a string describing the cleaner.
        Specified by:
        globalInfo in class weka.core.tokenizers.cleaners.AbstractTokenCleaner
        Returns:
        a description suitable for displaying in the explorer/experimenter gui
      • listOptions

        public Enumeration listOptions()
        Returns an enumeration describing the available options.
        Specified by:
        listOptions in interface weka.core.OptionHandler
        Overrides:
        listOptions in class weka.core.tokenizers.cleaners.AbstractTokenCleaner
        Returns:
        an enumeration of all the available options.
      • setOptions

        public void setOptions​(String[] options)
                        throws Exception
        Sets the OptionHandler's options using the given list. All options will be set (or reset) during this call (i.e. incremental setting of options is not possible).
        Specified by:
        setOptions in interface weka.core.OptionHandler
        Overrides:
        setOptions in class weka.core.tokenizers.cleaners.AbstractTokenCleaner
        Parameters:
        options - the list of options as an array of strings
        Throws:
        Exception - if an option is not supported
      • getOptions

        public String[] getOptions()
        Gets the current option settings for the OptionHandler.
        Specified by:
        getOptions in interface weka.core.OptionHandler
        Overrides:
        getOptions in class weka.core.tokenizers.cleaners.AbstractTokenCleaner
        Returns:
        the list of current option settings as an array of strings
      • reset

        protected void reset()
        Resets the cleaner.
        Overrides:
        reset in class weka.core.tokenizers.cleaners.AbstractTokenCleaner
      • getDefaultFind

        protected String getDefaultFind()
        Returns the default regular expression for finding tokens to clean.
        Returns:
        the default
      • setFind

        public void setFind​(String value)
        Sets the regular expression to use for finding tokens to clean.
        Parameters:
        value - the regexp
      • getFind

        public String getFind()
        Returns the regular expression to use for finding tokens to clean.
        Returns:
        the regexp
      • findTipText

        public String findTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • getDefaultReplace

        protected String getDefaultReplace()
        Returns the default expression for replacing matching tokens with.
        Returns:
        the default
      • setReplace

        public void setReplace​(String value)
        Sets the expression to use for replacing matching tokens with.
        Parameters:
        value - the expression
      • getReplace

        public String getReplace()
        Returns the expression to use for replacing matching tokens with.
        Returns:
        the expression
      • replaceTipText

        public String replaceTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • clean

        public String clean​(String token)
        Determines whether a token is clean or not.
        Specified by:
        clean in interface weka.core.tokenizers.cleaners.TokenCleaner
        Specified by:
        clean in class weka.core.tokenizers.cleaners.AbstractTokenCleaner
        Parameters:
        token - the token to check
        Returns:
        the clean token or null to ignore