Package adams.flow.transformer
Class Tokenize
- java.lang.Object
-
- adams.core.logging.LoggingObject
-
- adams.core.logging.CustomLoggingLevelObject
-
- adams.core.option.AbstractOptionHandler
-
- adams.flow.core.AbstractActor
-
- adams.flow.transformer.AbstractTransformer
-
- adams.flow.transformer.AbstractArrayProvider
-
- adams.flow.transformer.Tokenize
-
- All Implemented Interfaces:
adams.core.AdditionalInformationHandler
,adams.core.ArrayProvider
,adams.core.CleanUpHandler
,adams.core.Destroyable
,adams.core.GlobalInfoSupporter
,adams.core.logging.LoggingLevelHandler
,adams.core.logging.LoggingSupporter
,adams.core.option.OptionHandler
,adams.core.QuickInfoSupporter
,adams.core.ShallowCopySupporter<adams.flow.core.Actor>
,adams.core.SizeOfHandler
,adams.core.Stoppable
,adams.core.StoppableWithFeedback
,adams.core.VariablesInspectionHandler
,adams.event.VariableChangeListener
,adams.flow.core.Actor
,adams.flow.core.ArrayProvider
,adams.flow.core.ErrorHandler
,adams.flow.core.InputConsumer
,adams.flow.core.OutputProducer
,Serializable
,Comparable
public class Tokenize extends adams.flow.transformer.AbstractArrayProvider
Splits document strings into sentences using the specified document to sentences tokenizer.
Input/output:
- accepts:
java.lang.String
- generates:
java.lang.String
-logging-level <OFF|SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST> (property: loggingLevel) The logging level for outputting errors and debugging output. default: WARNING
-name <java.lang.String> (property: name) The name of the actor. default: DocumentToSentences
-annotation <adams.core.base.BaseText> (property: annotations) The annotations to attach to this actor. default:
-skip <boolean> (property: skip) If set to true, transformation is skipped and the input token is just forwarded as it is. default: false
-stop-flow-on-error <boolean> (property: stopFlowOnError) If set to true, the flow gets stopped in case this actor encounters an error; useful for critical actors. default: false
-output-array <boolean> (property: outputArray) Whether to output the sentences as array or one-by-one. default: false
-tokenizer <adams.flow.transformer.tokenizer.AbstractTokenizer> (property: tokenizer) The tokenizer to use for generating sentence strings from document strings. default: adams.flow.transformer.tokenizer.StanfordPTBTokenizer
- Version:
- $Revision: 11957 $
- Author:
- fracpete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected AbstractTokenizer
m_Tokenizer
the tokenizer to use.-
Fields inherited from class adams.flow.transformer.AbstractArrayProvider
BACKUP_INDEX, BACKUP_QUEUE, m_Index, m_OutputArray, m_Queue
-
Fields inherited from class adams.flow.transformer.AbstractTransformer
BACKUP_INPUT, BACKUP_OUTPUT, m_InputToken, m_OutputToken
-
Fields inherited from class adams.flow.core.AbstractActor
m_Annotations, m_BackupState, m_DetectedObjectVariables, m_DetectedVariables, m_ErrorHandler, m_Executed, m_Executing, m_ExecutionListeningSupporter, m_FullName, m_LoggingPrefix, m_Name, m_Parent, m_ScopeHandler, m_Self, m_Silent, m_Skip, m_StopFlowOnError, m_StopMessage, m_Stopped, m_StorageHandler, m_VariablesUpdated
-
-
Constructor Summary
Constructors Constructor Description Tokenize()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Class[]
accepts()
Returns the class that the consumer accepts.void
defineOptions()
Adds options to the internal list of options.protected String
doExecute()
Executes the flow item.protected Class
getItemClass()
Returns the base class of the items.String
getQuickInfo()
Returns a quick info about the actor, which will be displayed in the GUI.AbstractTokenizer
getTokenizer()
Returns the tokenizer to use.String
globalInfo()
Returns a string describing the object.String
outputArrayTipText()
Returns the tip text for this property.void
setTokenizer(AbstractTokenizer value)
Sets the tokenizer to use.String
tokenizerTipText()
Returns the tip text for this property.-
Methods inherited from class adams.flow.transformer.AbstractArrayProvider
backupState, generates, getOutputArray, hasPendingOutput, output, preExecute, pruneBackup, reset, restoreState, setOutputArray, wrapUp
-
Methods inherited from class adams.flow.transformer.AbstractTransformer
currentInput, execute, hasInput, input, postExecute
-
Methods inherited from class adams.flow.core.AbstractActor
annotationsTipText, canInspectOptions, canPerformSetUpCheck, cleanUp, compareTo, configureLogger, destroy, equals, finalUpdateVariables, findVariables, findVariables, forceVariables, forCommandLine, forName, forName, getAdditionalInformation, getAnnotations, getDefaultName, getDetectedVariables, getErrorHandler, getFlowActors, getFlowExecutionListeningSupporter, getFullName, getName, getNextSibling, getParent, getParentComponent, getPreviousSibling, getRoot, getScopeHandler, getSilent, getSkip, getStopFlowOnError, getStopMessage, getStorageHandler, getVariables, handleError, handleException, hasErrorHandler, hasStopMessage, index, initialize, isBackedUp, isExecuted, isExecuting, isFinished, isHeadless, isStopped, nameTipText, performSetUpChecks, performVariableChecks, pruneBackup, setAnnotations, setErrorHandler, setName, setParent, setSilent, setSkip, setStopFlowOnError, setUp, setVariables, shallowCopy, shallowCopy, silentTipText, sizeOf, skipTipText, stopExecution, stopExecution, stopFlowOnErrorTipText, updateDetectedVariables, updatePrefix, updateVariables, variableChanged
-
Methods inherited from class adams.core.option.AbstractOptionHandler
cleanUpOptions, finishInit, getDefaultLoggingLevel, getOptionManager, loggingLevelTipText, newOptionManager, setLoggingLevel, toCommandLine, toString
-
Methods inherited from class adams.core.logging.LoggingObject
getLogger, getLoggingLevel, initializeLogging, isLoggingEnabled
-
Methods inherited from class java.lang.Object
clone, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface adams.flow.core.Actor
cleanUp, compareTo, destroy, equals, execute, findVariables, getAnnotations, getDefaultName, getDetectedVariables, getErrorHandler, getFlowExecutionListeningSupporter, getFullName, getName, getNextSibling, getParent, getParentComponent, getPreviousSibling, getRoot, getScopeHandler, getSilent, getSkip, getStopFlowOnError, getStopMessage, getStorageHandler, getVariables, handleError, hasErrorHandler, hasStopMessage, index, isExecuted, isFinished, isHeadless, isStopped, setAnnotations, setErrorHandler, setName, setParent, setSilent, setSkip, setStopFlowOnError, setUp, setVariables, shallowCopy, shallowCopy, sizeOf, stopExecution, stopExecution, toCommandLine, variableChanged
-
-
-
-
Field Detail
-
m_Tokenizer
protected AbstractTokenizer m_Tokenizer
the tokenizer to use.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing the object.- Specified by:
globalInfo
in interfaceadams.core.GlobalInfoSupporter
- Specified by:
globalInfo
in classadams.core.option.AbstractOptionHandler
- Returns:
- a description suitable for displaying in the gui
-
defineOptions
public void defineOptions()
Adds options to the internal list of options.- Specified by:
defineOptions
in interfaceadams.core.option.OptionHandler
- Overrides:
defineOptions
in classadams.flow.transformer.AbstractArrayProvider
-
outputArrayTipText
public String outputArrayTipText()
Returns the tip text for this property.- Specified by:
outputArrayTipText
in interfaceadams.core.ArrayProvider
- Specified by:
outputArrayTipText
in interfaceadams.flow.core.ArrayProvider
- Specified by:
outputArrayTipText
in classadams.flow.transformer.AbstractArrayProvider
- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setTokenizer
public void setTokenizer(AbstractTokenizer value)
Sets the tokenizer to use.- Parameters:
value
- the tokenizer
-
getTokenizer
public AbstractTokenizer getTokenizer()
Returns the tokenizer to use.- Returns:
- the tokenizer
-
tokenizerTipText
public String tokenizerTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
getQuickInfo
public String getQuickInfo()
Returns a quick info about the actor, which will be displayed in the GUI.- Specified by:
getQuickInfo
in interfaceadams.flow.core.Actor
- Specified by:
getQuickInfo
in interfaceadams.core.QuickInfoSupporter
- Overrides:
getQuickInfo
in classadams.flow.core.AbstractActor
- Returns:
- null if no info available, otherwise short string
-
accepts
public Class[] accepts()
Returns the class that the consumer accepts.- Returns:
- the Class of objects that can be processed
-
getItemClass
protected Class getItemClass()
Returns the base class of the items.- Specified by:
getItemClass
in classadams.flow.transformer.AbstractArrayProvider
- Returns:
- the class
-
doExecute
protected String doExecute()
Executes the flow item.- Specified by:
doExecute
in classadams.flow.core.AbstractActor
- Returns:
- null if everything is fine, otherwise error message
-
-