Package adams.flow.transformer
Class PDFExtractText
-
- All Implemented Interfaces:
AdditionalInformationHandler
,CleanUpHandler
,Destroyable
,GlobalInfoSupporter
,LoggingLevelHandler
,LoggingSupporter
,OptionHandler
,QuickInfoSupporter
,ShallowCopySupporter<Actor>
,SizeOfHandler
,Stoppable
,StoppableWithFeedback
,VariablesInspectionHandler
,VariableChangeListener
,Actor
,ErrorHandler
,InputConsumer
,OutputProducer
,Serializable
,Comparable
public class PDFExtractText extends AbstractTransformer
Actor for extracting the text of a range of pages from a PDF file.
Input/output:
- accepts:
java.lang.String
java.io.File
- generates:
java.lang.String
Valid options are:
-D <int> (property: debugLevel) The greater the number the more additional info the scheme may output to the console (0 = off). default: 0 minimum: 0
-name <java.lang.String> (property: name) The name of the actor. default: PDFExtractText
-annotation <adams.core.base.BaseText> (property: annotations) The annotations to attach to this actor. default:
-skip (property: skip) If set to true, transformation is skipped and the input token is just forwarded as it is.
-stop-flow-on-error (property: stopFlowOnError) If set to true, the flow gets stopped in case this actor encounters an error; useful for critical actors.
-pages <java.lang.String> (property: pages) The range of pages to extract; A range is a comma-separated list of single 1-based indices or sub-ranges of indices ('start-end'); 'inv(...)' inverts the range '...'; the following placeholders can be used as well: first, second, third, last_2, last_1, last default: first-last
- Author:
- fracpete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected PlaceholderFile
m_Output
the output file.protected UnorderedRange
m_Pages
the range of pages to extract.-
Fields inherited from class adams.flow.transformer.AbstractTransformer
BACKUP_INPUT, BACKUP_OUTPUT, m_InputToken, m_OutputToken
-
Fields inherited from class adams.flow.core.AbstractActor
m_Annotations, m_BackupState, m_DetectedObjectVariables, m_DetectedVariables, m_ErrorHandler, m_Executed, m_Executing, m_ExecutionListeningSupporter, m_FullName, m_LoggingPrefix, m_Name, m_Parent, m_ScopeHandler, m_Self, m_Silent, m_Skip, m_StopFlowOnError, m_StopMessage, m_Stopped, m_StorageHandler, m_VariablesUpdated
-
Fields inherited from class adams.core.option.AbstractOptionHandler
m_OptionManager
-
Fields inherited from class adams.core.logging.LoggingObject
m_Logger, m_LoggingIsEnabled, m_LoggingLevel
-
Fields inherited from interface adams.flow.core.Actor
FILE_EXTENSION, FILE_EXTENSION_GZ
-
-
Constructor Summary
Constructors Constructor Description PDFExtractText()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Class[]
accepts()
Returns the class that the consumer accepts.void
defineOptions()
Adds options to the internal list of options.protected String
doExecute()
Executes the flow item.Class[]
generates()
Returns the class of objects that it generates.UnorderedRange
getPages()
Returns the page range.String
getQuickInfo()
Returns a quick info about the actor, which will be displayed in the GUI.String
globalInfo()
Returns a string describing the object.String
pagesTipText()
Returns the tip text for this property.void
setPages(UnorderedRange value)
Sets the page range.-
Methods inherited from class adams.flow.transformer.AbstractTransformer
backupState, currentInput, execute, hasInput, hasPendingOutput, input, output, postExecute, restoreState, wrapUp
-
Methods inherited from class adams.flow.core.AbstractActor
annotationsTipText, canInspectOptions, canPerformSetUpCheck, cleanUp, compareTo, configureLogger, destroy, equals, finalUpdateVariables, findVariables, findVariables, forceVariables, forCommandLine, forName, forName, getAdditionalInformation, getAnnotations, getDefaultName, getDetectedVariables, getErrorHandler, getFlowActors, getFlowExecutionListeningSupporter, getFullName, getName, getNextSibling, getParent, getParentComponent, getPreviousSibling, getRoot, getScopeHandler, getSilent, getSkip, getStopFlowOnError, getStopMessage, getStorageHandler, getVariables, handleError, handleException, hasErrorHandler, hasStopMessage, index, initialize, isBackedUp, isExecuted, isExecuting, isFinished, isHeadless, isStopped, nameTipText, performSetUpChecks, performVariableChecks, preExecute, pruneBackup, pruneBackup, reset, setAnnotations, setErrorHandler, setName, setParent, setSilent, setSkip, setStopFlowOnError, setUp, setVariables, shallowCopy, shallowCopy, silentTipText, sizeOf, skipTipText, stopExecution, stopExecution, stopFlowOnErrorTipText, updateDetectedVariables, updatePrefix, updateVariables, variableChanged
-
Methods inherited from class adams.core.option.AbstractOptionHandler
cleanUpOptions, finishInit, getDefaultLoggingLevel, getOptionManager, loggingLevelTipText, newOptionManager, setLoggingLevel, toCommandLine, toString
-
Methods inherited from class adams.core.logging.LoggingObject
getLogger, getLoggingLevel, initializeLogging, isLoggingEnabled
-
Methods inherited from class java.lang.Object
clone, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface adams.flow.core.Actor
cleanUp, compareTo, destroy, equals, findVariables, getAnnotations, getDefaultName, getDetectedVariables, getErrorHandler, getFlowExecutionListeningSupporter, getFullName, getName, getNextSibling, getParent, getParentComponent, getPreviousSibling, getRoot, getScopeHandler, getSilent, getSkip, getStopFlowOnError, getStopMessage, getStorageHandler, getVariables, handleError, hasErrorHandler, hasStopMessage, index, isExecuted, isFinished, isHeadless, isStopped, setAnnotations, setErrorHandler, setName, setParent, setSilent, setSkip, setStopFlowOnError, setUp, setVariables, shallowCopy, shallowCopy, sizeOf, stopExecution, stopExecution, toCommandLine, variableChanged
-
Methods inherited from interface adams.core.AdditionalInformationHandler
getAdditionalInformation
-
Methods inherited from interface adams.core.logging.LoggingLevelHandler
getLoggingLevel, setLoggingLevel
-
Methods inherited from interface adams.core.logging.LoggingSupporter
getLogger, isLoggingEnabled
-
Methods inherited from interface adams.core.option.OptionHandler
cleanUpOptions, getOptionManager
-
Methods inherited from interface adams.core.VariablesInspectionHandler
canInspectOptions
-
-
-
-
Field Detail
-
m_Output
protected PlaceholderFile m_Output
the output file.
-
m_Pages
protected UnorderedRange m_Pages
the range of pages to extract.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing the object.- Specified by:
globalInfo
in interfaceGlobalInfoSupporter
- Specified by:
globalInfo
in classAbstractOptionHandler
- Returns:
- a description suitable for displaying in the gui
-
defineOptions
public void defineOptions()
Adds options to the internal list of options.- Specified by:
defineOptions
in interfaceOptionHandler
- Overrides:
defineOptions
in classAbstractActor
-
setPages
public void setPages(UnorderedRange value)
Sets the page range.- Parameters:
value
- the range
-
getPages
public UnorderedRange getPages()
Returns the page range.- Returns:
- the range
-
pagesTipText
public String pagesTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
getQuickInfo
public String getQuickInfo()
Returns a quick info about the actor, which will be displayed in the GUI.- Specified by:
getQuickInfo
in interfaceActor
- Specified by:
getQuickInfo
in interfaceQuickInfoSupporter
- Overrides:
getQuickInfo
in classAbstractActor
- Returns:
- null if no info available, otherwise short string
-
accepts
public Class[] accepts()
Returns the class that the consumer accepts.- Returns:
- java.lang.String.class, java.io.File.class
-
generates
public Class[] generates()
Returns the class of objects that it generates.- Returns:
- java.lang.String.class
-
doExecute
protected String doExecute()
Executes the flow item.- Specified by:
doExecute
in classAbstractActor
- Returns:
- null if everything is fine, otherwise error message
-
-