adams.flow.transformer
Class PDFExtractText

java.lang.Object
  extended by adams.core.ConsoleObject
      extended by adams.core.option.AbstractOptionHandler
          extended by adams.flow.core.AbstractActor
              extended by adams.flow.transformer.AbstractTransformer
                  extended by adams.flow.transformer.PDFExtractText
All Implemented Interfaces:
AdditionalInformationHandler, CleanUpHandler, Debuggable, DebugOutputHandler, Destroyable, OptionHandler, QuickInfoSupporter, ShallowCopySupporter<AbstractActor>, SizeOfHandler, Stoppable, VariableChangeListener, Actor, ErrorHandler, InputConsumer, OutputProducer, Serializable, Comparable

public class PDFExtractText
extends AbstractTransformer

Actor for extracting the text of a range of pages from a PDF file.

Input/output:
- accepts:
   java.lang.String
   java.io.File
- generates:
   java.lang.String

Valid options are:

-D <int> (property: debugLevel)
    The greater the number the more additional info the scheme may output to
    the console (0 = off).
    default: 0
    minimum: 0
 
-name <java.lang.String> (property: name)
    The name of the actor.
    default: PDFExtractText
 
-annotation <adams.core.base.BaseText> (property: annotations)
    The annotations to attach to this actor.
    default:
 
-skip (property: skip)
    If set to true, transformation is skipped and the input token is just forwarded
    as it is.
 
-stop-flow-on-error (property: stopFlowOnError)
    If set to true, the flow gets stopped in case this actor encounters an error;
     useful for critical actors.
 
-pages <java.lang.String> (property: pages)
    The range of pages to extract; A range is a comma-separated list of single
    1-based indices or sub-ranges of indices ('start-end'); 'inv(...)' inverts
    the range '...'; the following placeholders can be used as well: first,
    second, third, last_2, last_1, last
    default: first-last
 

Version:
$Revision: 5563 $
Author:
fracpete (fracpete at waikato dot ac dot nz)
See Also:
Serialized Form

Field Summary
protected  PlaceholderFile m_Output
          the output file.
protected  Range m_Pages
          the range of pages to extract.
 
Fields inherited from class adams.flow.transformer.AbstractTransformer
BACKUP_INPUT, BACKUP_OUTPUT, m_InputToken, m_OutputToken
 
Fields inherited from class adams.flow.core.AbstractActor
m_Annotations, m_BackupState, m_DetectedObjectVariables, m_DetectedVariables, m_ErrorHandler, m_Executed, m_Executing, m_ExecutionListeningSupporter, m_FullName, m_Headless, m_Name, m_Parent, m_Root, m_Self, m_Skip, m_StopFlowOnError, m_StopMessage, m_Stopped, m_StorageHandler, m_VariablesUpdated
 
Fields inherited from class adams.core.option.AbstractOptionHandler
m_DebugLevel, m_OptionManager
 
Fields inherited from interface adams.flow.core.Actor
FILE_EXTENSION, FILE_EXTENSION_GZ
 
Constructor Summary
PDFExtractText()
           
 
Method Summary
 Class[] accepts()
          Returns the class that the consumer accepts.
 void defineOptions()
          Adds options to the internal list of options.
protected  String doExecute()
          Executes the flow item.
 Class[] generates()
          Returns the class of objects that it generates.
 Range getPages()
          Returns the page range.
 String getQuickInfo()
          Returns a quick info about the actor, which will be displayed in the GUI.
 String globalInfo()
          Returns a string describing the object.
protected  void initialize()
          Initializes the members.
 String pagesTipText()
          Returns the tip text for this property.
 void setPages(Range value)
          Sets the page range.
 
Methods inherited from class adams.flow.transformer.AbstractTransformer
backupState, execute, hasPendingOutput, input, output, postExecute, reset, restoreState, wrapUp
 
Methods inherited from class adams.flow.core.AbstractActor
annotationsTipText, canInspectOptions, canPerformSetUpCheck, cleanUp, compareTo, debug, destroy, equals, findVariables, findVariables, findVariables, forceVariables, forCommandLine, forName, getAdditionalInformation, getAnnotations, getDefaultName, getDetectedVariables, getErrorHandler, getFlowActors, getFlowExecutionListeningSupporter, getFullName, getName, getNextSibling, getParent, getPreviousSibling, getRoot, getSkip, getStopFlowOnError, getStopMessage, getStorageHandler, getVariables, handleError, handleException, hasErrorHandler, hasStopMessage, index, isBackedUp, isExecuted, isExecuting, isFinished, isHeadless, isStopped, nameTipText, performSetUpChecks, preExecute, pruneBackup, pruneBackup, setAnnotations, setErrorHandler, setHeadless, setName, setParent, setSkip, setStopFlowOnError, setUp, setVariables, shallowCopy, shallowCopy, sizeOf, skipTipText, stopExecution, stopExecution, stopFlowOnErrorTipText, updateDetectedVariables, updatePrefix, updateVariables, variableChanged
 
Methods inherited from class adams.core.option.AbstractOptionHandler
cleanUpOptions, debug, debugLevelTipText, finishInit, getDebugLevel, getOptionManager, isDebugOn, newOptionManager, setDebugLevel, toCommandLine, toString
 
Methods inherited from class adams.core.ConsoleObject
getDebugging, getSystemErr, getSystemOut
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface adams.flow.core.Actor
cleanUp, compareTo, debug, destroy, equals, findVariables, getAnnotations, getDefaultName, getDetectedVariables, getErrorHandler, getFlowExecutionListeningSupporter, getFullName, getName, getNextSibling, getParent, getPreviousSibling, getRoot, getSkip, getStopFlowOnError, getStopMessage, getStorageHandler, getVariables, handleError, hasErrorHandler, hasStopMessage, index, isExecuted, isFinished, isHeadless, isStopped, setAnnotations, setErrorHandler, setHeadless, setName, setParent, setSkip, setStopFlowOnError, setUp, setVariables, sizeOf, stopExecution, stopExecution, variableChanged
 
Methods inherited from interface adams.core.AdditionalInformationHandler
getAdditionalInformation
 
Methods inherited from interface adams.core.option.OptionHandler
cleanUpOptions, getOptionManager
 

Field Detail

m_Output

protected PlaceholderFile m_Output
the output file.


m_Pages

protected Range m_Pages
the range of pages to extract.

Constructor Detail

PDFExtractText

public PDFExtractText()
Method Detail

globalInfo

public String globalInfo()
Returns a string describing the object.

Specified by:
globalInfo in class AbstractOptionHandler
Returns:
a description suitable for displaying in the gui

defineOptions

public void defineOptions()
Adds options to the internal list of options.

Specified by:
defineOptions in interface OptionHandler
Overrides:
defineOptions in class AbstractActor

initialize

protected void initialize()
Initializes the members.

Overrides:
initialize in class AbstractActor

setPages

public void setPages(Range value)
Sets the page range.

Parameters:
value - the range

getPages

public Range getPages()
Returns the page range.

Returns:
the range

pagesTipText

public String pagesTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the GUI or for listing the options.

getQuickInfo

public String getQuickInfo()
Returns a quick info about the actor, which will be displayed in the GUI.

Specified by:
getQuickInfo in interface QuickInfoSupporter
Specified by:
getQuickInfo in interface Actor
Overrides:
getQuickInfo in class AbstractActor
Returns:
null if no info available, otherwise short string

accepts

public Class[] accepts()
Returns the class that the consumer accepts.

Returns:
java.lang.String.class, java.io.File.class

generates

public Class[] generates()
Returns the class of objects that it generates.

Returns:
java.lang.String.class

doExecute

protected String doExecute()
Executes the flow item.

Specified by:
doExecute in class AbstractActor
Returns:
null if everything is fine, otherwise error message


Copyright © 2013 University of Waikato, Hamilton, NZ. All Rights Reserved.