adams.flow.transformer
Class WekaTextDirectoryReader

java.lang.Object
  extended by adams.core.ConsoleObject
      extended by adams.core.option.AbstractOptionHandler
          extended by adams.flow.core.AbstractActor
              extended by adams.flow.transformer.AbstractTransformer
                  extended by adams.flow.transformer.WekaTextDirectoryReader
All Implemented Interfaces:
AdditionalInformationHandler, CleanUpHandler, Debuggable, DebugOutputHandler, Destroyable, OptionHandler, QuickInfoSupporter, ShallowCopySupporter<AbstractActor>, SizeOfHandler, Stoppable, VariableChangeListener, ErrorHandler, InputConsumer, OutputProducer, ProvenanceSupporter, Serializable, Comparable

public class WekaTextDirectoryReader
extends AbstractTransformer
implements ProvenanceSupporter

Loads all text files in a directory and uses the subdirectory names as class labels. The content of the text files will be stored in a String attribute, the filename can be stored as well.
Uses the WEKA weka.core.converters.TextDirectoryLoader converter.

Input/output:
- accepts:
   java.lang.String
   java.io.File
- generates:
   weka.core.Instances

Valid options are:

-D <int> (property: debugLevel)
    The greater the number the more additional info the scheme may output to
    the console (0 = off).
    default: 0
    minimum: 0
 
-name <java.lang.String> (property: name)
    The name of the actor.
    default: WekaTextDirectoryReader
 
-annotation <adams.core.base.BaseText> (property: annotations)
    The annotations to attach to this actor.
    default:
 
-skip (property: skip)
    If set to true, transformation is skipped and the input token is just forwarded
    as it is.
 
-stop-flow-on-error (property: stopFlowOnError)
    If set to true, the flow gets stopped in case this actor encounters an error;
     useful for critical actors.
 
-store-filename (property: storeFilename)
    If enabled, the filename will be stored in extra attribute.
 
-char-set <java.lang.String> (property: charSet)
    The character set to use when loading the text files.
    default: UTF-8
 

Version:
$Revision: 4584 $
Author:
fracpete (fracpete at waikato dot ac dot nz)
See Also:
Serialized Form

Field Summary
protected  String m_CharSet
          the character set.
protected  boolean m_StoreFilename
          whether to store the filename as extra attribute.
 
Fields inherited from class adams.flow.transformer.AbstractTransformer
BACKUP_INPUT, BACKUP_OUTPUT, m_InputToken, m_OutputToken
 
Fields inherited from class adams.flow.core.AbstractActor
FILE_EXTENSION, FILE_EXTENSION_GZ, m_Annotations, m_BackupState, m_DetectedObjectVariables, m_DetectedVariables, m_ErrorHandler, m_Executed, m_FullName, m_Headless, m_Name, m_Parent, m_Root, m_Self, m_Skip, m_StopFlowOnError, m_StopMessage, m_Stopped, m_StorageHandler, m_VariablesUpdated
 
Fields inherited from class adams.core.option.AbstractOptionHandler
m_DebugLevel, m_OptionManager
 
Constructor Summary
WekaTextDirectoryReader()
           
 
Method Summary
 Class[] accepts()
          Returns the class that the consumer accepts.
 String charSetTipText()
          Returns the tip text for this property.
 void defineOptions()
          Adds options to the internal list of options.
protected  String doExecute()
          Executes the flow item.
 Class[] generates()
          Returns the class of objects that it generates.
 String getCharSet()
          Returns the character set in use.
 String getQuickInfo()
          Returns a quick info about the actor, which will be displayed in the GUI.
 boolean getStoreFilename()
          Returns whether the filename gets stored in extra attribute.
 String globalInfo()
          Returns a string describing the object.
 void setCharSet(String value)
          Sets the character set to use.
 void setStoreFilename(boolean value)
          Sets whether to store the filename in extra attribute.
 String storeFilenameTipText()
          Returns the tip text for this property.
 void updateProvenance(ProvenanceContainer cont)
          Updates the provenance information in the provided container.
 
Methods inherited from class adams.flow.transformer.AbstractTransformer
backupState, execute, hasPendingOutput, input, output, postExecute, reset, restoreState, wrapUp
 
Methods inherited from class adams.flow.core.AbstractActor
annotationsTipText, canInspectOptions, canPerformSetUpCheck, cleanUp, compareTo, debug, destroy, equals, findVariables, findVariables, findVariables, forceVariables, forCommandLine, forName, getAdditionalInformation, getAnnotations, getDefaultName, getDetectedVariables, getErrorHandler, getFlowActors, getFullName, getName, getNextSibling, getParent, getPreviousSibling, getRoot, getSkip, getStopFlowOnError, getStopMessage, getStorageHandler, getVariables, handleError, hasErrorHandler, hasStopMessage, index, initialize, isBackedUp, isExecuted, isFinished, isHeadless, isStopped, nameTipText, performSetUpChecks, preExecute, pruneBackup, pruneBackup, setAnnotations, setErrorHandler, setHeadless, setName, setParent, setSkip, setStopFlowOnError, setUp, setVariables, shallowCopy, shallowCopy, sizeOf, skipTipText, stopExecution, stopExecution, stopFlowOnErrorTipText, updateDetectedVariables, updatePrefix, updateVariables, variableChanged
 
Methods inherited from class adams.core.option.AbstractOptionHandler
cleanUpOptions, debug, debugLevelTipText, finishInit, getDebugLevel, getOptionManager, isDebugOn, newOptionManager, setDebugLevel, toCommandLine, toString
 
Methods inherited from class adams.core.ConsoleObject
getDebugging, getSystemErr, getSystemOut
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_StoreFilename

protected boolean m_StoreFilename
whether to store the filename as extra attribute.


m_CharSet

protected String m_CharSet
the character set.

Constructor Detail

WekaTextDirectoryReader

public WekaTextDirectoryReader()
Method Detail

globalInfo

public String globalInfo()
Returns a string describing the object.

Specified by:
globalInfo in class AbstractOptionHandler
Returns:
a description suitable for displaying in the gui

defineOptions

public void defineOptions()
Adds options to the internal list of options.

Specified by:
defineOptions in interface OptionHandler
Overrides:
defineOptions in class AbstractActor

getQuickInfo

public String getQuickInfo()
Returns a quick info about the actor, which will be displayed in the GUI.

Specified by:
getQuickInfo in interface QuickInfoSupporter
Overrides:
getQuickInfo in class AbstractActor
Returns:
null if no info available, otherwise short string

accepts

public Class[] accepts()
Returns the class that the consumer accepts.

Specified by:
accepts in interface InputConsumer
Returns:
java.lang.String.class, java.io.File.class

generates

public Class[] generates()
Returns the class of objects that it generates.

Specified by:
generates in interface OutputProducer
Returns:
weka.core.Instances.class

setStoreFilename

public void setStoreFilename(boolean value)
Sets whether to store the filename in extra attribute.

Parameters:
value - if true then filename gets stored as well

getStoreFilename

public boolean getStoreFilename()
Returns whether the filename gets stored in extra attribute.

Returns:
true if a filename gets stored

storeFilenameTipText

public String storeFilenameTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the GUI or for listing the options.

setCharSet

public void setCharSet(String value)
Sets the character set to use.

Parameters:
value - the character set

getCharSet

public String getCharSet()
Returns the character set in use.

Returns:
the character set

charSetTipText

public String charSetTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the GUI or for listing the options.

doExecute

protected String doExecute()
Executes the flow item.

Specified by:
doExecute in class AbstractActor
Returns:
null if everything is fine, otherwise error message

updateProvenance

public void updateProvenance(ProvenanceContainer cont)
Updates the provenance information in the provided container.

Specified by:
updateProvenance in interface ProvenanceSupporter
Parameters:
cont - the provenance container to update


Copyright © 2012 University of Waikato, Hamilton, NZ. All Rights Reserved.