Package weka.core.converters
Class SimpleArffLoader
- java.lang.Object
-
- weka.core.converters.AbstractLoader
-
- weka.core.converters.AbstractFileLoader
-
- weka.core.converters.SimpleArffLoader
-
- All Implemented Interfaces:
EncodingSupporter,Serializable,weka.core.converters.FileSourcedConverter,weka.core.converters.Loader,weka.core.EnvironmentHandler,weka.core.OptionHandler,weka.core.RevisionHandler,weka.core.WeightedInstancesHandler
public class SimpleArffLoader extends weka.core.converters.AbstractFileLoader implements weka.core.WeightedInstancesHandler, weka.core.OptionHandler, EncodingSupporter
A simple ARFF loader, only supports batch loading.- Author:
- FracPete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static StringKEYWORD_ATTRIBUTEstatic StringKEYWORD_DATAstatic StringKEYWORD_RELATIONprotected weka.core.Instancesm_Datathe currently loaded data.protected BaseCharsetm_Encodingthe encoding to use.protected booleanm_ForceCompressionwhether to force compression.
-
Constructor Summary
Constructors Constructor Description SimpleArffLoader()Initializes the loader.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected weka.core.AttributecreateAttribute(String line)Creates an attribute from the specification line.StringencodingTipText()Returns the tip text for this property.StringforceCompressionTipText()Tip text suitable for displaying int the GUIweka.core.InstancesgetDataSet()Returns the full dataset.BaseCharsetgetEncoding()Returns the encoding to use.StringgetFileDescription()Get a one line description of the type of fileStringgetFileExtension()Get the file extension used for this type of fileString[]getFileExtensions()Gets all the file extensions used for this type of filebooleangetForceCompression()Gets whether the file gets interpreted as gzip-compressed ARFF file.weka.core.InstancegetNextInstance(weka.core.Instances structure)Not supported.String[]getOptions()Gets the current option settings for the OptionHandler.StringgetRevision()Returns the revision string.weka.core.InstancesgetStructure()Returns the structure of the dataset.StringglobalInfo()Description of loader.protected intindexOfUnescaped(String s, char chr, int start)Finds the index of an unescaped (ie not preceded by backslash) character starting with the provided starting position.Enumeration<weka.core.Option>listOptions()Returns an enumeration of all the available options..static voidmain(String[] args)Main method.protected Map<String,String>parseAttribute(String line)Extracts the attribute name, type and date format from the line.protected weka.core.InstanceparseDense(weka.core.Instances header, String line)Parses a dense instance.protected weka.core.InstanceparseSparse(weka.core.Instances header, String line)Parses a data row in sparse format.protected weka.core.Instancesread(BufferedReader reader)Performs the actual reading.protected StringremoveAttributeType(String current)Removes the attribute type.voidreset()Resets the loader.FileretrieveFile()Return the current source file/ destination filevoidsetEncoding(BaseCharset value)Sets the encoding to use.voidsetFile(File file)Set the file to load from/ to save invoidsetForceCompression(boolean value)Set whether the file gets interpreted as gzip-compressed ARFF file.voidsetOptions(String[] options)Sets the OptionHandler's options using the given list.voidsetSource(File file)Resets the Loader object and sets the source of the data set to be the supplied File object.voidsetUseRelativePath(boolean rp)Ignored.protected StringunquoteAttribute(String name)Unquotes the attribute name.-
Methods inherited from class weka.core.converters.AbstractFileLoader
getUseRelativePath, makeOptionStr, runFileLoader, setEnvironment, useRelativePathTipText
-
-
-
-
Field Detail
-
KEYWORD_RELATION
public static final String KEYWORD_RELATION
- See Also:
- Constant Field Values
-
KEYWORD_ATTRIBUTE
public static final String KEYWORD_ATTRIBUTE
- See Also:
- Constant Field Values
-
KEYWORD_DATA
public static final String KEYWORD_DATA
- See Also:
- Constant Field Values
-
m_Data
protected weka.core.Instances m_Data
the currently loaded data.
-
m_ForceCompression
protected boolean m_ForceCompression
whether to force compression.
-
m_Encoding
protected BaseCharset m_Encoding
the encoding to use.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Description of loader.- Returns:
- the description
-
setForceCompression
public void setForceCompression(boolean value)
Set whether the file gets interpreted as gzip-compressed ARFF file.- Parameters:
value- true if to treat as compressed
-
getForceCompression
public boolean getForceCompression()
Gets whether the file gets interpreted as gzip-compressed ARFF file.- Returns:
- true if to treat as compressed
-
forceCompressionTipText
public String forceCompressionTipText()
Tip text suitable for displaying int the GUI- Returns:
- a description of this property as a String
-
setEncoding
public void setEncoding(BaseCharset value)
Sets the encoding to use.- Specified by:
setEncodingin interfaceEncodingSupporter- Parameters:
value- the encoding, e.g. "UTF-8" or "UTF-16", empty string for default
-
getEncoding
public BaseCharset getEncoding()
Returns the encoding to use.- Specified by:
getEncodingin interfaceEncodingSupporter- Returns:
- the encoding, e.g. "UTF-8" or "UTF-16", empty string for default
-
encodingTipText
public String encodingTipText()
Returns the tip text for this property.- Specified by:
encodingTipTextin interfaceEncodingSupporter- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
listOptions
public Enumeration<weka.core.Option> listOptions()
Returns an enumeration of all the available options..- Specified by:
listOptionsin interfaceweka.core.OptionHandler- Returns:
- an enumeration of all available options.
-
getOptions
public String[] getOptions()
Gets the current option settings for the OptionHandler.- Specified by:
getOptionsin interfaceweka.core.OptionHandler- Returns:
- the list of current option settings as an array of strings
-
setOptions
public void setOptions(String[] options) throws Exception
Sets the OptionHandler's options using the given list. All options will be set (or reset) during this call (i.e. incremental setting of options is not possible).- Specified by:
setOptionsin interfaceweka.core.OptionHandler- Parameters:
options- the list of options as an array of strings- Throws:
Exception- if an option is not supported
-
getFileExtension
public String getFileExtension()
Get the file extension used for this type of file- Specified by:
getFileExtensionin interfaceweka.core.converters.FileSourcedConverter- Returns:
- the file extension
-
getFileExtensions
public String[] getFileExtensions()
Gets all the file extensions used for this type of file- Specified by:
getFileExtensionsin interfaceweka.core.converters.FileSourcedConverter- Returns:
- the file extensions
-
getFileDescription
public String getFileDescription()
Get a one line description of the type of file- Specified by:
getFileDescriptionin interfaceweka.core.converters.FileSourcedConverter- Returns:
- a description of the file type
-
reset
public void reset() throws IOExceptionResets the loader.- Specified by:
resetin interfaceweka.core.converters.Loader- Overrides:
resetin classweka.core.converters.AbstractFileLoader- Throws:
IOException
-
setFile
public void setFile(File file) throws IOException
Set the file to load from/ to save in- Specified by:
setFilein interfaceweka.core.converters.FileSourcedConverter- Overrides:
setFilein classweka.core.converters.AbstractFileLoader- Parameters:
file- the file to load from- Throws:
IOException- if an error occurs
-
setSource
public void setSource(File file) throws IOException
Resets the Loader object and sets the source of the data set to be the supplied File object.- Specified by:
setSourcein interfaceweka.core.converters.Loader- Overrides:
setSourcein classweka.core.converters.AbstractFileLoader- Parameters:
file- the source file.- Throws:
IOException- if an error occurs
-
retrieveFile
public File retrieveFile()
Return the current source file/ destination file- Specified by:
retrieveFilein interfaceweka.core.converters.FileSourcedConverter- Overrides:
retrieveFilein classweka.core.converters.AbstractFileLoader- Returns:
- a
Filevalue
-
setUseRelativePath
public void setUseRelativePath(boolean rp)
Ignored.- Specified by:
setUseRelativePathin interfaceweka.core.converters.FileSourcedConverter- Overrides:
setUseRelativePathin classweka.core.converters.AbstractFileLoader- Parameters:
rp- true if relative paths are to be used
-
removeAttributeType
protected String removeAttributeType(String current)
Removes the attribute type.- Parameters:
current- the remainder of the attribute type string- Returns:
- the remainder without type string
-
indexOfUnescaped
protected int indexOfUnescaped(String s, char chr, int start)
Finds the index of an unescaped (ie not preceded by backslash) character starting with the provided starting position.- Parameters:
s- the string to analyzechr- the character to look forstart- the 0-based index of the starting position- Returns:
- the index, -1 if not found
-
unquoteAttribute
protected String unquoteAttribute(String name)
Unquotes the attribute name.- Parameters:
name- the name to unquote, if necessary- Returns:
- the unquoted name
-
parseAttribute
protected Map<String,String> parseAttribute(String line)
Extracts the attribute name, type and date format from the line.- Parameters:
line- the line to parse- Returns:
- the extracted data
-
createAttribute
protected weka.core.Attribute createAttribute(String line)
Creates an attribute from the specification line.- Parameters:
line- the line to use- Returns:
- the attribute
-
parseSparse
protected weka.core.Instance parseSparse(weka.core.Instances header, String line) throws ExceptionParses a data row in sparse format.- Parameters:
header- the dataset formatline- the line to parse- Returns:
- the sparse instance
- Throws:
Exception- if parsing fails
-
parseDense
protected weka.core.Instance parseDense(weka.core.Instances header, String line) throws ExceptionParses a dense instance.- Parameters:
header- the dataset headerline- the line to parse- Returns:
- the parsed instance
- Throws:
Exception- if parsing fails
-
read
protected weka.core.Instances read(BufferedReader reader)
Performs the actual reading.- Parameters:
reader- the reader to read from- Returns:
- the spreadsheet or null in case of an error
-
getStructure
public weka.core.Instances getStructure() throws IOExceptionReturns the structure of the dataset.- Specified by:
getStructurein interfaceweka.core.converters.Loader- Specified by:
getStructurein classweka.core.converters.AbstractLoader- Returns:
- the structure
- Throws:
IOException- if failed to read
-
getDataSet
public weka.core.Instances getDataSet() throws IOExceptionReturns the full dataset.- Specified by:
getDataSetin interfaceweka.core.converters.Loader- Specified by:
getDataSetin classweka.core.converters.AbstractLoader- Returns:
- the dataset
- Throws:
IOException- if failed to read
-
getNextInstance
public weka.core.Instance getNextInstance(weka.core.Instances structure) throws IOExceptionNot supported.- Specified by:
getNextInstancein interfaceweka.core.converters.Loader- Specified by:
getNextInstancein classweka.core.converters.AbstractLoader- Parameters:
structure- the structure- Returns:
- the instance
- Throws:
IOException- always
-
getRevision
public String getRevision()
Returns the revision string.- Specified by:
getRevisionin interfaceweka.core.RevisionHandler- Returns:
- the revision
-
main
public static void main(String[] args)
Main method.- Parameters:
args- should contain the name of an input file.
-
-