Package weka.core.converters
Class SimpleArffLoader
- java.lang.Object
-
- weka.core.converters.AbstractLoader
-
- weka.core.converters.AbstractFileLoader
-
- weka.core.converters.SimpleArffLoader
-
- All Implemented Interfaces:
adams.core.io.EncodingSupporter
,Serializable
,weka.core.converters.FileSourcedConverter
,weka.core.converters.Loader
,weka.core.EnvironmentHandler
,weka.core.OptionHandler
,weka.core.RevisionHandler
,weka.core.WeightedInstancesHandler
public class SimpleArffLoader extends weka.core.converters.AbstractFileLoader implements weka.core.WeightedInstancesHandler, weka.core.OptionHandler, adams.core.io.EncodingSupporter
A simple ARFF loader, only supports batch loading.- Author:
- FracPete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static String
KEYWORD_ATTRIBUTE
static String
KEYWORD_DATA
static String
KEYWORD_RELATION
protected weka.core.Instances
m_Data
the currently loaded data.protected adams.core.base.BaseCharset
m_Encoding
the encoding to use.protected boolean
m_ForceCompression
whether to force compression.
-
Constructor Summary
Constructors Constructor Description SimpleArffLoader()
Initializes the loader.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected weka.core.Attribute
createAttribute(String line)
Creates an attribute from the specification line.String
encodingTipText()
Returns the tip text for this property.String
forceCompressionTipText()
Tip text suitable for displaying int the GUIweka.core.Instances
getDataSet()
Returns the full dataset.adams.core.base.BaseCharset
getEncoding()
Returns the encoding to use.String
getFileDescription()
Get a one line description of the type of fileString
getFileExtension()
Get the file extension used for this type of fileString[]
getFileExtensions()
Gets all the file extensions used for this type of fileboolean
getForceCompression()
Gets whether the file gets interpreted as gzip-compressed ARFF file.weka.core.Instance
getNextInstance(weka.core.Instances structure)
Not supported.String[]
getOptions()
Gets the current option settings for the OptionHandler.String
getRevision()
Returns the revision string.weka.core.Instances
getStructure()
Returns the structure of the dataset.String
globalInfo()
Description of loader.protected int
indexOfUnescaped(String s, char chr, int start)
Finds the index of an unescaped (ie not preceded by backslash) character starting with the provided starting position.Enumeration<weka.core.Option>
listOptions()
Returns an enumeration of all the available options..static void
main(String[] args)
Main method.protected Map<String,String>
parseAttribute(String line)
Extracts the attribute name, type and date format from the line.protected weka.core.Instance
parseDense(weka.core.Instances header, String line)
Parses a dense instance.protected weka.core.Instance
parseSparse(weka.core.Instances header, String line)
Parses a data row in sparse format.protected weka.core.Instances
read(BufferedReader reader)
Performs the actual reading.protected String
removeAttributeType(String current)
Removes the attribute type.void
reset()
Resets the loader.File
retrieveFile()
Return the current source file/ destination filevoid
setEncoding(adams.core.base.BaseCharset value)
Sets the encoding to use.void
setFile(File file)
Set the file to load from/ to save invoid
setForceCompression(boolean value)
Set whether the file gets interpreted as gzip-compressed ARFF file.void
setOptions(String[] options)
Sets the OptionHandler's options using the given list.void
setSource(File file)
Resets the Loader object and sets the source of the data set to be the supplied File object.void
setUseRelativePath(boolean rp)
Ignored.protected String
unquoteAttribute(String name)
Unquotes the attribute name.-
Methods inherited from class weka.core.converters.AbstractFileLoader
getUseRelativePath, makeOptionStr, runFileLoader, setEnvironment, useRelativePathTipText
-
-
-
-
Field Detail
-
KEYWORD_RELATION
public static final String KEYWORD_RELATION
- See Also:
- Constant Field Values
-
KEYWORD_ATTRIBUTE
public static final String KEYWORD_ATTRIBUTE
- See Also:
- Constant Field Values
-
KEYWORD_DATA
public static final String KEYWORD_DATA
- See Also:
- Constant Field Values
-
m_Data
protected weka.core.Instances m_Data
the currently loaded data.
-
m_ForceCompression
protected boolean m_ForceCompression
whether to force compression.
-
m_Encoding
protected adams.core.base.BaseCharset m_Encoding
the encoding to use.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Description of loader.- Returns:
- the description
-
setForceCompression
public void setForceCompression(boolean value)
Set whether the file gets interpreted as gzip-compressed ARFF file.- Parameters:
value
- true if to treat as compressed
-
getForceCompression
public boolean getForceCompression()
Gets whether the file gets interpreted as gzip-compressed ARFF file.- Returns:
- true if to treat as compressed
-
forceCompressionTipText
public String forceCompressionTipText()
Tip text suitable for displaying int the GUI- Returns:
- a description of this property as a String
-
setEncoding
public void setEncoding(adams.core.base.BaseCharset value)
Sets the encoding to use.- Specified by:
setEncoding
in interfaceadams.core.io.EncodingSupporter
- Parameters:
value
- the encoding, e.g. "UTF-8" or "UTF-16", empty string for default
-
getEncoding
public adams.core.base.BaseCharset getEncoding()
Returns the encoding to use.- Specified by:
getEncoding
in interfaceadams.core.io.EncodingSupporter
- Returns:
- the encoding, e.g. "UTF-8" or "UTF-16", empty string for default
-
encodingTipText
public String encodingTipText()
Returns the tip text for this property.- Specified by:
encodingTipText
in interfaceadams.core.io.EncodingSupporter
- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
listOptions
public Enumeration<weka.core.Option> listOptions()
Returns an enumeration of all the available options..- Specified by:
listOptions
in interfaceweka.core.OptionHandler
- Returns:
- an enumeration of all available options.
-
getOptions
public String[] getOptions()
Gets the current option settings for the OptionHandler.- Specified by:
getOptions
in interfaceweka.core.OptionHandler
- Returns:
- the list of current option settings as an array of strings
-
setOptions
public void setOptions(String[] options) throws Exception
Sets the OptionHandler's options using the given list. All options will be set (or reset) during this call (i.e. incremental setting of options is not possible).- Specified by:
setOptions
in interfaceweka.core.OptionHandler
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
getFileExtension
public String getFileExtension()
Get the file extension used for this type of file- Specified by:
getFileExtension
in interfaceweka.core.converters.FileSourcedConverter
- Returns:
- the file extension
-
getFileExtensions
public String[] getFileExtensions()
Gets all the file extensions used for this type of file- Specified by:
getFileExtensions
in interfaceweka.core.converters.FileSourcedConverter
- Returns:
- the file extensions
-
getFileDescription
public String getFileDescription()
Get a one line description of the type of file- Specified by:
getFileDescription
in interfaceweka.core.converters.FileSourcedConverter
- Returns:
- a description of the file type
-
reset
public void reset() throws IOException
Resets the loader.- Specified by:
reset
in interfaceweka.core.converters.Loader
- Overrides:
reset
in classweka.core.converters.AbstractFileLoader
- Throws:
IOException
-
setFile
public void setFile(File file) throws IOException
Set the file to load from/ to save in- Specified by:
setFile
in interfaceweka.core.converters.FileSourcedConverter
- Overrides:
setFile
in classweka.core.converters.AbstractFileLoader
- Parameters:
file
- the file to load from- Throws:
IOException
- if an error occurs
-
setSource
public void setSource(File file) throws IOException
Resets the Loader object and sets the source of the data set to be the supplied File object.- Specified by:
setSource
in interfaceweka.core.converters.Loader
- Overrides:
setSource
in classweka.core.converters.AbstractFileLoader
- Parameters:
file
- the source file.- Throws:
IOException
- if an error occurs
-
retrieveFile
public File retrieveFile()
Return the current source file/ destination file- Specified by:
retrieveFile
in interfaceweka.core.converters.FileSourcedConverter
- Overrides:
retrieveFile
in classweka.core.converters.AbstractFileLoader
- Returns:
- a
File
value
-
setUseRelativePath
public void setUseRelativePath(boolean rp)
Ignored.- Specified by:
setUseRelativePath
in interfaceweka.core.converters.FileSourcedConverter
- Overrides:
setUseRelativePath
in classweka.core.converters.AbstractFileLoader
- Parameters:
rp
- true if relative paths are to be used
-
removeAttributeType
protected String removeAttributeType(String current)
Removes the attribute type.- Parameters:
current
- the remainder of the attribute type string- Returns:
- the remainder without type string
-
indexOfUnescaped
protected int indexOfUnescaped(String s, char chr, int start)
Finds the index of an unescaped (ie not preceded by backslash) character starting with the provided starting position.- Parameters:
s
- the string to analyzechr
- the character to look forstart
- the 0-based index of the starting position- Returns:
- the index, -1 if not found
-
unquoteAttribute
protected String unquoteAttribute(String name)
Unquotes the attribute name.- Parameters:
name
- the name to unquote, if necessary- Returns:
- the unquoted name
-
parseAttribute
protected Map<String,String> parseAttribute(String line)
Extracts the attribute name, type and date format from the line.- Parameters:
line
- the line to parse- Returns:
- the extracted data
-
createAttribute
protected weka.core.Attribute createAttribute(String line)
Creates an attribute from the specification line.- Parameters:
line
- the line to use- Returns:
- the attribute
-
parseSparse
protected weka.core.Instance parseSparse(weka.core.Instances header, String line) throws Exception
Parses a data row in sparse format.- Parameters:
header
- the dataset formatline
- the line to parse- Returns:
- the sparse instance
- Throws:
Exception
- if parsing fails
-
parseDense
protected weka.core.Instance parseDense(weka.core.Instances header, String line) throws Exception
Parses a dense instance.- Parameters:
header
- the dataset headerline
- the line to parse- Returns:
- the parsed instance
- Throws:
Exception
- if parsing fails
-
read
protected weka.core.Instances read(BufferedReader reader)
Performs the actual reading.- Parameters:
reader
- the reader to read from- Returns:
- the spreadsheet or null in case of an error
-
getStructure
public weka.core.Instances getStructure() throws IOException
Returns the structure of the dataset.- Specified by:
getStructure
in interfaceweka.core.converters.Loader
- Specified by:
getStructure
in classweka.core.converters.AbstractLoader
- Returns:
- the structure
- Throws:
IOException
- if failed to read
-
getDataSet
public weka.core.Instances getDataSet() throws IOException
Returns the full dataset.- Specified by:
getDataSet
in interfaceweka.core.converters.Loader
- Specified by:
getDataSet
in classweka.core.converters.AbstractLoader
- Returns:
- the dataset
- Throws:
IOException
- if failed to read
-
getNextInstance
public weka.core.Instance getNextInstance(weka.core.Instances structure) throws IOException
Not supported.- Specified by:
getNextInstance
in interfaceweka.core.converters.Loader
- Specified by:
getNextInstance
in classweka.core.converters.AbstractLoader
- Parameters:
structure
- the structure- Returns:
- the instance
- Throws:
IOException
- always
-
getRevision
public String getRevision()
Returns the revision string.- Specified by:
getRevision
in interfaceweka.core.RevisionHandler
- Returns:
- the revision
-
main
public static void main(String[] args)
Main method.- Parameters:
args
- should contain the name of an input file.
-
-