Class SimpleArffLoader

  • All Implemented Interfaces:
    EncodingSupporter, Serializable, weka.core.converters.FileSourcedConverter, weka.core.converters.Loader, weka.core.EnvironmentHandler, weka.core.OptionHandler, weka.core.RevisionHandler, weka.core.WeightedInstancesHandler

    public class SimpleArffLoader
    extends weka.core.converters.AbstractFileLoader
    implements weka.core.WeightedInstancesHandler, weka.core.OptionHandler, EncodingSupporter
    A simple ARFF loader, only supports batch loading.
    Author:
    FracPete (fracpete at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Detail

      • m_Data

        protected weka.core.Instances m_Data
        the currently loaded data.
      • m_ForceCompression

        protected boolean m_ForceCompression
        whether to force compression.
      • m_Encoding

        protected BaseCharset m_Encoding
        the encoding to use.
    • Constructor Detail

      • SimpleArffLoader

        public SimpleArffLoader()
        Initializes the loader.
    • Method Detail

      • globalInfo

        public String globalInfo()
        Description of loader.
        Returns:
        the description
      • setForceCompression

        public void setForceCompression​(boolean value)
        Set whether the file gets interpreted as gzip-compressed ARFF file.
        Parameters:
        value - true if to treat as compressed
      • getForceCompression

        public boolean getForceCompression()
        Gets whether the file gets interpreted as gzip-compressed ARFF file.
        Returns:
        true if to treat as compressed
      • forceCompressionTipText

        public String forceCompressionTipText()
        Tip text suitable for displaying int the GUI
        Returns:
        a description of this property as a String
      • setEncoding

        public void setEncoding​(BaseCharset value)
        Sets the encoding to use.
        Specified by:
        setEncoding in interface EncodingSupporter
        Parameters:
        value - the encoding, e.g. "UTF-8" or "UTF-16", empty string for default
      • getEncoding

        public BaseCharset getEncoding()
        Returns the encoding to use.
        Specified by:
        getEncoding in interface EncodingSupporter
        Returns:
        the encoding, e.g. "UTF-8" or "UTF-16", empty string for default
      • encodingTipText

        public String encodingTipText()
        Returns the tip text for this property.
        Specified by:
        encodingTipText in interface EncodingSupporter
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • listOptions

        public Enumeration<weka.core.Option> listOptions()
        Returns an enumeration of all the available options..
        Specified by:
        listOptions in interface weka.core.OptionHandler
        Returns:
        an enumeration of all available options.
      • getOptions

        public String[] getOptions()
        Gets the current option settings for the OptionHandler.
        Specified by:
        getOptions in interface weka.core.OptionHandler
        Returns:
        the list of current option settings as an array of strings
      • setOptions

        public void setOptions​(String[] options)
                        throws Exception
        Sets the OptionHandler's options using the given list. All options will be set (or reset) during this call (i.e. incremental setting of options is not possible).
        Specified by:
        setOptions in interface weka.core.OptionHandler
        Parameters:
        options - the list of options as an array of strings
        Throws:
        Exception - if an option is not supported
      • getFileExtension

        public String getFileExtension()
        Get the file extension used for this type of file
        Specified by:
        getFileExtension in interface weka.core.converters.FileSourcedConverter
        Returns:
        the file extension
      • getFileExtensions

        public String[] getFileExtensions()
        Gets all the file extensions used for this type of file
        Specified by:
        getFileExtensions in interface weka.core.converters.FileSourcedConverter
        Returns:
        the file extensions
      • getFileDescription

        public String getFileDescription()
        Get a one line description of the type of file
        Specified by:
        getFileDescription in interface weka.core.converters.FileSourcedConverter
        Returns:
        a description of the file type
      • reset

        public void reset()
                   throws IOException
        Resets the loader.
        Specified by:
        reset in interface weka.core.converters.Loader
        Overrides:
        reset in class weka.core.converters.AbstractFileLoader
        Throws:
        IOException
      • setFile

        public void setFile​(File file)
                     throws IOException
        Set the file to load from/ to save in
        Specified by:
        setFile in interface weka.core.converters.FileSourcedConverter
        Overrides:
        setFile in class weka.core.converters.AbstractFileLoader
        Parameters:
        file - the file to load from
        Throws:
        IOException - if an error occurs
      • setSource

        public void setSource​(File file)
                       throws IOException
        Resets the Loader object and sets the source of the data set to be the supplied File object.
        Specified by:
        setSource in interface weka.core.converters.Loader
        Overrides:
        setSource in class weka.core.converters.AbstractFileLoader
        Parameters:
        file - the source file.
        Throws:
        IOException - if an error occurs
      • retrieveFile

        public File retrieveFile()
        Return the current source file/ destination file
        Specified by:
        retrieveFile in interface weka.core.converters.FileSourcedConverter
        Overrides:
        retrieveFile in class weka.core.converters.AbstractFileLoader
        Returns:
        a File value
      • setUseRelativePath

        public void setUseRelativePath​(boolean rp)
        Ignored.
        Specified by:
        setUseRelativePath in interface weka.core.converters.FileSourcedConverter
        Overrides:
        setUseRelativePath in class weka.core.converters.AbstractFileLoader
        Parameters:
        rp - true if relative paths are to be used
      • removeAttributeType

        protected String removeAttributeType​(String current)
        Removes the attribute type.
        Parameters:
        current - the remainder of the attribute type string
        Returns:
        the remainder without type string
      • indexOfUnescaped

        protected int indexOfUnescaped​(String s,
                                       char chr,
                                       int start)
        Finds the index of an unescaped (ie not preceded by backslash) character starting with the provided starting position.
        Parameters:
        s - the string to analyze
        chr - the character to look for
        start - the 0-based index of the starting position
        Returns:
        the index, -1 if not found
      • unquoteAttribute

        protected String unquoteAttribute​(String name)
        Unquotes the attribute name.
        Parameters:
        name - the name to unquote, if necessary
        Returns:
        the unquoted name
      • parseAttribute

        protected Map<String,​String> parseAttribute​(String line)
        Extracts the attribute name, type and date format from the line.
        Parameters:
        line - the line to parse
        Returns:
        the extracted data
      • createAttribute

        protected weka.core.Attribute createAttribute​(String line)
        Creates an attribute from the specification line.
        Parameters:
        line - the line to use
        Returns:
        the attribute
      • parseSparse

        protected weka.core.Instance parseSparse​(weka.core.Instances header,
                                                 String line)
                                          throws Exception
        Parses a data row in sparse format.
        Parameters:
        header - the dataset format
        line - the line to parse
        Returns:
        the sparse instance
        Throws:
        Exception - if parsing fails
      • parseDense

        protected weka.core.Instance parseDense​(weka.core.Instances header,
                                                String line)
                                         throws Exception
        Parses a dense instance.
        Parameters:
        header - the dataset header
        line - the line to parse
        Returns:
        the parsed instance
        Throws:
        Exception - if parsing fails
      • read

        protected weka.core.Instances read​(BufferedReader reader)
        Performs the actual reading.
        Parameters:
        reader - the reader to read from
        Returns:
        the spreadsheet or null in case of an error
      • getStructure

        public weka.core.Instances getStructure()
                                         throws IOException
        Returns the structure of the dataset.
        Specified by:
        getStructure in interface weka.core.converters.Loader
        Specified by:
        getStructure in class weka.core.converters.AbstractLoader
        Returns:
        the structure
        Throws:
        IOException - if failed to read
      • getDataSet

        public weka.core.Instances getDataSet()
                                       throws IOException
        Returns the full dataset.
        Specified by:
        getDataSet in interface weka.core.converters.Loader
        Specified by:
        getDataSet in class weka.core.converters.AbstractLoader
        Returns:
        the dataset
        Throws:
        IOException - if failed to read
      • getNextInstance

        public weka.core.Instance getNextInstance​(weka.core.Instances structure)
                                           throws IOException
        Not supported.
        Specified by:
        getNextInstance in interface weka.core.converters.Loader
        Specified by:
        getNextInstance in class weka.core.converters.AbstractLoader
        Parameters:
        structure - the structure
        Returns:
        the instance
        Throws:
        IOException - always
      • getRevision

        public String getRevision()
        Returns the revision string.
        Specified by:
        getRevision in interface weka.core.RevisionHandler
        Returns:
        the revision
      • main

        public static void main​(String[] args)
        Main method.
        Parameters:
        args - should contain the name of an input file.