Class SimpleCsvSpreadSheetReader

  • All Implemented Interfaces:
    AdditionalInformationHandler, Destroyable, ErrorProvider, GlobalInfoSupporter, EncodingSupporter, FileFormatHandler, LoggingLevelHandler, LoggingSupporter, LocaleSupporter, OptionHandlingLocaleSupporter, OptionHandler, SizeOfHandler, Stoppable, StoppableWithFeedback, ChunkedSpreadSheetReader, MissingValueSpreadSheetReader, NoHeaderSpreadSheetReader, SpreadSheetReader, DataRowTypeHandler, SpreadSheetTypeHandler, Serializable
    Direct Known Subclasses:
    TsvSpreadSheetReader

    public class SimpleCsvSpreadSheetReader
    extends AbstractSpreadSheetReaderWithMissingValueSupport
    implements ChunkedSpreadSheetReader, OptionHandlingLocaleSupporter, NoHeaderSpreadSheetReader
    Reads CSV files.
    It is possible to force columns to be text. In that case no intelligent parsing is attempted to determine the type of data a cell has.
    For very large files, one can turn on chunking, which returns spreadsheet objects till all the data has been read.

    -logging-level <OFF|SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST> (property: loggingLevel)
        The logging level for outputting errors and debugging output.
        default: WARNING
     
    -missing <java.lang.String> (property: missingValue)
        The placeholder for missing values.
        default: 
     
    -encoding <adams.core.base.BaseCharset> (property: encoding)
        The type of encoding to use when reading using a reader, leave empty for 
        default.
        default: Default
     
    -quote-char <java.lang.String> (property: quoteCharacter)
        The character to use for surrounding text cells.
        default: \"
     
    -separator <java.lang.String> (property: separator)
        The separator to use for the columns; use '\t' for tab.
        default: ,
     
    -trim <boolean> (property: trim)
        If enabled, the content of the cells gets trimmed before added.
        default: false
     
    -text-columns <adams.core.Range> (property: textColumns)
        The range of columns to treat as text.
        default: 
        example: A range is a comma-separated list of single 1-based indices or sub-ranges of indices ('start-end'); 'inv(...)' inverts the range '...'; the following placeholders can be used as well: first, second, third, last_2, last_1, last
     
    -datetime-columns <adams.core.Range> (property: dateTimeColumns)
        The range of columns to treat as date/time msec.
        default: 
        example: A range is a comma-separated list of single 1-based indices or sub-ranges of indices ('start-end'); 'inv(...)' inverts the range '...'; the following placeholders can be used as well: first, second, third, last_2, last_1, last
     
    -datetime-format <adams.data.DateFormatString> (property: dateTimeFormat)
        The format for date/time msecs.
        default: yyyy-MM-dd HH:mm:ss
        more: http://docs.oracle.com/javase/6/docs/api/java/text/SimpleDateFormat.html
     
    -datetime-lenient <boolean> (property: dateTimeLenient)
        Whether date/time msec parsing is lenient or not.
        default: false
     
    -datetime-type <TIME|TIME_MSEC|DATE|DATE_TIME|DATE_TIME_MSEC> (property: dateTimeType)
        How to interpret the date/time data.
        default: DATE_TIME
     
    -time-zone <java.util.TimeZone> (property: timeZone)
        The time zone to use for interpreting dates/times; default is the system-wide 
        defined one.
     
    -locale <java.util.Locale> (property: locale)
        The locale to use for parsing the numbers.
        default: Default
     
    -no-header <boolean> (property: noHeader)
        If enabled, all rows get added as data rows and a dummy header will get 
        inserted.
        default: false
     
    -custom-column-headers <java.lang.String> (property: customColumnHeaders)
        The custom headers to use for the columns instead (comma-separated list);
         ignored if empty.
        default: 
     
    -chunk-size <int> (property: chunkSize)
        The maximum number of rows per chunk; using -1 will read put all data into 
        a single spreadsheet object.
        default: -1
        minimum: -1
     
    Author:
    FracPete (fracpete at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Detail

      • m_QuoteCharacter

        protected String m_QuoteCharacter
        the quote character.
      • m_Separator

        protected String m_Separator
        the column separator.
      • m_TextColumns

        protected Range m_TextColumns
        the columns to treat as text.
      • m_DateTimeColumns

        protected Range m_DateTimeColumns
        the columns to treat as date/time.
      • m_DateTimeFormat

        protected DateFormatString m_DateTimeFormat
        the format string for the date/times.
      • m_DateTimeLenient

        protected boolean m_DateTimeLenient
        whether date/time parsing is lenient.
      • m_TimeZone

        protected TimeZone m_TimeZone
        the timezone to use.
      • m_Locale

        protected Locale m_Locale
        the locale to use.
      • m_NoHeader

        protected boolean m_NoHeader
        whether the file has a header or not.
      • m_CustomColumnHeaders

        protected String m_CustomColumnHeaders
        the comma-separated list of column header names.
      • m_ChunkSize

        protected int m_ChunkSize
        the chunk size to use.
      • m_Trim

        protected boolean m_Trim
        whether to trim the cells.
    • Constructor Detail

      • SimpleCsvSpreadSheetReader

        public SimpleCsvSpreadSheetReader()
    • Method Detail

      • getDefaultSeparator

        protected String getDefaultSeparator()
        Returns the default separator.
        Returns:
        the default
      • setQuoteCharacter

        public void setQuoteCharacter​(String value)
        Sets the character used for surrounding text.
        Parameters:
        value - the quote character
      • getQuoteCharacter

        public String getQuoteCharacter()
        Returns the string used as separator for the columns, '\t' for tab.
        Returns:
        the separator
      • quoteCharacterTipText

        public String quoteCharacterTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setSeparator

        public void setSeparator​(String value)
        Sets the string to use as separator for the columns, use '\t' for tab.
        Parameters:
        value - the separator
      • getSeparator

        public String getSeparator()
        Returns the string used as separator for the columns, '\t' for tab.
        Returns:
        the separator
      • separatorTipText

        public String separatorTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setTextColumns

        public void setTextColumns​(Range value)
        Sets the range of columns to treat as text.
        Parameters:
        value - the range
      • getTextColumns

        public Range getTextColumns()
        Returns the range of columns to treat as text.
        Returns:
        the range
      • textColumnsTipText

        public String textColumnsTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the gui
      • setDateTimeColumns

        public void setDateTimeColumns​(Range value)
        Sets the range of columns to treat as date/time msec.
        Parameters:
        value - the range
      • getDateTimeColumns

        public Range getDateTimeColumns()
        Returns the range of columns to treat as date/time msec.
        Returns:
        the range
      • dateTimeColumnsTipText

        public String dateTimeColumnsTipText()
        Returns the tip date for this property.
        Returns:
        tip date for this property suitable for displaying in the gui
      • setDateTimeFormat

        public void setDateTimeFormat​(DateFormatString value)
        Sets the format for date/time msec columns.
        Parameters:
        value - the format
      • getDateTimeFormat

        public DateFormatString getDateTimeFormat()
        Returns the format for date/time msec columns.
        Returns:
        the format
      • dateTimeFormatTipText

        public String dateTimeFormatTipText()
        Returns the tip date/time for this property.
        Returns:
        tip date for this property suitable for displaying in the gui
      • setDateTimeLenient

        public void setDateTimeLenient​(boolean value)
        Sets whether parsing of date/time msecs is to be lenient or not.
        Parameters:
        value - if true lenient parsing is used, otherwise not
        See Also:
        DateFormat.setLenient(boolean)
      • isDateTimeLenient

        public boolean isDateTimeLenient()
        Returns whether the parsing of date/time msecs is lenient or not.
        Returns:
        true if parsing is lenient
        See Also:
        DateFormat.isLenient()
      • dateTimeLenientTipText

        public String dateTimeLenientTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the gui
      • setDateTimeType

        public void setDateTimeType​(BasicDateTimeType value)
        Sets the type for date/time columns.
        Parameters:
        value - the type
      • getDateTimeType

        public BasicDateTimeType getDateTimeType()
        Returns the type for date/time columns.
        Returns:
        the type
      • dateTimeTypeTipText

        public String dateTimeTypeTipText()
        Returns the tip date/time for this property.
        Returns:
        tip date for this property suitable for displaying in the gui
      • setTimeZone

        public void setTimeZone​(TimeZone value)
        Sets the time zone to use.
        Parameters:
        value - the time zone
      • getTimeZone

        public TimeZone getTimeZone()
        Returns the time zone in use.
        Returns:
        the time zone
      • timeZoneTipText

        public String timeZoneTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the gui
      • setLocale

        public void setLocale​(Locale value)
        Sets the locale to use.
        Specified by:
        setLocale in interface LocaleSupporter
        Parameters:
        value - the locale
      • localeTipText

        public String localeTipText()
        Returns the tip text for this property.
        Specified by:
        localeTipText in interface OptionHandlingLocaleSupporter
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setNoHeader

        public void setNoHeader​(boolean value)
        Sets whether the file contains a header row or not.
        Specified by:
        setNoHeader in interface NoHeaderSpreadSheetReader
        Parameters:
        value - true if no header row available
      • getNoHeader

        public boolean getNoHeader()
        Returns whether the file contains a header row or not.
        Specified by:
        getNoHeader in interface NoHeaderSpreadSheetReader
        Returns:
        true if no header row available
      • noHeaderTipText

        public String noHeaderTipText()
        Returns the tip text for this property.
        Specified by:
        noHeaderTipText in interface NoHeaderSpreadSheetReader
        Returns:
        tip text for this property suitable for displaying in the gui
      • setChunkSize

        public void setChunkSize​(int value)
        Sets the maximum chunk size.
        Specified by:
        setChunkSize in interface ChunkedSpreadSheetReader
        Parameters:
        value - the size of the chunks, < 1 denotes infinity
      • getChunkSize

        public int getChunkSize()
        Returns the current chunk size.
        Specified by:
        getChunkSize in interface ChunkedSpreadSheetReader
        Returns:
        the size of the chunks, < 1 denotes infinity
      • chunkSizeTipText

        public String chunkSizeTipText()
        Returns the tip text for this property.
        Specified by:
        chunkSizeTipText in interface ChunkedSpreadSheetReader
        Returns:
        tip text for this property suitable for displaying in the gui
      • setTrim

        public void setTrim​(boolean value)
        Sets whether to trim the cell content.
        Parameters:
        value - if true the content gets trimmed
      • getTrim

        public boolean getTrim()
        Returns whether to trim the cell content.
        Returns:
        true if to trim content
      • trimTipText

        public String trimTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the gui
      • hasMoreChunks

        public boolean hasMoreChunks()
        Checks whether there is more data to read.
        Specified by:
        hasMoreChunks in interface ChunkedSpreadSheetReader
        Returns:
        true if there is more data available
      • setLastError

        protected void setLastError​(String value)
        Sets the value for the last error that occurred during read.
        Overrides:
        setLastError in class AbstractSpreadSheetReader
        Parameters:
        value - the error string, null if none occurred