Class CsvSpreadSheetReader.ChunkReader

    • Field Detail

      • DOUBLED_UP_QUOTES

        public static final String[] DOUBLED_UP_QUOTES
        the doubled up quotes to replace.
      • COLLAPSED_QUOTES

        public static final char[] COLLAPSED_QUOTES
        the replacement for doubled up quotes.
      • m_MissingValue

        protected BaseRegExp m_MissingValue
        the missing value.
      • m_HasTextCols

        protected boolean m_HasTextCols
        whether any text columns are defined.
      • m_TextCols

        protected gnu.trove.set.hash.TIntHashSet m_TextCols
        the text column indices.
      • m_HasDateCols

        protected boolean m_HasDateCols
        whether any date columns are defined.
      • m_DateCols

        protected gnu.trove.set.hash.TIntHashSet m_DateCols
        the date column indices.
      • m_HasDateTimeCols

        protected boolean m_HasDateTimeCols
        whether any date/time columns are defined.
      • m_DateTimeCols

        protected gnu.trove.set.hash.TIntHashSet m_DateTimeCols
        the date/time column indices.
      • m_HasDateTimeMsecCols

        protected boolean m_HasDateTimeMsecCols
        whether any date/time msec columns are defined.
      • m_DateTimeMsecCols

        protected gnu.trove.set.hash.TIntHashSet m_DateTimeMsecCols
        the date/time msec column indices.
      • m_HasTimeCols

        protected boolean m_HasTimeCols
        whether any time columns are defined.
      • m_HasTimeMsecCols

        protected boolean m_HasTimeMsecCols
        whether any time/msec columns are defined.
      • m_TimeCols

        protected gnu.trove.set.hash.TIntHashSet m_TimeCols
        the time column indices.
      • m_TimeMsecCols

        protected gnu.trove.set.hash.TIntHashSet m_TimeMsecCols
        the time/msec column indices.
      • m_DateFormat

        protected DateFormat m_DateFormat
        the date format.
      • m_DateTimeFormat

        protected DateFormat m_DateTimeFormat
        the date/time format.
      • m_DateTimeMsecFormat

        protected DateFormat m_DateTimeMsecFormat
        the date/time msec format.
      • m_TimeFormat

        protected DateFormat m_TimeFormat
        the time format.
      • m_TimeMsecFormat

        protected DateFormat m_TimeMsecFormat
        the time/smec format.
      • m_NumberFormat

        protected NumberFormat m_NumberFormat
        the number format.
      • m_ChunkSize

        protected int m_ChunkSize
        the chunk size.
      • m_QuoteChar

        protected char m_QuoteChar
        the quote char.
      • m_Separator

        protected char m_Separator
        the column separator.
      • m_Comment

        protected String m_Comment
        the comment string.
      • m_Trim

        protected boolean m_Trim
        whether to trim the cells.
      • m_HeaderCells

        protected List<String> m_HeaderCells
        the header cells to use.
      • m_LastChar

        protected char m_LastChar
        the last character that was read too far.
      • m_RowCount

        protected int m_RowCount
        the rows read so far.
      • m_FirstRow

        protected int m_FirstRow
        the first row to retrieve (1-based).
      • m_NumRows

        protected int m_NumRows
        the number of rows to retrieve (less than 1 = unlimited).
      • m_NumRowsAuto

        protected int m_NumRowsAuto
        the number of rows to use for automatically determining the column types.
      • m_ParseFormulas

        protected boolean m_ParseFormulas
        whether to parse formula-like cells.
      • m_SkipDifferingRows

        protected boolean m_SkipDifferingRows
        whether to drop rows with too few or too many cells.
      • m_AutoTypes

        protected Cell.ContentType[] m_AutoTypes
        the automatically determined column types.
    • Constructor Detail

      • ChunkReader

        public ChunkReader​(CsvSpreadSheetReader owner)
        Initializes the low-level reader.
        Parameters:
        owner - the owning reader
    • Method Detail

      • unquote

        protected String unquote​(String s)
        Unquotes the given string.
        Parameters:
        s - the string to unquote, if necessary
        Returns:
        the processed string
      • removeTrailingCR

        protected void removeTrailingCR​(StringBuilder current)
        Removes a trailing CR.
        Parameters:
        current - the current buffer
      • addCell

        protected void addCell​(StringBuilder current,
                               List<String> cells)
        Adds the current string to the cells.
        Parameters:
        current - the current string
        cells - the cells to add to
      • readCells

        protected List<String> readCells​(Reader reader)
                                  throws IOException
        Reads a row and breaks it up into cells.
        Parameters:
        reader - the reader to read from
        Returns:
        the cells, null if nothing could be read (EOF)
        Throws:
        IOException - if reading fails, e.g., due to IO error
      • next

        public SpreadSheet next()
        Reads the next chunk.
        Returns:
        the next chunk
      • close

        protected void close()
        Closes the reader.
      • hasNext

        public boolean hasNext()
        Returns whether there is more data to be read.
        Returns:
        true if more data available
      • read

        public SpreadSheet read​(Reader r)
        Reads the spreadsheet content from the specified reader.
        Parameters:
        r - the reader to read from
        Returns:
        the spreadsheet or null in case of an error