Package adams.data.io.input
Class CsvSpreadSheetReader
-
- All Implemented Interfaces:
AdditionalInformationHandler
,Destroyable
,ErrorProvider
,GlobalInfoSupporter
,EncodingSupporter
,FileFormatHandler
,LoggingLevelHandler
,LoggingSupporter
,LocaleSupporter
,OptionHandlingLocaleSupporter
,OptionHandler
,SizeOfHandler
,Stoppable
,StoppableWithFeedback
,ChunkedSpreadSheetReader
,InitialRowSkippingSpreadSheetReader
,MissingValueSpreadSheetReader
,NoHeaderSpreadSheetReader
,SpreadSheetReader
,WindowedSpreadSheetReader
,DataRowTypeHandler
,SpreadSheetTypeHandler
,Serializable
public class CsvSpreadSheetReader extends AbstractSpreadSheetReaderWithMissingValueSupport implements ChunkedSpreadSheetReader, OptionHandlingLocaleSupporter, WindowedSpreadSheetReader, NoHeaderSpreadSheetReader, InitialRowSkippingSpreadSheetReader
Reads CSV files.
It is possible to force columns to be text. In that case no intelligent parsing is attempted to determine the type of data a cell has.
For very large files, one can turn on chunking, which returns spreadsheet objects till all the data has been read.
-logging-level <OFF|SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST> (property: loggingLevel) The logging level for outputting errors and debugging output. default: WARNING
-data-row-type <adams.data.spreadsheet.DataRow> (property: dataRowType) The type of row to use for the data. default: adams.data.spreadsheet.DenseDataRow
-spreadsheet-type <adams.data.spreadsheet.SpreadSheet> (property: spreadSheetType) The type of spreadsheet to use for the data. default: adams.data.spreadsheet.DefaultSpreadSheet
-missing <adams.core.base.BaseRegExp> (property: missingValue) The placeholder for missing values. default: ^(\\\\?|)$
-encoding <adams.core.base.BaseCharset> (property: encoding) The type of encoding to use when reading using a reader, leave empty for default. default: Default
-comment <java.lang.String> (property: comment) The string denoting the start of a line comment (comments can only precede header row). default: #
-quote-char <java.lang.String> (property: quoteCharacter) The character to use for surrounding text cells. default: \"
-separator <java.lang.String> (property: separator) The separator to use for the columns; use '\t' for tab. default: ,
-trim <boolean> (property: trim) If enabled, the content of the cells gets trimmed before added. default: false
-text-columns <adams.core.Range> (property: textColumns) The range of columns to treat as text. default: example: A range is a comma-separated list of single 1-based indices or sub-ranges of indices ('start-end'); 'inv(...)' inverts the range '...'; the following placeholders can be used as well: first, second, third, last_2, last_1, last
-date-columns <adams.core.Range> (property: dateColumns) The range of columns to treat as date. default: example: A range is a comma-separated list of single 1-based indices or sub-ranges of indices ('start-end'); 'inv(...)' inverts the range '...'; the following placeholders can be used as well: first, second, third, last_2, last_1, last
-date-format <adams.data.DateFormatString> (property: dateFormat) The format for dates. default: yyyy-MM-dd more: http://docs.oracle.com/javase/6/docs/api/java/text/SimpleDateFormat.html
-date-lenient <boolean> (property: dateLenient) Whether date parsing is lenient or not. default: false
-datetime-columns <adams.core.Range> (property: dateTimeColumns) The range of columns to treat as date/time. default: example: A range is a comma-separated list of single 1-based indices or sub-ranges of indices ('start-end'); 'inv(...)' inverts the range '...'; the following placeholders can be used as well: first, second, third, last_2, last_1, last
-datetime-format <adams.data.DateFormatString> (property: dateTimeFormat) The format for date/times. default: yyyy-MM-dd HH:mm:ss more: http://docs.oracle.com/javase/6/docs/api/java/text/SimpleDateFormat.html
-datetime-lenient <boolean> (property: dateTimeLenient) Whether date/time parsing is lenient or not. default: false
-datetimemsec-columns <adams.core.Range> (property: dateTimeMsecColumns) The range of columns to treat as date/time msec. default: example: A range is a comma-separated list of single 1-based indices or sub-ranges of indices ('start-end'); 'inv(...)' inverts the range '...'; the following placeholders can be used as well: first, second, third, last_2, last_1, last
-datetimemsec-format <adams.data.DateFormatString> (property: dateTimeMsecFormat) The format for date/time msecs. default: yyyy-MM-dd HH:mm:ss.SSS more: http://docs.oracle.com/javase/6/docs/api/java/text/SimpleDateFormat.html
-datetimemsec-lenient <boolean> (property: dateTimeMsecLenient) Whether date/time msec parsing is lenient or not. default: false
-time-columns <adams.core.Range> (property: timeColumns) The range of columns to treat as time. default: example: A range is a comma-separated list of single 1-based indices or sub-ranges of indices ('start-end'); 'inv(...)' inverts the range '...'; the following placeholders can be used as well: first, second, third, last_2, last_1, last
-time-format <adams.data.DateFormatString> (property: timeFormat) The format for times. default: HH:mm:ss more: http://docs.oracle.com/javase/6/docs/api/java/text/SimpleDateFormat.html
-time-lenient <boolean> (property: timeLenient) Whether time parsing is lenient or not. default: false
-time-msec-columns <adams.core.Range> (property: timeMsecColumns) The range of columns to treat as time/msec. default: example: A range is a comma-separated list of single 1-based indices or sub-ranges of indices ('start-end'); 'inv(...)' inverts the range '...'; the following placeholders can be used as well: first, second, third, last_2, last_1, last
-time-msec-format <adams.data.DateFormatString> (property: timeMsecFormat) The format for times/msec. default: HH:mm:ss.SSS more: http://docs.oracle.com/javase/6/docs/api/java/text/SimpleDateFormat.html
-time-msec-lenient <boolean> (property: timeMsecLenient) Whether time/msec parsing is lenient or not. default: false
-time-zone <java.util.TimeZone> (property: timeZone) The time zone to use for interpreting dates/times; default is the system-wide defined one.
-locale <java.util.Locale> (property: locale) The locale to use for parsing the numbers. default: Default
-no-header <boolean> (property: noHeader) If enabled, all rows get added as data rows and a dummy header will get inserted. default: false
-custom-column-headers <java.lang.String> (property: customColumnHeaders) The custom headers to use for the columns instead (comma-separated list); ignored if empty. default:
-first-row <int> (property: firstRow) The index of the first row to retrieve (1-based). default: 1 minimum: 1
-num-rows <int> (property: numRows) The number of data rows to retrieve; use -1 for unlimited. default: -1 minimum: -1
-num-rows-col-type-discovery <int> (property: numRowsColumnTypeDiscovery) The number of data rows to use for automatically determining the column (= speed up for large files with consistent cell types); use 0 to turn off feature. default: 0 minimum: 0
-chunk-size <int> (property: chunkSize) The maximum number of rows per chunk; using -1 will read put all data into a single spreadsheet object. default: -1 minimum: -1
-parse-formulas <boolean> (property: parseFormulas) Whether to try parsing formula-like cells. default: true
-skip-differing-rows <boolean> (property: skipDifferingRows) If enabled, skips rows that have either too many or too few cells compared to the header row. default: false
- Author:
- fracpete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
CsvSpreadSheetReader.ChunkReader
Reads CSV files chunk by chunk.-
Nested classes/interfaces inherited from class adams.data.io.input.AbstractSpreadSheetReader
AbstractSpreadSheetReader.InputType
-
-
Field Summary
Fields Modifier and Type Field Description protected int
m_ChunkSize
the chunk size to use.protected String
m_Comment
the line comment.protected String
m_CustomColumnHeaders
the comma-separated list of column header names.protected Range
m_DateColumns
the columns to treat as date.protected DateFormatString
m_DateFormat
the format string for the dates.protected boolean
m_DateLenient
whether date parsing is lenient.protected Range
m_DateTimeColumns
the columns to treat as date/time.protected DateFormatString
m_DateTimeFormat
the format string for the date/times.protected boolean
m_DateTimeLenient
whether date/time parsing is lenient.protected Range
m_DateTimeMsecColumns
the columns to treat as date/time msec.protected DateFormatString
m_DateTimeMsecFormat
the format string for the date/times.protected boolean
m_DateTimeMsecLenient
whether date/time msec parsing is lenient.protected int
m_FirstRow
the first row to retrieve (1-based).protected Locale
m_Locale
the locale to use.protected boolean
m_NoHeader
whether the file has a header or not.protected int
m_NumRows
the number of rows to retrieve (less than 1 = unlimited).protected int
m_NumRowsColumnTypeDiscovery
the number of rows to use for automatic discovery of column types (0 = off).protected boolean
m_ParseFormulas
whether to parse formulas.protected String
m_QuoteCharacter
the quote character.protected CsvSpreadSheetReader.ChunkReader
m_Reader
for reading the actual data.protected String
m_Separator
the column separator.protected boolean
m_SkipDifferingRows
whether to drop rows with too few or too many cells.protected int
m_SkipNumRows
the number of initial rows to skip.protected Range
m_TextColumns
the columns to treat as text.protected Range
m_TimeColumns
the columns to treat as time.protected DateFormatString
m_TimeFormat
the format string for the times.protected boolean
m_TimeLenient
whether time parsing is lenient.protected Range
m_TimeMsecColumns
the columns to treat as time/msec.protected DateFormatString
m_TimeMsecFormat
the format string for the times/msec.protected boolean
m_TimeMsecLenient
whether time/msec parsing is lenient.protected TimeZone
m_TimeZone
the timezone to use.protected boolean
m_Trim
whether to trim the cells.-
Fields inherited from class adams.data.io.input.AbstractSpreadSheetReaderWithMissingValueSupport
m_MissingValue
-
Fields inherited from class adams.data.io.input.AbstractSpreadSheetReader
m_DataRowType, m_Encoding, m_LastError, m_SpreadSheetType, m_Stopped, OPTION_INPUT, OPTION_OUTPUT
-
Fields inherited from class adams.core.option.AbstractOptionHandler
m_OptionManager
-
Fields inherited from class adams.core.logging.LoggingObject
m_Logger, m_LoggingIsEnabled, m_LoggingLevel
-
-
Constructor Summary
Constructors Constructor Description CsvSpreadSheetReader()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description String
chunkSizeTipText()
Returns the tip text for this property.String
commentTipText()
Returns the tip text for this property.String
customColumnHeadersTipText()
Returns the tip text for this property.String
dateColumnsTipText()
Returns the tip date for this property.String
dateFormatTipText()
Returns the tip date for this property.String
dateLenientTipText()
Returns the tip text for this property.String
dateTimeColumnsTipText()
Returns the tip date for this property.String
dateTimeFormatTipText()
Returns the tip date/time for this property.String
dateTimeLenientTipText()
Returns the tip text for this property.String
dateTimeMsecColumnsTipText()
Returns the tip date for this property.String
dateTimeMsecFormatTipText()
Returns the tip date/time for this property.String
dateTimeMsecLenientTipText()
Returns the tip text for this property.void
defineOptions()
Adds options to the internal list of options.protected SpreadSheet
doRead(Reader r)
Reads the spreadsheet content from the specified file.String
firstRowTipText()
Returns the tip text for this property.int
getChunkSize()
Returns the current chunk size.String
getComment()
Returns the string denoting the start of a line comment.SpreadSheetWriter
getCorrespondingWriter()
Returns, if available, the corresponding writer.String
getCustomColumnHeaders()
Returns whether the file contains a header row or not.Range
getDateColumns()
Returns the range of columns to treat as date.DateFormatString
getDateFormat()
Returns the format for date columns.Range
getDateTimeColumns()
Returns the range of columns to treat as date/time.DateFormatString
getDateTimeFormat()
Returns the format for date/time columns.Range
getDateTimeMsecColumns()
Returns the range of columns to treat as date/time msec.DateFormatString
getDateTimeMsecFormat()
Returns the format for date/time msec columns.protected BaseRegExp
getDefaultMissingValue()
Returns the default missing value to use.int
getFirstRow()
Returns the first row to return.String
getFormatDescription()
Returns a string describing the format (used in the file chooser).String[]
getFormatExtensions()
Returns the extension(s) of the format.protected AbstractSpreadSheetReader.InputType
getInputType()
Returns how to read the data, from a file, stream or reader.Locale
getLocale()
Returns the locale in use.boolean
getNoHeader()
Returns whether the file contains a header row or not.int
getNumRows()
Returns the number of data rows to return.int
getNumRowsColumnTypeDiscovery()
Returns the number of data rows to use for automatically determining the column type.boolean
getParseFormulas()
Returns whether to parse formula-like cells.String
getQuoteCharacter()
Returns the string used for surrounding text.String
getSeparator()
Returns the string used as separator for the columns, '\t' for tab.boolean
getSkipDifferingRows()
Returns whether to skip rows that have too few/many cells.int
getSkipNumRows()
Returns the number of initial rows to skip.Range
getTextColumns()
Returns the range of columns to treat as text.Range
getTimeColumns()
Returns the range of columns to treat as time.DateFormatString
getTimeFormat()
Returns the format for time columns.Range
getTimeMsecColumns()
Returns the range of columns to treat as time/msec.DateFormatString
getTimeMsecFormat()
Returns the format for time/msec columns.TimeZone
getTimeZone()
Returns the time zone in use.boolean
getTrim()
Returns whether to trim the cell content.String
globalInfo()
Returns a string describing the object.boolean
hasMoreChunks()
Checks whether there is more data to read.boolean
isDateLenient()
Returns whether the parsing of dates is lenient or not.boolean
isDateTimeLenient()
Returns whether the parsing of date/times is lenient or not.boolean
isDateTimeMsecLenient()
Returns whether the parsing of date/time msecs is lenient or not.boolean
isTimeLenient()
Returns whether the parsing of times is lenient or not.boolean
isTimeMsecLenient()
Returns whether the parsing of times/msec is lenient or not.String
localeTipText()
Returns the tip text for this property.static void
main(String[] args)
Runs the reader from the command-line.SpreadSheet
nextChunk()
Returns the next chunk.String
noHeaderTipText()
Returns the tip text for this property.String
numRowsColumnTypeDiscoveryTipText()
Returns the tip text for this property.String
numRowsTipText()
Returns the tip text for this property.String
parseFormulasTipText()
Returns the tip text for this property.String
quoteCharacterTipText()
Returns the tip text for this property.String
separatorTipText()
Returns the tip text for this property.void
setChunkSize(int value)
Sets the maximum chunk size.void
setComment(String value)
Sets the string denoting the start of a line comment.void
setCustomColumnHeaders(String value)
Sets the custom headers to use.void
setDateColumns(Range value)
Sets the range of columns to treat as date.void
setDateFormat(DateFormatString value)
Sets the format for date columns.void
setDateLenient(boolean value)
Sets whether parsing of dates is to be lenient or not.void
setDateTimeColumns(Range value)
Sets the range of columns to treat as date/time.void
setDateTimeFormat(DateFormatString value)
Sets the format for date/time columns.void
setDateTimeLenient(boolean value)
Sets whether parsing of date/times is to be lenient or not.void
setDateTimeMsecColumns(Range value)
Sets the range of columns to treat as date/time msec.void
setDateTimeMsecFormat(DateFormatString value)
Sets the format for date/time msec columns.void
setDateTimeMsecLenient(boolean value)
Sets whether parsing of date/time msecs is to be lenient or not.void
setFirstRow(int value)
Sets the first row to return.void
setLocale(Locale value)
Sets the locale to use.void
setNoHeader(boolean value)
Sets whether the file contains a header row or not.void
setNumRows(int value)
Sets the number of data rows to return.void
setNumRowsColumnTypeDiscovery(int value)
Sets the number of data rows to use for automatically determining the column type.void
setParseFormulas(boolean value)
Sets whether to parse formula-like cells.void
setQuoteCharacter(String value)
Sets the character used for surrounding text.void
setSeparator(String value)
Sets the string to use as separator for the columns, use '\t' for tab.void
setSkipDifferingRows(boolean value)
Sets whether to skip rows that have too few/many cells.void
setSkipNumRows(int value)
Sets the number of initial rows to skip.void
setTextColumns(Range value)
Sets the range of columns to treat as text.void
setTimeColumns(Range value)
Sets the range of columns to treat as time.void
setTimeFormat(DateFormatString value)
Sets the format for time columns.void
setTimeLenient(boolean value)
Sets whether parsing of times is to be lenient or not.void
setTimeMsecColumns(Range value)
Sets the range of columns to treat as time.void
setTimeMsecFormat(DateFormatString value)
Sets the format for time/msec columns.void
setTimeMsecLenient(boolean value)
Sets whether parsing of times/msec is to be lenient or not.void
setTimeZone(TimeZone value)
Sets the time zone to use.void
setTrim(boolean value)
Sets whether to trim the cell content.String
skipDifferingRowsTipText()
Returns the tip text for this property.String
skipNumRowsTipText()
Returns the tip text for this property.protected boolean
supportsCompressedInput()
Returns whether to automatically handle gzip compressed files (AbstractSpreadSheetReader.InputType.READER
,AbstractSpreadSheetReader.InputType.STREAM
).String
textColumnsTipText()
Returns the tip text for this property.String
timeColumnsTipText()
Returns the tip time for this property.String
timeFormatTipText()
Returns the tip time for this property.String
timeLenientTipText()
Returns the tip text for this property.String
timeMsecColumnsTipText()
Returns the tip time for this property.String
timeMsecFormatTipText()
Returns the tip time for this property.String
timeMsecLenientTipText()
Returns the tip text for this property.String
timeZoneTipText()
Returns the tip text for this property.String
trimTipText()
Returns the tip text for this property.-
Methods inherited from class adams.data.io.input.AbstractSpreadSheetReaderWithMissingValueSupport
getMissingValue, missingValueTipText, setMissingValue
-
Methods inherited from class adams.data.io.input.AbstractSpreadSheetReader
canDecompress, check, dataRowTypeTipText, doRead, doRead, encodingTipText, getAdditionalInformation, getDataRowType, getDefaultDataRowType, getDefaultFormatExtension, getDefaultSpreadSheet, getEncoding, getLastError, getReaders, getSpreadSheetType, hasLastError, initialize, isStopped, read, read, read, read, runReader, setDataRowType, setEncoding, setLastError, setSpreadSheetType, spreadSheetTypeTipText, stopExecution
-
Methods inherited from class adams.core.option.AbstractOptionHandler
cleanUpOptions, destroy, finishInit, getDefaultLoggingLevel, getOptionManager, loggingLevelTipText, newOptionManager, reset, setLoggingLevel, toCommandLine, toString
-
Methods inherited from class adams.core.logging.LoggingObject
configureLogger, getLogger, getLoggingLevel, initializeLogging, isLoggingEnabled, sizeOf
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface adams.core.Destroyable
destroy
-
Methods inherited from interface adams.core.logging.LoggingLevelHandler
getLoggingLevel
-
Methods inherited from interface adams.core.option.OptionHandler
cleanUpOptions, getOptionManager, toCommandLine
-
Methods inherited from interface adams.data.io.input.SpreadSheetReader
dataRowTypeTipText, getDataRowType, getDefaultFormatExtension, getLastError, getSpreadSheetType, hasLastError, isStopped, read, read, read, read, setDataRowType, setSpreadSheetType, spreadSheetTypeTipText, stopExecution
-
-
-
-
Field Detail
-
m_Comment
protected String m_Comment
the line comment.
-
m_QuoteCharacter
protected String m_QuoteCharacter
the quote character.
-
m_Separator
protected String m_Separator
the column separator.
-
m_TextColumns
protected Range m_TextColumns
the columns to treat as text.
-
m_DateColumns
protected Range m_DateColumns
the columns to treat as date.
-
m_DateFormat
protected DateFormatString m_DateFormat
the format string for the dates.
-
m_DateLenient
protected boolean m_DateLenient
whether date parsing is lenient.
-
m_DateTimeColumns
protected Range m_DateTimeColumns
the columns to treat as date/time.
-
m_DateTimeFormat
protected DateFormatString m_DateTimeFormat
the format string for the date/times.
-
m_DateTimeLenient
protected boolean m_DateTimeLenient
whether date/time parsing is lenient.
-
m_DateTimeMsecColumns
protected Range m_DateTimeMsecColumns
the columns to treat as date/time msec.
-
m_DateTimeMsecFormat
protected DateFormatString m_DateTimeMsecFormat
the format string for the date/times.
-
m_DateTimeMsecLenient
protected boolean m_DateTimeMsecLenient
whether date/time msec parsing is lenient.
-
m_TimeColumns
protected Range m_TimeColumns
the columns to treat as time.
-
m_TimeFormat
protected DateFormatString m_TimeFormat
the format string for the times.
-
m_TimeLenient
protected boolean m_TimeLenient
whether time parsing is lenient.
-
m_TimeMsecColumns
protected Range m_TimeMsecColumns
the columns to treat as time/msec.
-
m_TimeMsecFormat
protected DateFormatString m_TimeMsecFormat
the format string for the times/msec.
-
m_TimeMsecLenient
protected boolean m_TimeMsecLenient
whether time/msec parsing is lenient.
-
m_TimeZone
protected TimeZone m_TimeZone
the timezone to use.
-
m_Locale
protected Locale m_Locale
the locale to use.
-
m_SkipNumRows
protected int m_SkipNumRows
the number of initial rows to skip.
-
m_NoHeader
protected boolean m_NoHeader
whether the file has a header or not.
-
m_CustomColumnHeaders
protected String m_CustomColumnHeaders
the comma-separated list of column header names.
-
m_ChunkSize
protected int m_ChunkSize
the chunk size to use.
-
m_Trim
protected boolean m_Trim
whether to trim the cells.
-
m_FirstRow
protected int m_FirstRow
the first row to retrieve (1-based).
-
m_NumRows
protected int m_NumRows
the number of rows to retrieve (less than 1 = unlimited).
-
m_NumRowsColumnTypeDiscovery
protected int m_NumRowsColumnTypeDiscovery
the number of rows to use for automatic discovery of column types (0 = off).
-
m_ParseFormulas
protected boolean m_ParseFormulas
whether to parse formulas.
-
m_SkipDifferingRows
protected boolean m_SkipDifferingRows
whether to drop rows with too few or too many cells.
-
m_Reader
protected CsvSpreadSheetReader.ChunkReader m_Reader
for reading the actual data.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing the object.- Specified by:
globalInfo
in interfaceGlobalInfoSupporter
- Specified by:
globalInfo
in classAbstractOptionHandler
- Returns:
- a description suitable for displaying in the gui
-
defineOptions
public void defineOptions()
Adds options to the internal list of options.- Specified by:
defineOptions
in interfaceOptionHandler
- Overrides:
defineOptions
in classAbstractSpreadSheetReaderWithMissingValueSupport
-
getDefaultMissingValue
protected BaseRegExp getDefaultMissingValue()
Returns the default missing value to use.- Overrides:
getDefaultMissingValue
in classAbstractSpreadSheetReaderWithMissingValueSupport
- Returns:
- the default
-
setComment
public void setComment(String value)
Sets the string denoting the start of a line comment.- Parameters:
value
- the comment start
-
getComment
public String getComment()
Returns the string denoting the start of a line comment.- Returns:
- the comment start
-
commentTipText
public String commentTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setQuoteCharacter
public void setQuoteCharacter(String value)
Sets the character used for surrounding text.- Parameters:
value
- the quote character, can be empty
-
getQuoteCharacter
public String getQuoteCharacter()
Returns the string used for surrounding text.- Returns:
- the quote character, can be empty
-
quoteCharacterTipText
public String quoteCharacterTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setSeparator
public void setSeparator(String value)
Sets the string to use as separator for the columns, use '\t' for tab.- Parameters:
value
- the separator
-
getSeparator
public String getSeparator()
Returns the string used as separator for the columns, '\t' for tab.- Returns:
- the separator
-
separatorTipText
public String separatorTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setTextColumns
public void setTextColumns(Range value)
Sets the range of columns to treat as text.- Parameters:
value
- the range
-
getTextColumns
public Range getTextColumns()
Returns the range of columns to treat as text.- Returns:
- the range
-
textColumnsTipText
public String textColumnsTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the gui
-
setDateColumns
public void setDateColumns(Range value)
Sets the range of columns to treat as date.- Parameters:
value
- the range
-
getDateColumns
public Range getDateColumns()
Returns the range of columns to treat as date.- Returns:
- the range
-
dateColumnsTipText
public String dateColumnsTipText()
Returns the tip date for this property.- Returns:
- tip date for this property suitable for displaying in the gui
-
setDateFormat
public void setDateFormat(DateFormatString value)
Sets the format for date columns.- Parameters:
value
- the format
-
getDateFormat
public DateFormatString getDateFormat()
Returns the format for date columns.- Returns:
- the format
-
dateFormatTipText
public String dateFormatTipText()
Returns the tip date for this property.- Returns:
- tip date for this property suitable for displaying in the gui
-
setDateLenient
public void setDateLenient(boolean value)
Sets whether parsing of dates is to be lenient or not.- Parameters:
value
- if true lenient parsing is used, otherwise not- See Also:
DateFormat.setLenient(boolean)
-
isDateLenient
public boolean isDateLenient()
Returns whether the parsing of dates is lenient or not.- Returns:
- true if parsing is lenient
- See Also:
DateFormat.isLenient()
-
dateLenientTipText
public String dateLenientTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the gui
-
setDateTimeColumns
public void setDateTimeColumns(Range value)
Sets the range of columns to treat as date/time.- Parameters:
value
- the range
-
getDateTimeColumns
public Range getDateTimeColumns()
Returns the range of columns to treat as date/time.- Returns:
- the range
-
dateTimeColumnsTipText
public String dateTimeColumnsTipText()
Returns the tip date for this property.- Returns:
- tip date for this property suitable for displaying in the gui
-
setDateTimeFormat
public void setDateTimeFormat(DateFormatString value)
Sets the format for date/time columns.- Parameters:
value
- the format
-
getDateTimeFormat
public DateFormatString getDateTimeFormat()
Returns the format for date/time columns.- Returns:
- the format
-
dateTimeFormatTipText
public String dateTimeFormatTipText()
Returns the tip date/time for this property.- Returns:
- tip date for this property suitable for displaying in the gui
-
setDateTimeLenient
public void setDateTimeLenient(boolean value)
Sets whether parsing of date/times is to be lenient or not.- Parameters:
value
- if true lenient parsing is used, otherwise not- See Also:
DateFormat.setLenient(boolean)
-
isDateTimeLenient
public boolean isDateTimeLenient()
Returns whether the parsing of date/times is lenient or not.- Returns:
- true if parsing is lenient
- See Also:
DateFormat.isLenient()
-
dateTimeLenientTipText
public String dateTimeLenientTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the gui
-
setDateTimeMsecColumns
public void setDateTimeMsecColumns(Range value)
Sets the range of columns to treat as date/time msec.- Parameters:
value
- the range
-
getDateTimeMsecColumns
public Range getDateTimeMsecColumns()
Returns the range of columns to treat as date/time msec.- Returns:
- the range
-
dateTimeMsecColumnsTipText
public String dateTimeMsecColumnsTipText()
Returns the tip date for this property.- Returns:
- tip date for this property suitable for displaying in the gui
-
setDateTimeMsecFormat
public void setDateTimeMsecFormat(DateFormatString value)
Sets the format for date/time msec columns.- Parameters:
value
- the format
-
getDateTimeMsecFormat
public DateFormatString getDateTimeMsecFormat()
Returns the format for date/time msec columns.- Returns:
- the format
-
dateTimeMsecFormatTipText
public String dateTimeMsecFormatTipText()
Returns the tip date/time for this property.- Returns:
- tip date for this property suitable for displaying in the gui
-
setDateTimeMsecLenient
public void setDateTimeMsecLenient(boolean value)
Sets whether parsing of date/time msecs is to be lenient or not.- Parameters:
value
- if true lenient parsing is used, otherwise not- See Also:
DateFormat.setLenient(boolean)
-
isDateTimeMsecLenient
public boolean isDateTimeMsecLenient()
Returns whether the parsing of date/time msecs is lenient or not.- Returns:
- true if parsing is lenient
- See Also:
DateFormat.isLenient()
-
dateTimeMsecLenientTipText
public String dateTimeMsecLenientTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the gui
-
setTimeColumns
public void setTimeColumns(Range value)
Sets the range of columns to treat as time.- Parameters:
value
- the range
-
getTimeColumns
public Range getTimeColumns()
Returns the range of columns to treat as time.- Returns:
- the range
-
timeColumnsTipText
public String timeColumnsTipText()
Returns the tip time for this property.- Returns:
- tip time for this property suitable for displaying in the gui
-
setTimeFormat
public void setTimeFormat(DateFormatString value)
Sets the format for time columns.- Parameters:
value
- the format
-
getTimeFormat
public DateFormatString getTimeFormat()
Returns the format for time columns.- Returns:
- the format
-
timeFormatTipText
public String timeFormatTipText()
Returns the tip time for this property.- Returns:
- tip time for this property suitable for displaying in the gui
-
setTimeLenient
public void setTimeLenient(boolean value)
Sets whether parsing of times is to be lenient or not.- Parameters:
value
- if true lenient parsing is used, otherwise not- See Also:
DateFormat.setLenient(boolean)
-
isTimeLenient
public boolean isTimeLenient()
Returns whether the parsing of times is lenient or not.- Returns:
- true if parsing is lenient
- See Also:
DateFormat.isLenient()
-
timeLenientTipText
public String timeLenientTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the gui
-
setTimeMsecColumns
public void setTimeMsecColumns(Range value)
Sets the range of columns to treat as time.- Parameters:
value
- the range
-
getTimeMsecColumns
public Range getTimeMsecColumns()
Returns the range of columns to treat as time/msec.- Returns:
- the range
-
timeMsecColumnsTipText
public String timeMsecColumnsTipText()
Returns the tip time for this property.- Returns:
- tip time for this property suitable for displaying in the gui
-
setTimeMsecFormat
public void setTimeMsecFormat(DateFormatString value)
Sets the format for time/msec columns.- Parameters:
value
- the format
-
getTimeMsecFormat
public DateFormatString getTimeMsecFormat()
Returns the format for time/msec columns.- Returns:
- the format
-
timeMsecFormatTipText
public String timeMsecFormatTipText()
Returns the tip time for this property.- Returns:
- tip time for this property suitable for displaying in the gui
-
setTimeMsecLenient
public void setTimeMsecLenient(boolean value)
Sets whether parsing of times/msec is to be lenient or not.- Parameters:
value
- if true lenient parsing is used, otherwise not- See Also:
DateFormat.setLenient(boolean)
-
isTimeMsecLenient
public boolean isTimeMsecLenient()
Returns whether the parsing of times/msec is lenient or not.- Returns:
- true if parsing is lenient
- See Also:
DateFormat.isLenient()
-
timeMsecLenientTipText
public String timeMsecLenientTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the gui
-
setTimeZone
public void setTimeZone(TimeZone value)
Sets the time zone to use.- Parameters:
value
- the time zone
-
getTimeZone
public TimeZone getTimeZone()
Returns the time zone in use.- Returns:
- the time zone
-
timeZoneTipText
public String timeZoneTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the gui
-
setLocale
public void setLocale(Locale value)
Sets the locale to use.- Specified by:
setLocale
in interfaceLocaleSupporter
- Parameters:
value
- the locale
-
getLocale
public Locale getLocale()
Returns the locale in use.- Specified by:
getLocale
in interfaceLocaleSupporter
- Returns:
- the locale
-
localeTipText
public String localeTipText()
Returns the tip text for this property.- Specified by:
localeTipText
in interfaceOptionHandlingLocaleSupporter
- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setSkipNumRows
public void setSkipNumRows(int value)
Sets the number of initial rows to skip.- Specified by:
setSkipNumRows
in interfaceInitialRowSkippingSpreadSheetReader
- Parameters:
value
- the number of rows
-
getSkipNumRows
public int getSkipNumRows()
Returns the number of initial rows to skip.- Specified by:
getSkipNumRows
in interfaceInitialRowSkippingSpreadSheetReader
- Returns:
- the number of rows
-
skipNumRowsTipText
public String skipNumRowsTipText()
Returns the tip text for this property.- Specified by:
skipNumRowsTipText
in interfaceInitialRowSkippingSpreadSheetReader
- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setNoHeader
public void setNoHeader(boolean value)
Sets whether the file contains a header row or not.- Specified by:
setNoHeader
in interfaceNoHeaderSpreadSheetReader
- Parameters:
value
- true if no header row available
-
getNoHeader
public boolean getNoHeader()
Returns whether the file contains a header row or not.- Specified by:
getNoHeader
in interfaceNoHeaderSpreadSheetReader
- Returns:
- true if no header row available
-
noHeaderTipText
public String noHeaderTipText()
Returns the tip text for this property.- Specified by:
noHeaderTipText
in interfaceNoHeaderSpreadSheetReader
- Returns:
- tip text for this property suitable for displaying in the gui
-
setCustomColumnHeaders
public void setCustomColumnHeaders(String value)
Sets the custom headers to use.- Specified by:
setCustomColumnHeaders
in interfaceNoHeaderSpreadSheetReader
- Parameters:
value
- the comma-separated list
-
getCustomColumnHeaders
public String getCustomColumnHeaders()
Returns whether the file contains a header row or not.- Specified by:
getCustomColumnHeaders
in interfaceNoHeaderSpreadSheetReader
- Returns:
- the comma-separated list
-
customColumnHeadersTipText
public String customColumnHeadersTipText()
Returns the tip text for this property.- Specified by:
customColumnHeadersTipText
in interfaceNoHeaderSpreadSheetReader
- Returns:
- tip text for this property suitable for displaying in the gui
-
setChunkSize
public void setChunkSize(int value)
Sets the maximum chunk size.- Specified by:
setChunkSize
in interfaceChunkedSpreadSheetReader
- Parameters:
value
- the size of the chunks, < 1 denotes infinity
-
getChunkSize
public int getChunkSize()
Returns the current chunk size.- Specified by:
getChunkSize
in interfaceChunkedSpreadSheetReader
- Returns:
- the size of the chunks, < 1 denotes infinity
-
chunkSizeTipText
public String chunkSizeTipText()
Returns the tip text for this property.- Specified by:
chunkSizeTipText
in interfaceChunkedSpreadSheetReader
- Returns:
- tip text for this property suitable for displaying in the gui
-
setTrim
public void setTrim(boolean value)
Sets whether to trim the cell content.- Parameters:
value
- if true the content gets trimmed
-
getTrim
public boolean getTrim()
Returns whether to trim the cell content.- Returns:
- true if to trim content
-
trimTipText
public String trimTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the gui
-
setFirstRow
public void setFirstRow(int value)
Sets the first row to return.- Specified by:
setFirstRow
in interfaceWindowedSpreadSheetReader
- Parameters:
value
- the first row (1-based), greater than 0
-
getFirstRow
public int getFirstRow()
Returns the first row to return.- Specified by:
getFirstRow
in interfaceWindowedSpreadSheetReader
- Returns:
- the first row (1-based), greater than 0
-
firstRowTipText
public String firstRowTipText()
Returns the tip text for this property.- Specified by:
firstRowTipText
in interfaceWindowedSpreadSheetReader
- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setNumRows
public void setNumRows(int value)
Sets the number of data rows to return.- Specified by:
setNumRows
in interfaceWindowedSpreadSheetReader
- Parameters:
value
- the number of rows, -1 for unlimited
-
getNumRows
public int getNumRows()
Returns the number of data rows to return.- Specified by:
getNumRows
in interfaceWindowedSpreadSheetReader
- Returns:
- the number of rows, -1 for unlimited
-
numRowsTipText
public String numRowsTipText()
Returns the tip text for this property.- Specified by:
numRowsTipText
in interfaceWindowedSpreadSheetReader
- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setNumRowsColumnTypeDiscovery
public void setNumRowsColumnTypeDiscovery(int value)
Sets the number of data rows to use for automatically determining the column type.- Parameters:
value
- the number of rows, 0 to turn off feature
-
getNumRowsColumnTypeDiscovery
public int getNumRowsColumnTypeDiscovery()
Returns the number of data rows to use for automatically determining the column type.- Returns:
- the number of rows, 0 to turn off feature
-
numRowsColumnTypeDiscoveryTipText
public String numRowsColumnTypeDiscoveryTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setParseFormulas
public void setParseFormulas(boolean value)
Sets whether to parse formula-like cells.- Parameters:
value
- if true then formula-like cells get parsed
-
getParseFormulas
public boolean getParseFormulas()
Returns whether to parse formula-like cells.- Returns:
- true if to parse formula-like cells
-
parseFormulasTipText
public String parseFormulasTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the gui
-
setSkipDifferingRows
public void setSkipDifferingRows(boolean value)
Sets whether to skip rows that have too few/many cells.- Parameters:
value
- true if to skip
-
getSkipDifferingRows
public boolean getSkipDifferingRows()
Returns whether to skip rows that have too few/many cells.- Returns:
- true if to skip
-
skipDifferingRowsTipText
public String skipDifferingRowsTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the gui
-
getFormatDescription
public String getFormatDescription()
Returns a string describing the format (used in the file chooser).- Specified by:
getFormatDescription
in interfaceFileFormatHandler
- Specified by:
getFormatDescription
in interfaceSpreadSheetReader
- Specified by:
getFormatDescription
in classAbstractSpreadSheetReader
- Returns:
- a description suitable for displaying in the file chooser
-
getFormatExtensions
public String[] getFormatExtensions()
Returns the extension(s) of the format.- Specified by:
getFormatExtensions
in interfaceFileFormatHandler
- Specified by:
getFormatExtensions
in interfaceSpreadSheetReader
- Specified by:
getFormatExtensions
in classAbstractSpreadSheetReader
- Returns:
- the extension (without the dot!)
-
getCorrespondingWriter
public SpreadSheetWriter getCorrespondingWriter()
Returns, if available, the corresponding writer.- Specified by:
getCorrespondingWriter
in interfaceSpreadSheetReader
- Returns:
- the writer, null if none available
-
getInputType
protected AbstractSpreadSheetReader.InputType getInputType()
Returns how to read the data, from a file, stream or reader.- Specified by:
getInputType
in classAbstractSpreadSheetReader
- Returns:
- how to read the data
-
supportsCompressedInput
protected boolean supportsCompressedInput()
Returns whether to automatically handle gzip compressed files (AbstractSpreadSheetReader.InputType.READER
,AbstractSpreadSheetReader.InputType.STREAM
).- Overrides:
supportsCompressedInput
in classAbstractSpreadSheetReader
- Returns:
- true if to automatically decompress
-
doRead
protected SpreadSheet doRead(Reader r)
Reads the spreadsheet content from the specified file.- Overrides:
doRead
in classAbstractSpreadSheetReader
- Parameters:
r
- the reader to read from- Returns:
- the spreadsheet or null in case of an error
- See Also:
AbstractSpreadSheetReader.getInputType()
-
hasMoreChunks
public boolean hasMoreChunks()
Checks whether there is more data to read.- Specified by:
hasMoreChunks
in interfaceChunkedSpreadSheetReader
- Returns:
- true if there is more data available
-
nextChunk
public SpreadSheet nextChunk()
Returns the next chunk.- Specified by:
nextChunk
in interfaceChunkedSpreadSheetReader
- Returns:
- the next chunk, null if no data available
-
main
public static void main(String[] args)
Runs the reader from the command-line. Use the optionAbstractSpreadSheetReader.OPTION_INPUT
to specify the input file. If the optionAbstractSpreadSheetReader.OPTION_OUTPUT
is specified then the read sheet gets output as .csv files in that directory.- Parameters:
args
- the command-line options to use
-
-