Package adams.data.io.input
Class ExcelSpreadSheetReader
- java.lang.Object
-
- All Implemented Interfaces:
AdditionalInformationHandler,Destroyable,ErrorProvider,GlobalInfoSupporter,EncodingSupporter,FileFormatHandler,LoggingLevelHandler,LoggingSupporter,OptionHandler,SizeOfHandler,Stoppable,StoppableWithFeedback,MissingValueSpreadSheetReader,MultiSheetSpreadSheetReader<SheetRange>,NoHeaderSpreadSheetReader,SpreadSheetReader,WindowedSpreadSheetReader,DataRowTypeHandler,SpreadSheetTypeHandler,Serializable
public class ExcelSpreadSheetReader extends AbstractExcelSpreadSheetReader<SheetRange>
Reads MS Excel files (using DOM).
-logging-level <OFF|SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST> (property: loggingLevel) The logging level for outputting errors and debugging output. default: WARNING min-user-mode: Expert
-data-row-type <adams.data.spreadsheet.DataRow> (property: dataRowType) The type of row to use for the data. default: adams.data.spreadsheet.DenseDataRow
-spreadsheet-type <adams.data.spreadsheet.SpreadSheet> (property: spreadSheetType) The type of spreadsheet to use for the data. default: adams.data.spreadsheet.DefaultSpreadSheet
-quiet <boolean> (property: quiet) If enabled, logging output in the spreadsheet is suppressed, e.g., from parsing errors of formulas. default: false
-only-store-formulas <boolean> (property: onlyStoreFormulas) If enabled, formulas are only stored but never evaluated; useful for spreadsheets with unsupported functions in formulas. default: false
-sheets <adams.data.spreadsheet.SheetRange> (property: sheetRange) The range of sheets to load. default: first example: A range is a comma-separated list of single 1-based indices or sub-ranges of indices ('start-end'); 'inv(...)' inverts the range '...'; sheet names (case-sensitive) as well as the following placeholders can be used: first, second, third, last_2, last_1, last; numeric indices can be enforced by preceding them with '#' (eg '#12'); sheet names can be surrounded by double quotes.-missing <adams.core.base.BaseRegExp> (property: missingValue) The placeholder for missing values. default: ^(\\\\?|)$ more: https://docs.oracle.com/javase/tutorial/essential/regex/ https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/regex/Pattern.html
-no-auto-extend-header <boolean> (property: autoExtendHeader) If enabled, the header gets automatically extended if rows have more cells than the header. default: true
-text-columns <adams.core.Range> (property: textColumns) The range of columns to treat as text. default: example: A range is a comma-separated list of single 1-based indices or sub-ranges of indices ('start-end'); 'inv(...)' inverts the range '...'; the following placeholders can be used as well: first, second, third, last_2, last_1, last-no-header <boolean> (property: noHeader) If enabled, all rows get added as data rows and a dummy header will get inserted. default: false
-custom-column-headers <java.lang.String> (property: customColumnHeaders) The custom headers to use for the columns instead (comma-separated list); ignored if empty. default:
-first-row <int> (property: firstRow) The index of the first row to retrieve (1-based). default: 1 minimum: 1
-num-rows <int> (property: numRows) The number of data rows to retrieve; use -1 for unlimited. default: -1 minimum: -1
-fill-empty-header-cells <boolean> (property: fillEmptyHeaderCells) If enabled, will use the formulas instead of the displayed text. default: true
- Author:
- fracpete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class adams.data.io.input.AbstractSpreadSheetReader
AbstractSpreadSheetReader.InputType
-
-
Field Summary
Fields Modifier and Type Field Description static StringHEADER_CELL_PREFIXthe prefix for empty header cells.protected booleanm_FillEmptyHeaderCellswhether to fill empty header cells.-
Fields inherited from class adams.data.io.input.AbstractExcelSpreadSheetReader
m_AutoExtendHeader, m_CustomColumnHeaders, m_FirstRow, m_NoHeader, m_NumRows, m_TextColumns
-
Fields inherited from class adams.data.io.input.AbstractMultiSheetSpreadSheetReaderWithMissingValueSupport
m_MissingValue
-
Fields inherited from class adams.data.io.input.AbstractMultiSheetSpreadSheetReader
m_SheetRange
-
Fields inherited from class adams.data.io.input.AbstractSpreadSheetReader
m_DataRowType, m_Encoding, m_LastError, m_OnlyStoreFormulas, m_Quiet, m_SpreadSheetType, m_Stopped, OPTION_INPUT, OPTION_OUTPUT
-
Fields inherited from class adams.core.option.AbstractOptionHandler
m_OptionManager
-
Fields inherited from class adams.core.logging.LoggingObject
m_Logger, m_LoggingIsEnabled, m_LoggingLevel
-
-
Constructor Summary
Constructors Constructor Description ExcelSpreadSheetReader()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voiddefineOptions()Adds options to the internal list of options.protected List<SpreadSheet>doReadRange(InputStream in)Reads the spreadsheet content from the specified file.StringfillEmptyHeaderCellsTipText()Returns the tip text for this property.SpreadSheetWritergetCorrespondingWriter()Returns, if available, the corresponding writer.protected SheetRangegetDefaultSheetRange()Returns the default sheet range.booleangetFillEmptyHeaderCells()Returns whether to fill empty header cells with a name.StringgetFormatDescription()Returns a string describing the format (used in the file chooser).String[]getFormatExtensions()Returns the extension(s) of the format.protected AbstractSpreadSheetReader.InputTypegetInputType()Returns how to read the data, from a file, stream or reader.StringglobalInfo()Returns a string describing the object.static voidmain(String[] args)Runs the reader from the command-line.protected StringnumericToString(org.apache.poi.ss.usermodel.Cell cell)Turns a numeric cell into a string.voidsetFillEmptyHeaderCells(boolean value)Sets whether to fill empty header cells with a name.-
Methods inherited from class adams.data.io.input.AbstractExcelSpreadSheetReader
autoExtendHeaderTipText, customColumnHeadersTipText, firstRowTipText, getAutoExtendHeader, getCustomColumnHeaders, getFirstRow, getNoHeader, getNumRows, getTextColumns, initialize, noHeaderTipText, numRowsTipText, setAutoExtendHeader, setCustomColumnHeaders, setFirstRow, setNoHeader, setNumRows, setTextColumns, textColumnsTipText
-
Methods inherited from class adams.data.io.input.AbstractMultiSheetSpreadSheetReaderWithMissingValueSupport
getDefaultMissingValue, getMissingValue, missingValueTipText, setMissingValue
-
Methods inherited from class adams.data.io.input.AbstractMultiSheetSpreadSheetReader
doRead, doRead, doRead, doReadRange, doReadRange, getSheetRange, readRange, readRange, readRange, readRange, setSheetRange, sheetRangeTipText
-
Methods inherited from class adams.data.io.input.AbstractSpreadSheetReader
canDecompress, check, dataRowTypeTipText, encodingTipText, getAdditionalInformation, getDataRowType, getDefaultDataRowType, getDefaultFormatExtension, getDefaultSpreadSheet, getEncoding, getLastError, getOnlyStoreFormulas, getReaders, getSpreadSheetType, hasLastError, isQuiet, isStopped, onlyStoreFormulasTipText, quietTipText, read, read, read, read, runReader, setDataRowType, setEncoding, setLastError, setOnlyStoreFormulas, setQuiet, setSpreadSheetType, spreadSheetTypeTipText, stopExecution, supportsCompressedInput
-
Methods inherited from class adams.core.option.AbstractOptionHandler
cleanUpOptions, destroy, finishInit, getDefaultLoggingLevel, getOptionManager, loggingLevelTipText, newOptionManager, reset, toCommandLine, toString
-
Methods inherited from class adams.core.logging.CustomLoggingLevelObject
setLoggingLevel
-
Methods inherited from class adams.core.logging.LoggingObject
configureLogger, getLogger, getLoggingLevel, initializeLogging, isLoggingEnabled, sizeOf
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface adams.core.Destroyable
destroy
-
Methods inherited from interface adams.core.logging.LoggingLevelHandler
getLoggingLevel
-
Methods inherited from interface adams.core.option.OptionHandler
cleanUpOptions, getOptionManager, toCommandLine
-
Methods inherited from interface adams.data.io.input.SpreadSheetReader
dataRowTypeTipText, getDataRowType, getDefaultFormatExtension, getLastError, getSpreadSheetType, hasLastError, isStopped, read, read, read, read, setDataRowType, setSpreadSheetType, spreadSheetTypeTipText, stopExecution
-
-
-
-
Field Detail
-
HEADER_CELL_PREFIX
public static final String HEADER_CELL_PREFIX
the prefix for empty header cells.- See Also:
- Constant Field Values
-
m_FillEmptyHeaderCells
protected boolean m_FillEmptyHeaderCells
whether to fill empty header cells.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing the object.- Specified by:
globalInfoin interfaceGlobalInfoSupporter- Specified by:
globalInfoin classAbstractOptionHandler- Returns:
- a description suitable for displaying in the gui
-
defineOptions
public void defineOptions()
Adds options to the internal list of options.- Specified by:
defineOptionsin interfaceOptionHandler- Overrides:
defineOptionsin classAbstractExcelSpreadSheetReader<SheetRange>
-
getDefaultSheetRange
protected SheetRange getDefaultSheetRange()
Returns the default sheet range.- Specified by:
getDefaultSheetRangein classAbstractMultiSheetSpreadSheetReader<SheetRange>- Returns:
- the default
-
getFormatDescription
public String getFormatDescription()
Returns a string describing the format (used in the file chooser).- Specified by:
getFormatDescriptionin interfaceFileFormatHandler- Specified by:
getFormatDescriptionin interfaceSpreadSheetReader- Specified by:
getFormatDescriptionin classAbstractSpreadSheetReader- Returns:
- a description suitable for displaying in the file chooser
-
getFormatExtensions
public String[] getFormatExtensions()
Returns the extension(s) of the format.- Specified by:
getFormatExtensionsin interfaceFileFormatHandler- Specified by:
getFormatExtensionsin interfaceSpreadSheetReader- Specified by:
getFormatExtensionsin classAbstractSpreadSheetReader- Returns:
- the extension (without the dot!)
-
getCorrespondingWriter
public SpreadSheetWriter getCorrespondingWriter()
Returns, if available, the corresponding writer.- Returns:
- the writer, null if none available
-
getInputType
protected AbstractSpreadSheetReader.InputType getInputType()
Returns how to read the data, from a file, stream or reader.- Specified by:
getInputTypein classAbstractSpreadSheetReader- Returns:
- how to read the data
-
setFillEmptyHeaderCells
public void setFillEmptyHeaderCells(boolean value)
Sets whether to fill empty header cells with a name.- Parameters:
value- true if to fill
-
getFillEmptyHeaderCells
public boolean getFillEmptyHeaderCells()
Returns whether to fill empty header cells with a name.- Returns:
- true if to fill
-
fillEmptyHeaderCellsTipText
public String fillEmptyHeaderCellsTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
numericToString
protected String numericToString(org.apache.poi.ss.usermodel.Cell cell)
Turns a numeric cell into a string. Tries to use "long" representation if possible.- Parameters:
cell- the cell to process- Returns:
- the string representation
-
doReadRange
protected List<SpreadSheet> doReadRange(InputStream in)
Reads the spreadsheet content from the specified file.- Overrides:
doReadRangein classAbstractMultiSheetSpreadSheetReader<SheetRange>- Parameters:
in- the input stream to read from- Returns:
- the spreadsheets or null in case of an error
- See Also:
AbstractSpreadSheetReader.getInputType()
-
main
public static void main(String[] args)
Runs the reader from the command-line. Use the optionAbstractSpreadSheetReader.OPTION_INPUTto specify the input file. If the optionAbstractSpreadSheetReader.OPTION_OUTPUTis specified then the read sheet gets output as .csv files in that directory.- Parameters:
args- the command-line options to use
-
-