Class JoinOnID
- java.lang.Object
-
- adams.core.logging.LoggingObject
-
- adams.core.logging.CustomLoggingLevelObject
-
- adams.core.option.AbstractOptionHandler
-
- adams.flow.transformer.spreadsheetmethodmerge.AbstractMerge
-
- adams.flow.transformer.spreadsheetmethodmerge.JoinOnID
-
- All Implemented Interfaces:
Destroyable
,GlobalInfoSupporter
,LoggingLevelHandler
,LoggingSupporter
,OptionHandler
,QuickInfoSupporter
,SizeOfHandler
,Serializable
public class JoinOnID extends AbstractMerge
Joins the spreadsheets by concatenating rows that share a unique ID.
-logging-level <OFF|SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST> (property: loggingLevel) The logging level for outputting errors and debugging output. default: WARNING
-class-finder <adams.data.spreadsheet.columnfinder.ColumnFinder> (property: classFinder) The method to use to find class columns in the spreadsheets. default: adams.data.spreadsheet.columnfinder.NullFinder
-spreadsheet-names <adams.core.base.BaseString> [-spreadsheet-names ...] (property: spreadsheetNames) The list of spreadsheet names to use in column renaming. default:
-column-renames-exp <adams.core.base.BaseRegExp> [-column-renames-exp ...] (property: columnRenamesExp) The expressions to use to select column names for renaming (one per spreadsheet ). default: more: https://docs.oracle.com/javase/tutorial/essential/regex/ https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html
-column-renames-format <adams.core.base.BaseString> [-column-renames-format ...] (property: columnRenamesFormat) One format string for each renaming expression to specify how to rename the column. Can contain the {SPREADSHEET} keyword which will be replaced by the spreadsheet name, and also group identifiers which will be replaced by groups from the renaming regex. default:
-output-name <java.lang.String> (property: outputName) The name to use for the merged spreadsheet. default: output
-ensure-equal-values <boolean> (property: ensureEqualValues) Whether multiple column being merged into a single column require equal values from all sources. default: false
-unique-id <java.lang.String> (property: uniqueID) The name of the column to use as the joining key for the merge. default:
-complete-rows-only <boolean> (property: completeRowsOnly) Whether only those IDs that have source data in all spreadsheets should be merged. default: false
- Author:
- Corey Sterling (csterlin at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description class
JoinOnID.UniqueIDEnumeration
Enumeration class that returns the rows from the source spreadsheets joined on the unique ID column.-
Nested classes/interfaces inherited from class adams.flow.transformer.spreadsheetmethodmerge.AbstractMerge
AbstractMerge.SourceColumn
-
-
Field Summary
Fields Modifier and Type Field Description protected boolean
m_CompleteRowsOnly
Whether or not to skip IDs that don't exist in all source spreadsheets.protected String
m_UniqueID
The name of the column to use as the merge key.-
Fields inherited from class adams.flow.transformer.spreadsheetmethodmerge.AbstractMerge
m_ClassColumns, m_ClassFinder, m_ColumnRenameFindRegexs, m_ColumnRenameFormatStrings, m_EnsureEqualValues, m_MergedSpreadsheetName, m_SpreadsheetNames, m_Spreadsheets, ROW_MISSING, SPREADSHEET_KEYWORD
-
Fields inherited from class adams.core.option.AbstractOptionHandler
m_OptionManager
-
Fields inherited from class adams.core.logging.LoggingObject
m_Logger, m_LoggingIsEnabled, m_LoggingLevel
-
-
Constructor Summary
Constructors Constructor Description JoinOnID()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected String
check(SpreadSheet[] datasets)
Hook method for performing checks before attempting the merge.protected String
checkAllSpreadsheetsHaveIDColumn(SpreadSheet[] spreadsheets)
Checks that each of the given spreadsheets has the unique ID column.protected String
checkColumnMapping(Map<String,List<AbstractMerge.SourceColumn>> columnMapping)
Makes sure the source data for each mapped column is the same type.protected int
compare(List<AbstractMerge.SourceColumn> sources1, List<AbstractMerge.SourceColumn> sources2)
Compares two lists of source columns to determine the order in which their mapped columns should appear in the merged spreadsheet.String
completeRowsOnlyTipText()
Gets the tip-text for the complete-rows-only option.void
defineOptions()
Adds options to the internal list of options.protected int
findColumnIndexOfUniqueID(SpreadSheet spreadsheet)
Finds the index of the unique ID column in the given spreadsheet.boolean
getCompleteRowsOnly()
Gets whether incomplete rows should be skipped.protected String
getMappedColumnName(AbstractMerge.SourceColumn source)
Gets the name of the column in the merged spreadsheet that the given source column maps to.protected Enumeration<int[]>
getRowSetEnumeration()
Allows specific merge methods to specify the order in which rows are placed into the merged spreadsheet, and which rows from the source spreadsheets are used for the source data.String
getUniqueID()
Gets the name of the unique ID column that the merge is joining on.String
globalInfo()
Returns a string describing the object.protected boolean
isUniqueIDName(String columnName)
Whether the given column name is the name of the unique ID column.void
setCompleteRowsOnly(boolean value)
Sets whether incomplete rows should be skipped.void
setUniqueID(String value)
Sets the name of the unique ID column that the merge is joining on.String
uniqueIDTipText()
Gets the tip-text for the unique ID option.-
Methods inherited from class adams.flow.transformer.spreadsheetmethodmerge.AbstractMerge
classFinderTipText, columnRenamesExpTipText, columnRenamesFormatTipText, createColumnMapping, createEmptyResultantSpreadsheet, ensureEqualValuesTipText, getClassFinder, getColumnRenamesExp, getColumnRenamesFormat, getEnsureEqualValues, getOutputName, getQuickInfo, getSpreadsheetNames, getValue, getValueEnsureEqual, getValueFirstAvailable, isAnyClassColumn, isClassColumn, isClassColumn, merge, outputNameTipText, recordClassColumns, resetInternalState, setClassFinder, setColumnRenamesExp, setColumnRenamesFormat, setEnsureEqualValues, setOutputName, setSpreadsheetNames, setValue, spreadsheetNamesTipText
-
Methods inherited from class adams.core.option.AbstractOptionHandler
cleanUpOptions, destroy, finishInit, getDefaultLoggingLevel, getOptionManager, initialize, loggingLevelTipText, newOptionManager, reset, setLoggingLevel, toCommandLine, toString
-
Methods inherited from class adams.core.logging.LoggingObject
configureLogger, getLogger, getLoggingLevel, initializeLogging, isLoggingEnabled, sizeOf
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface adams.core.logging.LoggingLevelHandler
getLoggingLevel
-
-
-
-
Field Detail
-
m_UniqueID
protected String m_UniqueID
The name of the column to use as the merge key.
-
m_CompleteRowsOnly
protected boolean m_CompleteRowsOnly
Whether or not to skip IDs that don't exist in all source spreadsheets.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing the object.- Specified by:
globalInfo
in interfaceGlobalInfoSupporter
- Specified by:
globalInfo
in classAbstractOptionHandler
- Returns:
- a description suitable for displaying in the gui
-
defineOptions
public void defineOptions()
Adds options to the internal list of options.- Specified by:
defineOptions
in interfaceOptionHandler
- Overrides:
defineOptions
in classAbstractMerge
-
getUniqueID
public String getUniqueID()
Gets the name of the unique ID column that the merge is joining on.- Returns:
- The name of the unique ID column.
-
setUniqueID
public void setUniqueID(String value)
Sets the name of the unique ID column that the merge is joining on.- Parameters:
value
- The name of the unique ID column.
-
uniqueIDTipText
public String uniqueIDTipText()
Gets the tip-text for the unique ID option.- Returns:
- The tip-text as a String.
-
getCompleteRowsOnly
public boolean getCompleteRowsOnly()
Gets whether incomplete rows should be skipped.- Returns:
- Whether incomplete rows should be skipped.
-
setCompleteRowsOnly
public void setCompleteRowsOnly(boolean value)
Sets whether incomplete rows should be skipped.- Parameters:
value
- Whether incomplete rows should be skipped.
-
completeRowsOnlyTipText
public String completeRowsOnlyTipText()
Gets the tip-text for the complete-rows-only option.- Returns:
- The tip-text as a String.
-
checkAllSpreadsheetsHaveIDColumn
protected String checkAllSpreadsheetsHaveIDColumn(SpreadSheet[] spreadsheets)
Checks that each of the given spreadsheets has the unique ID column.- Parameters:
spreadsheets
- The spreadsheets that are to be merged.- Returns:
- Null if all spreadsheeet have the unique ID column, otherwise an error message.
-
isUniqueIDName
protected boolean isUniqueIDName(String columnName)
Whether the given column name is the name of the unique ID column.- Parameters:
columnName
- The column name to check.- Returns:
- True if the given column name is the unique ID name, false otherwise.
-
findColumnIndexOfUniqueID
protected int findColumnIndexOfUniqueID(SpreadSheet spreadsheet)
Finds the index of the unique ID column in the given spreadsheet.- Parameters:
spreadsheet
- The spreadsheet to search.- Returns:
- The index of the unique ID column, or -1 if not found.
-
compare
protected int compare(List<AbstractMerge.SourceColumn> sources1, List<AbstractMerge.SourceColumn> sources2)
Compares two lists of source columns to determine the order in which their mapped columns should appear in the merged spreadsheet.- Overrides:
compare
in classAbstractMerge
- Parameters:
sources1
- The source columns of the first mapped column.sources2
- The source columns of the second mapped column.- Returns:
- sources1 < sources2 => -1, sources1 > sources2 => 1, otherwise 0;
-
getMappedColumnName
protected String getMappedColumnName(AbstractMerge.SourceColumn source)
Gets the name of the column in the merged spreadsheet that the given source column maps to.- Overrides:
getMappedColumnName
in classAbstractMerge
- Parameters:
source
- The source column.- Returns:
- The name of the mapped column in the merged spreadsheet.
-
getRowSetEnumeration
protected Enumeration<int[]> getRowSetEnumeration()
Allows specific merge methods to specify the order in which rows are placed into the merged spreadsheet, and which rows from the source spreadsheets are used for the source data.- Specified by:
getRowSetEnumeration
in classAbstractMerge
- Returns:
- An enumeration of the source rows, one row for each spreadsheet.
-
check
protected String check(SpreadSheet[] datasets)
Hook method for performing checks before attempting the merge.- Overrides:
check
in classAbstractMerge
- Parameters:
datasets
- the spreadsheets to merge- Returns:
- null if successfully checked, otherwise error message
-
checkColumnMapping
protected String checkColumnMapping(Map<String,List<AbstractMerge.SourceColumn>> columnMapping)
Makes sure the source data for each mapped column is the same type.- Overrides:
checkColumnMapping
in classAbstractMerge
- Parameters:
columnMapping
- The column mapping.- Returns:
- Null if all mappings are okay, or an error message if not.
-
-