Class JoinOnID

  • All Implemented Interfaces:
    Destroyable, GlobalInfoSupporter, LoggingLevelHandler, LoggingSupporter, OptionHandler, QuickInfoSupporter, SizeOfHandler, Serializable

    public class JoinOnID
    extends AbstractMerge
    Joins the spreadsheets by concatenating rows that share a unique ID.

    -logging-level <OFF|SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST> (property: loggingLevel)
        The logging level for outputting errors and debugging output.
        default: WARNING
     
    -class-finder <adams.data.spreadsheet.columnfinder.ColumnFinder> (property: classFinder)
        The method to use to find class columns in the spreadsheets.
        default: adams.data.spreadsheet.columnfinder.NullFinder
     
    -spreadsheet-names <adams.core.base.BaseString> [-spreadsheet-names ...] (property: spreadsheetNames)
        The list of spreadsheet names to use in column renaming.
        default:
     
    -column-renames-exp <adams.core.base.BaseRegExp> [-column-renames-exp ...] (property: columnRenamesExp)
        The expressions to use to select column names for renaming (one per spreadsheet
        ).
        default:
        more: https://docs.oracle.com/javase/tutorial/essential/regex/
        https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html
     
    -column-renames-format <adams.core.base.BaseString> [-column-renames-format ...] (property: columnRenamesFormat)
        One format string for each renaming expression to specify how to rename
        the column. Can contain the {SPREADSHEET} keyword which will be replaced
        by the spreadsheet name, and also group identifiers which will be replaced
        by groups from the renaming regex.
        default:
     
    -output-name <java.lang.String> (property: outputName)
        The name to use for the merged spreadsheet.
        default: output
     
    -ensure-equal-values <boolean> (property: ensureEqualValues)
        Whether multiple column being merged into a single column require equal
        values from all sources.
        default: false
     
    -unique-id <java.lang.String> (property: uniqueID)
        The name of the column to use as the joining key for the merge.
        default:
     
    -complete-rows-only <boolean> (property: completeRowsOnly)
        Whether only those IDs that have source data in all spreadsheets should
        be merged.
        default: false
     
    Author:
    Corey Sterling (csterlin at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Detail

      • m_UniqueID

        protected String m_UniqueID
        The name of the column to use as the merge key.
      • m_CompleteRowsOnly

        protected boolean m_CompleteRowsOnly
        Whether or not to skip IDs that don't exist in all source spreadsheets.
    • Constructor Detail

      • JoinOnID

        public JoinOnID()
    • Method Detail

      • getUniqueID

        public String getUniqueID()
        Gets the name of the unique ID column that the merge is joining on.
        Returns:
        The name of the unique ID column.
      • setUniqueID

        public void setUniqueID​(String value)
        Sets the name of the unique ID column that the merge is joining on.
        Parameters:
        value - The name of the unique ID column.
      • uniqueIDTipText

        public String uniqueIDTipText()
        Gets the tip-text for the unique ID option.
        Returns:
        The tip-text as a String.
      • getCompleteRowsOnly

        public boolean getCompleteRowsOnly()
        Gets whether incomplete rows should be skipped.
        Returns:
        Whether incomplete rows should be skipped.
      • setCompleteRowsOnly

        public void setCompleteRowsOnly​(boolean value)
        Sets whether incomplete rows should be skipped.
        Parameters:
        value - Whether incomplete rows should be skipped.
      • completeRowsOnlyTipText

        public String completeRowsOnlyTipText()
        Gets the tip-text for the complete-rows-only option.
        Returns:
        The tip-text as a String.
      • checkAllSpreadsheetsHaveIDColumn

        protected String checkAllSpreadsheetsHaveIDColumn​(SpreadSheet[] spreadsheets)
        Checks that each of the given spreadsheets has the unique ID column.
        Parameters:
        spreadsheets - The spreadsheets that are to be merged.
        Returns:
        Null if all spreadsheeet have the unique ID column, otherwise an error message.
      • isUniqueIDName

        protected boolean isUniqueIDName​(String columnName)
        Whether the given column name is the name of the unique ID column.
        Parameters:
        columnName - The column name to check.
        Returns:
        True if the given column name is the unique ID name, false otherwise.
      • findColumnIndexOfUniqueID

        protected int findColumnIndexOfUniqueID​(SpreadSheet spreadsheet)
        Finds the index of the unique ID column in the given spreadsheet.
        Parameters:
        spreadsheet - The spreadsheet to search.
        Returns:
        The index of the unique ID column, or -1 if not found.
      • compare

        protected int compare​(List<AbstractMerge.SourceColumn> sources1,
                              List<AbstractMerge.SourceColumn> sources2)
        Compares two lists of source columns to determine the order in which their mapped columns should appear in the merged spreadsheet.
        Overrides:
        compare in class AbstractMerge
        Parameters:
        sources1 - The source columns of the first mapped column.
        sources2 - The source columns of the second mapped column.
        Returns:
        sources1 < sources2 => -1, sources1 > sources2 => 1, otherwise 0;
      • getMappedColumnName

        protected String getMappedColumnName​(AbstractMerge.SourceColumn source)
        Gets the name of the column in the merged spreadsheet that the given source column maps to.
        Overrides:
        getMappedColumnName in class AbstractMerge
        Parameters:
        source - The source column.
        Returns:
        The name of the mapped column in the merged spreadsheet.
      • getRowSetEnumeration

        protected Enumeration<int[]> getRowSetEnumeration()
        Allows specific merge methods to specify the order in which rows are placed into the merged spreadsheet, and which rows from the source spreadsheets are used for the source data.
        Specified by:
        getRowSetEnumeration in class AbstractMerge
        Returns:
        An enumeration of the source rows, one row for each spreadsheet.
      • check

        protected String check​(SpreadSheet[] datasets)
        Hook method for performing checks before attempting the merge.
        Overrides:
        check in class AbstractMerge
        Parameters:
        datasets - the spreadsheets to merge
        Returns:
        null if successfully checked, otherwise error message