java.lang.Object
- adams.core.logging.LoggingObject
- - adams.core.logging.CustomLoggingLevelObject
  - - adams.core.option.AbstractOptionHandler
    - - adams.flow.transformer.wekadatasetsmerge.AbstractMerge
      - adams.flow.transformer.wekadatasetsmerge.JoinOnID

All Implemented Interfaces:: adams.core.Destroyable, adams.core.GlobalInfoSupporter, adams.core.logging.LoggingLevelHandler, adams.core.logging.LoggingSupporter, adams.core.option.OptionHandler, adams.core.QuickInfoSupporter, adams.core.SizeOfHandler, Serializable

public class JoinOnID
extends AbstractMerge

Joins the datasets by concatenating rows that share a unique ID.

-logging-level <OFF|SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST> (property: loggingLevel)
    The logging level for outputting errors and debugging output.
    default: WARNING

-class-finder <adams.data.weka.columnfinder.ColumnFinder> (property: classFinder)
    The column finder to use to find class attributes in the datasets.
    default: adams.data.weka.columnfinder.Class

-dataset-names <adams.core.base.BaseString> [-dataset-names ...] (property: datasetNames)
    The list of dataset names to use in attribute renaming.
    default:

-attr-renames-exp <adams.core.base.BaseRegExp> [-attr-renames-exp ...] (property: attributeRenamesExp)
    The expressions to use to select attribute names for renaming (one per dataset
    ).
    default:
    more: https://docs.oracle.com/javase/tutorial/essential/regex/
    https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html

-attr-renames-format <adams.core.base.BaseString> [-attr-renames-format ...] (property: attributeRenamesFormat)
    One format string for each renaming expression to specify how to rename
    the attribute. Can contain the {DATASET} keyword which will be replaced
    by the dataset name, and also group identifiers which will be replaced by
    groups from the renaming regex.
    default:

-output-name <java.lang.String> (property: outputName)
    The name to use for the merged dataset.
    default: output

-ensure-equal-values <boolean> (property: ensureEqualValues)
    Whether multiple attributes being merged into a single attribute require
    equal values from all sources.
    default: false

-unique-id <java.lang.String> (property: uniqueID)
    The name of the attribute to use as the joining key for the merge.
    default:

-complete-rows-only <boolean> (property: completeRowsOnly)
    Whether only those IDs that have source data in all datasets should be merged.
    default: false

Performs a merge by using a unique ID attribute for each source dataset to concatenate rows with the same ID.

Author:: Corey Sterling (csterlin at waikato dot ac dot nz)
See Also:: Serialized Form

Nested Class Summary

Nested Classes
Modifier and Type Class Description

class JoinOnID.UniqueIDEnumeration
Enumeration class that returns the rows from the source datasets joined on the unique ID attribute.
- Nested classes/interfaces inherited from class adams.flow.transformer.wekadatasetsmerge.AbstractMerge
  AbstractMerge.SourceAttribute

Field Summary

Fields
Modifier and Type	Field	Description
`protected boolean`	`m_CompleteRowsOnly`	Whether or not to skip IDs that don't exist in all source datasets.
`protected String`	`m_UniqueID`	The name of the attribute to use as the merge key.

Fields inherited from class adams.flow.transformer.wekadatasetsmerge.AbstractMerge
DATASET_KEYWORD, m_AttributeRenameFindRegexs, m_AttributeRenameFormatStrings, m_ClassAttributes, m_ClassFinder, m_DatasetNames, m_Datasets, m_EnsureEqualValues, m_MergedDatasetName, ROW_MISSING

Fields inherited from class adams.core.option.AbstractOptionHandler
m_OptionManager

Fields inherited from class adams.core.logging.LoggingObject
m_Logger, m_LoggingIsEnabled, m_LoggingLevel

Constructor Summary

Constructors
Constructor Description

JoinOnID()

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`protected String`	`check(weka.core.Instances[] datasets)`	Hook method for performing checks before attempting the merge.
`protected String`	`checkAllDatasetsHaveIDAttribute(weka.core.Instances[] datasets)`	Checks that each of the given datasets has the unique ID attribute.
`protected String`	`checkAttributeMapping(Map<String,List<AbstractMerge.SourceAttribute>> attributeMapping)`	Makes sure the source data for each mapped attribute is the same type.
`protected int`	`compare(List<AbstractMerge.SourceAttribute> sources1, List<AbstractMerge.SourceAttribute> sources2)`	Compares two lists of source attributes to determine the order in which their mapped attributes should appear in the merged dataset.
`String`	`completeRowsOnlyTipText()`	Gets the tip-text for the complete-rows-only option.
`void`	`defineOptions()`	Adds options to the internal list of options.
`protected int`	`findAttributeIndexOfUniqueID(weka.core.Instances dataset)`	Finds the index of the unique ID attribute in the given dataset.
`boolean`	`getCompleteRowsOnly()`	Gets whether incomplete rows should be skipped.
`protected String`	`getMappedAttributeName(AbstractMerge.SourceAttribute source)`	Gets the name of the attribute in the merged dataset that the given source attribute maps to.
`protected Enumeration<int[]>`	`getRowSetEnumeration()`	Allows specific merge methods to specify the order in which rows are placed into the merged dataset, and which rows from the source datasets are used for the source data.
`String`	`getUniqueID()`	Gets the name of the unique ID attribute that the merge is joining on.
`String`	`globalInfo()`	Returns a string describing the object.
`protected boolean`	`isUniqueIDName(String attributeName)`	Whether the given attribute name is the name of the unique ID attribute.
`void`	`setCompleteRowsOnly(boolean value)`	Sets whether incomplete rows should be skipped.
`void`	`setUniqueID(String value)`	Sets the name of the unique ID attribute that the merge is joining on.
`String`	`uniqueIDTipText()`	Gets the tip-text for the unique ID option.

Methods inherited from class adams.core.option.AbstractOptionHandler
cleanUpOptions, destroy, finishInit, getDefaultLoggingLevel, getOptionManager, initialize, loggingLevelTipText, newOptionManager, reset, setLoggingLevel, toCommandLine, toString

Methods inherited from class adams.core.logging.LoggingObject
configureLogger, getLogger, getLoggingLevel, initializeLogging, isLoggingEnabled, sizeOf

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Methods inherited from interface adams.core.logging.LoggingLevelHandler
getLoggingLevel

- Field Detail
  - m_UniqueID
```
protected String m_UniqueID
```
    The name of the attribute to use as the merge key.
  - m_CompleteRowsOnly
```
protected boolean m_CompleteRowsOnly
```
    Whether or not to skip IDs that don't exist in all source datasets.
- Constructor Detail
  - JoinOnID
```
public JoinOnID()
```
- Method Detail
  - globalInfo
```
public String globalInfo()
```
    Returns a string describing the object.
    
    Specified by:
    
    globalInfo in interface adams.core.GlobalInfoSupporter
    
    Specified by:
    
    globalInfo in class adams.core.option.AbstractOptionHandler
    
    Returns:
    
    a description suitable for displaying in the gui
  - defineOptions
```
public void defineOptions()
```
    Adds options to the internal list of options.
    
    Specified by:
    
    defineOptions in interface adams.core.option.OptionHandler
    
    Overrides:
    
    defineOptions in class AbstractMerge
  - getUniqueID
```
public String getUniqueID()
```
    Gets the name of the unique ID attribute that the merge is joining on.
    
    Returns:
    
    The name of the unique ID attribute.
  - setUniqueID
```
public void setUniqueID(String value)
```
    Sets the name of the unique ID attribute that the merge is joining on.
    
    Parameters:
    
    value - The name of the unique ID attribute.
  - uniqueIDTipText
```
public String uniqueIDTipText()
```
    Gets the tip-text for the unique ID option.
    
    Returns:
    
    The tip-text as a String.
  - getCompleteRowsOnly
```
public boolean getCompleteRowsOnly()
```
    Gets whether incomplete rows should be skipped.
    
    Returns:
    
    Whether incomplete rows should be skipped.
  - setCompleteRowsOnly
```
public void setCompleteRowsOnly(boolean value)
```
    Sets whether incomplete rows should be skipped.
    
    Parameters:
    
    value - Whether incomplete rows should be skipped.
  - completeRowsOnlyTipText
```
public String completeRowsOnlyTipText()
```
    Gets the tip-text for the complete-rows-only option.
    
    Returns:
    
    The tip-text as a String.
  - checkAllDatasetsHaveIDAttribute
```
protected String checkAllDatasetsHaveIDAttribute(weka.core.Instances[] datasets)
```
    Checks that each of the given datasets has the unique ID attribute. Also checks that the unique ID attribute is the same data type for all datasets.
    
    Parameters:
    
    datasets - The datasets that are to be merged.
    
    Returns:
    
    Null if all datasets have the unique ID attribute, otherwise an error message.
  - isUniqueIDName
```
protected boolean isUniqueIDName(String attributeName)
```
    Whether the given attribute name is the name of the unique ID attribute.
    
    Parameters:
    
    attributeName - The attribute name to check.
    
    Returns:
    
    True if the given attribute name is the unique ID name, false otherwise.
  - findAttributeIndexOfUniqueID
```
protected int findAttributeIndexOfUniqueID(weka.core.Instances dataset)
```
    Finds the index of the unique ID attribute in the given dataset.
    
    Parameters:
    
    dataset - The dataset to search.
    
    Returns:
    
    The index of the unique ID attribute, or -1 if not found.
  - compare
```
protected int compare(List<AbstractMerge.SourceAttribute> sources1,
                      List<AbstractMerge.SourceAttribute> sources2)
```
    Compares two lists of source attributes to determine the order in which their mapped attributes should appear in the merged dataset.
    
    Overrides:
    
    compare in class AbstractMerge
    
    Parameters:
    
    sources1 - The source attributes of the first mapped attribute.
    
    sources2 - The source attributes of the second mapped attribute.
    
    Returns:
    
    sources1 < sources2 => -1, sources1 > sources2 => 1, otherwise 0;
  - getMappedAttributeName
```
protected String getMappedAttributeName(AbstractMerge.SourceAttribute source)
```
    Gets the name of the attribute in the merged dataset that the given source attribute maps to.
    
    Overrides:
    
    getMappedAttributeName in class AbstractMerge
    
    Parameters:
    
    source - The source attribute.
    
    Returns:
    
    The name of the mapped attribute in the merged dataset.
  - getRowSetEnumeration
```
protected Enumeration<int[]> getRowSetEnumeration()
```
    Allows specific merge methods to specify the order in which rows are placed into the merged dataset, and which rows from the source datasets are used for the source data.
    
    Specified by:
    
    getRowSetEnumeration in class AbstractMerge
    
    Returns:
    
    An enumeration of the source rows, one row for each dataset.
  - check
```
protected String check(weka.core.Instances[] datasets)
```
    Hook method for performing checks before attempting the merge.
    
    Overrides:
    
    check in class AbstractMerge
    
    Parameters:
    
    datasets - the datasets to merge
    
    Returns:
    
    null if successfully checked, otherwise error message
  - checkAttributeMapping
```
protected String checkAttributeMapping(Map<String,List<AbstractMerge.SourceAttribute>> attributeMapping)
```
    Makes sure the source data for each mapped attribute is the same type.
    
    Overrides:
    
    checkAttributeMapping in class AbstractMerge
    
    Parameters:
    
    attributeMapping - The attribute mapping.
    
    Returns:
    
    Null if all mappings are okay, or an error message if not.

Class JoinOnID

Nested Class Summary

Nested classes/interfaces inherited from class adams.flow.transformer.wekadatasetsmerge.AbstractMerge

Field Summary

Fields inherited from class adams.flow.transformer.wekadatasetsmerge.AbstractMerge

Fields inherited from class adams.core.option.AbstractOptionHandler

Fields inherited from class adams.core.logging.LoggingObject

Constructor Summary

Method Summary

Methods inherited from class adams.flow.transformer.wekadatasetsmerge.AbstractMerge

Methods inherited from class adams.core.option.AbstractOptionHandler

Methods inherited from class adams.core.logging.LoggingObject

Methods inherited from class java.lang.Object

Methods inherited from interface adams.core.logging.LoggingLevelHandler

Field Detail

m_UniqueID

m_CompleteRowsOnly

Constructor Detail

JoinOnID

Method Detail

globalInfo

defineOptions

getUniqueID

setUniqueID

uniqueIDTipText

getCompleteRowsOnly

setCompleteRowsOnly

completeRowsOnlyTipText

checkAllDatasetsHaveIDAttribute

isUniqueIDName

findAttributeIndexOfUniqueID

compare

getMappedAttributeName

getRowSetEnumeration

check

checkAttributeMapping