Package weka.classifiers
Class GroupedRandomSplitGenerator
- java.lang.Object
-
- adams.core.logging.LoggingObject
-
- adams.core.logging.CustomLoggingLevelObject
-
- adams.core.option.AbstractOptionHandler
-
- weka.classifiers.AbstractSplitGenerator
-
- weka.classifiers.GroupedRandomSplitGenerator
-
- All Implemented Interfaces:
adams.core.Destroyable
,adams.core.GlobalInfoSupporter
,adams.core.logging.LoggingLevelHandler
,adams.core.logging.LoggingSupporter
,adams.core.option.OptionHandler
,adams.core.Randomizable
,adams.core.SizeOfHandler
,adams.data.splitgenerator.RandomSplitGenerator<weka.core.Instances,WekaTrainTestSetContainer>
,adams.data.splitgenerator.SplitGenerator<weka.core.Instances,WekaTrainTestSetContainer>
,InstancesViewSupporter
,Serializable
,Iterator<WekaTrainTestSetContainer>
,RandomSplitGenerator
,SplitGenerator
public class GroupedRandomSplitGenerator extends AbstractSplitGenerator implements RandomSplitGenerator
Generates random splits of datasets, making sure that groups of instances stay together (identified via a regexp).- Author:
- fracpete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected boolean
m_Generated
whether the split was generated.protected adams.data.splitgenerator.generic.randomsplit.RandomSplitGenerator
m_Generator
the underlying scheme for generating the split.protected String
m_Group
the group expression.protected WekaAttributeIndex
m_Index
the index to use for grouping.protected double
m_Percentage
the percentage.protected boolean
m_PreserveOrder
whether to preserve the order.protected adams.core.base.BaseRegExp
m_RegExp
the regular expression for the nominal/string attribute.-
Fields inherited from class weka.classifiers.AbstractSplitGenerator
m_Data, m_Initialized, m_OriginalIndices, m_Seed, m_UseViews
-
-
Constructor Summary
Constructors Constructor Description GroupedRandomSplitGenerator()
Initializes the generator.GroupedRandomSplitGenerator(weka.core.Instances data, long seed, double percentage, boolean preserveOrder, WekaAttributeIndex index, adams.core.base.BaseRegExp regExp, String group)
Initializes the generator.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected boolean
canRandomize()
Returns whether randomization is enabled.protected boolean
checkNext()
Returns true if the iteration has more elements.protected WekaTrainTestSetContainer
createNext()
Creates the next result.void
defineOptions()
Adds options to the internal list of options.protected void
doInitializeIterator()
Initializes the iterator, randomizes the data if required.String
getGroup()
Returns the replacement string to use as group (eg '$2').WekaAttributeIndex
getIndex()
Returns the attribute index to use for grouping.double
getPercentage()
Returns the split percentage.boolean
getPreserveOrder()
Returns whether to preserve the order.adams.core.base.BaseRegExp
getRegExp()
Returns the regular expression for identifying the group (eg '^(.*)-([0-9]+)-(.*)$').String
globalInfo()
Returns a string describing the object.String
groupTipText()
Returns the tip text for this property.String
indexTipText()
Returns the tip text for this property.String
percentageTipText()
Returns the tip text for this property.String
preserveOrderTipText()
Returns the tip text for this property.String
regExpTipText()
Returns the tip text for this property.void
setGroup(String value)
Sets the replacement string to use as group (eg '$2').void
setIndex(WekaAttributeIndex value)
Sets the attribute index to use for grouping.void
setPercentage(double value)
Sets the split percentage.void
setPreserveOrder(boolean value)
Sets whether to preserve the order.void
setRegExp(adams.core.base.BaseRegExp value)
Sets the regular expression for identifying the group (eg '^(.*)-([0-9]+)-(.*)$').String
toString()
Returns a short description of the generator.-
Methods inherited from class weka.classifiers.AbstractSplitGenerator
getData, getSeed, getUseViews, hasNext, initialize, initializeIterator, next, randomize, remove, reset, seedTipText, setData, setSeed, setUseViews, useViewsTipText
-
Methods inherited from class adams.core.option.AbstractOptionHandler
cleanUpOptions, destroy, finishInit, getDefaultLoggingLevel, getOptionManager, loggingLevelTipText, newOptionManager, setLoggingLevel, toCommandLine
-
Methods inherited from class adams.core.logging.LoggingObject
configureLogger, getLogger, getLoggingLevel, initializeLogging, isLoggingEnabled, sizeOf
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface adams.data.weka.InstancesViewSupporter
getUseViews, setUseViews
-
Methods inherited from interface java.util.Iterator
forEachRemaining
-
Methods inherited from interface adams.core.option.OptionHandler
cleanUpOptions, getOptionManager, toCommandLine
-
Methods inherited from interface weka.classifiers.SplitGenerator
getData, hasNext, initializeIterator, next, remove, setData
-
-
-
-
Field Detail
-
m_Percentage
protected double m_Percentage
the percentage.
-
m_PreserveOrder
protected boolean m_PreserveOrder
whether to preserve the order.
-
m_Generated
protected boolean m_Generated
whether the split was generated.
-
m_Index
protected WekaAttributeIndex m_Index
the index to use for grouping.
-
m_RegExp
protected adams.core.base.BaseRegExp m_RegExp
the regular expression for the nominal/string attribute.
-
m_Group
protected String m_Group
the group expression.
-
m_Generator
protected adams.data.splitgenerator.generic.randomsplit.RandomSplitGenerator m_Generator
the underlying scheme for generating the split.
-
-
Constructor Detail
-
GroupedRandomSplitGenerator
public GroupedRandomSplitGenerator()
Initializes the generator.
-
GroupedRandomSplitGenerator
public GroupedRandomSplitGenerator(weka.core.Instances data, long seed, double percentage, boolean preserveOrder, WekaAttributeIndex index, adams.core.base.BaseRegExp regExp, String group)
Initializes the generator. Does not preserve the order.- Parameters:
data
- the dataset to splitseed
- the seed value to use for randomizationpercentage
- the percentage of the training set (0-1)preserveOrder
- whether to preserve the orderindex
- the attribute indexregExp
- the regular expression to apply to the attribute valuesgroup
- the regexp group to use as group
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing the object.- Specified by:
globalInfo
in interfaceadams.core.GlobalInfoSupporter
- Specified by:
globalInfo
in classadams.core.option.AbstractOptionHandler
- Returns:
- a description suitable for displaying in the gui
-
defineOptions
public void defineOptions()
Adds options to the internal list of options.- Specified by:
defineOptions
in interfaceadams.core.option.OptionHandler
- Overrides:
defineOptions
in classAbstractSplitGenerator
-
setPercentage
public void setPercentage(double value)
Sets the split percentage.- Specified by:
setPercentage
in interfaceadams.data.splitgenerator.RandomSplitGenerator<weka.core.Instances,WekaTrainTestSetContainer>
- Specified by:
setPercentage
in interfaceRandomSplitGenerator
- Parameters:
value
- the percentage (0-1)
-
getPercentage
public double getPercentage()
Returns the split percentage.- Specified by:
getPercentage
in interfaceadams.data.splitgenerator.RandomSplitGenerator<weka.core.Instances,WekaTrainTestSetContainer>
- Specified by:
getPercentage
in interfaceRandomSplitGenerator
- Returns:
- the percentage (0-1)
-
percentageTipText
public String percentageTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setPreserveOrder
public void setPreserveOrder(boolean value)
Sets whether to preserve the order.- Specified by:
setPreserveOrder
in interfaceadams.data.splitgenerator.RandomSplitGenerator<weka.core.Instances,WekaTrainTestSetContainer>
- Specified by:
setPreserveOrder
in interfaceRandomSplitGenerator
- Parameters:
value
- true if to preserve order
-
getPreserveOrder
public boolean getPreserveOrder()
Returns whether to preserve the order.- Specified by:
getPreserveOrder
in interfaceadams.data.splitgenerator.RandomSplitGenerator<weka.core.Instances,WekaTrainTestSetContainer>
- Specified by:
getPreserveOrder
in interfaceRandomSplitGenerator
- Returns:
- true if to preserve order
-
preserveOrderTipText
public String preserveOrderTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setIndex
public void setIndex(WekaAttributeIndex value)
Sets the attribute index to use for grouping.- Parameters:
value
- the index
-
getIndex
public WekaAttributeIndex getIndex()
Returns the attribute index to use for grouping.- Returns:
- the index
-
indexTipText
public String indexTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setRegExp
public void setRegExp(adams.core.base.BaseRegExp value)
Sets the regular expression for identifying the group (eg '^(.*)-([0-9]+)-(.*)$').- Parameters:
value
- the expression
-
getRegExp
public adams.core.base.BaseRegExp getRegExp()
Returns the regular expression for identifying the group (eg '^(.*)-([0-9]+)-(.*)$').- Returns:
- the expression
-
regExpTipText
public String regExpTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setGroup
public void setGroup(String value)
Sets the replacement string to use as group (eg '$2').- Parameters:
value
- the group
-
getGroup
public String getGroup()
Returns the replacement string to use as group (eg '$2').- Returns:
- the group
-
groupTipText
public String groupTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
canRandomize
protected boolean canRandomize()
Returns whether randomization is enabled.- Specified by:
canRandomize
in classAbstractSplitGenerator
- Returns:
- true if to randomize
-
doInitializeIterator
protected void doInitializeIterator()
Initializes the iterator, randomizes the data if required.- Specified by:
doInitializeIterator
in classAbstractSplitGenerator
- See Also:
AbstractSplitGenerator.canRandomize()
-
checkNext
protected boolean checkNext()
Returns true if the iteration has more elements. (In other words, returns true if next would return an element rather than throwing an exception.)- Specified by:
checkNext
in classAbstractSplitGenerator
- Returns:
- true if the iterator has more elements.
-
createNext
protected WekaTrainTestSetContainer createNext()
Creates the next result.- Specified by:
createNext
in classAbstractSplitGenerator
- Returns:
- the next result
-
toString
public String toString()
Returns a short description of the generator.- Specified by:
toString
in interfaceadams.data.splitgenerator.SplitGenerator<weka.core.Instances,WekaTrainTestSetContainer>
- Specified by:
toString
in interfaceSplitGenerator
- Overrides:
toString
in classAbstractSplitGenerator
- Returns:
- a short description
-
-