Package weka.classifiers
Class MultiLevelSplitGenerator
- java.lang.Object
-
- adams.core.logging.LoggingObject
-
- adams.core.logging.CustomLoggingLevelObject
-
- adams.core.option.AbstractOptionHandler
-
- weka.classifiers.AbstractSplitGenerator
-
- weka.classifiers.MultiLevelSplitGenerator
-
- All Implemented Interfaces:
adams.core.Destroyable
,adams.core.GlobalInfoSupporter
,adams.core.logging.LoggingLevelHandler
,adams.core.logging.LoggingSupporter
,adams.core.option.OptionHandler
,adams.core.Randomizable
,adams.core.SizeOfHandler
,adams.core.Stoppable
,adams.core.StoppableWithFeedback
,adams.data.splitgenerator.SplitGenerator<weka.core.Instances,WekaTrainTestSetContainer>
,InstancesViewSupporter
,Serializable
,Iterator<WekaTrainTestSetContainer>
,SplitGenerator
public class MultiLevelSplitGenerator extends AbstractSplitGenerator implements SplitGenerator, adams.core.StoppableWithFeedback
Generates splits based on groups extracted via regular expressions. Each attribute index/regular expression/group represents a level. At each level, the data gets split into groups according to the level's regexp/group, making up train and test sets.- Author:
- fracpete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected List<WekaTrainTestSetContainer>
m_Containers
the list of generated containers.protected adams.core.base.BaseString[]
m_Groups
the groups to generate.protected WekaAttributeIndex[]
m_Indices
the attribute indices.protected adams.core.base.BaseRegExp[]
m_RegExps
the regular expressions to apply to determine the grouping.protected boolean
m_Silent
whether to suppress error output.protected boolean
m_Stopped
whether the generation got stopped.-
Fields inherited from class weka.classifiers.AbstractSplitGenerator
m_Data, m_Initialized, m_OriginalIndices, m_Seed, m_UseViews
-
-
Constructor Summary
Constructors Constructor Description MultiLevelSplitGenerator()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected boolean
canRandomize()
Returns whether randomization is enabled.protected boolean
checkNext()
Returns true if the iteration has more elements.protected WekaTrainTestSetContainer
createNext()
Creates the next result.void
defineOptions()
Adds options to the internal list of options.protected void
doInitializeIterator()
Initializes the iterator.protected void
generateContainers()
Generates the containers.protected List<weka.core.Instances>
generateGroups(weka.core.Instances data, int index, String regexp, String group)
Generates the groups from the data by applying the regexp/group.protected List<com.github.fracpete.javautils.struct.Struct2<weka.core.Instances,weka.core.Instances>>
generateSplits(weka.core.Instances data, int index, String regexp, String group)
Generates the train/test splits.adams.core.base.BaseString[]
getGroups()
Returns the groups to generate.WekaAttributeIndex[]
getIndices()
Returns the attribute indices.adams.core.base.BaseRegExp[]
getRegExps()
Returns the regular expressions to use for extracting the groups.boolean
getSilent()
Returns whether to suppress error messages.String
globalInfo()
Returns a string describing the object.String
groupsTipText()
Returns the tip text for this property.String
indicesTipText()
Returns the tip text for this property.protected void
initialize()
Initializes the members.boolean
isStopped()
Whether the execution has been stopped.protected List<com.github.fracpete.javautils.struct.Struct2<weka.core.Instances,weka.core.Instances>>
match(List<com.github.fracpete.javautils.struct.Struct2<weka.core.Instances,weka.core.Instances>> trainSplits, List<com.github.fracpete.javautils.struct.Struct2<weka.core.Instances,weka.core.Instances>> testSplits, int index)
Combines train and test splits as long as there are matches.String
regExpsTipText()
Returns the tip text for this property.protected void
reset()
Resets the scheme.void
setGroups(adams.core.base.BaseString[] value)
Sets the groups to generate.void
setIndices(WekaAttributeIndex[] value)
Sets the attribute indices.void
setRegExps(adams.core.base.BaseRegExp[] value)
Sets the regular expressions to use for extracting the groups.void
setSilent(boolean value)
Sets whether to suppress error messages.String
silentTipText()
Returns the tip text for this property.void
stopExecution()
Stops the execution.protected weka.core.Instances
subset(List<weka.core.Instances> groups, int index, boolean invert)
Generates the subset: either the specified index of the rest.String
toString()
Returns a short description of the generator.-
Methods inherited from class weka.classifiers.AbstractSplitGenerator
getData, getSeed, getUseViews, hasNext, initializeIterator, next, randomize, remove, seedTipText, setData, setSeed, setUseViews, useViewsTipText
-
Methods inherited from class adams.core.option.AbstractOptionHandler
cleanUpOptions, destroy, finishInit, getDefaultLoggingLevel, getOptionManager, loggingLevelTipText, newOptionManager, setLoggingLevel, toCommandLine
-
Methods inherited from class adams.core.logging.LoggingObject
configureLogger, getLogger, getLoggingLevel, initializeLogging, isLoggingEnabled, sizeOf
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface adams.data.weka.InstancesViewSupporter
getUseViews, setUseViews
-
Methods inherited from interface java.util.Iterator
forEachRemaining
-
Methods inherited from interface adams.core.option.OptionHandler
cleanUpOptions, getOptionManager, toCommandLine
-
Methods inherited from interface weka.classifiers.SplitGenerator
getData, hasNext, initializeIterator, next, remove, setData
-
-
-
-
Field Detail
-
m_Indices
protected WekaAttributeIndex[] m_Indices
the attribute indices.
-
m_RegExps
protected adams.core.base.BaseRegExp[] m_RegExps
the regular expressions to apply to determine the grouping.
-
m_Groups
protected adams.core.base.BaseString[] m_Groups
the groups to generate.
-
m_Silent
protected boolean m_Silent
whether to suppress error output.
-
m_Containers
protected List<WekaTrainTestSetContainer> m_Containers
the list of generated containers.
-
m_Stopped
protected boolean m_Stopped
whether the generation got stopped.
-
-
Method Detail
-
globalInfo
public String globalInfo()
Returns a string describing the object.- Specified by:
globalInfo
in interfaceadams.core.GlobalInfoSupporter
- Specified by:
globalInfo
in classadams.core.option.AbstractOptionHandler
- Returns:
- a description suitable for displaying in the gui
-
defineOptions
public void defineOptions()
Adds options to the internal list of options.- Specified by:
defineOptions
in interfaceadams.core.option.OptionHandler
- Overrides:
defineOptions
in classAbstractSplitGenerator
-
initialize
protected void initialize()
Initializes the members.- Overrides:
initialize
in classAbstractSplitGenerator
-
reset
protected void reset()
Resets the scheme.- Overrides:
reset
in classAbstractSplitGenerator
-
setIndices
public void setIndices(WekaAttributeIndex[] value)
Sets the attribute indices.- Parameters:
value
- the indices
-
getIndices
public WekaAttributeIndex[] getIndices()
Returns the attribute indices.- Returns:
- the indices
-
indicesTipText
public String indicesTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setRegExps
public void setRegExps(adams.core.base.BaseRegExp[] value)
Sets the regular expressions to use for extracting the groups.- Parameters:
value
- the expressions
-
getRegExps
public adams.core.base.BaseRegExp[] getRegExps()
Returns the regular expressions to use for extracting the groups.- Returns:
- the expressions
-
regExpsTipText
public String regExpsTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setGroups
public void setGroups(adams.core.base.BaseString[] value)
Sets the groups to generate.- Parameters:
value
- the groups
-
getGroups
public adams.core.base.BaseString[] getGroups()
Returns the groups to generate.- Returns:
- the groups
-
groupsTipText
public String groupsTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
setSilent
public void setSilent(boolean value)
Sets whether to suppress error messages.- Parameters:
value
- true if to suppress
-
getSilent
public boolean getSilent()
Returns whether to suppress error messages.- Returns:
- true if to suppress
-
silentTipText
public String silentTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the GUI or for listing the options.
-
canRandomize
protected boolean canRandomize()
Returns whether randomization is enabled.- Specified by:
canRandomize
in classAbstractSplitGenerator
- Returns:
- true if to randomize
-
generateGroups
protected List<weka.core.Instances> generateGroups(weka.core.Instances data, int index, String regexp, String group)
Generates the groups from the data by applying the regexp/group.- Parameters:
data
- the data to split into groupsindex
- the attribute indexregexp
- the regexp to apply to the values of the attributegroup
- the group ID to generate- Returns:
- the generated groups
-
subset
protected weka.core.Instances subset(List<weka.core.Instances> groups, int index, boolean invert)
Generates the subset: either the specified index of the rest.- Parameters:
groups
- the groups to useindex
- the current indexinvert
- whether to invert- Returns:
- the generated instances
-
generateSplits
protected List<com.github.fracpete.javautils.struct.Struct2<weka.core.Instances,weka.core.Instances>> generateSplits(weka.core.Instances data, int index, String regexp, String group)
Generates the train/test splits.- Parameters:
data
- the data to generate the splits forindex
- the attribute indexregexp
- the regexp to apply to the values of the attributegroup
- the group ID to generate- Returns:
- the generated splits
-
match
protected List<com.github.fracpete.javautils.struct.Struct2<weka.core.Instances,weka.core.Instances>> match(List<com.github.fracpete.javautils.struct.Struct2<weka.core.Instances,weka.core.Instances>> trainSplits, List<com.github.fracpete.javautils.struct.Struct2<weka.core.Instances,weka.core.Instances>> testSplits, int index)
Combines train and test splits as long as there are matches. Others get dropped.- Parameters:
trainSplits
- the training splitstestSplits
- the test splits- Returns:
- the combined splits
-
generateContainers
protected void generateContainers()
Generates the containers.
-
doInitializeIterator
protected void doInitializeIterator()
Initializes the iterator.- Specified by:
doInitializeIterator
in classAbstractSplitGenerator
- See Also:
AbstractSplitGenerator.canRandomize()
-
checkNext
protected boolean checkNext()
Returns true if the iteration has more elements. (In other words, returns true if next would return an element rather than throwing an exception.)- Specified by:
checkNext
in classAbstractSplitGenerator
- Returns:
- true if the iterator has more elements.
-
createNext
protected WekaTrainTestSetContainer createNext()
Creates the next result.- Specified by:
createNext
in classAbstractSplitGenerator
- Returns:
- the next result
-
stopExecution
public void stopExecution()
Stops the execution.- Specified by:
stopExecution
in interfaceadams.core.Stoppable
-
isStopped
public boolean isStopped()
Whether the execution has been stopped.- Specified by:
isStopped
in interfaceadams.core.StoppableWithFeedback
- Returns:
- true if stopped
-
toString
public String toString()
Returns a short description of the generator.- Specified by:
toString
in interfaceadams.data.splitgenerator.SplitGenerator<weka.core.Instances,WekaTrainTestSetContainer>
- Specified by:
toString
in interfaceSplitGenerator
- Overrides:
toString
in classAbstractSplitGenerator
- Returns:
- a short description
-
-