Class GroupedCrossValidationFoldGenerator

    • Field Detail

      • m_NumFolds

        protected int m_NumFolds
        the number of folds.
      • m_ActualNumFolds

        protected int m_ActualNumFolds
        the actual number of folds.
      • m_Stratify

        protected boolean m_Stratify
        whether to stratify the data (in case of nominal class).
      • m_CurrentFold

        protected int m_CurrentFold
        the current fold.
      • m_RelationName

        protected String m_RelationName
        the template for the relation name.
      • m_Randomize

        protected boolean m_Randomize
        whether to randomize the data.
      • m_RegExp

        protected adams.core.base.BaseRegExp m_RegExp
        the regular expression for the nominal/string attribute.
      • m_Group

        protected String m_Group
        the group expression.
      • m_Generator

        protected transient adams.data.splitgenerator.generic.crossvalidation.CrossValidationGenerator m_Generator
        the underlying scheme for generating the folds.
      • m_BinnableGroups

        protected transient List<adams.data.binning.Binnable<adams.data.binning.BinnableGroup<weka.core.Instance>>> m_BinnableGroups
        the collapsed data.
      • m_FoldPairs

        protected transient List<adams.data.splitgenerator.generic.crossvalidation.FoldPair<adams.data.binning.Binnable<adams.data.binning.BinnableGroup<weka.core.Instance>>>> m_FoldPairs
        the temporary pairs.
    • Constructor Detail

      • GroupedCrossValidationFoldGenerator

        public GroupedCrossValidationFoldGenerator()
        Initializes the generator.
      • GroupedCrossValidationFoldGenerator

        public GroupedCrossValidationFoldGenerator​(weka.core.Instances data,
                                                   int numFolds,
                                                   long seed,
                                                   boolean stratify,
                                                   boolean randomize,
                                                   WekaAttributeIndex index,
                                                   adams.core.base.BaseRegExp regExp,
                                                   String group)
        Initializes the generator.
        Parameters:
        data - the full dataset
        numFolds - the number of folds, leave-one-out if less than 2
        seed - the seed for randomization
        stratify - whether to perform stratified CV
        index - the attribute index
        regExp - the regular expression to apply to the attribute values
        group - the regexp group to use as group
    • Method Detail

      • globalInfo

        public String globalInfo()
        Returns a string describing the object.
        Specified by:
        globalInfo in interface adams.core.GlobalInfoSupporter
        Specified by:
        globalInfo in class adams.core.option.AbstractOptionHandler
        Returns:
        a description suitable for displaying in the gui
      • defineOptions

        public void defineOptions()
        Adds options to the internal list of options.
        Specified by:
        defineOptions in interface adams.core.option.OptionHandler
        Overrides:
        defineOptions in class AbstractSplitGenerator
      • setIndex

        public void setIndex​(WekaAttributeIndex value)
        Sets the attribute index to use for grouping.
        Parameters:
        value - the index
      • getIndex

        public WekaAttributeIndex getIndex()
        Returns the attribute index to use for grouping.
        Returns:
        the index
      • indexTipText

        public String indexTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setRegExp

        public void setRegExp​(adams.core.base.BaseRegExp value)
        Sets the regular expression for identifying the group (eg '^(.*)-([0-9]+)-(.*)$').
        Parameters:
        value - the expression
      • getRegExp

        public adams.core.base.BaseRegExp getRegExp()
        Returns the regular expression for identifying the group (eg '^(.*)-([0-9]+)-(.*)$').
        Returns:
        the expression
      • regExpTipText

        public String regExpTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setGroup

        public void setGroup​(String value)
        Sets the replacement string to use as group (eg '$2').
        Parameters:
        value - the group
      • getGroup

        public String getGroup()
        Returns the replacement string to use as group (eg '$2').
        Returns:
        the group
      • groupTipText

        public String groupTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setNumFolds

        public void setNumFolds​(int value)
        Sets the number of folds to use.
        Specified by:
        setNumFolds in interface adams.data.splitgenerator.CrossValidationFoldGenerator<weka.core.Instances,​WekaTrainTestSetContainer>
        Specified by:
        setNumFolds in interface CrossValidationFoldGenerator
        Parameters:
        value - the number of folds, less than 2 for LOO
      • numFoldsTipText

        public String numFoldsTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setRandomize

        public void setRandomize​(boolean value)
        Sets whether to randomize the data.
        Specified by:
        setRandomize in interface adams.data.splitgenerator.CrossValidationFoldGenerator<weka.core.Instances,​WekaTrainTestSetContainer>
        Specified by:
        setRandomize in interface CrossValidationFoldGenerator
        Parameters:
        value - true if to randomize the data
      • getRandomize

        public boolean getRandomize()
        Returns whether to randomize the data.
        Specified by:
        getRandomize in interface adams.data.splitgenerator.CrossValidationFoldGenerator<weka.core.Instances,​WekaTrainTestSetContainer>
        Specified by:
        getRandomize in interface CrossValidationFoldGenerator
        Returns:
        true if to randomize the data
      • randomizeTipText

        public String randomizeTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setStratify

        public void setStratify​(boolean value)
        Sets whether to stratify the data (nominal class).
        Specified by:
        setStratify in interface CrossValidationFoldGenerator
        Specified by:
        setStratify in interface adams.data.splitgenerator.StratifiableSplitGenerator<weka.core.Instances,​WekaTrainTestSetContainer>
        Parameters:
        value - whether to stratify the data (nominal class)
      • getStratify

        public boolean getStratify()
        Returns whether to stratify the data (in case of nominal class).
        Specified by:
        getStratify in interface CrossValidationFoldGenerator
        Specified by:
        getStratify in interface adams.data.splitgenerator.StratifiableSplitGenerator<weka.core.Instances,​WekaTrainTestSetContainer>
        Returns:
        true if to stratify
      • stratifyTipText

        public String stratifyTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • relationNameTipText

        public String relationNameTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • canRandomize

        protected boolean canRandomize()
        Returns whether randomization is enabled.
        Specified by:
        canRandomize in class AbstractSplitGenerator
        Returns:
        true if to randomize
      • checkNext

        protected boolean checkNext()
        Returns true if the iteration has more elements. (In other words, returns true if next would return an element rather than throwing an exception.)
        Specified by:
        checkNext in class AbstractSplitGenerator
        Returns:
        true if the iterator has more elements.