Class WekaCrossValidationExecution

  • All Implemented Interfaces:
    adams.core.CleanUpHandler, adams.core.logging.LoggingLevelHandler, adams.core.logging.LoggingSupporter, adams.core.SizeOfHandler, adams.core.Stoppable, adams.core.ThreadLimiter, InstancesViewSupporter, adams.flow.core.FlowContextHandler, Serializable

    public class WekaCrossValidationExecution
    extends adams.core.logging.CustomLoggingLevelObject
    implements adams.core.Stoppable, InstancesViewSupporter, adams.core.ThreadLimiter, adams.flow.core.FlowContextHandler, adams.core.CleanUpHandler
    Performs cross-validation, either single or multi-threaded.
    Author:
    FracPete (fracpete at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected int m_ActualFolds
      the actual folds used.
      protected adams.multiprocess.JobRunner m_ActualJobRunner
      the runner in use.
      protected int m_ActualNumThreads
      the actual number of threads to use.
      protected weka.classifiers.Classifier m_Classifier
      the classifier to evaluate.
      protected weka.classifiers.Classifier[] m_Classifiers
      the separate classifiers.
      protected weka.classifiers.Classifier m_CurrentClassifier
      the current classifier that is being trained.
      protected StoppableEvaluation m_CurrentEvaluation
      the evaluation currently being run.
      protected weka.core.Instances m_Data
      the data to evaluate on.
      protected boolean m_DiscardPredictions
      whether to discard predictions.
      protected weka.classifiers.Evaluation m_Evaluation
      the (aggregated) evaluation.
      protected weka.classifiers.Evaluation[] m_Evaluations
      the separate evaluations.
      protected adams.flow.core.Actor m_FlowContext
      the flow context.
      protected int m_Folds
      the number of folds.
      protected CrossValidationFoldGenerator m_Generator
      the cross-validation fold generator.
      protected adams.multiprocess.JobRunner m_JobRunner
      the jobrunner template.
      protected adams.flow.standalone.JobRunnerSetup m_JobRunnerSetup
      the jobrunner setup.
      protected int m_NumThreads
      the number of threads to use for parallel execution (only used if no JobRunnerSetup/JobRunner set).
      protected int[] m_OriginalIndices
      the original indices.
      protected weka.classifiers.evaluation.output.prediction.AbstractOutput m_Output
      for generating predictions output.
      protected StringBuffer m_OutputBuffer
      the buffer for the predictions.
      protected long m_Seed
      the seed value.
      protected boolean m_SeparateFolds
      whether to separate folds.
      protected adams.core.StatusMessageHandler m_StatusMessageHandler
      for outputting notifications.
      protected boolean m_Stopped
      whether the execution has been stopped.
      protected boolean m_UseViews
      whether to use views.
      protected boolean m_WaitForJobs
      whether to wait for jobs to finish when stopping.
      • Fields inherited from class adams.core.logging.LoggingObject

        m_Logger, m_LoggingIsEnabled, m_LoggingLevel
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void cleanUp()
      Cleans up data structures, frees up memory.
      String execute()
      Executes the flow item.
      int getActualFolds()
      Returns the actual number of folds used.
      weka.classifiers.Classifier getClassifier()
      Returns the classifier in use.
      weka.classifiers.Classifier[] getClassifiers()
      Returns the classifiers per fold.
      weka.core.Instances getData()
      Returns the data in use.
      boolean getDiscardPredictions()
      Returns whether to discard the predictions in order to preserve memory.
      weka.classifiers.Evaluation getEvaluation()
      Returns the generated (aggregated) evaluation.
      weka.classifiers.Evaluation[] getEvaluations()
      Returns the generated evaluations (if multi-threaded or separated).
      adams.flow.core.Actor getFlowContext()
      Returns the flow context, if any.
      int getFolds()
      Returns the number of folds.
      CrossValidationFoldGenerator getGenerator()
      Returns the generator to use for generating the folds.
      adams.multiprocess.JobRunner getJobRunner()
      Returns the JobRunner, if any.
      adams.flow.standalone.JobRunnerSetup getJobRunnerSetup()
      Returns the JobRunnerSetup, if any.
      int getNumThreads()
      Returns the number of threads to use for cross-validation (only used if no JobRunnerSetup/JobRunner set).
      int[] getOriginalIndices()
      Returns the original indices.
      weka.classifiers.evaluation.output.prediction.AbstractOutput getOutput()
      Returns the prediction output generator in use.
      StringBuffer getOutputBuffer()
      Returns the output buffer.
      long getSeed()
      Returns the seed value.
      boolean getSeparateFolds()
      Returns whether to separate the folds, an Evaluation object per fold.
      adams.core.StatusMessageHandler getStatusMessageHandler()
      Returns the status message handler for outputting notifications.
      boolean getUseViews()
      Returns whether to use views instead of dataset copies, in order to conserve memory.
      boolean getWaitForJobs()
      Returns whether to wait for jobs to finish when terminating.
      protected void initOutputBuffer()
      Initializes the output buffer.
      boolean isSingleThreaded()
      Returns whether the execution was single-threaded (after execute()).
      boolean isStopped()
      Returns whether the execution has been stopped.
      void setClassifier​(weka.classifiers.Classifier value)
      Sets the classifier to use.
      void setData​(weka.core.Instances value)
      Sets the data to use.
      void setDiscardPredictions​(boolean value)
      Sets whether to discard the predictions instead of collecting them for future use, in order to conserve memory.
      void setFlowContext​(adams.flow.core.Actor value)
      Sets the flow context.
      void setFolds​(int value)
      Sets the number of folds.
      void setGenerator​(CrossValidationFoldGenerator value)
      Sets the generator to use for generating the folds.
      void setJobRunner​(adams.multiprocess.JobRunner value)
      Sets the JobRunner.
      void setJobRunnerSetup​(adams.flow.standalone.JobRunnerSetup value)
      Sets the JobRunnerSetup.
      void setNumThreads​(int value)
      Sets the number of threads to use for cross-validation (only used if no JobRunnerSetup/JobRunner set).
      void setOutput​(weka.classifiers.evaluation.output.prediction.AbstractOutput value)
      Sets the prediction output generator to use.
      void setSeed​(long value)
      Sets the seed value.
      void setSeparateFolds​(boolean value)
      Sets whether to separate the folds, an Evaluation object per fold.
      void setStatusMessageHandler​(adams.core.StatusMessageHandler value)
      Sets the status message handler for outputting notifications.
      void setUseViews​(boolean value)
      Sets whether to use views instead of dataset copies, in order to conserve memory.
      void setWaitForJobs​(boolean value)
      Sets whether to wait for jobs to finish when terminating.
      void stopExecution()
      Stops the execution.
      • Methods inherited from class adams.core.logging.CustomLoggingLevelObject

        setLoggingLevel
      • Methods inherited from class adams.core.logging.LoggingObject

        configureLogger, getLogger, getLoggingLevel, initializeLogging, isLoggingEnabled, sizeOf
      • Methods inherited from interface adams.core.logging.LoggingLevelHandler

        getLoggingLevel
    • Field Detail

      • m_Classifier

        protected weka.classifiers.Classifier m_Classifier
        the classifier to evaluate.
      • m_Data

        protected weka.core.Instances m_Data
        the data to evaluate on.
      • m_Output

        protected weka.classifiers.evaluation.output.prediction.AbstractOutput m_Output
        for generating predictions output.
      • m_OutputBuffer

        protected StringBuffer m_OutputBuffer
        the buffer for the predictions.
      • m_Folds

        protected int m_Folds
        the number of folds.
      • m_ActualFolds

        protected int m_ActualFolds
        the actual folds used.
      • m_SeparateFolds

        protected boolean m_SeparateFolds
        whether to separate folds.
      • m_Seed

        protected long m_Seed
        the seed value.
      • m_UseViews

        protected boolean m_UseViews
        whether to use views.
      • m_DiscardPredictions

        protected boolean m_DiscardPredictions
        whether to discard predictions.
      • m_NumThreads

        protected int m_NumThreads
        the number of threads to use for parallel execution (only used if no JobRunnerSetup/JobRunner set).
      • m_ActualNumThreads

        protected int m_ActualNumThreads
        the actual number of threads to use.
      • m_JobRunnerSetup

        protected transient adams.flow.standalone.JobRunnerSetup m_JobRunnerSetup
        the jobrunner setup.
      • m_JobRunner

        protected transient adams.multiprocess.JobRunner m_JobRunner
        the jobrunner template.
      • m_ActualJobRunner

        protected transient adams.multiprocess.JobRunner m_ActualJobRunner
        the runner in use.
      • m_Evaluation

        protected weka.classifiers.Evaluation m_Evaluation
        the (aggregated) evaluation.
      • m_Evaluations

        protected weka.classifiers.Evaluation[] m_Evaluations
        the separate evaluations.
      • m_Classifiers

        protected weka.classifiers.Classifier[] m_Classifiers
        the separate classifiers.
      • m_OriginalIndices

        protected int[] m_OriginalIndices
        the original indices.
      • m_Stopped

        protected boolean m_Stopped
        whether the execution has been stopped.
      • m_StatusMessageHandler

        protected adams.core.StatusMessageHandler m_StatusMessageHandler
        for outputting notifications.
      • m_WaitForJobs

        protected boolean m_WaitForJobs
        whether to wait for jobs to finish when stopping.
      • m_FlowContext

        protected transient adams.flow.core.Actor m_FlowContext
        the flow context.
      • m_CurrentEvaluation

        protected transient StoppableEvaluation m_CurrentEvaluation
        the evaluation currently being run.
      • m_CurrentClassifier

        protected transient weka.classifiers.Classifier m_CurrentClassifier
        the current classifier that is being trained.
    • Constructor Detail

      • WekaCrossValidationExecution

        public WekaCrossValidationExecution()
        Initializes the execution.
    • Method Detail

      • setFlowContext

        public void setFlowContext​(adams.flow.core.Actor value)
        Sets the flow context.
        Specified by:
        setFlowContext in interface adams.flow.core.FlowContextHandler
        Parameters:
        value - the actor
      • getFlowContext

        public adams.flow.core.Actor getFlowContext()
        Returns the flow context, if any.
        Specified by:
        getFlowContext in interface adams.flow.core.FlowContextHandler
        Returns:
        the actor, null if none available
      • setJobRunnerSetup

        public void setJobRunnerSetup​(adams.flow.standalone.JobRunnerSetup value)
        Sets the JobRunnerSetup.
        Parameters:
        value - the setup
      • getJobRunnerSetup

        public adams.flow.standalone.JobRunnerSetup getJobRunnerSetup()
        Returns the JobRunnerSetup, if any.
        Returns:
        the JobRunnerSetup, null if none available
      • setJobRunner

        public void setJobRunner​(adams.multiprocess.JobRunner value)
        Sets the JobRunner.
        Parameters:
        value - the template
      • getJobRunner

        public adams.multiprocess.JobRunner getJobRunner()
        Returns the JobRunner, if any.
        Returns:
        the JobRunner, null if none available
      • setWaitForJobs

        public void setWaitForJobs​(boolean value)
        Sets whether to wait for jobs to finish when terminating.
        Parameters:
        value - true if to wait
      • getWaitForJobs

        public boolean getWaitForJobs()
        Returns whether to wait for jobs to finish when terminating.
        Returns:
        true if to wait
      • setClassifier

        public void setClassifier​(weka.classifiers.Classifier value)
        Sets the classifier to use.
        Parameters:
        value - the classifier
      • getClassifier

        public weka.classifiers.Classifier getClassifier()
        Returns the classifier in use.
        Returns:
        the classifier
      • getClassifiers

        public weka.classifiers.Classifier[] getClassifiers()
        Returns the classifiers per fold.
        Returns:
        the classifiers, null if not stored
      • setData

        public void setData​(weka.core.Instances value)
        Sets the data to use.
        Parameters:
        value - the data
      • getData

        public weka.core.Instances getData()
        Returns the data in use.
        Returns:
        the data
      • setOutput

        public void setOutput​(weka.classifiers.evaluation.output.prediction.AbstractOutput value)
        Sets the prediction output generator to use.
        Parameters:
        value - the output generator
      • getOutput

        public weka.classifiers.evaluation.output.prediction.AbstractOutput getOutput()
        Returns the prediction output generator in use.
        Returns:
        the output generator
      • setFolds

        public void setFolds​(int value)
        Sets the number of folds.
        Parameters:
        value - the folds, <2 for LOOCV
      • getFolds

        public int getFolds()
        Returns the number of folds.
        Returns:
        the folds
      • getActualFolds

        public int getActualFolds()
        Returns the actual number of folds used.
        Returns:
        the actual folds, -1 if not yet determined
      • setSeparateFolds

        public void setSeparateFolds​(boolean value)
        Sets whether to separate the folds, an Evaluation object per fold.
        Parameters:
        value - true if to separate
      • getSeparateFolds

        public boolean getSeparateFolds()
        Returns whether to separate the folds, an Evaluation object per fold.
        Returns:
        true if to separate
      • setSeed

        public void setSeed​(long value)
        Sets the seed value.
        Parameters:
        value - the seed
      • getSeed

        public long getSeed()
        Returns the seed value.
        Returns:
        the seed
      • setUseViews

        public void setUseViews​(boolean value)
        Sets whether to use views instead of dataset copies, in order to conserve memory.
        Specified by:
        setUseViews in interface InstancesViewSupporter
        Parameters:
        value - true if to use views
      • getUseViews

        public boolean getUseViews()
        Returns whether to use views instead of dataset copies, in order to conserve memory.
        Specified by:
        getUseViews in interface InstancesViewSupporter
        Returns:
        true if using views
      • setGenerator

        public void setGenerator​(CrossValidationFoldGenerator value)
        Sets the generator to use for generating the folds.
        Parameters:
        value - the generator
      • getGenerator

        public CrossValidationFoldGenerator getGenerator()
        Returns the generator to use for generating the folds.
        Returns:
        the generator
      • setDiscardPredictions

        public void setDiscardPredictions​(boolean value)
        Sets whether to discard the predictions instead of collecting them for future use, in order to conserve memory. NB: Must be false in case of parallel execution to allow for aggregation of statistics;
        Parameters:
        value - true if to discard predictions
      • getDiscardPredictions

        public boolean getDiscardPredictions()
        Returns whether to discard the predictions in order to preserve memory. NB: Must be false in case of parallel execution to allow for aggregation of statistics;
        Returns:
        true if predictions discarded
      • setNumThreads

        public void setNumThreads​(int value)
        Sets the number of threads to use for cross-validation (only used if no JobRunnerSetup/JobRunner set).
        Specified by:
        setNumThreads in interface adams.core.ThreadLimiter
        Parameters:
        value - the number of threads: -1 = # of CPUs/cores; 0/1 = sequential execution
      • getNumThreads

        public int getNumThreads()
        Returns the number of threads to use for cross-validation (only used if no JobRunnerSetup/JobRunner set).
        Specified by:
        getNumThreads in interface adams.core.ThreadLimiter
        Returns:
        the number of threads: -1 = # of CPUs/cores; 0/1 = sequential execution
      • setStatusMessageHandler

        public void setStatusMessageHandler​(adams.core.StatusMessageHandler value)
        Sets the status message handler for outputting notifications.
        Parameters:
        value - the handler
      • getStatusMessageHandler

        public adams.core.StatusMessageHandler getStatusMessageHandler()
        Returns the status message handler for outputting notifications.
        Returns:
        the handler, null if none set
      • initOutputBuffer

        protected void initOutputBuffer()
        Initializes the output buffer.
      • getOutputBuffer

        public StringBuffer getOutputBuffer()
        Returns the output buffer.
        Returns:
        the output buffer
      • getEvaluation

        public weka.classifiers.Evaluation getEvaluation()
        Returns the generated (aggregated) evaluation.
        Returns:
        the evaluation
      • getEvaluations

        public weka.classifiers.Evaluation[] getEvaluations()
        Returns the generated evaluations (if multi-threaded or separated).
        Returns:
        the evaluations, null if not multi-threaded or not separated
      • getOriginalIndices

        public int[] getOriginalIndices()
        Returns the original indices.
        Returns:
        the indices
      • isSingleThreaded

        public boolean isSingleThreaded()
        Returns whether the execution was single-threaded (after execute()).
        Returns:
        true if single-threaded
      • execute

        public String execute()
        Executes the flow item.
        Returns:
        null if everything is fine, otherwise error message
      • isStopped

        public boolean isStopped()
        Returns whether the execution has been stopped.
        Returns:
        true if stopped
      • stopExecution

        public void stopExecution()
        Stops the execution.
        Specified by:
        stopExecution in interface adams.core.Stoppable
      • cleanUp

        public void cleanUp()
        Cleans up data structures, frees up memory.
        Specified by:
        cleanUp in interface adams.core.CleanUpHandler