Class RemoveOutliers

  • All Implemented Interfaces:
    adams.core.Destroyable, adams.core.GlobalInfoSupporter, adams.core.logging.LoggingLevelHandler, adams.core.logging.LoggingSupporter, adams.core.option.OptionHandler, adams.core.Randomizable, adams.core.ShallowCopySupporter<AbstractCleaner>, adams.core.SizeOfHandler, adams.core.Stoppable, adams.core.StoppableWithFeedback, adams.core.ThreadLimiter, adams.flow.core.FlowContextHandler, Serializable, Comparable

    public class RemoveOutliers
    extends AbstractCleaner
    implements adams.core.Randomizable, adams.core.ThreadLimiter, adams.core.StoppableWithFeedback
    Cross-validates the specified classifier on the incoming data and applies the outlier detector to the actual vs predicted data to remove the outliers.
    NB: only works on full dataset, not instance by instance.

    -logging-level <OFF|SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST> (property: loggingLevel)
        The logging level for outputting errors and debugging output.
        default: WARNING
     
    -pre-filter <weka.filters.Filter> (property: preFilter)
        The filter to use for pre-filtering the data.
        default: weka.filters.AllFilter
     
    -classifier <weka.classifiers.Classifier> (property: classifier)
        The classifier to use for generating the actual vs predicted data.
        default: weka.classifiers.functions.LinearRegressionJ -S 0 -R 1.0E-8
     
    -seed <long> (property: seed)
        The seed value for the cross-validation.
        default: 1
     
    -num-folds <int> (property: numFolds)
        The number of folds to use in the cross-validation.
        default: 10
        minimum: 2
     
    -num-threads <int> (property: numThreads)
        The number of threads to use for cross-validation; -1 = number of CPUs/cores;
         0 or 1 = sequential execution.
        default: 1
        minimum: -1
     
    -detector <adams.flow.control.removeoutliers.AbstractOutlierDetector> (property: detector)
        The outlier detector to use.
        default: adams.flow.control.removeoutliers.Null
     
    Author:
    FracPete (fracpete at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Detail

      • m_Classifier

        protected weka.classifiers.Classifier m_Classifier
        the classifier to use for evaluation.
      • m_Seed

        protected long m_Seed
        the seed value.
      • m_NumFolds

        protected int m_NumFolds
        the number of folds to use.
      • m_Detector

        protected adams.flow.control.removeoutliers.AbstractOutlierDetector m_Detector
        the outlier detector to use.
      • m_NumThreads

        protected int m_NumThreads
        the number of threads to use for parallel execution.
      • m_JobRunnerSetup

        protected transient adams.flow.standalone.JobRunnerSetup m_JobRunnerSetup
        the jobrunner setup.
      • m_JobRunner

        protected transient adams.multiprocess.JobRunner m_JobRunner
        the runner in use.
      • m_Stopped

        protected boolean m_Stopped
        whether the execution was stopped.
      • m_CurrentEvaluation

        protected transient weka.classifiers.StoppableEvaluation m_CurrentEvaluation
        the current evaluation.
    • Constructor Detail

      • RemoveOutliers

        public RemoveOutliers()
    • Method Detail

      • globalInfo

        public String globalInfo()
        Returns a string describing the object.
        Specified by:
        globalInfo in interface adams.core.GlobalInfoSupporter
        Specified by:
        globalInfo in class adams.core.option.AbstractOptionHandler
        Returns:
        a description suitable for displaying in the gui
      • defineOptions

        public void defineOptions()
        Adds options to the internal list of options.
        Specified by:
        defineOptions in interface adams.core.option.OptionHandler
        Overrides:
        defineOptions in class AbstractCleaner
      • setClassifier

        public void setClassifier​(weka.classifiers.Classifier value)
        Sets the classifier.
        Parameters:
        value - the classifier
      • getClassifier

        public weka.classifiers.Classifier getClassifier()
        Returns the classifier.
        Returns:
        the classifier
      • classifierTipText

        public String classifierTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setSeed

        public void setSeed​(long value)
        Sets the seed value.
        Specified by:
        setSeed in interface adams.core.Randomizable
        Parameters:
        value - the seed
      • getSeed

        public long getSeed()
        Returns the seed value.
        Specified by:
        getSeed in interface adams.core.Randomizable
        Returns:
        the seed
      • seedTipText

        public String seedTipText()
        Returns the tip text for this property.
        Specified by:
        seedTipText in interface adams.core.Randomizable
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setNumFolds

        public void setNumFolds​(int value)
        Sets the number of folds to use.
        Parameters:
        value - the folds
      • getNumFolds

        public int getNumFolds()
        Returns the number of folds to use in CV.
        Returns:
        the folds
      • numFoldsTipText

        public String numFoldsTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setNumThreads

        public void setNumThreads​(int value)
        Sets the number of threads to use for cross-validation.
        Specified by:
        setNumThreads in interface adams.core.ThreadLimiter
        Parameters:
        value - the number of threads: -1 = # of CPUs/cores; 0/1 = sequential execution
      • getNumThreads

        public int getNumThreads()
        Returns the number of threads to use for cross-validation.
        Specified by:
        getNumThreads in interface adams.core.ThreadLimiter
        Returns:
        the number of threads: -1 = # of CPUs/cores; 0/1 = sequential execution
      • numThreadsTipText

        public String numThreadsTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setDetector

        public void setDetector​(adams.flow.control.removeoutliers.AbstractOutlierDetector value)
        Sets the detector.
        Parameters:
        value - the detector
      • getDetector

        public adams.flow.control.removeoutliers.AbstractOutlierDetector getDetector()
        Returns the detector.
        Returns:
        the detector
      • detectorTipText

        public String detectorTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • preCheck

        protected void preCheck​(weka.core.Instances data)
        Performs the some pre-checks whether the data is actually suitable.
        Overrides:
        preCheck in class AbstractCleaner
        Parameters:
        data - the instances to clean
      • performCheck

        protected String performCheck​(weka.core.Instance data)
        Performs the actual check.
        Specified by:
        performCheck in class AbstractCleaner
        Parameters:
        data - the instance to check
        Returns:
        always null
      • crossValidate

        protected weka.classifiers.Evaluation crossValidate​(weka.core.Instances data,
                                                            int folds)
                                                     throws Exception
        Cross-validates the classifier on the given data.
        Parameters:
        data - the data to use for cross-validation
        folds - the number of folds
        Returns:
        the evaluation
        Throws:
        Exception - if cross-validation fails
      • evaluationToSpreadSheet

        protected adams.data.spreadsheet.SpreadSheet evaluationToSpreadSheet​(weka.classifiers.Evaluation eval)
        Turns the predictions of the evaluation object into a spreadsheet.
        Parameters:
        eval - the evaluation object to convert
        Returns:
        the generated spreadsheet
      • performClean

        protected weka.core.Instances performClean​(weka.core.Instances data)
        Performs the actual check.
        Specified by:
        performClean in class AbstractCleaner
        Parameters:
        data - the instance to check
        Returns:
        null if ok, otherwise error message
      • stopExecution

        public void stopExecution()
        Stops the execution. No message set.
        Specified by:
        stopExecution in interface adams.core.Stoppable
      • isStopped

        public boolean isStopped()
        Whether the execution has been stopped.
        Specified by:
        isStopped in interface adams.core.StoppableWithFeedback
        Returns:
        true if stopped