Class WekaInstancesInfo

  • All Implemented Interfaces:
    AdditionalInformationHandler, ArrayProvider, CleanUpHandler, Destroyable, GlobalInfoSupporter, LoggingLevelHandler, LoggingSupporter, OptionHandler, QuickInfoSupporter, ShallowCopySupporter<Actor>, SizeOfHandler, Stoppable, StoppableWithFeedback, VariablesInspectionHandler, VariableChangeListener, Actor, ArrayProvider, DataInfoActor, ErrorHandler, InputConsumer, OutputProducer, Serializable, Comparable

    public class WekaInstancesInfo
    extends AbstractArrayProvider
    implements DataInfoActor
    Outputs statistics of a weka.core.Instances object.
    FULL_ATTRIBUTE and FULL_CLASS output a spreadsheet with detailed attribute statistics. All others output either strings, integers or doubles (or arrays of them, in case of counts/distribution).

    Input/output:
    - accepts:
       weka.core.Instances
       weka.core.Instance
    - generates:
       java.lang.String


    -logging-level <OFF|SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST> (property: loggingLevel)
        The logging level for outputting errors and debugging output.
        default: WARNING
     
    -name <java.lang.String> (property: name)
        The name of the actor.
        default: WekaInstancesInfo
     
    -annotation <adams.core.base.BaseAnnotation> (property: annotations)
        The annotations to attach to this actor.
        default:
     
    -skip <boolean> (property: skip)
        If set to true, transformation is skipped and the input token is just forwarded
        as it is.
        default: false
     
    -stop-flow-on-error <boolean> (property: stopFlowOnError)
        If set to true, the flow execution at this level gets stopped in case this
        actor encounters an error; the error gets propagated; useful for critical
        actors.
        default: false
     
    -silent <boolean> (property: silent)
        If enabled, then no errors are output in the console; Note: the enclosing
        actor handler must have this enabled as well.
        default: false
     
    -output-array <boolean> (property: outputArray)
        Whether to output the values one-by-one or as array (counts or distributions
        are always output as array).
        default: false
     
    -type <FULL|FULL_ATTRIBUTE|FULL_CLASS|HEADER|RELATION_NAME|NUM_ATTRIBUTES|NUM_INSTANCES|NUM_CLASS_LABELS|ATTRIBUTE_NAME|ATTRIBUTE_NAMES|CLASS_ATTRIBUTE_NAME|LABELS|CLASS_LABELS|NUM_LABELS|NUM_MISSING_VALUES|NUM_DISTINCT_VALUES|NUM_UNIQUE_VALUES|LABEL_COUNT|CLASS_LABEL_COUNT|LABEL_COUNTS|CLASS_LABEL_COUNTS|LABEL_DISTRIBUTION|CLASS_LABEL_DISTRIBUTION|MIN|MAX|MEAN|STDEV|ATTRIBUTE_TYPE|CLASS_TYPE> (property: type)
        The type of information to generate; NB some of the types are only available
        for numeric or nominal attributes.
        default: FULL
     
    -attribute-index <adams.data.weka.WekaAttributeIndex> (property: attributeIndex)
        The attribute index to use for generating attribute-specific information;
         An index is a number starting with 1; apart from attribute names (case-sensitive
        ), the following placeholders can be used as well: first, second, third,
        last_2, last_1, last; numeric indices can be enforced by preceding them
        with '#' (eg '#12'); attribute names can be surrounded by double quotes.
        default: last
        example: An index is a number starting with 1; apart from attribute names (case-sensitive), the following placeholders can be used as well: first, second, third, last_2, last_1, last; numeric indices can be enforced by preceding them with '#' (eg '#12'); attribute names can be surrounded by double quotes.
     
    -label-index <adams.data.weka.WekaLabelIndex> (property: labelIndex)
        The index of the label to use; An index is a number starting with 1; apart
        from label names (case-sensitive), the following placeholders can be used
        as well: first, second, third, last_2, last_1, last; numeric indices can
        be enforced by preceding them with '#' (eg '#12'); label names can be surrounded
        by double quotes.
        default: first
        example: An index is a number starting with 1; apart from label names (case-sensitive), the following placeholders can be used as well: first, second, third, last_2, last_1, last; numeric indices can be enforced by preceding them with '#' (eg '#12'); label names can be surrounded by double quotes.
     
    Author:
    fracpete (fracpete at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Detail

      • m_AttributeIndex

        protected WekaAttributeIndex m_AttributeIndex
        the index of the attribute to get the information for.
      • m_LabelIndex

        protected WekaLabelIndex m_LabelIndex
        the index of the label.
      • m_DateFormat

        protected DateFormat m_DateFormat
        for formatting dates.
    • Constructor Detail

      • WekaInstancesInfo

        public WekaInstancesInfo()
    • Method Detail

      • setType

        public void setType​(WekaInstancesInfo.InfoType value)
        Sets the type of information to generate.
        Parameters:
        value - the type
      • typeTipText

        public String typeTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setAttributeIndex

        public void setAttributeIndex​(WekaAttributeIndex value)
        Sets the attribute index to use for attribute-specific information.
        Parameters:
        value - the 1-based index
      • getAttributeIndex

        public WekaAttributeIndex getAttributeIndex()
        Returns the attribute index to use for attribute specific information.
        Returns:
        the 1-based index
      • attributeIndexTipText

        public String attributeIndexTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setLabelIndex

        public void setLabelIndex​(WekaLabelIndex value)
        Sets the index of the label to use.
        Parameters:
        value - the 1-based index
      • getLabelIndex

        public WekaLabelIndex getLabelIndex()
        Returns the index of the label to use.
        Returns:
        the 1-based index
      • labelIndexTipText

        public String labelIndexTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • accepts

        public Class[] accepts()
        Returns the class that the consumer accepts.
        Specified by:
        accepts in interface InputConsumer
        Returns:
        weka.core.Instances.class, weka.core.Instance.class
      • addStatistic

        protected void addStatistic​(SpreadSheet sheet,
                                    String name,
                                    Object value)
        Adds a statistic to the dataset.
        Parameters:
        sheet - the spreadsheet to add the data to
        name - the name of the statistic
        value - the statistic (string, double, int)
      • formatDate

        protected Object formatDate​(double value)
        Formats date stats.
        Parameters:
        value - the date (java epoch) to process
        Returns:
        the (potentially) formatted value
      • getAttributeStats

        protected SpreadSheet getAttributeStats​(weka.core.Instances data,
                                                int index)
        Generates attributes statistics.
        Parameters:
        data - the dataset to use
        index - the 0-based index of the attribute
      • doExecute

        protected String doExecute()
        Executes the flow item.
        Specified by:
        doExecute in class AbstractActor
        Returns:
        null if everything is fine, otherwise error message