Class PCA

  • All Implemented Interfaces:
    Destroyable, GlobalInfoSupporter, LoggingLevelHandler, LoggingSupporter, OptionHandler, SizeOfHandler, CapabilitiesHandler, BatchFilter, ColumnSubsetFilter, Filter, Serializable

    public class PCA
    extends AbstractColumnSubsetBatchFilter
    Performs principal components analysis.

    -logging-level <OFF|SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST> (property: loggingLevel)
        The logging level for outputting errors and debugging output.
        default: WARNING
    -column-subset <RANGE|REGEXP> (property: columnSubset)
        Defines how to determine the columns to use for filtering.
        default: RANGE
    -col-range <> (property: colRange)
        The range of columns to use in the filtering process.
        default: first-last
        example: A range is a comma-separated list of single 1-based indices or sub-ranges of indices ('start-end'); 'inv(...)' inverts the range '...'; column names (case-sensitive) as well as the following placeholders can be used: first, second, third, last_2, last_1, last; numeric indices can be enforced by preceding them with '#' (eg '#12'); column names can be surrounded by double quotes.
    -col-regexp <adams.core.base.BaseRegExp> (property: colRegExp)
        The regular expression to use on the column names to determine whether to
        use a column for filtering.
        default: .*
    -drop-other-columns <boolean> (property: dropOtherColumns)
        If enabled, other columns that aren't used for filtering get removed from
        the output; does not affect any class columns.
        default: false
    -variance <double> (property: variance)
        The variance to cover.
        default: 0.95
        minimum: 0.0
        maximum: 1.0
    -max-columns <int> (property: maxColumns)
        The maximum number of columns to generate.
        default: -1
        minimum: -1
    -center <boolean> (property: center)
        If enabled, the data gets centered rather than standardized, computing PCA
        from covariance matrix rather than correlation matrix.
        default: false
    FracPete (fracpete at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Detail

      • m_Variance

        protected double m_Variance
        the variance to cover.
      • m_MaxColumns

        protected int m_MaxColumns
        the maximum number of attributes.
      • m_Center

        protected boolean m_Center
        whether to center (rather than standardize) the data and compute PCA from covariance (rather than correlation) matrix.
      • m_Algorithm

        protected com.github.waikatodatamining.matrix.algorithm.PCA m_Algorithm
        the actual algorithm.
      • m_NumColumns

        protected int m_NumColumns
        the number of columns that got determined.
      • m_Transformed

        protected transient com.github.waikatodatamining.matrix.core.Matrix m_Transformed
        temp matrix to avoid duplicate transformation.
    • Constructor Detail

      • PCA

        public PCA()
    • Method Detail

      • setVariance

        public void setVariance​(double value)
        Sets the variance.
        value - the variance
      • getVariance

        public double getVariance()
        Returns the variance.
        the variance
      • varianceTipText

        public String varianceTipText()
        Returns the tip text for this property.
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setMaxColumns

        public void setMaxColumns​(int value)
        Sets the maximum attributes.
        value - the maximum
      • getMaxColumns

        public int getMaxColumns()
        Returns the maximum attributes.
        the maximum
      • maxColumnsTipText

        public String maxColumnsTipText()
        Returns the tip text for this property.
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setCenter

        public void setCenter​(boolean center)
        Set whether to center (rather than standardize) the data. If set to true then PCA is computed from the covariance rather than correlation matrix.
        center - true if the data is to be centered rather than standardized
      • getCenter

        public boolean getCenter()
        Get whether to center (rather than standardize) the data. If true then PCA is computed from the covariance rather than correlation matrix.
        true if the data is to be centered rather than standardized.
      • centerTipText

        public String centerTipText()
        Returns the tip text for this property.
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • getCapabilities

        public Capabilities getCapabilities()
        Returns the capabilities.
        the capabilities