Package weka.filters.unsupervised.instance

Class Summary
NonSparseToSparse An instance filter that converts all incoming instances into sparse format.
Randomize Randomly shuffles the order of instances passed through it.
RemoveFolds This filter takes a dataset and outputs a specified fold for cross validation.
RemoveFrequentValues Determines which values (frequent or infrequent ones) of an (nominal) attribute are retained and filters the instances accordingly.
RemoveMisclassified A filter that removes instances which are incorrectly classified.
RemovePercentage A filter that removes a given percentage of a dataset.
RemoveRange A filter that removes a given range of instances of a dataset.
RemoveWithValues Filters instances according to the value of an attribute.
Resample Produces a random subsample of a dataset using either sampling with replacement or without replacement.
ReservoirSample Produces a random subsample of a dataset using the reservoir sampling Algorithm "R" by Vitter.
SparseToNonSparse An instance filter that converts all incoming sparse instances into non-sparse format.
SubsetByExpression Filters instances according to a user-specified expression.

Grammar:

boolexpr_list ::= boolexpr_list boolexpr_part | boolexpr_part;

boolexpr_part ::= boolexpr:e {: parser.setResult(e); :} ;

boolexpr ::= BOOLEAN
| true
| false
| expr < expr
| expr <= expr
| expr > expr
| expr >= expr
| expr = expr
| ( boolexpr )
| not boolexpr
| boolexpr and boolexpr
| boolexpr or boolexpr
| ATTRIBUTE is STRING
;

expr ::= NUMBER
| ATTRIBUTE
| ( expr )
| opexpr
| funcexpr
;

opexpr ::= expr + expr
| expr - expr
| expr * expr
| expr / expr
;

funcexpr ::= abs ( expr )
| sqrt ( expr )
| log ( expr )
| exp ( expr )
| sin ( expr )
| cos ( expr )
| tan ( expr )
| rint ( expr )
| floor ( expr )
| pow ( expr for base , expr for exponent )
| ceil ( expr )
;

Notes:
- NUMBER
any integer or floating point number
(but not in scientific notation!)
- STRING
any string surrounded by single quotes;
the string may not contain a single quote though.
- ATTRIBUTE
the following placeholders are recognized for
attribute values:
- CLASS for the class value in case a class attribute is set.
- ATTxyz with xyz a number from 1 to # of attributes in the
dataset, representing the value of indexed attribute.

Examples:
- extracting only mammals and birds from the 'zoo' UCI dataset:
(CLASS is 'mammal') or (CLASS is 'bird')
- extracting only animals with at least 2 legs from the 'zoo' UCI dataset:
(ATT14 >= 2)
- extracting only instances with non-missing 'wage-increase-second-year'
from the 'labor' UCI dataset:
not ismissing(ATT3)

Valid options are:

 



Copyright © 2012 University of Waikato, Hamilton, NZ. All Rights Reserved.