Package weka.core
Class InstanceGrouping
- java.lang.Object
-
- adams.core.logging.LoggingObject
-
- weka.core.InstanceGrouping
-
- All Implemented Interfaces:
adams.core.logging.LoggingSupporter,adams.core.SizeOfHandler,Serializable
public class InstanceGrouping extends adams.core.logging.LoggingObjectGroups rows in a dataset using a regular expression on a nominal or string attribute.- Author:
- FracPete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected weka.core.Instancesm_Datathe original data.protected Stringm_Groupthe replacement string, using the groups from the regexp.protected Map<String,gnu.trove.list.TIntList>m_Groupsthe groups.protected WekaAttributeIndexm_Indexthe attribute index.protected adams.core.base.BaseRegExpm_RegExpthe regular expression.
-
Constructor Summary
Constructors Constructor Description InstanceGrouping(weka.core.Instances data, WekaAttributeIndex index, adams.core.base.BaseRegExp regExp, String group)Initializes the object.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description weka.core.Instancescollapse(weka.core.Instances data)Collapses the data into a fake dataset with only the the group and the class attribute.protected weka.core.InstancescollapsedHeader()Creates the header for the collapsed data.weka.core.Instancesexpand(weka.core.Instances data, boolean useView)Expands the fake data into the original dataset space.gnu.trove.list.TIntListexpand(weka.core.Instances data, gnu.trove.list.TIntList subset)Expands the fake data into the original dataset space.protected voidexpandCheck(weka.core.Instances data)Ensures that the data to expand is in the right format.gnu.trove.list.TIntListget(String group)Returns the group.weka.core.InstancesgetData()Returns the underlying data.StringgetGroup()The group expression, i.e., replacement string (eg '$2').adams.core.base.BaseRegExpgetRegExp()Returns the regular expression in use (eg '(.*)-([0-9]+)-(.*)').Set<String>groups()Returns the groups.protected voidinitialize()Initializes the grouping.protected voidprocess()Performs the grouping.intsize()Returns the number of groups.StringtoString()Returns the groups and their indices.
-
-
-
Field Detail
-
m_Data
protected weka.core.Instances m_Data
the original data.
-
m_Index
protected WekaAttributeIndex m_Index
the attribute index.
-
m_RegExp
protected adams.core.base.BaseRegExp m_RegExp
the regular expression.
-
m_Group
protected String m_Group
the replacement string, using the groups from the regexp.
-
-
Constructor Detail
-
InstanceGrouping
public InstanceGrouping(weka.core.Instances data, WekaAttributeIndex index, adams.core.base.BaseRegExp regExp, String group)Initializes the object.- Parameters:
data- the data to groupindex- the indexregExp- the regular expression (eg '(.*)-([0-9]+)-(.*)')group- the replacement string, using the groups from the regexp (eg '$2')
-
-
Method Detail
-
initialize
protected void initialize()
Initializes the grouping.
-
process
protected void process()
Performs the grouping.
-
collapsedHeader
protected weka.core.Instances collapsedHeader()
Creates the header for the collapsed data.- Returns:
- the header
-
collapse
public weka.core.Instances collapse(weka.core.Instances data)
Collapses the data into a fake dataset with only the the group and the class attribute.- Parameters:
data- the data to collapse- Returns:
- the collapsed dataset
-
expandCheck
protected void expandCheck(weka.core.Instances data)
Ensures that the data to expand is in the right format.- Parameters:
data- the data to check- Throws:
IllegalArgumentException- if checks fail
-
expand
public gnu.trove.list.TIntList expand(weka.core.Instances data, gnu.trove.list.TIntList subset)Expands the fake data into the original dataset space.- Parameters:
data- the data to expand- Returns:
- the expanded dataset
-
expand
public weka.core.Instances expand(weka.core.Instances data, boolean useView)Expands the fake data into the original dataset space.- Parameters:
data- the data to expanduseView- whether to use a view- Returns:
- the expanded dataset
-
getData
public weka.core.Instances getData()
Returns the underlying data.- Returns:
- the data
-
getRegExp
public adams.core.base.BaseRegExp getRegExp()
Returns the regular expression in use (eg '(.*)-([0-9]+)-(.*)').- Returns:
- the regexp
-
getGroup
public String getGroup()
The group expression, i.e., replacement string (eg '$2').- Returns:
- the group
-
size
public int size()
Returns the number of groups.- Returns:
- the number of groups
-
get
public gnu.trove.list.TIntList get(String group)
Returns the group.- Parameters:
group- the group to return- Returns:
- the indices in the original dataset that form this group
-
-