Package weka.core
Class InstanceGrouping
- java.lang.Object
-
- adams.core.logging.LoggingObject
-
- weka.core.InstanceGrouping
-
- All Implemented Interfaces:
adams.core.logging.LoggingSupporter
,adams.core.SizeOfHandler
,Serializable
public class InstanceGrouping extends adams.core.logging.LoggingObject
Groups rows in a dataset using a regular expression on a nominal or string attribute.- Author:
- FracPete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected weka.core.Instances
m_Data
the original data.protected String
m_Group
the replacement string, using the groups from the regexp.protected Map<String,gnu.trove.list.TIntList>
m_Groups
the groups.protected WekaAttributeIndex
m_Index
the attribute index.protected adams.core.base.BaseRegExp
m_RegExp
the regular expression.
-
Constructor Summary
Constructors Constructor Description InstanceGrouping(weka.core.Instances data, WekaAttributeIndex index, adams.core.base.BaseRegExp regExp, String group)
Initializes the object.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description weka.core.Instances
collapse(weka.core.Instances data)
Collapses the data into a fake dataset with only the the group and the class attribute.protected weka.core.Instances
collapsedHeader()
Creates the header for the collapsed data.weka.core.Instances
expand(weka.core.Instances data, boolean useView)
Expands the fake data into the original dataset space.gnu.trove.list.TIntList
expand(weka.core.Instances data, gnu.trove.list.TIntList subset)
Expands the fake data into the original dataset space.protected void
expandCheck(weka.core.Instances data)
Ensures that the data to expand is in the right format.gnu.trove.list.TIntList
get(String group)
Returns the group.weka.core.Instances
getData()
Returns the underlying data.String
getGroup()
The group expression, i.e., replacement string (eg '$2').adams.core.base.BaseRegExp
getRegExp()
Returns the regular expression in use (eg '(.*)-([0-9]+)-(.*)').Set<String>
groups()
Returns the groups.protected void
initialize()
Initializes the grouping.protected void
process()
Performs the grouping.int
size()
Returns the number of groups.String
toString()
Returns the groups and their indices.
-
-
-
Field Detail
-
m_Data
protected weka.core.Instances m_Data
the original data.
-
m_Index
protected WekaAttributeIndex m_Index
the attribute index.
-
m_RegExp
protected adams.core.base.BaseRegExp m_RegExp
the regular expression.
-
m_Group
protected String m_Group
the replacement string, using the groups from the regexp.
-
-
Constructor Detail
-
InstanceGrouping
public InstanceGrouping(weka.core.Instances data, WekaAttributeIndex index, adams.core.base.BaseRegExp regExp, String group)
Initializes the object.- Parameters:
data
- the data to groupindex
- the indexregExp
- the regular expression (eg '(.*)-([0-9]+)-(.*)')group
- the replacement string, using the groups from the regexp (eg '$2')
-
-
Method Detail
-
initialize
protected void initialize()
Initializes the grouping.
-
process
protected void process()
Performs the grouping.
-
collapsedHeader
protected weka.core.Instances collapsedHeader()
Creates the header for the collapsed data.- Returns:
- the header
-
collapse
public weka.core.Instances collapse(weka.core.Instances data)
Collapses the data into a fake dataset with only the the group and the class attribute.- Parameters:
data
- the data to collapse- Returns:
- the collapsed dataset
-
expandCheck
protected void expandCheck(weka.core.Instances data)
Ensures that the data to expand is in the right format.- Parameters:
data
- the data to check- Throws:
IllegalArgumentException
- if checks fail
-
expand
public gnu.trove.list.TIntList expand(weka.core.Instances data, gnu.trove.list.TIntList subset)
Expands the fake data into the original dataset space.- Parameters:
data
- the data to expand- Returns:
- the expanded dataset
-
expand
public weka.core.Instances expand(weka.core.Instances data, boolean useView)
Expands the fake data into the original dataset space.- Parameters:
data
- the data to expanduseView
- whether to use a view- Returns:
- the expanded dataset
-
getData
public weka.core.Instances getData()
Returns the underlying data.- Returns:
- the data
-
getRegExp
public adams.core.base.BaseRegExp getRegExp()
Returns the regular expression in use (eg '(.*)-([0-9]+)-(.*)').- Returns:
- the regexp
-
getGroup
public String getGroup()
The group expression, i.e., replacement string (eg '$2').- Returns:
- the group
-
size
public int size()
Returns the number of groups.- Returns:
- the number of groups
-
get
public gnu.trove.list.TIntList get(String group)
Returns the group.- Parameters:
group
- the group to return- Returns:
- the indices in the original dataset that form this group
-
-