Class NormalizableDistance
- java.lang.Object
-
- moa.classifiers.lazy.neighboursearch.NormalizableDistance
-
- All Implemented Interfaces:
DistanceFunction
- Direct Known Subclasses:
EuclideanDistance
public abstract class NormalizableDistance extends Object implements DistanceFunction
Represents the abstract ancestor for normalizable distance functions, like Euclidean or Manhattan distance.- Version:
- $Revision: 8034 $
- Author:
- Fracpete (fracpete at waikato dot ac dot nz), Gabi Schmidberger (gabi@cs.waikato.ac.nz) -- original code from weka.core.EuclideanDistance, Ashraf M. Kibriya (amk14@cs.waikato.ac.nz) -- original code from weka.core.EuclideanDistance
-
-
Field Summary
Fields Modifier and Type Field Description protected boolean[]
m_ActiveIndices
The boolean flags, whether an attribute will be used or not.protected Instances
m_Data
the instances used internally.protected boolean
m_DontNormalize
True if normalization is turned off (default false).protected double[][]
m_Ranges
The range of the attributes.protected boolean
m_Validated
Whether all the necessary preparations have been done.static int
R_MAX
Index in ranges for MAX.static int
R_MIN
Index in ranges for MIN.static int
R_WIDTH
Index in ranges for WIDTH.
-
Constructor Summary
Constructors Constructor Description NormalizableDistance()
Invalidates the distance function, Instances must be still set.NormalizableDistance(Instances data)
Initializes the distance function and automatically initializes the ranges.
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description String
attributeIndicesTipText()
Returns the tip text for this property.protected double
difference(int index, double val1, double val2)
Computes the difference between two given attribute values.double
distance(Instance first, Instance second)
Calculates the distance between two instances.double
distance(Instance first, Instance second, double cutOffValue)
Calculates the distance between two instances.String
dontNormalizeTipText()
Returns the tip text for this property.String
getAttributeIndices()
Gets the range of attributes used in the calculation of the distance.boolean
getDontNormalize()
Gets whether if the attribute values are to be normazlied in distance calculation.Instances
getInstances()
returns the instances currently set.boolean
getInvertSelection()
Gets whether the matching sense of attribute indices is inverted or not.double[][]
getRanges()
Method to get the ranges.abstract String
globalInfo()
Returns a string describing this object.protected void
initialize()
initializes the ranges and the attributes being used.protected void
initializeAttributeIndices()
initializes the attribute indices.double[][]
initializeRanges()
Initializes the ranges using all instances of the dataset.double[][]
initializeRanges(int[] instList)
Initializes the ranges of a subset of the instances of this dataset.double[][]
initializeRanges(int[] instList, int startIdx, int endIdx)
Initializes the ranges of a subset of the instances of this dataset.void
initializeRangesEmpty(int numAtt, double[][] ranges)
Used to initialize the ranges.boolean
inRanges(Instance instance, double[][] ranges)
Test if an instance is within the given ranges.protected void
invalidate()
invalidates all initializations.String
invertSelectionTipText()
Returns the tip text for this property.static boolean
isMissingValue(double val)
Tests if the given value codes "missing".protected double
norm(double x, int i)
Normalizes a given value of a numeric attribute.void
postProcessDistances(double[] distances)
Does nothing, derived classes may override it though.boolean
rangesSet()
Check if ranges are set.void
setAttributeIndices(String value)
Sets the range of attributes to use in the calculation of the distance.void
setDontNormalize(boolean dontNormalize)
Sets whether if the attribute values are to be normalized in distance calculation.void
setInstances(Instances insts)
Sets the instances.void
setInvertSelection(boolean value)
Sets whether the matching sense of attribute indices is inverted or not.String
toString()
Returns an empty string.void
update(Instance ins)
Update the distance function (if necessary) for the newly added instance.protected abstract double
updateDistance(double currDist, double diff)
Updates the current distance calculated so far with the new difference between two attributes.void
updateRanges(Instance instance)
Update the ranges if a new instance comes.double[][]
updateRanges(Instance instance, double[][] ranges)
Updates the ranges given a new instance.void
updateRanges(Instance instance, int numAtt, double[][] ranges)
Updates the minimum and maximum and width values for all the attributes based on a new instance.void
updateRangesFirst(Instance instance, int numAtt, double[][] ranges)
Used to initialize the ranges.protected void
validate()
performs the initializations if necessary.
-
-
-
Field Detail
-
R_MIN
public static final int R_MIN
Index in ranges for MIN.- See Also:
- Constant Field Values
-
R_MAX
public static final int R_MAX
Index in ranges for MAX.- See Also:
- Constant Field Values
-
R_WIDTH
public static final int R_WIDTH
Index in ranges for WIDTH.- See Also:
- Constant Field Values
-
m_Data
protected Instances m_Data
the instances used internally.
-
m_DontNormalize
protected boolean m_DontNormalize
True if normalization is turned off (default false).
-
m_Ranges
protected double[][] m_Ranges
The range of the attributes.
-
m_ActiveIndices
protected boolean[] m_ActiveIndices
The boolean flags, whether an attribute will be used or not.
-
m_Validated
protected boolean m_Validated
Whether all the necessary preparations have been done.
-
-
Constructor Detail
-
NormalizableDistance
public NormalizableDistance()
Invalidates the distance function, Instances must be still set.
-
NormalizableDistance
public NormalizableDistance(Instances data)
Initializes the distance function and automatically initializes the ranges.- Parameters:
data
- the instances the distance function should work on
-
-
Method Detail
-
globalInfo
public abstract String globalInfo()
Returns a string describing this object.- Returns:
- a description of the evaluator suitable for displaying in the explorer/experimenter gui
-
dontNormalizeTipText
public String dontNormalizeTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setDontNormalize
public void setDontNormalize(boolean dontNormalize)
Sets whether if the attribute values are to be normalized in distance calculation.- Parameters:
dontNormalize
- if true the values are not normalized
-
getDontNormalize
public boolean getDontNormalize()
Gets whether if the attribute values are to be normazlied in distance calculation. (default false i.e. attribute values are normalized.)- Returns:
- false if values get normalized
-
attributeIndicesTipText
public String attributeIndicesTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setAttributeIndices
public void setAttributeIndices(String value)
Sets the range of attributes to use in the calculation of the distance. The indices start from 1, 'first' and 'last' are valid as well. E.g.: first-3,5,6-last- Specified by:
setAttributeIndices
in interfaceDistanceFunction
- Parameters:
value
- the new attribute index range
-
getAttributeIndices
public String getAttributeIndices()
Gets the range of attributes used in the calculation of the distance.- Specified by:
getAttributeIndices
in interfaceDistanceFunction
- Returns:
- the attribute index range
-
invertSelectionTipText
public String invertSelectionTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setInvertSelection
public void setInvertSelection(boolean value)
Sets whether the matching sense of attribute indices is inverted or not.- Specified by:
setInvertSelection
in interfaceDistanceFunction
- Parameters:
value
- if true the matching sense is inverted
-
getInvertSelection
public boolean getInvertSelection()
Gets whether the matching sense of attribute indices is inverted or not.- Specified by:
getInvertSelection
in interfaceDistanceFunction
- Returns:
- true if the matching sense is inverted
-
invalidate
protected void invalidate()
invalidates all initializations.
-
validate
protected void validate()
performs the initializations if necessary.
-
initialize
protected void initialize()
initializes the ranges and the attributes being used.
-
initializeAttributeIndices
protected void initializeAttributeIndices()
initializes the attribute indices.
-
setInstances
public void setInstances(Instances insts)
Sets the instances.- Specified by:
setInstances
in interfaceDistanceFunction
- Parameters:
insts
- the instances to use
-
getInstances
public Instances getInstances()
returns the instances currently set.- Specified by:
getInstances
in interfaceDistanceFunction
- Returns:
- the current instances
-
postProcessDistances
public void postProcessDistances(double[] distances)
Does nothing, derived classes may override it though.- Specified by:
postProcessDistances
in interfaceDistanceFunction
- Parameters:
distances
- the distances to post-process
-
update
public void update(Instance ins)
Update the distance function (if necessary) for the newly added instance.- Specified by:
update
in interfaceDistanceFunction
- Parameters:
ins
- the instance to add
-
distance
public double distance(Instance first, Instance second)
Calculates the distance between two instances.- Specified by:
distance
in interfaceDistanceFunction
- Parameters:
first
- the first instancesecond
- the second instance- Returns:
- the distance between the two given instances
-
distance
public double distance(Instance first, Instance second, double cutOffValue)
Calculates the distance between two instances. Offers speed up (if the distance function class in use supports it) in nearest neighbour search by taking into account the cutOff or maximum distance. Depending on the distance function class, post processing of the distances by postProcessDistances(double []) may be required if this function is used.- Specified by:
distance
in interfaceDistanceFunction
- Parameters:
first
- the first instancesecond
- the second instancecutOffValue
- If the distance being calculated becomes larger than cutOffValue then the rest of the calculation is discarded.- Returns:
- the distance between the two given instances or Double.POSITIVE_INFINITY if the distance being calculated becomes larger than cutOffValue.
-
updateDistance
protected abstract double updateDistance(double currDist, double diff)
Updates the current distance calculated so far with the new difference between two attributes. The difference between the attributes was calculated with the difference(int,double,double) method.- Parameters:
currDist
- the current distance calculated so fardiff
- the difference between two new attributes- Returns:
- the update distance
- See Also:
difference(int, double, double)
-
norm
protected double norm(double x, int i)
Normalizes a given value of a numeric attribute.- Parameters:
x
- the value to be normalizedi
- the attribute's index- Returns:
- the normalized value
-
difference
protected double difference(int index, double val1, double val2)
Computes the difference between two given attribute values.- Parameters:
index
- the attribute indexval1
- the first valueval2
- the second value- Returns:
- the difference
-
initializeRanges
public double[][] initializeRanges()
Initializes the ranges using all instances of the dataset. Sets m_Ranges.- Returns:
- the ranges
-
updateRangesFirst
public void updateRangesFirst(Instance instance, int numAtt, double[][] ranges)
Used to initialize the ranges. For this the values of the first instance is used to save time. Sets low and high to the values of the first instance and width to zero.- Parameters:
instance
- the new instancenumAtt
- number of attributes in the modelranges
- low, high and width values for all attributes
-
updateRanges
public void updateRanges(Instance instance, int numAtt, double[][] ranges)
Updates the minimum and maximum and width values for all the attributes based on a new instance.- Parameters:
instance
- the new instancenumAtt
- number of attributes in the modelranges
- low, high and width values for all attributes
-
initializeRangesEmpty
public void initializeRangesEmpty(int numAtt, double[][] ranges)
Used to initialize the ranges.- Parameters:
numAtt
- number of attributes in the modelranges
- low, high and width values for all attributes
-
updateRanges
public double[][] updateRanges(Instance instance, double[][] ranges)
Updates the ranges given a new instance.- Parameters:
instance
- the new instanceranges
- low, high and width values for all attributes- Returns:
- the updated ranges
-
initializeRanges
public double[][] initializeRanges(int[] instList) throws Exception
Initializes the ranges of a subset of the instances of this dataset. Therefore m_Ranges is not set.- Parameters:
instList
- list of indexes of the subset- Returns:
- the ranges
- Throws:
Exception
- if something goes wrong
-
initializeRanges
public double[][] initializeRanges(int[] instList, int startIdx, int endIdx) throws Exception
Initializes the ranges of a subset of the instances of this dataset. Therefore m_Ranges is not set. The caller of this method should ensure that the supplied start and end indices are valid (start <= end, end<instList.length etc) and correct.- Parameters:
instList
- list of indexes of the instancesstartIdx
- start index of the subset of instances in the indices arrayendIdx
- end index of the subset of instances in the indices array- Returns:
- the ranges
- Throws:
Exception
- if something goes wrong
-
updateRanges
public void updateRanges(Instance instance)
Update the ranges if a new instance comes.- Parameters:
instance
- the new instance
-
inRanges
public boolean inRanges(Instance instance, double[][] ranges)
Test if an instance is within the given ranges.- Parameters:
instance
- the instanceranges
- the ranges the instance is tested to be in- Returns:
- true if instance is within the ranges
-
rangesSet
public boolean rangesSet()
Check if ranges are set.- Returns:
- true if ranges are set
-
getRanges
public double[][] getRanges() throws Exception
Method to get the ranges.- Returns:
- the ranges
- Throws:
Exception
- if no randes are set yet
-
toString
public String toString()
Returns an empty string.
-
isMissingValue
public static boolean isMissingValue(double val)
Tests if the given value codes "missing".- Parameters:
val
- the value to be tested- Returns:
- true if val codes "missing"
-
-