Class EuclideanDistance

  • All Implemented Interfaces:
    Cloneable, DistanceFunction

    public class EuclideanDistance
    extends NormalizableDistance
    implements Cloneable
    Implementing Euclidean distance (or similarity) function.

    One object defines not one distance but the data model in which the distances between objects of that data model can be computed.

    Attention: For efficiency reasons the use of consistency checks (like are the data models of the two instances exactly the same), is low.

    For more information, see:

    Wikipedia. Euclidean distance. URL http://en.wikipedia.org/wiki/Euclidean_distance.

    BibTeX:

     @misc{missing_id,
        author = {Wikipedia},
        title = {Euclidean distance},
        URL = {http://en.wikipedia.org/wiki/Euclidean_distance}
     }
     

    Valid options are:

     -D
      Turns off the normalization of attribute 
      values in distance calculation.
     -R <col1,col2-col4,...>
      Specifies list of columns to used in the calculation of the 
      distance. 'first' and 'last' are valid indices.
      (default: first-last)
     -V
      Invert matching sense of column indices.
    Version:
    $Revision: 8034 $
    Author:
    Gabi Schmidberger (gabi@cs.waikato.ac.nz), Ashraf M. Kibriya (amk14@cs.waikato.ac.nz), FracPete (fracpete at waikato dot ac dot nz)
    • Constructor Detail

      • EuclideanDistance

        public EuclideanDistance()
        Constructs an Euclidean Distance object, Instances must be still set.
      • EuclideanDistance

        public EuclideanDistance​(Instances data)
        Constructs an Euclidean Distance object and automatically initializes the ranges.
        Parameters:
        data - the instances the distance function should work on
    • Method Detail

      • globalInfo

        public String globalInfo()
        Returns a string describing this object.
        Specified by:
        globalInfo in class NormalizableDistance
        Returns:
        a description of the evaluator suitable for displaying in the explorer/experimenter gui
      • distance

        public double distance​(Instance first,
                               Instance second)
        Calculates the distance between two instances.
        Specified by:
        distance in interface DistanceFunction
        Overrides:
        distance in class NormalizableDistance
        Parameters:
        first - the first instance
        second - the second instance
        Returns:
        the distance between the two given instances
      • updateDistance

        protected double updateDistance​(double currDist,
                                        double diff)
        Updates the current distance calculated so far with the new difference between two attributes. The difference between the attributes was calculated with the difference(int,double,double) method.
        Specified by:
        updateDistance in class NormalizableDistance
        Parameters:
        currDist - the current distance calculated so far
        diff - the difference between two new attributes
        Returns:
        the update distance
        See Also:
        NormalizableDistance.difference(int, double, double)
      • postProcessDistances

        public void postProcessDistances​(double[] distances)
        Does post processing of the distances (if necessary) returned by distance(distance(Instance first, Instance second, double cutOffValue). It is necessary to do so to get the correct distances if distance(distance(Instance first, Instance second, double cutOffValue) is used. This is because that function actually returns the squared distance to avoid inaccuracies arising from floating point comparison.
        Specified by:
        postProcessDistances in interface DistanceFunction
        Overrides:
        postProcessDistances in class NormalizableDistance
        Parameters:
        distances - the distances to post-process
      • sqDifference

        public double sqDifference​(int index,
                                   double val1,
                                   double val2)
        Returns the squared difference of two values of an attribute.
        Parameters:
        index - the attribute index
        val1 - the first value
        val2 - the second value
        Returns:
        the squared difference
      • getMiddle

        public double getMiddle​(double[] ranges)
        Returns value in the middle of the two parameter values.
        Parameters:
        ranges - the ranges to this dimension
        Returns:
        the middle value
      • closestPoint

        public int closestPoint​(Instance instance,
                                Instances allPoints,
                                int[] pointList)
                         throws Exception
        Returns the index of the closest point to the current instance. Index is index in Instances object that is the second parameter.
        Parameters:
        instance - the instance to assign a cluster to
        allPoints - all points
        pointList - the list of points
        Returns:
        the index of the closest point
        Throws:
        Exception - if something goes wrong
      • valueIsSmallerEqual

        public boolean valueIsSmallerEqual​(Instance instance,
                                           int dim,
                                           double value)
        Returns true if the value of the given dimension is smaller or equal the value to be compared with.
        Parameters:
        instance - the instance where the value should be taken of
        dim - the dimension of the value
        value - the value to compare with
        Returns:
        true if value of instance is smaller or equal value