Class KMeansInpiredMethod

  • All Implemented Interfaces:
    Serializable

    public class KMeansInpiredMethod
    extends KDTreeNodeSplitter
    The class that splits a node into two such that the overall sum of squared distances of points to their centres on both sides of the (axis-parallel) splitting plane is minimum.

    For more information see also:

    Ashraf Masood Kibriya (2007). Fast Algorithms for Nearest Neighbour Search. Hamilton, New Zealand.

    BibTeX:

     @mastersthesis{Kibriya2007,
        address = {Hamilton, New Zealand},
        author = {Ashraf Masood Kibriya},
        school = {Department of Computer Science, School of Computing and Mathematical Sciences, University of Waikato},
        title = {Fast Algorithms for Nearest Neighbour Search},
        year = {2007}
     }
     

    Version:
    $Revision: 8034 $
    Author:
    Ashraf M. Kibriya (amk14[at-the-rate]cs[dot]waikato[dot]ac[dot]nz)
    See Also:
    Serialized Form
    • Constructor Detail

      • KMeansInpiredMethod

        public KMeansInpiredMethod()
    • Method Detail

      • globalInfo

        public String globalInfo()
        Returns a string describing this nearest neighbour search algorithm.
        Returns:
        a description of the algorithm for displaying in the explorer/experimenter gui
      • splitNode

        public void splitNode​(KDTreeNode node,
                              int numNodesCreated,
                              double[][] nodeRanges,
                              double[][] universe)
                       throws Exception
        Splits a node into two such that the overall sum of squared distances of points to their centres on both sides of the (axis-parallel) splitting plane is minimum. The two nodes created after the whole splitting are correctly initialised. And, node.left and node.right are set appropriately.
        Specified by:
        splitNode in class KDTreeNodeSplitter
        Parameters:
        node - The node to split.
        numNodesCreated - The number of nodes that so far have been created for the tree, so that the newly created nodes are assigned correct/meaningful node numbers/ids.
        nodeRanges - The attributes' range for the points inside the node that is to be split.
        universe - The attributes' range for the whole point-space.
        Throws:
        Exception - If there is some problem in splitting the given node.
      • partition

        protected static int partition​(Instances insts,
                                       int[] index,
                                       int attidx,
                                       int l,
                                       int r)
        Partitions the instances around a pivot. Used by quicksort and kthSmallestValue.
        Parameters:
        insts - The instances on which the tree is (or is to be) built.
        index - The master index array containing indices of the instances.
        attidx - The attribution/dimension based on which the instances should be partitioned.
        l - The begining index of the portion of master index array that should be partitioned.
        r - The end index of the portion of master index array that should be partitioned.
        Returns:
        the index of the middle element
      • quickSort

        protected static void quickSort​(Instances insts,
                                        int[] indices,
                                        int attidx,
                                        int left,
                                        int right)
        Sorts the instances according to the given attribute/dimension. The sorting is done on the master index array and not on the actual instances object.
        Parameters:
        insts - The instances on which the tree is (or is to be) built.
        indices - The master index array containing indices of the instances.
        attidx - The dimension/attribute based on which the instances should be sorted.
        left - The begining index of the portion of the master index array that needs to be sorted.
        right - The end index of the portion of the master index array that needs to be sorted.
      • rearrangePoints

        protected int rearrangePoints​(int[] indices,
                                      int startidx,
                                      int endidx,
                                      int splitDim,
                                      double splitVal)
        Re-arranges the indices array so that in the portion of the array belonging to the node to be split, the points <= to the splitVal are on the left of the portion and those > the splitVal are on the right.
        Parameters:
        indices - The master index array.
        startidx - The begining index of portion of indices that needs re-arranging.
        endidx - The end index of portion of indices that needs re-arranging.
        splitDim - The split dimension/attribute.
        splitVal - The split value.
        Returns:
        The startIdx of the points > the splitVal (the points belonging to the right child of the node).