You are on page 1of 3

Spatial Topics in Database

Assignment 1
Vibhor (MT13019)

Q: Insertion and deletion in Rdnn tree.

Rdnn tree is a special type of tree whose non leaf nodes are of the form (ptr , rect, max_dnn) where ptr is pointer to child nodes , rect is MBR of child nodes and max_dnn is maximum of NN distance from all child nodes. Whereas leaf nodes (ptid , dnn) where ptid refers to d-dimensional point in data set and dnn is NN distance for that ptid. Insertion for insertion a point p we need to have NN of P as well as RNN of p. NN is needed so that we can put the entry of p in Rdnn tree and RNN is needed to update the dnn for points in RNN. We also have to update max_dnn of ascendant nodes of points in RNN because due to inclusion of this point in a subtree might change the value of max_dnn so we have to update the value of max_dnn of ascendant nodes of RNN points. For non-leaf node we find the distance between p and MBR composed of child nodes that is Rect i.e. D(p,Rect) if this distance is less than max_dnn of subtree then we again repeat same for child node otherwise we simply prune the subtree. If subtree at some level changes than we update the max_dnn value of ascending node and so on for upward nodes. For leaf node if D(p, ptid) < D(p,c) where is c is some previous node , this means point ptid is more closer than any other node and id D(p,ptid) < dnn which means ptid is NN of p. Now by this we can have a dnns(p) and then using simple r tree insertion algo we can add newly find NN of p into n.

Deletion : here for deleting a node p we first have to find RNN (p) and than for all those RNN points we have to perform NN query as previous NN of those points is obviously p which is getting removed so we have to update. Performing NN

query for all RNN points is cosly and moreover all the RNN points are close to each other as they are RNN points for same point viz. p. So we can perform batch NN query which find NN for all RNN points. And after this query we update the dnn value for each of RNN point in tree.

Q: How to use MINDIST, MINMAXDIST, MAXDIST heuristic in deciding whether to prune the result or further explore the nodes.
d Is the distance for MBR having NN MINDIST it is the minimum distance to closest face of MBR from query point q. MINMAXDIST it is the minimum of maximum distance to all the faces of MBR from q. MAXDIST it is the maximum distance from any face of MBR to point q.

1. D > MAXDIST -- Explore - in this case it says that MAXDIST from q to MBR is smaller than d which means q is more closer to MBR than NN at d so it should be explore for RNN points. 2. d > MINMAXDIST Explore in this case also we should explore for RNN points here distance for NN of MBR is greater than MINMAXDIST so there is possibility of existence of RNN. Above point will give RNN when some points exist on closest face of MBR Otherwise no RNN 3. d < MINDIST -- Prune here this condition says that current NN of MBR having distance d is smaller than MINDIST to MBR from point q which assures that q can never be RNN of MBR hence it should be prune out.

Q: How Rdnn tree makes NN algorithm more efficient.

Algorithm based on Rdnn tree for NN algo use one more heuristic to prune some more data sets which makes it more efficient. What it do is lets say we have a query point q and we have to find NN for it. P is a point in dataset than euclidean distance between p and q is D(p,q). and dnns(p) is the distance between p and its nearest neighbor Here if D(p,q) < dnns(p)/2 for any p in data set S than p is NN for q. For point p, c(p,dnns(p)) is a circle centered at p and having radius dnns(p) If D(p,q) < dnns(p)/2 than for a point x outside the circle C D(x,q) >= dnns(p)/2 which means for any point outside the circle distance between q and x is greater than distance between p and q and hence p is NN for q. so while searching for NN if at leaf node we found any point for which D(p,q) < dnns(p)/2 is found than we can stop the search and report p as NN of q.