Professional Documents
Culture Documents
Given four .csv files (two pairs) that include numeric data:
Each file includes the number of points in the file in the first row and the number 128.
The first four numbers are the "name" of the point and the other 128sea number are the
point in space 128 mm.
Each pair of files must build the data structure from the first file and use the points of the
second file as queries.
As you may have noticed, each file is accompanied by an image with the same name.
The app from which the files were created is image matching. Given a pair of images in
which you see roughly the same scene, we want to match a point in the first image with the
point from the second image in which you see the same place.
The algorithm scans the image, finds special points, and describes an environment around
the point by the 128-number vector. X and Y are of course the point position in the image.
By finding the point with the closest vector with the help of the ANN allegory, a candidate is
found to match the matching point.
For each point there is a point that is closest to it , and if not all points have a matching point
in the image ( maybe because there is no overlap between the images or the process that scanned
and searched for special points did not think that the partner of the point is special enough).
The first option for this is by defining points as compatible if the distance between them in a
128-dimensional space is less than any threshold. However, this idea did not work well.
With the structure of the data that has been realized you can easily try this method.
Parameters:
Each of the two algorithms has two parameters that should be given to the algorithm when
building the data structure.
InLSH there are K and L: K is the number of insiduals (incisions) tocalculate to create the h
ash key, whereas L is the numberof hashtables to build.
In RKDT there are N0 and L0: N 0isthe maximum number of points found in each leaf of
thetreeand L0 is the number of trees to build.
1) Run speed: Comparing the run speed of the ANN algorithm against the KNN linear
algorithm.
2) Accuracy: Since it is about ANN it does not always find the nearest neighbor.
If we set dNN(P) as a distance to the neighbor closest to point P and dANN(P) as a
distance to the point returned by the algorithm (the ratio of distances is alwaysgreater
than /equals 1).
In this exercise, you are required to implement one of the data structures (of your choice)
of the ANN algorithm and a simple KNN linear algorithm for testing correctness and
accelerating.
Build a class for the ANN algorithm with the data structure you have chosen to redeem.
The department will contain the following methods ( note: these are not necessarily all the
methods that must be built. You can add measurements as needed, of your choice:
ANN() – From the constructor's thanks (this is where the hyperparameters will be
defined)
fit() – a confessor that inserts the training set into the data structure.
Kneighbors() – This method will receive a set of new samples and return for each sample
the neighbors closest to it and their distance from it.
5. Write a thank you that gets that low value per α (e.g. 0.1) and return the 5 pairs of
parameters for which the ANN algorithm runs fastest.
- Display the parameter values and run times of all pairs in the graph.
6. For the fastest pair of per-meter values (obtained in section 5), run theANN
algorithm 10 times and show that different results (accuracy and run time) are obtained
each run.
- Present the results in the appropriate visuals.
7. The run the corresponding algorithm insklearn (NearestNeighbors). This modeluses KD
Tree as the assistive algorithm, but its functionality matches that of your algorithm. (15
points)
8. From the data structure that you did not choose to exercise in the mandatory section,
including sections 2-4
9. Place another thanks for the ANN algorithm that receives a certain distance radius and
returns the nearest neighbors within that radius. We'll make the radius 1.5*d(NN).
Compare the results to the radius_neighbors that exists insklearn.
Please notethat for each section of you must record your implementation stages through
comments in code + markdown cells for results and conclusions.
Please read the following section carefully, the nature of the work and the scores will
depend on the following sections:
Submission:
Code:
Correct
brightness
Effectiveness of realization
Elegance of realization