You are on page 1of 5

File data:

Given four .csv files (two pairs) that include numeric data:

Hananya1. csv and Hananya2. csv,

Hashmal1. csv and Hashmal2. csv.

Each file includes the number of points in the file in the first row and the number 128.

Each point consists of 132 numbers:

Y X Scale Angle v[1]... v[128].

The first four numbers are the "name" of the point and the other 128sea number are the
point in space 128 mm.

Each pair of files must build the data structure from the first file and use the points of the
second file as queries.

File data means:

As you may have noticed, each file is accompanied by an image with the same name.
The app from which the files were created is image matching. Given a pair of images in
which you see roughly the same scene, we want to match a point in the first image with the
point from the second image in which you see the same place.

The algorithm scans the image, finds special points, and describes an environment around
the point by the 128-number vector. X and Y are of course the point position in the image.

By finding the point with the closest vector with the help of the ANN allegory, a candidate is
found to match the matching point.
For each point there is a point that is closest to it , and if not all points have a matching point
in the image ( maybe because there is no overlap between the images or the process that scanned
and searched for special points did not think that the partner of the point is special enough).

Therefore, it is necessary to decide whether the point is appropriate or not.

The first option for this is by defining points as compatible if the distance between them in a
128-dimensional space is less than any threshold. However, this idea did not work well.

Therefore, suggest the following method of treatment:


We will find the two neighbors closest to the point in the data structure and consider the
ratio of distances (first distance divided by second distance). If the distance to the first
neighbor is much smaller than the distance to the second neighbor (a ratio less than 0.8 for
example) then we will take the couple. If not, then we won't take it.
Example of a code psedo:

nearest_n, second_nearest = kneighbors(sample, k=2)


ratio = nearest_n /second_nearest
if ratio < 0.8 return sample, nearest_n

With the structure of the data that has been realized you can easily try this method.

In the course we learned about two METHODS of ANN:


 Locality Sensitive Hashing (LSH)

 Randomized KD-Trees (RKDT).

Parameters:

Each of the two algorithms has two parameters that should be given to the algorithm when
building the data structure.

InLSH there are K and L: K is the number of insiduals (incisions) tocalculate to create the h
ash key, whereas L is the numberof hashtables to build.

In RKDT there are N0 and L0: N 0isthe maximum number of points found in each leaf of
thetreeand L0 is the number of trees to build.

Each such data structure is examined in two ways:

1) Run speed: Comparing the run speed of the ANN algorithm against the KNN linear
algorithm.

2) Accuracy: Since it is about ANN it does not always find the nearest neighbor.
If we set dNN(P) as a distance to the neighbor closest to point P and dANN(P) as a
distance to the point returned by the algorithm (the ratio of distances is alwaysgreater
than /equals 1).

We'll make the average error α like this:


To-dos :

In this exercise, you are required to implement one of the data structures (of your choice)
of the ANN algorithm and a simple KNN linear algorithm for testing correctness and
accelerating.

Build a class for the ANN algorithm with the data structure you have chosen to redeem.
The department will contain the following methods ( note: these are not necessarily all the
methods that must be built. You can add measurements as needed, of your choice:

 ANN() – From the constructor's thanks (this is where the hyperparameters will be
defined)
 fit() – a confessor that inserts the training set into the data structure.
 Kneighbors() – This method will receive a set of new samples and return for each sample
the neighbors closest to it and their distance from it.

* Similarly, build a class for the KNN linear algorithm.

1. Realization of a KNN linear algorithm.


2. Implement the ANN algorithm by selecting one of the data structures.
3. Given k=2 – check the correctness of the ANN algorithm that you implemented using
the aforementioned method of relation.
- View and explain the results you received.
4. Use the pillow library and show for the 10 best results the closer neighbors you found on the
photos, highlight parallel pixels between the pairs. Looking at the results, we indicated
which of the adjustments were correct and which were not. See example below
Depending on the data structure you have chosen to exercise for the ANN (LSH or
RKDT) algorithm, find the parameters that will result in the optimal result (the smallest error
you will be able to reach).
- Select 10 values for each parameter and find the optimal values using grid search.

5. Write a thank you that gets that low value per α (e.g. 0.1) and return the 5 pairs of
parameters for which the ANN algorithm runs fastest.
- Display the parameter values and run times of all pairs in the graph.
6. For the fastest pair of per-meter values (obtained in section 5), run theANN
algorithm 10 times and show that different results (accuracy and run time) are obtained
each run.
- Present the results in the appropriate visuals.
7. The run the corresponding algorithm insklearn (NearestNeighbors). This modeluses KD
Tree as the assistive algorithm, but its functionality matches that of your algorithm. (15
points)
8. From the data structure that you did not choose to exercise in the mandatory section,
including sections 2-4
9. Place another thanks for the ANN algorithm that receives a certain distance radius and
returns the nearest neighbors within that radius. We'll make the radius 1.5*d(NN).
Compare the results to the radius_neighbors that exists insklearn.

Please notethat for each section of you must record your implementation stages through
comments in code + markdown cells for results and conclusions.

Example for Section3:

Guidelines if you accept to work on the project:

Please read the following section carefully, the nature of the work and the scores will
depend on the following sections:

Submission:

 Jupyter notebook in html format, ipynb.

Code:

 The code should run without warnings or errors.


 Good documentation is critical.
 Use familiar packages with explicit explanations.
 If you have installed any directories beyond those presented in the exercises, please
list this in the report.
 When you draw graphs, you must add the following: clear axis names, title, and legend
if necessary.
 Please specify the exercise clauses in the code as well.
 Significant organ names were used.
 Don't use reserved words.
 Require you to write to functions, loops or any last solution.
 Use constants are possible.
Criteria:

 Correct
 brightness
 Effectiveness of realization
 Elegance of realization

You might also like