Welcome to Scribd!

Skip carousel

2.8 DataMining

Uploaded by

Chinmayi Kulkarni

100% found this document useful (2 votes)

6K views4 pages

Data mining assignment

Original Title

2.8_DataMining

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Data mining assignment

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

100% found this document useful (2 votes)

6K views4 pages

2.8 DataMining

Uploaded by

Chinmayi Kulkarni

Data mining assignment

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 4

Search inside document

2.8 It is important to dene or select similarity measures in data analysis.

However, there is no commonly

accepted subjective similarity measure. Results can vary depending on the similarity measures used.
Nonetheless, seemingly different similarity measures may be equivalent after some transformation.
Suppose we have the following 2-D data set Formula for Eucledian distance,

(a) Consider the data as 2-D data points. Given a new data point, x = (1.4,1.6) as a query, rank the
database points based on similarity with the query using Euclidean distance, Manhattan distance,
supremum distance, and cosine similarity.
(b) Normalizethedatasettomakethenormofeachdatapointequalto1.UseEuclidean distance on the
transformed data to rank the data points.
Ans a) Formula for Euclidean distance,

Therefore, d(x,x1)=0.141
d(x,x2)=0.67
d(x,x3)=0.28
d(x,x4)=0.223
d(x,x5)=0.60
Thus, rank of the data points based on similarity with x using Eucledian distance is
x2,x5,x3,x4,x1

Formula for Manhattan distance,

Therefore, d(x,x1)=0.2
d(x,x2)=0.9
d(x,x3)=0.4
d(x,x4)=0.3
d(x,x5)=0.7
Thus, rank of the data points based on similarity with x using Manhattan distance is
X2, x5, x3, x4, and x1

Formula for Supremum distance,

Therefore, d(x,x1)=0.1
d(x,x2)=0.6
d(x,x3)=0.2
d(x,x4)=0.2
d(x,x5)=0.6
Thus, rank of the data points based on similarity with x using Supremum distance is
X2, x5, x3, x4, and x1
Cosine similarity:

x. x1
x.x 1

( x , x 1) =

where

( x , x 1) =

( x , x 2) =

( x , x 3 )=

is the Euclidean norm of vector x defined as

( 1.4 )( 1.5 )+(1.6)(1.7)

( 1.4 +1.6 )( 1.5 +1.7 )
2

(1.4 ) (2 )+(1.6)(1.9)
( 1.4 2 +1.62 )( 22 +1.92 )

2.1+ 2.72 4.82

=
=0.9999
4.86
4.86

=0.9957

( 1.4 ) ( 1.6 ) +(1.6)(1.8)

( 1.42 +1.62 )( 1.62+ 1.82)

x 12+ x 22 ++ xn 2

=0.9999

( x , x 4) =

( x , x 5 )=

(1.4 ) (1.2 ) +(1.6)(1.5)

( 1.42 +1.62 )( 1.22 +1.52 )

( 1.4 ) ( 1.5 )+(1.6)(1.0)

( 1.42 +1.62 )( 1.52+1.0 2)

=0.9990

=0.9653

Thus, rank of the data points based on similarity with x using Supremum distance is x1, x3, x4, x2, x5.

b) norm( x)=sqrt{(1.4)^2 +(1.6)^2} ~ 2.13

Normalized x is (1.4/2.13,1.6/2.13) =(0.65,0.75)

norm( x1)=sqrt{(1.5)^2 +(1.7)^2} ~ 2.26

Normalized x1 is (1.5/2.26,1.7/2.26) =(0.57,0.75)
norm( x2)=sqrt{(2)^2 +(1.7)^2} ~ 2.76
Normalized x2 is (2/2.76,1.9/2.76) =(0.26,0.69)
norm( x3)=sqrt{(1.6)^2 +(1.8)^2} ~ 2.40
Normalized x3 is (1.6/2.40,1.8/2.40) =(0.67,0.75)
norm( x4)=sqrt{(1.2)^2 +(1.5)^2} ~ 1.92
Normalized x4 is (1.2/1.92,1.5/1.92) =(0.62,0.78)
norm( x5)=sqrt{(1.5)^2 +(1.0)^2} ~ 1.80
Normalized x5 is (1.5/1.80,1.0/1.80) =(0.83,0.55)
Formula for Eucledian distance is,

D(x,x1)=0.8
D(x,x2)=0.71
D(x,x3)=0.02

D(x,x4)=0.04
D(x,x5)=0.27
Thus, rank of the data points based on similarity with x using Euclidean distance in normalized form is
x1, x2, x5, x4, x3.

W. Keith Nicholson-Introduction To Abstract Algebra, Solutions Manual-Wiley (2012)
Document497 pages
W. Keith Nicholson-Introduction To Abstract Algebra, Solutions Manual-Wiley (2012)
Marcus Vinicius Sousa Sousa
100% (1)
Elementary Differential Equations with Linear Algebra
From Everand
Elementary Differential Equations with Linear Algebra
Albert L. Rabenstein
No ratings yet
DATAbase Connectivity
Document4 pages
DATAbase Connectivity
karthiyayani umashankar
100% (2)
Maths GCSE 2001 Paper 1 Higher
Document24 pages
Maths GCSE 2001 Paper 1 Higher
Sid Menon
No ratings yet
Grade 9 Math Exam Study Notes
Document6 pages
Grade 9 Math Exam Study Notes
cynthia_88
No ratings yet
Programs With Sol
Document28 pages
Programs With Sol
Sai Tejesh Reddy Gurijala
71% (7)
Apriori Principle Example Question and Answer
Document11 pages
Apriori Principle Example Question and Answer
Udara Seneviratne
100% (11)
Import As From Import Import: Problem 1
Document5 pages
Import As From Import Import: Problem 1
ikhwancules46
100% (1)
14 Variance and Covariance of Random Variables
Document7 pages
14 Variance and Covariance of Random Variables
Harold Finch
No ratings yet
Numerical Methods Short Questions
Document6 pages
Numerical Methods Short Questions
সৌভিক মাজি
No ratings yet
Unit Test
Document4 pages
Unit Test
Nicholas Chevalier
No ratings yet
Precalculus: Reference: TC7 by Leithold ©2015
Document34 pages
Precalculus: Reference: TC7 by Leithold ©2015
Robert Tac-an Nericua
No ratings yet
DSBDA Lab Manual 2022-23
Document148 pages
DSBDA Lab Manual 2022-23
Mishra Ji
100% (2)
Solutions To II Unit Exercises From Kamber
Document16 pages
Solutions To II Unit Exercises From Kamber
jyothibellaryv
83% (42)
Physics Coursework 2016/2017 STPM
Document13 pages
Physics Coursework 2016/2017 STPM
Shi Jie
No ratings yet
Basics of Statistics and Probability - FP: Statistical Measures
Document12 pages
Basics of Statistics and Probability - FP: Statistical Measures
Shafieul Mohammad
No ratings yet
Moments of Inertia Explained
Document25 pages
Moments of Inertia Explained
engineer_atul
No ratings yet
AdvanceTS1handson - Jupyter Notebook
Document3 pages
AdvanceTS1handson - Jupyter Notebook
Vikas
100% (2)
Data Mining Chapter 2
Document8 pages
Data Mining Chapter 2
Rony saha
No ratings yet
Google Data Analytics Quiz 1st Week
Document9 pages
Google Data Analytics Quiz 1st Week
Nadeem Khalid
50% (2)
E. Brian Davies-Linear Operators and Their Spectra
Document465 pages
E. Brian Davies-Linear Operators and Their Spectra
Mahmood Kamil Shihab
No ratings yet
Machine Learning Scikit Handson
Document4 pages
Machine Learning Scikit Handson
Akshay Sharada Hanmant Suryawanshi
0% (1)
Dupont - Fiber Bundles in Gauge Theory
Document109 pages
Dupont - Fiber Bundles in Gauge Theory
Wilfred Hulsbergen
No ratings yet
Geometry Solving Problems (Circles)
Document36 pages
Geometry Solving Problems (Circles)
Hero Mirasol
No ratings yet
Azure Machine Learning
Document18 pages
Azure Machine Learning
Ram
No ratings yet
Code Migration
Document8 pages
Code Migration
Nazmus Sadat Shaiekh
67% (3)
Frequent Pattern Based Clustering Techniques
Document18 pages
Frequent Pattern Based Clustering Techniques
Priyanka Bhardwaj
100% (1)
Data Mining Solution
Document7 pages
Data Mining Solution
Fritzie West
No ratings yet
Practical File of Multimedia Technology
Document33 pages
Practical File of Multimedia Technology
Pankaj Gill
100% (1)
Implement Newton Backward Interpolation
Document4 pages
Implement Newton Backward Interpolation
Befzz
No ratings yet
Stat 2
Document3 pages
Stat 2
smrutiranjan parida
No ratings yet
Data Visualization MCQ Questions and Answer PDF
Document6 pages
Data Visualization MCQ Questions and Answer PDF
Siva Nani
100% (1)
Internet of Things Mcqs
Document5 pages
Internet of Things Mcqs
Poonam Surve
0% (3)
SOR December 1 2020 Lecture Notes Tuesday
Document7 pages
SOR December 1 2020 Lecture Notes Tuesday
John Larry Pasion02
No ratings yet
Linear Systems - Iterative Methods
Document55 pages
Linear Systems - Iterative Methods
Green Stone
No ratings yet
Imp
Document8 pages
Imp
mujeeb.abdullah2830
No ratings yet
Aashikpelokhai and Momodu
Document7 pages
Aashikpelokhai and Momodu
SandaMohd
No ratings yet
venkateshIJMA1 4 2011
Document10 pages
venkateshIJMA1 4 2011
Sara Khan
No ratings yet
5-PCA
Document14 pages
5-PCA
SAMRIDDHI JAISWAL
No ratings yet
CGM Unit 2 Question Bank
Document11 pages
CGM Unit 2 Question Bank
Abhishek Kumar
No ratings yet
FInd Measure of Central Tendency and Dispersion
Document3 pages
FInd Measure of Central Tendency and Dispersion
ASHISH KUMAR KEWAT Vel Tech, Chennai
No ratings yet
Sheets-BAS111 Numerical Analysis-Part
Document6 pages
Sheets-BAS111 Numerical Analysis-Part
abdallamasoud354
No ratings yet
ECOM 6302: Engineering Optimization: Chapter Three
Document56 pages
ECOM 6302: Engineering Optimization: Chapter Three
aaqlain
100% (1)
Runge-Kutta Methods For ODE Systems: Ordinary Differential Equations
Document8 pages
Runge-Kutta Methods For ODE Systems: Ordinary Differential Equations
Emir Nezirić
No ratings yet
Theoretical Grounds of Factor Analysis PDF
Document76 pages
Theoretical Grounds of Factor Analysis PDF
RosHan Awan
No ratings yet
4curve Fitting Techniques
Document32 pages
4curve Fitting Techniques
SubhankarGanguly
No ratings yet
BSC 1 Sem Ecs Mathematics Numerical Methods SLR SC 8 2018
Document4 pages
BSC 1 Sem Ecs Mathematics Numerical Methods SLR SC 8 2018
jonna
No ratings yet
Solutions of Systems of Non Linear Equations: X X X F X X X F
Document6 pages
Solutions of Systems of Non Linear Equations: X X X F X X X F
yonatan
No ratings yet
Studying Some Approaches To Estimate The Smoothing Parameter For The Nonparametric Regression Model
Document11 pages
Studying Some Approaches To Estimate The Smoothing Parameter For The Nonparametric Regression Model
Út Nhỏ
No ratings yet
CH2. Locating Roots of Nonlinear Equations
Document17 pages
CH2. Locating Roots of Nonlinear Equations
bytebuilder25
No ratings yet
Solving Ordinary Differential Equations - Sage Reference Manual v7
Document13 pages
Solving Ordinary Differential Equations - Sage Reference Manual v7
amyounis
No ratings yet
ODE Curve Fitting & Interpolation
Document58 pages
ODE Curve Fitting & Interpolation
Melih Tecer
No ratings yet
Solving Ordinary Differential Equations With Matlab
Document22 pages
Solving Ordinary Differential Equations With Matlab
Mario Zamora
No ratings yet
Yogesh Meena (BCA-M15 4th SEM) CONM CCE
Document10 pages
Yogesh Meena (BCA-M15 4th SEM) CONM CCE
Yogesh Meena
No ratings yet
Measures of Dispersion - Docx4.18.23
Document3 pages
Measures of Dispersion - Docx4.18.23
Jericho Tumazar
No ratings yet
Numerical Solutions of Differential Equations: Euler's Method
Document9 pages
Numerical Solutions of Differential Equations: Euler's Method
Vinayaga Murthy G
No ratings yet
Calculus 2 Final Exam
Document8 pages
Calculus 2 Final Exam
tevin sessa
No ratings yet
Chi Square Tutorial
Document4 pages
Chi Square Tutorial
Alifah Ulya
No ratings yet
American Mathematical Society
Document20 pages
American Mathematical Society
Andrea Maroni
No ratings yet
Euler Method PDF
Document20 pages
Euler Method PDF
Fauzi Abas
No ratings yet
ML Model Paper 3 Solution
Document22 pages
ML Model Paper 3 Solution
Roger kujur
No ratings yet
Vander Monde
Document11 pages
Vander Monde
Augita SP
No ratings yet
HW2 Soln
Document8 pages
HW2 Soln
Jun Kang
No ratings yet
Solution of Weighted Residual Problems by Using Galerkin's Method
Document4 pages
Solution of Weighted Residual Problems by Using Galerkin's Method
sameh lotfy
No ratings yet
3 PDF
Document56 pages
3 PDF
Tala Abdelghani
No ratings yet
2022 Stat6089 Lgda TP2-W3-S3-R0 2401967132
Document5 pages
2022 Stat6089 Lgda TP2-W3-S3-R0 2401967132
Oki Aditya
No ratings yet
Solutions of Gauss'S Hypergeometric Equation, Leguerre'S Equation by Differential Transform Method
Document7 pages
Solutions of Gauss'S Hypergeometric Equation, Leguerre'S Equation by Differential Transform Method
Umer Sayyab Khalid
No ratings yet
Num Met-MS
Document29 pages
Num Met-MS
itsmohanecom
No ratings yet
Chapter 6 (1) ODE - RK
Document31 pages
Chapter 6 (1) ODE - RK
Muhammad Firdaws
No ratings yet
Chapter 4
Document11 pages
Chapter 4
Omed. H
No ratings yet
Physics 10th Edition Cutnell Test Bank 1
Document6 pages
Physics 10th Edition Cutnell Test Bank 1
marchowardwacfrpgzxn
100% (25)
Deployment of Tensegrity Structures
Document22 pages
Deployment of Tensegrity Structures
Francisco Vicent Pacheco
No ratings yet
14-Curve Surveying
Document76 pages
14-Curve Surveying
Vinayaka Ram
No ratings yet
The Nature of Mathematics
Document2 pages
The Nature of Mathematics
Nobody013
No ratings yet
Statics of Particle Lecture 4 - Forces in Space (3D) : DR Shahruddin Bin Mahzan@Mohd Zin
Document23 pages
Statics of Particle Lecture 4 - Forces in Space (3D) : DR Shahruddin Bin Mahzan@Mohd Zin
Norsahira Binti Maslazim
No ratings yet
G9 Activity Sheets Ncov 19
Document3 pages
G9 Activity Sheets Ncov 19
Bea Bianca
100% (1)
Trigonometric Identities
Document34 pages
Trigonometric Identities
jzand999
No ratings yet
213 - PYTHAGORAS, 2 Eso
Document3 pages
213 - PYTHAGORAS, 2 Eso
Irene Arbos Dalmau
No ratings yet
Combined QP - M2 OCR
Document59 pages
Combined QP - M2 OCR
mwesigwaa
No ratings yet
ArtiosCAD Quick D202934 V1.1
Document38 pages
ArtiosCAD Quick D202934 V1.1
cesar
No ratings yet
Diagonals of A Rectangle: Sheet 1
Document2 pages
Diagonals of A Rectangle: Sheet 1
Mari Crizalyn Agot
No ratings yet
Hsslive-physics-Plus One Chapter4
Document21 pages
Hsslive-physics-Plus One Chapter4
Hafna Basheer
No ratings yet
6.curve Surface
Document7 pages
6.curve Surface
Shiela Sorino
No ratings yet
Graphically Determine Forces Using Trigonometry
Document70 pages
Graphically Determine Forces Using Trigonometry
Jorge Fernando Loaiza Vergara
No ratings yet
Tutorial 2
Document18 pages
Tutorial 2
Miguel Martin
No ratings yet
Elastic Model Unifies Forces
Document4 pages
Elastic Model Unifies Forces
Shashi Prakash
No ratings yet
Fierz Transform
Document8 pages
Fierz Transform
Jan
No ratings yet
Architecture Determinisim
Document2 pages
Architecture Determinisim
rihana
No ratings yet
Math 213a: Homework 1: Vinh-Kha Le September 2018
Document8 pages
Math 213a: Homework 1: Vinh-Kha Le September 2018
vinhkhale
No ratings yet
Phy 12345
Document33 pages
Phy 12345
Jameel Malik
No ratings yet