Professional Documents
Culture Documents
Oleh :
Viona Kacaribu -
Steven Tricahayadi -
Nurchayadin - 004201705065
28 Januari 2020
Contribution Member :
Nurcahayadin = No.2 C
Manager Group :
Member Group :
a) (-1,1). Many times the data has only positive entries and in that case the range is (0 , 1)
b) + Not necessarily. All we know is that the value at their attributes differ by contrant
factor
c) – For two vectors , ‘x’ and “y” that have mean at 0 , CORR (x, y) = COS (x, y)
d) Since cw the 10000 points fall on the curve there is a functional relationship between
Euclidean distance and cosine similarity for normalized data. More specially , there is an
inverse relationship between cosine similarity and Euclidean distance, for example if two
data points are identic, there cosine similarity is one and Euclidean distance is zero. But
if two data point have a high Euclidean distance , their cosine value is close to zero . Note
that all the sample data points were bean the positive quadrat . I.e had only positive value.
This means that allcosine value will be positive and correlation values will be positive.
e) Some as answer (d), but with correlation substituted for cosine.
No.2
Question : For the following vectors, x and y, calculate the indicated similarity or distance
measures.
a. x = (2, 2, 2, 2), y = (3, 3, 3, 3) cosine, correlation, Euclidean, Extended Jaccard
b. x = (0, 1, 0, 1), y = (1, 0, 1, 0) cosine, correlation, Euclidean, SMC, Jaccard
c. x = (0,−1, 0, 1), y = (1, 0,−1, 0) cosine, correlation, Euclidean
d. x = (1, 1, 0, 1, 0, 1), y = (1, 1, 1, 0, 0, 1) cosine, correlation, SMC, Jaccard
e. x = (2,−1, 0, 2, 0,−3), y = (−1, 1,−1, 0, 0,−1) cosine, correlation, Euclidean
Answer :
Nurchayadin ( 004201705065)
A ) x=(1,1,1,1) , y =(2,2,2,2)
Cosine
Cosx,y =>x.yxy
x.yxy=>2+2+2+21+1+1+14+4+4+4 =82
=>82*4=1
Correlation
Corr of x and y
Corrx,y =SxySxSy =00 // Not Defined
Sxy=1n-1k=1nxk-x(yk-y)
x=Ink=Inxk= 1+1+1+4=1
y=Sxy=130=0
Sx=1n-1k=1nxk-x2=0
Sy=1n-1k=1nyk-y2=0
Euclidian
dx,y=k
k=1n(xk-yk)2=(1 2 2)+(1 2 2)+(2 2 1)+(2 2 4) =2
Viona Kacaribu
B) x=(0 ,1,0,1) y=(1,0,1,0)
Cosine
Cosx.y=x.yxy=0+0+0+022=02=0
Correlation
Corr(x,y)
Corrxy=SxySxSy=-131313=-1
Sxy=1n-1k=1nxk-x(yk-y)
x=1nk=1nxk=24=12
y=1nk=1nyk=12
Sxy=13(-14-14-14-14)=-13
Sx=1n-1k=1nxk-x2=13(+14+14+14+14=13
Sx=1n-1k=1nyk-y2=13
Euclidean
Dx,y=k=1n(xk-yk)2=1+1+1+1=4=2
Jaccard
J=F11f01+f10+f11=01+20+0
Correlation
1
Cov(x,y) = * [(1-0.666)(1-0.666) + (1-0.666)(1-0.666) +(0-0.666)(1-0.666)
6−1
+(1-0.666)(0-0.666) + (0-0.666)(0-0.666) +(1-0.666)(1-0.666) ]
1
Cov(x,y) = * [(0.111556) + (0.111556) + (-0.2224) + (-0.2224) + (0.4435) +
5
(0.111556)]
1
Cov(x,y) = * 0.333368 = 0.06667
5
s(x) =
1
√ 6−1
∗¿ ¿
=
1
s(y) =
√ 6−1
∗¿ ¿ = 0.5174
1
Corr(x,y) 5 = 0.7471
(0.5174∗0.5174)
Euclidean
SMC = ( f 11 +f 00 ¿ / ( f 01+ f 10 +f 11 + f 00
= (3 + 1) / (1 + 1 + 3 + 1) = 0.666
Jaccard
J = ( f 11) / f 01 + f 10 + f 11
=3/1+1+3 = 0.6
Steven Tricahyadi
E. x = (2, -1, 0, 2, 0, -3), y(-1, 1, -1, 0, 0, -1)
Cosine
(x . y) = (2 * -1) + (-1 * 1) + (0 * -1) + (2 * 0) + (0*0) + (-3 * -1)= 1
||x|| = √ ( 2∗2 ) + (−1∗−1 ) + ( 0∗0 ) + ( 2∗2 )+ ( 0∗0 )+(−3∗−3)
= √ 14 = 3.7417
||y|| =√ (−1∗−1 ) + ( 1∗1 )+ (−1∗−1 ) + ( 0∗0 )+ ( 0∗0 )+(−1∗−1)
= √4 = 2
Cos(x,y) = 0.1336
Correlation
Mean(x) = (2 - 1 + 0 + 2 + 0 - 3)/6 =0
Mean(y) =(-1 + 1 - 1 + 0 + 0 - 1)/6 = -0.3333
1
Cov(x,y) = * [(2-0)(-1+0.3333) + (-1-0)(1+0.3333) +(0-0)(-1+0.3333)
6−1
+(2-0)(0+0.3333) + (0-0)(0+0.3333) + (-3-0)(-1+0.3333)]
1
Cov(x,y) = * [-0.3332]
5
Cov(x,y) = −0.06664
1
s(x) =
√ 6−1
∗¿ ¿ = 3.6
1
s(y) =
√ 6−1
∗¿ ¿ = -0.00004
0
Corr(x,y) =0
(1.4142∗1.4142)