You are on page 1of 44

02450: Introduction to Machine Learning and Data Mining

Measures of similarity, summary statistics and probabilities

Georgios Arvanitidis

DTU Compute, Technical University of Denmark (DTU)


Today
Feedback Groups of the day: Reading material:
Lukas Blazek, Ebbe Abildgaard Bliksted, Villiam Bo,
Anna Keisha Mercy Kyei Boateng, Oskar Boesch, Chapter 4, Chapter 5
Rodrigo Bonmati Salazar, Jonas Vesterager Borge,
Anna-Sofie Bøgegaard Børresen, Adrià Bosch,
Rares-Victor Botis, Astrid Lind Bouquin, Ahmed
Mansour Bourassine, Amir Bouslama, Niels Moll
Bown, Andreas Boye, Veda Bradley, Jacob Høberg
Brams, Jonas Branner, Max Brazhnyy, Saxe
Brochmann Brøten, William Bruce, Emil Wraae
Carlsen, Gonçalo Carvalho, Beatriz Braga de
Carvalho, gaia casotto, Mohamed Chakrouni, Yue
Chang, Lola Maud R Charles, Valentine Chassagne,
Olivia Wenya Chen, Xiao Chen, Xi Chen, Hongjin
Chen, Tongxi Chen, Ting-Hui Cheng, Mark Chew,
Shu-Han Chien, Shirisha Reddy Chintaguntla, Jia Wei
Nelson Choo, Karl Ursø Christensen, Sif Egelund
Christensen, Emma Klostergaard Christensen, Laura
Emilie Jul Christiansen, Pernille Christie, Oscar
Flensburg Clausen, Estel Clua i Sánchez, José Dinis
Coelho Da Silva Ferreira, Robert Corneliu Coman,
Alexandre Comas Gispert, Riccardo Conti, Euan
Thomas Cortes, Luísa Maria Coimbra Cortes,
Geoffrey Clément Benoît Côte, Tallula Catherine
Crean, Marc Cummings, Afonso Martins dos Reis
Filipe Cunha, Lukas Samuel Czekalla, Frederik
Danielsen, Lasse Schnell Danielsen

2 DTU Compute Lecture 3 12 September, 2023


Lecture Schedule
1 Introduction 8 Artificial Neural Networks and
29 August: C1 Bias/Variance
Data: Feature extraction, and visualization 24 October: C14, C15
2 Data, feature extraction and PCA 9 AUC and ensemble methods
5 September: C2, C3 31 October: C16, C17
3 Measures of similarity, summary Unsupervised learning: Clustering and density estimation

statistics and probabilities 10 K-means and hierarchical clustering


12 September: C4, C5 7 November: C18
4 Probability densities and data 11 Mixture models and density estimation
visualization 14 November: C19, C20 (Project 2 due before 13:00)
19 September: C6, C7 12 Association mining
Supervised learning: Classification and regression 21 November: C21

5 Decision trees and linear regression Recap

26 September: C8, C9 13 Recap and discussion of the exam


6 Overfitting, cross-validation and 28 November: C1-C21

Nearest Neighbor
3 October: C10, C12 (Project 1 due before 13:00)
7 Performance evaluation, Bayes, and
Naive Bayes
10 October: C11, C13

Online help: Discussion Forum (Piazza) on DTU Learn


Videos of lectures: https://panopto.dtu.dk
Streaming of lectures: Zoom (link on DTU Learn)
3 DTU Compute Lecture 3 12 September, 2023
Learning Objectives
• Compute measures of similarity/dissimilarity (Lp distance, cosine distance, etc.)
• Understand basics of probability theory and stochastic variables in the discrete
setting
• Understand probabilistic concepts such as expectations, independence and the
Bernoulli distribution
• Understand the maximum likelihood principle for repeated binary events

4 DTU Compute Lecture 3 12 September, 2023


osvg-6

PCA recap: Principal component analysis on images

5 DTU Compute Lecture 3 12 September, 2023


osvg-8

Pre-processing

row

datpoint
each ,

coints

[Jazzos
in
~ ..
...

==
-

:
.
I
avodinacions
/
↑ - -
~ W

s
-
L W
1000
6 DTU Compute Lecture 3 12 September, 2023
osvg-9
x
d

Vb
Principal component analysis (PCA)
V2 -
......, -
4
.
+

...... x
=

..... -

V(x H) I
U

: ·
VI
b
-
=

............
1
·
&
" >

......
'

5 =
(x 1) V(x)
-
= xV(x)el12"
1x K IxM MxK

x =
V,. b +
+
kxt
kxt
-
Mxk

viafrit 4) , ...,
Mxk

z =

/w
7 DTU Compute Lecture 3 12 September, 2023
osvg-10

PCA on face images

8 DTU Compute Lecture 3 12 September, 2023


osvg-11

PCA on face images

9 DTU Compute Lecture 3 12 September, 2023


osvg-12 1

~

I
↑ ·

PCA on face images ...


7

randomly v v

reconstruct
selectedt we col
poin the subspace
it on

10 DTU Compute Lecture 3 12 September, 2023


osvg-14

11 DTU Compute Lecture 3 12 September, 2023


osvg-15

VI
1

.
-
/

.

.
.
"

Bata lie in 1123


but on a
plane
if we use a

Vector (1-dim
subspace)
and project
the data there
we cannot
reconstruct
them perfectly .

12 DTU Compute Lecture 3 12 September, 2023


osvg-16

Similarity / Dissimilarity measures

Similarity s(x, y) Often between 0 and 1. Higher means more similar


Dissimilarity d(x, y) Greater than 0. Lower means more similar.
Classification Classify a document as having the same topic y as the
document it is most similar/least dissimilar to.
Outlier detection The observation most dissimilar to all other observations
is an outlier

13 DTU Compute Lecture 3 12 September, 2023


osvg-17

Dissimilarity measures

1q 2 p1
• General Minkowsky distance (p-distance) dp (x, y) = M
j=1 |xj ≠ yj |p
• One-norm (p = 1) with y
=

[8]
color(x) = red if
M
ÿ di(x y)[1
d1 (x, y) =
,

|xj ≠ yj | ele
color(x)
=
blue
j=1

• Euclidean (p = 2) Similar with

ı̂ M dz(x y) ,
= 1
ıÿ and
d2 (x, y) = Ù (xj ≠ yj )2 doo(X y) ,
=1

j=1

• Max-norm distance (p = Œ)

dŒ (x, y) = max {|x1 ≠ y1 |, |x2 ≠ y2 |, . . . , |xM ≠ yM |}

Usage: Regularization and alternative optimization targets. For instance,


p = Πis very affected by outliers, p = 1 much less so.

14 DTU Compute Lecture 3 12 September, 2023


osvg-20

Similarity measures
fitfoot fot fol
Simple Matching Coefficient (SMC) fro : xx =1 , xx
=
0 k =

1
for :

X
=
0
, xk
=

f00 + f11
SMC(x, y) =
K
Jaccard Coefficient
f11
J(x, y) =
K ≠ f00

Cosine similarity
x€ y
cos(x, y) =
ÎxÎÎyÎ
Extended Jaccard coefficient
x€ y
EJ(x, y) =
ÎxÎ2 + ÎyÎ2 ≠ x€ y

15 DTU Compute Lecture 3 12 September, 2023


osvg-22

Quiz 1, similarity measures


Calculate the simple matching coefficient, Jaccard, Which of the following statements are true?
cosine and extended jaccard similarity between cus-
tomer 1 and customer 2 in the market basket data O
A. SMC(o1 , o2 ) = 35 J(o1 , o2 ) = 12 , cos(o1 , o2 ) =O
2
3,
below. q
SMC(x Y) ,3/5=

B. SMC(o1 , o2 ) = 3
J(o1 , o2 ) = 34 , cos(o1 , o2 ) = 2
3,
foot
5

= +z
=

X
2 1 2
fr =

~
2
S(x y) , 1

11 + 11
0 0 01
C. SMC(o1 , o2 ) = 5 J(o1 , o2 ) = 3 , cos(o1 , o2 ) = 3 ,
q
Yzxy
. +
+

k 3 1 0 +

9 2 1 2
-D. SMC(o1 , o2 ) = 5 J(o1 , o2 ) = 3 , cos(o1 , o2 ) = 3,
= .
=

cos(x y)
V-
=

X1 1
,
11x11z
= =
+ 1 +

E. Don’t know.

16 DTU Compute Lecture 3 12 September, 2023


The problem is easily solved by using the inserted data is binary, the extended Jaccard and the jaccard
formula. We obtain: SMC(o2 , o2 ) = 35 J(o1 , o2 ) = 12 , coefficient agree.
cos(o1 , o2 ) = 23 and therefore the A is true. Since the

17 DTU Compute Lecture 3 12 September, 2023


osvg-25

Invariance
Scale invariance
d(x, y) = d(–x, y)

Translation invariance
d(x, y) = d(– + x, y)

General invariances

18 DTU Compute Lecture 3 12 September, 2023


osvg-28 * * 3 dz(X =
,
Xz) =
10

I I ↓
dz(X+ x)) 1 000

Transformations
=

30
.

40 40 ,

50 .
000 50 000 ,
49 .
000
does this represent
a

?
12
reasonable" distance
Standardization: Ensure a single attribute will not dominate:
xik ≠ µ̂k
x̃ik =
ˆk

Combining heterogeneous attributes Transform measures and combine


-

sEdu. = SMC(xEdu. , yEdu. )


sAge. = a (a + d1 (xAge. , yAge. ))≠1 , a = 1
&

1
s(x, y) = (sEdu. + sAge. )
2
-

Weighting Attributes have different importance

s(x, y) = 0.99sEdu. + 0.01sAge.

19 DTU Compute Lecture 3 12 September, 2023


Empirical statistics
Given two samples x1 , x2 , . . . , xN œ R and y1 , y2 , . . . , yN œ R:
• Empirical mean
1 ÿ
N
µ̂ = x̄ = xi
N i=1

• Empirical variance
1 ÿ
N
ŝ = var[x]
ˆ = (xi ≠ µ̂)2
N ≠ 1 i=1

• Empirical covariance

1 ÿ
N
cov[x,
ˆ y] = (xi ≠ µ̂x )(yi ≠ µ̂y )
N ≠ 1 i=1

• Empirical standard deviation


Ô
‡ ˆ
ˆ = std[x] = ŝ

20 DTU Compute Lecture 3 12 September, 2023


osvg-24

Correlation

• Measure of degree of linear relationship

cov[x,
ˆ y]
cor[x,
ˆ y] =
ˆx ‡
‡ ˆy

21 DTU Compute Lecture 3 12 September, 2023


quartiles median
Quantiles
v I'v kaz'xz
I

xaz xizi'a
Given N observations of an attribute x1 , x2 , . . . , xN œ R.

Quantiles describe the points that divide the underlying distribution into
intervals that are equally probable:
• The one 2-quantile (median) divides the distribution in two intervals.
• The three 4-quantiles (quartiles) divides the distribution in four intervals.
• The 99 100-quantiles (percentiles) divides the distribution in 100 intervals.

The median is the same as the 2nd quartile or the 50th percentile.

E.g., we can (approximately) find the median by


• Sort the observations in ascending order x(1) Æ x(2) Æ · · · Æ x(N )
Y
]x N +1
( ) if N is odd
median[x] = 1 1 2 2
[ x N +x N
2 ( ) ( +1) if N is even.
2 2

22 DTU Compute Lecture 3 12 September, 2023


14 : 05
Probabilities

Pragmatically: A big part of AI is dealing with uncertainty and incomplete


information. Probabilities is the formal framework for doing
so.
Algorithmically: If an image belongs to a particular category is a discrete
event. The probability it belongs to a category is continuous.
Algorithmically, easier to optimize continuous quantities.
Convenience: There are boiler-plate ideas for transforming a probabilistic
assumption into an algorithm (maximum likelihood).

23 DTU Compute Lecture 3 12 September, 2023


osvgp-2

Probabilities

We reason about a proposition A in light of evidence B:


P (A|B) = x
The degree-of-belief that A is true given B is accepted as true is at
a level x
• A number between 0 and 1
• A and B are always binary (true/false) propositions
• Represents a state of knowledge
24 DTU Compute Lecture 3 12 September, 2023
osvgp-4

Probabilities: Trial example

G : The accused is guilty


E1 : A car similar to his was seen at the crime scene.
E2 : A large sum of money was found in his posession
E3 : His fingerprints was found at the door of the bank.
Probabilities express states-of-knowledge
E © E1 and E2 and E3
P (G|E) > P (G|E2 )
25 DTU Compute Lecture 3 12 September, 2023
osvgp-6

Binary propositions
A binary proposition is a statement which is either true or false (we might
not know, but someone with complete knowledge would)
A : In 49 BCE, Caesar crossed the Rubicon
B : Acceleration sensor 39 measures more than 0.85
C : Patient 901 has high cholesterol
Propositions can be combined with and, or and not:
AB © True if A and B are both true Ar
A + B © True if either A or B are true Au
-

A © True if A is false
We define two special propositions which is always true/false:
-

1 : A proposition which is always true


-

0 : A proposition which is always false


...and the following identities:A1 = A, A + A = 1, A = A and
A(B1 + B2 + · · · + Bn ) = AB1 + AB2 + · · · + ABn
26 DTU Compute Lecture 3 12 September, 2023
osvg-p5

Quiz 2, Probabilities
Assume we define the following 4 boolean variables. A. P (R1 R2 R3 |F ) > 0.9

R1 : Handed in report 1 B. P (F |R1 + R2 + R3 ) > 0.9


F : passed R2 : Handed in report 2 C. P (F |R1 R2 R3 ) > 0.9
R3 : Handed in report 3
R1R2R3 D. P (R1 + R2 + R3 |F ) > 0.9
F : Student failed 02450
E. Don’t know.
How would you express the probability of the state-

p(F)R1R2Rs)
ment: > 0 .
9
If a student hand in report 1, 2 and 3, the
chance of passing 02450 is greater than 90%?

27 DTU Compute Lecture 3 12 September, 2023


Passing can be defined as not failing. Therefore, namely, if it is true report 1, 2 and 3 are all handed
express the statement as: in, what is the chance of passing?
P (F |R1 R2 R3 ) > 0.9

28 DTU Compute Lecture 3 12 September, 2023


osvgp-6b

Rules of probability graus * is


A A

The sum rule: P (A|C) + P (A|C) = 1


The product rule: P (AB|C) = P (B|AC)P (A|C)

Interpretation: P(ABC) P(BIC)


=

P (A|B) = 0 (interpretation: given B is true, A is certainly false)


P (A|B) = 1 (interpretation: given B is true, A is certainly true)

We also use the shorthand:

P (A|1) = P (A)

Remarkably, this is the mathematical basis for this course

29 DTU Compute Lecture 3 12 September, 2023


osvgp-8

Marginalization and Bayes’ theorem


=

P(AIBC) P(BI)
·

① ② ②
Ë n e eÈ a
m

P (B|C) = P (B|C) P (A|BC) + P (A|BC) = P (AB|C) + P (AB|C)


= P (B|AC)P (A|C) + P (B|AC)P (A|C).
-
③ P (B|AC)P (A|C)
P (A|BC) =
P (B|C)
P (B|AC)P (A|C)
= .
P (B|AC)P (A|C) + P (B|AC)P (A|C)

P(BIAC P(AIC8
P(ABIC)
P(BIAC)P(AIC)]
P(ABC) P(BIC)
P(ABC) P(BIC)
= ·

30 DTU Compute Lecture 3 12 September, 2023


osvgp-10

DNA

Crimes may be solved by matching crime-scene DNA to DNA in a database


• If the two samples are from the same person, a DNA test will always give a
positive match
P(D1G) 1 =

• If the DNA are from different persons, DNA will incorrectly give a positive match
one time out of a million 1 e6
↑(D1E)
-
=

A crime is committed in Racoon City by an unidentified male. Assume all 8000


possible perpetrators undergo a DNA test, and suppose the DNA test gives a
positive result for George. What is the chance George is guilty?

G : George is guilty, D : There was a positive DNA match

p(G1D) =

P(D1G) p(G) .

p(a) +
p(a) = 1

PCC)
q(a) 48000 p(D1a) p(a)
=

p(p) p(D1G) p(a) .


.
= +

31 DTU Compute Lecture 3 12 September, 2023


Solution:

P (D|G)P (G)
P (G|D) =
P (D|G)P (G) + P (D|G)P (G)
1
1 ◊ 8000
= 1 2
1 1
1◊ 8000 + 10≠6 ◊ 1 ≠ 8000
1
=1≠ ¥ 99%
126

32 DTU Compute Lecture 3 12 September, 2023


osvgp-11a

Exclusive and exhaustive events

A1 : The side face up. A2 : The side face up.


A3 : The side face up. A4 : The side face up.
A5 : The side face up. A6 : The side face up.
• When no two propositions can be true at the same time, they are said to be
mutually exclusive: Ai Aj = 0 for i ”= j p(AiAj) 0
=>
=

• Consider any two events A and B

A
/
/
/"In/
/
/ ,
P (A + B) = P (A) + P (B) ≠ P (AB)
*

• In general, for n mutually exclusive events
n
ÿ
P (A1 + A2 + · · · + An ) = P (Ai )
i=1
• A set of events is exhaustive if one has to be true: A1 + · · · + An = 1. Then:
n
ÿ
P (Ai ) = P (A1 + A2 + · · · + An ) = 1
i=1

33 DTU Compute Lecture 3 12 September, 2023


Exclusive and exhaustive events

A1 : The side face up. A2 : The side face up.


A3 : The side face up. A4 : The side face up.
A5 : The side face up. A6 : The side face up.
• When no two propositions can be true at the same time, they are said to be
mutually exclusive: Ai Aj = 0 for i ”= j
• Consider any two events A and B
P (A + B) = 1 ≠ P (A B) =1 ≠ P (A|B)P (B)
# $
= 1 ≠ 1 ≠ P (A|B) P (B) =P (B) + P (AB)
= P (B) + P (B|A)P (A) =P (B) + [1 ≠ P (B|A)] P (A)
= P (A) + P (B) ≠ P (AB).
qn
• In general, for n mutually exclusive events P (A1 + A2 + · · · + An ) = i=1 P (Ai )
• A set of events is exchaustive if one has to be true: A1 + · · · + An = 1. Then:
n
ÿ
P (Ai ) = P (A1 + A2 + · · · + An ) = 1
i=1

34 DTU Compute Lecture 3 12 September, 2023


osvgp-24

Stochastic variables
• Often, we will measure numerical quantities (number of children, age of a
patient, etc.)
• Suppose a quantity X (number of children) takes a value x = 3. We can write
this as the binary event X3 and in general:

Xx : {The binary event that X is equal to the number x}

• Stochastic variable simplify this notation by the definition:

35 DTU Compute Lecture 3 12 September, 2023


osvgp-quiz4

Quiz 3, Avila bible (Fall 2018)


p(x̃2 , x̃10 |y) y=1 y=2 y=3 the copyists is
x̃2 = 0, x̃10 = 0 0.19 0.3 0.19
x̃2 = 0, x̃10 = 1 0.22 0.3 0.26 p(y = 1) = 0.316, p(y = 2) = 0.356, p(y = 3) = 0.328.
x̃2 = 1, x̃10 = 0 0.25 0.2 0.35
Using this, what is then the probability an observation
x̃2 = 1, x̃10 = 1 0.34 0.2 0.2
was authored by copyist 1 given that x̃2 = 1 and
Table 1: Probability of observing particular values of x̃2 x̃10 = 0? 0=
1 X10
=

1/X2
A. p(y = p(Y
=

and x̃10 conditional on y. =


,

1|x̃2 = 1, x̃10 = 0) = 0.25


1)
1 X10 0(y 1)-p(y
We will consider a dataset based on the Avila bible. =

p(Xx
=
=
=

We wish to predict the copyist (y = 0, 1, 2) of a bible B. p(y = 1|x̃2 = 1, x̃10 = 0) = 0.313


,

-
based on the two typographic attributes upperm and
X10 0=
C. p(y = 1|x̃2 = 1, x̃10 =10), =
P1X2
=

mr/is. We suppose the attributes have been binarized 0.262 =

3
such that upperm corresponds to x̃2 = 0, 1 and mr/is i)
D. p(y = 1|x̃2 = 1, x̃10 = 0) = 0.298
1 X10=0ly= i) plX=
to x̃10 = 0, 1. Suppose the probability for each of the
= p(X2
=

,
configurations of x̃2 and x̃10 conditional on the copyist E. Don’t know.
y are as given in Table 1. and the prior probability of 0)
1 x
=

11X2 = 0

p(y
+
=

1 X 0(y 1)p(y 1) =

p(X
=
=
=
0
,
+

p(Xz
=
1
,
x +
0
=

1 X
=
0
)(X2 =
0

I
+

31
0(y y)p(y 3)
Z p(x
= =

1 x
=

=
= + 0
,
.

I = 1
36 DTU Compute Lecture 3 12 September, 2023
The problem is solved by a simple application of The values of p(y) are given in the problem text and
Bayes’ theorem: the values of p(x̃2 = 1, x̃10 = 0|y) in ??. Inserting the
values we see option D is correct.
p(y = 1|x̃2 = 1, x̃10 = 0)
p(x̃2 = 1, x̃10 = 0|y = 1)p(y = 1)
= P3
k=1 p(x̃2 = 1, x̃10 = 0|y = k)p(y = k)

37 DTU Compute Lecture 3 12 September, 2023


osvgp-20

Independence

Independent: p(xi , yj ) = p(xi )p(yj )


Conditionally independent given zk : p(xi , yj |zk ) = p(xi |zk )p(yj |zk )

38 DTU Compute Lecture 3 12 September, 2023


osvgp-22

Expectations

N
ÿ
Expectation: E[f ] = f (xi )p(xi ). (2)
i=1

N
ÿ N
ÿ
mean: E[x] = xi p(xi ), Variance: Var[x] = (xi ≠ E[x])2 p(xi ). (3)
i=1 i=1
Example: Uniform probability
1
p(xi ) =
N
1 ÿN
E[f ] = f (xi )
N i=1
qN
i=1 xi
E[x] =
N
1 ÿN
Var[x] = (xi ≠ E[x])2
N i=1
39 DTU Compute Lecture 3 12 September, 2023
osvgp-26

40 DTU Compute Lecture 3 12 September, 2023


The Bernoulli distribution

• Let b = 0, 1 denote a binary event.


• For instance,
• b = 0 corresponds to heads, and b = 1 to tails, or
• b = 0 corresponds to a person being ill, and b = 1 that a person is well.
• The probability of b is expressed using a parameter ◊ in the unit interval [0, 1]
-

1≠b
Bernoulli distribution: p(b|◊) = ◊ (1 ≠ ◊) b
.

! x
0 :(
-

0(0) 1 a 1 0
p(b
-
= -
=
=

+
0 0)+
-

p(b
=

1
10) =

(1 -

=
0

41 DTU Compute Lecture 3 12 September, 2023


osvgp-28

The Bernoulli distribution, repeated events

• Suppose we observe a sequence b1 , . . . , bN of Bernoulli (binary) events.


• For instance, for N patients we record whether person 1 is ill or well (b1 = 0 or
b1 = 1) and up to whether patient N is ill or well (bN = 0 or bN = 1)
• When we know ◊ (the chance a person is well or ill), the events are independent

Bernoulli distribution: p(b|◊) = ◊b (1 ≠ ◊)1≠b .

N
Ÿ N
Ÿ qN qN
p(b1 , · · · , bN |◊) = p(bi |◊) = ◊bi (1 ≠ ◊)1≠bi = ◊ b
i=1 i (1 ≠ ◊)N ≠ b
i=1 i

i=1 i=1
bi 1
= ◊m (1 ≠ ◊)N ≠m , m = b1 + b2 + · · · + bN
=

b2 =
0

b3 =
1

by = 1

bi =
0
42 DTU Compute Lecture 3 12 September, 2023
osvgp-40

The Bernoulli distribution, maximum likelihood


fro
S

I
"
! i
prediction
&* 0 75 &* 0 5
probability
=
=
. .

I
for
*
a new point O

-
p(bx /0 ) *
f101 = p(b1 , . . . , bN |◊) = ◊m (1 ≠ ◊)N ≠m

An idea for selecting ◊ú is Maximum likelihood

◊ú = arg max p(b1 , . . . , bN |◊)


The value of ◊ according to which the data is most plausible

43 DTU Compute Lecture 3 12 September, 2023


Resources

https://bayes.wustl.edu Classical textbook which treats probabilities as


states-of-knowledge and discuss many practical and
philosophical issues (this book converted me to ML!)
(https://bayes.wustl.edu/etj/prob/book.pdf)

https://02402.compute.dtu.d A more in-depth description of summary


statistics (see chapter 1) (https://02402.compute.dtu.dk)
https://www.khanacademy.org An excellent introduction to probability
theory which we recommend as a go-to resource
(https://www.khanacademy.org/math/statistics-probability/probability-library)

http://citeseerx.ist.psu.edu A more in-depth discussion of Bayes in the


court room (http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=
EFE0328140036A4C668BB5B9FC76C9BE?doi=10.1.1.599.8675&rep=rep1&type=pdf)

44 DTU Compute Lecture 3 12 September, 2023

You might also like