Renuka Assignment2

PRNN ASSIGNMNET –2
M.Renuka Sr.No: 14794 MTech(res) EE

Q1:
This problem is estimating a mixture density using EM algorithm. Four data sets are given, each
of which is a mixture of four one dimensional Gaussians.
Details of data:
I. Means and variances: μ1 = 0, μ2 = 4, μ3 = 8, μ4 = 12, σi2 = 2
(a) Mixture coefficients: λ = (0.25, 0.25, 0.25, 0.25)
(b) Mixture coefficients: λ = (0.1, 0.3, 0.4, 0.2)
II. Means and variances: μ1 = 0, μ2 = 2, μ3 = 4, μ4 = 6, σi2 = 2
Explanation:
Expectation and maximization algorithm is used to estimate the mixture density model. From EM
algorithm, iterative equations are used to find the means, variances, mixture coefficients and
responsible coefficients. Iteration is stopped when the absolute difference between the present and
previous log likelihood values is less than 0.01.
Results:
Means and variances: μ1 = 0, μ2 = 4, μ3 = 8, μ4 = 12, σi2 = 2
Initialization using k-mean clustering algorithm:
Means: [ 0.392 8.570 4.75 12.485]
Variances: [ 1.661 1.175 1.448 1.443 ]
Convex coefficients: [0.25, 0.25, 0.25, 0.25]
No of Data samples: 1000 500
Estimated parameters:
Means 3.864 11.967 7.981 0.236 0.393 12.621 4.79 8.814
variance 2.058 2.239 2.384 1.971 1.956 1.609 3.705 2.75
Mixture coefficients 0.244, 0.235, 0.267, 0.252 0.286, 0.164, 0.284, 0.264
Initialization using random values:

Means: [ 1.5 2.8 9.9 11.45]
variances: [ 0.9 2.7 3.4 2.2]
convex coefficients: [ 0.25 0.25 0.25 0.25]
Means 12.25829065 3.6784469 9.05761227 12.93881526
0.37893025 7.85906749 0.25113002 4.34910373
variance 1.92048256 1.25302852 5.64241879 1.08190963
2.1401904 4.8833603 1.75769022 5.90657845
Mixture coefficients 0.18528773 0.15717964 0.35232415 0.10171543
0.27488474 0.38264789 0.24240155 0.30355887
Means: [ 0.19369554 7.94349252 4.14737299 12.03507276]Variances: [ 1.62768382
1.2086989 1.00322176 1.67682529]convex coefficients: [0.25, 0.25, 0.25, 0.25]
Means 0.41243477 7.76352293 8.04201751 0.03975008
3.89869537 12.13636008 12.60059602 3.89781969
variance 2.27914732 2.89059773 2.31114475 2.28833371
1.0858704 1.95865604 1.52660472 1.58020699
Mixture coefficients 0.14780525122195676, 0.43560656760595462,
0.44358003176793837, 0.12994418152740156,
0.2279460288737257, 0.15565545512639165,
0.18066868813637915 0.27879379574025215

Means: [ 1.5 2.8 9.9 11.45]
variances:[ 0.9 2.7 3.4 2.2]convex coefficients: [ 0.25 0.25 0.25 0.25]

Means 0.65373244 4.43670409 8.99909169 3.9833741
8.47397735 12.16314289 2.38597548 -1.22288273
variance 1.19151201 5.41281454 8.17225195 5.2586788
1.80223968 1.86780262 4.51487774 0.68610243
0.26222952 0.1819522 0.05685935 0.04779275
Means and variances: μ1 = 0, μ2 = 2, μ3 = 4, μ4 = 6, σi2 = 2

a) Mixture coefficients: λ = (0.25, 0.25, 0.25, 0.25)
Means:[ 4.15122552 -0.52552276 6.59405083 1.8548552 ]
Variances:[ 0.41437168 0.84328951 0.89881232 0.44422335]
convex coefficients:[0.25, 0.25, 0.25, 0.25]
Means 3.98196298 -0.2309566 0.84150743 3.61432439
6.07766755 1.68508014 5.98199189 -1.57751029
variance 0.91949574 1.53849932 0.76572659 0.91925715
1.75023034 1.11978284 1.99001761 0.40930648
0.20928111983823364, 0.33595482871716786,
0.24976450317845478, 0.2819950413805053,
0.25091710458478472 0.082485558631168548

Means:[ 0.5 2.8 5.1 5.8]
Variances:[ 0.92 2.3 3.46 2.24]
convex coefficients:[ 0.25 0.25 0.25 0.25]
Means 3.92409613 5.87708489 - 4.78542496 3.11354441
0.03004335 2.15423479 0.04556595 9.80261266
variance 2.2577949 2.21361 3.47412482 4.5591313
1.68231266 3.26459354 1.82047346 0.02900859
0.19181533 0.32014717 0.23997685 0.00233534
b) Mixture coefficients: λ = (0.1, 0.3, 0.4, 0.2)
Means:[ 4.32616535 2.2879091 6.71675145 -0.15379188]
Variances:[ 0.3988787 0.42015898 0.94862613 1.09906159]
convex coefficients[0.25, 0.25, 0.25, 0.25]
Means 4.18442402 2.36756763 3.73301298 5.69157774
6.27277664 0.968141454 2.03378372 0.43883713
variance 1.33864308 1.89928169 1.52026073 2.34973117
1.93084925 3.27172711 1.42999243 3.57233625
0.29670581601993296, 0.3362607640540069,
0.1898430386374001, 0.29028521044803973,
0.20964658120203791 0.12966470899860824

Means:[ 0.5 2.8 5.1 5.8]
Variances:[ 0.92 2.3 3.46 2.24]
convex coefficients:[ 0.25 0.25 0.25 0.25]
Means 4.42623909 1.49262621 3.02602852 9.27485898
2.33093529 7.87752826 5.99020967 2.50752538
variance 3.44446978 4.31674734 2.0821693 0.22802885
3.1350812 0.53218903 1.52620706 5.30478667
0.26211833 0.01449011 0.2157721 0.4565346
Observations:
1. As the data sample size is increasing, absolute value of log likelihood value is increasing
and EM algorithm is giving better estimates, since EM algorithm maximizes the
expectation of complete data log likelihood (or minimizes the negative log likelihood).
2. As iteration number increases, log likelihood value is increasing (or negative log likelihood
value is decreasing) which is validating the convergence of EM algorithm.
3. EM algorithm is very sensitive to initializations. EM algorithms is giving better
performance with the initial values which are generated from k-means clustering algorithm
(k=4). Since, k-means clustering aims to partition the data samples into k-clusters in which
each data sample belongs to the cluster with the nearest mean, serving as a prototype of the
cluster. Randomly initialized EM algorithm doesn't give better performance compared to
K-means clustering initialized EM algorithm. If the random initialization of means of all
the densities are very close, then EM algorithm is not giving accurate estimates.
4. EM algorithm accuracy depends on number of mixture components (k). If k = 4 EM
algorithm estimated density perfectly fits the data histogram because the actual model of
the data has 4 mixture density components. If k =2 (k<4) EM algorithm estimated density
doesn't fit the data histogram perfectly since the actual model of the data has 4 mixture
densities. If k=6 (k>4) EM algorithm estimated density overfits the data histogram.
Q2:
Results:
(a): Considering full data size
Means:
[ 0.0084 , 0.0385, 0.0401, 0.0914, 0.2851699,0.007, 0.060, 0.0475, 0.003, 0.092]
[ 1.802, 1.985, 1.837, 1.712, 1.927, 1.7469, 2.013, 1.465, 2.071, 1.919]
Variances:
Diagonal elements:
[ 2.805, 1.984, 1.367, 2.712, 1.925, 2.464, 2.216, 2.465, 2.452, 2.019]
Diagonal elements:
[2.678, 2.345, 1.736, 2.687, 2.223, 1.371, 2.516, 2.896, 1.937, 1.959]
Mixture coefficients: [0.456, 0.544]
(b): Considering full data size

Means:[ 0.0145 , 0.0293, 0.0145, 0.0457, 0.3857, 0.0165, 0.2604, 0.0537, 0.0433, 0.0527]
[ 1.9245, 0.9586, 1.3456, 1.9211, 2.2751, 2.6759, 2.0135, 2.0567, 1.8956, 2.3034]
Variances:
Diagonal elements:
[ 1.342, 2.345, 2.567, 1.854, 1.456, 2.456, 1.789, 2.344, 1.355, 2.012]
Diagonal elements:
[1.834, 2.445, 2.836, 1.687, 2.356, 2.171, 1.576, 2.578, 1.872, 2.192]
Mixture coefficients: [0.259, 0.741]
Observations:
5. As the data sample size is increasing, absolute value of log likelihood value is increasing
and EM algorithm is giving better estimates, since EM algorithm maximizes the
expectation of complete data log likelihood (or minimizes the negative log likelihood).
6. As iteration number increases, log likelihood value is increasing (or negative log likelihood
value is decreasing) which is validating the convergence of EM algorithm.
7. EM algorithm is very sensitive to initializations. EM algorithms is giving better
performance with the initial values which are generated from k-means clustering algorithm
(k=4). Since, k-means clustering aims to partition the data samples into k-clusters in which
each data sample belongs to the cluster with the nearest mean, serving as a prototype of the
cluster. Randomly initialized EM algorithm doesn't give better performance compared to
K-means clustering initialized EM algorithm. If the random initialization of means of all
the densities are very close, then EM algorithm is not giving accurate estimates.
densities. If k=6 (k>4) EM algorithm estimated density overfits the data histogram.
Q3:
Explanation:
This problem is 2-class classification problem where class conditional densities are mixtures of
Gaussians. Bayes' classifiers are implemented.
Class conditional densities are estimated in two ways:
(i) The class conditional densities are mixture of two Gaussians and estimating the density using
EM algorithm
(ii) the class conditional densities are single Gaussians and estimating it using maximum likelihood
method
The accuracies of these two Bayes' classifiers are compared and also these classifiers are compared
with nearest neighbour classifier.
Given data: class conditional densities: f1 – N(0,2), f2 – N(2,2), f3 – N(4,2), f4 – N(6,2)
DataSet1:
Histogram:
Class1: 0.5f1 + 0.5f2 Class2: 0.5f3+0.5f4

Results:
Initialized values for EM algorithm for class one:
means:[0.9, 3]
variances:[1.5, 3]
convex coefficients:[0.5, 0.5]
Initialized values for EM algorithm for class two:
means:[3.2, 7.8]
variances:[1.5, 3.4]
For full size of training data,
Estimated parameters using ML:
Class1: mean = 0.92773412 variance = 3.04458713
Estimated parameters using EM:
Class 1:
Means: 0.45327577 , 1.79083831
variances = 2.68604845, 2.54236184
Convex coefficients: 0.6452, 0.3547
Class 2:
Means: 4.48803422, 5.89692537
Variances: 2.66078359, 2.2742712
Convex coefficients: 0.560 0.4394
DataSet2:
Histogram:
Class1: 0.5f1 + 0.5f3 Class2: 0.5f2 + 0.5f4

Results:
means:[0.9, 3]
variances:[1.5, 3]
means:[3.2, 7.8]
Class 1:
Means: -0.12535948 , 3.71909734
variances = 1.68473773, 2.22478517
Class 2:
Means: 2.43829025, 6.18947128
Variances: 2.7963326, 2.19972346
Observations:
1. In all the cases Bayes' classifier using both ML estimation and EM algorithm estimation is
outperforming than the nearest neighbour classifier.
2. Accuracy of EM algorithm for dataset1 is high compared to accuracy of dataset2. In
dataset1 case means of class conditional densities are well separated. So the overlapping
region of class 1 and class2 densities of dataset2 is high compared to dataset1. So Bayes'
classifier with these class conditional densities is not performing better. For ML estimation
also the same reason is valid.
3. EM algorithms is giving better performance with the initial values which are generated
from k-means clustering algorithm (k=2). Since, k-means clustering aims to partition the
data samples into k-clusters in which each data sample belongs to the cluster with the
nearest mean, serving as a prototype of the cluster.
densities. So when the classifier is designed with these this class conditional densities it
doesn't give better accuracy. If k=4 (k>2) EM algorithm estimated density overfits the
data histogram. The classifier gives better accuracy. But it can't have generalization ability.
Q4:
Explanation:
This problem is 2-class classification problem where class conditional densities are mixtures of
Gaussians. Bayes' classifiers are implemented.
Class conditional densities are estimated in two ways:
(i) The class conditional densities are mixture of two Gaussians and estimating the density using
EM algorithm
(ii) the class conditional densities are single Gaussians and estimating it using maximum likelihood
method
The accuracies of these two Bayes' classifiers are compared and also these classifiers are compared
with nearest neighbour classifier.
Given data: class conditional densities: f1 – N(0,2), f2 – N(4,2), f3 – N(8,2), f4 – N(12,2)
DataSet1:
Histogram:
Class1: 0.5f1 + 0.5f2 Class2: 0.5f3+0.5f4
Results:
means:[0.9, 3]
variances:[1.5, 3]
means:[6.9, 11.6]
Class 1:
Means: 1.732479, 2.11602963
variances = 1.732479, 2.11602963
Class 2:
Means: 7.6593934], 11.43740568
Variances: 1.44899449, 2.96050072
DataSet2:
Histogram:
Class1: 0.5f1 + 0.5f3 Class2: 0.5f2+0.5f4

Results:
means:[0.9, 6.5]
variances:[1.5, 3]
means:[3.5, 11.0]
Class1: mean = 3.97720365 variance = 17.64253797]
Class 1:
Means: 0.13237427, 8.07812975
variances = 1.71688626, 2.04401085
Class 2:
Means: 12.03635727, 1.80758142
Variances: 1.80758142, 2.09601184
Observations:
1. In all the cases Bayes' classifier using both ML estimation and EM algorithm estimation is
outperforming than the nearest neighbour classifier.
2. EM algorithms is giving better performance with the initial values which are generated
from k-means clustering algorithm (k=2). Since, k-means clustering aims to partition the
data samples into k-clusters in which each data sample belongs to the cluster with the
nearest mean, serving as a prototype of the cluster.
3. For dataset1 and dataset2 EM algorithm is giving better performance compared to nearest
neighbour and ML algorithm. ML estimation based classifier of dataset2 is having poor
performance compared to dataset1 case. By observing the data histogram of the data set of
class1 and class2 it is inferred that with single gaussian model, the histogram can not be
fitted. So the Bayes' classifier with the ML estimated class conditional densities doesn’t
give better accuracy.
densities. So when the classifier is designed with these this class conditional densities it
doesn't give better accuracy. If k=4 (k>2) EM algorithm estimated density overfits the
data histogram. The classifier gives better accuracy. But it can't have generalization ability.

Renuka Assignment2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Renuka Assignment2

Uploaded by

Copyright:

Available Formats

PRNN ASSIGNMNET –2

M.Renuka Sr.No: 14794 MTech(res) EE

Initialization using random values:

Initialization using random values:

No of Data samples: 1000 500

Means and variances: μ1 = 0, μ2 = 2, μ3 = 4, μ4 = 6, σi2 = 2

Initialization using random values:

Initialization using random values:

(b): Considering full data size

Class1: 0.5f1 + 0.5f2 Class2: 0.5f3+0.5f4

Class1: 0.5f1 + 0.5f3 Class2: 0.5f2 + 0.5f4

Class1: 0.5f1 + 0.5f3 Class2: 0.5f2+0.5f4

You might also like