You are on page 1of 18

CS F437 - Generative AI - Assignment 1

Faizal Shaikh - 2020AAPS2107H


Shreyansh Tripathi - 2020A4PS2340H
Parth Kadhane - 2020B5A72258H

VAE

Procedure
MNIST dataset is loaded and pixel values are normalized between 0 and 1.
Images are flattened to be fed into the neural network. A VAE architecture is defined with an
encoder and decoder.The encoder maps input images to mean and log-variance in the latent
space. The decoder reconstructs images from the sampled latent space values.
The sampling function introduces randomness to sample from the latent space using the
reparameterization trick
A custom loss layer is defined to include both the reconstruction loss and the KL divergence
regularization term. the model is trained on the MNIST training data for 10 epochs.
The MSE is calculated between the original and reconstructed images for evaluation.
Mean Squared Error (MSE)for latent dim = 2 is 0.043276213109493256
Mean Squared Error (MSE)for latent dim = 4 is 0.03182738646864891
Mean Squared Error (MSE)for latent dim = 8 is 0.021206101402640343
Mean Squared Error (MSE)for latent dim = 16 is 0.015053900890052319
Mean Squared Error (MSE)for latent dim = 32 is 0.014244679361581802
Mean Squared Error (MSE)for latent dim = 64 is 0.01374827977269888

Observations
As the latent dimension increases from 2 to 64, the Mean Squared Error (MSE) consistently
decreases. This suggests that a higher-dimensional latent space is capturing more
information, leading to better reconstruction. The sharpest decrease in MSE occurs from
latent dimension 2 to 8, indicating a significant improvement. However, beyond latent
dimension 8, the reduction in MSE becomes less pronounced. Higher-dimensional latent
spaces allow the Variational Autoencoder (VAE) to represent more intricate features of the
data, resulting in improved reconstruction accuracy. However, the rate of improvement slows
down as dimensions increase. VAEs are susceptible to noise in the data. Higher-
dimensional latent spaces may capture not only meaningful patterns but also noise, leading
to a degradation in performance.
PCA

Procedure:
The MNIST dataset is loaded, and pixel values are normalized between 0 and 1.
Images are flattened to be fed into the PCA algorithm. A PCA transformation is applied to
reduce the dimensionality of the dataset.
The reconstructed images are obtained by transforming the lower-dimensional
representation back to the original space.
A list of different numbers of principal components (latent dimensions) is defined, ranging
from 2 to 64.
For each number of components, PCA is applied to the training data, and the inverse
transformation is used to reconstruct images.
The Mean Squared Error (MSE) is calculated between the original and reconstructed images
for evaluation.
Mean Squared Error (MSE)for latent dim = 2 is 0.05828242530489970
Mean Squared Error (MSE)for latent dim = 4 is 0.05016590047252508
Mean Squared Error (MSE)for latent dim = 8 is 0.03387848169004275
Mean Squared Error (MSE)for latent dim = 16 is 0.0293145234529949
Mean Squared Error (MSE)for latent dim = 32 is 0.0195610353236370
Mean Squared Error (MSE)for latent dim = 64 is 0.01038182497455126
Observations
Similar to the VAE, as the number of principal components (latent dimensions) increases
from 2 to 64, the Mean Squared Error (MSE) consistently decreases.
This suggests that a higher-dimensional representation captures more information, leading
to better reconstruction.
The most significant improvement in MSE occurs when going from 2 to 8 dimensions,
indicating a substantial enhancement in capturing relevant features.
Beyond 8 dimensions, the reduction in MSE becomes less pronounced, indicating
diminishing returns in terms of reconstruction accuracy.
PCA is effective in capturing meaningful patterns in the data, but higher dimensions may
also include noise, leading to a degradation in performance.
The rate of improvement in MSE slows down as the number of principal components
increases, similar to the behavior observed in VAEs.
PPCA

Procedure:
Probabilistic PCA (PPCA) is applied to the MNIST dataset for various latent dimensions (2,
4, 8, 16, 32, 64). The MNIST dataset is loaded and normalized between 0 and 1.
PPCA is performed by flattening the data, centering it, and calculating the covariance matrix,
eigenvalues, and eigenvectors.
The top latent_dim eigenvectors are chosen to represent the principal components.
The noise variance is calculated based on the remaining eigenvalues.
Reconstructed images are obtained by projecting the centered data onto the principal
components and adding back the mean. Mean Squared Error (MSE) is calculated between
the original and reconstructed images for evaluation.
Mean Squared Error (MSE)for latent dim = 2 is 0.05566949160254878
Mean Squared Error (MSE)for latent dim = 4 is 0.047903466384238376
Mean Squared Error (MSE)for latent dim = 8 is 0.03744099859719932
Mean Squared Error (MSE)for latent dim = 16 is 0.026860300913873612
Mean Squared Error (MSE)for latent dim = 32 is 0.01682823498043167
Mean Squared Error (MSE)for latent dim = 64 is 0.00904679636069839

Observations:
Similar to standard PCA and VAE, as the latent dimension increases, the Mean Squared
Error (MSE) consistently decreases.
The most significant improvement in MSE occurs when going from 2 to 8 dimensions,
indicating a substantial enhancement in capturing relevant features.
Beyond 8 dimensions, the reduction in MSE becomes less pronounced, suggesting
diminishing returns in terms of reconstruction accuracy.
PPCA is effective in capturing meaningful patterns in the data, but higher dimensions may
also include noise, leading to a degradation in performance.The rate of improvement in MSE
slows down as the number of latent dimensions increases.
MSE v/s Latent Dim in PCA for all images
MSE v/s Latent Dim in PPCA for all images
Comparison of all the models

In direct contrast, the Variational Autoencoder (VAE) stands out for its flexibility in capturing
intricate features through a non-linear latent space, allowing for more nuanced
representations. The incorporation of stochastic sampling, achieved via the
reparameterization trick, contributes to VAE's ability to generate diverse and realistic
samples. On the other hand, Principal Component Analysis (PCA) emphasizes simplicity
and interpretability with its linear transformation, providing a clear reduction of
dimensionality. PCA excels at capturing the global structure of the data, particularly when
underlying patterns are primarily linear. In contrast, Probabilistic PCA (PPCA) introduces a
probabilistic modeling approach, explicitly considering noise variance in the reconstruction
process. PPCA strikes a delicate balance between capturing meaningful patterns and
mitigating the impact of noise, offering perspective that lies between the flexibility of VAE
and the simplicity of PCA.

You might also like