You are on page 1of 5

WEEK-4

NAME : MADDULA NAGA TEJA

ID NO : 2000080061

Date: DD/MM/YYYY

Outcome: Students are able to implement dimensionality reduction and classification using ANN.

Pre Lab:

1) What is dimensionality reduction and what is the need for reducing the
dimensions?
A) Dimensionality Reduction is a technique used to reduce the attributes or features for the data. This makes
removal of some features and due to those removal remaining features data are updated such that the loss will be
neglected. It is used for reducing the computational time and resources by not checking unnecessary features of
the data.

2) What are different types of algorithms for performing dimensionality reduction?


A)
1. Principal Component Analysis (PCA)
2. Non-negative matrix factorization (NMF)
3. Linear discriminant analysis (LDA)
4. Feature extraction
5. Feature selection (Filter strategy, Wrapper strategy, Embedded strategy)
6. Generalized discriminant analysis (GDA)
7. Missing Values Ratio
8. Backward Feature Elimination

2) What is Batch Normalization and why is it used in deep neural networks?


A) Batch normalization is a technique for training very deep neural networks that standardizes the inputs to a
layer for each mini-batch. This has the effect of stabilizing the learning process and dramatically reducing
the number of training epochs required to train deep networks. Batch normalization solves a major problem
called internal covariate shift. It helps by making the data flowing between intermediate layers of the neural
network look, this means you can use a higher learning rate. It has a regularizing effect which means you
can often remove dropout.

1|P a ge
In Lab:

EXP4:

a) In this dataset, there are various factors given, which are involved when a patient is hospitalized. On
the basis of these factors, predict whether the patient will survive or not. But it has 85 columns so,
perform Dimensionality reduction using PCA.

Data set: https://www.kaggle.com/mitishaagarwal/patient

b) Normalize the data in the given dataset and perform classification using ANN.

Program:

Importing the Libraries

Importing the Dataset

Dividing the features into numerical and categorical

Selecting the Target Variable

2|P a ge
Applying CountEncoder, RobustScaler and PCA on data

Variance Ratio between features existed in the data

Plotting Bar graph on variance ratio

Creating the Model

Compiling & Running the Model

Evaluating the Model

3|P a ge
Output
Info of the Data

Variance Ratios

Bar plot of the variance ratio

Accuracy of the model

4|P a ge
Post Lab:

Briefly describe the working of principle component analysis

Ans) PCA is one of the methods to do Dimensionality Reduction Technique. It is most widely used in Machine
Learning for predictive models and Data Analysis. PCA is an unsupervised technique used to examine the
interrelations between features / variables within the data. PCA uses an orthogonal transformation that converts
a set of correlated variables to a set of uncorrelated variables. It is also known as general factor of analysis
where regression determines a line of best fit.

It increases interpretability yet, at the same time it minimizes the information loss. The remaining features or
data will be changed according to the data removed from the main data, so that information loss will be
reduced.

5|P a ge

You might also like