GRP Project DT

University Institute of Engineering
Department of Computer Science & Engineering
Experiment: 5
Student Names :
1. ABHISHEK CHOUDHARY UID’S: 23BAI70030
2. RISHI JAIN 23BAI70569
3. JATIN CHADDA 23BAI70041
4. KAVYA JAIN 23BAI70137
5. GAUTAM KUMAR 23BAI70207
6. MAYANK GUPTA 23BAI70292
Branch: Computer Science & Engineering Section/Group: 23AML 104A

Semester: 1 Date of Performance:
Subject Name: DISRUPTIVE TECHNOLOGIES
Subject Code: 23 ECH-102
1. Aim of the practical: To Develop a prediction model based on linear/logistic regression.
2. Tool Used: Google Colaboratory and Require the PyCaret libraries

(https://pycaret.org/) .
3. Basic Concept/ Command Description:
PyCaret : PyCaret is an open-source, low-code machine learning library in Python that

automates machine learning workflows. It is an end-to-end machine learning and model
management tool that exponentially speeds up the experiment cycle and makes you more
productive.
Machine Learning : Machine learning is a method of teaching computers to learn from

data, without being explicitly programmed. Python is a popular programming language
for machine learning because it has a large number of powerful libraries and frameworks
that make it easy to implement machine learning algorithms.
Compare model : The primary objective of model comparison and selection is definitely
better performance of the machine learning software/solution. The objective is to narrow
down on the best algorithms that suit both the data and the business requirements.
Train/Test dataset : Train/Test is a method to measure the accuracy of your model. It is

called Train/Test because you split the data set into two sets: a training set and a testing
set. 80% for training, and 20% for testing. You train the model using the training set.
Normalization : Normalization in machine learning is the process of translating data into

the range [0, 1] (or any other range) or simply transforming data onto the unit sphere.
Some machine learning algorithms benefit from normalization and standardization,
particularly when Euclidean distance is used.
Transformation : Data transformation is also known as data preparation or data

preprocessing. There are lots of different names for the same thing. It makes sure that
your data is clean and ready to be used by your machine learning algorithm. Without data
transformation, your AI won't be able to make accurate predictions.
Handling of outliers : To handle outliers effectively, analysts should identify them through
visualization or statistical methods, evaluate their impact on analysis, and apply
appropriate techniques like trimming, transformation, or exclusion to mitigate their
influence.
Building Models : The ML model development involves data acquisition from multiple
trusted sources, data processing to make suitable for building the model, choose algorithm
to build the model, build model, compute performance metrics and choose best
performing model. A building model is either a physical (real) or virtual (computer)
representation of a building. Very often, the physical model is smaller than the original
(scale model). Architectural model of an orthodox church building.
Feature Selection : Feature selection is a process in machine learning to identify important

features in a dataset to improve the performance and interpretability of the model.
Model performance (PCA): Principal Component Analysis (PCA) is one of the most
commonly used unsupervised machine learning algorithms across a variety of
applications: exploratory data analysis, dimensionality reduction, information
compression, data de-noising, and plenty more.
4. Code :
1. Input:
Output :
2. Input:
Output:
3.Input :
Output : On next page …..

4.Input:
Output:
5.Input:
Output:
6.Input:
Output:
7.Input:
Output:
8.Input:
Output: ON NEXT PAGE ……

9.Input:
Output:
10.Input:
Output: ON NEXT PAGE …..

11.Input:
Output:
12.Input:
Output:
13.Input:
Output: ON NEXT PAGE …

14.Input:
Output:
5. Observations, Simulation Screen Shots and Discussions:
In the simplest setting, each training input xi is a D-dimensional vector of numbers,

representing, say, the height and weight of a person. These are called features, attributes or
covariates. In general, however, xi could be a complex structured object, such as an image, a
sentence, an email message, a time series, a molecular shape, a graph, etc.
SCREEN SHOTS : Screen shots of output are pasted with codes .
STEPS :
Data Collection. Machine learning requires training data, a lot of it. ...
Data Preparation. We cannot work on raw data. ...
Choose a Model / Algorithm. The third step consists of selecting the right model. ...
Training the Model. ...
Evaluate the Model. ...
Parameter Tuning. ...
Make Predictions.
6.Result and Summary:
The first step in model evaluation is to prepare your data. Split your dataset into training and
test sets using the train_test_split function from the scikit-learn library. This ensures that we
have separate data for training and evaluating our model. Now, it's time to evaluate our model
on the test set.
Machine learning is usually divided into two main types. In the predictive or supervised
learning approach, the goal is to learn a mapping from inputs x to outputs y. Given a given a
labeled set of input-output pairs
We measure a feature's importance by calculating the increase of the model's prediction error
after perturbing the feature. A feature is “important” if perturbing its values increases the model
error, because the model relied on the feature for the prediction.
Classification models have various evaluation metrics to gauge the model's performance.
Commonly used metrics are Accuracy, Precision, Recall, F1 Score, Log loss, etc. It is worth
noting that not all metrics can be used for all situations.
Good accuracy in machine learning is subjective. But in our opinion, anything greater than 70%
is a great model performance. In fact, an accuracy measure of anything between 70%-90% is
not only ideal, it's realistic. This is also consistent with industry standards.
Regression analysis is a statistical method to model the relationship between a dependent

(target) and independent (predictor) variables with one or more independent variables. More
specifically, Regression analysis helps us to understand how the value of the dependent variable
is changing corresponding to an independent variable when other independent variables are held
fixed. It predicts continuous/real values such as temperature, age, salary, price, etc.
7. Additional Creative Inputs (If Any):

Learning outcomes (What I have learnt):
1. A statistical measure that determine the strength of the relationship between the
one dependent variable (y) and other independent variables (x1, x2, x3......)
2. This is done to gain information about one through knowing values of the others.
3. We learned how to build model in python.
4. It is basically used for predicting and forecasting.
5. Ma’am taught us how to compare models , save model , train model etc. .
Evaluation Grid:
Sr. Parameters Marks Obtained Maximum Marks

No.
1. Student Performance 12
(Conduct of experiment)
2. Viva Voce 10
3. Submission of Work Sheet 8
(Record)
Signature of Faculty (with Date): Total Marks Obtained: 30

GRP Project DT

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

GRP Project DT

Uploaded by

Copyright:

Available Formats

University Institute of Engineering

Department of Computer Science & Engineering

Branch: Computer Science & Engineering Section/Group: 23AML 104A

1. Aim of the practical: To Develop a prediction model based on linear/logistic regression.

2. Tool Used: Google Colaboratory and Require the PyCaret libraries

3. Basic Concept/ Command Description:

PyCaret : PyCaret is an open-source, low-code machine learning library in Python that

Machine Learning : Machine learning is a method of teaching computers to learn from

Train/Test dataset : Train/Test is a method to measure the accuracy of your model. It is

Normalization : Normalization in machine learning is the process of translating data into

Transformation : Data transformation is also known as data preparation or data

Feature Selection : Feature selection is a process in machine learning to identify important

Output : On next page …..

Output: ON NEXT PAGE ……

Output: ON NEXT PAGE …..

Output: ON NEXT PAGE …

5. Observations, Simulation Screen Shots and Discussions:

In the simplest setting, each training input xi is a D-dimensional vector of numbers,

SCREEN SHOTS : Screen shots of output are pasted with codes .

Data Preparation. We cannot work on raw data. ...

Training the Model. ...

Evaluate the Model. ...

Parameter Tuning. ...

6.Result and Summary:

Regression analysis is a statistical method to model the relationship between a dependent

7. Additional Creative Inputs (If Any):

Learning outcomes (What I have learnt):

3. We learned how to build model in python.

4. It is basically used for predicting and forecasting.

Sr. Parameters Marks Obtained Maximum Marks

You might also like