You are on page 1of 21

WINTER TRAINING

MACHINE
LEARNING
WITH PYTHON

BY- Ayush Aggarwal (20615002820)


AGENDA

PROJECT-Flight Fare
Machine learning Introduction Python Introduction
Prediction
What is
Machine Learning
Machine learning is an application of AI that enables
systems to learn and improve from experience without
being explicitly programmed. Machine learning focuses on
developing computer programs that can access data and
use it to learn for themselves.
Applications

Siri Healthcare Social Media


Advantages and Disadvanges of
Machine Learning
Advantages Disadvantages

1. Easily identifies trends and 1. Data Acquisition


pattern 2. Time and Resources
2. No human intervention needed 3. Interpretation of Results
(automation) 4. High error-susceptibility
3. Continuous Improvement
4. Handling multi-dimensional and
multi-variety data
5. Wide Applications
STUDIO SHODWE
Python
Python is a dynamic, high level, free open source and interpreted programming
language.
It supports object-oriented programming as well as procedural oriented programming.
it is a dynamically typed language.
Python is Portable language
IDEs for
Python Programming

VS CODE JUPYTER ANACONDA PYCHARM


Project – Flight Fare Prediction

Flight ticket prices can be something hard to


guess. we have been provided with prices of
flight tickets for various airlines between the
months of March and June of 2019 and
between various cities, using which we aim to
build a model which predicts the prices of the
flights using various input features.
All the Lifecycle In A Data Science
Project is divided into Four Parts:
1. Exploratory Data Analysis
2. Feature Engineering
3. Feature selection
4. Model Deployment
Exploratory Data Analysis
1) Importing libraries
2) Importing the datase
Feature Engineering

1. Nominal data data are not in any order→ OneHotEncoder is used in this case

2. Ordinal data data are in order→ LabelEncoder is used in this case.
Feature Selection
Model Training:
We do not know beforehand which model will perform best on this problem, as it is
unknowable. We used Extra tree Regressor, Random Forest Regression Model
(Feature Selection) on the train set. you can try any number of regression models
and choose one among them which is best suitable.
we drop the “price” column from train dataset and make (y)independent variable
to find correlation between dependent and independent data. After cleaning the
data, we can visualize data and better understand the relationships between
different variables. There are many more visualizations that you can do to learn
more about your dataset, like scatterplots, histograms, boxplots, etc .
ExtraTreesRegressor model
Fitting model using Random Forest
Hyperparameter Tuning
We use the Randomized SearchCV for the best hyperparameters. A random search
of parameters, using 5 fold cross-validation search across 100 different
combinations.
Checking accuracy of the model
MAE (Mean absolute error) represents the difference between the original and
predicted values extracted by averaged the absolute difference over the data
set.
MSE (Mean Squared Error) represents the difference between the original and
predicted values extracted by squared the average difference over the data set.
RMSE (Root Mean Squared Error) is the error rate by the square root of MSE.
Model Deployment
Model Deployment is one of the last stages of any machine learning project. Here, we will design a
user interface. we used a flask to make an HTML file for flight price prediction. this will take the input
value for each feature and calculate the price for a flight as shown in the image below.
STUDIO SHODWE

Conclusion
Machine Learning (ML) has a big scope in future and the biggest reason for it is that ML is used to predict useful insights from
the data.
Also, as most of the companies are now becoming data oriented, it will be very helpful for these companies when they will use
data science on there data.
STUDIO SHODWE

THANK YOU

You might also like