Sales Prediction For Big Mart 3.0

Sales Prediction For Big Mart
Group Members :
Mohammed Mansoor (160318733083)
Abdullah(160318733088)
Mohammed Yousuf Ullah khan(160318733086)
INDEX
INTRODUCTION
LITERATURE SURVEY
PROBLEM STATEMENT
PROPOSED SYSTEM
IMPLEMENTTATION
CONCLUSIONS
REFERENCES
Introduction
The dataset built with various

Huge shopping centers such as This can then further be used for
dependent and independent
big malls and marts are recording forecasting future sales by means
variables is a composite form of
data related to sales of items or of employing machine learning
item attributes, data gathered by
products algorithms
means of customer
Abstract
Machine learning algorithms To find out what role certain

Data gathered by means of
such as the random forests and properties of an item play and
customer, and also data related
simple or multiple linear how they affect their sales by
to outlet in a data warehouse
regression model understanding Big Mart sales
Applications
Helps every business make better Sales teams achieve their goals by
Allows companies to efficiently
business decisions. It helps in overall identifying early warning signals in
allocate resources for future growth
business planning, budgeting, and risk their sales pipeline and course-correct
and manage its cash flow
management. before it’s too late
Advantages Disadvantages
o Measuring the Company Health o Possibility of High Error

o Building Marketing Plans o Data Acquisition
o Planning of Supply based on Demand o Time and Space
o Capital for Small Businesses o Algorithm Selection
Literature Survey
Using Machine Learning to predict sales and anticipate demand
• How much stock should be ordered?

• How much revenue can be expected in a particular month?
• How many staff should be hired?
All these questions show how central sales forecasting is to business planning.
What is sales forecasting?
• company’s sales revenue for a specific time period – commonly a

month, quarter, or year
• A sales forecast is prediction of how much a company will sell in

the future.
Types of Sales forecasting:
Rule-based forecasting:
• manually developed rules and assumptions
• based on past data and known trends
• Ex : growing by 5% each year
• Sales Tomorrow = Sales Last Year * 1.05
Machine Learning forecasting:

• It learn the rules that would have to be manually designed
• Supervised learning is the task of learning the relationship
between outputs (sales) and inputs (past sales, economic
indicator, holiday calendar etc.)
Machine Learning to predict sales
• sales forecasting is a time-series

regression problem
• Time-series regressions are a particular
case of regression.
types of time series regressions models:
1. Auto-regressive models: These 2. Multivariate models: Multivariate models are

models predict future sales solely based based on a variety of inputs, including past sales,
on past sales values. They generate holiday calendars, or even economic
predictions by indicators. These models include Linear
finding trends and seasonality patterns Regressions, Neural Networks, Decision Tree-
based methods and Support Vector Machines.
Case Study: Parkdean Resorts
• Parkdean Resorts, the largest Holiday Park Operator in the

UK
• develop a model generating hourly food and beverage
sales forecasts in more than 180 venues
• overstaffing can be a substantial cost driver
• understaffing can significantly impact customer

satisfaction
• A reliable hourly forecast could therefore help fight these

issues
• A rule-based forecasting model would never have been
manageable
• A rule-based model designed in 2018 might not hold in
2019
Problem statement
• The objective is to predict the future sales from given data of the previous
year's using Machine Learning Techniques
• Another objective is to conclude the best model which is more efficient

and gives fast and accurate result by using XG Boost Regressor
• To find out key factors that can increase their sales and what changes
could be made to the product or store's characteristics.
Proposed System
The steps followed in this work, right from the dataset preparation to
obtaining results are represented in Fig.
Implementin
Pre- Training of g Machine
Raw Data Result
processing Dataset learning
model
Dataset and its Preprocessing
• The data scientists at BigMart

have collected 2013 sales data for
1559 products across 10 stores in
different cities.
• Using all the observations it is

inferred what role certain
properties of an item play and
how they affect their sales
Preprocessing
Data may contain null values, or redundant values, or various types of ambiguity, which demands for
pre-processing of data
Missing values : Checking for null values in each column and then replacing or filling them with
supported appropriate data types
Outliers: Any datapoint which is faraway or doesn’t follow the regular pattern. It can be treated by
deletion or Imputing the values
Feature Engineering:
feature transformation : When information is not represented in best possible way.
feature creation : Creating new features by combining existing features or another feature
feature scaling : Used to normalize the range of independent variables or features of data
feature selection : reducing the number of input variables
Algorithms
Random Forest regression :

Random forest algorithm is a very accurate algorithm
It operates by constructing several decision trees
during training time and outputting the mean of the
classes as the prediction of all the trees
Linear Regression Algorithm :

Regression can be termed as a parametric technique
which is used to predict a continuous or dependent
variable on basis of a provided set of independent
variables. It finds out a linear relationship between x
(input) and y(output). Hence, the name is Linear
Regression
Algorithms
Decision tree regression:

Decision tree builds regression or classification
models in the form of a tree structure. It breaks down
a dataset into smaller and smaller subsets while at
the same time an associated decision tree is
incrementally developed. The final result is a tree
with decision nodes and leaf nodes
Extremely randomized trees regression :

It is a variant of a random forest. Unlike a random
forest, at each step the entire sample is used and
decision boundaries are picked at random, rather
than the best one. In real world cases,
performance is comparable to an ordinary random
forest, sometimes a bit better
Implementation
Conclusion
Dataset Information
The data scientists at BigMart have collected 2013 sales data for 1559 products across 10 stores in different cities.
Also, certain attributes of each product and store have been defined. The aim is to build a predictive model and
find out the sales of each product at a particular store.
Using this model, BigMart will try to understand the properties of products and stores which play a key role in
increasing sales.
Libraries
o Pandas
o Matplotlib
o Sea born
o Sickit learn
Algorithms
o Linear Regression
o Decision Tree
o Random Forest
o Extra Trees
Mean Squared Error: 0.28

References
 https://www.kaggle.com/
 https://www.geeksforgeeks.org/python-programming-language/
 https://www.geeksforgeeks.org/machine-learning/
 https://www.geeksforgeeks.org/what-is-data-science/
 https://www.analyticsvidhya.com/blog/2021/05/an-introduction-to-statistics-for-data-sc
ience-basic-terminologies-explained/
 https://www.geeksforgeeks.org/exploratory-data-analysis-in-python/
 https://www.javatpoint.com/machine-learning-algorithms

Sales Prediction For Big Mart 3.0

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sales Prediction For Big Mart 3.0

Uploaded by

Copyright:

Available Formats

Sales Prediction For Big Mart

The dataset built with various

Machine learning algorithms To find out what role certain

o Measuring the Company Health o Possibility of High Error

Using Machine Learning to predict sales and anticipate demand

• How much stock should be ordered?

• company’s sales revenue for a specific time period – commonly a

• A sales forecast is prediction of how much a company will sell in

Machine Learning forecasting:

• sales forecasting is a time-series

1. Auto-regressive models: These 2. Multivariate models: Multivariate models are

• Parkdean Resorts, the largest Holiday Park Operator in the

• overstaffing can be a substantial cost driver

• understaffing can significantly impact customer

• A reliable hourly forecast could therefore help fight these

• Another objective is to conclude the best model which is more efficient

• The data scientists at BigMart

• Using all the observations it is

Random Forest regression :

Linear Regression Algorithm :

Decision tree regression:

Extremely randomized trees regression :

Mean Squared Error: 0.28

You might also like