You are on page 1of 17

PYTHON WITH DATA

SCIENCE
DIWALI SALES ANALYSIS
BY- Shivam Goel
ECE-3
Enrollment number – 04096302820
MAHARAJA SURAJMAL INSTITUTE OF TECHNOLOGY
DATA SCIENCE

• Data science is about deriving useful


insights from the data in order to
solve real world complex problems
Task Of A Data Scientist
• Data Acquisition – data is gathered from various sources like Databases ,
web servers , API's(application programming interface)
• Data Preparation – It involves data cleaning and data transformation
Tools like talend and informatica are used to perform
complex data transformations and helps in better understanding

Exploratory Data analysis- Exploratory Data Analysis refers to the critical


process of performing initial investigations on data so as to discover patterns,
to spot anomalies , and to check assumptions with the help of summary
statistics and graphical representations.
• Data Modelling - ML models like are applied on data to create a data
product(predict future outcomes , gain insights on data) from it.
Mostly data modelling is done by Python.

• Data Visualization - Data visualization is a way to represent


information graphically, highlighting patterns and trends in data and
helping the reader to achieve quick insights.
PROJECT - DIW
ALI SALES
ANALYSIS

• A company has provided us with their Diwali


sales data, they want us to analyze the data for
each record and attribute in the table and we
share a summary with them in the end by
which company can-

a) Improve customer experience by analyzing


sales data

b) Increase their revenue


• JUPYTER NOTEBOOK - use to create and share
documents that contain live code, equations,
visualizations, and text
• PYTHON - In data science , various python libraries are

Technologies used for fetching data and performing operations on it


• NUMPY - for creating N-dimensional arrays of data.(An
used in array is a special variable, which can hold more than one
value at a time.)

project • PANDAS - used for cleaning and organizing data and to


perform exploratory data analysis. It is better than
spreadsheets/excel as it has tools for reading and writing
data between many formats(csv file, excel file , sql
database)
• MATPLOTLIB,SEABORN – used for data visualization.
IMPORTING
LIBRARIES
Load csv file in jupyter notebook
Cleaning of data- Once we get our data, cleaning and organizing is
done
• 1) drop unrelated or blank columns from our dataset 2.) rename a column
3) check for presence of null values
and remove them

4.) convert columns to correct data


type.
5.) use of ‘describe’ function
and its application for a
specific column
DATA
EXPLORAT
ION
Data exploration is the first step in data analysis
involving the use of data visualization tools and
statistical techniques to uncover data set
characteristics and initial patterns.
1) Male buyers vs Female
buyers

Plot a bar chart for gender vs total amount

From above graphs we can see that most


of the buyers are females and even the
purchasing power of females are greater
than men
2.)total amount vs age group

3.) Total number of orders from top 10 states


4.) marital status

5.) Occupation

z
6.)Product category

7.) On the basis of ‘product ID’ we


want to see our top selling
products
Conclusion- Married women age group 26-35 yrs from UP, Maharastra and
Karnataka working in IT, Healthcare and Aviation are more likely to buy products from Food,
Clothing and Electronics category

Thank you

You might also like