You are on page 1of 36

Lecture PPT on

Data Science (Open Elective)


(BCSL425)
6th Semester
By

Prof. Amresh Kumar


(Asst. Professor, CSE)
Prof. Amresh Kumar

Department of Computer Science & Engineering


Session: 2017-18 (Even)

G.H. Raisoni College of Engineering, Nagpur


(An autonomous Institute under UGC act 1956 & Affiliated to Rashtrasant Tukadoji Maharaj Nagpur University, Nagpur)
1
All Rights Reserved, Copyright © 2018 Prof. Amresh Kumar, GHRCE, Nagpur
Unit 05
Data Analytics & Visualization
Topics
Introduction to Data Analytics, Data Visualization, Analytics vs.
analysis, Application of analytics, Process of data analysis,
Analysis vs Reporting, Business Intelligence vs Analytics vs Big
Data vs Data Mining vs Data Science.

2
All Rights Reserved, Copyright © 2018 Prof. Amresh Kumar, GHRCE, Nagpur
Outline
1. Introduction to Data Analytics,
2. Data Visualization,
3. Analytics vs. Analysis,
4. Application of analytics,
5. Process of data analysis,
6. Analysis vs Reporting,
7. Business Intelligence vs Analytics vs Big Data vs
Data Mining vs Data Science.

3
All Rights Reserved, Copyright © 2018 Prof. Amresh Kumar, GHRCE, Nagpur
Introduction to Data Analytics
Introduction
• Big Data Analytics involves collection of data from different sources,
• It manipulation of those data it in a way that it becomes available to be
consumed by analysts and
• Finally it deliver data products useful to the organization business.

ALSO,
• Data analytics refers to qualitative and quantitative techniques and
processes used to enhance productivity and business gain.
• Data is extracted and categorized to identify and analyze behavioral data
and patterns, and techniques vary according to organizational
requirements.
• Data analytics is primarily conducted in business-to-consumer (B2C)
applications.

All Rights Reserved, Copyright © 2018


4
Prof. Amresh Kumar, GHRCE, Nagpur
Data Visualization
Introduction
• Data Visualization is used to communicate information clearly and
efficiently to users by the usage of information graphics such as tables and
charts.
• It helps users in analyzing a large amount of data in a simpler way.

Pro and Cons of Data Visualization


Pros
 makes complex data more accessible, understandable, and usable.
 It can be accessed quickly by a wider audience.
 It conveys a lot of information in a small space.
 It makes your report more visually appealing.
Cons
o It can misrepresent information , if an incorrect visual representation is
made.
All Rights Reserved, Copyright © 2018
o It can be distracting , if the
Prof. visual dataGHRCE,
Amresh Kumar, is distorted
Nagpur or excessively used. 5
Data Visualization Conti…
Some Useful Ways to Visualize Your Data (With Examples)
1. Line Chart
• Line charts are resounding popular for a range of business use cases because they
demonstrate an overall trend swiftly and concisely, in a way that’s hard to
misinterpret.
• For example, this graph visualizes sales figures by age group for three different
product lines:

Here, you can see at a glance that your


biggest customers are 34-45 year old
buyers of PDAs, followed by 19-24
year old buyers of cell phones.
All Rights Reserved, Copyright © 2018
6
Prof. Amresh Kumar, GHRCE, Nagpur
Data Visualization Conti…
2. Bar Chart
• Bar charts are great for comparing several different values, especially when some
of these are broken into color-coded categories.
• To illustrate the difference between this and a line graph, let’s now take the same
information as above and re-visualize it as a bar chart:

While the primary takeaway from the line


chart is the huge central spike,
representing PDAs bought by 34-45 year
olds, here you are encouraged to take in
the more granular differences between
sales figures for each category within
each age group. Since the different
product lines are groups by age group,
you can also see at a glance which age
groups are the most valuable to your
business, rather than focusing on the
product line.
All Rights Reserved, Copyright © 2018
7
Prof. Amresh Kumar, GHRCE, Nagpur
Data Visualization Conti…
3. Column Chart
• Usually it makes sense to use column charts for side-by-side comparison of
different values.
• We can also use them to show change over time, although it makes sense to do
this when you want to draw attention to total figures rather than the shape of the
trend (which is more effective with a line chart).
• For example, the chart below shows total website page views vs sessions on a
series of dates.

All Rights Reserved, Copyright © 2018


8
Prof. Amresh Kumar, GHRCE, Nagpur
Data Visualization Conti…
4. Pie Chart
• Pie charts are useful for communicating instantaneously what share each value
makes up of the whole.
• The ’re far more intuitive than simply listing percentages that add up to 100%.
• For example, this pie chart illustrates which campaigns bring in the biggest share
of total leads. You see at once that AdWords is the most effective source, followed
by social media and then webinar signups. An instant insight would illuminate to
your marketing team what’s working best, helping them to rapidly reassign
resources or refocus their efforts to maximise lead generation.

All Rights Reserved, Copyright © 2018


9
Prof. Amresh Kumar, GHRCE, Nagpur
Data Visualization Conti…
5. Area Chart
• Area charts are useful as they give a sense of the overall volume, as well as the
proportion of this taken up by each category.
• In the previous example, we can see how much of one volume (revenue) is
overlapped by another volume (cost). This is a great way to impose a reality check
on your revenue estimations – you see at once that the yellow sliver of profit is at
its thinnest, helping you to assess where cash flow really is tightest, rather than
where in the year ou’re simply bringing in the most cash.

This kind of information can give an


instant insight that helps with issues
like resource planning, ordering
patterns, financial management
allocating appropriate storage space,
and so on.
All Rights Reserved, Copyright © 2018
10
Prof. Amresh Kumar, GHRCE, Nagpur
Data Visualization Conti…
6. Scatter Chart
• These represent categories by circle color and the volume of the data by circle size;
the ’re used to visualize the distribution of, and relationship between, two
variables.
• For example, the chart below visualizes each product line by the number of units
sold and the revenue this brings in, representing the value in physical size. It also
breaks this down by gender (hovering over the circles would reveal the name of
the product in the original).

In this scenario, you would determine


that your most frequent (and profitable)
clients are currently men – which
could, for example, lead you either to
focus more marketing effort on male
shoppers, or to seek out more effective
ways of engaging female customers,
depending on your business priorities.
All Rights Reserved, Copyright © 2018
11
Prof. Amresh Kumar, GHRCE, Nagpur
Data Visualization Conti…
7. Bubble Chart
• Similar to Scatter Charts, Bubble Charts depict the weight of values by circle
circumference size.
• However, they differ in that they pack many different values into one small space
and only represent a single measurement per category.
• They are useful when you want to demonstrate how a handful of categories are
highly significant compared to a sea of insignificant ones.

While bubble charts like these are often


used to make a stark political point, you
can also use this to great effect in your
business to demonstrate things like
misplaced priorities, actual comparative
costs and values, or to highlight areas of
highest spending when looking to
streamline activities and cut costs.
All Rights Reserved, Copyright © 2018
12
Prof. Amresh Kumar, GHRCE, Nagpur
Data Visualization Conti…
8. Treemap
• Treemaps are useful for displaying hierarchies and comparative value between
categories and subcategories, as well as allowing you to retain detail while
projecting an instant sense of which areas are most important overall.

All Rights Reserved, Copyright © 2018


13
Prof. Amresh Kumar, GHRCE, Nagpur
Data Visualization Conti…
9. Area Map/Scatter Map
• These kinds of data visualizations allow you to see immediately which geographical
locations are most significant to your business.
• Data is visualized as points of color on a map; values are represented by circle size.
• For example, the map below depicts website visitors by location, while the color
indicates the percentage of conversions (the brighter the green, the higher the
conversion rate).

All Rights Reserved, Copyright © 2018


14
Prof. Amresh Kumar, GHRCE, Nagpur
Analysis vs. Analytics
Analysis Analytics
1. Data analysis is a broader term. 1. Data analytics is a subcomponent of data
analysis.
2.Data analysis is the practice (process) that 2. Analytics is basically the concepts
encompasses the use of data analytics. (discipline) used to do the analysis.
3. It involves compiling and analyzing data 3. It involves the use of technical tools and
in order to present findings to management techniques and methods to achieve business
to help inform business decision making. objectives .
4. Person who do Analysis are know as 4. Person who uses Analytics are know as
DATA ANALYST . DATA SCIENTIST .
5. Looks at the past 5. Predict the future
6. Answer the Questions: What happened? 6. Answer the Questions: Why did it happen
and what will happen next?
7. Analysis is a thorough study of 7. Analytics is defined as "the method of
anything, it does not involve any algorithm. logical analysis or involves algorithms.
8. Analysis is an in-depth review and 8. Analytics is beyond a historical review, that
sorting of the past & current facts, for can anticipate future scenarios and help in
many decision-making scenarios.All Rights Reserved, Copyright © 2018
taking necessary steps in future.
Prof. Amresh Kumar, GHRCE, Nagpur
15
Application of analytics
Various Areas for Data Analytics Application
1. Policing/Security

2. Transportation

3. Fraud and Risk Detection

4. Manage Risk

5. Delivery Logistics

6. Proper Spending

7. Customer Interactions

8. City Planning

9. Healthcare

10. Energy Management


All Rights Reserved, Copyright © 2018
11. Digital Advertisement Prof. Amresh Kumar, GHRCE, Nagpur
16
Application of analytics Conti…
1. Policing/Security
 Several cities all over the world have employed predictive analysis in predicting areas that would
likely witness a surge in crime with the use of geographical data and historical data. This has seemed
to work in major cities such as Chicago, London, Los Angeles, etc. Although, it is not possible to make
arrests for every crime committed but the availability of data has made it possible to have police
officers within such areas at a certain time of the day which has led to a drop in crime rate.
2. Transportation
 The train operators made use of data analytics to ensure the large numbers of journeys went
smoothly. They were able to input data from events that took place and forecasted a number of
persons that were going to travel; transport was being run efficiently and effectively so that athletes
and spectators can be transported to and from the respective stadiums.
3. Fraud and Risk Detection
 Many organizations had very bad experiences with debt and were so fed up with it. Data analytics
made it easy for them to analyze and infer if there was any probability of customers defaulting.
4. Manage Risk
 Data analytics gives insurance companies information on claims data, actuarial data and risk data
covering all important decision that the company needs to take. Evaluation is done by an
underwriter before an individual insured then the appropriate insurance is set.
5. Delivery Logistics
 There are several logistic companies working all over the world such as UPS, DHL, FedEx, etc. that
make use of data for improving their efficiency in operations. From data analytics applications, these
companies have found the most suitable routes for shipping, the best delivery time, most suitable
means of transport to select All Rights
so as Reserved,
to gain Copyright ©
cost efficiency 2018
and many others. 17
Prof. Amresh Kumar, GHRCE, Nagpur
Application of analytics Conti…
6. Proper Spending
 Data analytics applications would target where ta pa ers’ money would have a major impact on and
the kind of work that would be adequate for it. The targeting of where this money should be spent
would lead to the entire cit ’s infrastructure getting a facelift with a reduction of excess money
spent.
7. Customer Interactions
 Taking the analysis of customer demographics with feedback can help insurers improve on customer
experience depending on customer behavior and proven insights.
8. City Planning
 We usually see buildings that are built on spots that look suitable but actually have a negative effect
on other places. This is because such issues were not considered during the period of planning. Data
analytics applications, as well as modeling, would make it easy to mark the outcome of erecting a
structure on any spot.
9. Healthcare
 Machine and instrument data use has risen drastically so as to optimize and track treatment, patient
flow as well as the use of equipment in hospitals.
10. Energy Management
 We are in an era where firms make apply data analytics to energy management and cover areas like
energy optimization, smart-grid management, distribution of energy and building automation for
utility companies.
11. Digital Advertisement
 From the banners displayed All
on several websites
Rights Reserved, to the digital
Copyright © 2018 billboards seen in the big cities; all are
18
Prof.
controlled by data algorithms. Amresh Kumar, GHRCE, Nagpur
Process of data analysis

All Rights Reserved, Copyright © 2018


19
Prof. Amresh Kumar, GHRCE, Nagpur
Process of data analysis Conti…
Description
• Data Analysis is a process of collecting, transforming, cleaning, and
modeling data with the goal of discovering the required information.
• The results so obtained are communicated, suggesting conclusions, and
supporting decision-making.
• Data visualization is at times used to represent the data for the ease of
discovering the useful patterns in the data.

Data Analysis Process consists of the following phases:


 Raw Data Collection
 Data Preprocessing
 Data Cleaning
 Data Analysis (EDA: Exploratory Data Analysis)
 Communication (Insights, Reports, Visual Graphs)
Description of above points are in NEXT SLIDE…
All Rights Reserved, Copyright © 2018
20
Prof. Amresh Kumar, GHRCE, Nagpur
Process of data analysis Conti…
Data Analysis Process consists of the following phases:
 Raw Data Collection
 Data Collection is the process of gathering information (from various sources) on
targeted variables identified as data requirements.
 Data Preprocessing
 Data preprocessing involves transforming raw data into an understandable format.
 Data preprocessing prepares raw data for further processing (analysis.).
 Data Cleaning
 The processed and organized data may be incomplete, contain duplicates, or
contain errors.
 Data Cleaning is the process of preventing and correcting these errors.
 Data Analysis (EDA: Exploratory Data Analysis)
 The process of compiling and analyzing data in order to present findings to
management to help inform business decision making.
 Communication (Insights, Reports, Visual Graphs)
 The results of the data analysis are to be reported in a format as required by the
users to support their decisions and further action.
 The data analysts can choose data visualization techniques, such as tables and
All Rights Reserved, Copyright © 2018
charts. Prof. Amresh Kumar, GHRCE, Nagpur
21
Process of data analysis Conti…
Summary
• Analyzing data sets to summarize and visualize data properties is our main
area of business.
• Before you make any inferences, we listen to your data by examining all
variables in data.
• Our EDA (Exploratory Data Analysis) process specifically focuses on
following aspects;
 Understanding characters of data
 Finding meaningful patters in data
 Possible modeling strategies
 Debugging strategies
 Visualization of results

Tools Used for performing data analysis


• Tableau Public, R, Excel, Python, Matlab; and many more…
All Rights Reserved, Copyright © 2018
22
Prof. Amresh Kumar, GHRCE, Nagpur
Analysis vs Reporting
Analysis Reporting
1. Analysis can figure out what’s going on. 1. Reports can monitor and alert you.
2. Analysis transforms data and information 2.Reporting translates raw data into
into insights. information.
3. Answer the Questions: What happened 3. Reporting basically raises questions – What
and Why did it happen ? is happening?
4. Analysis follows a pull approach, where 4. Reporting follows a push approach, where
particular data is pulled by an analyst in reports are pushed to users who are then
order to answer specific business expected to extract meaningful insights and
questions. take appropriate actions for themselves.
5. Two main types: ad-hoc responses & 5. Three main types of reporting: canned
analysis presentations. reports, dashboards, and alerts.
6. Analysis is all about human beings using 6. Reporting is all about people can access
their superior reasoning and analytical skills reports through an analytics tool.
to extract key insights.
•Reporting & Analysis go hand-in-hand. The ultimate goal for the reporting &analysis
is to increase sales and reduce costs & have the greater value in organizations.
All Rights Reserved, Copyright © 2018
23
Prof. Amresh Kumar, GHRCE, Nagpur
Business Intelligence vs Analytics vs Big Data
vs Data Mining vs Data Science
1. Business Intelligence
 Business intelligence is a data driven decision-making process that enables
data scientists to generate, aggregate, analyze and visualize data to help
business make better management decisions.
 Business intelligence (BI) represents the tools and systems that play a key
role in the strategic planning process within a corporation.
 Business Intelligence (BI) is a comprehensive term encompassing data
analytics and other reporting tools that help in decision making using
historical data.
2. Analytics
 Data Analytics focuses on algorithms to determine relationship between data
offering insights.
 Data Analytics: Data Analytics the science of examining raw data with the
purpose of drawing conclusions about that information.
3. Big Data
 Big Data refers to humongous volumes of data that cannot be processed
effectively with the traditional applications
All Rights Reserved, Copyright ©that
2018 exist.
24
Prof. Amresh Kumar, GHRCE, Nagpur
Business Intelligence vs Analytics vs Big Data
vs Data Mining vs Data Science Conti…
4. Data Mining
 The process of looking for trends, patterns, or other useful information
within sets of data.
5. Data Science
 It is the umbrella of techniques used when trying to extract insights and
information from data.
 Dealing with unstructured and structured data, Data Science is a field that
comprises of everything that related to data cleansing, preparation, and
analysis.

All Rights Reserved, Copyright © 2018


25
Prof. Amresh Kumar, GHRCE, Nagpur
Business Intelligence vs Analytics vs Big Data
vs Data Mining vs Data Science Conti…

All Rights Reserved, Copyright © 2018


26
Prof. Amresh Kumar, GHRCE, Nagpur
References
Websites
1. https://www.tutorialspoint.com/big_data_analytics/big_data_analytics_
overview.htm
2. https://www.sisense.com/blog/10-useful-ways-visualize-data-examples/
3. https://www.getsmarter.com/career-advice/industry-advice/difference-
data-analytics-data-analysis
4. http://www.digitalvidya.com/blog/data-analytics-applications/

Ebooks
1. https://advanceddataanalytics.net/ebooks/

All Rights Reserved, Copyright © 2018


27
Prof. Amresh Kumar, GHRCE, Nagpur
Concluding Slides (About Data Science)

All Rights Reserved, Copyright © 2018


28
Prof. Amresh Kumar, GHRCE, Nagpur
Concluding Slides (About Data Science)
Conti…

All Rights Reserved, Copyright © 2018


29
Prof. Amresh Kumar, GHRCE, Nagpur
Concluding Slides (About Data Science)
Conti…

All Rights Reserved, Copyright © 2018


30
Prof. Amresh Kumar, GHRCE, Nagpur
Concluding Slides (About Data Science)
Conti…

All Rights Reserved, Copyright © 2018


31
Prof. Amresh Kumar, GHRCE, Nagpur
Concluding Slides (About Data Science)
Conti…

All Rights Reserved, Copyright © 2018


32
Prof. Amresh Kumar, GHRCE, Nagpur
Understanding the Evolving Data Science Jobs Landscape
November 21, 2017 by Harvard Business Analytics Staff

All Rights Reserved, Copyright © 2018


33
Prof. Amresh Kumar, GHRCE, Nagpur
Understanding the Evolving Data Science Jobs Landscape
November 21, 2017 by Harvard Business Analytics Staff Conti…

All Rights Reserved, Copyright © 2018


34
Prof. Amresh Kumar, GHRCE, Nagpur
What Skills Do I Need to Become a Data
Scientist?

All Rights Reserved, Copyright © 2018


35
Prof. Amresh Kumar, GHRCE, Nagpur
Best Wishes!!!
36

You might also like