You are on page 1of 35

BIG DATA ANALYTICS

SEN-332

PART-1
INTRODUCTION TO BIG DATA
ANALYTICS
Lecture Outline & Objectives
– General Term Analytics
– Business Analytics
– Data Science
– Data Mining
– Machine Learning
– Challenges
– Tools Available

Objectives: Provide fundamental information to get insight into


the challenges/ opportunities with big data.
What is Data Analytics
• Analytics is an encompassing and multidimensional field that
uses mathematics, statistics, predictive modeling and machine
learning techniques to find meaningful patterns and
knowledge in recorded data.

• Data analytics is the science of drawing insights from raw


information sources. Many of the techniques and processes of
data analytics have been automated into mechanical
processes and algorithms that work over raw data for human
consumption. Data analytics techniques can reveal trends and
metrics that would otherwise be lost in the mass of
information. This information can then be used to optimize
processes to increase the overall efficiency of a business or
system.
What is Data Analytics
• Data analytics is a broad term that encompasses many diverse
types of data analysis. Essentially any type of information can be
subjected to data analytics techniques to get insight that can be
used to improve things. For example, manufacturing companies
often record the runtime, downtime, and work queue for various
machines and then analyze the data to better plan the workloads so
that the machines operate closer to peak capacity.

• Of course, data analytics can do much more than point out


bottlenecks in production. Gaming companies use data analytics to
set rewards schedules for players that keep the majority of players
active in the game. Content companies use many of the same data
analytics to keep you clicking, watching, or re-organizing content to
get another view or another click.
What is Business Analytics
• Business Analytics can be termed as the study of business data
using statistical techniques and programming for creating decision
support and insights for achieving business goals.

• Business analytics is the combination of skills, technologies,


applications and processes used by organizations to gain insight in
to their business based on data and statistics to drive business
planning. Business analytics is used to evaluate organization-wide
operations, and can be implemented in any department from sales
to product development to customer service.

• Business analytics solutions typically use data, statistical and


quantitative analysis and fact-based data to measure past
performance to guide an organization's business planning.
Importance of Data Analytics
• Having huge quantity of data is of use only when
analytics provides insights into the data.

• Analytics allows to understand huge quantities of


data which otherwise was not possible or
computationally feasible.

• Analytics brings out trends in data which were


otherwise unknown and were causing serious
impediments towards organization’s
performance/ efficiency.
Importance of Data Analytics
• Improving Efficiency
• Market Understanding
• Cost Reduction
• Faster and Better Decision-Making
• New Products/Services
• Industry Knowledge
• Witnessing the Opportunities
• Online Marketing
• Scientific Modeling
• And many more…
Importance of Data Analytics
• Some of industries/ fields gaining most
advantage of data analytics are:
– Healthcare Industry
– Pharmaceutical Industry.
– Social Media
– E-Commerce
– Banking
– Military
– Etc….
Importance of Data Analytics
Types of Analytics
• Data analytics is broken down into four basic types:
– Descriptive analytics describes what has happened over a given
period of time. Have the number of views gone up? Are sales stronger
this month than last?
– Diagnostic analytics focuses more on why something happened. This
involves more diverse data inputs and a bit of hypothesizing. Did the
weather affect Tea sales? Did that latest marketing campaign impact
sales?
– Predictive analytics moves to what is likely going to happen in the
near term. What happened to sales last time we had a hot summer?
How many weather models predict a hot summer this year?
– Prescriptive analytics moves into the territory of suggesting a course
of action. If the likelihood of a hot summer as measured as an average
of these five weather models is above 58%, then we should add an
evening shift to the brewery and rent an additional tank to increase
output.
Types of Analytics
Big Data Analytics
• Big data analytics is the often complex process of
examining large and varied data sets -- or big data -- to
uncover information including hidden patterns,
unknown correlations, market trends and customer
preferences that can help organizations make informed
business decisions.

• Big data analytics is a form of advanced analytics,


which involves complex applications with elements
such as predictive models, statistical algorithms and
what-if analysis powered by high-performance
analytics systems
Big Data Analytics
Data Science
• Data science is a multidisciplinary field focused on finding
actionable insights from large sets of raw and structured data. The
field primarily fixates on unearthing answers to the things we don’t
know. Data science experts use several different techniques to
obtain answers, incorporating computer science, predictive
analytics, statistics, and machine learning to parse through massive
data sets in an effort to establish solutions to problems that haven’t
been thought of yet.

• People have tried to define data science for over a decade now, and
the best way to answer the question is probably via a Venn
diagram. Created by Hugh Conway in 2010, this Venn diagram
consists of three circles - math and statistics, subject expertise
(knowledge about the domain to abstract and calculate) and
hacking skills. Essentially if you can do all three, you are already
highly knowledgeable in the field of data science.
Data Science
Data Scientist
• Data scientists main goal is to ask questions and locate
potential avenues of study, with less concern for specific
answers and more emphasis placed on finding the right
question to ask. Experts accomplish this by predicting
potential trends, exploring disparate and disconnected data
sources, and finding better ways to analyze information.

• Anyone who’s interested in building a strong career in this


domain should gain key skills in three departments:
analytics, programming and domain knowledge. Going one
level deeper, the skills shown in diagram on next slide will
help you carve out a niche as a data scientist.
Data Scientist
Difference Between Data Science and
Data Analytics
• While many people use the terms interchangeably, data science and
big data analytics are unique fields, with the major difference being
the scope. Data science is an umbrella term for a group of fields
that are used to mine large data sets. Data analytics is a more
focused version of this and can even be considered part of the
larger process. Analytics is devoted to realizing actionable insights
that can be applied immediately based on existing queries.

• Another significant difference in the two fields is a question of


exploration. Data science isn’t concerned with answering specific
queries, instead parsing through massive data sets in sometimes
unstructured ways to expose insights. Data analysis works better
when it is focused, having questions in mind that need answers
based on existing data. Data science produces broader insights that
concentrate on which questions should be asked, while big data
analytics emphasizes discovering answers to questions being asked.
Data Mining
• Data mining is defined as a process used to
extract usable data from a larger set of any
raw data. It implies analyzing data patterns in
large batches of data using one or more
software.
Machine Learning
• Machine learning is a method of data analysis that
automates analytical model building. It is a branch of
artificial intelligence based on the idea that systems can
learn from data, identify patterns and make decisions with
minimal human intervention.

• Machine learning is an application of artificial intelligence


(AI) that provides systems the ability to automatically learn
and improve from experience without being explicitly
programmed. Machine learning focuses on the
development of computer programs that can access data
and use it learn for themselves.
Big Data Analytics Challenges
• Uncertainty of Data Management Landscape
Because big data is continuously expanding, there are new
companies and technologies that are being developed everyday. A
big challenge for companies is to find out which technology works
bests for them without the introduction of new risks and problems.

• The Big Data Talent Gap


While Big Data is a growing field, there are very few experts
available in this field. This is because Big data is a complex field and
people who understand the complexity and intricate nature of this
field are far few and between. Another major challenge in the field
is the talent gap that exists in the industry
Big Data Analytics Challenges
• Getting data into the big data platform
Data is increasing every single day. This means that
companies have to tackle limitless amount of data on a
regular basis. The scale and variety of data that is available
today can overwhelm any data practitioner and that is why
it is important to make data accessibility simple and
convenient for brand mangers and owners.

• Need for synchronization across data sources


As data sets become more diverse, there is a need to
incorporate them into an analytical platform. If this is
ignored, it can create gaps and lead to wrong insights and
messages.
Big Data Analytics Challenges
• Data integration
The ability to combine data that is not similar in structure
or source and to do so quickly and at reasonable cost.
With such variety, a related challenge is how to manage
and control data quality so that you can meaningfully
connect well-understood data from your data warehouse
with data that is less well understood.

• Data volume
The ability to process the volume at an acceptable
speed so that the information is available to decision
makers when they need it.
Big Data Analytics Challenges
• Solution cost
Since Big Data has opened up a world of
possible business improvements, there is a
great deal of experimentation and discovery
taking place to determine the patterns that
matter and the insights that turn to value. To
ensure a positive ROI on a Big Data project,
therefore, it is crucial to reduce the cost of the
solutions used to find that value.
Big Data Analytics Challenges
Big Data Analytics in Industry
Big Data Analytics in Industry
Big Data Analytics in Industry
Big Data Analytic in Industry
Some Tools Needed for Data Analytics
• R Studio
• SPSS
• Minitab
• MATLAB
• Python
• Etc…
Example Problems of Data Analytics
• Tell how people rate uber and careem service and
compare these services using uber and careem
social media comments.
• Create a model for predicting final grade of
students using university students data.
• Create a system that can listen voices of birds and
predict which bird it is.
• Create a complex network of railway tracks and
predict most vulnerable track in terms of
infrastructure breakdown threat.
What Data Analytics will do here?
Basic Data Analytics Funnel
Data Scientist Skills
Here we go…
• We come to end of introduction part of out
course here.
• From here we will dive into the ocean of big
data.
• We will look into practical aspects of data in
two parts:
– Big Data Architecture
– Big Data Analytics

You might also like