You are on page 1of 8

www.anuupdates.

org
Unit – 2: Big data Analytics
What is and isn’t big data analytics? Why hype around big data analytics? Classification of analytics, top
challenges facing big data, importance of big data analytics, technologies needed to meet challenges of big
data.

…………………………………………………………………………………………………………………………….

1. What is and isn’t big data analytics

Big data analytics is time-sensitive, used to take faster decision from huge amount of diversified data and
used to find the deep and richer insights of the business. It is one kind of technology enabled analytics.

1 © www.anuupdates.org Prepared by D.Venkata Reddy M.Tech(Ph.D), UGC NET, AP SET Qualified


www.anuupdates.org
WHAT BIG DATA ANALYTICS ISN’T
Big data analytics is not to replace the RDBMS or traditional data warehouse. Big data analytics is co-existing
with both RDBMS and Data Warehouse. If you have only huge amount of data then you cannot say that it is
big data. Huge Volume is a characteristic, but only Volume cannot justify big data. It is not only used by big
companies, it can be used by any company.

2. CLASSIFICATION OF ANALYTICS

Analytics is the discovery and communication of meaningful patterns in data. Especially, valuable in areas
rich with recorded information, analytics relies on the simultaneous application of statistics, computer
programming, and operation research to qualify performance. Analytics often favors data visualization to
communicate insight.

Firms may commonly apply analytics to business data, to describe, predict, and improve business
performance. Especially, areas within include predictive analytics, enterprise decision management, etc. Since
analytics can require extensive computation(because of big data), the algorithms and software used to
analytics harness the most current methods in computer science.

In a nutshell, analytics is the scientific process of transforming data into insight for making better decisions.
The goal of Data Analytics is to get actionable insights resulting in smarter decisions and better business
outcomes.

It is critical to design and built a data warehouse or Business Intelligence(BI) architecture that provides a
flexible, multi-faceted analytical ecosystem, optimized for efficient ingestion and analysis of large and diverse
data sets.
2 © www.anuupdates.org Prepared by D.Venkata Reddy M.Tech(Ph.D), UGC NET, AP SET Qualified
www.anuupdates.org

There are four types of data analytics:

1. Predictive (forecasting)

2. Descriptive (business intelligence and data mining)

3. Prescriptive (optimization and simulation)

4. Diagnostic analytics

Predictive Analytics: Predictive analytics turn the data into valuable, actionable information. predictive
analytics uses data to determine the probable outcome of an event or a likelihood of a situation occurring.

Predictive analytics holds a variety of statistical techniques from modeling, machine, learning, data mining,
and game theory that analyze current and historical facts to make predictions about a future
event. Techniques that are used for predictive analytics are:

• Linear Regression

• Time series analysis and forecasting

• Data Mining

There are three basic cornerstones of predictive analytics:

• Predictive modeling

• Decision Analysis and optimization

• Transaction profiling

Descriptive Analytics: Descriptive analytics looks at data and analyze past event for insight as to how to
approach future events. It looks at the past performance and understands the performance by mining
historical data to understand the cause of success or failure in the past. Almost all management reporting such
as sales, marketing, operations, and finance uses this type of analysis.

The descriptive model quantifies relationships in data in a way that is often used to classify customers or
prospects into groups. Unlike a predictive model that focuses on predicting the behavior of a single
customer, Descriptive analytics identifies many different relationships between customer and product.

Common examples of Descriptive analytics are company reports that provide historic reviews like:

• Data Queries

• Reports

• Descriptive Statistics

• Data dashboard

Prescriptive Analytics: Prescriptive Analytics automatically synthesize big data, mathematical science, business
rule, and machine learning to make a prediction and then suggests a decision option to take advantage of
the prediction.
Prescriptive analytics goes beyond predicting future outcomes by also suggesting action benefit from the
predictions and showing the decision maker the implication of each decision option. Prescriptive Analytics
not only anticipates what will happen and when to happen but also why it will happen. Further, Prescriptive
Analytics can suggest decision options on how to take advantage of a future opportunity or mitigate a future
risk and illustrate the implication of each decision option.
For example, Prescriptive Analytics can benefit healthcare strategic planning by using analytics to leverage
3 © www.anuupdates.org Prepared by D.Venkata Reddy M.Tech(Ph.D), UGC NET, AP SET Qualified
www.anuupdates.org
operational and usage data combined with data of external factors such as economic data, population
demography, etc.

Diagnostic Analytics: In this analysis, we generally use historical data over other data to answer any question
or for the solution of any problem. We try to find any dependency and pattern in the historical data of the
particular problem.

For example, companies go for this analysis because it gives a great insight into a problem, and they also
keep detailed information about their disposal otherwise data collection may turn out individual for every
problem and it will be very time-consuming. Common techniques used for Diagnostic Analytics are:

• Data discovery

• Data mining

• Correlations

3. CHALLENGES FACING BIG DATA


Many companies get stuck at the initial stage of their Big Data projects. This is because they are neither
aware of the challenges of Big Data nor are equipped to tackle those challenges. The challenges of
conventional systems in Big Data need to be addressed. Below are some of the major Big Data challenges
and their solutions.

1. Lack of proper understanding of Big Data

Companies fail in their Big Data initiatives due to insufficient understanding. Employees may not know what
data is, its storage, processing, importance, and sources. Data professionals may know what is going on, but
others may not have a clear picture.

For example, if employees do not understand the importance of data storage, they might not keep the
backup of sensitive data. They might not use databases properly for storage. As a result, when this important
data is required, it cannot be retrieved easily.

Solution

Big Data workshops and seminars must be held at companies for everyone. Basic training programs must be
arranged for all the employees who are handling data regularly and are a part of the Big Data projects. A
basic understanding of data concepts must be inculcated by all levels of the organization.

2. Data growth issues

One of the most pressing challenges of Big Data is storing all these huge sets of data properly. The amount of
data being stored in data centers and databases of companies is increasing rapidly. As these data sets grow
exponentially with time, it gets extremely difficult to handle.

Most of the data is unstructured and comes from documents, videos, audios, text files and other sources. This
means that you cannot find them in databases. This can pose huge Big Data analytics challenges and must be
resolved as soon as possible, or it can delay the growth of the company.

Solution

In order to handle these large data sets, companies are opting for modern techniques, such as compression,
tiering, and deduplication. Compression is used for reducing the number of bits in the data, thus reducing its
overall size. Deduplication is the process of removing duplicate and unwanted data from a data set.

4 © www.anuupdates.org Prepared by D.Venkata Reddy M.Tech(Ph.D), UGC NET, AP SET Qualified


www.anuupdates.org
Data tiering allows companies to store data in different storage tiers. It ensures that the data is residing in the
most appropriate storage space. Data tiers can be public cloud, private cloud, and flash storage, depending
on the data size and importance.

Companies are also opting for Big Data tools, such as Hadoop, NoSQL and other technologies.

This leads us to the third Big Data problem.

3. Confusion while Big Data tool selection

Companies often get confused while selecting the best tool for Big Data analysis and storage. Is HBase or
Cassandra the best technology for data storage? Is Hadoop MapReduce good enough or will Spark be a
better option for data analytics and storage?

These questions bother companies and sometimes they are unable to find the answers. They end up making
poor decisions and selecting inappropriate technology. As a result, money, time, efforts and work hours are
wasted.

Solution

The best way to go about it is to seek professional help. You can either hire experienced professionals who
know much more about these tools. Another way is to go for Big Data consulting. Here, consultants will give
a recommendation of the best tools, based on your company’s scenario. Based on their advice, you can
work out a strategy and then select the best tool for you.

4. Lack of data professionals

To run these modern technologies and Big Data tools, companies need skilled data professionals. These
professionals will include data scientists, data analysts and data engineers who are experienced in working
with the tools and making sense out of huge data sets.

Companies face a problem of lack of Big Data professionals. This is because data handling tools have evolved
rapidly, but in most cases, the professionals have not. Actionable steps need to be taken in order to bridge
this gap.

Solution

Companies are investing more money in the recruitment of skilled professionals. They also have to offer
training programs to the existing staff to get the most out of them.

Another important step taken by organizations is the purchase of data analytics solutions that are powered
by artificial intelligence/machine learning. These tools can be run by professionals who are not data science
experts but have basic knowledge. This step helps companies to save a lot of money for recruitment.

5. Securing data

Securing these huge sets of data is one of the daunting challenges of Big Data. Often companies are so busy
in understanding, storing and analyzing their data sets that they push data security for later stages. But, this is
not a smart move as unprotected data repositories can become breeding grounds for malicious hackers.

Companies can lose up to $3.7 million for a stolen record or a data breach.

Solution

Companies are recruiting more cybersecurity professionals to protect their data. Other steps taken for
securing data include:

• Data encryption

• Data segregation
5 © www.anuupdates.org Prepared by D.Venkata Reddy M.Tech(Ph.D), UGC NET, AP SET Qualified
www.anuupdates.org
• Identity and access control

• Implementation of endpoint security

• Real-time security monitoring

• Use Big Data security tools, such as IBM Guardian

6. Integrating data from a variety of sources

Data in an organization comes from a variety of sources, such as social media pages, ERP applications,
customer logs, financial reports, e-mails, presentations and reports created by employees. Combining all this
data to prepare reports is a challenging task.

This is an area often neglected by firms. But, data integration is crucial for analysis, reporting and business
intelligence, so it has to be perfect.

Solution

Companies have to solve their data integration problems by purchasing the right tools. Some of the best data
integration tools are mentioned below:

• Talend Data Integration

• Centerprise Data Integrator

• ArcESB

• IBM InfoSphere

• Xplenty

• Informatica PowerCenter

• CloverDX

• Microsoft SQL

• QlikView

• Oracle Data Service Integrator

In order to put Big Data to the best use, companies have to start doing things differently. Addressing these
Big Data challenges as soon as possible is crucial. This means hiring better staff, changing the management,
reviewing existing business policies and the technologies being used. To enhance decision making, they can
hire a Chief Data Officer – a step that is taken by many of the fortune 500 companies.

Big Data Analytics Challenges in Different Industries

Big Data challenges are there in every industry and are very common. Here are some of the challenges of
conventional systems in big data and their solutions.

Big Data Challenge in Healthcare

• Boost effectiveness of diagnosis.

• Predictive Analysis can be used to find trends that were previously classified.

• Delivering digitised findings to medical professionals.

• Providing healthcare and preventative medicine.

• Real-time monitoring can become prominent.


6 © www.anuupdates.org Prepared by D.Venkata Reddy M.Tech(Ph.D), UGC NET, AP SET Qualified
www.anuupdates.org
• To suggest a Prospective and Prescriptive Modeling System for doctors in order to close the
complexity for a precise diagnosis.

• To create a data transfer and interchange framework to give the patient individualised treatment.

• To create an appropriate technology powered by AI for combining data from several sources.

Solution

• Prescriptive and Predictive Analysis

Utilising the information gleaned from the patient’s records, the transmission of data and accessibility were
developed to offer the patient individualised treatment. AI can store all medical records in the same place. It
can also increase the rate of accurate diagnosis.

• Text Analysis

The General Health Records (GHR) database, compiled by gathering medical reports, is utilised to develop
the algorithm. These reports are then digitalised so that the analysis can be considered.

• Genomic Data Analysis

Genomic data analysis thoroughly explains the connections among various genetic tags, alterations, and
states. It has the potential to significantly aid in developing many genetic medicines to treat diseases.

Big Data Challenge in Security Management

• Sensitivity to generating fake data.

• While “points of access and exit” are frequently guarded, your system’s internal security may not be.

• Granular Access control challenges.

• Protecting and securing data.

Solution –

• Centralised Management

Centralised key management is more efficient than distributed or application-specific key management.
Security keys and audit logs can be accessed from a single point in centralised management systems.
Companies handling sensitive data need reliable key management systems.

• User Access Control

Basic network security tools include user access control. Big data systems can suffer a great deal from
improper access control measures. Role-based settings and policies are the foundation of a robust user
control policy. With policy-driven access control, complex levels of user control, such as multiple
administrator settings, are automatically managed to prevent insider threats.

• Encryption

Several big data encryption tools can help in handling large volumes of data. This is the reason why
companies encrypt their data, both machine-generated and manual

5. IMPORTANCE OF BIG DATA ANALYTICS

Big Data analytics is a process used to extract meaningful insights, such as hidden patterns, unknown
correlations, market trends, and customer preferences. Big Data analytics provides various advantages—it can
be used for better decision making, preventing fraudulent activities, among other things.

7 © www.anuupdates.org Prepared by D.Venkata Reddy M.Tech(Ph.D), UGC NET, AP SET Qualified


www.anuupdates.org
In today’s world, Big Data analytics is fueling everything we do online—in every industry.

Take the music streaming platform Spotify for example. The company has nearly 96 million users that generate
a tremendous amount of data every day. Through this information, the cloud-based platform automatically
generates suggested songs—through a smart recommendation engine—based on likes, shares, search history,
and more. What enables this is the techniques, tools, and frameworks that are a result of Big Data analytics.

If you are a Spotify user, then you must have come across the top recommendation section, which is based on
your likes, past history, and other things. Utilizing a recommendation engine that leverages data filtering tools
that collect data and then filter it using algorithms works. This is what Spotify does.

Organizations can use big data analytics systems and software to make data-driven decisions that can improve
business-related outcomes. The benefits may include more effective marketing, new revenue opportunities,
customer personalization and improved operational efficiency. With an effective strategy, these benefits can
provide competitive advantages over rivals.

6. TECHNOLOGIES NEEDED TO MEET CHALLENGES OF BIG DATA.

Here are some of the key big data analytics tools:

• Hadoop - helps in storing and analyzing data

• MongoDB - used on datasets that change frequently

• Talend - used for data integration and management

• Cassandra - a distributed database used to handle chunks of data

• Spark - used for real-time processing and analyzing large amounts of data

• STORM - an open-source real-time computational system

• Kafka - a distributed streaming platform that is used for fault-tolerant storage

8 © www.anuupdates.org Prepared by D.Venkata Reddy M.Tech(Ph.D), UGC NET, AP SET Qualified

You might also like