You are on page 1of 11

Lecture#1

BIG DATA
What is Big Data?

Big Data is a massive amount of data sets that cannot be stored, processed, or
analyzed using traditional tools.

Today, there are millions of data sources that generate data at a very rapid rate. These
data sources are present across the world. Some of the largest sources of data are
social media platforms and networks. Let’s use Facebook as an example—it generates
more than 500 terabytes of data every day. This data includes pictures, videos,
messages, and more.

Data also exists in different formats, like structured data, semi-structured data, and
unstructured data. For example, in a regular Excel sheet, data is classified as structured
data—with a definite format. In contrast, emails fall under semi-structured, and your
pictures and videos fall under unstructured data. All this data combined makes up Big
Data.

What is Big Data Analytics?

Big Data analytics is a process used to extract meaningful insights, such as hidden
patterns, unknown correlations, market trends, and customer preferences. Big Data
analytics provides various advantages—it can be used for better decision making,
preventing fraudulent activities, among other things.

Why is big data analytics important?

In today’s world, Big Data analytics is fueling everything we do online—in every


industry.

Take the music streaming platform Spotify for example. The company has nearly 96
million users that generate a tremendous amount of data every day. Through this
information, the cloud-based platform automatically generates suggested songs—
through a smart recommendation engine—based on likes, shares, search history, and
more. What enables this is the techniques, tools, and frameworks that are a result of Big
Data analytics.

If you are a Spotify user, then you must have come across the top recommendation
section, which is based on your likes, past history, and other things. Utilizing a
recommendation engine that leverages data filtering tools that collect data and then
filter it using algorithms works. This is what Spotify does.
Let’s look into the four advantages of Big Data analytics.

Benefits and Advantages of Big Data Analytics

1. Risk Management

Use Case: Banco de Oro, a Philippine banking company, uses Big Data analytics to
identify fraudulent activities and discrepancies. The organization leverages it to narrow
down a list of suspects or root causes of problems.

2. Product Development and Innovations

Use Case: Rolls-Royce, one of the largest manufacturers of jet engines for airlines and
armed forces across the globe, uses Big Data analytics to analyze how efficient the
engine designs are and if there is any need for improvements.

3. Quicker and Better Decision Making Within Organizations

Use Case: Starbucks uses Big Data analytics to make strategic decisions. For example,
the company leverages it to decide if a particular location would be suitable for a new
outlet or not. They will analyze several different factors, such as population,
demographics, accessibility of the location, and more.

4. Improve Customer Experience

Use Case: Delta Air Lines uses Big Data analysis to improve customer experiences.
They monitor tweets to find out their customers’ experience regarding their journeys,
delays, and so on. The airline identifies negative tweets and does what’s necessary to
remedy the situation. By publicly addressing these issues and offering solutions, it helps
the airline build good customer relations.

The Lifecycle Phases of Big Data Analytics

Now, let’s review how Big Data analytics works:

Stage 1 - Business case evaluation - The Big Data analytics lifecycle begins with a
business case, which defines the reason and goal behind the analysis.

Stage 2 - Identification of data - Here, a broad variety of data sources are identified.

Stage 3 - Data filtering - All of the identified data from the previous stage is filtered
here to remove corrupt data.

Stage 4 - Data extraction - Data that is not compatible with the tool is extracted and
then transformed into a compatible form.
Stage 5 - Data aggregation - In this stage, data with the same fields across different
datasets are integrated.

Stage 6 - Data analysis - Data is evaluated using analytical and statistical tools to
discover useful information.

Stage 7 - Visualization of data - With tools like Tableau, Power BI, and QlikView, Big
Data analysts can produce graphic visualizations of the analysis.

Stage 8 - Final analysis result - This is the last step of the Big Data analytics lifecycle,
where the final results of the analysis are made available to business stakeholders who
will take action.

Different Types of Big Data Analytics

Here are the four types of Big Data analytics:

1. Descriptive Analytics

This summarizes past data into a form that people can easily read. This helps in
creating reports, like a company’s revenue, profit, sales, and so on. Also, it helps in the
tabulation of social media metrics.

Use Case: The Dow Chemical Company analyzed its past data to increase facility
utilization across its office and lab space. Using descriptive analytics, Dow was able to
identify underutilized space. This space consolidation helped the company save nearly
US $4 million annually.

2. Diagnostic Analytics

This is done to understand what caused a problem in the first place. Techniques like
drill-down, data mining, and data recovery are all examples. Organizations use
diagnostic analytics because they provide an in-depth insight into a particular problem.

Use Case: An e-commerce company’s report shows that their sales have gone down,
although customers are adding products to their carts. This can be due to various
reasons like the form didn’t load correctly, the shipping fee is too high, or there are not
enough payment options available. This is where you can use diagnostic analytics to
find the reason.

3. Predictive Analytics

This type of analytics looks into the historical and present data to make predictions of
the future. Predictive analytics uses data mining, AI, and machine learning to analyze
current data and make predictions about the future. It works on predicting customer
trends, market trends, and so on.

Use Case: PayPal determines what kind of precautions they have to take to protect their
clients against fraudulent transactions. Using predictive analytics, the company uses all
the historical payment data and user behavior data and builds an algorithm that predicts
fraudulent activities.

4. Prescriptive Analytics

This type of analytics prescribes the solution to a particular problem. Perspective


analytics works with both descriptive and predictive analytics. Most of the time, it relies
on AI and machine learning.

Use Case: Prescriptive analytics can be used to maximize an airline’s profit. This type of
analytics is used to build an algorithm that will automatically adjust the flight fares based
on numerous factors, including customer demand, weather, destination, holiday
seasons, and oil prices.

Big Data Analytics Tools

Here are some of the key big data analytics tools :

 Hadoop - helps in storing and analyzing data


 MongoDB - used on datasets that change frequently
 Talend - used for data integration and management
 Cassandra - a distributed database used to handle chunks of data
 Spark - used for real-time processing and analyzing large amounts of data
 STORM - an open-source real-time computational system
 Kafka - a distributed streaming platform that is used for fault-tolerant storage

Applications of Big Data

In today’s world, there are a lot of data. Big companies utilize those data for their
business growth. By analyzing this data, the useful decision can be made in various
cases as discussed below:

1. Tracking Customer Spending Habit, Shopping Behavior: In big retails store (like
Amazon, Walmart, Big Bazar etc.) management team has to keep data of customer’s
spending habit (in which product customer spent, in which brand they wish to spent,
how frequently they spent), shopping behavior, customer’s most liked product (so that
they can keep those products in the store). Which product is being searched/sold most,
based on that data, production/collection rate of that product get fixed.
Banking sector uses their customer’s spending behavior-related data so that they can
provide the offer to a particular customer to buy his particular liked product by using
bank’s credit or debit card with discount or cashback. By this way, they can send the
right offer to the right person at the right time.

2. Recommendation: By tracking customer spending habit, shopping behavior, Big


retails store provide a recommendation to the customer. E-commerce site like Amazon,
Walmart, Flipkart does product recommendation. They track what product a customer is
searching, based on that data they recommend that type of product to that customer.

As an example, suppose any customer searched bed cover on Amazon. So, Amazon
got data that customer may be interested to buy bed cover. Next time when that
customer will go to any google page, advertisement of various bed covers will be seen.
Thus, advertisement of the right product to the right customer can be sent.

YouTube also shows recommend video based on user’s previous liked, watched video
type. Based on the content of a video, the user is watching, relevant advertisement is
shown during video running. As an example suppose someone watching a tutorial video
of Big data, then advertisement of some other big data course will be shown during that
video.

3. Smart Traffic System: Data about the condition of the traffic of different road,
collected through camera kept beside the road, at entry and exit point of the city, GPS
device placed in the vehicle (Ola, Uber cab, etc.). All such data are analyzed and jam-
free or less jam way, less time taking ways are recommended. Such a way smart traffic
system can be built in the city by Big data analysis. One more profit is fuel consumption
can be reduced.

4. Secure Air Traffic System: At various places of flight (like propeller etc) sensors
present. These sensors capture data like the speed of flight, moisture, temperature,
other environmental condition. Based on such data analysis, an environmental
parameter within flight are set up and varied.

By analyzing flight’s machine-generated data, it can be estimated how long the machine
can operate flawlessly when it to be replaced/repaired.

5. Auto Driving Car: Big data analysis helps drive a car without human interpretation.
In the various spot of car camera, a sensor placed, that gather data like the size of the
surrounding car, obstacle, distance from those, etc. These data are being analyzed,
then various calculation like how many angles to rotate, what should be speed, when to
stop, etc carried out. These calculations help to take action automatically.
6. Virtual Personal Assistant Tool: Big data analysis helps virtual personal assistant
tool (like Siri in Apple Device, Cortana in Windows, Google Assistant in Android) to
provide the answer of the various question asked by users. This tool tracks the location
of the user, their local time, season, other data related to question asked, etc. Analyzing
all such data, it provides an answer.

As an example, suppose one user asks “Do I need to take Umbrella?”, the tool collects
data like location of the user, season and weather condition at that location, then
analyze these data to conclude if there is a chance of raining, then provide the answer.

7. IoT:

Manufacturing company install IOT sensor into machines to collect operational data.
Analyzing such data, it can be predicted how long machine will work without any
problem when it requires repairing so that company can take action before the situation
when machine facing a lot of issues or gets totally down. Thus, the cost to replace the
whole machine can be saved.

In the Healthcare field, Big data is providing a significant contribution. Using big data
tool, data regarding patient experience is collected and is used by doctors to give better
treatment. IoT device can sense a symptom of probable coming disease in the human
body and prevent it from giving advance treatment. IoT Sensor placed near-patient,
new-born baby constantly keeps track of various health condition like heart bit rate,
blood presser, etc. Whenever any parameter crosses the safe limit, an alarm sent to a
doctor, so that they can take step remotely very soon.

8. Education Sector: Online educational course conducting organization utilize big data
to search candidate, interested in that course. If someone searches for YouTube tutorial
video on a subject, then online or offline course provider organization on that subject
send ad online to that person about their course.

9. Energy Sector: Smart electric meter read consumed power every 15 minutes and
sends this read data to the server, where data analyzed and it can be estimated what is
the time in a day when the power load is less throughout the city. By this system
manufacturing unit or housekeeper are suggested the time when they should drive their
heavy machine in the night time when power load less to enjoy less electricity bill.

10. Media and Entertainment Sector: Media and entertainment service providing
company like Netflix, Amazon Prime, Spotify do analysis on data collected from their
users. Data like what type of video, music users are watching, listening most, how long
users are spending on site, etc are collected and analyzed to set the next business
strategy.
All About Big Data Applications

The world today produces an enormous amount of data every day. Experts have
predicted that this scenario may also result in a great wave of data or dramatically, even
a data tsunami. This huge amount of data is nowadays known as Big Data. More or less
of the data tsunami being true, we now feel it a necessity to have a tool to have this
data in a systematic manner for applications in various fields including government,
scientific research, industry, etc. This will help in a proper study, storage, and
processing of the same.

Big data is a term for large and complex unprocessed data. This data is difficult and
also time-consuming to process using traditional processing methodologies. Big data
can be characterized as:

Volume – The quantity of data that is generated is very important.

Variety – Variety is the category to which Big Data belongs to is also a very essential
fact that needs to be known for data analysis.

Velocity – The term ‘velocity’ in the context refers to the speed of data generation or
how fast the data is generated and processed.

Variability – This is factor refers to the inconsistency which can be shown by the data
at times. This can hamper the process of being able to handle and manage the data
effectively.

Veracity – The quality of the data being captured can vary to a great extent and hence
does the accuracy.

Complexity – Data management can become a very complex process, especially when
large volumes of data come from multiple sources.

Thus to process this data, big data tools are used, which analyze the data and process
it according to the need.

Role of Big Data:

The primary goal of big data analytics is to help companies make more informed
business decisions by enabling data scientists, predictive modelers, and other analytics
professionals to analyze large volumes of transactional data, as well as other forms of
data that may be untapped by more conventional Business Intelligence(BI) programs.
That could include web server logs and Internet click-stream data, social media content
and social network activity reports, text from customer emails and survey responses,
mobile phone call detail records and machine data captured by sensors and connected
to the Internet of Things.
Big Data Applications:

Big data has found many applications in various fields today. The major fields where big
data is being used are as follows.

Government

Big data analytics has proven to be very useful in the government sector. Big data
analysis played a large role in Barack Obama’s successful 2012 re-election campaign.
Also most recently, Big data analysis was majorly responsible for the BJP and its allies
to win a highly successful Indian General Election 2014. The Indian Government utilizes
numerous techniques to ascertain how the Indian electorate is responding to
government action, as well as ideas for policy augmentation.

Social Media Analytics

The advent of social media has led to an outburst of big data. Various solutions have
been built in order to analyze social media activity like IBM’s Cognos Consumer
Insights, a point solution running on IBM’s BigInsights Big Data platform, can make
sense of the chatter. Social media can provide valuable real-time insights into how the
market is responding to products and campaigns. With the help of these insights, the
companies can adjust their pricing, promotion, and campaign placements accordingly.
Before utilizing the big data there needs to be some preprocessing to be done on the
big data in order to derive some intelligent and valuable results. Thus to know the
consumer mindset the application of intelligent decisions derived from big data is
necessary.
Technology

The technological applications of big data comprise of the following companies which
deal with huge amounts of data every day and put them to use for business decisions
as well. For example, eBay.com uses two data warehouses at 7.5 petabytes and 40PB
as well as a 40PB Hadoop cluster for search, consumer recommendations, and
merchandising. Inside eBay‟s 90PB data warehouse. Amazon.com handles millions of
back-end operations every day, as well as queries from more than half a million third-
party sellers. The core technology that keeps Amazon running is Linux-based and as of
2005, they had the world’s three largest Linux databases, with capacities of 7.8 TB, 18.5
TB, and 24.7 TB. Facebook handles 50 billion photos from its user base. Windermere
Real Estate uses anonymous GPS signals from nearly 100 million drivers to help new
home buyers determine their typical drive times to and from work throughout various
times of the day.

Fraud detection

For businesses whose operations involve any type of claims or transaction processing,
fraud detection is one of the most compelling Big Data application examples.
Historically, fraud detection on the fly has proven an elusive goal. In most cases, fraud
is discovered long after the fact, at which point the damage has been done and all that’s
left is to minimize the harm and adjust policies to prevent it from happening again. Big
Data platforms that can analyze claims and transactions in real time, identifying large-
scale patterns across many transactions or detecting anomalous behavior from an
individual user, can change the fraud detection game.

Call Center Analytics

Now we turn to the customer-facing Big Data application examples, of which call center
analytics are particularly powerful. What’s going on in a customer’s call center is often a
great barometer and influencer of market sentiment, but without a Big Data solution,
much of the insight that a call center can provide will be overlooked or discovered too
late. Big Data solutions can help identify recurring problems or customer and staff
behavior patterns on the fly not only by making sense of time/quality resolution metrics
but also by capturing and processing call content itself.

Banking

The use of customer data invariably raises privacy issues. By uncovering hidden
connections between seemingly unrelated pieces of data, big data analytics could
potentially reveal sensitive personal information. Research indicates that 62% of
bankers are cautious in their use of big data due to privacy issues. Further, outsourcing
of data analysis activities or distribution of customer data across departments for the
generation of richer insights also amplifies security risks. Such as customers’ earnings,
savings, mortgages, and insurance policies ended up in the wrong hands. Such
incidents reinforce concerns about data privacy and discourage customers from sharing
personal information in exchange for customized offers.

Agriculture

A biotechnology firm uses sensor data to optimize crop efficiency. It plants test crops
and runs simulations to measure how plants react to various changes in condition. Its
data environment constantly adjusts to changes in the attributes of various data it
collects, including temperature, water levels, soil composition, growth, output, and gene
sequencing of each plant in the test bed. These simulations allow it to discover the
optimal environmental conditions for specific gene types.

Marketing

Marketers have begun to use facial recognition software to learn how well their
advertising succeeds or fails at stimulating interest in their products. A recent study
published in the Harvard Business Review looked at what kinds of advertisements
compelled viewers to continue watching and what turned viewers off. Among their tools
was “a system that analyses facial expressions to reveal what viewers are feeling.” The
research was designed to discover what kinds of promotions induced watchers to share
the ads with their social network, helping marketers create ads most likely to “go viral”
and improve sales.

Smart Phones

Perhaps more impressive, people now carry facial recognition technology in their
pockets. Users of I Phone and Android smartphones have applications at their fingertips
that use facial recognition technology for various tasks. For example, Android users with
the remember app can snap a photo of someone, then bring up stored information
about that person based on their image when their own memory lets them down a
potential boon for salespeople.

Telecom

Now a day’s big data is used in different fields. In telecom also it plays a very good role.
Operators face an uphill challenge when they need to deliver new, compelling, revenue-
generating services without overloading their networks and keeping their running costs
under control. The market demands new set of data management and analysis
capabilities that can help service providers make accurate decisions by taking into
account customer, network context and other critical aspects of their businesses. Most
of these decisions must be made in real time, placing additional pressure on the
operators. Real-time predictive analytics can help leverage the data that resides in their
multitude systems, make it immediately accessible and help correlate that data to
generate insight that can help them drive their business forward.

Healthcare

Traditionally, the healthcare industry has lagged behind other industries in the use of big
data, part of the problem stems from resistance to change providers are accustomed to
making treatment decisions independently, using their own clinical judgment, rather
than relying on protocols based on big data. Other obstacles are more structural in
nature. This is one of the best place to set an example for Big Data Application.Even
within a single hospital, payor, or pharmaceutical company, important information often
remains siloed within one group or department because organizations lack procedures
for integrating data and communicating findings.

Health care stakeholders now have access to promising new threads of knowledge.
This information is a form of “big data,” so called not only for its sheer volume but for its
complexity, diversity, and timelines. Pharmaceutical industry experts, payers, and
providers are now beginning to analyze big data to obtain insights. Recent technologic
advances in the industry have improved their ability to work with such data, even though
the files are enormous and often have different database structures and technical
characteristics.

Conclusion:

Big Data is a powerful tool that makes things ease in various fields as said above. Big
data applications are applied in various fields like banking, agriculture, chemistry, data
mining, cloud computing, finance, marketing, stocks, healthcare, etc.

You might also like