You are on page 1of 28



• Big data is data sets that are so voluminous and complex that traditional data-
processing application software are inadequate to deal with them. Big data
challenges include capturing data, data storage, data analysis, search, sharing,
transfer, visualization, information privacy and also data source. There are a number
of concepts associated with big data that is originally 3 concepts of volume, variety,
and velocity.
• Volume which is the organizations collect data from a variety of sources, including
business transactions, social media and information from sensor or machine-to-
machine data. In the past, storing it would’ve been a problem but new technologies
(such as Hadoop) have eased the burden. Meanwhile, velocity is data streams in at
an unprecedented speed and must be dealt with in a timely manner. RFID tags,
sensors and smart metering are driving the need to deal with torrents of data in
near-real time. Besides, variety is data comes in all types of formats – from
structured, numeric data in traditional databases to unstructured text documents,
email, video, audio, stock ticker data and financial transactions.
• Lately, the term "big data" tends to refer to the use of predictive analytics, user
behaviour analytics or certain other advanced data analytics methods that
extract value from data and seldom to a particular size of data set. There is
little doubt that the quantities of data now available are indeed large, but that’s
not the most relevant characteristic of this new data ecosystem. Analysis of
data sets can find new correlations to spot business trends, prevent diseases,
combat crime and others. Scientists, business executives, practitioners of
medicine, advertising and governments alike regularly meet difficulties with
large data-sets in areas including internet search, urban information and
business informatics. Scientists encounter limitations during work, including
meteorology, genomics, connectomics, complex physics simulations, biology
and environmental research.
• Data sets grow rapidly in part because they are increasingly gathered by
cheap and numerous information-sensing internet of things devices such as
mobile device, aerial, software logs, cameras, microphones, radio-frequency
identification (RFID) readers and also wireless sensor networks.
• Relational database management systems, desktop statistics and software
packages to visualize data often have difficulty handling big data. The work
may require massively parallel software running on tens, hundreds, or even
thousands of servers. What counts as big data varies depending on the
capabilities of the users and their tools and expanding capabilities make big
data a moving target. For some organizations, facing hundreds of gigabytes
of data for the first time may trigger a need to reconsider data management
options. For others, it may take tens or hundreds of terabytes before data
size becomes a significant consideration.

• Big data is the big word in the current technical landscape. It has forced
industries, governments, academicians and researchers to give a serious
thought considering that they can mine the knowledge to increase the
impact of government policies, operations of private organizations and
businesses for taking better and wiser decisions for their stakeholders. The
experts in Data Science consider it a game changer for every industry. The
world around us and the activities that we have been doing has created a
whole mesh of massive data in every area. It has shifted from hype to a
real situation that must be dealt with. It has been raising many questions
like how to manage big data? How useful is big data? How big data can
influence the current data science scenario?
• The biggest challenge faced by a new creed of professionals termed data scientists is
how to quantify the value of big data? It is a huge challenge that needs funding with
an assurance to the stakeholders that the investment is worth the risk
involved. Several industries are investing hugely on big data. Hadoop, the most
preferred open source software for distributed computing, has forecasted to grow
by 58% by the year 2020. So, it is a clear indication towards the fact that the world
belongs to big data now and investing in this field is definitely a sign for stepping in
right direction.
• Organizations can have many reasons to go ahead and jump into the sea of
opportunities offered in this promising avenue. It can be better customer
experience, targeted marketing, concise analysis of business, reduction in expenses,
securing the business and expanding the customer base. They must start evaluating
in the terms of the need of industry specific analysis and management of big data,
identifying the cost-benefit analysis of current market, identifying opportunity to
create new services and products and also redefining current services and products.
• 1. Banking and security
• Industry-Specific challenges
• It shows that the challenges in this industry including securities fraud early warning, tick analytics, card fraud
detection, archival of audit trails, enterprise credit risk reporting, trade visibility, customer data transformation,
social analytics for trading, IT operations analytics, and IT policy compliance analytics, among others.
• Applications of big data in the banking and securities industry
• The Securities Exchange Commission (SEC) is using big data to monitor financial market activity. They are
currently using network analytics and natural language processors to catch illegal trading activity in the financial
• Retail traders, big banks, hedge funds and other so-called ‘big boys’ in the financial markets use big data for trade
analytics used in high frequency trading, pre-trade decision-support analytics, sentiment measurement, Predictive
Analytics etc.
• This industry also heavily relies on big data for risk analytics including; anti-money laundering, demand enterprise
risk management, "Know Your Customer", and fraud mitigation.
• Big Data providers specific to this industry including 1010data, Panopticon Software, Streambase Systems, Nice Actimize
and Quartet.
• 2. Communications, Media and Entertainment
• Industry-Specific big data challenges
Since consumers expect rich media on-demand in different formats and in a variety of devices, some
big data challenges in the communications, media and entertainment industry include:
 Collecting, analyzing, and utilizing consumer insights
 Leveraging mobile and social media content
 Understanding patterns of real-time, media content usage
• Applications of big data in the communications, media and entertainment industry
 Organizations in this industry simultaneously analyze customer data along with behavioral data to
create detailed customer profilCreate content for different target audiences
 Recommend content on demand
 Measure content performance
• A case in point is the Wimbledon Championships (YouTube Video) that
leverages big data to deliver detailed sentiment analysis on the tennis matches
to TV, mobile, and web users in real-time.
• Spotify, an on-demand music service, uses Hadoop big data analytics, to
collect data from its millions of users worldwide and then uses the analyzed
data to give informed music recommendations to individual users.
• Amazon Prime, which is driven to provide a great customer experience by
offering, video, music and Kindle books in a one-stop shop also heavily
utilizes big data.
• Big Data Providers in this industry including Infochimps, Splunk, Pervasive
Software, and Visible Measures.
• 3. Healthcare Providers
• Industry-Specific challenges
• The healthcare sector has access to huge
amounts of data but has been plagued by
failures in utilizing the data to curb the cost of
rising healthcare and by inefficient systems that
stifle faster and better healthcare benefits across
the board.
• This is mainly due to the fact that electronic
data is unavailable, inadequate, or unusable.
Additionally, the healthcare databases that hold
health-related information have made it difficult
to link data that can show patterns useful in the
medical field.
• Other challenges related to big data include: the
exclusion of patients from the decision making
process, and the use of data from different
readily available sensors.
• Applications of big data in the healthcare sector
• Some hospitals, like Beth Israel, are using data collected from a cell phone app, from
millions of patients, to allow doctors to use evidence-based medicine as opposed to
administering several medical/lab tests to all patients who go to the hospital. A
battery of tests can be efficient but they can also be expensive and usually
• Free public health data and Google Maps have been used by the University of
Florida to create visual data that allows for faster identification and efficient analysis
of healthcare information, used in tracking the spread of chronic disease.
• 4. Education
• Industry-Specific big data challenges
• From a technical point of view, a major challenge in the education industry is
to incorporate big data from different sources and vendors and to utilize it
on platforms that were not designed for the varying data.
• From a practical point of view, staff and institutions have to learn the new
data management and analysis tools.
• On the technical side, there are challenges to integrate data from different
sources, on different platforms and from different vendors that were not
designed to work with one another.
• Politically, issues of privacy and personal data protection associated with big
data used for educational purposes is a challenge.
• In a different use case of the use of big data
in education, it is also used to measure
teacher’s effectiveness to ensure a good
experience for both students and teachers.
• Applications of big data in Education Teacher’s performance can be fine-tuned and
measured against student numbers, subject
Big data is used quite significantly in higher matter, student demographics, student
aspirations, behavioral classification and
education. For example, The University of several other variables.
Tasmania. An Australian university with over
• On a governmental level, the Office of
26000 students, has deployed a Learning and Educational Technology in the U. S.
Management System that tracks among other Department of Education, is using big data to
things, when a student logs onto the system, how develop analytics to help course correct
students who are going astray while using
much time is spent on different pages in the online big data courses. Click patterns are also
system, as well as the overall progress of a student being used to detect boredom.
over time. Big Data Providers in this industry including
Knewton and Carnegie Learning and MyFit/
• 5. Insurance
• Industry-Specific challenges
• Lack of personalized services, lack of personalized pricing and the lack of targeted services to new
segments and to specific market segments are some of the main challenges.
• In a survey conducted by a marketforce challenges identified by professionals in the insurance industry
include underutilization of data gathered by loss adjusters and a hunger for better insight.
• Applications of big data in the insurance industry
• Big data has been used in the industry to provide customer insights for transparent and simpler
products, by analyzing and predicting customer behavior through data derived from social media, GPS-
enabled devices and CCTV footage. The big data also allows for better customer retention from
insurance companies.
• When it comes to claims management, predictive analytics from big data has been used to offer faster
service since massive amounts of data can be analyzed especially in the underwriting stage. Fraud
detection has also been enhanced.
• Through massive data from digital channels and social media, real-time monitoring of claims
throughout the claims cycle has been used to provide insights.
• Big Data Providers in this industry include: Sprint, Qualcomm, Octo Telematics, The Climate Corp.
• 6. Transportation
• Industry-Specific challenges
• In recent times, huge amounts of data from location-based social networks and high speed data from
telecoms have affected travel behavior. Regrettably, research to understand travel behavior has not
progressed as quickly.
• In most places, transport demand models are still based on poorly understood new social media
• Applications of big data in the transportation industry
• Some applications of big data by governments, private organizations and individuals including:
• Governments use of big data: traffic control, route planning, intelligent transport systems, congestion
management (by predicting traffic conditions)
• Private sector use of big data in transport: revenue management, technological enhancements, logistics
and for competitive advantage (by consolidating shipments and optimizing freight movement)
• Individual use of big data includes: route planning to save on fuel and time, for travel arrangements in
tourism and others.

• Scalability – Trying to anticipate big data storage requirements is virtually

impossible. You would have to calculate the data needed to run applications and
predictive models for each big data category, including future demands. However,
there is probably one primary application that is driving most of the company
revenue and is the focal point for your big data initiative. Use that application to
gauge your initial storage requirements. As your needs grow you can buy more disk
space or you can add cloud storage. Whatever strategy you choose, you have to
make sure storage scalability has minimal impact on data throughput and
administration overhead.
• High data availability – In order to have value, data has to be readily accessible.
However, as you start to archive more data, managing availability becomes more
challenging. Traditional policy engines spread new data across RAID architectures,
but RAID becomes less viable as storage demands grow. A better approach is to
consider is “wide area storage” where data is represented by objects and the objects
are dispersed across multiple storage nodes.
• Support tiered storage – A big data storage architecture needs to be able to
prioritize data, keeping some data available for analytics and archiving data you don’t
need right away. Most big data storage systems have a storage hierarchy to prioritize
flash memory, disk, tape storage, and other media.
• Self-managing – Most enterprise storage systems are used by multiple applications
and users, so the data storage system has to be able to tier data across media types
so as not to conflict with applications storage. You need to be able to program and
automate big data movement, including how long data is kept before being stored
on slower media. For example, if older data is accessed from the same tape archive
multiple times in a week the system should move the data to disk for faster access.
• Wide accessibility – Stored content needs to be available throughout the
organization. Distributing data geographically so it’s closer to remote users has
become increasingly important for performance, which is why big data architects are
adopting more cloud storage.
• Supports both analytics and content applications – Data is seldom
dedicated to a single task such as big data analytics. Unstructured files like as
web logs or financial data may be needed in user applications as well as big
data analytics. Big data storage needs to be able to accommodate both
analytics and applications in a single shared architecture.
• Self-healing – A well-architected big data storage system can automatically
work around component failures so users see no disruption in service.
• Supports various cloud configurations – Installing sufficient data storage
on site for most big data applications isn’t practical, so whatever big data
storage architecture you create needs to be able to accommodate public,
private, and hybrid cloud environments from the outset.
1. Big Data in Insurance Services
• Be deficient in modified services, be short of adapted charging and the need of
beleaguered services to fresh fragments and to specific market segments are some
of the main challenges. Big data is the technology tool that is being used in the
production to offer purchaser insights for see-through and simpler commodities, by
finding out and foreseeing buyer behaviour from side to side information obtained
from internet websites including the social media as well as CCTV video recording.
• The big data as well enables for the better purchaser preservation from insurance
agencies. In the claims administration, extrapolative big data business analytics has
been utilized to provide more rapid service given that enormous quantity of
information can be worked on particularly in the countersigning period. Scam
discovery has also been improved. In the course of gigantic data from digital
conduits and social media, real-time controlling of allurements all through the
argument series is used to afford insights.
2. Big Data in Finance and Crime Detection
• Big data is hugely used in the fraud detection in the banking sectors. In banking
sectors as the big data is implemented, it finds out all the mischief tasks done. It
detects the misuse of credit cards, misuse of debit cards, archival of inspection
tracks, venture credit hazard treatment, business clarity, customer statistics
alteration, public analytics for business, IT action analytics, and IT strategy
fulfilment analytics. The SEC uses this big data in order to keep a track of all the
commercial market movements.
• They are at present using network analytics and natural speech processors to grasp
unlawful business activity in the economic marketplaces. Retail traders, Private and
public actor banks, prevaricate funds and others in the monetary marketplace make
use of big data for business analytics used in big businesses, reaction dimension,
prognostic Analytics etc. In businesses big data helps a lot in knowing the shopping
patterns of customers and CRM tactics of the competitors so that they can apply
them in their businesses in order to improve the sales.
3. Big Data in Healthcare
• The big data is in extended use in the field of medicine and healthcare. As the technology
raises the cost of health care is also increasing more and more. Big data is a great helping
hand in this issue. It is a great help for even physicians to keep track of all the patients’
history. The link to the patient’s history can be accessed only by the patient and his
particular physician
• Once a patient gets treated his name and his data will be stored in the database safely forever
and whenever required, the doctor can have a view of it. A large number of medical devices
are there which are big data oriented. Today data is used to such an extent that doctor
prescribes the medicines without even visiting the patient by knowing the heartbeat and
temperature through the heart and temperature monitoring watch fitted on the patient’s
hand that stays in a remote place.
• Nanobots are miniature robots that are being developed which will increase the immunity in
the human’s body by fighting with bacteria and other harmful germs. They have their own
sensors and will be great in delivering chemotherapy. Nanobots are great biotech robots that
will be used in carrying oxygen, destroy germs, and renovate tissues.
4. Big Data’s Contribution to Transportation
• In current times, vast volumes of statistics from area-oriented community networks
and towering speediness statistics from telecoms have influenced journey policies a
lot. Unfortunately, investigation of the appreciate voyage policy has not developed
yet. Usually, transportation requirement representation is again oriented on
defectively unstated fresh social media architectures.
• A number of claims of big data by the public sector, private associations and
personal use including:
• Private sector uses the big data in traffic management, direction preparation, intellectual
transportation arrangements and overcrowding administration
• Private sector uses the big data in income administration, industrial improvements, logistics
and for reasonable benefit
• Personal use of the big data comprises direction forecasting to accumulate on petroleum and
period, for tour activities in seeing the sights etc.
5. Big Data’s Contributions to Public Sector
• In the public sectors, the major confrontations are the amalgamation and
ability of the big data from corner to corner of various public sector units
and allied unions. Big data provides a large range of facilities to the
government sectors including the power investigation, deceit recognition,
fitness interconnected exploration, economic promotion investigation and
ecological fortification.
• Big data is even used to examine the food based infections by the FDA. Big
data results are fast which outputs to quicker well-being. Also in the
investigation of a huge volume of communal complaints uses the big data
analytics. This same analytics are utilized in the course of health check
statistics in urgency and resourcefully for quicker pronouncement
manufacture and to become aware of mistrustful or falsified declarations.

1) Data mining
• There are two focus terms which is data extraction & data mining. Data extraction is where data is analysed
and crawled through to retrieve relevant information from data sources like a database while data mining is a
process of identifying valuable insights within that database. Such data is collected by data scientists.
• For example, you are a grocery site owner. After using various research techniques, you concluded that
approximately 40% people wear jeans. This is called data extraction. Now you have to go deeper to
understand which age, gender, and type of people use Brand 1 and Brand 2 jeans. This process is known as
data mining.

2)Data collection
• Data collection is the systematic approach to gathering and measuring information from a variety of sources
to get a complete and accurate picture of an area of interest. Big data doesn’t have an “END” button. As the
world grows, data will keep on streaming in. Data needs to be extracted constantly.
• From the above example, there will be people who wear Brand 1 have switched to Brand 2 and so on.
3) Data storing
• Data storage is a general term for archiving data in electromagnetic or other forms for use
by a computer or device. Different types of data storage play different roles in a computing
environment. In addition to forms of hard data storage, there are now new options for
remote data storage, such as cloud computing, that can revolutionize the ways that users
access data. A good data storage system provides an infrastructure which has all the latest
data analytics tools and storage space. Data storage is one step which here on can be
inserted in between any other step.

4) Data cleaning
• Data cleansing is the process of altering data in a given storage resource to make sure that it
is accurate and correct. Data sets can come in all forms and degrees, some good and some
not so good especially if extracted from the web. Therefore, all the data extracted needs to
clean. In the cleaning process, all the unwanted and inaccurate data is filtered out. After this
process, you will only be left with what you actually want to focus on. Cleaning promotes
structuring your data well. For example, you know number and type of people wearing jeans
all over. While cleaning, you can remove all the duplicate entries, wrong data, unwanted
regions or information and more.
5) Data analysis
• The process of evaluating data using analytical and logical reasoning to examine each component of the data
provided. This form of analysis is just one of the many steps that must be completed when conducting a research
experiment. While analysing the data you come across your audience pattern, behaviour and so on. Exploratory
research method proves to be very helpful in analysing big data. Analytics is about asking a specific question and
finding answers to it. Qubole and Statwing are powerful data analytics tools. For example, you might ask such as
does my audience like to wear two pocket jeans? Which colour is most preferred by them, or else.

6) Data consumption
• Data is consumed whenever a mobile data user is connected to the Internet, whether for sending or receiving
emails, downloading streaming music, songs, videos or apps, or simply browsing web pages. Data is consumed in
various verticals which is firstly, including the identifying retail trends in the market using which businesses can
highlight their top selling product. Next, it is used by Government bodies in order to reach out to the correct
demographics, geographies, and lastly ethnicities and marketers find big data extremely useful to figure out which
advertisement works for their products. Big data also consumed at many places depending on the specific goals you
want to achieve.

The availability of Big Data, low-cost commodity hardware, and new information
management and analytic software has produced a unique moment in the history of
data analysis. The commercial impacts of the Big Data have the potential to generate
significant productivity growth for a number of vertical sectors. The convergence of
these trends means that we have the capabilities required to analyse astonishing data
sets quickly and cost-effectively for the first time in history. These capabilities are
neither theoretical nor trivial. Big Data presents opportunity to create unprecedented
business advantages and better service delivery.
They represent a genuine leap forward and a clear opportunity to realize
enormous gains in terms of efficiency, productivity, revenue, and profitability. The Age
of Big Data is here, and these are truly revolutionary times if both business and
technology professionals continue to work together and deliver on the promise.