Professional Documents
Culture Documents
ABFM MODULE -D
Chapter :23 BUSINESS ANALYTICS AS MANAGEMENT TOOL
What we will study?
* All About Business Analytics as Management Tool?
Join CAIIB WITH ASHOK on YouTube & App
INTRODUCTION:
Business analytics (BA) refers to the combination of skills,
technologies, and practices that are used to analyse the
data and performance of an organisation in order to gain
insights and make decisions in the future, that are driven by
data.
Statistical analysis is one of the most common methods
used in business analytics.
Business analysis is to determine which datasets are
valuable and which have the potential to boost revenue,
productivity, and efficiency.
BA may be used to make accurate predictions of future
events that are related to the activities of consumers, and
trends in the market.
It can also help create more efficient operations, which
could contribute to an increase in revenue, if it is used to its
full potential.
Data Mining History and Origins:
During late 1980s and early 1990s, data warehousing,
business intelligence, and analytics technologies began to
develop.
These innovations provided an enhanced capability to
evaluate the ever-increasing amounts of data that
organisations were creating and gathering.
Join CAIIB WITH ASHOK on YouTube & App
By the year 1995, when the First International Conference
on Knowledge Discovery and Data Mining was held in
Montreal, the phrase "data mining" was already in common
usage.
The Association for the Advancement of Artificial
Intelligence (AARI), which also hosted the conference on an
annual basis for the subsequent three years, was the
organisation that was responsible for sponsoring the event.
The conference, which has been held annually since 1999
and is commonly referred to as KDD 2021 and so on, is
primarily coordinated by Special Interest Group on
Knowledge Discovery in Data (SIGKDD),
Which is part of the Association for Computing Machinery
that focuses on knowledge discovery and data mining.
In 1997, the first issue of a specialised journal called Data
Mining and Knowledge Discovery was released to the
public.
It was once published on a quarterly basis, but it is now
published on a biweekly basis and has articles on data
mining and knowledge discovery that have been vetted by
experts in the field.
In 2016, a second publication known as the American
Journal of Data Mining and Knowledge Discovery was made
available to readers.
Join CAIIB WITH ASHOK on YouTube & App
ESSENTIALS OF BUSINESS ANALYTICS:
There are numerous different applications for Business
Analytics (BA)
When it comes to commercial enterprises, BA is most
commonly used to:
* Analyse data coming from a range of different sources.
Anything from cloud applications to marketing automation
tools and customer relationship management software
could fall under this category.
* Find patterns within the data sets by employing more
complex analytics and statistical methods.
These patterns can assist you in predicting future trends
and providing you with new information regarding
consumers and the behaviours they engage in.
* Keep an eye on key performance indicators (KPIs) and
trends as they evolve in real time. Because of this, it is much
simpler for companies to not only store all of their data in a
single location but also draw correct and speedy conclusions
from those data.
* Back and support decisions based on the most recent
available facts. Because BA gives us access to such a large
amount of data that we can put to use in support of
business decisions, we can be certain that we are well-
informed not only for one but also for multiple distinct
scenarios.
Join CAIIB WITH ASHOK on YouTube & App
When it comes to applying BA, there is no single strategy
that is more significant than the others; it all depends on
what our final aim is.
TYPES OF ANALYTICS:
When you apply these four different types of analytics, your
data can be cleansed, examined, and digested in such a way
that makes it feasible to produce answers for any
difficulties that your organisation may be facing.
Descriptive analytics:
This method involves the interpretation of historical data
and key performance indicators to discover patterns and
trends.
Using methods such as data aggregation and data mining,
this makes it possible to get a comprehensive view of
events that have occurred in the past as well as those that
are occurring at the present time.
Numerous businesses today make use of descriptive
analytics to gain a more in-depth understanding of the
actions taken by their customers and the ways in which
they may better direct their marketing efforts toward those
customers.
Join CAIIB WITH ASHOK on YouTube & App
Diagnostic analytics:
This type of analysis focuses on previous performance to
understand which factors drive particular trends.
This can be accomplished through the use of drill-down,
data discovery, data mining, and correlation to uncover the
reasons behind particular occurrences.
After arriving at a knowledge of the likelihood of an event
and the reasons why an event may occur, algorithms are
utilised for classification and regression.
Predictive analytics:
This is the practice of applying statistics to estimate and
evaluate future outcomes by employing statistical models
and techniques derived from machine learning.
In many cases, the conclusions of descriptive analytics are
used in this manner to construct models that determine the
likelihood of particular outcomes.
It is common for sales and marketing teams to employ this
type in order to forecast the opinions of specific clients
based on data collected from social media.
Prescriptive analytics:
This approach makes use of data on previous performance
to make recommendations for how similar situations should
be managed in the future.
This particular kind of business analytics not only forecasts
results, but it also has the ability to make suggestions
Join CAIIB WITH ASHOK on YouTube & App
regarding the particular activities that need to take place in
order to get the greatest potential conclusion.
Deep learning and sophisticated neural networks are
frequently used to accomplish this goal.
The purpose of this type of business analytics is often to
match different solutions to the immediate requirements of
a customer.
The current state of the company's operations will play a
significant role in determining which approach to pursue.
ELEMENTS OF BUSINESS ANALYTICS:
When one takes a more in-depth look at business analytics,
the method of business analytics that we choose to use is
going to be contingent on the end-goal that we establish for
ourselves before beginning the process.
The various elements of business analytics are as follows:
1. Data Mining
2. Text Mining
3. Data Aggregation
4. Forecasting
5. Data Visualisation
Data Mining:
Data mining is the process of searching through big data
sets in order to find patterns and relationships that, when
Join CAIIB WITH ASHOK on YouTube & App
analysed, can assist in the resolution of issues that arise in
commercial enterprises.
Enterprises now have the ability to forecast future trends
and make better informed business decisions thanks to the
methodologies and tools for data mining.
The process of extracting meaningful information from data
sets is known as "data mining," and it is one of the
fundamental disciplines that make up "data science."
Data mining is a step in the knowledge discovery in
databases (KDD) process, which is a data science approach
for obtaining, processing, and analysing data.
KDD is an acronym for knowledge discovery in databases.
Data mining and knowledge discovery and data mining are
two different concepts, despite the fact that they are
sometimes used interchangeably.
The information that it generates can be put to use in
applications for business intelligence (BI) and advanced
analytics, both of which involve the examination of
historical data.
Additionally, the information can be put to use in
applications for real-time analytics, which look at streaming
data as it is being created or collected.
Data mining that is done well can be of assistance in the
planning and management of numerous elements of
corporate operations and strategies.
Join CAIIB WITH ASHOK on YouTube & App
This covers services such as marketing, advertising, sales,
and customer support that directly interact with customers.
It also includes functions like as production, supply chain
management, finance, and human resources.
The prevention of fraud, management of risks, and planning
for cybersecurity are only few of the many important
business uses that data mining serves.
Data Mining Process: How does it Work:
Data mining is often carried out by data scientists in
addition to other qualified experts in the business
intelligence and analytics fields.
Machine learning and statistical analysis are two of its
fundamental components, coupled with data management
operations that are carried out in order to get the data
ready for analysis.
Mining massive data sets, such as customer databases,
transaction records, and log files from web servers, mobile
apps, and sensors, has become significantly simpler thanks
to the implementation of machine learning algorithms and
other artificial intelligence (AI) tools.
This has resulted in a greater degree of process automation.
The process of data mining can be split down into four basic
steps, which are as follows:
Data collection:
Join CAIIB WITH ASHOK on YouTube & App
It is determined which data are pertinent for an analytics
application, and then they are compiled.
The data could be stored in a variety of source systems, a
data warehouse, or a data lake, the latter of which is
becoming an increasingly typical repository in big data
contexts and is comprised of a combination of structured
and unstructured data.
There is also the possibility of utilising data from external
sources.
In order to continue with the process after the data has
been collected from its original location, a data scientist will
frequently relocate it to a data lake.
Data preparation:
During this stage, a series of actions are carried out to get
the data prepared for the mining stage.
It begins with the exploration, profiling, and pre-processing
of data, and then moves on to the job of data cleansing to
correct errors and other issues related to the data's quality.
It is also necessary to convert data in order to keep data
sets consistent.
This is the case unless a data scientist intends to do an
analysis on raw, unfiltered data for a specific application.
The Data Mining Process:
Join CAIIB WITH ASHOK on YouTube & App
After the data has been prepared, a data scientist will select
the proper data mining technique, at which point they will
apply one or more algorithms in order to mine the data.
Before being applied to the whole set of data, the
algorithms that are used in machine learning applications
often need to be trained on smaller sample data sets to
search for the information that is being sought after.
The interpretation and analysis of the data:
The findings from data mining are incorporated into
analytical models, which are then utilised to guide decision-
making and other aspects of business operations.
It is the responsibility of the data scientist or another
member of the data science team to explain the findings to
business executives and users.
This is typically accomplished through the use of data
visualisation and methodologies that are based on data
storytelling.
Types of Data Mining Techniques:
Various techniques can be used to mine data for different
data science applications.
A common data mining use case that is enabled by multiple
techniques is pattern recognition.
Anomaly detection, which seeks to identify outlier values in
data sets, is another data mining use case that is enabled by
multiple techniques.
Join CAIIB WITH ASHOK on YouTube & App
The following categories of data mining methods are among
the most common:
Data mining using association rules:
When mining data, if-then statements known as association
rules are used to determine the connections between
different data elements.
Support and confidence are two of the criteria that are
utilised in the process of evaluating the relationships.
Support is a measurement of the frequency with which the
related elements appear in a data set, and confidence is a
reflection of the number of times an if-then statement is
accurate.
Classification:
Using this strategy, the components of the data sets are
partitioned into the various categories that have been
established as part of the data mining process.
Among the many classification methods available, some
examples include decision trees, Naive Bayes classifiers, k-
nearest neighbour, and logistic regression.
Clustering:
As part of the data mining applications, the data elements
that have certain characteristics in common are grouped
together into clusters.
Some examples of clustering methods are k-means
clustering, hierarchical clustering, and Gaussian mixture
models.
Join CAIIB WITH ASHOK on YouTube & App
Regression:
Calculating predicted data values based on a set of variables
is another method that can be utilised in the process of
discovering relationships hidden within data sets.
Some examples of regression include linear regression and
multivariate regression.
Regressions can be done with decision trees and other
classification methods too, such as some of those
classification methods.
Sequence and path analysis:
Data can also be mined to look for patterns in which one set
of events or values leads to later ones.
This type of pattern can be used to predict future events.
Neural networks:
The functioning of the human brain can be modelled using a
system of computer programmes known as a neural
network.
Deep learning is a subfield of machine learning that is
considered to be a more advanced form of the field overall.
Neural networks are particularly helpful in pattern
recognition applications that involve deep learning.
Data Mining Software and Tools:
There are a large number of companies that offer data
mining tools, and these products are generally packaged as
part of larger software platforms that contain a variety of
Join CAIIB WITH ASHOK on YouTube & App
other types of data science and advanced analytics tools.
Data preparation capabilities, built-in algorithms, support
for predictive modelling, a graphical user interface (GUI)
based development environment, and tools for deploying
models and scoring how well they perform are among the
most important aspects offered by software designed for
data mining.
Alteryx, AWS, Databricks, Dataiku, DataRobot, Google,
H2O.ai, IBM, Knime, Microsoft, Oracle, RapidMiner, SAP,
SAS Institute, and Tibco Software are among the many
vendors that offer solutions for data mining.
Other vendors include SAS Institute and Tibco Software.
Data mining can also be accomplished with the assistance of
a number of other free and open-source technologies, such
as Data Melt, Elke, Orange, Rattle, scikit-learn, and Weka.
There are some software manufacturers that also offer
open source option(s).
For instance, Knime is able to manage data science
applications by combining an open source analytics
platform with commercial software.
Other businesses, such as Dataiku and H2O.ai, provide free
versions of their respective technologies.
Benefits of Data Mining:
The improved capability of data mining to discover
previously hidden patterns, trends, correlations, and
Join CAIIB WITH ASHOK on YouTube & App
anomalies in data sets is the primary source of the benefits
that data mining provides to businesses as a whole.
Combining traditional data analysis with predictive analytics
is one way that this knowledge can be put to use to
enhance the processes of decision-making and strategy
planning in commercial enterprises.
The following is a list of specific benefits that come with
data mining:
Increased productivity in terms of marketing and sales:
Mining consumer behaviour and preferences for patterns
can help marketers better understand client preferences,
which in turn enables them to develop more targeted
marketing and advertising campaigns.
In a similar vein, sales teams can leverage the results of
data mining to enhance lead conversion rates and market
additional products and services to clients who have already
purchased from them.
Improved quality of service to customers:
Because of data mining, businesses are able to detect
possible problems with customer service in a more timely
manner and provide contact centre personnel with up-to-
date information that can be used during phone calls and
online chats with customers.
Improvements in the management of the supply chain:
Companies are able to recognise patterns in the market and
make more accurate projections about the demand for their
Join CAIIB WITH ASHOK on YouTube & App
products, which enables them to better manage their
stockpiles of goods and supplies.
The information obtained through data mining can also be
utilised by supply chain managers in order to optimise
warehousing, distribution, and other logistics operations.
Retail:
Online merchants can better focus their marketing efforts,
advertisements, and promotional offers to individual
customers by mining consumer data and tracking shoppers'
click streams on the internet.
Data mining and predictive modelling are the driving forces
behind recommendation engines, which make suggestions
about potential purchases to website users.
These technologies are also used in the management of
inventory and supply chains.
Financial services:
Data mining technologies are utilised by financial
institutions such as banks and credit card firms in order to
construct financial risk models, identify fraudulent
transactions, and validate loan and credit applications.
Join CAIIB WITH ASHOK on YouTube & App
Data mining is also an essential component of marketing
and is essential for determining whether or not existing
clients have prospects for upselling.
Insurance:
Data mining is utilised by insurance companies to assist
with the pricing of insurance policies as well as the
determination of whether or not to approve policy
applications.
This process also includes risk modelling and management
for prospective clients.
Manufacturing:
Applications of data mining for manufacturers include work
to enhance uptime and operational efficiency in production
plants, as well as product safety and the performance of
supply chains.
Entertainment:
Streaming services mine user data to determine what
people are watching or listening to on their platforms, and
then utilise this information to provide personalised
suggestions based on users' viewing and listening
preferences.
Healthcare:
Join CAIIB WITH ASHOK on YouTube & App
The ability to diagnose medical diseases, treat patients, and
analyse X-rays and other medical imaging results is made
possible with the use of data mining.
Data mining, machine learning, and various other forms of
analytics are also extremely important to the field of
medical research.
Text Mining:
Text mining, also known as text data mining, is the process
of converting unstructured text into a structured format in
order to find relevant patterns and fresh insights.
Companies are able to investigate and identify hidden links
within their unstructured data when they employ advanced
analytical approaches such as Naive Bayes, Support Vector
Machines (SVM), and other deep learning algorithms.
Within databases, text is one of the types of data that is
used the most frequently.
This information might be arranged in the following ways,
depending on the database:
Structured data:
This data has been standardised into a tabular format,
which consists of several rows and columns.
This makes it much simpler to store and handle for the
purposes of analysis and machine learning algorithms.
Join CAIIB WITH ASHOK on YouTube & App
Inputs like names, addresses, and phone numbers are all
examples of the kinds of things that can be included in
structured data.
Unstructured data:
This data does not adhere to any particular data format that
has been standardised.
Text from various sources, such as social media or product
reviews, as well as rich media formats, such as video and
audio files, may be included in this section.
Semi-structured data:
This information is a combination of structured and
unstructured data forms, as the name of the data set
suggests.
Although it is organised to some degree, it does not possess
the necessary level of structure to fulfil the prerequisites of
a relational database.
Files written in XML, JSON, and HTML are all examples of
types of data that are considered semi-structured.
Text mining is an immensely helpful activity for
organisations to implement due to the fact that the
majority of data in the world is stored in an unstructured
manner.
Text mining tools and natural language processing (NLP)
approaches, such as information extraction, enable us to
transform unstructured materials into a structured format,
Join CAIIB WITH ASHOK on YouTube & App
which in turn enables analysis and the development of high-
quality insights.
This, in turn, leads to improved decision-making within
organisations, which in turn leads to improved outcomes
for businesses.
Text Mining Techniques:
Text mining is a process that involves deducing information
from unstructured text data by using a series of activities
that are included in the process.
Text pre-processing is the practise of cleaning and
transforming Text data into a format that can be used.
Before you can apply various text mining techniques, you
must first begin with text pre-processing, which is the
practise.
This methodology is an essential part of natural language
processing (NLP), and it typically entails the application of
processes such as language identification, tokenization,
part-of-speech tagging, chunking, and syntax parsing in
order to appropriately format data for analysis.
After the text has been pre-processed to your satisfaction,
you will be able to apply text mining algorithms to the data
in order to gain insights.
The following is a list of some of the more common text
mining techniques:
The retrieval of information:
Join CAIIB WITH ASHOK on YouTube & App
Information retrieval, also known as IR, is the process of
locating and delivering pertinent data or documents based
on a predetermined list of queries or phrases.
IR systems make use of algorithms to monitor user activities
and identify data that is pertinent to those activities.
The process of information retrieval is utilised frequently in
library catalogue management systems as well as in popular
search engines such as Google.
The following are some examples of typical IR side jobs:
Tokenization: refers to the process of separating a lengthy
piece of text into individual sentences and words that are
referred to as "tokens."
After that, these are incorporated into models, such as bag-
of-words, that are used for text clustering and document
matching activities.
Stemming: is the process of removing prefixes and suffixes
from words in order to determine the form and meaning of
the root word.
This is referred to as "stemming." This method decreases
the amount of space required for indexing files, which
results in improved information retrieval.
Natural Language Processing:
Natural language processing is an offshoot of computational
linguistics that draws on techniques from a variety of fields,
including computer science, artificial intelligence, linguistics,
Join CAIIB WITH ASHOK on YouTube & App
and data science, to give computers the ability to
comprehend spoken and written forms of human language.
NLP subtasks allow computers to "read" by analysing
sentence structure and grammar, which gives them the
ability to do so.
Typical examples of subtasks are as follows:
Summarization: is a method that condenses lengthy
passages of text into a concise and logical overview of the
most important aspects of a document.
This method provides a synopsis of the text.
Part-of-speech (PoS) tagging: is a method in which a tag is
assigned to each token in a document based on the part of
speech that the token denotes, such as nouns, verbs,
adjectives, and so on.
Following this step, semantic analysis can be performed on
unstructured text.
Text categorization: This task, which is also known as text
classification, is responsible for analysing text documents
and classifying them based on predefined topics or
categories.
In other words, this task is responsible for text
classification.
When it comes to classifying synonyms and abbreviations,
this subsidiary task is especially useful.
Sentiment analysis: is a task that identifies positive or
negative sentiment from internal or external data sources.
Join CAIIB WITH ASHOK on YouTube & App
This gives you the ability to monitor changes in customer
attitudes over the course of time.
It is frequently utilised to provide information about
people's opinions regarding various brands, products, and
services.
These insights have the potential to propel businesses
toward connecting with customers and improving processes
as well as the user experiences they provide.
The Extraction of Information:
When searching through a variety of documents,
information extraction (IE) brings to the surface the
pertinent pieces of data.
In addition to this, the emphasis is placed on the extraction
of structured information from free text and the storage of
information regarding entities, attributes, and relationships
in a database.
The following are examples of common information
extraction sub-tasks:
Feature Selection:
The process of selecting the important features
(dimensions) that will contribute the most to the output of
a predictive analytics model is referred to as feature
selection, which is also known as attribute selection.
Feature Selection:
Join CAIIB WITH ASHOK on YouTube & App
The process of selecting a subset of features in order to
improve the accuracy of a classification task is referred to as
feature extraction.
This is of utmost significance when attempting to reduce
the number of dimensions.
Named-entity recognition:
also known as entity identification or entity extraction,
seeks to locate and classify particular entities in text, such
as names or locations.
This can be accomplished by searching for and analysing the
text.
For instance, NER recognises "Mary" as a female name and
"California" as the name of a place in the world.
Data Aggregation:
The process of collecting raw data and presenting it in a
summary format for the purposes of statistical analysis is
referred to as data aggregation.
For instance, raw data can be aggregated over a specified
amount of time to provide statistics like the average, the
lowest, the maximum, the sum, and the count.
Following the aggregation of the data and its subsequent
writing to a view or report, you will be able to perform an
analysis on the aggregated data in order to get insights on
specific resources or resource groupings.
There are two different approaches to accumulating data:
Join CAIIB WITH ASHOK on YouTube & App
Time aggregation:
Every single data point that pertains to a single resource
over a particular time period.
Spatial aggregation:
Every single data point for a collection of resources over a
predetermined amount of time.
Time Intervals for Data Collection and Aggregation:
Within the context of a number of different time intervals,
data is compiled and shown in a view or report as follows:
Time frame for reports:
This refers to the time frame that encompasses the
collection of data prior to its dissemination.
For instance, a resource summary table can include data
that was gathered for a specific network device over the
course of a single day.
A reporting period could contain raw data points as well as
aggregated data points (data that has not been aggregated).
The time intervals supported for reports are: daily, weekly,
monthly, quarterly, and yearly.
Granularity:
Granularity is defined as the time frame during which
individual data points for a specific resource or collection of
resources are gathered for the purposes of aggregation.
For instance, the granularity would be five minutes if you
wanted to get the average of the data points for a particular
Join CAIIB WITH ASHOK on YouTube & App
resource that were gathered over the course of five
minutes.
Granularity can range anywhere from one minute to one
month, depending on the view or report type, as well as the
time period being analysed.
Data View is capable of dynamically aggregating data down
to a granularity of less than one day.
Data Channel aggregates data for larger granularity values.
Voting time period:
The length of time over which the frequency with which
resources are sampled for data is established is referred to
as the voting time or polling period.
For illustration's sake:
A set of resources may be surveyed once every five minutes,
which would imply that a data point was produced for each
resource once every five minutes.
The output of a spatial aggregate can be affected by a
number of factors, including polling period and granularity.
For illustration's sake:
Let’s say you want to determine the average of a collection
of data points that were gathered for a group of devices
over the course of ten minutes (the granularity).
The result is the average of the single data points acquired
from each device, and if the polling period is also 10
minutes, this is what is calculated.
Join CAIIB WITH ASHOK on YouTube & App
If, on the other hand, the polling time is only 5 minutes,
then each device is only sampled once over the 10 minute
granularity period.
The aggregated result is the average of all of the data points
that were obtained, which includes the individual data
points that were gathered for each resource during the first
polling period as well as the individual data points that
were gathered during the second polling session.
Forecasting:
The process of making predictions about what will occur in
the future by taking into account what has happened in the
past and what is happening in the present is referred to as
forecasting.
In its most fundamental form, it is a decision-making tool
that examines previous data and patterns with the goal of
assisting organisations in dealing with the impact of the
unpredictability of the future.
It is a tool for planning that gives companies the ability to
map out their next steps and set budgets that will ideally
cover any unpredictability the future may bring.
Forecasting Methods:
When companies wish to make educated guesses about
what might take place in the future, they have the option of
choosing between two fundamental approaches: qualitative
and quantitative approaches.
1. Qualitative Research Approach:
Join CAIIB WITH ASHOK on YouTube & App
The qualitative approach of forecasting, which is sometimes
referred to as the judging method, generates subjective
findings because it is based on the personal judgements of
experts or forecasters.
Because the process of creating forecasts is not based on
mathematics but rather on the knowledge, intuition, and
experience of the experts making them, there is a high
likelihood that the forecasts will contain errors.
One illustration of this would be if a person were to predict
the outcome of a finals game in the NBA.
Which, of course, would be influenced more by their own
personal drive and interest.
The possibility of error is one of the shortcomings of using
such a strategy.
2. Quantitative Technique:
The quantitative technique of predicting is based on a
mathematical procedure, which gives it the qualities of
being consistent and objective.
It avoids based the results on opinion and intuition, opting
instead to use enormous volumes of data and statistics that
are then interpreted, as opposed to basing the results on
opinion and intuition.
Features of Forecasting:
Join CAIIB WITH ASHOK on YouTube & App
The following is a list of some of the characteristics of
creating a forecast:
1. Involves future events:
Because they are used to make predictions about the
future, forecasts are an essential part of the planning
process.
2. Covers recent and historical occurrences:
Opinions, intuition, and educated estimates, in addition to
facts, numbers, and other pertinent data, are the
foundations around which forecasts are constructed.
All of the components that go into the formation of a
prediction are, to some extent, a reflection of what has
occurred with the company in the past as well as what is
anticipated to take place in the foreseeable future.
b) NoSQL databases:
Are non-relational data management systems that do not
require a fixed scheme.
Because of this, they are an excellent choice for large
amounts of raw data that are not structured.
These databases, whose name comes from the phrase "not
simply SQL," are able to deal with a wide variety of data
models.
c) MapReduce:
Is a key component Hadoop's framework, and this
framework serves two different purposes.
The first step is called mapping, and it distributes data to
different nodes within the cluster using various filters.
The second step is called "reducing," and it involves
organising and condensing the results obtained from each
node in order to respond to a query.
d) "Yet Another Resource Negotiator"
is what "YARN" stands for full. it is yet another component
of the Hadoop system of the second generation.
Join CAIIB WITH ASHOK on YouTube & App
The job scheduling and resource management in the cluster
can be improved with the assistance of the cluster
management technology.
e) Spark
Is an open-source cluster computing framework that
provides an interface for programming complete clusters by
utilising implicit data parallelism and fault tolerance.
Spark is capable of handling both batch and stream
processing, which enables it to do computations quickly.
f) Tableau
Is a platform for end-to-end data analytics that enables you
to prepare, analyse, collaborate, and share your insights
derived from large amounts of data.
Tableau is a leader in the field of self-service visual analysis,
which enables individuals to pose novel inquiries using
managed big data and to simply communicate their findings
throughout an organisation.
Challenges of Big Data:
Enormous data delivers big benefits, but it also introduces
big issues, such as new privacy and security concerns,
accessibility for business users, and the need to choose the
correct solutions for your company's requirements.
Join CAIIB WITH ASHOK on YouTube & App
In order for enterprises to make the most of incoming data,
they will need to handle the following issues:
Making big data accessible:
When there is a greater volume of data, it is significantly
more challenging to collect and process the data.
It is imperative that organisations make the usage of data
simple and accessible to individuals with varying degrees of
expertise.
Maintaining data quality:
Due to the large amount of data that needs to be
maintained, businesses are devoting more time than they
ever have before to the process of checking for mistakes,
inconsistencies, conflicts, and duplication.
Data Security:
Privacy and safety are becoming ever more of a worry as
more and more data is collected.
Before organisations can begin to reap the benefits of big
data, they will need to first work toward compliance and
establish data processes that are particularly stringent.
Identifying appropriate tools and platforms:
Continuous innovation takes place in the field of developing
technologies that can process and analyse large amounts of
data.
In order to meet their specific requirements, organisations
have to locate suitable technological solutions that are
Join CAIIB WITH ASHOK on YouTube & App
compatible with the ecosystems they have already
developed.
The optimal solution is frequently one that is also flexible
and able to adapt to alterations in the underlying
infrastructure at a later point.