You are on page 1of 6

Data Analytic Approach to Business Problems

The CRISP-DM methodology is a framework originally developed by data miners to generalize the
common approaches to defining and analyzing a problem.

Consider this hypothetical problem of business to understand the methodology:


Problem-Solving Framework (CRISP-DM Methodology)

The Framework is iterative. Going through the process once without having solved the
problem is not a failure!

Business Understanding:
This is the first step in approaching any business problem. The ability to cast the problem as one or
more data science problems is the key skill in this phase. To do this, we ask these questions:

· What decision needs to be made?

· What information is needed to inform that decision?

· What type of analysis will provide the information to inform that decision?

From the business case above, to see what Bob could have done better, let’s answer these key
questions:

· Decision: The Company wants to decide on which customers to contact regarding their insurance
policy.

· What information is needed to inform that decision: Customers who are most likely to answer a
phone call (not good enough and does not help us achieve our business goal)

· What type of Analysis will provide the information?


Predictive analysis to obtain the information we need. This produces the customers who are most
likely to answer a call.

Predictive modeling is flawed because we are not predicting the right information needed to make
our decision.

What should Bob have done?

Bob’s boss wants him to get a list of customers who are most likely to answer a phone call.
However, this is not a good enough predictor to get prospective customers who are receptive as
well. Bob should have thought about the business goals, then further challenged his boss on if
predicting the customers who are likely to answer a call is good enough to make a business decision.
This will require both parties to thoroughly examine the business problem keeping in mind the kind
of data and variables that give a good enough prediction of the target outcome.

We’ll briefly look at some key questions Bob should answer at each stage.

Data Understanding:
The data is the raw material from which the solution will be built. The critical part of Data
understanding is to estimate the cost and benefit of each data source and deciding on the best
investment of your resources. We dig beneath the surface to grasp the intricacies of the business
problem and the data that is available, then match them to more data mining tasks for which we
have modeling tools to apply.

For Data understanding, some questions Bob needs to answer are:

· What data is needed for the insurance problem?

· What data is available?

· What are the important characteristics of the data?

· Estimated costs/benefits of acquiring additional data?

· Apart from customer demographic data, other important data might be: Years as a customer, kinds
of insurance, subscription history, times contacted in the past and any other variable that might be
informative.

Data Preparation:
Real-world data is seldom in the format we require for modeling. Data preparation involves one or
more of the following: Gathering, cleansing, formatting, and sampling of the data.

For the Data Preparation phase, some questions Bob needs to answer are:

· Is the data in the right format for analysis and modeling?

· Are there outliers that can skew the result of the analysis?

· What additional features are needed for analysis and modeling?

· What aggregations are important?

· Is the data properly sampled?

Analysis and Modelling:


The next step is to perform the analysis and if appropriate model the problem. It is important to
understand the data mining technique you want to apply to the prepared data.

The figure below shows a methodology map for Data analysis and predictive modeling depending
on the type of business problem.

To adequately use the Methodology map, we use a Top-down approach in asking questions
depending on the type of business problem and the amount of data available, until we arrive at the
predictive model we will use.
For Data analysis, Bob needs to do some exploratory data analysis to know answers to the
following questions:

· What variables (data fields), customer behaviors are most correlated with a customer paying for
insurance?

· Are data fields such as: Years as a customer, previous insurance subscriptions, times contacted in
the past statistically significant with target outcome (customer paying for insurance)

· Are one or more predictor variables highly correlated?

· Are there distinctive clusters within the data each customer belongs to?

· Given the descriptive statistics, what hypothesis should be tested to make an inference?

For Modeling, some questions Bob needs to answer are:

· Is the sampled data enough for training and testing?

· Given the insurance problem is a classification problem (Yes or No), what classification algorithm
should be used? Some important ones are (logistic regression, decision trees, forest, and boosted
models)

· What parameters and hyperparameters need to be tuned?

Validation & Evaluation Stage:


Here we want to access the data mining result to ensure that it achieves the business objectives
before presenting or deploying it. A key objective is to determine if there is some business issue
that has not been sufficiently considered.

Remember that the data mining process is iterative. It should be repeated until there is a balance
with the business goals.

For Validation stage, Key questions Bob should answer are:

· What are the key results of the model?

· Are the results biased, does the model appear to be overfitting?

· Does the result make sense within the context of the business problem?

· Do we proceed to the next step or return to a previous phase?


Presentation & Deployment:
Having completed model evaluation and analysis, it is time to present your results to the decision-
makers. For presentations, the kind of visualizations used should be determined based on the
audience, and the type of analysis itself. The best practice is to tell a story with the data that meets
the needs of the decision-makers. Sometimes, you may need to walk the audience through the
analysis emphasizing the decisions and assumptions you made along the way. It is important to
measure the success of your analysis if it fits the decision that needs to be made.

Deployment involves implementing the predictive model in some information systems or business
processes.

Finally, Bob presents the results of his data analysis to his boss, the company achieves a 70%
positive response.

CONCLUSION:

Developing a mental model of the data mining framework helps us understand what needs to be
done when faced with a business problem. It is important to remember that the CRISP-DM
methodology is an iterative process. So, if there are still difficulties in coming up with a solution,
we go back to the business understanding phase. A second iteration may lead to a better solution.

References:
https://www.linkedin.com/advice/0/what-most-common-business-problems-analytics-can-b0jic

https://www.pangaeax.com/2022/06/06/data-analytics-solve-business-problems/

https://www.clariontech.com/platform-blog/solve-common-business-problem-with-analytics-solutions

https://www.investopedia.com/terms/d/data-analytics.asp

https://online.hbs.edu/blog/post/business-analytics-examples

You might also like