High ROI of Data Mining

BUSINESS INTELLIGENCE
INDIVIDUAL ASSIGNMENT:
GENERAL INSTRUCTIONS:
- Read the following article.
- Answer the questions/Follow the tasks after the article on a separate MS
WORD document. Please include your names in the word document
- The document must be saved in the following format:
o YourSurname_yourFirstNameHW01.doc
- Submit your document via agsb_bintelstd@yahoogroups.com in the folder
High ROI of Data Mining.
- DEADLINE: Session 3 before 6:00 pm class (late submissions accepted but
deductions will be given)
ARTICLE:
The High ROI of Data Mining for Innovative Organizations

FEBRUARY 19, 2009 in AN AL YT IC S , D AT A MIN IN G , D EC IS IO N M ANA GE MEN T
T AK EN FROM T HE WE BS IT E
http://jtonedm.com/2009/02/19/the-high-roi-of-data-mining-for-innovative-organizations/
(JT on EDM: James Taylor on Everything Decision Management)
John Elder presented a collection of case studies to showcase the ROI of data
mining. John started by making the point that many of his case studies had
technical success but not business success an interesting statistic. John sees
three major ways that predictive analytics can help streamlining, eliminating
the bad or discovering the good.
John and his team do a tremendous amount of work with data mining and predictive
analytic tools and know how well they work but also consider the human aspect
critical. After all, computers are both powerful and mindless and the human
aspect of putting them to work is key.
Gartner have hype cycles for products and data mining, unlike artificial
intelligence, is on the plateau of productivity. The focus of data mining on
bottom line activities is part of why it is already considered productive. In
addition, most corporate processes are already fairly well performed and so
small improvements (using data mining) really matter. Cases on streamlining or
automating decisions came first:
HSBC wanted to cross-sell products and used their historical data to find
out what might interest a customer next. They wanted to take a customer
contact that was a pure cost and make it a benefit by targeting inbound
calls or contacts at a branch. Used data mining and visualization to present
new ideas to people.
Anheuser-Busch wanted to see how their products are displayed in stores.
Knowing this helps them see what works and does not work and helps them
manage their products in a store. Used analytics to take an image and
automate the definition of a plan for the shelf. Easier than other visual
recognition because products and brands make it easy to spot whats what.
Got a 90% accuracy rate, dramatically improving the process.
Lumidigm is a bio-metrics company that uses how your skin reflects infrared reflections to identify you. Originally wanted to use this to diagnose
disease but found that person-specific factors were overwhelming it. To use
the differences required analytics to predict how likely someone is who they
say they are. The fact that none of the models were 100% accurate did not
mean it could not be used Disney use it for tying people to their tickets
for instance.
Pergrine Systems wanted to develop a Sim City for IT and let an IT

department simulate the impact of staffing, service level agreements etc.
The analytics allowed IT departments to answer questions like where to add
staff or what the impact of upgrading laptops would be. One of the key
learnings was to keep uncertainty throughout the calculations.
Social Security Administration wanted to improve a 2 year

disability process where about a third were accepted and half of those
declined succeed on appeal. Needed a way to fast track easy
applications. But what is an easy application? Used text mining of the
application data to predict with 90% accuracy the 20% easy cases.
Next two were detecting and eliminating bad results:
IRS wanted to detect fraud for a particular kind of refund. There were
plenty of fraud examples in this case but the fraud was so easy to
perpetrate that they were drowning in cases. They were finding 1 in 100
anyway but when they automated the detection they found 25 in every 100!
Service fraud detection at a consumer electronics firm warranty fraud.
Got some tips from folks but not much else. Automated the decision to score
claims and focused the investigators on the top ones. Recovered $20M in 9
months!
Final ones were mining for gold finding the hidden good results.
WestWind foundation hedge fund strategy trying to manage trades based on

predictive models and market timing. Managed to do better over the year than
the market as a whole but still very volatile and not always better. Felt
like it could be luck but were able to develop a model of the model to see
how likely it was that this was real or just luck. In this case they found
there was almost no likelihood that this was random. Monitoring is critical.
Pharmacia and Upjohn had a drug they were about the abandon because it
did not seem useful. Were comparing it to a placebo and placebos work
(especially if the placebo has side-effects)! Analyzing the data there were,
for instance, a group of people who really got better on the placebo as well
as others who felt worse. The drug did much better but only a complex,
sophisticated visualization made this clear. The scientists were applying
the FDA test where the data miners just looked at the data.
The bottom line for these projects is interesting. In HSBC they lost the
champion and in Anheuser-Busch 9/11 happened and the projects died. Lumidigm
found a solution with Disney and Peregrine got a solution that was a successs.
The SSA project died with a change of management. The IRS and the consumer goods
fraud detection systems both worked. The market timing system lasted 8 years
before the market caught up and the edge disappeared.So, lessons learned. You
need:
Potential gains either leverage where an incremental improvement helps

or low hanging fruit that no-one has attempted yet (though the latter is
increasingly rare).
Interdisciplinary team
Data vigilance capture and maintain the data you have
Time for learning cycles

A Business Champion!
QUESTIONS/TASKS
1. The above cases are divided into 3 categories: streamlining,
detecting/eliminating bad results, and finding the good (hidden) results.
Choose one case for each of these categories and hypothetically explain what
the company actually did for each of the CRISP-DM Phases. You may
make/invent your own assumptions on what the company actually did, as
long as your narrations/explanations illustrate each of the CRISP-DM phases
for the cases you chose.
2. Using one of the cases above, create a hypothetical database (one table,
similar to slide 12 and 13 of the second lecture PowerPoint lecture for Chapter
2 (Data Mining Processes). Create your own attributes for your chosen case.
In the resulting table you created, are there any attributes that you can
remove from your attribute set? Identify the attributes you may remove and
explain why they are dispensable. If there are no attributes that may be
removed from your table, explain why this is the situation.
3. What other insights can you add to the 5 lessons that the author of this article
has enumerated? Explain your thoughts.
4. Search the Internet for another specific case where data mining is used for
the improvement of a business. Summarize this case and explain whether it
falls under which of the 3 categories (streamlining, detecting/eliminating bad
results, and finding the good (hidden) results) this case falls under.
5. Create a short PowerPoint slide presentation about number 4. Include in your
presentation the narration/explanation of hypothetical CRISP-DM steps that the
organization in youre your researched case might have taken. (This PowerPoint
presentation will not be submitted, but will be used for reporting in class)

High ROI of Data Mining

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

High ROI of Data Mining

Uploaded by

Copyright:

Available Formats

BUSINESS INTELLIGENCE

The High ROI of Data Mining for Innovative Organizations

Pergrine Systems wanted to develop a Sim City for IT and let an IT

Social Security Administration wanted to improve a 2 year

WestWind foundation hedge fund strategy trying to manage trades based on

Potential gains either leverage where an incremental improvement helps

Time for learning cycles

You might also like