Professional Documents
Culture Documents
High ROI of Data Mining
High ROI of Data Mining
INDIVIDUAL ASSIGNMENT:
GENERAL INSTRUCTIONS:
- Read the following article.
- Answer the questions/Follow the tasks after the article on a separate MS
WORD document. Please include your names in the word document
- The document must be saved in the following format:
o YourSurname_yourFirstNameHW01.doc
- Submit your document via agsb_bintelstd@yahoogroups.com in the folder
High ROI of Data Mining.
- DEADLINE: Session 3 before 6:00 pm class (late submissions accepted but
deductions will be given)
ARTICLE:
http://jtonedm.com/2009/02/19/the-high-roi-of-data-mining-for-innovative-organizations/
(JT on EDM: James Taylor on Everything Decision Management)
John Elder presented a collection of case studies to showcase the ROI of data
mining. John started by making the point that many of his case studies had
technical success but not business success an interesting statistic. John sees
three major ways that predictive analytics can help streamlining, eliminating
the bad or discovering the good.
John and his team do a tremendous amount of work with data mining and predictive
analytic tools and know how well they work but also consider the human aspect
critical. After all, computers are both powerful and mindless and the human
aspect of putting them to work is key.
Gartner have hype cycles for products and data mining, unlike artificial
intelligence, is on the plateau of productivity. The focus of data mining on
bottom line activities is part of why it is already considered productive. In
addition, most corporate processes are already fairly well performed and so
small improvements (using data mining) really matter. Cases on streamlining or
automating decisions came first:
HSBC wanted to cross-sell products and used their historical data to find
out what might interest a customer next. They wanted to take a customer
contact that was a pure cost and make it a benefit by targeting inbound
calls or contacts at a branch. Used data mining and visualization to present
new ideas to people.
Anheuser-Busch wanted to see how their products are displayed in stores.
Knowing this helps them see what works and does not work and helps them
manage their products in a store. Used analytics to take an image and
automate the definition of a plan for the shelf. Easier than other visual
recognition because products and brands make it easy to spot whats what.
Got a 90% accuracy rate, dramatically improving the process.
Lumidigm is a bio-metrics company that uses how your skin reflects infrared reflections to identify you. Originally wanted to use this to diagnose
disease but found that person-specific factors were overwhelming it. To use
the differences required analytics to predict how likely someone is who they
say they are. The fact that none of the models were 100% accurate did not
mean it could not be used Disney use it for tying people to their tickets
for instance.
IRS wanted to detect fraud for a particular kind of refund. There were
plenty of fraud examples in this case but the fraud was so easy to
perpetrate that they were drowning in cases. They were finding 1 in 100
anyway but when they automated the detection they found 25 in every 100!
Service fraud detection at a consumer electronics firm warranty fraud.
Got some tips from folks but not much else. Automated the decision to score
claims and focused the investigators on the top ones. Recovered $20M in 9
months!
Final ones were mining for gold finding the hidden good results.
The bottom line for these projects is interesting. In HSBC they lost the
champion and in Anheuser-Busch 9/11 happened and the projects died. Lumidigm
found a solution with Disney and Peregrine got a solution that was a successs.
The SSA project died with a change of management. The IRS and the consumer goods
fraud detection systems both worked. The market timing system lasted 8 years
before the market caught up and the edge disappeared.So, lessons learned. You
need:
QUESTIONS/TASKS
1. The above cases are divided into 3 categories: streamlining,
detecting/eliminating bad results, and finding the good (hidden) results.
Choose one case for each of these categories and hypothetically explain what
the company actually did for each of the CRISP-DM Phases. You may
make/invent your own assumptions on what the company actually did, as
long as your narrations/explanations illustrate each of the CRISP-DM phases
for the cases you chose.
2. Using one of the cases above, create a hypothetical database (one table,
similar to slide 12 and 13 of the second lecture PowerPoint lecture for Chapter
2 (Data Mining Processes). Create your own attributes for your chosen case.
In the resulting table you created, are there any attributes that you can
remove from your attribute set? Identify the attributes you may remove and
explain why they are dispensable. If there are no attributes that may be
removed from your table, explain why this is the situation.
3. What other insights can you add to the 5 lessons that the author of this article
has enumerated? Explain your thoughts.
4. Search the Internet for another specific case where data mining is used for
the improvement of a business. Summarize this case and explain whether it
falls under which of the 3 categories (streamlining, detecting/eliminating bad
results, and finding the good (hidden) results) this case falls under.
5. Create a short PowerPoint slide presentation about number 4. Include in your
presentation the narration/explanation of hypothetical CRISP-DM steps that the
organization in youre your researched case might have taken. (This PowerPoint
presentation will not be submitted, but will be used for reporting in class)