You are on page 1of 14
CHAPTER Predictive Analytics |: Data Mining Process, Methods, and Algorithms LEARNING OBJECTIVES Define data mining as an enabling _* Learn different methods and technology for business analytics algorithms of data mining ™ Understand the objectives and ben- _*™ Build awareness of the existing data fits of data mining mining software tools = Become familiar with the wide range" Understand the privacy issues, pitfalls, of applications of data mining and myths of data mining ™ Leam the standardized data mining processes information or knowledge) from data that an organization collects, organizes, and stores, A wide range of data mining techniques are being used by organizations to gain a better understanding of their customers and their operations and to solve com plex organizational problems. In this chapter, we study data mining as an enabling technol- ogy for business analytics and predictive analytics, learn about the standard processes of conducting data mining projects, understand and build expertise in the use of major data mining techniques, develop awareness of the existing software tools, and explore privacy issues, common myths, and pitfalls that are often associated with data mining, G enerally speaking, data mining is a way to develop intelligence (Le., actionable 4.1 Opening Vignette: Miami-Dade Police Department Is Using Predictive Analytics to Foresee and Fight Crime 216 4.2 Data Mining Concepts and Applications 219, 4.3 Data Mining Applications 229 4.4 Data Mining Process 232 4.5 Data Mining Methods 241 4.6 Data Mining Software Tools 257 4.7 Data Mining Privacy Issues, Myths, and Blunders 263 216 Chapter + Predictive Analytics I; Data Mining Process, Methods, and Algorithms ZS OPENING VIGNETTE: Miami-Dade Police Department Is Using Predictive Analytics to Foresee and Fight Crime Predictive analytics and data mining have become an integral part of many law enforce ment agencies including the Miami-Dade Police Department, whose mission is not only to protect the safety of Florida's largest county with 25 million citizens (making it the seventh largest in the United States), but also to provide a safe and inviting climate for the millions of tourists that come from around the world to enjoy the county's natural beauty, warm climate, and stunning beaches. With tourists spending nearly US$20 billion every year and generating nearly a third of Florida's sales taxes, it's hard to overstate the impor- tance of tourism to the region's economy. So although few of the county's police officers would likely list economic development in their job description, nearly all grasp the vital link between safe streets and the region's tourist-driven prospent. That connection is paramount for Lieutenant Arnold Palmer, currently supervising the Robbery Investigations Section, and a former supervisor of the department's Robbery Intervention Detail. This specialized team of detectives is focused on intensely policing ne county's robbery hot spots and worst repeat offenders. He and the team occupy mod- est offices on the second floor of a modern-looking conerete building, set back from a palm-lined street on the western edge of Miami, In his 10 years in the unit, out of 23 in total on the force, Palmer has seen a lot of changes. I's not just in policing pra the way his team used to mark street P Policing with Less Palmer and the team have also seen the impact of a growing population, shifting demo. graphics, and a changing economy on the streets they patrol. Like any good police force, they've continually adapted their methods and practices to meet a policing challenge that has grown in scope and complexity. But like nearly all branches of the county's govern ment, intensifying budget pressures have placed the department in a squeeze betw rising demands and shrinking resources Palmer, who sees detectives as front-line fighters against a rising tide of street crime and the looming prospect of evertightening resources, put it this way: “Our basic chal- lenge was how to cut street crime even as tighter resources have reduced the number of cops on the street.” Over the years, the team had been open to trying new tools, the most notable of which was a program called “analysis-driven enforcement” that used crime his- tory data as the basis for positioning teams of detectives. “We've evolved a lot since then in our ability to predict where robberies are likely to occur, both through the use of analysis and our own collective experience.” New Thinking on Cold Cases ‘The more confounding challenge for Palmer and his team of investigators, one shared with the police of all major urban areas, is in closing the hardest cases, where leads, witnesses, video. any facts or evidence that can help solve a case—are lacking. Its not surprising, explains Palmer, because ‘the standard practices we used to generate leads, like talking to informants or to the community or to patrol officers, haven't changed much, if at all,” says Palmer. “That kkind of an approach works okay, but it relies a lot on the experience our detectives carry in their head. When the detectives retire or move on, that experience goes with the Palmer's conundrum was that turnover, due to the retirement of many of his most experienced detectives, was on an upward trend. True, he saw the infusion of young Chapter 4.» Predictive Analytics I Data Mining Process, Methods, and Algorithms blood as an inherently good thing, especially given their greater comfort with the new types of information—from e-mails, social media, and traffic cameras, to name a few—that his team had access to. But as Palmer recounts, the problem came when the handful of new detectives coming into the unit turned to look for guidance from the senior officers “and it's just not there. We knew at that point we needed a different way to fill the experi cence gap going forwar His ad hoe efforts to come up with a solution led to blue-sky speculation. What if new detectives on the squad could pose the same questions to a computer database as they would to a veteran detective? That speculation planted a seed in Palmer's mind that wouldn't go away. The Big Picture Starts Small What was taking shape within the robbery unit demonstrated how big ideas can come from small places. But more important, it showed that for these ideas to reach fruition, the “right” conditions need to be in alignment at the right time. On a leadership level, that means a driving figure in the organization who knows what it takes to nurture top-down support as well as crucial bottom-up buy-in from the ranks, while at the same time keep: ing the department's information technology (IT) personnel on the same page. That son was Palmer. At the organizational level, the robbery unit served as a particularly good launching point for lead modeling because of the prevalence of repeat offenders among perpetrators. Ulimately, the department's ability to unleash the broader transformative potential of lead modeling would hinge in large part on the team’s ability to deliver results ona smaller scale carly tests and demos proved encouraging —with the modi the details of solved cases were fed into it—the team started gaining alten- tion. The initiative received a critical boost when the robbery bureau's unit major and captain voiced their support for the direction of the project, telling Palmer that “if you can make this work, run with it” But more important than the encouragement, Pa in cir willingness to advocate for the project among the department's higher- ps. “T can't get it off the ground if the brass doesn't buy in,” says Palmer, "So their support was crucial yielding accurate Success Brings Credibility Having been appointed the official liaison between IT’ and the robbery unit, Palmer set out to strengthen the case for the lead-modeling tool—now officially called Blue PALM for Predictive Analytics Lead Modeling Software—by building up a series of successes. His constituency was not only the department brass, but also the detectives whose sup port would be critical to its successful adoption as a robbery-solving tool. In his attempts to introduce Blue PALMS, resistance was predictably stronger among veteran detectives, who saw no reason to give up their long-standing practices. Palmer knew that dictates or coercion wouldn't win their hearts and minds. He would need to build a beachhead of credibility. Palmer found that opportunity in one of his best and most experienced detectives. Early in a robbery investigation, the detective indicated to Palmer that he had a strong hunch who the perpetrator was and wanted, in essence, to test the Blue PALMS system. So at the detectives request, the department analyst fed key details of the crime into the system, including the modus operandi, or MO. The systems statistical models compared these details to a database of historical data, looking for important correlations and simi- larities in the crime’s signature, The report that came out of the process included a list of 20 suspects ranked in order of match strength, or likelihood. When the analyst handed the 207 218 Chapter 4 + Predictive Analytics I; Data Mining Process, Methods, and Algorithms detective the report, his “hunch” suspect was listed in the top five. Soon after his arrest, he confessed, and Palmer had gained a solid convert ‘Though it was a useful exercise, Palmer realized that the true test wasn't in confirm ing hunches but in breaking cases that had come to a dead end. Such was the situation in a carjacking that had, in Palmer’s words, “no witnesses, no video and no crime scene- nothing to go on.” When the senior detective on the stalled case went on leave after three months, the junior detective to whom it was assigned requested a Blue PALMS report Shown photographs of the top people on the suspect list, the victim made a positive identification of the suspect leading to the successful conclusion of the case. That suspect was number one on the list. Just the Facts ‘The success that Blue PALMS continues to build has been a major factor in Palmer's suc cess in getting his detectives on board. But if there's a part of his message that resonates even more with his detectives, it's the fact that Blue PALMS is designed not to change the basics of policing practices, but to enhance them by giving them a second chance of cracking the case. “Police work is at the core about human relations—about talking to witnesses, to victims, to the community—and we're not out to change that,” says Palmer. “Our aim is to give investigators factual insights from information we already have that might make a difference, so even if we're successful 5% of the time, we're going to take 2 lot of offenders off the street. ‘The growing list of cold cases solved has helped Palmer in his efforts to reinforce fhe merits of Blue PALMS. But, in showing where his loyalty lies, he sees the detectives who've closed these cold cases—not the program—as most deserving of the spotlight and that approach has gone over well. At his chief's request, Palmer is beginning to use his liaison role as a platform for reaching out to other areas in the Miami-Dade Police Departn Safer Streets for a Smarter City When he speaks of the impact of tourism, a thread that runs through Miami-Dade's Smarter Cities vision, Palmer sees Blue PALMS as an important tool to protect one of the county's greatest assets. “The threat to tourism posed by rising street crime was a big rea- son the unit was established,” says Palmer. “The fact that we're able to use analytics and intelligence to help us close more cases and keep more criminals off the street is good news for our citizens and our tourist industry QUESTIONS FORTHE OPENING VIGNETTE. 1. Why do law enforcement agencies and departments like Miami-Dade Police Department embrace advanced analytics and data mining? 2. What are the top challenges for law enforcement agencies and departments like Miami-Dade Police Department? Can you think of other challenges (not mentioned in this case) that can benefit from data mining? 3. What are the sources of data that law enforcement agencies and departments like Miami-Dade Police Department use for their predictive modeling and data mining projects? 4, What type of analytics do law enforcement agencies and departments like Miami- Dade Police Department use to fight crime? 5. What does “the big picture starts small” mean in this case? Explain Chapter 4 + Predictive Analytics I: Data Mining Process, Methods, and Algorithms ‘What We Can Learn from This Vignette ‘The law enforcement agencies and departments are under tremendous pressure to carry out their mission of safeguarding people with limited resources. The environment within which they perform their duties is becoming increasingly more challenging so that they have to constantly adopt and perhaps stay a few steps ahead to prevent the likelihood of catastrophes. Understanding the changing nature of crime and criminals is an ongoing challenge. In the midst of these challenges, what works in favor of these agencies is the availability of the data and analytics technologies to better analyze past occurrences and to foresee future events. Data has become available more than it has in the past. Applying advanced analytics and data mining tools (.¢., knowledge discovery techniques) to these large and rich data sources provides them with the insight that they need to better pre- pare and act on their duties, Therefore, law enforcement agencies are becoming one of the leading users of the new face of analytics. Data mining is a prime candidate for bet- ter understanding and management of these mission critical tasks with a high level of accuracy and timeliness. The study deseribed in the opening vignette clearly illustrates the power of analytics and data mining to create a holistic view of the world of crime and criminals for better and faster reaction and management. In this chapter, you will see a wide variety of data mining applications solving complex problems in a variety of indus- tries and organizational settings where the data is used to discover actionable insight to improve mission readiness, operational efficiency, and competitive advantage Sources: Marni Dade Police Deparment Predictive madeling pinpoints likely suspects based on common exime signatures of previous crimes, IBM Customer Case Studies, wwvw-08 sbm.comy/software/businesscasestudies/om/ cen/corpPsynixey=CA4638H125952N07; Law Enforcement Analytics: Inelligence-Led and Predictive Policing by Information Busldee www informationbuilderscon/solutions/g0¥ lea, BQ Data Mining Concepts and Applications Data mining, a new and exciting technology of only a few years ago, has become a com- mon practice for a vast majority of organizations. In an interview with Computerworld magazine in January 1999, Dr. Aro Penzias (Nobel laureate and former chief scientist of Bell Labs) identified data mining from organizational databases as a key application for corporations of the near future. In response to Computerworld age-old question of “What will be the killer applications in the corporation?” Dr, Penzias replied: “Data min- ing” He then added, “Data mining will become much more important and companies will throw away nothing about their customers because it will be so valuable, If you're not doing this, you're out of business.” Similarly, in an article in Harvard Business Review, Thomas Davenport (2006) argued that the latest strategic weapon for companies is ana~ lytical decision making, providing examples of companies such as Amazon.com, Capital One, Marriott International, and others that have used analytics to better understand their customers and optimize theit extended supply chains to maximize their returns on invest- ment while providing the best customer service. This level of success is highly depend- ent on a company understanding its customers, vendors, business processes, and the extended supply chain very well. A large portion of “understanding the customer” can come from analyzing the vast amount of data that a company collects. The cost of storing and processing data has decreased dramatically in the recent past, and, as a result, the amount of data stored in electronic form has grown at an explosive rate. With the creation of large databases, the possibility of analyzing the data stored in them has emerged. The term data mining was originally used to describe the process through which previously unknown patterns in data were discovered. This definition has since been stretched beyond those limits by 219 220 Chapter 4 + Predictive Analytics I; Data Mining Process, Methods, and Algorithms some software vendors to include most forms of data analysis in order to increase sales with the popularity of the data mining label. In this chapter, we accept the original defini- tion of data mining, Although the term data mining is relatively new, the ideas behind it are not. Many of the techniques used in data mining have their roots in traditional statistical analysis and artificial intelligence work done since the early part of the 1980s. Why, then, has it sud- denly gained the attention of the business world? Following are some of most pronounced * More intense competition at the global scale driven by customers’ ever-changing needs and wants in an increasingly saturated marketplace. * General recognition of the untapped value hidden in large data sources, * Consolidation and integration of database records, which enables a single view of customers, vendors, transactions, and so on, * Consolidation of databases and other data repositories into a single location in the form of a data warehouse * The exponential increase in data processing and storage technologies * Significant reduction in the cost of hardware and software for data storage and processing, + Movement toward the demassification (conversion of information resources into nonphysical form) of business practices Data generated by the Internet is increasing rapidly in both volume and complexity. Large amounts of genomic data are being generated and accumulated all over the world Disciplines such as astronomy and nuclear physics create huge quantities of data on a regular basis. Medical and pharmaceutical researchers constantly generate and store data that can then be used in data mining applications to identify better ways to accurately diagnose and treat illnesses and to discover new and improved drugs. On the commercial side, perhaps the most common use of data mining has been in the finance, retail, and healthcare sectors. Data mining is used to detect and reduce fraudulent activities, especially in insurance claims and credit card use (Chan et al, 1999); to identify customer buying patterns (Hoffman, 1999); to reclaim profitable customers (Hoffman, 1998); to identify trading rules from historical data; and to aid in increased profitability using market-basket analysis. Data mining is already widely used to better tar- get clients, and with the widespread development of e-commerce, this can only become more imperative with time. See Application Case 4.1 for information on how Infinity P&C has used predictive analytics and data mining to improve customer service, combat fraud, and increase profit Reo When card issuers first started using automated busi- consulting services that make its strategies more effec- ness rules software to counter debit and credit card tive, Throtigh this approach, Visa enhances customer fraud, the limits on that technology were quickly evi- experience and minimizes invalid transaction declines, dent: Customers reported frustrating payment rejec- The company’s global network connects thou- tions on dream vacations or ctitical business trips. _ sands of financial institutions with millions of merchants Visa works with its clients to improve customer expe- and cardholders every day. It has been a pioneer in rience by providing cutting-edge fraud risk tools and cashless payments for more than 50 yeats. By using Chapter 4 SAS® Analytics, Visa is supporting financial institutions to reduce fraud without upsetting customers with unnecessary payment rejections, Whenever it pro- cesses a transaction, Visa analyzes up to 500 unique variables in real time to assess the risk of that transac tion. Using vast data sets, including global fraud hot spots and transactional patterns, the company can more accurately assess whether you're buying escar- ‘got in Paris, or someone who stole your credit card is, “What that means is that if you are likely to travel we know it, and we tell your financial insti- tution so you're not declined at the point of sale,” says Nathan Falkenborg, Head of Visa Performance Solutions for North Asia, "We also will assist your bank in developing the right strategies for using the Visa tools and scoring systems,” he adds. Visa esti- mates that Big Data analytics works; state-of-the-art models and scoring systems have the potential to prevent an incremental $2 billion of fraudulent pay- ‘ment volume annually. A globally recognized name, Visa facilitates electronic funds transfer through branded products that are issued by its thousands of financial institu tion partners. The company processed 64.9 billion transactions in 2014, and $47 trillion in purchases were made with a Visa card in that same year. It has the computing capability to process 56,000 transaction messages per second, which is greater than four times the actual peak transaction’ rate to date. Visa doesn't just process and compute— it is continually using analytics to share strategic and. operational insights with its partner financial institu- tions and assist them in improving performance. This business goal is supported by a robust data manage- ment system. Visa also assists its clients in improv- ing performance by developing and delivering deep analytical insight. “We understand patterns of behavior by per- forming clustering and segmentation at a granular level, and we provide this insight to our financial institution partners,” says Falkenborg, “It’s an effec tive way to help our clients communicate better and. deepen their understanding of the customer, ‘As an example of marketing support, Visa has assisted clients globally in identifying segments of customers that should be offered a different Visa product. “Understanding the customer lifecycle is incredibly important, and Visa provides information, to clients that help them take action and offer the right product to the right customer before a value proposition becomes stale,” says Falkenborg, + Predictive Analytics I Data Mining Process, Methods, and Algorithms 21 How Can Using In-Memory Analytics Make a Difference? In a recent proof-of-concept, Visa used a high- performance solution from SAS that relies on in-memory computing to power statistical and machine-learning algorithms and then present the information visually. In-memory analytics reduces the need to move data and perform more model iterations, making it much faster and accurate Falkenborg describes the solution as_like having the information memorized, versus having to get up and go to a filing cabinet to retrieve it In-memory analytics is just taking your brain and making it bigger. Everything is instantly accessible, Ultimately, solid analytics helps the company do mote than just process payments, "We can deepen the client conversation and serve our clients even better with our incredible big data set and expertise in mining transaction data,” says Falkenborg. “We use our consulting and analytics capabilities to assist our clients in tackling business challenges and pro- tect the payment ecosystem. And that's what we do with high-performance analytics.” “The challenge that we have, as with any com- pany managing and using massive data sets, is how ‘we use all necessary information to solve 2 busi- ness challenge—whether that is improving our fraud models, or assisting a client to more effectively com- municate with its customers,” elaborates Falkenborg, ‘In-memory analytics enables us to be more nim- ble; with a 100x analytical system processing speed improvement, our data and decision scientists can iterate much faster” Fast and accurate predictive analytics allows Visa to better serve clients with tailored consulting services, helping them succeed in today's fast- changing payments industry Questions For Discusston 1, What challenges were Visa and the rest of the credit card industry facing? 2. How did Visa improve customer service while also improving retention of fraud? 3. What is in-memory analytics, and why was it necessary? Source: “Enhancing the customer experience while reducing fraud (SAS® Analytics) ~ High-performance analytics empowers Visa to enhance customer experience while reducing debit and ‘creditcard fraud” Copyright © 2016 SAS Institute foe, Cary, NC, ‘USA. Reprinted with permission. All rights reserved 222. Chapter 4 + Predictive Analytics I; Data Mining Process, Methods, and Algorithms Definitions, Characteristics, and Benefits Simply defined, data mining is a term used to describe discovering or “mining” knowl- edge from large amounts of data. When considered by analogy, one can easily realize that the term data mining is a misnomer, that is, mining of gold from within rocks or dirt is referred to as “gold” mining rather than “rock” or “ditt” mining. Therefore, data mining perhaps should have been named “knowledge mining" or “knowledge discovery." Despite the mismatch between the term and its meaning, data mining has become the choice of the community. Many other names that are associated with data mining include knowledge extraction, pattern analysis, data archaeology, information harvesting, pattern searching, and data dredging Technically speaking, data mining is a process that uses statistical, mathematical, and artificial intelligence techniques to extract and identify useful information and subse- quent knowledge (or patterns) from large sets of data, These pattems can be in the form of business rules, affinities, correlations, trends, or prediction models (see Nemati and Barko, 2001), Most literature defines data mining as “the nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data stored in structured databases,” where the data are organized in records structured by categorical, ‘ordinal, and continuous variables (Fayyad et al., 1996, pp. 40-41). In this definition, the meanings of the key term are as follows: + Process implies that data mining comprises many iterative steps. + Nontrivial means that some experimentation-type search or inference is involved; that is, it is not as straightforward as a computation of predefined quantities. * Valid means that the discovered patterns should hold true on new data with a suf- ficient degree of certainty + Novel means that the patterns are not previously known to the user within the con- text of the system being analyzed * Potentially usefuul means that the discovered patterns should lead to some benefit to the user or task * Ultimately understandable means that the pattern should make business sense that leads to the user saying, "Mmm! It makes sense; why didn’t I think of that,” if not immediately, at least after some postprocessing Data mining is not a new discipline, but rather a new definition for the use of many disciplines. Data mining is tightly positioned at the intersection of many disciplines, includ- ing statistics, artificial intelligence, machine leaming, management science, information systems (I), and databases (see Figure 4.1), Using advances in all of these disciplines, data ining strives to make progress in extracting useful information and knowledge from large databases. It is an emerging field that has attracted much attention in a very short time The following are the major characteristics and objectives of data mining: * Data are often buried deep within very large databases, which sometimes contain data from several years. In many cases, the data are cleansed and consolidated into a data warehouse, Data may be presented in a variety of formats (see Chapter 2 for a brief taxonomy of data). + The data mining environment is usually a client/server architecture or a Web-based 1S architecture Sophisticated new tools, including advanced visualization tools, help to remove the information ore buried in corporate files or archival public records, Finding it involves massaging and synchronizing the data to get the right results. Cutting- edge data miners are also exploring the usefulness of soft data ((e., unstructured text stored in such places as Lotus Notes databases, text files on the Internet, or enterprise-wide intranets). Chapter 4. + Predictive Analytics I: Data Mining Process, Methods, and Algorithms 223 Drea Pa Artif ee Py ont een) Ce Learning & Pattern Re Cet oy Corey Visualzati FIGURE 4.1 Data Mining sa Blend of Mutiple Disciplines. + The miner is often an end user, empowered by data drills and other powerful query tools to ask ad hoc questions and obtain answers quickly, with litte or no program- ming skill, Striking it rich often involves finding an unexpected result and requires end users to think creatively throughout the process, including the interpretation of the findings. Data mining tools are readily combined with spreadsheets and other software devel- ‘opment tools. Thus, the mined data can be analyzed and deployed quickly and easily Because of the large amounts of data and massive search efforts, it is sometimes necessary to use parallel processing for data mining A company that effectively leverages data mining tools and technologies can acquire and maintain a strategic competitive advantage. Data mining offers organizations an indis- pensable decision-enhancing environment to exploit new opportunities by transforming data into a strategic weapon, See Nemati and Barko (2001) for a more detailed discussion on the strategic benefits of data mining. How Data Mining Works Using existing and relevant data obtained from within and outside the organization, data ‘mining builds models to discover patterns among the attributes presented in the data set. Models are the mathematical representations (simple linear relationships/affinities and/or complex and highly nonlinear relationships) that identify the patterns among the attributes of the things (e.g., customers, events) described within the data set. Some of these patterns are explanatory (explaining the interrelationships and affinities among the 224 Chapter 4 + Predictive Analytics I; Data Mining Process, Methods, and Algorithms attributes), whereas others are predictive (foretelling future values of certain attributes). In general, data mining seeks to identify four major types of patterns: 1. Associations find the commonly co-occurring groupings of things, such as beer and diapers going together in market-basket analysis. 2. Predictions tell the nature of future occurrences of certain events based on what has happened in the past, such as predicting the winner of the Super Bowl or forecast- ing the absolute temperature of a particular day, Reto Dell Is Staying Agile and Effective with Analytics in the 21st Century The digital revolution is changing how people shop. Studies show that even com- mercial customers spend more of their buyer journey researching solutions online before they engage a vendor. To compete, companies like Dell are transforming sales and marketing models to support these new requirements, However, doing, so effectively requires a Big Data solution that can. analyze corporate databases along with unstructured, information from sources such as clickstreams and social media. Dell has evolved into a technology leader by using efficient, data-driven processes. For decades, employecs could get measurable results by using enterprise applications to support insight and facili- tate processes such as customer relationship man- agement (CRM), sales, and accounting. When Dell recognized that customers were spending dramati- cally more time researching products online before contacting a sales representative, it wanted to update marketing models accordingly so that it could deliver the new types of personalized services and the support that customers expected, To make such’ changes, however, marketing employees needed more data about customers’ online behavior, Staff also needed an easier way to condense insight from ‘numerous business intelligence (BD tools and data sources, Drew Miller, Executive Director, Marketing, Analytics and Insights at Dell, says, “There are peta- bytes of available information about customers’ online and offline shopping habits. We just needed to give marketing employees an easy-to-use solution, that could assimilate all of it, pinpoint patterns and make recommendations about marketing spend and activities” Setting Up an Agile Team to Boost Return on Investment (ROI) with BI and Analytics To improve its global BI and analytics strategy and communications, Dell established an IT task force. Executives created a flexible governance model for the team so that it can rapidly respond to employees’ evolving BI and analytics requirements and deliver rapid ROI. For example, in addition to having the freedom to collaborate with internal business groups, the task force is empowered to modily business and IT processes using agile and innovative strategies, The team must dedicate more than 50% of its efforts identifying and implementing, quick-win BI and analytics projects that are typi- cally too small for the “A” priority list of Dell's IT department, And the team must also spend at least 30% of its time evangelizing within internal busi- ess groups to raise awareness about BI's trans- formative capabilities—as well as opportunities for collaboration, One of the task force's first projects was a new BI and analytics solution called the Marketing Analytics Workbench. Its initial application was focused on a select set of use cases around online and offline commercial customer engagements. This effort was cofunded by Dell's IT and marketing organizations. “There was a desire to expand the usage of this solution to support many more sales and marketing activities as soon as possible. However, we knew we Chapter 4 could build a more effective solution if we scaled it out via iterative quick sprint efforts,” says Fadi Taffal, Director, Enterprise IT at Dell. ‘One Massive Data Mart Facilitates a Single Source of Truth Working closely with marketing, task force engi- neers use lean software development strategies and. numerous technologies to create a highly scalable data mart. The overall solution utilizes. multiple technologies and tools to enable different types of data storage, manipulation, and automation a ties, For example, engineers store unstructured data from digital/social media sources on servers running, Apache Hadoop, They use the Teradata Aster plat form to then integrate and explore large amounts of customer data from other sources in near real time, For various data transformation and automa- tion needs, the solution includes the use of Dells Toad software suite, specifically Toad Data Point and ‘Toad Intelligence Central, and Dell Statistica. Toad Data Point provides a business friendly interface for data manipulation and automation, which is a critical gap in the ecosystem. For advanced analytical mod- els, the system uses Dell Statistica, which provides data preparation, predictive analytics, data mining and machine learning, statistics, text analytics, visu- alization and reporting, and model deployment and monitoring. Engineers also utilize this solution to develop analytical models that can sift through all of the disparate data and provide an accurate picture of customers’ shopping behavior. Tools provide sug- gestions for improving service, as well as ROI met- rics for multivehicle strategies that include Web site marketing, phone calls, and site visits Within several months, employees were using the initial Marketing Analytics Workbench. The task force plans to expand the solution's capabilities so it can analyze data from more sources, provide additional visualizations, and measure the retums of other channel activities such as tweets, texts, e-mail messages, and social media posts. Saves More Than $2.5 Mil Operational Costs With its new solution, Dell has already eliminated several third-party BI applications. “Although we're just in the initial phases of rolling out our Markering Analytics Workbench, we've saved approximately + Predictive Analytics I Data Mining Process, Methods, and Algorithms 225 $2.5 million in vendor outsourcing costs," says Chaitanya Laxminarayana, Marketing Program Manager at Dell. “plus, employees gain faster and more detailed insights” As Dell scales the Marketing Analytics Workbench, it will phase out additional third-party BI applications, further reducing costs and boosting efficiency. Facilitates $5.3 Million in Revenue Marketing employees now have the insight they need to identify emerging trends in customer engagements—and update’ models accordingly. “We've already realized $5.3 million in incremental revenue by initiating more personalized market- ing programs and uncovering new opportunities with our big data Marketing Analytics Workbench,” says Laxman Stigiri, Director, Marketing Analytics at Dell. “Additionally, we have programs on track to scale this impact many times over in the next three years.” For example, employees can now see a timeline of a customer's online and offline inter- actions with Dell, including purchases, the spe- Cific Dell Web site pages the customer visited, and the files they downloaded. Plus, employ- ees receive database suggestions for when and how to contact a customer, as well as the URLs of specific pages they should read to learn more about the technologies a customer is researching, Srigiri says, “It was imperative that we understand changing requirements so we could stay agile. Now that we have that insight, we can quickly develop more effective marketing models that deliver the personalized information and support customers expect Questions ror Discussion 1, What was the challenge Dell was facing that led to their analytics journey? 2. What solution did Dell develop and implement? ‘What were the results? 3. As an analytics company itself, Dell has used its service offerings for its own business. Do you think it is easier or harder for a company to taste its own medicine? Explain Source: Dell. Staying agle and effective in the 21st century, Dell Case Study, software dell com/easestudy/dell staying -agile- and. effective in-the-21st-cenrury881389. Used by permission from Dell 226 Chapter + Predictive Analytics I; Data Mining Process, Methods, and Algorithms 3. Clusters identify natural groupings of things based on their known characteristics, such as assigning customers in different segments based on their demogeaphics and past purchase behaviors 4, Sequential relationships discover time-ordered events, such as predicting that an existing banking customer who already has a checking account will open a savings account followed by an investment account within a year. These types of patterns have been manually extracted from data by humans for centuries, but the increasing volume of data in modern times has created a need for more automatic approaches. As data sets have grown in size and complexity, direct manual data analysis has increasingly been augmented with indirect, automatic data processing tools hat use sophisticated methodologies, methods, and algorithms. The manifestation of such. evolution of automated and semiautomated means of processing large data sets is now commonly referred to as data mining Generally speaking, data mining tasks can be classified into three main catego- ries: prediction, association, and clustering. Based on the way in which the patterns are extracted from the historical data, the learning algorithms of data mining methods can be classified as either supervised or unsupervised, With supervised learning algorithms, the training data includes both the descriptive attributes Ge., independent variables or decision variables) as well as the class attribute (Le., output variable or result variable). In contrast, with unsupervised learning the training data includes only the descriptive attrib- utes, Figure 4.2 shows a simple taxonomy for data mining tasks, along with the learning methods and popular algorithms for each of the data mining tasks. PREDICTION Prediction is commonly referred to as the act of telling about the future It differs from simple guessing by taking into account the experiences, opinions, and other relevant information in conducting the task of foretelling. A term that is commonly associ ated with prediction is forecasting. Even though many believe that these two terms are synonymous, there is a subtle but critical difference between the two. Whereas prediction is largely experience and opinion based, forecasting is data and model based. That is, in order of increasing reliability, one might list the relevant terms as guessing, predicting, and forecasting, respectively. In data mining terminology, prediction and forecasting are used synonymously, and the term prediction is used as the common representation of the act, Depending on the nature of what is being predicted, prediction can be named more specifically as classification (where the predicted thing, such as tomorrow's forecast, is a lass label such as “rainy” or “sunny” or regression (where the predicted thing, such as tomorrow's ature, is a real number, such as “65°F). CLASSIFICATION Classification, or supervised induction, is perhaps the most com- mon of all data mining tasks. The objective of classification is to analyze the historical data stored in a database and automatically generate a model that can predict future behavior This induced model consists of generalizations over the records of a training data set, which help distinguish predefined classes. The hope is that the model can then be used to predict the classes of other unclassified records and, more important, to accurately predict actual future events. Common classification tools include neural networks and decision trees (from machine learning), logistic regression and discriminant analysis (from traditional sta tistics), and emerging tools such as rough scts, support vector machines (SVMs), and genetic algorithms. Statistics-based classification techniques (e.g., logistic regression and discriminant analysis) have received their share of criticism—that they make unrealistic assumptions about the data, such as independence and normality—which limit their use in classification-type data mining projects Chapter 4. + Predictive Analytics I Data Mining Process, Methods, and Algorithms Data Mining Tasks & Methods Data Mining Algorithms Learning Type Decision Trees, Neural Networks, Support! Vector Machines, INN. Nave Bayes, GA. | L+{ classification Linear/Nonlinear Regression, ANN. Regression Trees, SVM. kNN, GA Fel Regression ‘Autoregressive Methods, Averaging ~ (Se Methods, Exoonential Smoothing, ARIMA _ as Expectation Maximization, Aprior! ‘Apriori Algorithm, FP-Growth, | Sequence aneiysis Groph-Bases Matching Le{ clustering ‘emeans, Expectation Maximization (EM) LL! cutter anaiysis ‘Supervised ‘Supervised ‘Supervised LL Market-basket Asriori, OneR, ZeroR, Eclat, GA Unsupervised Algorithm, Graph-Based Matching Unsvperviseg Unsupervised | Unsupervised means, Expectation Maximizaton EM) | Unsupervised FIGURE 4.2 A Simple Taxonomy for Data Mesng Tasks, Methods, and Algrthms Neural networks involve the development of mathematical structures (somewhat resembling the biological neural networks in the human brain) that have the capability to learn from past experiences presented in the form of well-structured data sets. They tend to be more effective when the number of variables involved is rather large and the relationships among them are complex and imprecise. Neural networks have disadvan tages as well as advantages. For example, it is usually very difficult to provide a good rationale for the predictions made by a neural network. Also, neural networks tend to nced considerable training. Unfortunately, the time needed for training tends to increase exponentially as the volume of data increases, and in genei 1, neural networks cannot be 227 228 Chapter 4 + Predictive Analytics I; Data Mining Process, Methods, and Algorithms trained on very large databases. These and other factors have limited the applicability of neural networks in data-rich domains, Decision trees classify data into a finite number of classes based on the values of .¢ input variables. Decision trees are essentially a hierarchy of ifthen statements and are hus significantly faster than neural networks. They are most appropriate for categori- cal and interval data. Therefore, incorporating continuous variables into a decision tn framework requires discretization; that is, converting continuous valued numerical vari ables to ranges and categories. A related category of classification tools is rule induction, Unlike with a decision c, with rule induction the ifthen statements are induced from the training data directly, and they need not be hierarchical in nature. Other, more recent techniques such as SVM, rough sets, and genetic algorithms are gradually finding their way into the arsenal of clas- sification algorithms. CLUSTERING Clustering partitions a collection of things (eg, objects, events, pr. sented in a structured data set) into segments (or natural groupings) whose member share similar characteristics. Unlike in classification, in clustering the class labels are unknown. As the selected algorithm goes through the data set, identifying the common- alities of things based on their characteristics, the clusters are established. Because the clusters are determined using a heuristic-type algorithm, and because different algorithms may end up with different sets of clusters for the same data set, before the results of clus- ering techniques are put to actual use it may be necessary for an expert to interpret, and potentially modify, the suggested clusters. After reasonable clusters have been identified, hey can be used to classify and interpret new data Not suxprisingly, clustering techniques include optimization. The goal of cluster ing is to create groups so that the members within each group have maximum simi larity and the members across groups have minimum similarity. The most commonly used clustering techniques include k-means (from statistics) and self-organizing maps (from machine learning), which is a unique neural network architecture developed by Kohonen (1982) Firms often effectively use their data mining systems to perform market segmenta- tion with cluster analysis. Cluster analysis is a means of identifying classes of items so that items in a cluster have more in common with each other than with items in other clusters. It can be used in segmenting customers and directing appropriate marketing products to the segments at the right time in the right format at the right price. Cluster analysis is also used to identify natural groupings of events or objects so that a common set of char istics of these groups can be identified to describe them, ASSOCIATIONS Associations, or association rule learning in data mining, is a popu- lar and well-researched technique for discovering interesting relationships among v: ables in large databases, Thanks to automated data-gathering technologies such as bar code scanners, the use of association rules for discovering regularities among prod large-scale transactions recorded by point-of-sale systems in supermarkets has become a common knowledge discovery task in the retail industry. In the context of the retail indus- ry, association rule mining is often called market-basket analysis. ‘Two commonly used derivatives of association rule mining are link analysis and sequence mining. With link analysis, the linkage among ma discovered automatically, such as the link between Web pages and referent ships among groups of academic publication authors, With sequence mining, relation- ships are examined in terms of their order of occurrence to identify associations over time Algorithms used in association rule mining include the popular Apriori (where frequent itemsets are identified) and FP-Growth, OneR, ZeroR, and Eclat

You might also like