You are on page 1of 5

(This is Data Mining Dynamics column for June issue. It is holdover from May.

It can go anywhere
in the main issue. Pick up standing head and logo from page 12 of December 2005 issue)

(Main Head)
The future of data miningPart 3

(Deck)
This is the third and final article of this series. The first article appearing on page 22 of the October,
2005 issue, presented some general observations and commentary on what has happened. The
second article appearing on page 12 of the December, 2005 issue, examined data mining from the
perspective of organizational change, people and software.

For this article, we conducted a series of interviews to three well-respected practitioners within the
industry. Like most interviews conducted to a group of people, there will be both common as well
as some unique perspectives concerning various data mining themes. Yet it is the common
perspectives and opinions expressed by the interviewees that will always need to be considered as
data mining evolves.
However, those divergent opinions and perspectives, which are not totally shared by all
interviewees, perhaps represent the real areas of focus as data mining moves forward. It is a
continuing journey. Enjoy the ride.
The experts interviewed were: Andrew Storey, director of decision support and campaign execution
for the Bank of Nova Scotia; Julio Tavares, director, partner management and business
development with CIBC, and Steve Heck, group manager CRM / privacy officer, Microsoft
Canada. We posed the same questions to all of the m.

What are the critical ingredients of achieving success within a data mining project?

Storey: First, the team should be strong from a very mathematical and statistical perspective and
have a strong understanding of the business lines they support. Having one without the other is not
enough. Another key ingredient is executive buy-in because embedding data mining processes
within the business is non-traditional.

Tavares: Clearly, the key ingredient is to focus on business objectives. Besides having a clear
understanding of the business issues, one must also be able to evaluate the scope and size of a given
project as well as its expected benefit.
For example, are we simply satisfied with 80/20 or do you need something more that really
addresses the business problem? Can we better explore the use of reports as opposed to complex
statistical solutions when trying to solve problems?

Heck: Aside from some obvious requirements, such as skilled analysts or analytical software, an
effective data mining project needs to have the following ingredients to achieve success.
Be clear about the vision of how the output could benefit the business even before any work is
done, and socialize this with the stakeholders and end users. This will ensure you have support
throughout the project.
Dont under estimate the importance of getting at the data you need to conduct the analysis. This
will typically make or break the project and is usually much harder than it seems.
Find people who can bridge business knowledge with the knowledge of how the data behaves.
In the end success is determined by the impact the outcome has on the business, not by the
statistical success. An R2 has no business value.

What have been the most significant advancements/improvements in data mining in the last five
to 10 years?

Heck: Three key drivers have improved data mining capabilities.


First, our ability to get the executive team to understand the merits of data mining and develop
business strategy around it has greatly expanded the application of data mining.
One of the most exciting developments is the attempt to bridge the gap between customer facing
channels and data mining. Feeding better insights to front line agents and having them capture
explicit customer needs will provide the level of validated customer information that has not been
possible.
Finally, the impact of privacy compliance activities has also had a dramatic effect on data
dependent activities. It has forced data collection to have greater integrity, while at the same time,
limiting the data mining to responsible, value added applications.

Tavares: One of them is the ability to reduce the preparation time within the data mining cycle.
Advances in database technology and software allow quicker response time with such functions as
being able to create derived variables.
Software options are much more readily available which allow the analyst to try a variety of
different techniques. Data mining is no longer a standalone process but rather has become
integrated within the overall business process.

Storey: Adoption of non-traditional techniques such as non-parametric techniques and decision-


theory based approaches are more readily accepted within the data mining community. The way we
use results from data mining have also changed.
Examples in the past would focus specifically on one marketing behaviour rather than looking at
behaviour holistically. For instance, in the past decisions would have been made focused solely on
the response to a given product offer. Today, there is a more holistic perspective whereby
customers might have a series of different product model scores which are used to determine which
offer might be given to the customer.
This more holistic perspective has resulted in much more relevant leads given to customers and
greater flexibility in being able to select names than what could be done 10 years ago.

Whats the impact of a constantly expanding information environment on data mining?

Tavares: This requires the organization better manage the data. As this comp lexity increases, we
need to identify new techniques that make data management more effective.
With this exploding data environment, the data miner will also need to better understand the data.
Knowledge of how to work with increasingly large volumes of data will be paramount if one
expects to derive meaningful insights from the information.
2
Heck: While more data is a good thing, it creates exponentially complex data mining challenges.
Given that the lions share of data mining is focused on cleaning and manipulating the data, if the
analytic requirements are not considered, the productivity of data mining could be impeded at the
very time when it should be exploding.

Storey: Having access to more information can only help data mining. More information is better
but getting results out instantly is not the optimum answer particularly if they involve findings and
insights.
The actual numbers and results can be delivered quickly but people need time to think out what
these results mean. Building and applying real-time solutions should be viewed with caution as
there needs to be some human intervention concerning the overall stability and applicability of
these solutions.

Do you see companies altering their corporate structure to more fully integrate data mining into
their overall business and if so, how?

Storey: I see both a centralized group working with perhaps non-centralized units to deliver an
overall solution. The centralized group would contain the higher-end or more specialized analytical
skill sets. This would be the group that would actually build models and perform specialized
analytics on a project basis.
In the non-centralized units, departments would have these units that provide a more generalized
business perspective to the data mining problem. These units would work very closely with the
more specialized central group in arriving at solutions that would benefit the units department.

Heck: Keep in mind that data mining is a means to an end, not an end in and of itself. Only as the
business becomes dependent on the output of data mining, will it have the potential to become fully
integrated. The scope of integration should expand as data expands to support new customer
insights.
There is, however, an increasing trend for organizations to protect their data assets and customer
relationships with data governance officers and/or corporate compliance officers who oversee all
information assets within their organizations.

Tavares: I dont see it happening as data mining structures are mature. The real need for more
structure is around the area of marketing/business intelligence.

What will be the impact of privacy on data mining?

Tavares: This is starting to happen already. Data miners need to question whether they need all the
data.
We need to better understand permission as it relates to privacy. For instance, do we have
permission to use rented list in terms of appending information and are the marketing clauses
sufficient enough in terms of combining data sources with rented lists? Education is a big part of
the process in dealing with privacy in future.
Heck: If customers see the benefit of providing information to an organization, they will continue
to allow the data to be mined. The privacy movement has helped a great deal to give customers this
clarity and to force businesses to maintain a consistently high level of integrity over the use of the
data. Without privacy, customer dissatisfaction would have forced companies to abandon the use of
data mining.

What is the most significant challenge facing data miners and analysts?

Heck: The biggest challenge will be finding new sources of information that support the growing
expectations for customer- focused solutions. Data mining is only as good as the data you can
collect and the collection methods are becoming ever more complicated.

Tavares: Business and the marketplace are changing quickly. Data mining has not really changed
in terms of new techniques. Yet, speed is the issue now in terms of getting to a solution.
In the past, there was a willingness to wait; whereas, now speed is of the essence. Data miners will
need to rely on producing solutions that yield good results, albeit not optimal, but results that can
deliver these solutions in a fraction of the time.

Storey: It is the ability to find profitable applications which can be integrated into the business
process. Besides creating new tools, it is more important to figure out how to use existing tools to
support the business.
The challenge with existing tools is to arrive at more creative applications for business strategy as
well as campaign execution. Ultimately the development of these creative applications is done with
a view to optimizing profitability by providing relevant offers to a given customers needs.

(Author ID paragraph)
Richard Boire heads up the Boire Filler Group, Pickering, ON, a firm specializing in customer
database analytics and predictive modeling. He can be reached at 905.837.0005.

(Possible pull quotes)


This more holistic perspective has resulted in much more relevant leadsStorey

Knowledge of how to work with increasingly large volumes of data will be paramountTavares

While more data is a good thing, it creates exponentially complex data mining challenges.
Heck

You might also like