You are on page 1of 12



1 Email:


Organisations are today suffering
from a malaise of data overflow. The
developments in the transaction reasonable amount of time, to extract
processing technology has given rise to a intelligence/knowledge in a near real
situation where the amount and rate time.
ofdata capture is very high, but the The data warehouse allows the
processing of this data into information storage of data in a format that facilitates
that can be utilised for decision making, its access, but if the tools for deriving
is not developing at the same pace. Data information and/or knowledge and
warehousing and data mining (both data presenting them in a format that is useful
& text) provide a technology that for decision making are not provided the
enables the decision-maker in the whole rationale for the existence of the
corporate sector/govt. to process this warehouse disappears. Various
huge amount of data in a technologies for extracting new insight
from the data warehouse have come up
which we classify loosely as "Data
Mining Techniques".
Our paper focuses on the need
for information repositories and
discovery of knowledge and hence the
overview of, the so hyped, Data
Warehousing and Data Mining.

2 Email:

I N T R O D U C T I O N:-
“Knowledge [no more emerging technology – viz. "Data

Information] is not only power, but Warehousing and Data Mining". What is

also has significant competitive needed today is not just the latest and

advantage” updated to the nano-second information,
but the crossfunctional information that

Organizations have lately can help decisions making activity as

realized that just processing transactions "on-line" process.

and/or information’s faster and more Evolution of Information Technology

efficiently, no longer provides them with Tools

a competitive advantage vis-à-vis their The evolution of the information

competitors for achieving business systems characterize the evolution of

excellence. Information technology (IT) systems from data maintenance systems,

tools that are oriented towards to systems that transform the data into

knowledge processing can provide the "information" for use in the decision

edge that organizations need to survive making process. These systems

and thrive in the current era of fierce supported the information acquisition

competition. The increasing competitive from the database of transactional data.

pressures and the desire to leverage The managerial knowledge acquisition

information technology techniques have function is/was not directly supported by

led many organizations to explore the these systems . The evolution of new

benefits of new patterns in the changing scenario could
not be provided by these systems
directly, the planner was supposed to do
this from experience.

3 Email:
Warehouse with a database
Data warehousing is an
One thing that remains information infrastructure based on
constant , especially in corporate detail data that supports the
world , is “ Change” decisionmaking process and provides
businesses the ability to access and
And, these days, change is analyze data to increase an
occurring at an ever-increasing rate. A organization's competitive advantage.
key challenge is implementing an Data warehousing is a process,
information infrastructure that allows not an off-the-shelf solution you buy, but
your company to rapidly respond to hardware--database and tools integrated
change. One solution to this challenge is into an evolving information
the data warehouse. infrastructure--that changes with the
dynamics of the business.

What is Data-Warehousing ?

4 Email:
The data warehouse makes an * Data in a warehouse is not
attempt to figure out "what we need", updates or changed in any way, but is
before we know we need it. only loaded and accessed later on

What it actually is?
* A data warehouse stores
current and historical data
* This data is taken from various, * Data is organized according to
perhaps incompatible, sources and stored subject instead of application. In general
in a uniform format a database is not a data warehouse unless
* Several tools transform this it has the following two features:
data into meaningful business It
 collects information from a
information for the purpose of number of different disparate sources
comparisons, trends and forecasting 5 and is the place where this disparity is
reconciled, and information.
Conceptually, a Data Warehouse looks like this:

5 Email:
Information Sources Data – Warehouse Functions
Always include the core The main function behind a data
operational systems which form the warehouse is to get the enterprise-wide
backbone of day-to-day activities. It is data in a format that is most useful to
these systems which have traditionally end-users, regardless of their locations.
provided management information to Data warehousing is used for:
support decision making. • Increasing the speed and
flexibility of analysis.
Decision Support Tools • Providing a foundation for
Are used to analyze the enterprise-wide integration and
information stored in the warehouse, access.
typically to identify trends and new • Improving or re-inventing
business opportunities.. business processes.
• Gaining a clear understanding of
customer behavior.

Data Warehouse Architecture
The Data Warehouse Each implementation of a data
Itself is the bridge between the warehouse is different in its detailed
operational systems and the decision design (a schematic high-level of the
support tools. It holds a copy of much of architecture and its components is given
the operational system data in a logical in the figure below), but all are
structure which is more conducive to characterised by a handful of the
analysis. The Data Warehouse, which following key components:
will be refreshed in scheduled bursts A
 data model to define the
from operational systems and from warehouse contents.
relevant external data sources, provides a A
 carefully designed
single, consistent view of corporate data, warehouse database, whether
leaving operational systems hierarchical, relational, or
multidimensional. While choosing a
DBMS it must be kept in view that the

6 Email:
database management system should be A front end for Decision Support
powerful enough to handle huge amount System (DSS) for reporting and for
of data running up to terabytes. structured and
unstructured analysis.

Data Mining
Data base mining or Data mining (say gold etc.) and/or gems, hence the
(DM) (formally termed Knowledge term “mining”, It is based on filtration
Discovery in Databases – KDD) is a and assaying of mountain of data “ore”
process that aims to use existing data to in order to get “nuggets” of knowledge.
invent new facts and to uncover new The data mining process is
relationships previously unknown even diagrammatically exemplified in Figure
to experts thoroughly familiar with the below
data. It is like extracting precious metal

7 Email:

Data Mining with Data Warehousing Data Mining as a Part of the
Knowledge Discovery Process
· The goal of a data warehouse is
to support decision making with data. · Knowledge Discovery in
· Data mining can be used in Databases, frequently abbreviated as
conjunction with a data warehouse to KDD, typically encompasses more than
help with certain types of decisions. data mining.
· Data mining can be applied to · The knowledge discovery
operational databases with individual process comprises six phases:
transactions. Data selection ,Data about specific
· To make data mining more items or categories of items, or from
efficient, the data warehouse should stores in a specific
have an aggregated or summarized region or area of the country, may be
collection of data. selected.
· Data mining helps in extracting Data cleansing process then may correct
meaningful new patterns that cannot be invalid zip codes or eliminate records
found necessarily by merely querying or with incorrect
processing data or metadata in the data phone prefixes.
warehouse. Enrichment typically enhances the data
with additional sources of information.

8 Email:
Data transformation and encoding
may be done to reduce the amount of

Goals of Data Mining
Classification: Data mining can
The goals of data mining fall into partition the data so that different classes
the following classes: or categories can be identified
Prediction: Data mining can show based on combinations of parameters.
how certain attributes within the data Optimization: One eventual goal of
will behave in the future. data mining may be to optimize the use
Identification: Data patterns can be of limited resources such as time, space,
used to identify the existence of an item, money, or materials and to maximize
an event, or an activity. output variables such as sales or profits
under a given set of constraints.

9 Email:

10 Email:
CONCLUSION knowledge and productivity; spares the
operational database from ad-hoc queries
A data warehouse takes with the resulting performance
the organizational degradation and clears the legacy
operationaldata,historical data database system, while moving the
and external data corporate system architecture forward.
a) consolidates it into a With the incorporation of new
separately designed database (which can data delivery and presentation
either be techniques, like hypertext mark up
relational or multi-dimensional in language (HTML), Open Database
nature) Connectivity (ODBC) etc. the database
b) manages it into a format that is mining (Data & Text) operation has
optimised for end users to access and gained wide spread recognition as a
analyse. viable tool for business intelligence
When a data warehouse has been gathering. Advances in the document
constructed, it provides a complete mining technology (database mining of
picture of the enterprise. It provides an free form text/data, in contrast to the
unparalleled opportunity to the “classical” approach to data mining of
management to learn about their
The data warehouse technology
together with online transaction
processing and data mining, allows the
management to provide better customer
service, create greater customer loyalty
and activity, focus customer acquisition
and retention of the most profitable
customer, increase revenue, reduce
operating cost; provides tools that
facilitate sounder decision making;
improves worker/management

11 Email:
fixed length records) are making the data warehouse of unstructured and free form
mining technology more powerful. data. The new technologies are geared
Last but never the least, the towards mining this great data
Internet has emerged as the largest data warehouse.

12 Email: