 NADEEM . A
 JAYADURGA .S  RAGINI .S

111A1012
111A1009 111A1013

.

WHICH ARE OUR LOWEST/HIGHEST MARGIN CUSTOMERS ? WHAT IS THE MOST EFFECTIVE DISTRIBUTION CHANNEL? WHO ARE MY CUSTOMERS AND WHAT PRODUCTS ARE THEY BUYING? WHAT PRODUCT PROMOTIONS HAVE THE BIGGEST IMPACT ON REVENUE? WHAT IMPACT WILL NEW PRODUCTS OR SERVICES HAVE ON REVENUE AND MARGINS? WHICH CUSTOMERS ARE MOST LIKELY TO GO TO THE COMPETITORS ? .

 I CAN’T FIND THE DATA I NEED  DATA IS SCATTERED OVER THE NETWORK  MANY VERSIONS. SUBTLE DIFFERENCES  I CAN’T GET THE DATA I NEED  NEED AN EXPERT TO GET THE DATA  I CAN’T UNDERSTAND THE DATA I FOUND  AVAILABLE DATA POORLY DOCUMENTED  I CAN’T USE THE DATA I FOUND  RESULTS ARE UNEXPECTED  DATA NEEDS TO BE TRANSFORMED FROM ONE FORM TO OTHER .

DATABASE CREATION. 1960S:   1970S:  DATA COLLECTION. IMS AND NETWORK DBMS  1980S:  RELATIONAL DATA MODEL. ETC. ADVANCED DATA MODELS (EXTENDEDRELATIONAL. RELATIONAL DBMS IMPLEMENTATION RDBMS. ENGINEERING. SCIENTIFIC. AND WEB DATABASES 6  1990S—2000S:  . MULTIMEDIA DATABASES.) AND APPLICATIONORIENTED DBMS (SPATIAL.) DATA MINING AND DATA WAREHOUSING. DEDUCTIVE. ETC. OO.

The data warehouse is that portion of an overall Architected Data Environment that serves as the single integrated source of data for processing information. .

constraints) from data in large databases 8 . data warehouses and other information repositories   We are drowning in data. but starving for knowledge! Solution: Data warehousing and data mining  Extraction of interesting knowledge (rules. Data explosion problem  Automated data collection tools and mature database technology lead to tremendous amounts of data stored in databases. regularities. patterns.

Subject Oriented Process Oriented Integrated DATA WAREHOUSE Accessible Non Volatile Time variant .

DATA MART STAGING AREA OLAP OLAP TOOLS .

Operational. External & other Databases DATA MANAGEMENT Analytical Data Store Enterprise Warehouse Data Marts Data Analysis Data Acquisition METADATA MANAGEMENT Metadata Directory Metadata Repository Warehouse Design Web Information Systems .

.

Data warehousing is not simply an “access wrapper” for operational data.The data warehouse is distinctly different from the operational data used and maintained by day-today operational systems. where data is simply “dumped” into tables for direct access. .

as of the moment of access Serves the clerical community Performance sensitive (immediate response required when entering a transaction) Flexible structure.OPERATIONAL DATA DATA WAREHOUSE Application oriented Detailed Subject oriented Summarized Represents values over time Serves the managerial community Performance relaxed (immediacy not required) Static structure Accurate. variable contents Small amount of data used in a process large amount of data used in a process .

previously unknown and potentially useful) information or patterns from data in large databases Process of finding different patterns or co-relations among the data in large relational databases. implicit. Popular and highly used in the INFORMATION INDUSTRY 15   . Data mining (knowledge discovery in databases):  Extraction of interesting (non-trivial.

► Market Analysis And Management ► Corporate Analysis And Risk Management ► Fraud Detection And Management 16 .

 Where are the data sources for analysis?  Credit card transactions. Conversion of single to a joint bank account: marriage. discount coupons. plus (public) lifestyle studies  Target marketing  Find clusters of “model” customers who share the same characteristics: interest. customer complaint calls. etc. spending habits. Associations/co-relations between product sales Prediction based on the association information 17  Determine customer purchasing patterns over time   Cross-market analysis   . etc. loyalty cards. income level.

 Customer profiling  Data mining can tell you what types of customers buy what products (clustering or classification) Identifying the best products for different customers Use prediction to find what factors will attract new customers Various multidimensional summary reports Statistical summary information (data central tendency and variation) 18  Identifying customer requirements    Provides summary information   .

financial ratios and market value Summarize and compare the resources and spending Monitor competitors and market directions Group customers into classes and a class-based pricing procedure Set pricing strategy in a highly competitive market 19 . Finance planning and asset evaluation     Resource planning: Competition:     Cash flow analysis and prediction Analysis of trends.

telecommunications (phone card fraud). etc. Applications   Approach  widely used in health care. credit card services. retail.  Examples    use historical data to build models of fraudulent behavior and use data mining to help identify similar instances auto insurance: detect a group of people who stage accidents to collect on insurance money laundering: detect suspicious money transactions (US Treasury's Financial Crimes Enforcement Network) medical insurance: detect professional patients and ring of doctors and ring of references 20 .

duration. especially mobile phones. 21 . Detecting inappropriate medical treatment  Australian Health Insurance Commission identifies that in many cases blanket screening tests were requested (save Australian $1m/yr. British Telecom identified discrete groups of callers with frequent intra-group calls.). Analyze patterns that deviate from an expected norm. and broke a multimillion dollar fraud. time of day or week.  Detecting telephone fraud   Telephone call model: destination of the call.

etc. 22  Internet Web Surf-Aid  . improving Web site organization. analyzing effectiveness of Web marketing. assists. and fouls) to gain competitive advantage for New York Knicks and Miami Heat JPL and the Palomar Observatory discovered 22 quasars with the help of data mining IBM Surf-Aid applies data mining algorithms to Web access logs for market-related pages to discover customer preference and behavior pages. Sports   Astronomy  IBM Advanced Scout analyzed NBA game statistics (shots blocked.

email.    Stream data mining Web mining.◙ Other Applications  Text mining documents) (news group. DNA data analysis .

Sign up to vote on this title
UsefulNot useful