Professional Documents
Culture Documents
Database:- storage of data in a paper or electronic files , for future use, analysis, and retrieval is called database.
For example Banking- for customer information accounts, loans and transactions. Universities- for students information, course registration. Sales- customer product and purchase information. Hence, database is where the data resides.
What impact new product/ services will have on revenue/margin? These all statement contains meaningful facts and figures but data needs to be transformed from one to other.
A data warehouse is subject-oriented, integrated, timevarying, non-volatile collection of data that is used primarily in organizational decision making.
Data Mining
Data mining software tools find hidden pattern and relationships in large pool of data and infer rules from them that can be used to predict future behavior guide. The major reason why data mining gained a great deal of attraction is due to wide availability of data and imminent need of turning that data into information and knowledge.
The mining of gold from sand or rocks is referred to as gold mining rather than rock or sand mining. Thus data mining should have been named knowledge mining from data.
Data Mining
Data Cleaning
Data Integration Databases
Problem fomulation
Data collection subset data: sampling might hurt if highly skewed data feature selection Pre-processing: cleaning name/address cleaning, different meanings (annual, yearly), duplicate removal, supplying missing values Transformation: map complex objects e.g. time series data to features e.g. frequency Choosing mining task and mining method: Result evaluation and Visualization:
Knowledge discovery is an iterative process
Application Areas
Industry Finance Insurance Telecommunication Transport Consumer goods Data Service providers Utilities Application Credit Card Analysis Claims, Fraud Analysis Call record analysis Logistics management promotion analysis Value added data Power usage analysis
Conclusion
Data Warehousing provides the means to change the raw data into information for making effective business decisions-the emphasis on information, not data. The Data warehouse is the hub for decision support data Where, Data mining is a useful tool with multiple algorithms that can be tuned for specific tasks. It can benefit business, medicine, and science. It needs more efficient algorithms to speed up data mining process