You are on page 1of 9



Data mining y A tool that helps in extracting information from large y y y y databases Technological tool used in organizations Discover previously unknown data Decisions are built on these data Analyze the data . uncover the problems and then use the models to predict the business behavior .

Contd«« y These techniques can be implemented on existing hardware platforms y Techniques used to obtain KNOWLEDGE y Knowledge is then used to make predictions of values such as sales returns .

explore the full depth of a database y More rows.Scope of Data Mining Data mining technology can generate new business opportunities by providing these capabilities: y Automated prediction of trends and behaviorautomates the process of finding information from large databases y Automated discovery of previously unknown patterns.identifies previously hidden data y More columns.lower estimation errors .

decision trees  Prognosis.identify common data patterns.Extracting knowledge y To build data mining tools one needs to do:  Data preparation. use algorithms etc  Knowledge acquisition.selects the appropriate knowledge acquisition algorithm eg.main data are identified and cleansed of any impurities  Data analysis and classification.findings are used to predict future .

y .  Associates.  Sequential patterns.DATA MINING TOOLS Data mining software analyses relationships and patterns in stored transaction data based on open ended user queries y While mining the data one or more of four types of relationships are sought:  Classes.Data is mined to anticipate behavior patterns and trends.  Clusters.Stored data is used to locate data in predetermined groups.Data items are grouped according to logical relationships or cluster preferences.Data can be mined to identify associations between the buying patterns.

Different levels of analysis that can be done to mine the data are: y Decision trees. .A structure that can be used to divide a large collection of records into successively smaller sets by applying a sequence of simple decision rules.Non-linear predictive models that learn through training and resemble biological neural networks in structure. y Artificial Neural Network(ANN).

In order to forecast the prediction value for an unclassified record is to look for similar records and use the prediction value of the record that is nearest to the unclassified records. y Clustering.Used to segment a database into clusters based on a set of attributes.y Nearest Neighbor method. .

y Thank you .