You are on page 1of 14

A TECHINICAL PAPER ON

BY:
TRUPTI MAKADIA (5TH IT) YAMINI PATEL (5TH CE) C.U.SHAH COLLEGE OF ENGG. & TECH. WADHAWAN CITY-363030 E-MAIL: tupimakadia1@yahoo.co.in.

yamu_4u1985@yahoo.co.in

What can data mining do? 6.3 Database integration 9. Conclusion . Application of data mining 5. Level of analysis 9.INDEX: Abstract Overview 1.1continuous innovations 3. Integration with object relational database system 9.1 Significance of data mining 9. How it works? 7. Elements 8. What is data mining? 2. Data warehouses 4.2 The object relational perspective 9.4 Problems and difficulties in integration 10.

The traditional database systems are not well suited to meet the challenges of the future. challenges and methods to enable the seamless integration of data mining technology within the framework of the Object-Relational Database Systems. and to growing its user base.Data Mining techniques. This technical research paper explores the key issues. based on statistics and machine learning can significantly boost the ability to analyze large amounts of data. . Relational models lack support for the complex data needed by today’s enterprises whereas the object models suffer from scalability problems. This integration is the key to making it convenient to use. this technology is destined be a niche technology unless an effort is made to integrate it with the new evolving Object-Relational Database Systems. easy to deploy in real applications. Object-Relational Model combines the advantages of the traditional models while overcoming their deficiencies. Despite its potential.

.

OVERVIEW:- .: DATA MINING : 1.

Data mining algorithms change all that by finding interesting patterns that an enterprise didn’t even know were there.Mining has always been associated with dark. allowing businesses to make proactive. giving rise to the myth that the more data you have. cuts costs. It allows users to analyze data from many different dimensions or angles. Both processes require either sifting through an immense amount of material. WHAT IS DATAMINING? . Data mining tools can answer business questions that traditionally were too time-consuming to resolve. or intelligently probing it to find exactly where the value resides. It is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. They scour databases for hidden patterns. finding predictive information that experts may miss because it lies outside their expectations. These huge data warehouses contain gigabytes with "hidden" information that can't be easily found using typical database queries. Data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information . competitors and products. Most major organizations have data warehouses containing information about their clients.Data mining derives its name from the similarities between searching for valuable business information in a large database and mining a mountain for a vein of valuable ore. knowledge-driven decisions. Data mining software is one of a number of analytical tools for analyzing data. data mining is the process of finding correlations or patterns among dozens of fields in large relational databases 2.information that can be used to increase revenue. or both. Technically. categorize it. prospective analyses offered by data mining move beyond the analyses of past events provided by retrospective tools. The automated. Data mining tools predict future trends and behaviors. bottomless pits and workers who didn't see the light of day for hours at a time. the less you know. and summarize the relationships identified.

Further analysis showed that these shoppers typically did their weekly grocery shopping on Saturdays. Data warehousing is defined as a process of centralized data management and retrieval. 4. Data warehousing.1 Continuous Innovation: . and statistical software are dramatically increasing the accuracy of analysis while driving down the cost. they only bought a few items. Dramatic technological advances are making this vision a reality for many companies. 3. Centralization of data is needed to maximize user access and analysis. they could move the beer display closer to the diaper display. Data warehousing represents an ideal vision of maintaining a central repository of all organizational data. continuous innovations in computer processing power. one Midwest grocery chain used the data mining capacity of Oracle software to analyze local buying patterns. And. They discovered that when men bought diapers on Thursdays and Saturdays. they could make sure beer and diapers were sold at full price on Thursdays. However. DATA WAREHOUSE: Dramatic advances in data capture. The data analysis software is what supports data mining. For example. And. the technology is not. like data mining. Companies have used powerful computers to sift through volumes of supermarket scanner data and analyze market research reports for years. The grocery chain could use this newly discovered information in various ways to increase revenue. data transmission. APPLICATION OF DATA MINING: - . For example. they also tended to buy beer. On Thursdays. processing power. however. equally dramatic advances in data analysis software are allowing users to access this data freely. is a relatively new term although the concept itself has been around for years. and storage capabilities are enabling organizations to integrate their various databases into data warehouses.2.Although data mining is a relatively new term. disk storage. The retailer concluded that they purchased the beer to have it available for the upcoming weekend.

and customer demographics. trend analysis. Blockbuster Entertainment mines its video rental history database to recommend rentals to individual customers. this technology is destined be a niche technology unless an effort is made to integrate it with the new evolving ObjectRelational Database Systems. and corporate profits. The traditional database systems are not well suited to meet the challenges of the future. customer satisfaction. product positioning. helps business understand customer "click. based on statistics and machine learning can significantly boost the ability to analyze large Amounts of data. It enables these companies to determine relationships among "internal" factors such as price. competition. credit card scoring and personal profile marketing. And. or staff skills. Finally. Skillful interpretation of data can enhance customer relations.retail. By mining demographic data from comment or warranty cards. it enables them to "drill down" into summary information to view detail transactional data. it enables them to determine the impact on sales. Despite its potential.Applications of data mining include fraud detection.stream" behavior online. financial. Web mining. and marketing organizations. the retailer could develop products and promotions to appeal to specific customer segments. American Express can suggest products to its cardholders based on analysis of their monthly expenditures. a retailer could use pointof-sale records of customer purchases to send targeted promotions based on an individual's purchase history. 6.Data Mining techniques. For example. With data mining. WHAT CAN DATA MINING DO? Companies with a strong consumer focus . communication. and "external" factors such as economic indicators. direct marketing. primarily use data mining today. financial market forecasting and international criminal investigations. Relational models lack support for the complex data needed by todays 5. HOW IT WORKS? . through which data is analyzed from the Web.

The beer-diaper example is an example of associative mining. Provide data access to business analysts and information technology professionals. an outdoor equipment retailer could predict the likelihood of a backpack being purchased based on a consumer's purchase of sleeping bags and hiking shoes. Generally. data mining provides the link between the two. For example. data can be mined to identify market segments or consumer affinities. transform. and load transaction data onto the data warehouse system. Several types of analytical software are available: statistical. machine learning. ELEMENTS: • • • Extract. Store and manage the data in a multidimensional database system. For example. any of four types of relationships are sought: • Classes: Stored data is used to locate data in predetermined groups. • Sequential patterns: Data is mined to anticipate behavior patterns and trends. This information could be used to increase traffic by having daily specials. For example. . • Associations: Data can be mined to identify associations. 7.While large-scale information technology has been evolving separate transaction and analytical systems. a restaurant chain could mine customer purchase data to determine when customers visit and what they typically order. and neural networks. Data mining software analyzes relationships and patterns in stored transaction data based on openended user queries. • Clusters: Data items are grouped according to logical relationships or consumer preferences.

.• • Analyze the data by application software. such as a graph or table. and natural selection in a design based on the concepts of natural evolution. Present the data in a useful format. mutation. 8. • Genetic algorithms: Optimization techniques that use processes such as genetic combination. LEVEL OF ANALYSIS: - • Artificial neural networks: Non-linear predictive models that learn through training and resemble biological neural networks in structure.

• Nearest neighbor method: A technique that classifies each record in a dataset based on a combination of the classes of the k record(s) most similar to it in a historical dataset (where k 1).1 Significance of Data Mining A recent Gartner Group Advanced Technology Research Note listed data mining and artificial intelligence at the top of the five key technology areas that “will clearly have a major impact across a wide range of industries within the next 3 to 5 years”.” The way in which companies interact with their customers has changed dramatically over the past few years. Sometimes called the k-nearest neighbor technique. large-systems users will increasingly need to implement new and innovative ways to mine the after. at least half of the Fortune 1000 companies worldwide will be using data mining technology. employing MPP [massively parallel processing] systems to create new sources of business advantage.• Decision trees: Tree-shaped structures that represent sets of decisions. • Rule induction: The extraction of useful if-then rules from data based on statistical significance. CART and CHAID are decision tree techniques used for classification of a dataset. A customer’s continuing . These decisions generate rules for the classification of a dataset. Graphics tools are used to illustrate data relationships.market value of their vast stores of detail data. Within the next 2-3 years. • Data visualization: The visual interpretation of complex relationships in multidimensional data. Specific decision tree methods include Classification and Regression Trees (CART) and Chi Square Automatic Interaction Detection (CHAID) . 9. “With the rapid advance in data capture. transmission and storage. INTEGRATION WITH OBJECT RELATIONAL DATABASE SYSTEM: 9. It also observed that.

images. Modern database applications need to store and manipulate objects that are neither small nor simple. In addition. The explosion of the World Wide Web has made it possible to publish content that involves text. domain-specific data. videos and songs can be stored in the database directly. As a result companies have found that they need to understand their customers better. thus extending the type set of the database.business is no longer guaranteed. ORDBMS allows the users to define their own types. and to perform operations on these objects. audio and video. unstructured. image. Intranets and extranets help drive the data workflow both within a company and externally with its partners. We are in a period of intensive change and innovation regarding database technology and related products. inheritance between types and polymorphism. image.2 The Object-Relational Perspective The complexity and richness of data to be handled by business applications is constantly increasing. 9. Formally defined “A system that includes both object infrastructure and a set of relational extenders that exploit it is called an Object-Relational Database System. Furthermore they also provide features such as encapsulation of data. audio and video data. It is no longer possible to wait 9. suppliers and customers to support its business processes. Besides saving significant . structured data and large.” Object-Relational Database Systems (ORDBMS) are able to easily store complex. Complex data like graphics. time frame in which these responses need to be made has been shrinking.3 Database Integration Companies spend millions of dollars to build data warehouses to hold their data and data mining techniques must take advantage of this. The pressures of a competitive marketplace are driving corporations to build and evolve their applications in a timely and cost-effective manner. Increasingly. companies need to build applications that closely match their business models and processes. and to quickly respond to their wants and needs. such as text.

In such a fast paced world. but there is stillroom for improvement. ObjectRelational Database Systems suffer a performance loss when dealing with such large amounts of persistent objects. For Object-Relational Databases and only the relational model should be extended to incorporate user-defined types. Data mining applications must be smoothly integrated within the Object-Relational Database Systems in order to get the maximum benefit out its inherent object-orientation. 9. Independently both Data Mining and Object-Relational Database Systems are still evolving and this continues to pose problems to bring them to a common framework. this integration allows data mining applications to access the most up. . have taken positive steps in this regard. Success of data mining as an enterprise technology crucially depends on seamless integration of this technology with enterprise databases. Oracle etc. and more specifically the newly emerging Object-Relational Database Systems. The present era is seeing sweeping changes at an unprecedented rate. Many leading vendors like IBM. Many of the data mining algorithms use complex mathematical and statistical algorithms that are not easily mapped into human terms. technologies need to keep up to date with developments in related fields or they would be rendered obsolete.to-date information available. We found this to be a strong deterrent for understanding.4 Problems and Difficulties in the Integration Data Mining normally involves operations on very large sets of data.manual effort and storage space.

Independently both of these are technologies are becoming the prerequisites of doing business in the new economy and their integration is the next logical step The concept of database and its extension to data warehousing and mining such a data warehousing by different mining technique with their implementation organize data in more meaningful way with integration and not only drawing report but extracting vital and valuable information for managing the process in an effective manner and help knowledge driven decision making. self learning virtual environment etc. student counseling. Further the architecture will make easy path to extends its scope covering wider areas touching administration. research. teaching.10. . . management. CONCLUSION: Object-relational database systems are fast gaining popularity in the industry and are replacing the traditional relational databases. The system will open-up doors for optimizing inter-linked processes to enhance efficiency & effectively of the working patterns. content creation & delivering system.Data mining techniques are currently optimized for relational database systems and must evolve to work with object-relational database systems.