Professional Documents
Culture Documents
Customer
Relationship
Management
Data Mining and Knowledge Discovery
Data mining and knowledge discovery in databases (KDD) are frequently treated as synonyms, data
mining is actually part of the knowledge discovery process.
Query tools can be used to easily build and input Data Mining is a technique or a concept in computer
queries to databases science
Query tools make it very easy to build queries without Deals with extracting useful and previously unknown
even having to learn a database-specific query language information from raw data
Query tools the users need to know exactly what they While data mining is used mostly when the user has a
are looking for vague idea about what they are looking for
Query tools can be used to easily build and input Data miners can use the existing functionalities of
queries to databases Query Tools to pre-process raw data before the Data
mining process
Kind of data can be mined
1. Flat Files
2. Relational Databases
3. Data Warehouse
4. Transactional Databases
5. Multimedia Databases
6. Spatial Databases
7. Time Series Databases
8. World Wide Web(WWW)
9. Medical and personal data
10. Satellite sensing
11. Games
12. Text reports / Memos / Email-messages / chats
etc
Kind of data can be mined
1. Flat Files
• Flat files is defined as data files in text form or binary form with a structure that can be easily
extracted by data mining algorithms.
• Data stored in flat files have no relationship or path among themselves.
• Flat files are represented by data dictionary. Eg: CSV file.
• Application: Used in Data Warehousing to store data, Used in carrying data to and from server,
etc.
2. Relational Databases
• A Relational database is defined as the collection of data organized in tables with rows and
columns.
• Physical schema in Relational databases is a schema which defines the structure of tables.
• Logical schema in Relational databases is a schema which defines the relationship among tables.
• Standard API of relational database is SQL.
• Application: Data Mining, Relational Online Analytical Processing (ROLAP) model, etc.
Kind of data can be mined
3. Data Warehouse
• A data warehouse is defined as the collection of data integrated from multiple sources that will
queries and decision making.
• Two approaches can be used to update data in Data Warehouse: Query-driven Approach
and Update-driven Approach.
• Application: Business decision making etc.
4. Transactional Databases
• Transactional databases is a collection of data organized by time stamps, date, etc to represent
transaction in databases.
• This type of database has the capability to roll back or undo its operation when a transaction is
not completed or committed.
• Highly flexible system where users can modify information without changing any sensitive
information.
• Application: Banking, Distributed systems, Object databases, etc.
Kind of data can be mined
5. Multimedia Databases
• Multimedia databases consists audio, video, images and text media.
• They can be stored on Object-Oriented Databases.
• They are used to store complex information in a pre-specified formats.
• Application: Digital libraries, video-on demand, news-on demand, musical database, etc.
6. Spatial Database
• Store geographical information.
• Stores data in the form of coordinates, topology, lines, polygons, etc.
• Application: Maps, Global positioning, etc.
7. Time-series Databases
• Time series databases contains stock exchange data and user logged activities.
• Handles array of numbers indexed by time, date, etc.
• It requires real-time analysis.
• Application: eXtremeDB, InfluxDB, etc.
Kind of data can be mined
8. WWW refers to World wide web is a collection of documents and resources like audio, video, text, etc which are
identified by Uniform Resource Locators (URLs) through web browsers, linked by HTML pages, and accessible
via the Internet network.
•It is the most heterogeneous repository as it collects data from multiple resources.
•It is dynamic in nature as Volume of data is continuously increasing and changing.
Application: Online shopping, Job search, Research, studying, etc
9. Medical and personal data: From government census to personnel and customer files, very large collections
of information are continuously gathered about individuals and groups. Governments, companies and
organizations such as hospitals, are stockpiling very important quantities of personal data to help them manage
human resources, better understand a market, or simply assist clientele.
Applications: Hospitals, Social media etc
10. Satellite sensing: There is a countless number of satellites around the globe: some are geo-stationary above a
region, and some are orbiting around the Earth, but all are sending a non-stop stream of data to the surface.
NASA, which controls a large number of satellites, receives more data every second than what all NASA
researchers and engineers can cope with.
Applications: Space institutions etc
Kind of data can be mined
11. Games: Our society is collecting a tremendous amount of data and statistics about games, players and
athletes. From hockey scores, basketball passes and car-racing lapses, to swimming times, boxers pushes
and chess positions, all the data are stored. Commentators and journalists are using this information for
reporting, but trainers and athletes would want to exploit this data to improve performance and better
understand opponents.
Applications: BCCI, etc
12. Text reports and memos (e-mail messages): Most of the communications within and between companies
or research organizations or even private people, are based on reports and memos in textual forms often
exchanged by e-mail. These messages are regularly stored in digital form for future use and reference
creating formidable digital libraries.
Applications: e-commerce, hospitals, library etc
References
• https://webdocs.cs.ualberta.ca/~zaiane/courses/cmput690/notes/Chapte
r1/index.html
• https://www.javatpoint.com/data-mining
• https://www.differencebetween.com/difference-between-data-mining-
and-vs-query-tools/
• https://vspages.com/data-mining-vs-query-tools-1897/
• https://www.geeksforgeeks.org/types-of-sources-of-data-in-data-
mining/