Professional Documents
Culture Documents
• The term “data mining” appeared around 1990 in the database community.
• Gregory Piatetsky-Shapiro coined the term “Knowledge Discovery in Databases”
for the first workshop on the same topic (KDD-1989) and this term become more
popular in AI and Machine Learning Community.
• Currently, Data Mining and KDD are used interchangeably.
• Since about 2007, “Predictive Analytics” and since 2011, “Data Science” terms
were also used to describe this field
(Source: Coenen, 2011)
ORIGIN OF DATA MINING
• Draws ideas from machine learning/AI, pattern
recognition, statistics, and database systems
AI,
• Traditional techniques may be unsuitable due Statistics
Machine Learning,
to data that is Pattern
• Large-scale Recognition
• High dimensional
Data Mining
• Heterogeneous
• Complex
• Distributed Database
systems
• A key component of the emerging field of data
science and data-driven discovery
THE EVOLUTION OF DATA MINING
Evolutionary Step Enabling Technologies Business Question Characteristics
Data Collection Computers, tapes, "What was my total revenue Retrospective, static data
(1960s) disks in the last five years?" delivery
Data Access RDBMS, SQL, ODBC "What were unit sales in New Retrospective, dynamic data
(1980s) England last March? delivery at record level
Data OLAP, multidimensional "What were unit sales in New Retrospective, dynamic data
Warehousing databases, England last March? Drill delivery at multiple levels
(1990s) Data warehouses down to Boston”
Source: www.thearling.com
MOTIVATION OF DATA MINING
Growth of data both in commercial and scientific databases
due to advances in data generation and collection technologies
• Commercial Viewpoint
o Lots of data is being collected and warehoused
Amazon, Shopee, Lazada
o Computers have become cheaper and more powerful
(E-commerce)
• Scientific Viewpoint
o Data collected and stored at enormous speeds
o Helps scientists in automated analysis of massive
datasets
https://www.ncdc.noaa.gov/sotc/global/202003
KNOWLEDGE DISCOVERY (KDD) PROCESS
• This is a view from typical Pattern Evaluation
database systems and data
warehousing communities
Data Mining
• Data mining plays an
essential role in the
knowledge discovery Task-relevant Data
process
Data Warehouse Selection
Data Cleaning
Data Integration
Databases
DATA MINING : 1-STEP OF KDD
Data mining
Task
Techniques
CLASSIFICATION OF DATA MINING SYSTEMS
Other Applications
4. Text mining (news group, email, documents) and Web mining
5. Stream data mining
6. DNA and bio-data analysis
MARKET ANALYSIS & MANAGEMENT
1. Tan, Steinbach, Karpatne, Kumar, Lecture Notes, Chapter 1, Introduction to Data Mining, 2 nd Edition, 2018
2. Pang-Ning Tan, Michael Steinbach & Vipin Kumar, Introduction to Data Mining, Addison Wesley, 2019.
3. Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, 3rd Edition, Morgan Kaufmann, 2012.
4. Coenen, Frans. Data mining: past, present and future. Knowledge Engineering Review, 26(1), 25-29, 2011
5. Gregory Piatetsky-Shapiro, Data Science: Past, Present, and Future KDnuggets 1© Kdnuggets, 2016
THANK YOU
Shuzlina Abdul Rahman | Sofianita Mutalib | Siti Nur Kamaliah Kamarudin | Farah Syazwani Mohd Rashid