You are on page 1of 22

DATA MINING

INTRODUCTION

DATA MINING

2 06-2023
INTRODUCTION

Is the person using card


912344567 really the cardholder
or an impostor?

3 06-2023
INTRODUCTION

Mr A:
Refund: No
Marital Status: Maried
Taxable Income: 130

Evade ?

4 06-2023
INTRODUCTION

10/2023, Amazon.com Inc (AMZN) price % ?


5 06-2023
INTRODUCTION

6 06-2023
INTRODUCTION

7 06-2023
INTRODUCTION

➢ Data: có thể xem là chuỗi các bit, số, ký tự... được


thu thập hàng ngày trong công việc.
➢ Information: tập hợp những dữ liệu đã được xử
lý, dùng mô tả, giải thích đặc tính của một đối tượng
nào đó.
➢ Knowledge: tập hợp những thông tin có liên hệ
với nhau; được lập luận chặt chẽ hoặc được thực
nghiệm kiểm chứng qua nhiều thể hệ → thể hiện tư
duy của con người về một vấn đề
8 06-2023
Knowledge Discovery in Databases (KDD)

9 06-2023
Knowledge Discovery in Databases (KDD)

10 06-2023
Data Mining Process

➢ The data mining process is a pipeline containing


many phases such as data cleaning, feature
extraction, and algorithmic design.
➢ The workflow of a typical data mining application
contains the following phases:

Data collection

Feature extraction and data cleaning

Analytical processing and algorithms

11 06-2023
Data Mining Process
➢ The workflow of a typical data mining application
contains the following phases:

Data collection

Feature extraction and data cleaning

Analytical processing and algorithms

12 06-2023
Data preprocessing

13 06-2023
Data preprocessing

14 06-2023
Data preprocessing

➢ Data cleaning: real-world data tend to be incomplete,


noisy, and inconsistent. Data cleaning (or data cleansing)
routines attempt to fill in missing values, smooth out
noise while identifying outliers, and correct
inconsistencies in the data.
➢ Data Integration: data mining often requires data
integration - the merging of data from multiple data
stores. Careful integration can help reduce and avoid
redundancies and inconsistencies in the resulting data
set. This can help improve the accuracy and speed of the
subsequent data mining process.
15 06-2023
Data preprocessing

➢ Data Transformation: the data are transformed or


consolidated into forms appropriate for mining.
➢ Data Reduction: data reduction techniques can
be applied to obtain a reduced representation of the
data set that is much smaller in volume, yet closely
maintains the integrity of the original data. That is,
mining on the reduced data set should be more
efficient yet produce the same (or almost the same)
analytical results.

16 06-2023
Data mining Methods

17 06-2023
Data mining Methods

18 06-2023
Data mining Methods

19 06-2023
Data mining Methods

20 06-2023
APPLICATION

21 06-2023
22 01-2023

You might also like