Professional Documents
Culture Documents
UAS Data Mining - Agung Bayu Aji (1806152861)
UAS Data Mining - Agung Bayu Aji (1806152861)
Abstract — In recent years, many service industries have Indonesia have many OTA and city, the number of
widely adopted information technology (IT) to enhance their transaction data can increased as well as the number of days.
operation in communication with their customer or other
industries. One example of that service industries is tourism In previous research, there are many service innovation
industry. To support the tourism industry based on technology that developed in OTA to perform their e-service based on
and people behaviors, Online Travel Agent (OTA) can help us their transactional databases, such as revealed many
to get a continuous improvement because they have a big electronic payment scheme [5], mobile bus ticketing services
transactional data. This paper use transactional data that [6], ticket home delivery and cash on delivery systems [7],
consist of Surabaya city as a point in one day of April 2019. implementing general ticket for public transport (bus and
From the data, we can get an information using Knowledge train) using card in Switzerland [8]. This paper wants to
Discovery in Databases (KDD) process and get the result there discover a knowledge that we can know from the one day
are any point or location that can associate as route with many one point transactional data from OTA using KDD process.
transit. With Amoeba Algorithm, we can get 5 best route
combination that centered in Surabaya. The route are JKT-
SUB, SUB-JR, SUB-ML, JKT-SUB-JKT, and JKT-SUB-JR. II. LITERATURE REVIEW & METHODOLOGY
This paper will use many techniques of Knowledge
Keywords — Amoeba algorithm, Association rule mining, Discovery (KDD) Process in Data Mining, as in figure
Data mining, Knowledge discovery in databases, Online travel below:
agent, Transaction
I. INTRODUCTION
Nowadays, many service industries have widely adopted
information technology (IT) to enhance their operation in
communication with their customer or other industries. One
example of that service industries is tourism industry [1],
[2]. The World Tourism Organization defines tourists as
people "traveling to and staying in places outside their usual
environment for not more than one consecutive year for Fig. 1. The Knowledge Discovery in Databases (KDD) Process [2]
leisure, business and other purposes" [3]. So, traveller who
want to leisure or recreation outside their boredness in usual
A. Data Selection
environment usually want to plan and prepare their travelling
in easy way, such as search the accomodation and destination First, we should select data related to the analysis task
in their laptop or smartphone which are capable and more from the raw database. In this paper, we use raw data from
versatile. There are 5 different area where IT is used in Online Travel Agent transactions and analyze the data for
tourism industry are in Marketing, Accomodation and domestic route and from/to Surabaya only. Then, we can do
Booking System, Delivery of visitor experiences, Customer statistical analysis using Explanatory Data Analysis (EDA).
relationships and follow-up, and Digital Coach Program [4].
B. Data Preprocessing
One of tourism firm who use IT for accomodation and
booking system is Online Travel Agent (OTA). Due to Second, we should cleaning the selected data such as
technology evolution, people changes their behaviour from replacing missing data, removing outliers, extreme values,
face to face in conventional travel agent to access online noise and inconsistent data.
travel agent in their smart phone. The consequences from
that behavioural changes, traveller must fill their personal C. Data Transformation
data in OTA system to search and book their accomodation Third, we should cleaning the selected data such as
plan for travelling. So, OTA will have big data that capture replacing missing data, removing outliers, extreme values,
location and accomodation plan from many traveller. noise and inconsistent data. In this paper, we use Principal
Component Analysis (PCA) to reduce the dimension the
In Indonesia, there are many OTA such as Traveloka,
selected data.
Tiket.com, Airy rooms, etc. They have already getting
knowledge from their transactional data to perform a
continuous improvement. In this paper, we can know that D. Data Mining
they get 48 transaction data only in one day, one point Next, after the data have already processed, we do the
(Surabaya city), and only from 2 OTA. As we know, if data mining process using classification / clustering /
association rule algorithm.
B. Data Preprocessing
Then, we continue to next step for cleaning or preparing
data before data mining process. In this step we need to
conduct many activities such as: The highest number is flight with 28 transactions.
1) Replacing missing data : there are no missing data
because automatically generated by system with must
filled field
h) Product_Brand : Text data with histogram below Based on Minitab output for eigen value analysis, we get
eigen value that more than 1 is in 4th order. So, we should
reduce the 14 field data to 4 field data only. The Principal
Component Matrix if we reduce data to only 4 field is below.
REFERENCES