You are on page 1of 9




By :
Abdul Aris



Abdul Aris 1)

The index page acces log pages grouping known as search engine keyword information , the
presence of acces log data can provide some information as well as briefly explaining the information about
a search engine keywords entered into a website .
In this research , a system designed for the application with keyword search engines classify
Complete Link method in this case involves the activity of a user in accessing the index page . then
implement and classify these keywords with Complete Link method .
Complete link clustering is a method that can be used to categorize the keywords or content that
goes into a website . Complete the link included in the hierarchical method clastering so that the output
from this method or content keywords are grouped in the form of a hierarchy or tree view . The results of
this clustering analysis can be made in reference to business owners online store for promotional segment
for the website administrator prodak and can be used as a reference to improve the content ..

Keyword : Search engine keyword ,Metode Complete Link,clastering.

1. INTRODUCTION reference to improve the content so that visitors

1.1 Background feel comfortable in visiting the website . Based on
Today the search for information through these descriptions , the authors would like to
search engines is a vital and so on . Typing achieve this goal in the final with the
keywords in the search engines can interpret user COMPLETE LINK METHOD FOR
interest to a user informasi. when visiting the IMPROVED RANKING WEBSITE
website then the user data will be recorded
include IP , protocol , including the site of origin 1.2 Problem Formulation
if the user comes from the search engines look Based on the background of the problem, the
for keywords that are also recorded in problem to be addressed is "How to build an
log.Keyword acces very useful in the application to group search engine keywords to
development of the website because he is a user Complete Link method".
interest . If the numbers are still a little keyword 1.3 Destination
analysis can be done manually , but as the The results of this study aims to establish the
number of keyword analysis manually then it application of search engine Optimization
will take time and effort very much . It is Method Using Complete Link
necessary in the wake of an application that can 1.4 Complete linkage
group interests so that the analysis becomes Complete linkage (furthest neighbor). In this
easier . method, the distance between the clusters is
Complite link clustering is a method that can be determined by the greatest distance between two
used to group keywords website . Complete the objects in different clusters this method is quite
link included in the hierarchical method good in cases when objects actually form
clastering so that the output of this method in the naturally (Liu, Bing)
form of keywords are grouped in the form of a the steps to the formation of clusters with
hierarchy or tree view so that website complete link method is as follows:
administrators can analyze interest in the sub 1. Started by assigning each a document so that
interest , so that a referral can be made online if there is a cluster of N documents, the
shop for business owners to perform number of clusters that have as many pairs
segmentation and promotion prodak the N.Cari clusters with large distances (most
administrator of the website can be used as a
Program Studi S1-Teknik Informatika, Universitas Semarang
similar) and merge into a single cluster, so the 𝑑(𝑐𝑒)(𝑎𝑏𝑑) = max {𝑑(𝑐𝑒)𝑎 , 𝑑(𝑐𝑒)(𝑏𝑑) } =
number of existing clusters is reduced by one. max{11,10}=11 The final distance matrix is :
2. Update the distance matrix by calculating the {𝑐, 𝑒} {𝑎, 𝑏, 𝑑}
distance (similarity) between the new cluster {𝑏, 𝑑} 0 11
with the old cluster, use the largest distance {𝑐, 𝑒} 11 0
3. Repeat steps two and three until all documents
into a single cluster. For example, the the last step is to combine the cluster {a, c, e} and
distance between the unknown matrix object cluster {b, d} to form a cluster.
5 as follows: 1.5 Tokenizing
𝑎 𝑏 𝑐 𝑑 𝑒 Tokenizing process is the process of cutting
𝑎 0 9 3 6 11 the string input by every word that has put them
𝐷 = {𝑑𝑖𝑘 } = 𝑏 9 0 7 5 10 together. In principle this process is to separate
𝑐 3 7 0 9 2 each word that make up a document. In general,
𝑑 6 5 9 0 8 each said identified or separated by other words
𝑒 11 10 2 8 0 by a space character, so the tokenizing process
relying on the document for a space character to
From the distance matrix can be in the know min separate words. whole series of sentences, each
(d_ (ik)) = d_ (ec) = 2 then the document e and separated by a space he said, after going through
c combined to form clusters {e, c} with the the process of tokenizing the sentence into a set
document a, b, d is the largest distance: d_ ((ce of arrays which each cell contains the words that
), a) = max {d_ca, d_ea} = max {3,11} = 11d_ exist in the sentence. In the tokenizing process
((ce), b) = max {d_cb, d_eb} = max {7,10} = usually also added information on the number of
10d_ ((ce), d) = max {d_cd, d_ed} = max {9,8} occurrences of each word of the sentence.
= 9 Remove rows and columns corresponding to 1.6 Filtering
the document D e and c and adding rows and Filtering is the process of making the words
columns to cluster {c, e} in order to obtain a that are considered important or has meaning
{𝑐, 𝑒} 𝑎 𝑏 𝑑 only. In this process the words are considered to
{𝑐, 𝑒} 0 11 10 9 have no meaning as the conjunction will be
distance matrix 𝑎 11 0 9 6 removed. In this process typically used
𝑏 10 9 0 5 stopwords or stop word list stored in a database
𝑑 6 6 5 0 table, which then is used as a reference for the
The smallest distance between pairs of removal of the word. overview of the results of
clusters are now 𝑑𝑏𝑑 = 5 so that the cluster {b} the filtering process, the word is shown as 'in',
and cluster {d} combined into cluster {b,d} 'is' and a, through the removal process.
with {c,e} and a 1.7 Stemming
𝑑(𝑏𝑑)𝑎 = max {𝑑𝑏𝑎 , 𝑑𝑑𝑎 } = max{9,6}= 9 Steming process is a process to find the root of
𝑑(𝑏𝑑)(𝑐𝑒) = max {𝑑𝑏(𝑐𝑒) , 𝑑𝑑(𝑐𝑒) } = a word results from the filtering process. Search
min{10,9}=10 the root of a word or so-called basic words can
The new distance matrix now is reduce the index results without having to remove
{𝑏, 𝑑} {𝑐, 𝑒} 𝑎 the meaning. Filtering is the process of making
{𝑏, 𝑑} 0 10 9 the words that are considered important or has
{𝑐, 𝑒} 10 0 11 meaning. There are two approaches to stemming
𝑎 9 11 0 the dictionary approach and the approach that is
the rule. Several studies have also been conducted
The smallest distance between pairs of for both Indonesian stemmer dictionary
clusters are now 𝑑(𝑏𝑑)𝑎 = 9 combined cluster approach, or pure rule approach. Algorithms with
{b,d} and cluster {a} combined into {a,b,d}. The such rules approach introduced by Vega (2001)
number of clusters now there are two clusters and Tala (2003) they each have different
{c,e} and cluster {a,b,d}, the greatest distance algorithms in the process of stemming the
from the second cluster is: Indonesian language document. Stemming in the

Program Studi S1-Teknik Informatika, Universitas Semarang
English language, is derived from the word 1. Processor Intel Core i3
origins learning basic word is changed to learn. 2. Memori 2 Gb
Then returned to form words using the base into 3. Harddisk 320 Gb
use. But the text says is that the basic word is not 4. Monitor LCD
changed 5. Keyboard dan Mouse
1.8 Planning System 6. Printer
b. Software
Start The software used in the making of the
program are as follows::
Pengumpulan Data: Pengunduhan Acces log dari a. Sistem Operasi Windows 7
Server, Pengambilan Search Keyphres dari Acces log
b. PHP
c. Apache webserver
Pembersihan Data Search Keyphres: Tokenisasi
d. Notepad ++
Pembuangan Stopword, Stemming
e. Browser
f. MySQL Database Server
Pembentukan Cluster dengan Algoritma Complete link
1.8.4 System Design
The system being designed, developed
using modeling diagrams in UML (Unified
Hasil Clustering
Model Language). The following diagram

a. Use Case Diagram

In the use case described above that the system is
Planning is an attempt to determine the user friendly. Where the user can access the menu
activities to be carried out in future by drafting system
and outlining those goals in order to achieve the
goal, so that all activities be focused and efficient Tutorial
( Mulyanto, A, 2009). Pre-proses

1.8.1 Analysis System Design <<extended>>

Proses 1
1.8.2 Description of System <<extended>>

Application development search engine Upload acces log

optimization by using the complete link method user

aims to assist and support the promotion of Cluster Complete

segmentation products of an online store that will Hasil Clastering

be thrown into the market due keywords or

content grouped in the form of a hierarchy or tree Gambar 3. Use Case Diagram
view, so that the decision making process in the
promotion of a product store online. This b. Squence Diagram Upload
application helps a website administrator can
analyze the content or keyword optimally. For the : Upload Acces Log : Pilih File
: Upload

criteria used by the highest number of keywords : User

1 : Pilih Menu Upload Acces Log()

are entered into acces logs.In making the new 2 : Tampilan Menu Upload()

system is the desired list of requirements:

3 : Klik Menu Pilih File()
a. Setting up a website
4 : Tampilan File yang akan di upload()
b. Downloading the data acces logs.
1.8.3 Analisis Hardware dan Software 5 : Kilik Menu Upload()

a. Hardware 6 : File di upload()

Hardware specifications are used in the

implementation of the system / application, Gambar 4. Squence DiagramUpload
Program Studi S1-Teknik Informatika, Universitas Semarang
c. Squence Diagram Pre-proses
h. Class Diagram
: Pre-proses

Keyphrase Similaritas
: User Indek
+Frase +Pk_id
+PK_id +r
+Jumlah +PK_kata
1 : Pilih Menu Pre-proses() +s
+Status +Jumlah
+Clastered +drs
+Pre-proses() +m
2 : Tampilan file acceslog yang berhasil di upload()
+Baca _acceslog()


Stopword Step
+PK-id +PK_id +m
d. Squence Diagram Proses 1 +Kata

Gambar 5. Class Diagram

i. Activity Diagram Home

Pilih menu home

e. Squence Diagram Proses 2 Menu home

: Proses2

: User
j. Activity Diagram Tutorial
1 : Pilih Menu Proses2()

2 : Tampilan langkah lanjutan proses 1()

Pilih menu tutorial


f. Squence Diagram Claster Complete Link

: Claster Complete Link

: User

k. Activity Upload Acces log

1 : Pilih Menu Claster Complete Link()

2 : Tampilan proses claster()

Pilih Menu upload

Pilih file

g. Squence Diagram Hasil Claster Complete Upload file

Tampilan file berhasil di upload

Program Studi S1-Teknik Informatika, Universitas Semarang
l. Activity Pre-proses p. Activity Hasil Cluster Complete Link

Pilih Hasil Cluster

Pilih preproses

Hasil Cluster Complete Link


3.2.5. Designing Database

m. Activity Proses1 The design of the database used in this
system is as this database is an information
system that integrates a collection of data that are
interconnected to each other and make it available
Pilih proses 1
to multiple applications. The design of the
database used for data processing applications
Proses 1 can be described as follows:
* : Primary Key
** : Foreign Key
a. Tabel Keyphrase

n. Activity Proses2 Function : input data, store data

File Name : TBKeyphare
Coloumn Data Type Allaw
Name Nulls
Frase Varchar (255)
Pilih proses 2
Jumlah Tinyint (4)
Status Tinyint (4)
Proses 2 Id Int (11)


Analysis of system design has been done in
detail, then the next step is towards the
implementation phase. The purpose of the
o. Activity Cluster Complete Link implementation phase is to explain about the
manual module to all users who will use the
system. So that the user can respond to what is
displayed by the system and provide input to
Pilih Claster Complete Link
system maker to be repaired in order to be a better
Claster Complite Link
1.9.1 Testing Methods of implementation
System testing is the cornerstone that aims
to discover faults or deficiencies in the software
being tested. The test aims to determine which
software meets the criteria were made in
accordance with the purpose of designing such

Program Studi S1-Teknik Informatika, Universitas Semarang
1.10 Implementasi Program
a. Form Home e. Form Menu Claster Complete Link
This form is used to perform the complete
Home menu function to view the
link cluster.
applications menu system.

f. Form Menu Result Claster Complete

b. Form Menu Pre-proses This form is used to display the results of
Upon successful entry of the Home Menu, complete link clusters.
it will immediately appear Menu Pre-process.

Based on a case study that has been done,
then the author can take some conclusions as
c. Form Menu Proses 1 follows:
In this form is used to process the file a) Application Optimization Search Engines
successfully uploaded. Complete Link Method Using built can be
used as a means to improve the segmentation
promotion prodak online store.
b) Application Optimization Search Engines
Complete Link Method Using built to increase
the number of penggunjung website thereby
increasing the number of purchases at online
stores prodak.
c) Application Optimization Search Engines
d. Form Menu Proses 2 Complete Method Using these links can
This form is used to continue the process 1 provide convenience in managing a product or
item that has not been touched by the search
d) Application Optimization Search Engines
Complete Method Using these links to search
engine optimize a website to the online store.
e) Complete Application Linked To Increased
Website Ranking can improve the ranking of
a website on the first page ranks in the search
Program Studi S1-Teknik Informatika, Universitas Semarang
REFERENCES International conference on Database
Abdurrahman1, Bambang Riyanto, T.2, Rila and Expert System Applications,
Mandala3, Rajesri Govindaraju4, 2009, Krakow, September 4-8,842-854.
Klasifikasi Pengguna WEB Dalam Usage
Mining Untuk Business Intelegence Pudjo,Prabowo 2011, Menggunakan UML,
Dengan Algoritma Ant colony Informatika, Bandung
optimization, Jurnal Penelitian dan Prasetio ,Adhi , 2012, Buku Pintar Pemrograman
Pengembangan TELEKOMUNIKASI, WEB, Mediakita,Jakarta.
Vol. 14, No. 1, Juni. Mulyanto ,Agus, 2009, Sistem Informasi
Konsep dan Aplikasi, Jakarta.
Bernard Renaldy Suteja, Ahmad Ashari, 2008, R.C. Gonzalez, R.E. Woods, 1992, Digital Image
Ontology e-Learning Content berbasis Processing, Addison-Wesley Publishing
Web Semantic, Seminar Nasional Company, USA.
Aplikasi Teknologi Informasi
Yogyakarta. Salton,G.,1989,Automatic Text Processing,
Baeza,R.,dan Ribeiro,B,1999, Modern
Information Retrieval,ACM Press New Tala,Z.,2003,A Study of Stemming Effects on
York,USA. Information Retrieval in Bahasa
Eirinaki,M.,dan Vazirgiannis,M.,2003,web Indonesia,Institute for Logic, Laguage
mining for web Personalization, Journal and Computation , Universiti van
of ACM Transaction on Internet Amsterdam,The Netherlands.
Technology (TOIT) 1-27.
Tyagi,N.K., Solanki, A.K., dan Wadhwa,M.,
Kadir, Abdul, 2002, Pemrograman Web 2010. Analysis of Server Log by web
mencakup HTML, CSS, Javascript dan Usage Mining for Website Improvement,
PHP, Penerbit Andi, Yogyakarta Eirinaki,M.,dan
Vazirgiannis,M.,2003,web mining for
Kadir, Abdul, 2008, Dasar Perancangan dan web Personalization, Journal of ACM
Implementasi Database Relasional, Transaction on Internet Technology
Penerbit Andi, Yogyakarta (TOIT) 1-27.
L.Fausett, 1994, Fundamentals of Neural
Networks: Architectures, Algorithms, IJCSI International Journal of Computer
and Applications, Prentice-Hall Inc., Science Issues, Vol.7,Issues 4,No
USA. 8,July 2010
Liu,Bing, 2010, Web DataMining Exploring
Hyperlinks,Contents, and Usage Data Vijayalakshmi, S., Mohan,V., dan Raja,SS.,
2009.Mining Constrain-Based
Liu,B., 2007, Web Data Mining: Exploring Multidimension Frequent Sequential
Hyperlinks,Contents, and Usage Data, Pattern in Web Logs,European Journal of
Department of Computer Science Scientific Research 36(3),480-490.
University Of Illinois,Chicago,IL.
Xie,Y., dan Phoha,V.V.,2001,Web User
Nugroho, Bunafit, 2010, Latihan Membangun Clustering from Acces Log Using Belief
Aplikasi WEB PHP dan MYSQL Function, Proceedings of the 1st
dengan Dreamweaver MX, Gava International conference on Knowledge
Media. capture,vol.1, Britis Columbia,Oktober
Otsuka,S.,dan Kitsuregawa,M.,2006, Clustering
Of Search engine Keywords Using
Acces Logs,Proceeding of the 17th
Program Studi S1-Teknik Informatika, Universitas Semarang
Program Studi S1-Teknik Informatika, Universitas Semarang