Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Save to My Library
Look up keyword
Like this
1Activity
0 of .
Results for:
No results containing your search query
P. 1
Analysis of Educational Web Pattern Using Adaptive Markov Chain For Next Page Access Prediction

Analysis of Educational Web Pattern Using Adaptive Markov Chain For Next Page Access Prediction

Ratings: (0)|Views: 369|Likes:
Published by ijcsis
The Internet grows at an amazing rate as an information gateway and as a medium for business and education industry. Universities with web education rely on web usage analysis to obtain students behavior for web marketing. Web Usage Mining (WUM) integrates the techniques of two popular research fields – Data Mining and the Internet. Web usage mining attempts to discover useful knowledge from the secondary data (Web logs). These useful data pattern are use to analyze visitors activities in the web sites. So many servers manage their cookies for distinguishing server address. User Navigation pattern are in the form of web logs .These Navigation patterns are refined and resized and modeled as a new format. This method is known as “Loginizing”. In this paper we study the navigation pattern from web usage and modeled as a Markov Chain. This chain works on higher probability of usage. Markov chain is modeled for the collection of navigation a pattern and used for finding the most likely used navigation pattern for a web site.
The Internet grows at an amazing rate as an information gateway and as a medium for business and education industry. Universities with web education rely on web usage analysis to obtain students behavior for web marketing. Web Usage Mining (WUM) integrates the techniques of two popular research fields – Data Mining and the Internet. Web usage mining attempts to discover useful knowledge from the secondary data (Web logs). These useful data pattern are use to analyze visitors activities in the web sites. So many servers manage their cookies for distinguishing server address. User Navigation pattern are in the form of web logs .These Navigation patterns are refined and resized and modeled as a new format. This method is known as “Loginizing”. In this paper we study the navigation pattern from web usage and modeled as a Markov Chain. This chain works on higher probability of usage. Markov chain is modeled for the collection of navigation a pattern and used for finding the most likely used navigation pattern for a web site.

More info:

Published by: ijcsis on Aug 13, 2011
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

08/13/2011

pdf

text

original

 
Analysis of Educational web pattern usingAdaptive Markov Chain for Next pageAccess Prediction
Harish Kumar,PhD scholar, Mewar University.Chittorgarh.Dr. Anil Kumar Solanki
 
MIET Meerut.
ABSTRACT
The Internet grows at an amazing rate as aninformation gateway and as a medium for  business and education industry. Universitieswith web education rely on web usage analysisto obtain students behavior for web marketing.Web Usage Mining (WUM) integrates thetechniques of two popular research fields - DataMining and the Internet. Web usage miningattempts to discover useful knowledge from thesecondary data (Web logs). These useful data pattern are use to analyze visitors activities inthe web sites. So many servers manage their cookies for distinguishing server address. User  Navigation pattern are in the form of web logs.These Navigation patterns are refined andresized and modeled as a new format. Thismethod is known as “Loginizing”. In this paper we study the navigation pattern from web usageand modeled as a Markov Chain. This chainworks on higher probability of usage .Markovchain is modeled for the collection of navigationa pattern and used for finding the most likelyused navigation pattern for a web site.Keyword: Web mining, web usage, web logs,Markov Chain.
INTRODUCTION:
The IT revolution is the fastest emergingrevolution seen by the human race. The Internetsurpasses online education, Web basedinformation and volume of click the web site hasreached at huge proportions. Internet and thecommon use of educational databases haveformed huge need for KDD methodologies. TheInternet is an infinite source of data that cancome either from the Web content, represented by the billions of pages publicly available, or from the Web usage, represented by the loginformation daily collected by all the serversaround the world[1][2]. The informationcollection through data mining has allowed E-education Applications to make more revenues by being able to better use of the internet thathelps students to make more decisions.Knowledge Discovery and Data Mining (KDD)is an interdisciplinary area focusing uponmethodologies for mining useful information or knowledge from data [1]. Users leave navigationtraces, which can be pulled up as a basis for auser behavior analysis. In the field of webapplications similar analyses have beensuccessfully executed by methods of Web UsageMining [2] [3]. The challenge of extractingknowledge from data draws upon research instatistics, databases, pattern recognition,machine learning, data visualization,optimization, web user behavior and high-
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 9, No. 7, July 2011124http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
 performance computing, to deliver advanced business intelligence and web discoverysolutions[3][4]. It is a powerful technology withgreat potential to help various industries focuson the most important information in their datawarehouses. Data mining can be viewed as aresult of the natural evolution of informationtechnology. In Web usage analysis, these dataare the sessions of the site visitors: the activities performed by a user from the moment he entersthe site until the moment he leaves it. Webusage mining consists on applying data miningtechniques for analyzing web user’s activity. Ineducational contexts, it has been used for  personalizing e-learning and adaptingeducational hypermedia, discovering potential browsing problems, automatic recognition of learner groups in exploratory learningenvironments or predicting student performance.The discovered patterns are usually representedas collection of web pages, objects or resourcesthat are frequently accessed by groups of userswith common needs or interests [10][11].Generally user visit a web site in sequentialnature means user visit first home page thensecond page and then third and then finish hiswork with this user leaves his navigation markson a server. These navigation marks are callednavigation pattern that can be used to decide thenext likely web page request based onsignificantly statistical correlations. If thatsequence is occurring very frequently then thissequence indicated most likely traversal pattern.If this pattern occurs sequentially, Makov chainshave been used to represent navigation patternof the web site. This is because in Markov chain present state is depending on previous state. If aweb site contains more navigation pattern(“Interesting Pattern”) high supporting thresholdis assign to it and less interesting patterns areignored. So we can say that at different level of web site we need to assign different thresholdvalue.Important properties of Markov Chain:1.
 
Markov Chain is successful in sequencematching generation.2.
 
Markov model is depending on previous state.3.
 
Markov Chain model is Generative.4.
 
Markov Chain is a discrete – time stochastic processDue to the generative nature of Markov chain,navigation tours can automatically derived.Sarukkai proposed a technique ho Markovmodel predict the next page accessed page bythe user[4][2]. Pitkow and Deshpande,Dongshan and Junyi proposed varioustechniques for log mining using MakovModel[5][2]
METHODOLOGY:
This Markov model is an easiest way of representing navigation patterns and navigationtree. Suppose we have an e web site of auniversity. Navigation pattern sequences are1.
 
A B C D E F2.
 
A C F3.
 
A C E4.
 
B C D Navigation Pattern Frequencyof visitS A B C D E F T 3S A C F T2S A C E T 3S B C D T 2Total No of website navigate
10Table 1: Navigation pattern table
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 9, No. 7, July 2011125http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
 The probability of transition is calculated by theratio of the number of times the correspondingsequence of pages was traversed and the number of times a hyperlink page was visited. A state of a page is composed by two other states Startstate(S) and Terminal State (F).Probability of hyperlink is based on the contentof page being viewed. Navigation matrix is asfollows:This Indicate navigation control can reach attotal 10 times at T.
A B C D E F T
A
0
3 / 51 / 2
0 0 0 0
B
0 0
1 / 2
0 0 0 0
C
0 0 0
1 1 / 22 / 5
0
D
0 0 0 0
1 / 2
0
1 / 5
E
0 0 0 0 0
3 / 53 / 10
F
0 0 0 0 0 0
1 / 2
T
0 0 0 0 0 0
1
Table 2: frequency of each Node and theirprobability.
So we can identify that total probability of visitof A is 8/39, B is 5/39, C is 10/39, D is 5/39, Eis 6/39 and F is 5/39.Here
 NP
i j
 
is a navigation probability matrix where NP is the probabilitywhere next stage will be
 j
. Navigation probability is defined as NP
i j
0,1
 
And for all j NP
i j
=1. The initial probability of astate is estimated as the how many number of times a page was requested by user so we cansay that every state has a positive probability.The Traditional Markov model has somelimitations which are as follows.1.
 
Low order Markov Models has goodcoverage but less accurate due to poor history.2.
 
High order Markov Models suffersfrom high state space complexity.In higher-order Markov model number of statesexponential increases as increase in the order of model. The exponential increment in number of states increases search space and complexity Higher-order Markov model also have low coverage problem. In proposed model, each request with itstime-duration is considered as a state. A session is asequence of such states. The m-step Markov modelassumes that the next request depends only on last mrequests. Hence, the probability of the next request iscalculated byP(r 
n+1
|r 
n...r1
) = P(r 
n+1
|r 
n...
n
m +1
),Where r 
i
is the i th request in a session, i=1, 2... n, r 
n
 is the current request, and r 
n+1
is the next request.From this equation, if m=1 (the 1-step model), thenext request is determined only by the currentrequest [5]. The Matrix CM is of conditional probability of previous occurrence. The state matrixCM is a square matrix. So we need to be calculatingthe probability of each page. So we need to design amodel that is dynamic in nature means prediction is based on the next incoming and outgoing node. TheMarkov model construction starts with the first rowof table (first navigation pattern)
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 9, No. 7, July 2011126http://sites.google.com/site/ijcsis/ISSN 1947-5500

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->