You are on page 1of 5

Enhanced Resource User Navigation through Pattern Based

Restructuring
ABSTRACT
Organizational websites have become a vital element
in informing its stakeholders and inducing transaction
decisions, especially for institutions dealing with a
large and dispersed user group. Achieving the
desired user (centricity to satisfy users needs are ever
more difficult when the organizational website
offerings are large and diverse. In such cases, website
design is done frequently in decentralized fashion,
with a common, centrally defined, web structure and
responsibilities for content distributed to individual
organizational units. We challenge the effectiveness
of this method, based on the empirical website
analysis for a large state university. Using a scenario
based approach; we analyzed the websites structure
(static effectiveness) as well as navigational
properties (dynamic effectiveness). Findings show
that when the website content and navigation are
created by a number of designers using a site (of
(sites distributed approach, significant differences
may be observed especially in dynamic effectiveness,
leading to overall different user perceptions of
effectiveness. Furthermore, differences in design
between units will lead to an overall inconsistent user
experience. Meanwhile, the results from static
effectiveness do not reveal any significance. This
observable difference indirectly reinforces the
importance of dynamic effectiveness.
Keyword: Website Evaluation, Navigational Pattern
Discovery, Pattern Improved With Connectivity.
1. INTRODUCTION
As online communication becomes a common form
of interaction, many organizations use their corporate
website for intra (interpersonal exchange with their
stakeholders. Yet designing web pages to fit
stakeholder demands is not without challenge. The
design must allow users to find needed information,
and to do so with reasonable ease, so as to facilitate
information requests and possible website
transactions. The design task increases in challenge,
as the organization and thus its website grows. It
further increases when the institution is highly
distributed. A frequent coping strategy in that case is
to provide a common web structure (e.g., through a
content management system) for all organizational
units, while giving them authority and responsibility

for the content. The result may be considered a site


(of (sites, where each organizational unit creates its
own website, however under a common website
structure. This strategy appears meaningful as it
creates consistency of structure, while allowing those
with best content knowledge to select the most
relevant content. This approach, while widely
practiced, creates dual concerns. First, content
designers may be so immersed in the content they
seek to present that the resulting web pages are not
user (centric, but designer (centric.
Second,
decentralized decision making at the content level,
together with different degrees of user (centricity
may lead to an inconsistent user experience.
As website design plays a critical role in
influencing visitors decision(making and revisit
intentions
(Klein,
1998), understanding what
contributes to effective website performance for users
is essential. Numerous prior studies have proposed
design criteria. For example, Nielsen (2000) claims
that navigability and website organization explain
performance. Floyd and Santiago (2007) mentions
the importance of an accessible webpage format that
is usable by all ages and abilities. Beyond these broad
areas, Liu and Arnett (2000) found that information
service quality, system use, interactivity and system
design affect website success. These effectiveness
characteristics focus predominantly on structural
effectiveness, that is the availability of the relevant
information within the web design and the possibility
to navigate to relevant web pages through the pre
(defined access structures (e.g., menus and links). Yet
as far as users are concerned, the theoretical ability to
reach desired pages is of little importance, if the
navigation is none (intuitive or takes so long that
users give up (e.g., because of loss of interest, or
because of doubt that a solution can be found).
According to Kendall and Kendall (2011), a 3(click
rule applies to measuring website usability. Peerce
(1993) argues that to determine effectiveness, a
website must undergo a process of evaluation that is
capable to provide suggestions for improvement.
Thus, in measuring effectiveness and user
perceptions of site quality, both a structural and
dynamic effectiveness must be determined, and thus
the evaluation must be both structural (e.g.,
evaluating the site map) and dynamic (evaluating the
performance in completing actual queries). In

addition, Carlos et al. (2008) explain that visual cues


such as layout, color, photos are essential to attain
positive user response in a web environment. On
similar grounds, Aladwani and Palvia (2002) suggest
that web quality includes content specificity, content
quality and appearance. This article reports on a
multi(dimensional evaluation of the website for a
large university which includes multiple layers of
sites from colleges, departments, and programmers.
In doing so, it seeks to answer several design related
questions, namely [1] how users perceive structural
vs. dynamic characteristics of an organizational site;
[2] the design choices made by site developers; [3]
the impact of decentralized content design decisions
on overall site consistency and user experience; [4]
and how evaluations of effectiveness compare to
assessments of website appearance.
2. WEBSITE EVALUATION
Peerce (1993) argues that to determine effectiveness,
a website must undergo a process of evaluation that is
capable to provide suggestions for improvement.
Typical evaluation approaches used focus groups,
questionnaires and scenario (setting to measure user
experience; in particular, scenario (based questions
have been a common mechanism to assess website
design as they help to conceptualize the activity
(Turner, 1998). The process of envisioning and
interpreting scenarios allows designers to unfold an
average users behavior patterns which then improve
design and evaluation decisions (Carroll, 1995;
Turner, 1998).
2.1. Pattern Based Web Site Restructuring Scheme
Web site restructuring is performed with links
and access log details. Apriority algorithm is used
for the usage pattern analysis. Page domains are used
for the reconstruction process. The system is divided
into five major modules. They are log optimizer,
site structure analysis, statistical model based
restructuring, pattern identification and pattern
based restructuring.
Log optimizer module is
designed to perform access log preprocessing
operations. Web page links are analyzed under the
site structure analysis module. Statistical model based
restructuring module is designed to rearrange the web
site links. The pattern identification module is
designed to fetch access patterns. Pattern based
restructuring module is designed to perform site
rearrangement using patterns.
2.2. Log Optimizer
The web page requests are maintained under the
access log files. Data populate process transfers the

log data into database. Redundant page requests are


removed from the log files. The page requests are
grouped into sessions.
2.3. Site Structure Analysis
Link information is collected for each page in the
web site. In degree is calculated with the in link count
for the web page. Out link count is used to estimate
the out degree value for the web page. In degree and
out degree values are used in the site restructuring
process.
2.4. Statistical Model Based Restructuring
Mini session identification is carried out with the
access log details. Back tracking algorithm is used to
estimate back tracking pages. Out degree threshold is
used to rearrange the links in a web page. Out degree
and request frequency for the pages are used in the
restructuring process.
2.5. Pattern Identification
Association rule mining method is used to identify
the user access patterns. Apriority algorithm is used
in the pattern extraction process. Support and
confidence values are estimated from the session
details. Minimum support and minimum confidence
values are used for the pattern identification process.
2.6. Pattern Based Restructuring
Web site restructuring is performed with usage
patterns. Dynamic out degree threshold identification
scheme is used to estimate the out link levels. Search
domains are used to identify the user interest levels.
Access patterns, out degree threshold and stay time
details are used in the restructuring process.
3. CONTENT COHERENT NAVIGATIONAL
PATTERN DISCOVERY
Navigational patterns are sets of web pages that are
frequently visited together and that have related
content. These patterns are used by the recommender
system to recommend web pages, if they were not
already visited. To discover these navigational
patterns, we simply group the missions we uncovered
from the web server logs into clusters of sub-sessions
having commonly visited pages. Each of the resulting
clusters could be viewed as a users navigation
pattern. Note that the patterns discovered from
missions possess two characteristics: usage cohesive
and content coherent. Usage cohesiveness means the
pages in a cluster tend to be visited together, while

content coherence means pages in a cluster tend to be


related to a topic or concept.
This is because missions are grouped according to
content information. Since each cluster is related to a
topic, and each page is represented in a keyword
vector, we are able to easily compute the topic vector
of each cluster, in which the value of a keyword is the
average of the corresponding values of all pages in
the cluster. The cluster topic is widely used in our
system, in both the off-line and on-line phases (see
below for details). The clustering algorithm we
adopted for grouping missions is Page Gat her [6].
This algorithm is a soft clustering approach allowing
overlap of clusters. Instead of attempting to partition
the entire space of items, it attempts to identify a
small number of high quality clusters based on the
clique clustering technique.
3.1. NAVIGATIONAL PATTERN IMPROVED
WITH CONNECTIVITY
The missions we extracted and clustered to generate
navigational patterns are primarily based on the
sessions from the web server logs. These sessions
exclusively represent web pages or resources that
were visited. It is conceivable that there are other
resources not yet visited, even though they are
relevant and could be interesting to have in the
recommendation list. Such resources could be, for
instance, newly added web pages or pages that have
links to them not evidently presented due to bad
design. Thus, these pages or resources are never
presented in the missions previously discovered.
Since the navigational patterns, represented by the
clusters of pages in the missions, are used by the
recommendation engine, we need to provide an
opportunity for these rarely visited or newly added
pages to be included in the clusters. Otherwise, they
would never be recommended. To alleviate this
problem, we expand our clusters to include the
connected neighborhood of every page in a mission
cluster. The neighborhood of a page p is the set of all
the pages directly linked from p and all the pages that
directly link to p. Figure 1 illustrates the concept of
neighborhood expansion. This approach of expanding
the neighborhood is performed as follows: we
consider each previously discovered navigational
pattern as a set of seeds. Each seed is supplemented
with pages it links to and pages from the web site that
link to it. The result is what is called a connectivity
graph which now represents our augmented
navigational pattern. This process of obtaining the
connectivity graph is similar to the process used by
the HITS algorithm [Kle99][5] to find the author it y
and hub pages for a given topic. The difference is that

we do not consider a given topic, but start from a


mission cluster as our set of seeds. We also consider
only internal links, i.e., links within the same web
site. After expanding the clusters representing the
navigational patterns, we also augment the keyword
vectors that label the clusters. The new keyword
vectors that represent the augmented navigational
patterns have also the terms extracted from the
content of augmented pages.

Fig.1: Navigational Patterns (NPs) and Augmented


Navigational Patterns (ANPs)
We take advantage of the built connectivity graph by
cluster to apply the HITS algorithm in order to
identify the author it y and hub pages within a given
cluster. These measures of author it y and hub allow
us to rank the pages within the cluster. This is
important because at real time during the
recommendation,
it
is
crucial
to
rank
recommendations,
especially
when
the
recommendation list is long. Author it y and hub are
mutually reinforcing [Kle99] [5] concepts. Indeed, a
good author it y is a page pointed to by many good
hub pages, and a good hub is a page that points to
many good author it y pages. Since we would like to
be able to recommend pages newly added to the site,
in our framework, we consider only the hub measure.
This is because a newly added page would be
unlikely to be a good authoritative page, since not
many pages are linked to it. However, a good new
page would probably link too many author it y pages;
it would, therefore, have the chance to be a good hub
page. Consequently, we use the hub value to rank the
candidate recommendation pages in the on-line
module.
4. RELATED WORK
The growth of the Internet has led to numerous
studies on improving user navigations with the
knowledge mined from web server logs and they can
be generally categorized in to web personalization
and
web
transformation
approaches.
Web
personalization is the process of tailoring web
pages to the needs of specific users using the
information of the users navigational behavior and
profile data. Perkowitz and Etzioni describe an
approach that automatically synthesizes index
pages which contain links to pages pertaining to

particular topics based


on the co-occurrence
frequency of pages in user traversals, to facilitate
user navigation. The methods proposed by Mobasher
et al. And Yan et al create clusters of users profiles
from weblogs and then dynamically generate links
for users who are classified into different categories
based on their access patterns. Nakagawa and
Mobasher develop a hybrid personalization
system that can dynamically switch between
recommendation models based on degree of
connectivity and the users position in the site. For
reviews on web personalization approaches, see [7].
Web transformation, on the other hand, involves
changing the structure of a website to facilitate the
navigation for a large set of users [8] instead of
personalizing pages for individual users. Fu et al.
describe an approach to reorganize web pages so as
to provide users with their desired information in
fewer clicks. However, this approach considers only
local structures in a website rather than the site as a
whole, so the new structure may not be necessarily
optimal. Gupta et al. [9] propose a heuristic method
based on simulated annealing to re link web pages to
improve navigability. This method makes use of the
aggregate user preference data and can be used
to improve the link structure in websites for both
wired and wireless devices. However, this approach
does not yield optimal solutions and takes relatively a
long time (10 to 15 hours) to run even for a small
website. Lin [10] develops integer programming
models to reorganize a website based on the
cohesion between pages to reduce information
overload and search depth for users. In addition,
a two-stage heuristic involving two integerprogramming models is developed to reduce the
computation time. However, this heuristic still
requires very long computation times to solve for the
optimal solution, especially when the website
contains many links. Besides, the models were
tested on randomly generated websites only, so
its applicability on real websites remains
questionable. To resolve the efficiency problem in
[10], Lin and Tseng [8] propose an ant colony
system to reorganize website structures. Although
their approach is shown to provide solutions in a
relatively short computation time, the sizes of the
synthetic websites and real website tested in [8] are
still relatively small, posing questions on its
scalability to large-sized websites.
There are several remarkable differences between
web
transformation
and
personalization
approaches. First, transformation approaches create
or modify the structure of a website used for all users,
while personalization approaches dynamically

reconstitute pages for individual users. Hence, there


is no predefined/built-in web structure for
personalization approaches. Second, in order to
understand the preference of individual users,
personalization approaches need to collect
information associated with these users.
This
computationally intensive and time-consuming
process is not required for transformation approaches.
Third, transformation approaches make use of
aggregate usage data from weblog files and do not
require tracking the past usage for each user while
dynamic pages are typically generated based on the
users traversal path.
Thus,
personalization
approaches are more suitable for dynamic
websites whose contents are more volatile and
transformation approaches are more appropriate
for websites that have a built-in structure and
store relatively static and stable contents. This
paper examines the questions of how to improve user
navigation in a website with minimal changes to its
structure. It complements the literature of
transformation
approaches
that
focus
on
reconstructing the link structure of a website. As a
result, our model is suitable for website maintenance
and can be applied in a regular manner.
5. CONCLUSION
We present a framework f or a combined web
recommender system, in which users navigational
patterns are automatically learned from web usage
data and content data. These navigational patterns are
then used to generate recommendations based on a
users current status. The items in a recommendation
list are ranked according to their importance, which is
in turn computed based on web structure information.
Our prelim is experiments show that the combination
of usage, content, and structure of data in a web
recommender sys-tem has the potential to improve
the quality of the system, as well as to keep the
recommendation up-to-date. However, there are
various ways to combine these different channels.
Our future work in this area will include investigating
different methods of combination.
REFERENCES
[1]. Aladwani, A. M. and Palvia, P. C. (2002)
Developing and validating an instrument for
measuring user (perceived web quality. Information
and Management, 39, 6, 467(476
[2]. Carlos, F., Raquel, G. and Carlos, O. (2008) the
relevance of Web Design for the Website Success: A
heuristic analysis. University of Zaragoza

[3]. Carroll, J. M. (1995). The scenario perspective


on system development, in J. M. Carroll (Eds.),
Scenario (based design: Envisioning work and
technology in system development, 1(18. John Wiley,
New York.

[7]. B.
Mobasher,
Data
Mining
for
Personalization, The Adaptive Web: Methods and
Strategies of Web Personalization, A. Kobsa, W.
Nejdl, P. Brusilov sky, eds., vol. 4321, pp. 90-135,
Springer Verlag, 2007.

[4]. Erickson, T. (1995) Notes on design practice:


Stories
and
prototypes
as
catalysts
for
communication, in J. M. Carroll (Eds.), Scenario
(based design: Envisioning work and technology in
system development, pp. 37(58. John Wiley, New
York.

[8] C.C. Lin and L. Tseng, Website Reorganization


Using an Ant Colony System, Expert Systems with
Applications, vol. 37, no. 12, pp. 7598-7605, 2010.

[5]. [Kle99] Jon M. Kleinberg. Authoritative sources


in a hyperlinked environment. Journal of the AC M,
46(5):604632, 1999.
[6]. [PE98] Mike Perkowitz and Oren Etzioni.
Adaptive web sites: Automatically synthesizing web
pages. In AAAI /I AAI, pages 727732, 1998.

[9] R. Gupta, A. Bagchi, and S. Sarkar, Improving


Linkage of Web Pages, INFORMS J. Computing,
vol. 19, no. 1, pp. 127-136, 2007.
[10] C.C. Lin, Optimal Web Site Reorganization
Considering Information Overload and Search
Depth, European J. Operational Research, vol. 173,
no. 3, pp. 839-848, 2006.