This action might not be possible to undo. Are you sure you want to continue?
James Ma University of Arizona Tucson, AZ email@example.com Zhu Zhang University of Arizona Tucson, AZ firstname.lastname@example.org Ray Garcia University of Arizona Tucson, AZ email@example.com
Current search engines obtain their search results largely based on keyword matching methods, and rarely return search results based on the web site comprehensibility. Moreover search mechanisms perform the matching at the level of individual web pages, and do not account for the related hypertext information presented on the website. To answer those two challenges that current search engines have, we propose a website-level comprehensibility model. This model aims to automatically compute the comprehensibility score of a site from webpage-level quantitative features. An empirical study indicates the accuracy rate to predict a comprehensibility score of a site is 64% using our model. Our work can be potentially applied to provide an additional search criterion to the existing web search practices for the purpose of learning. Additionally the automatic calculation of comprehensibility scores may improve the ability of a learner to select appropriate websites from a list of search results to best service their self-directed learning objectives.
The World Wide Web (WWW) is increasingly used as a source of information when reading about a specific topic. A major purpose of reading is comprehension . As the size of the information sources on the web increases dramatically, how to locate the web source that is easiest to comprehend thus facilitating the learning process becomes a difficult task. A widely used tool for finding web information is web search engines. However, current search engines obtain their search results largely based on keyword matching methods. These search mechanisms rarely consider the intention of an individual to locate information sources to facilitate learning. Therefore, one measure of the ease at which one can understand the information on the website, comprehensibility, is needed to improve the searching engines to find appropriate web sites for learning about a topic. Comprehensibility scores also afford the learner an easy means of evaluating the search results in terms of the ease of understanding the information source. Current search mechanisms are limited to the web page level matching, and do not account for the hypertext information present on the website. The comprehension process involves an information seeking behavior which is typically not satisfied by the content of a single web page. Multiple pages within the same site are likely to be browsed after a learner is directed to the first page by the search engine. Therefore, the comprehensibility score should be evaluated against the website and not a single page to adequately represent the ease of learning purpose being fulfilled. How to automatically determine the comprehensibility score of a web site becomes an intriguing challenge. Research on comprehensibility usually involves cognitive models for human reading behavior  which can not be readily applied to the auto-determination of web site comprehensibility. This paper proposes an analytical model to automatically compute the comprehensibility score of a website from page-level quantitative features. Specifically, by considering the hyperlink structures of websites, we aggregate the page-level features to the corresponding site-level features. We then apply statistical analysis to obtain the analytical relationship between the comprehensibility score and the site-level features. Our model is evaluated by an empirical study using 300 websites with 51,790 web pages. The empirical
results indicate the accuracy rate to predict a comprehensibility score of a site is 64% using our model. The remainder of this paper is structured as follows. Section 2 presents an overview of related research. In Section 3 we introduce the site-level comprehensibility model. Section 4 summarizes our empirical study. In Section 5 we describe our findings and discuss their implications. Section 6 concludes by reviewing our findings and proposing potential avenues for future inquiry.
2. Related Research
Web site evaluation in general is a well studied research area. Among these research practices, usability study is a major focus of this area . Fogg et al. investigated how different elements of web sites affect people’s perception of credibility by using an online questionnaire  . Web page similarity evaluation was studied by Tombros and Ali . They examined the relative effectiveness for three different sources of evidence – HTML tags, structure, and query when calculating similarities between pages. Automated web site evaluations also mainly focus on usability studies [5-7]. A free online service, http://webxact.watchfire.com/, is widely used to analyze the usability of an individual web page for quality, accessibility, and privacy issues. Most of the methods employ the calculation of a number of page-level or site-level features by parsing HTML source code. A comparison of 11 feature calculation tools was provided by Brajnik . Few of the research focuses on the analysis of web comprehensibility. In addition the hyperlink structures of web sites have been rarely taken into direct consideration in these studies. The assessment of web comprehensibility for the purpose of facilitating learning processes has not been widely investigated . In 1995, Manfred Thüring et al. associated comprehension of a hypertext document with the construction of a mental model that represents the objects and semantic relations described in a text . A recent study  related reading strategies with hypertext based learning and cognition. A reference report on how to evaluate a website for educational purposes reviewed dozens of sites discussing content evaluation . Few web comprehensibility research studies the comprehensibility from an analytical perspective, and thus are hardly feasible to be automated. Automatic web site comprehensibility evaluation is therefore a gap in the literature, and we aim at contributing to this area.
3. Site-Level Comprehensibility Model
We present a site-level comprehensibility model to evaluate the analytical relationship between the comprehensibility score of a web site and the page-level features. We denote S a web site with n pages. The i − th page at S is represented by pi , i = 1, 2,...n . Among the n pages,
p1 is the entry page, the first page from which a learner starts to navigate the site. S can be represented by a directed graph with n vertices. Each vertex represents a unique page. An arc from pi to p j corresponds to a hyperlink on page pi to page p j . In this stage of our
inquiry we ignore the weight of arcs. For any page pi , we compute m page-level textual, graphical, or structural features. xij represents the j − th feature of pi , j = 1, 2,...m . For site S , the corresponding j − th site-level feature, X j , is the weighted average of xij , ∀i :
i =1 n ij
, j = 1, 2,...m
where wi is the weight of pi . In order to calculate wi , we define a dumping factor, α , 0 ≤ α ≤ 1 . Specifically, at any given page in S , we assume a learner has a probability α to continue browsing within S , and a probability 1 − α to stop browsing. We also employ ti to
represent the shortest path from p1 to pi . Then α ti is the maximum probability that a learner will ever visit the page pi given the learner starts his browsing in site S from p1 . We consider the maximum probability of a page ever being visited be the weight of that page. wi = α ti , i = 1, 2,...n (2) It is easy to prove that w1 = 1 ; wi = 0 if there is no path from p1 to pi since ti → ∞ ; 0 < wi < 1 for any other i ; therefore, 0 ≤ wi ≤ 1, ∀i . Moreover, X j becomes a regular average
of xij if α = 1 ; X j = x1 j if α = 0 . The comprehensibility score of the site S is denoted as C . We hypothesize there is an analytical relationship between the comprehensibility score and the site-level features. C = f ( X j , j = 1, 2,...m) (3)
We obtain the analytical relationship f (•) from training data. We then compute X j ' s of any testing site by setting the parameter α and the entry page p1 . Finally we can predict the comprehensibility score C of that site automatically.
4. An Empirical Study
Table 1. Feature Description and Classification
Group Feature Description Group Feature 2 NumBold 1 NumImage No. of images MinImg Height MaxImg Height AveImg Height MinImg Width Min. image height Max. image height Avg. image height Min. image width NumItalic Description No. of bold text Group Feature 4 HasTitle Description Whether there is a title (0:no; 1: yes) No. of words in the page title No. of blocks of scripts
No. of italic text NumUnderline No. of underlined text 3 NumLink No. of links NumLink No. of links NoDecoration without visible underlines NumLink No. of links InPage point to the page itself
NumWord InTitle 5 NumScript
No. of visible words NumSentence No of visible sentences
MaxImg Width AveImg Width
Max. image width Avg. image width
No. of tables
In order to evaluate our model, we performed an empirical study by choosing 300 entry pages. Starting from these entry pages, we downloaded 300 web sites in the breadth-first manner by using Offline Explorer, commercial software developed by MetaProducts Corporation. Since some sites have a large number of pages, we stopped the crawling when the number of downloaded web pages for a site reached 500. In total 51,790 pages have been downloaded. We extracted 19 page-level features by employing a home-grown Perl application. The features are categorized into 7 groups as shown in Table 1. Note in group 4, the feature “HasTitle” has a Boolean value where 0 means “No,” and 1 means “Yes.” To compute the sitelevel features we extract the hyperlink structures for each site. For each extracted graph, the shortest path from the entry page to any page of that site was calculated. The comprehensibility scores of those web sites were evaluated by four professional librarians who are well trained to interpret the ease of understanding the information contained within the hypertext of the web site. The librarians have, as their primary job, to consider the overall comprehensibility of the web site in making their evaluations. Some guidelines used to evaluate the sites include the general impression of the site facilitating the finding and reading of information; whether the information is easy to understand, useful, and credible; whether the site is a source of informational or instructional content to the general public. The average of the evaluation scores given by the librarians was used as the comprehensibility score for that site. While the evaluation that Ivory et al.  performed used a scale of three, our evaluation used a Likert scale of five , where 1 means poor; 2 means low; 3 means average; 4 means high; and 5 means best. The Likert evaluation provides more detailed and more descriptive predictions of comprehensibility. We obtained the best-fit model by a linear regression analysis with the comprehensibility scores being the response variable, and the site-level features being the explanatory variables. Fifty sites were selected as the testing group to study the prediction accuracy by comparing the comprehensibility scores given by the librarians and those predicted by the obtained model.
5. Results and Discussion
We present the results of the empirical analysis. 5.1 Impact of Individual Group
Table 2. Individual Group Regressions
Group 1 2 3 4 Adjusted R2 0.07036 0.03587 0.02123 0.01572
Feature List NumImage, MaxImgHeight, AveImgHeight NumBold, NumItalic, NumUnderline NumLinkInPage HasTitle, NumWordInTitle
Group 5 6 7
Adjusted R2 0.005397 -0.004499 -0.008871
Feature List NumScript NumWord NumTable
In order to analyze the impact of each group of features to the comprehensibility scores, we performed a regression analysis with each group of features as the explanatory variables. We assumed the dumping factor to be 0.75. Table 2 exhibits the regression results. The feature list shows the specific features used to obtain the best-fit model of all possible combinations of the features within that group. For example, using “NumLinkInPage” gives the best-fit model in all 7 possible combinations of the three features, “NumLink,” “NumLinkNoDecoration” and “NumLinkInPage,” in group 3. Table 2 shows group 1 gives the best explanatory model compare
with the other 6 groups. It indicates the most influential factors to the comprehensibility are the image features, especially the number of images and the image height features. In contrast, groups 6 and 7 are the worst explanatory of the comprehensibility scores.
5.2 Impact of Multiple Groups We have also analyzed the combinations of different groups of features. The results are listed in Table 3 from which several constructive guidance lines can be obtained. For example, “MaxImgHeight” has a substantially positive impact on the comprehensibility. “NumImage” also has a positive impact whereas “AvgImgHeight” has a negative impact. It indicates that a large image in height, usually the focal point, helps learners understand the concept of the site. The more images, the easier the comprehension is. However, if the average image height is large, the comprehension process will be likely encumbered because learners will have to scroll up and down extra distance in order to grasp the concept of the site. Some results can not be easily explained intuitively. “HasTitle” has a negative impact, and “NumWordInTitle” has a positive impact. It seems that learners would prefer no page titles. If there is one, learners would like to have long page titles. An intuitive explanation of this observation is that learners would rather have no pages titles than a short and possibly misguiding page title.
Table 3. Multiple Group Regressions
Adjusted R p-value Grou p 1 Feature Intercept NumImage MaxImgHeight AvgImgHeight NumBold NumItalic NumUnderline NumLinkInPage HasTitle NumWordInTitle NumScript
Table 4. Dumping Impact Analysis
0.1599 0.002833 Coefficien t 3.8277 1.2248 14.5791 -4.9629 -1.3673 0.9929 -1.7603 1.2681 -2.1441 0.9646 -1.6778 Pr(>|t|) 0.125422 0.125422 0.011526 0.116469 0.184508 0.162660 0.124276 0.311571 0.052613 0.137958 0.064209
0.00 0.25 0.50 0.75 1.00
Adjusted R2 0.08993 0.1505 0.1549 0.1599 0.1632
p-value 0.06986 0.00417 0.003489 0.002833 0.002468
Analysis is performed using the 10 features in Table 3.
3 4 5
5.3 Impact of Dumping Factor Table 4 analyzed the impact of the dumping factor α . As shown in Section 3, if α = 0 , only the entry page has any effect on the site comprehensibility because α = 0 indicates a learner will definitely stop the navigation process after he reaches the entry page. As α increases, the first impression impact decreases. Namely the pages that may be visited earlier in the stage of the navigation have lesser advantages over the pages being visited later. If α = 1 , each page of the site has the same relevance to the site. The results in Table 4 indicate that the model is more explanatory as α increases. α = 1 gives the best-fit model, which implies that the site comprehensibility is equally benefited from all pages within the site. The order of a page being browsed within a site does not play a significantly important role to the comprehensibility of the site.
5.4 Model Validation In order to validate our model, we chose 50 sites as the testing group. We applied the model coefficients obtained at α = 1 to calculate the predicted comprehensibility scores for the 50 sites. We then considered the absolute difference between the predicted comprehensibility scores and the scores, scale of 5, given by the human evaluators. The average of the absolute differences is 1.0684. If the absolute score difference of a site is greater than 1, we regarded it as a prediction error. Out of 50 sites, 18 prediction errors have been observed. Therefore the prediction accuracy rate of the site-level comprehension model is 64%.
We propose a site-level comprehensibility model to automatically compute the comprehensibility score of a site from page-level quantitative features. By utilizing the hyperlink structure of a site, we provide a framework to aggregate the page-level features to the corresponding site-level features. An empirical study indicates the accuracy rate to predict a comprehensibility score of a site is 64% using our model. Our work can be potentially applied to provide an additional search criterion to the existing web search practices for the purpose of learning. Additionally the automatic calculation of comprehensibility scores may improve the ability of a learner to select appropriate websites from a list of search results to best service their self-directed learning objectives. Our initial research may be extended in several ways. We have identified 160 page-level features to be extracted for analysis of which only 19 were selected for this initial phase of the research. The textual and structural similarities between different pages may be considered as site-level features. Alternative methods for page weight determinations need to be studied. Instead of only considering the effect of the shortest path, all possible paths can be considered and then analyzed. Further analysis against a larger set of web sites and features should improve the model fitness. Improved understanding of the concept of comprehensibility may enable search engine developers and web page developers to employ these findings in their development practices.
           T. A. van Dijk and W. Kintsch, Strategies of discourse comprehension. Orlando: Academic Press, 1983. E. H. Chi, P. Pirolli, and J. Pitkow, "The scent of a site: A system for analyzing and predicting information scent, usage, and usability of a web site," presented at Conference on Human Factors in Computing Systems, Hague, the Netherlands, 2000. B. Fogg, J. Marshall, O. Laraki, A. Osipovich, C. Varma, N. Fang, J. Paul, A. Rangnekar, J. Shon, P. Swani, and M. Treinen, "What Makes Web Sites Credible? A Report on a Large Quantitative Study," presented at SIGCHI'01, Seattle, WA, USA, 2001. A. Tombros and Z. Ali, "Factors Affecting Web Page Similarity," presented at Advances in Information Retrieval, 27th European Conference on IR Research, ECIR 2005, Santiago de Compostela, Spain, 2005. M. Y. Ivory, R. R. Sinha, and M. A. Hearst, "Empirically validated web page design metrics," presented at ACM SIGCHI'01 Conference: Human Factors in Computing Systems, New York, 2001. M. Y. Ivory and M. A. Hearst, "Statistical Profiles of Highly-Rated Web Sites," presented at CHI, Minneapolis, Minnesota, USA, 2002. G. Brajnik, "Automatic web usability evaluation: Where is the limit?," presented at the 6th Conference on Human Factors and the Web, Austin, TX, 2000. M. Thüring, JörgHannemann, and J. Haake, "Hypermedia and Cognition: Designing for Comprehension," COMMUNICATIONS OF THE ACM, vol. 38, pp. 57-66, 1995. L. Salmerón, J. J. Cañas, W. Kintsch, and I. Fajardo, "Reading Strategies and Hypertext Comprehension," Discourse Processes vol. 40, pp. 171-191, 2005. EETAP, "Evaluating the Content of Web Sites Guidelines for Educators," http://www.epa.gov/enviroed/pdf/evalwebsites.pdf 1999. R. Likert, "A technique for the measurement of attitudes," Archives Psychol, vol. 140, pp. 55, 1932.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.