Professional Documents
Culture Documents
2
Employing Dynamic Analysis towards Quantifying Quality In 59
60
3 Use of Websites 61
4 62
5 Michail D. Papamichail, Themistoklis Diamantopoulos, Kyriakos C. Chatzidimitriou, and Andreas 63
6 64
7
L. Symeonidis 65
8
(mpapamic,thdiaman,kyrcha)@issel.ee.auth.gr,asymeon@eng.auth.gr 66
9
Electrical and Computer Engineering Dept., Aristotle University of Thessaloniki 67
10 Thessaloniki, Greece 68
11 69
ABSTRACT the latest statistics of the world wide web1 , the number of websites
12 70
The constant growth of the service-oriented business model dic- has increased 10 times in the last nine years, from approximately
13 71
tates the development of web applications that allow interaction of 206 million in 2010 to more than 2 billion websites today. Nowa-
14 72
the end user with the business. In this type of end user-business days, there are numerous business websites that offer their products
15 73
interaction, quality in use is of pivotal importance, since bad user online and are trying to increase their market share as much as
16 74
experience may practically lead to customer churn and revenue possible, so the impact of easily finding what the customer wants
17 75
loss. Thus, the need to evaluate the extent to which a website is is decisive. As a result, monitoring and evaluating the quality in
18 76
appealing from a quality in use perspective is high. To that end, we use of websites becomes a critical aspect of website success.
19 77
have built a platform that performs dynamic analysis on websites, According to ISO 25010 [1], Quality in Use is defined as the degree
20 78
focusing on quantifying quality in use through the computation of to which a product or system can be used by specific users to meet
21 79
a series of dynamic analysis metrics. Apart from the computation their needs towards achieving specific goals and is composed of
22 80
of various metrics that are related to performance, accessibility, the following characteristics: Effectiveness, Efficiency, Satisfaction,
23 81
search engine optimizations, and development best practices, our Freedom from risk, and Context coverage. These characteristics are
24 82
analysis involves the identification of the technology stack used in closely related with the design quality of websites from an end-user
25 83
each website. We have applied our methodology on the 5,000 most point of view, which influences important user-related factors such
26 84
popular websites as defined by Alexa and provide the results of the as the degree of acceptance and the purchasing behaviour [10, 19].
27 85
analysis to the community for further probing and research. From a quality assessment viewpoint, measuring the aforemen-
28 86
tioned characteristics in a systematic manner has drawn the atten-
29 87
CCS CONCEPTS tion of researchers, with current research efforts targeting at mod-
30 88
elling the design quality of websites both quantitatively [5, 14, 18]
31 • Software and its engineering → Extra-functional proper- 89
and qualitatively [12, 13]. The quantitative studies use metrics that
32 ties; • Information systems → Web mining; • Human-centered 90
quantify certain characteristics in order to evaluate the overall de-
33 computing → Human computer interaction (HCI). 91
sign quality of websites, while qualitative approaches mostly target
34 KEYWORDS at measuring attractiveness and customer satisfaction through user
92
35 93
quality in use, dynamic analysis, technology stack penetration, web feedback. An interesting factor is that the majority of the studies
36 94
application quality that abide by the definition of quality in use, perform context-
37 95
ACM Reference Format:
specific evaluation by analyzing websites that belong in the same
38 96
Michail D. Papamichail, Themistoklis Diamantopoulos, Kyriakos C. Chatzidim- domain (e-shops, airline companies, marketing sites, e-government
39 97
itriou, and Andreas L. Symeonidis. 2020. Employing Dynamic Analysis to- etc.), instead of providing a generic evaluation scheme.
40 98
wards Quantifying Quality In Use of Websites. In Proceedings of MSR ’20: 17th From a software development viewpoint, the competitive web
41 99
International Conference on Mining Software Repositories (MSR ’20). ACM, development ecosystem dictates fast time-to-market and offers nu-
42 100
New York, NY, USA, 5 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn merous frameworks for development, while the leading vendors
43 101
propose the use of many reusable components as a state-of-the-
44 102
1 INTRODUCTION practice development paradigm. This reuse-oriented approach is
45 103
The continuously increasing penetration of the high-speed internet more than obvious in the JavaScript (JS) community and especially
46 104
into every aspect of everyday life has defined a new popular busi- in the npm ecosystem, where in extreme cases there are one-liner
47 105
ness paradigm, which offers everything-as-a-service. According to libraries containing more than 70 dependencies [11]. Against this
48 106
background and in an effort to construct a dataset that quantifies the
49 107
Permission to make digital or hard copies of all or part of this work for personal or quality in use as perceived by end-users, we performed a thorough
50 108
classroom use is granted without fee provided that copies are not made or distributed dynamic analysis on the 5,000 most popular websites (as ranked by
51 109
for profit or commercial advantage and that copies bear this notice and the full citation Alexa2 ). Our dataset contains the dynamic analysis results for each
52 on the first page. Copyrights for components of this work owned by others than ACM 110
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, website, and specifically: 1) metrics that quantify the characteristics
53 111
to post on servers or to redistribute to lists, requires prior specific permission and/or a related to quality in use (as defined by ISO 25010), and 2) informa-
54 fee. Request permissions from permissions@acm.org. 112
tion regarding the involved technology stack (web server, content
55 MSR ’20, May 25–26, 2020, Seoul, South Korea 113
56
© 2020 Association for Computing Machinery. 114
1 http://www.internetlivestats.com/total-number-of-websites/
ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00
57 https://doi.org/10.1145/nnnnnnn.nnnnnnn 2 https://www.alexa.com/ 115
58 1 116
MSR ’20, May 25–26, 2020, Seoul, South Korea Papamichail and Diamantopoulos, et al.
233 291
234 html-lang-valid link-blocking-first-paint errors-in-console Message Boards 292
layout-table consistently-interactive manifest-short-name-length Widgets
235 object-alt deprecations 293
redirects Maps
aria-required-parent uses-long-cache-ttl no-websql Payment Processors
236 uses-http2 294
offscreen-content-hidden dom-size Analytics
td-headers-attr unused-css-rules image-aspect-ratio Advertising Networks
237 appcache-manifest 295
definition-list total-byte-weight Issue Trackers
logical-tab-order unminified-javascript no-vulnerable-libraries CMS
238 no-document-write 296
list uses-optimized-images Live Chat
239 document-title mainthread-work-breakdown geolocation-on-start JavaScript Graphics 297
aria-valid-attr first-interactive external-anchors-use-rel- Editors
240 audio-caption script-blocking-first-paint url noopener Ecommerce 298
frame-title url critical-request-chains notification-on-start Comment Systems
241 focus-traps speed-index-metric is-on-https Web Frameworks 299
video-caption unminified-css password-inputs-can-be- Cache Tools
242 input-image-alt uses-responsive-images pasted-into PaaS 300
listitem user-timings uses-passive-event-listeners Video Players
243 bypass time-to-first-byte no-mutation-events Tag Managers 301
tabindex screenshot-thumbnails url Dev Tools
244 custom-controls-labels uses-webp-images Web Servers 302
th-has-data-cells bootup-time Miscellaneous
245 custom-controls-roles uses-rel-preload Databases 303
label offscreen-images Performance_Score JavaScript Frameworks
246 heading-levels uses-request-compression Rich Text Editors 304
Accessibility_Score
247
aria-allowed-attr estimated-input-latency Progressive_Web_App_Score url Containers
305
aria-required-attr network-requests Best_Practices_Score SEO
248 color-contrast first-meaningful-paint SEO_Score Static Site Generator 306
button-name Number_of_JS_Libraries CDN
249 focusable-controls Programming_Language Operating Systems 307
aria-valid-attr-value url Website_url Mobile Frameworks
250 aria-required-children Domain JavaScript Libraries 308
use-landmarks Search Engines
251 accesskeys url IaaS 309
video-description content-width Blogs
252 duplicate-id pwa-page-transitions Wikis 310
dlitem load-fast-enough-for-pwa Web Server Extensions
253 visual-order-follows-dom is-on-https link-text Documentation Tools 311
image-alt works-offline structured-data Programming Languages
254 meta-refresh without-javascript canonical Marketing Automation 312
html-has-lang viewport viewport Captchas
255 link-name splash-screen document-title Font Scripts 313
aria-roles themed-omnibox
url http-status-code Hosting Panels
256 managed-focus service-worker is-crawlable 314
meta-viewport pwa-each-page-has-url font-size
257 valid-lang": "jhebfjw" redirects-http meta-description 315
pwa-cross-browser mobile-friendly
258 webapp-install-banner hreflang 316
plugins
259 317
260 318
261 Figure 2: Database Schema Overview ( Metrics collection, Statistics collection, and Technologies collection) 319
262 320
263 Table 1: Distribution of values for collection statistics // Get identified technologies for a certain website 321
264 curl −X GET {base_url}/api/v1/Technologies 322
265 ?where={"primaryUrl":"the_website_url"} 323
Metric Min Max Mean Std
266 324
267 Performance_Score 0.00 100.00 41.04 25.11 // Get websites with high Performance and Accessibility Scores 325
268 Accessibility_Score 0.00 100.00 59.40 19.43 curl −X GET {base_url}/api/v1/Statistics 326
269 Progressive_Web_App_Score 0.00 90.90 34.81 11.80 ?where={ 327
Best_Practices_Score 31.25 93.75 63.14 11.71 "$and":[
270 328
{"performance_score": {"$gt": 70}},
271 SEO_Score 33.33 100.00 90.20 12.59 329
{"accessibility_score": {"$gt": 80}},
272 Number_of_JS_Libraries 1 264 21.37 24.77 ] 330
273 First_meaningful_paint 0.31s 42.57s 6.12s 4.04s } 331
274 Technology_Stack_length 1 28 5.22 4.43 332
275 333
Figure 3: Example Queries
276 334
277 335
found at [2]. In addition, the statistics include metrics that refer
278 336
to the complexity of the analyzed websites. These metrics are the REST API7 that enables effective data retrieval and seamless inte-
279 337
number of JavaScript libraries they include, their first meaningful gration with existing systems. The REST API was developed using
280 338
paint which expresses the required time period to load the basic the Python Eve8 framework. Figure 3 demonstrates two example
281 339
elements of the website and the technology stack length which is the REST calls using the implemented API. The first retrieves the list
282 340
sum of all the identified technologies. On top of the aforementioned of identified technologies for a certain website, while the second
283 341
information, in an effort to enable domain-aware quality in use filters the websites that exhibit performance score and accessibility
284 342
evaluation, we also extracted the domain of each analyzed website. score more than 70 and 80, respectively. The full documentation of
285 343
the provided API is available online9 .
286 REST API. Given that quality in use evaluation is a non-trivial 344
287 task that involves a variety of research questions, each having dif- 7 https://quality-in-use.herokuapp.com/api/v1
345
288 ferent motivation and experimental targets, scalability and data 8 http://docs.python-eve.org/en/latest/ 346
289 querying abilities are essential. Towards this direction, we offer a 9 https://quality-in-use.herokuapp.com/docs 347
290 3 348
MSR ’20, May 25–26, 2020, Seoul, South Korea Papamichail and Diamantopoulos, et al.
465 REFERENCES [11] David Haney. 2020. NPM & left-pad: Have We Forgotten How To Program? https: 523
466 [1] 2020. ISO 25010. https://www.iso.org/obp/ui/#iso:std:iso-iec:25010:ed-1:v1:en. //www.davidhaney.io/npm-left-pad-have-we-forgotten-how-to-program/. Ac- 524
Accessed: 2020-02-05. cessed: 2020-02-05.
467 [12] Frode Heldal, Endre SjøVold, and Anders Foyn Heldal. 2004. Success on the 525
[2] 2020. Lighthouse Scoring Methodology. https://developers.google.com/web/
468 tools/lighthouse/v3/scoring. Accessed: 2020-02-05. InternetâĂŤoptimizing relationships through the corporate site. International 526
469 [3] Salam Abdallah and Bushra Jaleel. 2015. Website appeal: development of an Journal of Information Management 24, 2 (2004), 115–129. 527
assessment tool and evaluation framework of e-marketing. Journal of theoretical [13] Soyoung Kim and Leslie Stoel. 2004. Apparel retailers: website quality dimensions
470 and satisfaction. Journal of Retailing and Consumer Services 11, 2 (2004), 109–117. 528
and applied electronic commerce research 10, 3 (2015), 45–62.
471 [4] Faizan Ali. 2016. Hotel website quality, perceived flow, customer satisfaction and [14] Scott McCoy, Andrea Everard, and Eleanor T Loiacono. 2009. Online ads in 529
purchase intention. Journal of Hospitality and Tourism Technology 7, 2 (2016), familiar and unfamiliar sites: Effects on perceived website quality and intention
472 530
213–228. to reuse. Information Systems Journal 19, 4 (2009), 437–458.
473 [15] Renuka Nagpal, Deepti Mehrotra, and Pradeep Kr Bhatia. 2016. Usability eval- 531
[5] Billy Bai, Rob Law, and Ivan Wen. 2008. The impact of website quality on
474 customer satisfaction and purchase intentions: Evidence from Chinese online uation of website using combined weighted method: fuzzy AHP and entropy 532
visitors. International journal of hospitality management 27, 3 (2008), 391–402. approach. International Journal of System Assurance Engineering and Management
475 7, 4 (2016), 408–417. 533
[6] Ioannis K Chaniotis, Kyriakos-Ioannis D Kyriakou, and Nikolaos D Tselikas. 2015.
476 Is Node.js a viable option for building modern web applications? A performance [16] Michail Papamichail, Themistoklis Diamantopoulos, Ilias Chrysovergis, Philip- 534
477 evaluation study. Computing 97, 10 (2015), 1023–1044. pos Samlidis, and Andreas Symeonidis. 2018. User-perceived reusability esti- 535
[7] Kyriakos Chatzidimitriou, Michail Papamichail, Themistoklis Diamantopoulos, mation based on analysis of software repositories. In Proceedings of the 2018
478 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation 536
Michail Tsapanos, and Andreas Symeonidis. 2018. Npm-miner: An infrastructure
479 for measuring the quality of the npm registry. In 2018 IEEE/ACM 15th International (MaLTeSQuE). IEEE, 49–54. 537
Conference on Mining Software Repositories (MSR). IEEE, 42–45. [17] Rim Rekik and Ilhem Kallel. 2011. Fuzzy reduced method for evaluating the
480 538
[8] Wen-Chih Chiou, Chin-Chao Lin, and Chyuan Perng. 2010. A strategic frame- quality of institutional web sites. In Proceedings of the 2011 7th International
481 Conference on Next Generation Web Services Practices (NWeSP). IEEE, 296–301. 539
work for website evaluation based on a review of the literature from 1995–2006.
482 Information & management 47, 5-6 (2010), 282–290. [18] John D Wells, Veena Parboteeah, and Joseph S Valacich. 2011. Online impulse buy- 540
[9] Sookyup Chong and Rob Law. 2018. Review of studies on airline website evalua- ing: understanding the interplay between consumer impulsiveness and website
483 quality. Journal of the Association for Information Systems 12, 1 (2011), 32. 541
tion. Journal of Travel & Tourism Marketing (2018), 1–16.
484 [10] Etienne Cocquebert, Damien Trentesaux, and Christian Tahon. 2010. WISDOM: [19] Tao Zhou, Yaobin Lu, and Bin Wang. 2009. The relative importance of website 542
485 A website design method based on reusing design and software solutions. Infor- design quality and service quality in determining consumers’ online repurchase 543
mation and Software Technology 52, 12 (2010), 1272–1285. behavior. Information Systems Management 26, 4 (2009), 327–337.
486 544
487 545
488 546
489 547
490 548
491 549
492 550
493 551
494 552
495 553
496 554
497 555
498 556
499 557
500 558
501 559
502 560
503 561
504 562
505 563
506 564
507 565
508 566
509 567
510 568
511 569
512 570
513 571
514 572
515 573
516 574
517 575
518 576
519 577
520 578
521 579
522 5 580