You are on page 1of 15

COLLNET Journal of Scientometrics and Information

Management

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/tsim20

Web citation analysis of Library and Information


Science and Communication and Media Studies
journals : A comparative study

B. Niveditha & Mallinath Kumbar

To cite this article: B. Niveditha & Mallinath Kumbar (2020) Web citation analysis of Library
and Information Science and Communication and Media Studies journals : A comparative
study, COLLNET Journal of Scientometrics and Information Management, 14:2, 335-348, DOI:
10.1080/09737766.2021.1915721

To link to this article: https://doi.org/10.1080/09737766.2021.1915721

Published online: 04 Jun 2021.

Submit your article to this journal

Article views: 12

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=tsim20
COLLNET Journal of Scientometrics and Information Management
ISSN : 0973 – 7766 (Print) 2168 – 930X (Online)
Vol. 14(2) December 2020, pp. 335–348
DOI : 10.1080/09737766.2021.1915721

Web citation analysis of Library


and Information Science and
Communication and Media Studies
journals : A comparative study
B. Niveditha
Mallinath Kumbar

The present study examines the availability of web citations in schol-


arly journals of Library and Information Science and Communication
and Media Studies. The journals were selected based on their high im-
pact factor published between 2008 and 2017. A PHP script was used to
crawl the Uniform Resource Locators collected from the references. A
total of 12,251 articles were downloaded and 5,55,428 references were
extracted. A total of 1,02,718 URLs were checked for their availability.
Further, the lexical features of URL like file extension, path depth, char-
acter length and the top-level domain were determined. The research
B. Niveditha* findings indicated that the percentage of web citations in articles has
UGC-Junior Research Fellow been continuously increased from 2008 to 2017 in both disciplines. The
Department of Library and result of the accessibility check indicated that more number of URLs in
Information Science CMS journal articles were accessible than in LIS journal articles. The
University of Mysore majority of errors in both the disciplines were due to HTTP 404 error
Manasagangotri code (Not found error). The findings of the study will be helpful to au-
Mysuru 570006 thors, publishers and editorial staff to ensure that web citations will be
Karnataka accessible in future.
India
Keywords: References, Web citations, URLs, DOIs, Library and information
niveditha.jb@gmail.com science, Communication and media studies, HTTP error, PHP.

Mallinath Kumbar +
Department of Library and
Information Science
University of Mysore 1. Introduction
Manasagangotri
Mysuru 570006 The Internet-based electronic resources have been grow-
Karnataka ing drastically in recent years. These electronic resources
India
have changed how scholarly community seek information
+ mallinathk@yahoo.com
and which has in turn led to a positive impact on their pro-
* Corresponding Author ductively and creativity. The scholars have thus started to

©
B. Niveditha and M. Kumbar

cite the electronic resources along with their URL links or DOIs in their research publica-
tions. The accessibility of these web citations has created a new challenge. The stability of
the web citations will not only aid in steady information transfer to other researchers but
also can result in enhanced academic productivity. Web citations are used by authors from
many disciplines. In this context, the present study has attempted to check the availability
of URLs in the field of Library and Information Science and Communication and Media
Studies during the period 2008-2017 using a PHP script.

2.  Review of literature


Isfandyari-Moghaddam et al. [1] stated that the Web has changed the citing behaviour
of researchers and this in turn has influenced the growth of web citations. Prithvi Raj and
Sampath Kumar [5] analyzed web citations cited in three LIS scholarly journals published
from 2001 to 2010. The web citations were used in other disciplines apart from Library
and Information Science. For instance, Lawrence [2] made a comprehensive study on con-
ference articles in computer science and related disciplines, Zhang [9] conducted a com-
parison study among two journals in Communication and Media studies, and Mardani [3]
surveyed the available web citations in chemistry articles.
Wu [8] had stated in his study that with the rise of web references, there is an increase in
inaccessible web references also. The reasons for non-persistence of web citations as stated
by Markwell and Brooks [4] were broken links and restructuring the file hierarchy by some
providers and Spenellis [6] in his study stated that it was due to server problems and in-
valid URL hostname or paths. Vinay Kumar and Sampath Kumar [7] had also analyzed the
various characteristics features of URLs like their file format, top-level domain, path depth
and character length. This paper aims to extend the above-said studies by comparing the
web citations in two disciplines, Library and Information Science and Communication and
Media Studies. The study also uses a PHP script to check for the availability of URLs.

3.  Research Questions

· What proportion of web citations are used in Library and Information Science and
Communication and Media Studies journal articles?
· What proportion of URLs and DOIs has been cited in scholarly LIS and CMS journal
articles?
· What is the percentage of vanished URLs?
· What are the lexical features associated with active and vanished URLs?

336 COLLNET JOURNAL OF SCIENTOMETRICS AND INFORMATION MANAGEMENT     14(2) DECEMBER 2020
Web citation analysis of Library and Information Science and Communication...

4. Methodology
4.1 Selection of journals
For the present study, data were drawn from 20 leading Library and Information Sci-
ence and Communication and Media Studies scholarly journals. The journals were selected
based on their high impact factor as per Clarivate Analytics’ 2018 “Journal Citation Re-
port.” The journals selected for the current study are presented in table 1.

4.2 Selection of articles and references


All the research articles published during the 10-year period, that is, from 2008 to 2017
were taken up for the study. Editorial notes, book reviews, short communication were ex-
cluded. The references that were adjoined at the end of each article were considered for
the study. A total of 5,55,428 references were selected from 12,251 articles published in the
20 journals.

4.3 Extraction of URLs


The references that contained web links and DOIs were extracted as the study deals
with their accessibility. The DOIs and arXiv identifier was first resolved to URLs using the
syntax https://doi.org/. For example, a DOI name 10.1010.1234/567 was resolved from
the address https://doi.org/10.1010.1234/567. Similarly, arXiv identifier was resolved to
URLs using the syntax https://arxiv.org/. A total of 1,02,718 URLs were extracted for
checking their availability.

4.4 Testing URLs and examining their lexical features


A PHP script was developed to test bulk URLs. The script uses the CURL library, a
standard PHP extension to check for URL availability and also documents the error code
associated with vanished URLs. Apart from checking the URLs, the script obtains the lexi-
cal features of URLs. To determine how the lexical features influence the decay of URL,
the file extension, path depth, character length and top-level domain was obtained for
active and vanished URLs.

5.  Results and Discussion


5.1  Journal-wise distribution of articles, references and web citations
Table 2 reflects the journal-wise distribution of articles, references and web citations
during the year 2008-2017. A total of 7986 and 4265 articles published in ten Library and
Information Science and ten Communication and Media Studies scholarly journals were
examined. The articles in LIS journals contained a total of 3,24,636 references, which is
higher than the references in CMS journals which have a total of 2,30,792 references. It can
be observed that the number of references in LIS journals has been consistently increas-
ing, whereas in CMS journals the references are varied, that is, it is neither increasing nor

COLLNET JOURNAL OF SCIENTOMETRICS AND INFORMATION MANAGEMENT    14(2) DECEMBER 2020 337
B. Niveditha and M. Kumbar

decreasing. The journal Scientometrics in Library and Information Science has the highest
number of articles (2575) as well as references (96,137). This is followed by JASIST with
1744 articles and 87,468 references and New Media and Society, a Communication and Me-
dia Studies journal with 794 articles and 40,663 references. The lowest number of articles
was found in the journal Media Psychology (212) and Communication Theory (213), both
of which are Communication and Media Studies journals. The lowest number of references
was found in the Journal of the Medical Library Association (6293) and Portal (10412). It
can be observed that the references in the journal Scientometrics and Journal of the Asso-
ciation for Information Science and Technology have accounted for 56% of the total cited
references in Library and Information Science journals. The table also summarizes the dis-
tribution of web citations and depicts that the number of web citations in Library and
Information Science journal articles (51839) is more than in Communication and Media
Studies journal articles (50879). The total web citations among all journals were found to
be highest in the journal Scientometrics (14261), followed by Information Communication
and Society (13233) and Journal of the Association for Information Science and Technology
(10301). A low number of web citations were noted in the Journal of Advertising (940) and
Journal of Medical Library Association (2255). The percentage of web citations from the
total number of references is high in Communication and Media Studies journal articles
(22.05%) than in Library and Information Science journal articles (15.97%). The number of
references and web citations in journal articles is positively co-related and the relation is
statistically significant with r = .965, p = .000 in Library and Information Science journals
and r = .960, p = .000 in Communication and Media Studies journals. This was performed
using Pearson’s Correlation analysis.

5.2 Year-wise distribution of web citations


The percentage of web citation by year in both disciplines is shown in figure 1. The
percentage has increased from a low of 11.21 and 7.22 in the year 2008 to a high of 22.80
and 37.22 in the year 2017. The statistical relation shows that there is a positive correlation
between the year and percentage of web citations ((r = 0.919) in LIS journals and (r= 0.968)
in CMS journals) and the relation is statistically significant ((p = 0.000) in LIS and CMS
journals). This clearly shows that the percentage of web citations in articles has continu-
ously increased from 2008 to 2017 in both disciplines.

5.3 Distribution of URLs and DOIs


The web citation permanence is of major concern to academicians and the use of
DOIs can prevent their decay (Saberi & Abedi, 2012; Sadat-Moosavi et al., 2012; Yang
et al., 2010). The DOI is a character string that is used to identify a scholarly publication
in the digital environment. Figure 2 shows the distribution of URLs and DOIs in both
disciplines. It was found that Library and Information Science journal articles had 31,291
(60.36%) URLs, 19,640 (37.89%) DOIs and 908 (1.75%) arXiv and WOS identifiers cited in the

338 COLLNET JOURNAL OF SCIENTOMETRICS AND INFORMATION MANAGEMENT     14(2) DECEMBER 2020
Web citation analysis of Library and Information Science and Communication...

references. In contrast, Communication and Media Studies journal articles had 19,946
(39.20%) URL links, 30,868 (60.67%) DOIs and 65 (0.13%) arXiv identifiers.

5.4 Journal-wise distribution of active and vanished URLs


Though the web has eased information access, missing web citations is a major con-
cern for researchers (Sadat-Moosavi et al., 2012). The DOIs and arXiv identifiers were
resolved to URLs and were tested for their availability. The result of the accessibility check
indicated that out of the 51,839 URLs in LIS journal articles, 76.90% were active while the
remaining 23.10% encountered accessibility error. In CMS journal articles, out of the 50,879
URLs, 84.32% were accessible and 15.68% were considered as vanished URLs. The sum-
mary of journal-wise active and vanished URLs are presented in Table 3. It can be noted
that the Library and Information Science journal articles have more number of vanished
URLs as compared to Communication and Media Studies journals. This can be attributed
to more number of DOIs used in CMS journal articles. It is pertinent to highlight that
among the two disciplines, Media Psychology, Journal of Communication, and Commu-
nication Theory, which are Communication and Media Studies journals had less number
of vanished URLs. A large number of vanished URLs were found in the journal Portal
(44.46%), which is a Library and Information Science journal. This is followed by New
Media and Society (38.67%), followed by Journal of Advertising (37.23%) and Public Un-
derstanding of Science (33.69%), which are Communication and Media Studies journals.

5.5 Year-wise distribution of active and vanished URLs


Figure 3 shows the distribution of active and vanished URLs in both disciplines. The
percentage of vanished URLs has been gradually decreasing from 2008 (50.57%) to 2017
(9.69%) in LIS journal articles. However, the percentage of vanished URLs varied from a
low of 6.90% in the year 2017 to a high of 51.68% in 2009 in CMS journal articles. This in-
dicates that early published papers have a collectively greater number of vanished URLs.
To know the correlation between the age and percentage of vanished URLs, Pearson’s Cor-
relation analysis was performed. It was found that there is positive correlation between
the age and percentage of vanished URLs and the correlation was statistically significant
in both the disciplines (r = 0.996, p = 0.000) (r = 0.927, p = 0.000). It is evident that the
URLs which are cited early tend to vanish, which can also be seen in previous studies
(Sampath Kumar, B. T. & Manoj Kumar, 2012; Sampath Kumar, B. T. & Prithviraj, 2015).

5.6 Distribution of HTTP error codes


The various error codes that are encountered for vanished URLs are presented in Table
4. In Library and Information Science journal articles, out of the 11,973 inaccessible URLs,
the HTTP 404 error message is encountered by 83.10% URLs, followed by HTTP 403 error
message with 5.38% and HTTP 500 error message with 3.96%. The HTTP error codes in
the “Others” category accounted for 0.7%. Of the 7980 inaccessible URLs, in Communica-
tion and Media Studies, the HTTP 404 error code is encountered by almost 84.40% URLs,

COLLNET JOURNAL OF SCIENTOMETRICS AND INFORMATION MANAGEMENT    14(2) DECEMBER 2020 339
B. Niveditha and M. Kumbar

followed by HTTP 403 error message with 6.18% and HTTP 500 error message with 3.53%.
The HTTP error codes in the “Others” category contributed 0.85% of the total error codes.
The results are comparable with previous study results of Goh and Ng (2007) and Sam-
path Kumar, B. T. & Manoj Kumar (2012) where most of the errors are associated with
HTTP 404 error code.

5.7 File extension associated with active and vanished URLs


The data illustrated in Table 5 indicates that the greatest numbers of cited URLs in both
disciplines are .html files and the least cited are .txt files. File format having the highest per
cent of vanished URLs was the .doc file. Low level of loss was associated with the .html file
extension.

5.8 Path depth associated with active and vanished URLs


Table 6 shows that 88.13% of URLs in LIS journal articles and 93.06% of URLs in CMS
journal articles with path depth 2 were accessible. The study showed that the highest per-
centage of vanished URLs, that is, 54.75% in LIS journal articles and 63.10% in CMS journal
articles were found in the URLs associated with path depth >8. To know the relationship
between the path depth of the URLs and the percentage of vanished URLs, Pearson’s Co-
relation analysis was performed. It is found that the path depth and the percentage of van-
ished URLs are positively correlated (r = 0.825, p = 0.003), and the relation is statistically
significant in LIS journal articles. The result is comparable with the findings of Spinellis
(2003) who stated that deep path depths are related to increased URL failures. Whereas
in Communication and Media Studies journal articles, the path depth and the percentage
of vanished URLs are positively correlated (r = 0.556, p = 0.951), but surprisingly the rela-
tion is statistically not significant.

5.9 Character length associated with active and vanished URLs


Table 7 shows the percentage of active and vanished URLs by their character length. It
can be found that the majority of URLs had a character length of 31-40 and 41-50 and in
both LIS and CMS journal articles. The URLs with a length of 31-40 and 41-50 were also
found to be active than the other URLs. URL length between 51 and 80 was found to have
been vanished in both the disciplines. To know the relation between the percentage of van-
ished URLs and the character length, Pearson’s correlation analysis was performed. It was
found that there is positive correlation between percentage of vanished URLs and the char-
acter length and this relation is statistically significant (r = 0.846, p = 0.002) and (r = 0.710,
p = 0.021). This indicates that more number of characters in an URL leads to its decay,
which is also observed in the previous study (Sampath Kumar & Vinay Kumar (2012)).

5.10  Top-level domain associated with active and vanished URLs


The analysis of active and vanished URLs by type of top-level domain is shown in table
8. Nine main types of top-level domain have been considered in this study. They are .com,

340 COLLNET JOURNAL OF SCIENTOMETRICS AND INFORMATION MANAGEMENT     14(2) DECEMBER 2020
Web citation analysis of Library and Information Science and Communication...

country code, .edu, .gov, .info, .int, .mil, .net, and.org. The top-level domain like .name,
.design, and .ngo were considered in the “Others” category. It was found that the organi-
zational top-level domain was the leading domain used followed by commercial top-level
domain. This is consistent in both disciplines. The top-level domain having the greatest
number of vanished URLs was the information top-level domain (.info) in Library and In-
formation Science journal articles and military top-level domain (.mil) in Communication
and Media Studies journal articles. A noteworthy finding is that proportionally low level
of loss was associated with organizational (.org) top-level domain in both the disciplines.
The reason for this low level of loss can be attributed to the use of DOIs, which have .org
as their top-level domain.

6. Conclusion

The present study investigated the use of web citations cited in 10 LIS and 10 CMS
scholarly journals during the year 2008-2017. The percentages of web citations in articles
have been continuously increasing in both disciplines. However, it can be seen than Com-
munication and Media Studies journals have slightly more web citations than the Library
and Information Science journals. The stability of URL is an important aspect to be con-
sidered while citing a web citation. The URLs become valueless if they are not accessible
by the other researchers. This happens when the URLs tend to move to a new location or
change their content. It is obvious from the present study that URL decay can be reduced to
some extent by the use of Digital Object Identifier (DOI). The percentage of vanished URLs
was low in Communication and Media Studies journal articles, which was due to the use
of DOIs which accounted for nearly 60.67%. The Library and Information Science journal
articles had only 37.89% of DOIs. To overcome the problem of inaccessibility of URLs some
suggestions are needed to be implemented. The authors should systematically check the
URL links that they cite in their scholarly publication. The authors should remove or
update the vanished URLs. The authors should maintain a digital backup of the web
pages that they cite. Apart from the authors, the editors and publishers should take up
the responsibility to check the availability of the URLs before their publication. The
publishers can also use the URL shortening service to reduce the length of the URL as
shorter URLs tend to be active. Various web archives can also be used to preserve web
resources. The authors, publishers and editorial team should make sure that the cited
resources in the scholarly work can be available to the future researchers without any
hindrance.

COLLNET JOURNAL OF SCIENTOMETRICS AND INFORMATION MANAGEMENT    14(2) DECEMBER 2020 341
B. Niveditha and M. Kumbar

Table 1
Journals selected for the study

Library and Information Science Communication and Media Studies


Impact Impact
Journal Journal
factor factor
Journal of Computer-Mediated
Journal of Informetrics (JOC) 3.484 4.00
Communication (JCMC)
Information Processing and
3.444 Journal of Communication (JOC) 3.729
Management (IPM)
Journal of the Association for
Information Science and Technology 2.835 Communication Research (CR) 3.391
(JASIST)
Scientometrics 2.173 New Media and Society (NMS) 3.121
Information, Communication and
College and Research Libraries (CRL) 1.626 3.084
Society (ICS)
Journal of the Medical Library
1.541 Journal of Advertising (JOA) 2.88
Association (JMLA)
Portal: Libraries and the Academy
1.473 Political Communication (PC) 2.738
(Portal)
Aslib Journal of Information
1.461 Communication Theory (CT) 2.733
Management (AJIM)
Journal of Academic Librarianship
1.459 Media Psychology (MP) 2.57
(JAL)
Library and Information Science Public Understanding of Science
1.372 2.452
Research (LISR) (PUOS)

Table 2
Journal-wise distribution of articles, references and web citations

Library and Information Science Communication and Media Studies

Total Total Total Percent- Total Total Total Percent-


num- number web age of Jour- num- number web age of
Journal
ber of of refer- cita- web cita- nal ber of of refer- cita- web cita-
articles ences tions tions articles ences tions tions

JOI 647 24901 3546 14.24 JCMC 337 17946 4548 25.34

IPM 720 31553 2798 8.87 JOC 469 25811 8631 33.44

JASIST 1744 87468 10301 11.78 CR 401 24503 4781 19.51


Contd...

342 COLLNET JOURNAL OF SCIENTOMETRICS AND INFORMATION MANAGEMENT     14(2) DECEMBER 2020
Web citation analysis of Library and Information Science and Communication...

Scientomet-
rics 2575 96137 14261 14.83 NMS 794 40663 6468 15.91

CRL 376 13376 2722 20.35 ICS 704 34782 13233 38.05

JMLA 241 6293 2255 35.83 JOA 350 19427 940 4.84

Portal 288 10412 2906 27.91 PC 254 14715 2396 16.28

AJIM 378 15536 2682 17.26 CT 213 15348 3274 21.33

JAL 698 22658 6999 30.89 MP 212 12495 4147 33.19

LISR 319 16302 3369 20.67 PUOS 531 25102 2461 9.8

Total 7986 324636 51839 15.97   4265 230792 50879 22.05

Table 3
Journal-wise distribution of active and vanished URLs

Library and Information Science Communication and Media Studies

Journal Van- Journal Van-


Total Active Total Active
% ished % % ished %
URLs URLs URLs URLs
URLs URLs

JOI 3546 3163 89.20 383 10.80 JCMC 4548 3861 84.89 687 15.11

IPM 2798 2236 79.91 562 20.09 JOC 8631 7998 92.67 633 7.33

JASIST 10301 7842 76.13 2459 23.87 CR 4781 4323 90.42 458 9.58

Scientomet-
rics 14261 11956 83.84 2305 16.16 NMS 6468 3967 61.33 2501 38.67

CRL 2722 1864 68.48 858 31.52 ICS 13233 11545 87.24 1688 12.76

JMLA 2255 1751 77.65 504 22.35 JOA 940 590 62.77 350 37.23

Portal 2906 1614 55.54 1292 44.46 PC 2396 2116 88.31 280 11.69

AJIM 2682 1804 67.26 878 32.74 CT 3274 2969 90.68 305 9.32

JAL 6999 5187 74.11 1812 25.89 MP 4147 3898 94.00 249 6.00

LISR 3369 2449 72.69 920 27.31 PUOS 2461 1632 66.31 829 33.69

Total 51839 39866 76.90 11973 23.10 50879 42899 84.32 7980 15.68

COLLNET JOURNAL OF SCIENTOMETRICS AND INFORMATION MANAGEMENT    14(2) DECEMBER 2020 343
B. Niveditha and M. Kumbar

Table 4
Distribution of HTTP error codes

Library and Information Science Communication and Media Studies


HTTP error
codes Number of vanished
Number of vanished URLs % %
URLs
400 535 4.47 159 1.99
403 644 5.38 493 6.18
404 9949 83.1 6735 84.4
405 41 0.34 53 0.66
410 65 0.54 61 0.76
416 71 0.59 32 0.4
500 474 3.96 282 3.53
503 110 0.92 98 1.23
Others 84 0.7 67 0.85
Total 11973 100 7980 100

Table 5
File extension associated with active and vanished URLs

Library and Information Science Communication and Media Studies


File Exten-
sion Total Active Vanished Total Active Vanished
% % % %
URLs URLs URLs URLs URLs URLs
.asp 1056 542 51.33 514 48.67 1022 438 42.86 584 57.14
.cfm 626 410 65.50 216 34.50 314 224 71.34 90 28.66
.cgi 114 71 62.28 43 37.72 28 18 64.29 10 35.71
.doc 127 47 37.01 80 62.99 48 16 33.33 32 66.67
.html 40401 33850 83.79 6551 16.21 44388 39632 89.29 4756 10.71
.jsp 164 97 59.15 67 40.85 164 129 78.66 35 21.34
.pdf 7736 3921 50.69 3815 49.31 3624 1714 47.30 1910 52.70
.php 1213 730 60.18 483 39.82 1205 681 56.51 524 43.49
.txt 18 14 77.78 4 22.22 25 14 56.00 11 44.00
Others 384 184 47.92 200 52.08 61 33 54.10 28 45.90
Total 51839 39866 76.90 11973 23.10 50879 42899 84.32 7980 15.68

344 COLLNET JOURNAL OF SCIENTOMETRICS AND INFORMATION MANAGEMENT     14(2) DECEMBER 2020
Web citation analysis of Library and Information Science and Communication...

Table 6
Path depth associated with active and vanished URLs

Library and Information Science Communication and Media Studies


Path
Depth Total Active Vanished Total Active Vanished
% % % %
URLs URLs URLs URLs URLs URLs
PD = 0 1403 1135 80.90 268 19.10 764 593 77.62 171 22.38
PD = 1 3910 2658 67.98 1252 32.02 2456 1579 64.29 877 35.71
PD = 2 27360 24113 88.13 3247 11.87 33975 31618 93.06 2357 6.94
PD = 3 8021 5141 64.09 2880 35.91 5610 3864 68.88 1746 31.12
PD = 4 5513 3428 62.18 2085 37.82 3134 1836 58.58 1298 41.42
PD = 5 2767 1731 62.56 1036 37.44 2635 1875 71.16 760 28.84
PD = 6 1494 942 63.05 552 36.95 1279 890 69.59 389 30.41
PD = 7 761 405 53.22 356 46.78 577 426 73.83 151 26.17
PD = 8 252 151 59.92 101 40.08 178 118 66.29 60 33.71
PD > 8 358 162 45.25 196 54.75 271 100 36.90 171 63.10
Total 51839 39866 76.90 11973 23.10 50879 42899 84.32 7980 15.68

Table 7
Character length associated with active and vanished URLs

Library and Information Science Communication and Media Studies


Character
Length Total Active Vanished Total Active Vanished
% % % %
URLs URLs URLs URLs URLs URLs
<20 454 379 83.48 75 16.52 202 174 86.14 28 13.86
21-30 2759 2204 79.88 555 20.12 1826 1508 82.58 318 17.42
31-40 10597 9229 87.09 1368 12.91 13007 12143 93.36 864 6.64
41-50 19022 16847 88.57 2175 11.43 21974 20477 93.19 1497 6.81
51-60 6235 3892 62.42 2343 37.58 3811 2291 60.12 1520 39.88
61-70 4390 2527 57.56 1863 42.44 2759 1606 58.21 1153 41.79
71-80 2972 1702 57.27 1270 42.73 2426 1526 62.90 900 37.10
81-90 2035 1170 57.49 865 42.51 1730 1119 64.68 611 35.32
91-100 1236 701 56.72 535 43.28 1107 742 67.03 365 32.97
>100 2139 1215 56.80 924 43.20 2037 1313 64.46 724 35.54
Total 51839 39866 76.90 11973 23.10 50879 42899 84.32 7980 15.68

COLLNET JOURNAL OF SCIENTOMETRICS AND INFORMATION MANAGEMENT    14(2) DECEMBER 2020 345
B. Niveditha and M. Kumbar

Table 8
Top-level domain associated with active and vanished URLs

Library and Information Science Communication and Media Studies


Top-level
Domain Total Active Vanished Total Active Vanished
% % % %
URLs URLs URLs URLs URLs URLs
.com 6728 4729 70.29 1999 29.71 6960 4889 70.24 2071 29.76
country
code 6563 3547 54.05 3016 45.95 4112 2332 56.71 1780 43.29
.edu 4200 2525 60.12 1675 39.88 1871 1257 67.18 614 32.82
.gov 1849 1246 67.39 603 32.61 695 432 62.16 263 37.84
.info 185 90 48.65 95 51.35 52 33 63.46 19 36.54
.int 168 123 73.21 45 26.79 138 105 76.09 33 23.91
.mil 21 13 61.90 8 38.10 12 5 41.67 7 58.33
.net 1458 1161 79.63 297 20.37 687 394 57.35 293 42.65
.org 30656 26422 86.19 4234 13.81 36351 33452 92.02 2899 7.98
Others 11 10 90.91 1 9.09 1 0  0.00 1 100.00
Total 51839 39866 76.90 11973 23.10 50879 42899 84.32 7980 15.68

Figure 1
Year-wise distribution of web citations

346 COLLNET JOURNAL OF SCIENTOMETRICS AND INFORMATION MANAGEMENT     14(2) DECEMBER 2020
Web citation analysis of Library and Information Science and Communication...

Figure 2
Distribution of URL and DOIs

Figure 3
Active and vanished URLs

References

[1] Goh, D.H., & Ng, P.K. Link decay in leading information science journals, Journal of the
American Society for Information Science and Technology, 58(1), 2007, 15–24. https://doi.
org/10.1002/asi.20513

COLLNET JOURNAL OF SCIENTOMETRICS AND INFORMATION MANAGEMENT    14(2) DECEMBER 2020 347
B. Niveditha and M. Kumbar

[2] Isfandyari-Moghaddam, A., Saberi, M. K. & Mohammad Esmaeel, S. Availability and


Half-life of Web References Cited in Information Research Journal: A Citation Study, In-
ternational Journal of Information Science and Management, 8(2), 2010, 57-75.
[3] Lawrence, S. Free online availability substantially increases a paper’s impact, Nature,
411(6837), 2001, 521.
[4] Mardani, A. An investigation of the web citations in Iran’s chemistry articles in SCI, Li-
brary Review, 61(1), 2012, 18–29.
[5] Markwell, J., & Brooks D. W. “Link rot” limits the usefulness of web-based educational
materials in biochemistry and molecular biology, Biochemistry and Molecular Biology
Education, 31(1), 2003, 69–72.
[6] Prithvi Raj, K. R., & Sampath Kumar, B. T. Web Citation Trends in Indian LIS Journals:
A Citation Analysis, COLLNET Journal of Scientometrics and Information Management,
9(2), 2015, 295–310.
[7] Saberi, M. K., & Abedi, H. Accessibility and decay of web citations in five open access ISI jour-
nals, Internet Research, 22(2), 2012, 234–247. https://doi.org/10.1108/10662241211214584
[8] Sadat-Moosavi, A., Isfandyari-Moghaddam, A., & Tajeddini, O. Accessibility of online
resources cited in scholarly LIS journals: A study of Emerald ISI-ranked journals, Aslib
Proceedings, 64(2), 2012, 178–192. https://doi.org/10.1108/00012531211215196
[9] Sampath Kumar, B. T., & Manoj Kumar, K. S. Persistence and half-life of URL citations
cited in LIS open access journals, Aslib Proceedings, 64(4), 2012, 405–422. https://doi.
org/10.1108/00012531211244752
[10] Sampath Kumar, B. T., & Prithviraj, K. R. Bringing life to dead: Role of Wayback Machine
in retrieving vanished URLs. Journal of Information Science, 41(1), 2015, 71–81.
[11] Sampath Kumar, B. T., & Vinay Kumar, D. HTTP 404-page (not) found: Recovery of de-
cayed URL citations. Journal of Informetrics, 7(1), 2012, 145–157.
[12] Spinellis, D. The Decay and Failures of Web References, Communications of the ACM,
46(1), 2003, 71–77.
[13] Vinay Kumar, D., & Sampath Kumar, B. T. Prevalence of URLs in Library and Information
Science (LIS) Literature: A Citation Analysis, COLLNET Journal of Scientometrics and
Information Management, 11(2), 2017, 287-297
[14] Wu, Z. An empirical study of the accessibility of web references in two Chinese academic
journals, Scientometrics, 78(3), 2008, 481–503.
[15] Yang, S., Qiu, J., & Xiong, Z. An empirical study on the utilization of web academic re-
sources in humanities and social sciences based on web citations. Scientometrics, 84(1),
2010, 1–19. https://doi.org/10.1007/s11192-009-0142-7
[16] Zhang, Y. The Effect of Open Access on Citation Impact: A Comparison Study Based on
Web Citation Analysis, Libri, 56(3), 2007, 145–156.

348 COLLNET JOURNAL OF SCIENTOMETRICS AND INFORMATION MANAGEMENT     14(2) DECEMBER 2020

You might also like