How To Sort Trustworthy Health Online Information Improvements of The Automated Detection of Honcode Criteria

ScienceDirect
Available online at www.sciencedirect.com
ScienceDirect
Procedia Computer Science 00 (2017) 000–000
www.elsevier.com/locate/procedia
Available online at www.sciencedirect.com
www.elsevier.com/locate/procedia
ScienceDirect
CENTERIS - International Conference on ENTERprise Information Systems / ProjMAN -
International Conference on Project MANagement / HCist - International Conference on Health
and Social Care Information
CENTERIS Systems
- International and Technologies,
Conference CENTERIS
on ENTERprise / ProjMAN
Information Systems/ HCist 2017, -8-10
/ ProjMAN
November 2017, Barcelona, Spain
International Conference on Project MANagement / HCist - International Conference on Health
and Social Care Information Systems and Technologies, CENTERIS / ProjMAN / HCist 2017, 8-10
How to sort trustworthy health
November online
2017, information?
Barcelona, Spain Improvements
of the automated detection of HONcode criteria
How to sort trustworthy health online information? Improvements
ofa*,,the
Célia Boyer automated
Cédric Frossarda,, detection of HONcode
Arnaud Gaudinat b
criteria
, Allan Hanburyc
Gilles Falquetd
a
Health On The Net Foundation, Geneva, Switzerland
a*,
Célia Boyer BiTeM
, Cédric Frossarda,Geneva,
group, HES-SO/HEG
b
, Arnaud Gaudinat
Information
b
, AllanCarouge,
Science Department, Hanbury c
Gilles Falquetd
Switzerland
Institute of Software Technology & Interactive Systems, TU Wien, Vienna, Austria
c
d a
Healthand
Faculty of economics Onsocial
The Net Foundation,
sciences, Geneva,
University Switzerland
of Geneva, Geneva, Switzerland
b
BiTeM group, HES-SO/HEG Geneva, Information Science Department, Carouge, Switzerland
c
Institute of Software Technology & Interactive Systems, TU Wien, Vienna, Austria
Abstract
d
Faculty of economics and social sciences, University of Geneva, Geneva, Switzerland
When searching on the Web, laypersons and professionals have difficulty in determining the quality or trustworthiness of health
Abstract
websites. No simple approach can differentiate among trustworthy and unreliable health websites in the results provided by major
search engines such as Google, Yahoo or Bing. The European project Kconnect proposes to classify the reliability and readability
When of
levels searching on theWeb
health-related Web,sites
laypersons
and pages,andtools
professionals
according have
to thedifficulty
detection in
of determining the quality
HONcode criteria, or trustworthiness
the Health of health
On the Net Foundation
websites. No simple approach can differentiate among trustworthy and unreliable health websites in the
(HON) Code of Conduct using automated tools that examine how technical the health information contained in each document results provided by major
is.
search engines such as Google, Yahoo or Bing. The European project Kconnect proposes to classify the reliability
This article focuses on enhancement of automated detection of HONcode criteria in real-time settings. Applications of the approach and readability
levels
includeofintegration
health-related
intoWeb
the sites and pages,
HONcode tools according
certification process,toand
the detection
embedding of it
HONcode criteria,
as generic the tool
filtering Health Onuser-centered
into the Net Foundation
health
(HON)
domain Code
searchofengines
Conduct asusing automated
well as into major tools that examine
general how technical
search engines such as athe
Web health information
browser contained in each document is.
extension.
This article
© 2017 Thefocuses onPublished
Authors. enhancement of automated
by Elsevier B.V. detection of HONcode criteria in real-time settings. Applications of the approach
include integration into the HONcode certification
Peer-review under responsibility of the scientific committeeprocess,ofand
theembedding
CENTERISit -asInternational
generic filtering tool into
Conference on user-centered
ENTERprise health
domain search engines as well as into major general search engines such as a Web browser extension.
Information Systems / ProjMAN - International Conference on Project MANagement / HCist - International Conference on
© 2017 TheSocial
Health Authors. Published by Elsevier
SystemsB.V.
© 2017and
Peer-review
CarePublished
The Authors. Information
under responsibility by
of Elsevier
and Technologies.
B.V.
the scientific committee of the CENTERIS - International Conference on ENTERprise
Peer-review under responsibility of the scientific committee of the CENTERIS - International Conference on ENTERprise
Health
Health and
and Social
Social Care
Care Information
Information Systems
Systems andand Technologies.
Technologies.
*
Corresponding author. Tel.: +0-000-000-0000 ; fax: +0-000-000-0000 .
E-mail address: celia.boyer@healthonnet.org
*
Corresponding
1877-0509 © 2017author. Tel.: +0-000-000-0000
The Authors. ; fax: B.V.
Published by Elsevier +0-000-000-0000 .
E-mail address:
Peer-review celia.boyer@healthonnet.org
under responsibility of the scientific committee of the CENTERIS - International Conference on ENTERprise Information Systems /
ProjMAN - International Conference on Project MANagement / HCist - International Conference on Health and Social Care Information Systems
1877-0509 © 2017 The Authors. Published by Elsevier B.V.
and Technologies.
Peer-review under responsibility of the scientific committee of the CENTERIS - International Conference on ENTERprise Information Systems /
ProjMAN - International Conference on Project MANagement / HCist - International Conference on Health and Social Care Information Systems
and Technologies.
1877-0509 © 2017 The Authors. Published by Elsevier B.V.

Peer-review under responsibility of the scientific committee of the CENTERIS - International Conference on ENTERprise Information
Systems / ProjMAN - International Conference on Project MANagement / HCist - International Conference on Health and Social
Care Information Systems and Technologies.
10.1016/j.procs.2017.11.122
Célia Boyer et al. / Procedia Computer Science 121 (2017) 940–949 941
2 Author name / Procedia Computer Science 00 (2017) 000–000
Keywords:Quality criteria, Trust, Learning approach, HONcode, search engine
1. Introduction
Accessing quality online health content requires the ability to define and identify reliable health-related content.
This problem is particularly pertinent to public health [1]. Previous efforts have attempted to automatically label online
health pages according to the quality of their information [2,3,3,4,5,7].
1.1. The HONcode certification as the current solution of trust for health online information
The HONcode Code of Conduct [8] from the Health on the Net Foundation (http://www.HealthOnNet.org), is a set
of ethical, honesty, transparency and quality standards applicable to the processes of creating and maintaining health
website content (see Table 1). The HONcode criteria provide acknowledged indicators of the accuracy and reliability
of health websites [9, 10, 11, 12].
Table 1. The eight HON Code of Conduct (HONcode) criteria for medical and health websites. http://www.hon.ch/Conduct.html and a sample of
content extracted during the evaluation by the HON assessor that documents how each principle was met
HONcode a) Detail b) Sample of the Extracts stored while certifying nihseniorhealth.gov

principle
HC1-Authority Indicate the Site developed by the National Institute on Aging and the National Library of Medicine of the
qualifications of the National Institutes of Health (NIH). NIHSeniorHealth features authoritative and up-to-date
authors health information from Institutes and Centers at NIH. The American Geriatrics Society also
provides expert independent review of some material found on this Web site.
http://nihseniorhealth.gov/background.html
HC2- Information should Medical Information: NLM does not attempt to provide specific medical advice, but rather to
Complementarity support, not replace, the provide users with information to better understand their health and their diagnosed disorders.
doctor-patient NLM urges you to consult with a qualified physician for diagnosis & answers to personal
relationship questions. https://nihseniorhealth.gov/copyright.html
HC3-Privacy Respect the privacy and We do not collect any personally identifiable information (PII) about you when you visit our
policy confidentiality of Web sites unless you choose to provide that information to us. We do collect some data about
personal data submitted your visit to our Web site to help us better understand how the public uses the site and how to
to the site by the visitor make it more helpful. NLM never collects information for commercial marketing or any purpose
unrelated to NLM's functions. https://nihseniorhealth.gov/privacy.html
Attribution of
- HC4- Cites the source(s) of "The information in this topic was provided by the National Eye Institute "
reference criteria published information https://nihseniorhealth.gov/cataract/whatisacataract/01.html
- HC9-date Dates medical and Topic last reviewed: May 2016
health pages https://nihseniorhealth.gov/depression/aboutdepression/01.html
HC5-Justifiability Backs up claims Choosing the right medication, medication dose, and treatment plan should be based on a
relating to benefits and person's individual needs and medical situation, and done under an expert’s care. Only an expert
performance clinician can help you decide whether the medicine’s ability to help is worth the risk of a side
effect. Your doctor may try several medicines before finding the right one.
https://nihseniorhealth.gov/anxietydisorders/medication/01.html
HC6- Presentation is Contact Us - https://nihseniorhealth.gov/about.html
Transparency accessible; contact https://support.nlm.nih.gov/ics/support/ticketnewwizard.asp?style=classic&deptID=28054&cate
information is present gory=nih_seniorhealthm&_ga=2.93009561.567164154.1494170552-1152880118.1494168960
HC7-Financial Identifies funding This site was developed by the National Institute on Aging (NIA) and the National Library of
disclosure sources Medicine (NLM) both part of the National Institutes of Health (NIH).
http://nihseniorhealth.gov/background.html
HC8-Advertising Clearly distinguishes Pop-Up Advertisements: When visiting our Web site, your Web browser may produce pop-up
policy advertising from advertisements. These advertisements were most likely produced by other Web sites you visited
editorial content or by third party software installed on your computer.
https://nihseniorhealth.gov/copyright.html
942 Célia Boyer et al. / Procedia Computer Science 121 (2017) 940–949
Author name / Procedia Computer Science 00 (2017) 000–000 3
Sites must first submit a formal application for HONcode certification. Experts then manually check the site for
compliance with all HON principles. After determination that the site is committed to and respects the HONcode, it
can display the HONcode seal (Figure 1). The site is checked on a regular basis to ensure that it is still compliant.
However, HON relies on the community at large to report misuse of the label or non-respect of a principle.
Fig. 1: HONcode quality and ethical label displayed on certified websites
1.2. Limitations of using a solution such as HONcode certification on a large scale
The HONcode certification process requires that health website editors are both motivated and committed. They
need to invest time, but will receive no direct financial return or other personal incentives for certification. During
HON review, health website editors agree to comply with the HONcode, and must respond to any HON requests to
improve compliance of the site. The manual evaluation process is both time consuming and costly, limiting HON
certification to covering only a few thousand health websites. While the HONcode represents an indicator of accuracy
[9], the limited number of health websites certified varies by health domain. Wozney et al. (2017) identified 24% of
selected websites on child and adolescent mental health were HONcode certified while Davis et al. (2017) identified
only 10% coverage for selected paediatric vascular anomaly sites, or on hip arthroscopy sites [12]. Many high-quality
websites are not HONcode certified. This limits the usefulness of HONcode accreditation status in some health
domains [12]. In 2014, HON changed the annual assessment of certified websites to include a contribution-based
program, thus helping to finance HON’s offering of on going (post-initial) certification. Prices range from 50 euros
for not-for-profit websites to 325 euros for high-ranked commercial websites. In summary, HON filtering of
trustworthy health websites has been limited by the traditional, entirely manual process evaluation (Fig. 2 (a)). The
continuous surveillance of already certified websites is limited to processes of random evaluation and automated
comparisons of content within a time frame. This comparison reveals modifications of content but not necessarily
whether the changes impacted HONcode certification adherence.
1.3. Automated detection of HONcode criteria: an alternative to the HONcode certification process limitation
Motivation to develop automated detection of HONcode criteria adherence include: to identify more health
websites meeting HON criteria; to enhance ongoing surveillance of HONcode certified websites, and to assist the
work of manual HONcode assessors on both tasks, as illustrated in Fig 2 (b). The Kconnect project† sponsored research
focused on development of automated HONcode compliance detection tools - constituted of classifiers that filter
information according to the HON criteria.
†
KConnect a European Union’s Horizon 2020 research and innovation program under grant agreement No 644753 http://Kconnect.e
4 Célia
Author Boyer
name et al. / Procedia
/ Procedia ComputerComputer Science
Science 00 (2017)121 (2017) 940–949
000–000 943
Fig. 2 (a) Current situation with only the manual certification program; (b) Application of automated HONcode criteria detection.
2. Motivation and Early Implementation
The automated system for the detection of HONcode items comprised nine distinct machine learning classifiers,
one for each HONcode criterion (the Attribution criterion requires two distinct parts - Date Attribution and References)
[13,14,15]. The excerpts previously manually extracted by HONcode experts served as the 80% training / 20% test
collection. Those excerpts justified how the website’s compliance to each criterion was met (see Table 1 b). Based on
earlier work, a Naïve Bayes learning algorithm with single word tokenization and a space reduction of 70% was used
for the classifiers [13,14]. Evaluation yielded good results for the Complementarity and Date attribution criteria with
a precision of 83% and 94 % respectively, and a recall of 95% for both. In [14] the automated detection of the
HONcode criteria was compared with the manual process. System precision was good (80% to 100%, except for HC4-
Reference and HC5-
Jusfiability). However, the
results were disappointing
for recall, especially for
HC8-Advertisement policy
Complementarity and Date
Attribution. Two distinct
problems caused poor
system performance on
real-life examples. The
system could not detect
Date Attribution, data if it
was displayed as only
numbers (with no adjunct
text), e.g., 15/07/2016.
Another problem was the
low recall for the
Complementarity criterion
content, which was not
detected because it was
confused within the large
amount of information
present on the page parsed
by the classifier. Fig. 3
illustrates this limitation
Fig. 3 Example of text related to Complementarity criteria a) in blue and not detected by the automated
system. b) The red rectangle illustrating the sliding window. c) Webpage name address
for some cases. Webpage
descriptions of terms of
service or legal policy (see
example in Fig. 3 the webpage‡) typically contained a large body of text, related to different criteria such as HC1-
Authority, HC2-Complementary, HC3-Privacy, and HC8-Advertising policies. The Complementarity criterion
corresponds to 4% of this page’s content in Fig. 3, which means it is not prominent on the page and other criteria were
proportionally more significant. So the classifier detected the information related to others principles but not the HC2-
Complementarity. We named this problem the “submerged content” issue. Further developments were needed to
overcome this identified limitation and are described in the section Methods below.
3. Methods
3.1. Automated Date recognition
Because the date criterion involves structured data, we replaced its machine learning classifier with the OpenNLP
toolkit Named Entity Recognition (NER) tool.[16] The NER model was programmed with the 2’794 excerpts of the
Date attribution (examples in Table 2).
Table 2: Samples of excerpts related to the date criterion taken from the set of 2’794 training data for this criterion.
3.2. HC2- Complementarity criterion
In previous work, the context analysis was restricted to the sentence level [15], but this approach gave a high rate
of false positives because the context was not large enough to be sufficiently discriminant for Complementarity (as
noted above). We adopted a window-based classification unit consisting of an empirically determined 500-characters
maximum (limited to the word boundaries) and slid it progressively 250 characters at a time. The motivation for such
an approach appears in the n-gram tokenization article [17].
To test the effectiveness of the above-described methods for both the Date attribution and Complementarity criteria,
we used the same set of 27 health websites (+10,000 webpages) used for the previous experiments [14]. The 27
websites were used to identify differences between the previous machine learning method used in the article [14,15]
and the NER method used in this research. The main motivation for using the same collection testing environment
was to be able to compare system performance before/after integrating new approaches.
3.3. Classification of Webpages according to a combination between meta information and structured data
Because 80% of health consumers use the web to obtain information, having pertinent and trustworthy sites listed
at or near the top of the Search Engine Results Pages (SERPs) is crucial. Optimizing search engine indexing obliges
websites editors to use structured data such as Dublin Core or Schema.org Microdata, RDF or JSON-LD formats to
add information in the Website content. These structured data are additional content which are not visible to user but
are machine-understandable. Using the HONcode database provides a clear picture of how health websites structure
their content and how they comply with the HONcode criteria. As a result, we created a classification of webpage
names/titles according to related HON principles. For example the webpage https://nihseniorhealth.gov/privacy.html
‡
http://www.alere.com/en/home/policies.html
Manual evaluation
Automated detection
27 websites
Detection unit
Precision Recall with the word “copyright”
Accuracy
+ - and found in the<meta
in % in % in %
Document 26 1 100 19 22 name="DC.Title"
content="
Sentence 26 1 96 85 81
NIHSeniorHealth: Privacy
Sliding window 26 1 96 92 89 Policy " /> will mainly
correspond to the criteria
HC3-Privacy policy. The correspondence work has been done based on the extract retrieved by the HON assessors
during HONcode certification, the meta information of the Web page such as Title content and the associated URL
stored (Fig. 3). This information is used in complement of the automated HONcode detection system in order to
propose an alternative when the automated system is not able to identify the HONcode criteria while it exists in the
site.
4. Results of recent changes to HONcode classification algorithms
The comparative results are described according to precision, recall and accuracy. Table 3 indicates that the NER
approach resulted in lower precision (79 %) when compared to the machine learning approach. However, NER had
higher recall (90% vs. 24%) than machine learning; NER also had higher accuracy (74% vs. 41%).
Table 3 Results using different approaches for the HC4-Date principle: sentence, whole document and NER approaches
Manual evaluation 27 websites Automated detection

Detection unit/method
+ - Precision in % Recall in % Accuracy in %
Document 21 6 100 24 41
NER 21 5 79 90 74
NER offers an effective compromise between precision and recall and outperforms machine learning document
unit classification, making it the better choice for future use on HC-4 Date extraction. Furthermore, NER has the
advantage of extracting text related to the detected date criterion. The sliding window enhances detection of a short
and close succession of terms within a document, as shown in Table 4. The results presented that both Sentence and
Window approaches result in slightly lower precision (96% compared to 100% for Document). On the other hand, the
increase in recall for both these approaches, when compared to that of Document is significant (85% and 92% for
Sentence and Window respectively vs. 19% for Document unit). This is also the case for the accuracy.
The classification of webpages name first evaluation shows that some page names applies to several HONcode criteria
implying a less discriminative approach. So the classification has been used and applied to web page related to one or
two criteria only. The results have just been introduced in the integration phase of the Health Trust indicator available
in the Kconnect search (http://search.kconnect.eu).
Structured meta content exploration preliminary results provides intuitively very good results for a limited criteria
such as HC1-Authority, HC2-Complementarity HC3-Privacy policy, HC8-Advertising. Detailed evaluation of the
results of such strategy on real webpages will be subject of another publication. This strategy of using meta
information is in the continuation of the research work conducted on the name of webpages [4].
Table 4 Results using different approaches for the HC2-Complementarity principle: whole document, sentence, and sliding window approaches
5. Discussion of automated NER vs machine learning vs manual approaches
The comparison between the manual certification of whole health websites and the automatic identification of
criteria on “crawled” webpages is biased by the crawler capability and the level of application of the criteria guidelines
by the HONcode assessors. In a few cases, the date is not present on all health pages of the site, which is not compliant
with the HONcode criterion. However, the automated system might identify the criteria on the page. For the
Complementarity criterion, current results show that the Sentence approach is comparable to the Window approach.
The sliding window avoids over-detection of the criterion HC2-complementary based on the presence of terms such
as “medical doctor” and “healthcare professional” which are statistically discriminant for the HC2-complementary
criterion.
Fig. 4 Health Trust indicator results in results provided by Bing.
The multilingual nature of the Web has not yet been addressed. Current research involved only English health
websites. Further research must follow, as the Web is multilingual even if the English language prevails. Google is
currently displaying as a beta version the ClaimReview schema.org mark-up in their service “Fact Check” for Google
news§. HON should investigate feasibility of such an initiative in the health domain.
§
https://blog.google/products/search/fact-check-now-available-google-search-and-news-around-world/
6. Integration
6.1. Enriching search results
Figure 4 illustrates how a browser extension can enrich search results for common popular search engines, such as
Google or Bing. Search engines rank results according to popularity, not by trustworthiness. The main issue of
implementing such a web browser extension is that the webpages first need to be classified by the automated system
whereby indexing is enriched by the automated HONcode system. There is no information on the Search Engine
Result pages so no information on which websites are indexed and retrieved appears in the top results. Kconnect does
not have the same crawling and indexing capacity as major search tools. So, in order to enable automated trust level
rankings via the Health trust indicator web extension, the 30 first results of 50 most commonly entered queries are
retrieved from Google, Bing and Yahoo. These pages are crawled, indexed and classified according to the automated
HONcode system. Thus, the Web browser plugin allows persons using these general search engines to know the level
of trust and readability for each result, as shown in Figure 4.
6.2. Filtering health websites
The health search engine “everyone.KHRESMOI.eu” provides access to selected and HONcode certified websites.
The list of websites that it indexes is limited and known beforehand, so calculating the Automated HONcode score is
straightforward. Providing an optimal look and feel is difficult. Sometimes the website includes a HON cirterion but
it is not detected by the system due to crawler or system limitations.
Fig. 5 Integration of the Automated HONcode detection within an health search engine
7. Conclusion
Authors have shown in past and current work that alternatives to linguistic approaches such as n-grams can be
used without adversely affecting search results. Classification using NER (Named Entity Recognition) and a sliding
window and can improve accuracy significantly. Thus alternative approaches should be explored for recognition of
other HONcode principles. Automated detection of HONcode principles may correlate with the quality of a website,
but the automated results obtained cannot yet fully match the manual certification process where a human expert
interprets text meanings and where health editors have been “educated” about “the ethical issues and responsibilities
of providing health and medical information on the Web”[18]. Thus, automated HONcode detection should be used
as an adjunct to human expert review - as an autonomous system that determines areas to focus on in rating the
reliability of a health website. The automated system can only detect structured elements without understanding the
meaning of the content. The automated system for HONcode criteria detection developed as part of the Kconnect
services gives the opportunity to complement manual work with an automated approach coupled with user-supported
feedback. During the evaluation conducted in [19], the identification of the trustability level of health websites was
an important feature for users. Thus, authors have developed a prototypic web browser extension Health Trust
indicator (https://hon.ch/20-years/en/tools.html#health-trust-indicator) capable of detecting and analysing pages
contained in the Google or Bing results and estimating the level of trust and readability automatically.
Acknowledgements
The research is conducted in the scope of the European project Kconnect and funded by this project (2015-2017,
project No. 644753) . We would like to thank the HON team who developed the APIs and Web browser extensions.
References
1. Corcelles, R., C. R. Daigle, H. R. Talamas, S. A. Brethauer and P. R. Schauer. 2015. Assessment of the quality of Internet information on
sleeve gastrectomy. Surg Obes Relat Dis, 2015 May-Jun; 11(3):539-44. doi: 10.1016/j.soard.2014.08.014. Epub 2014 Sep 6. URL:
http://www.ncbi.nlm.nih.gov/pubmed/25604832
2. Aphinyanaphongs Y, Tsamardinos I., Statnikov A., Hardin D., Aliferis C.F. 2005. Text categorization models for high-quality article retrieval
in internal medicine. J Am Med Inform Assoc 2005;12(2):207-216
3. Abbasi, A., F. M. Zahedi, and S. Kaza. 2012. Detecting Fake Medical Websites using Recursive Trust Labeling. ACM Transactions on
Information Systems 2012.
4. Gaudinat, A., N. Grabar, C. Boyer. 2007a. Combination of heterogeneous criteria for the automatic detection of ethical principles on health
web sites. AMIA Annu Symp Proc, 2007 Oct 11:264-8.
5. Gaudinat, A., S. Cruchet, C. Boyer, P. Chrawdhry. 2011. Enriching the trustworthiness of health-related web pages. Health Informatics J.,
2011 Jun; 17(2):116-26.
6. Sondhi, P., V. G. Vinod Vydiswaran and C.X. Zhai. 2012. “Reliability Prediction of Webpages in the Medical Domain”. In Advances in
Information Retrieval, R. Baeza-Yates, et al. eds. 7224:219–31. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012.
7. Wang, Y. and Z. Liu. 2007. Automatic detecting indicators for quality of health information on the web. International Journal of Medical
Informatics, 2007; 76(8):575-82.
8. Boyer, C., M. Selby, J. R. Scherrer, and R. D. Appel. 1998. The Health On the Net Code of Conduct for Medical and Health Websites.
Computers in Biology and Medicine 28, no. 5 (September 1998): 603–10.
9. Joury A. U., Alshathri M., Alkhunaizi M., Jaleesah N., and Pines J. M., Internet Websites For Chest Pain Symptoms Demonstrate Highly
Variable Content and Quality, in: Academic Emergency Medicine. (2016)
10. Wozney, Lori, Ashley D. Radomski, and Amanda S. Newton. "The Gobbledygook in Online Parent-Focused Information about Child and
Adolescent Mental Health." Health Communication (2017): 1-6.
11. Davis, K. S., McCormick, A. A., & Jabbour, N. (2017). What might parents read: Sorting webs of online information on vascular anomalies.
International Journal of Pediatric Otorhinolaryngology, 93, 63-67.
12. Ellsworth, B., Patel, H., & Kamath, A. F. (2016). Assessment of quality and content of online information about hip arthroscopy. Arthroscopy:
The Journal of Arthroscopic & Related Surgery, 32(10), 2082-2089.
13. Boyer C, Dolamic L. 2014 Feasibility of automated detection of HONcode conformity for health-related websites. IJACSA 2014;5(3):69-74.
14. Boyer C., Dolamic L. 2015 Automated Detection of HONcode Website Conformity Compared Manual Detection: An Evaluation J Med
Internet Res 2015;17(6):e135.
15. Boyer, C., L. Dolamic, P. Ruch and G. Falquet. 2016. Effect of the Named Entity Recognition and Sliding Window on HONcode automated
detection of HONcode criteria for mass online health content. Proceedings of the 9th International Joint Conference on Biomedical Engineering
Systems and Technologies (BIOSTEC 2016). Vol. 5:HEALTHINF, pp. 151-158 ISBN: 978-989-758-170-0).
16. Apache OpenNLP Developer Documentation, URL:
http://opennlp.apache.org/documentation/manual/opennlp.html#tools.namefind.recognition; Accesses on 25.05.201716
17. Mc Namee P., Mayfield J. 2004 Character n-gram tokenization for EU language text retrieval. Information Retrieval, 2004;7(1-2):73–97
18. Adams, S. A., and A. de Bont. 2007. More than Just a Mouse Click: Research into Work Practices behind the Assignment of Medical Trust
Marks on the World Wide Web. International Journal of Medical Informatics 76 (June 2007): S14–20. doi:10.1016/j.ijmedinf.2006.05.024.
19. Pletneva, N., Z. Uresova, J. J. Altman, N. Postel Vinay, P. Degoulet, J. Hajic, C. Boyer. 2014. Observations and lessons learnt from non-health
professionals evaluating a health search engine. MIE 2014 Stud Health Technol Inform, 2014; 205:940-4. URL:
http://www.ncbi.nlm.nih.gov/pubmed/25160326

How To Sort Trustworthy Health Online Information Improvements of The Automated Detection of Honcode Criteria

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

How To Sort Trustworthy Health Online Information Improvements of The Automated Detection of Honcode Criteria

Uploaded by

Copyright:

Available Formats

ScienceDirect

Available online at www.sciencedirect.com

1877-0509 © 2017 The Authors. Published by Elsevier B.V.

Keywords:Quality criteria, Trust, Learning approach, HONcode, search engine

HONcode a) Detail b) Sample of the Extracts stored while certifying nihseniorhealth.gov

Fig. 1: HONcode quality and ethical label displayed on certified websites

1.2. Limitations of using a solution such as HONcode certification on a large scale

2. Motivation and Early Implementation

3.1. Automated Date recognition

3.2. HC2- Complementarity criterion

4. Results of recent changes to HONcode classification algorithms

Manual evaluation 27 websites Automated detection

5. Discussion of automated NER vs machine learning vs manual approaches

Fig. 4 Health Trust indicator results in results provided by Bing.

6.1. Enriching search results

6.2. Filtering health websites

You might also like