You are on page 1of 9

126 German Medical Data Sciences 2023 — Science. Close to People.

R. Röhrig et al. (Eds.)


© 2023 The authors and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/SHTI230703

Understanding Human-Computer
Interactions in Restricted Clinical
Environments
Kevin KRAUS a, Leona TRÜBEa, and Christopher GUNDLER a,1
a
Applied Medical Informatics, University Medical Center Eppendorf, Hamburg,
Germany

Abstract. Introduction Conducting research on human-computer interaction and


information retrieval requires unobtrusive observations within existing network
architectures. State of the art Most of the available tools are not suitable to be
applied within restricted clinical systems. The specific requirements hinder analysis
of the human factors in health sciences. Concept We identified extensions for
popular web browsers as a suitable way to conduct studies in highly regulated
environments. Implementation Considering the specialized requirements and an
adequate level of transparency for the recorded clinician, we developed an open-
source Web Extension compatible with major web browsers. Lessons learned We
identified the challenges associated with the specific tool and are preparing its use
to understand clinical reasoning in personalized oncology.

Keywords. human-computer interaction, unobtrusive search behavior analysis, web


extension, clinical environment

1. Introduction

1.1. Background

Unobtrusive observation of user activities is a central part of human-computer interaction


(HCI). Designing, evaluating, and implementing interactive systems which are easily and
successfully usable by humans represent a core requirement for all areas of computer
science [1]. This applies especially to medical informatics with its associated
requirement for clear, rapidly available information [2,3]. Consequently, appropriate
tools are required to adhere to human factors in health sciences.
Within this work, we are specifically interested in understanding the human factors
in tasks related to knowledge retrieval. In this area of research, insights into common
processes are the first step toward more elaborate systems. Thereupon, recommender
systems are crucial components to speed up the search process and enhance the quality
of the results in clinical environments [4].
Implementing these tools requires a smooth integration into existing processes
performed by clinicians. Comparable to many questions in HCI, acquiring the necessary
data on clinicians’ search behavior poses a challenge. Achieving unbiased results

1
Corresponding Author, Christopher Gundler, Applied Medical Informatics, University Medical Center
Eppendorf, Martinistraße 52, 20246 Hamburg, Germany; E-mail: c.gundler@uke.de.
K. Kraus et al. / Understanding Human-Computer Interactions 127

requires careful consideration of experimenter effects. A software solution might present


as beneficial to acquire realistic data without interference. However, this solution must
be specifically tailored to the technical challenges imposed by security constraints within
clinical networks. Rendering existing and well-known software solutions impractical, an
alternative appears necessary.

1.2. Objective and Requirements

The aim of the study is the development and evaluation of a software solution for
studying HCI in restricted clinical environments. As a prototypical example, we strive to
conduct an unobtrusive search behavior analysis (USBA). Further examples of
appropriate tasks are usability tests for user interfaces (UI), click or mouse movement
analyses, and navigation analysis on websites. A modular architecture should allow
straightforward modifications and enable customization to other experimental setups.
For creating such a multi-purpose tool, the artifact should work across different operating
systems and without access to the internet. Additionally, the output should be explainable
for clinicians without knowledge of the underlying technology to ensure appropriate
transparency.

2. State of the art

2.1. Related Work

Observation of user activities is a crucial part of studying central factors in HCI. Usually,
specialized departments at universities or in the industry research user interactions with
advanced technology. Besides interviews and questionnaires, the application of
technology like stationary eye-tracking devices has significantly affected this discipline.
Accordingly, in vitro studies with their associated shortcomings remain prevalent [1].
In vivo observations of user behavior can be achieved through the utilization of
software solutions or external sensors, which allow for direct analysis in a more natural
environment compared to laboratory settings. Camera-based setups represent an option
for external hardware usage, which does not require any interaction with the system
under scrutiny. However, these setups might not provide the necessary data resolution
due to the mandatory registration of the recorded screen content within the recording or
are very time-consuming [1]. In such cases, software solutions appear more promising.
When evaluating the usability of an interface, the technology employed for tracking user
behavior can be categorized into at least two distinct groups:
The first category of technology for tracking user behavior focuses on the analysis
of self-hosted websites. Numerous software solutions are available for custom
integration, including commercial products such as Hotjar, Mouseflow, and Mixpanel,
which are well-tested and ready to use. Typically, scripts executed by the web browser
upon loading a prepared website collect the requisite information and transmit it to
centralized servers. As there are no requirements for manual software installation, these
tools can gather extensive amounts of data from most website visitors, even without their
informed consent.
The second category of tools focuses on the general tracking of all activities within
the graphical user interface of an operating system. Kukreja et al. [5] and Jason et al. [6]
proposed similar tools in 2006 for Windows as well as MacOS, and 2008 for Windows.
128 K. Kraus et al. / Understanding Human-Computer Interactions

Each tracking user’s keyboard and mouse events while interacting with other
applications. Even newer studies as the study on user verification systems via mouse
movements by Zheng et al. [7] from 2011, relied on the tool proposed by Kukreja et al.
Maxwell et al. [8] introduced a more modern framework in 2021 using state-of-the-art
technologies to create a cross-platform solution for collecting user interactions. Other
recent publications extend the observations to smartphones [9] or focus on screenshots
collected in regular intervals [10].
Both categories of tools offer distinct advantages and limitations in analyzing
human-computer Interaction. The choice of which tool to use will depend on the research
question, the available resources, and the level of detail required. Sensible environments
within clinical setups may introduce additional challenges and limitations which must be
considered when selecting and deploying the appropriate tools for analyzing HCI.

2.2. Shortcomings

Current tools are only of limited use for the analysis of human factors on search behavior
in the clinical context.
On the one hand, services that directly embed scripts into the research subject like
the website of interest are not well suited. Firstly, these frontends common in clinical
practice often belong to proprietary and sensitive software solutions like clinical
information systems. Researchers are not able to directly insert foreign scripts required.
Secondly, most of the solutions depend on external services or an internet connection for
data aggregation. For security reasons, many clinical systems have limited network
capabilities rendering these solutions unusable. Thirdly, researchers are only able to track
websites that are prepared appropriately. Unrestricted setups such as in information
retrieval tasks with unknown sources are not possible.
On the other hand, services recording the user interaction with interfaces on the level
of the operating system present no alternative. Due to their low-level integration, an
installation requires elevated access rights or customized drivers. Researchers usually
lack the required rights and need to rely on information technology (IT) service
departments. Permission by the latter must be well justified given the associated
challenges for system security. Besides limitations resulting from security concerns in
the clinical context, technical limitations of the tools are well described in the literature.
Cockburn et al. [11] emphasized in their review in 2014 that these “techniques have
several limitations, including access to only a subset of controls, reduced system
responsiveness, and increased frequency of software crashes, which make them
impractical for studies of real use”. Even under the assumption of an optimized
performance in more recent publications, considering alternatives appears attractive.
In both approaches, the tracked interactions might include sensitive information due
to their recording in the clinical context. While user tracking on common websites
requires informed consent, the importance of it appears amplified once clinicians are
observed. During observation of their routine, additional options for the self-censorship
of the recorded data even retrospectively might lead to a higher acceptance.

3. Concept

The aim of this work is the conceptualization and implementation of a hybrid tool
combining the advantages of the two described groups of HCI tools. The software should
K. Kraus et al. / Understanding Human-Computer Interactions 129

allow researchers to collect and analyze data primarily within a web browser but without
limitation on specific resources in the inter- or intranet. It must be accessible even in
highly restricted environments and should not require an installation with the need for
elevated access rights. The tool must be independent of any third-party software, an
internet connection, or similar constraints. The aforementioned requirements emerged in
the conceptualization phase, which was mainly driven by experience from multiple
clinical projects, the highly restrictive clinical environment, and a deep dive into existing
tools. Challenges from those projects were collected and compared to coverage provided
by research publications for defining requirements regarding the proposed extension.
Developing a modern extension for web browsers might be an attractive alternative
to stand-alone software. First, it does not require custom-compiled source code and relies
on the functionality of the host software. Accordingly, the tool operates within an already
installed and maintained, well-tested, and established environment. Second, browser
extensions can access the Document Object Model (DOM) comparably to the scripts of
the first group mentioned in 2.1. Based on the included structural information, they can
access mouse movement or navigational tracking generated by the host without
recording all keystrokes and clicks. Third, they do not require to change the source code
of the website or weaken possible encryption through a proxy. Finally, web browsers and
extensions are usually not limited to a single platform or operating system. Consequently,
they represent versatile and flexible tools covering a broad range of possible tasks.
Nowadays, these extensions could use the modern web standards established within
the last years. Working with multimedia content does not require proprietary solutions
like Adobe Flash anymore. Web browsers can even be used as screen recorders.
Corresponding application programming interfaces (APIs) were stabilized and are used
for video conferences and sharing of screen content. Accordingly, the tracking of user
interaction is not limited to websites. Even the inclusion of portable document formats
(PDFs) or multimedia content viewed during the research process is possible. The
acquired video recording appears as a natural opportunity to evaluate the observed HCI
process afterward. Encoding and embedding the recorded events with subtitles, a
standard multimedia player allows a subsequent and transparent analysis similar to the
work of Schöning et al. [12].

4. Implementation

4.1. Solution description

The central contribution of the work is the proposed extension for web browsers usable
within different systems, clinical system architectures, and networks. In the past, most
developers of web browsers enforced proprietary formats and APIs for extensions
compatible with their software. In recent years, standardization efforts resulting in the
Web Extension format enabled the creation of well-tested extensions fitting most
common platforms. Consequently, we initially aimed to create a single codebase to cover
all Chromium-based web engines and Mozilla Firefox which implement the Manifest v3
standard.
Web Extensions are fully inspectable file archives containing various popular
standards from web development. Figure 1 contains an overview of the major
components used within the proposed software solution. Extensions are by themselves
neither installable nor executable. They are handled and interpreted by the web browser
130 K. Kraus et al. / Understanding Human-Computer Interactions

similar to websites. Accordingly, safety checks and security measures apply to the
extensions as well. The available source code, no compiled components, and the
execution within the web browser optimized for security make the extension appealing
for restricted clinical environments.

Figure 1. The proposed architecture of the Web Extension. The pictogram for the web browser is based on a
work by Flaticon and published under a suitable license.

The Web Extension 1 consists of four components: (1) The manifest.json, a


configuration file holding all the required information to get the extension up and running.
(2) The service-worker.json, representing the primary backbone of an extension. It
connects the developed tool to the web browser and allows their interaction. (3) The
scripts folder containing one or multiple content-script.js scripts. They allow direct
access to the user interface (UI) to conduct certain tasks. (4) An index.html to give the
Web Extension an appropriate look. All mentioned components can communicate with
each other to exchange data. Additionally, they can access the browser’s synced or the
local storage to save and fetch data (e.g., the current extension state) via a specialized
API. In the following sections, we will discuss the system architecture in more detail.

4.1.1. Backend: Data recording


The capture of a user’s interaction with digital resources is carried out within the backend.
Due to security measures within the web browser, multiple components record
independently and are synchronized through timestamps: Since access to the UI is
required, the user’s click behavior and the corresponding metadata like tags and
coordinates are collected through the respecting content script. In contrast, navigational

1
The Git repository is available at https://github.com/UKEIAM/unobtrusive-search-behaviour-analysis
K. Kraus et al. / Understanding Human-Computer Interactions 131

events of the user session like page reloads or redirections are collected through the
service-worker.js. The recording of the user’s screen activity relies on the MDN
mediaRecorder API [13]. The API provides all the required options to customize the
recordings. One example of a possible configuration is the limitation of frames per
second to balance the granularity of the recording and the required capacity for data
storage. The resulting binary large objects (BLOBs) are encoded in the open and
standardized WebM format. Subsequently, the backend is responsible for processing and
mapping the collected user interaction data to a suitable subtitle format. While multiple
format options are available to even encode complex visual elements, we focused on the
Web Video Track Format (WebVTT). Being another HTML5 standard, it harmonizes
well with the chosen WebM format and encodes the events with their corresponding time
in relation to the start of the recording. The obtained file can be embedded as subtitles
into the screen recording by an appropriate multimedia player.

4.1.2. Frontend
Components of the React.js framework were utilized to build a straightforward user
interface. If additional UI elements are required for different experimental setups,
adaptation is easily possible. The subject of an experiment triggers the Web Extension
manually by clicking on the extension icon within the browser. This opens the Web
Extension’s UI. The sensitive nature of the data requires the informed consent of the
subject. Before the recording starts, they receive an overview of recording options with
the possibility to opt-out each option individually (figure 2 A). Once the selection
appears to be appropriate regarding the aim of the study and the data security, the subject
starts the recording. Once the recording has started, the possibility for selections gets
disabled and the reset button is enabled (figure 2 B). This allows a reset of the current
session including a full deletion of the collected data to ensure privacy. After finishing a
session, the subject is invited to give experiment-dependent feedback. An example is
shown in figure 2 C. Here, the subject is requested feedback on the success of the
information retrieval. The resulting feedback is saved to the local storage. Finally, the
raw data, the screen recording, and the processed WebVTT file are downloaded to the
local computer and deleted from the temporal storage of the browser.

Figure 2. A UI before the start, B UI during recording, C Feedback request after the end of study.
132 K. Kraus et al. / Understanding Human-Computer Interactions

4.2. System in Use

The proposed software is currently being evaluated with field tests. A variety of setups
with different browsers, operating systems, and networks are currently assessed. A major
part of these testing procedures has already been completed. Corresponding results are
included in the section lessons learned below. Besides multiple evaluations, preparations
are made to conduct a first study to understand the information retrieval in molecular
tumor board processes. For the evaluation, the proposed Web Extension is being installed
on the local machines used by the clinicians. After an introduction, the results are
collected weekly after multiple searches have been conducted. Afterwards, the output is
checked for errors and the participants are asked for feedback on the usability of the tool
and any issues that came up while using it. Based upon the feedback, the proposed tool
allows to collect click and navigational data precisely in an almost unaltered environment.
The additional screen recording backups the data collection, shedding light on blind spots.
The labels acquired in the last step in combination with the metadata provided in JSON
format should enable the development of recommender systems for clinical publications
of interest, since creating those systems always requires data with a ground truth, which
is provided by the labels.

5. Lessons learned

Web Extensions give a unique opportunity to be integrated into highly restricted clinical
environments without installation on the operator’s system or dependence on third-party
services. Within the project, we were able to create fully functional software.
Nevertheless, several challenges are to be considered when using this technology.
The currently available web standards offer a variety of possibilities. However, the
extensions still fail to provide the full flexibility of stand-alone software. Since the
browser runs the extension, it defines the rules and possible operations. An example is
the content privacy policy, which got even more restrictive with the release of Google’s
Manifest v3. While such policies guarantee an appropriate level of security, they enforce
complex and error-prone communication routines.
If the supplier does not provide an API, access to resources of interest by the
extension is not possible. Corresponding APIs are even more crucial when considering
subtle differences in their implementation. Despite existing standards, web browsers not
based on the Chromium engine behave surprisingly differently. Consequently, we
suspended the intended support for Mozilla Firefox as it does not entirely follow the
Manifest v3. While the idea of one extension for all tools is appealing, the standardization
procedures are not mature enough. Ensuring compatibility between the Chromium-based
browsers still requires extensive testing to avoid common interferences.
Utilizing third-party JavaScript packages within a Web Extension requires access to
those packages independent of a potential connection to the internet. This is achieved by
bundling required packages to a format the browser can read and access. This bundling
can become a very tedious task, nevertheless necessary to utilize those packages.
Additionally, many third-party packages require special access to the aforementioned
resources blocked by security policies. That leads to either the failure of certain functions
or the crash of the whole extension. Unfortunately, there are no ways to bypass certain
browser-specific policies and hence, no way to use certain packages within the extension.
K. Kraus et al. / Understanding Human-Computer Interactions 133

To summarize, the developed Web Extension creates a first attempt towards a


system-agnostic tool, utilizable in restricted setups such as clinics to observe a user's
behavior while searching unobtrusively. Nevertheless, there are certain limitations like
the dependency on a web browser running on the Chromium engine and the restrictions
defined by the Manifest v3 standard. That implies certain limits in terms of feature
implementation and the data available for collection, as well as the possibilities of
processing the data within the system to deliver direct results. Consequently, the
proposed Web Extension is intended to co-exist with other solutions. There should
always be an assessment to understand if the proposed tool meets the requirements for
certain use cases or if another tool is more suitable.

6. Conclusion

The present work proposes Web Extensions as a self-contained, out-of-the-box solution


to understand human factors during interaction within clinical interfaces. The developed
software is an innovative, multi-purpose tool available for multiple web browsers and
applicable in highly restricted environments. It requires neither installation nor an
internet connection. The open-source codebase provides a modular architecture with the
possibility to add features, customize the collected data regarding privacy restrictions,
and adjust the behavior as well as the UI to allow more experimental setups. Without the
need for a dedicated experiment setup or an attending experimenter, the insights within
usability studies or information retrieval tasks could directly mirror clinical reality.
Strong validity, low costs, and enabled longitudinal analyses can be argued as the major
benefits. The ability to capture metadata and visual data such as navigation, clicks, mouse
movements, and the screen of the user is provided to the researcher. In addition, the
screen recordings allow an understanding of multimedia content and blind spots. When
combined with the exported subtitles, both the subject and scientist get a transparent and
intuitive summary of the recorded events. Accordingly, the proposed tool enables future
research regarding HCI in highly regulated clinical setups.

Declarations

Conflict of Interest: The authors declare that there is no conflict of interest.

Contributions of the authors: KK implemented the web extension. CG planned and led
the project. KK, CG, and LT were involved in writing and/or revision of the
manuscript.

References

[1] Rogers Y, Sharp H, Preece J. Interaction Design: beyond human-computer interaction. Indianapolis: John
Wiley and Sons; 2023.
[2] Unertl KM, Lehmann CU, Lorenzi NM. Effective Implementation of a Clinical Information System. In:
Finnell JT, Dixon BE, editors. Clinical Informatics Study Guide: Text and Review. Cham: Springer
International Publishing; 2022. p. 319–30. Available from: https://doi.org/10.1007/978-3-030-93765-
2_22
134 K. Kraus et al. / Understanding Human-Computer Interactions

[3] Bishop RO, Patrick J, Besiso A. Efficiency Achievements From a User-Developed Real-Time
Modifiable Clinical Information System. Ann Emerg Med. 2015 Feb 1;65(2):133-142.e5.
[4] Nunes I, Jannach D. A systematic review and taxonomy of explanations in decision support and
recommender systems. User Model User-Adapt Interact. 2017 Dec 1;27(3):393–444.
[5] Kukreja U, Stevenson WE, Ritter FE. RUI: Recording user input from interfaces under Windows and
Mac OS X. Behav Res Methods. 2006 Nov 1;38(4):656–9.
[6] Alexander J, Cockburn A, Lobb R. AppMonitor: A tool for recording user actions in unmodified
Windows applications. Behav Res Methods. 2008 May 1;40(2):413–21.
[7] Zheng N, Paloski A, Wang H. An efficient user verification system via mouse movements. In:
Proceedings of the 18th ACM conference on Computer and communications security. New York, NY,
USA: Association for Computing Machinery; 2011. p. 139–50. (CCS ’11). Available from:
https://dl.acm.org/doi/10.1145/2046707.2046725
[8] Maxwell D, Hauff C. LogUI: Contemporary Logging Infrastructure for Web-Based Experiments. In:
Hiemstra D, Moens MF, Mothe J, Perego R, Potthast M, Sebastiani F, editors. Advances in Information
Retrieval. Cham: Springer International Publishing; 2021. p. 525–30. (Lecture Notes in Computer
Science).
[9] Krieter P. Can I record your screen? mobile screen recordings as a long-term data source for user studies.
In: Proceedings of the 18th International Conference on Mobile and Ubiquitous Multimedia. New York,
NY, USA: Association for Computing Machinery; 2019. p. 1–10. (MUM ’19). Available from:
https://dl.acm.org/doi/10.1145/3365610.3365618
[10] Reeves B, Ram N, Robinson TN, Cummings JJ, Giles CL, Pan J, et al. Screenomics: A Framework to
Capture and Analyze Personal Life Experiences and the Ways that Technology Shapes Them. Hum-
Comput Interact. 2021;36(2):150–201.
[11] Cockburn A, Gutwin C, Scarr J, Malacria S. Supporting Novice to Expert Transitions in User Interfaces.
ACM Comput Surv. 2014 Nov 12;47(2):31:1-31:36.
[12] Schöning J, Gundler C, Heidemann G, König P, Krumnack U. Visual Analytics of Gaze Data with
Standard Multimedia Player. J Eye Mov Res. 2017 Nov 20;10(5). Available from:
https://bop.unibe.ch/JEMR/article/view/3725
[13] MediaRecorder - Web APIs | MDN [Internet]. 2023 [cited 2023 Feb 27]. Available from:
https://developer.mozilla.org/en-US/docs/Web/API/MediaRecorder

You might also like