You are on page 1of 41

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

“JNANA SANGAMA”, BELAGAVI-590018.

A Synopsis on

“BIG DATA ANALYTICS USING MACHINE


LEARNING”
SUBMITTED IN PARTIAL FULFILMENT FOR THE AWARD OF THE DEGREE OF
BACHELOR OF ENGINEERING
IN

COMPUTER SCIENCE AND ENGINEERING


By

DEEPTHI K [1JB19CS044]

UNDER THE GUIDANCE OF

Internal Guide
Dr. Veena H N
Associate Professor
Dept. ofCSE SJBIT,Bengaluru

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


SJB INSTITUTE OF TECHNOLOGY
#67, BGS HEALTH AND EDUCATION CITY, Dr. Vishnuvardhan Road, Kengeri, Bangalore-560060.

2023-2024
ABSTRACT

1. “A Structured Analysis to study the Role of Machine Learning


and Deep Learning in The Healthcare Sector with Big Data
Analytics”
 Publication: Juli Kumari Ela Kumar · Deepak Kumar

The research paper talks about increased applications of Machine Learning (ML) and
Deep Learning (DL) in healthcare, especially when combined with big data analytics.
They offer predictive analytics, medical image analysis, drug discovery, personalized
medicine, and EHR analysis. ML and DL have the potential to revolutionize healthcare
by improving patient outcomes, reducing costs, and speeding up treatment development.
This study analyzes research trends using bibliometric and VOS viewer software, aiming
to guide future research and help healthcare professionals understand and leverage these
technologies effectively.

2. “The use of Big Data Analytics in healthcare”


 Publication: Kornelia Batko and Andrzej Ślęzak

The paper examines the integration of Big Data Analytics (BDA) in healthcare, aiming
to assess its potential benefits and current adoption. Through a literature review and direct
research involving 217 medical facilities in Poland, it evaluates the utilization of BDA in
medical settings. The findings indicate a growing trend towards data-driven healthcare,
with facilities leveraging structured and unstructured data for administrative, business,
and clinical purposes. Various data sources such as databases, transaction records, emails,
documents, and sensor data are being utilized, although the use of social media data
remains limited. Overall, the research confirms the shift towards data-driven decision-
making in healthcare, aligning with the observed advantages outlined in existing
literature.
Archives of Computational Methods in Engineering (2023) 30:3673–3701
https://doi.org/10.1007/s11831-023-09915-y

REVIEW ARTICLE

A Structured Analysis to study the Role of Machine Learning and Deep


Learning in The Healthcare Sector with Big Data Analytics
Juli Kumari1 · Ela Kumar1 · Deepak Kumar2,3

Received: 28 October 2022 / Accepted: 13 March 2023 / Published online: 31 March 2023
© The Author(s) under exclusive licence to International Center for Numerical Methods in Engineering (CIMNE) 2023

Abstract
Machine and deep learning are used worldwide. Machine Learning (ML) and Deep Learning (DL) are playing an increas-
ingly important role in the healthcare sector, particularly when combined with big data analytics. Some of the ways that ML
and DL are being used in healthcare include Predictive Analytics, Medical Image Analysis, Drug Discovery, Personalized
Medicine, and Electronic Health Records (EHR) Analysis. It has become one of the advanced and popular tool for computer
science domain.’ The advancement of ML and DL for various fields has opened new avenues for research and development.
It could revolutionize prediction and decision-making capabilities. Due to increased awareness about the ML and DL in
the healthcare, it has become one of the vital approaches for the sector. High-volume of unstructured, and complex medical
imaging data from health monitoring devices, gadgets, sensors, etc. Is the biggest trouble for healthcare sector. The cur-
rent study uses analysis to examine research trends in adoption of machine learning and deep learning approaches in the
healthcare sector. The WoS database for SCI/SCI-E/ESCI journals are used as the datasets for the comprehensive analysis.
Apart from these various search strategy are utilised for the requisite scientific analysis of the extracted research documents.
Bibliometrics R statistical analysis is performed for year-wise, nation-wise, affiliation-wise, research area, sources, docu-
ments, and author based analysis. VOS viewer software is used to create author, source, country, institution, global coopera-
tion, citation, co-citation, and trending term co-occurrence networks. ML and DL, combined with big data analytics, have
the potential to revolutionize healthcare by improving patient outcomes, reducing costs, and accelerating the development
of new treatments, so the current study will help academics, researchers, decision-makers, and healthcare professionals
understand and direct research.

1 Introduction

Nowadays, machine learning (ML), and deep learning (DL),


* Juli Kumari are leading topics and centres of interest in the industry, aca-
send2juli@gmail.com demia, and popular culture all over the world [1, 1]. Machine
Ela Kumar learning has emerged as a popular field of study and cutting-
ela_kumar@igdtuw.ac.in; ela_kumar@rediffmail.com edge tool in recent years. Multiple areas of prediction and
Deepak Kumar decision-making benefit from its suite of algorithms and
deepakdeo2003@gmail.com; dkumar3@albany.edu statistical techniques. The fact that it can use a variety of
learning approaches means it can process large amounts of
1
Indira Gandhi Delhi Technical University for Women unstructured and complex data in ways that lead to revolu-
(IGDTUW), New Church Rd, Kashmere Gate, Delhi, James
Church, New Delhi 110006, India tionary shifts in perspective based on the data experience.
2 Different machine learning-based model has been applied
Center of Excellence in Weather & Climate Analytics,
Atmospheric Sciences Research Center (ASRC), University in two ways: (1) Supervised learning, and (2) Unsupervised
at Albany (UAlbany), State University of New York (SUNY), learning. The data are input into the supervised learning-
Albany, New York 12226, USA based model, and then the model is used to perform clas-
3
Amity Institute of Geoinformatics & Remote sification, regression, and prediction on the labelled dataset
Sensing (AIGIRS), Amity University Uttar Pradesh [3–5]. In contrast, an unsupervised learning-based model
(AUUP), Sector‑125, Gautam Buddha Nagar, Noida, is applied to unlabeled datasets to perform tasks such as
Uttar Pradesh 201313, India

13
Vol.:(0123456789)
3674 J. Kumari et al.

clustering, dimensionality reduction of a dataset, detec- In this research, bibliometric analysis has been applied
tion, etc. Also, the concepts of machine learning are being to analyze the most significant and crucial research pat-
applied in various industries; it is largely responsible for the tern in machine learning (ML), deep learning (DL), and the
revolutionary changes occurring in the healthcare industry. healthcare field, as well as the most relevant research areas
The majority of the time, it is utilized for the prediction of machine learning (ML) and deep learning (DL) appli-
of early-stage disease, the identification of disease, and the cation applying in the healthcare field. Also, explored the
monitoring of the patient's status based on clinical datasets. most related research areas of healthcare like disease detec-
Detecting automatic object features in medical images can tion, diagnosis, and prediction [14–16]. For this, Thomson
also be accomplished by the use of machine learning tech- Reuters’ Web of Science (WoS) (Clarivate analytics, 2020)
niques to datasets related to medical imaging. To manage database has been used to collect the bibliometric informa-
picture collections and larger datasets, the deep neural net- tion in topic-wise and title-wise categories and “Machine
work-based technique has emerged as the model of choice learning”, “Deep Learning” and “Healthcare OR health*”
among the various machine learning models [6, 7]. Because used as query keywords. The web of science (WoS) database
it uses the three-layer perceptron for analyzing the dataset. is a well-structured and differently indexed database, that
Deep Neural Network-based approaches are a great solution selected only the most relevant top publication, it includes
in larger-scale datasets with different structures in fields of many significant scientific articles. Out of all relevant signif-
data mining, Image processing, Natural Language Process- icant-top publication document types, only article, review,
ing, Expert Systems and Prediction, etc. [8, 9]. and early access articles have been selected for descriptive
The subfield of machine learning known as "deep learn- analysis. Another aspect of the bibliometric study is the fol-
ing" employs iterative methods for learning new features. lowing: number of published papers from 2010–2021; the
It offers several neural network algorithm variants that can number of authors; the number of papers per author and
learn features in a variety of ways [10–13]. It is used on contribution; authors' collaboration; country-wise numbers
a huge scale and processed via numerous layers of filters. of publications: country-wise paper impact factors; year-
As a result of its use of multi-layer filtering techniques, it wise paper publication; top-most institution etc. Finally, we
can deliver increasingly precise results compared to conven- elected top categories to show the research dynamic, trends,
tional machine learning while efficiently organizing massive hot topics, and research directions and variations on this
datasets. related type of strategy have been successfully applied by
Healthcare is the fastest expanding sector, drawing in the recent literature [14, 17, 18].
more scientists, professors, and medical experts who are Therefore, the purpose of this study is to give depth
all eager to make important contributions to the discipline insight and a clear understanding of the research pattern
and the healthcare system as a whole. Hospitals, medical of machine learning and deep learning application in the
equipment, clinical trials, telemedicine, medical tourism, healthcare sector and compare the application of both
health insurance, and other related services are all examples techniques in the advancement of healthcare treatment.
of healthcare facilities. It presents a tremendous chance to Hence, this study will be more beneficial for academia, and
enhance healthcare facilities with the use of cutting-edge researchers’ health professionals to determine the relevant
technologies. Therefore, the development of more sophisti- area of research in Machine Learning, Deep Learning, and
cated technology has allowed for massive amounts of data Healthcare that has been broadly focused on along with the
and proven to be a massive transformation in healthcare gaps that should be addressed. The paper is organized as
fields, in terms of providing easy access to the best diag- (1) the “Introduction” section deals with the introduction of
nostic tools, the most cutting-edge treatments, and a vari- the topic. (2) “Methodology" section discusses the proposed
ety of minimally invasive procedures that result in less pain methodology. (3) “Results and discussion” section provides
and quicker healing. Drug development, tailored treatment, the results and follows them with discussions. The “Sum-
robotic surgery, illness monitoring, etc. all rely heavily on mary” section deals with the summary of the whole work.
healthcare in various capacities. A field's research dynam- (4) “Conclusion and prospect” section concludes and future
ics and behaviours can be gleaned from scientific articles scope of this work. (Fig. 1).
using bibliometric analysis, making it the most reliable
tool for spotting patterns in the research landscape. Math-
ematical and statistical methods are used to research articles, 2 Material and Methods
books, and other forms of scholarly communication, and the
results are used to analyze patterns in the scientific publish- 2.1 Data Collection
ing world. Systematics examination of published sources is
commonly used to gauge advancements in science across a For this study, all bibliometric information has been col-
range of disciplines using a variety of criteria. lected using the text keyword search strategies in topical

13
A Structured Analysis to study the Role of Machine Learning and Deep Learning in The Healthcare… 3675

Keyword Search

Title Keyword Search Topic Keyword Search


Title wise keyword search ("machine learning”, “deep Title wise keyword search ("machine learning”, “deep
learning”, “healthcare OR health”, “machine learning" learning”, “healthcare OR health”, “machine learning" AND
AND "healthcare”, “deep learning" AND "healthcare”). "healthcare”, “deep learning" AND "healthcare”).

Web of Science (WoS) Database Collection

Criteria:
1. SCI-EXPANDED Journal
2. Years from 2010-2021

Document Type Search

Criteria:
1. Arcle type
2. Review Arcle type
3. Early Arcle type

Selection Criteria

Criteria:
a. Inclusion Criteria
b. Exclusion Criteria

Bibliometric Analysis

Analyzed Topmost
1. Author
2. Country
3. Instuon
4. Source Of Publicaon
5. Published Document

Network Visualization
Analyzed interacon
1. Co-Author
2. Co-Country
3. Co-Instuon
4. Most Trending Topic
5. Most Occurrence of Topic

Fig. 1  Workflow diagrams

and title wise search from the Web of Science (WoS) data- only from Science Index-Expanded (SCI-Expanded)
base in machine learning and deep learning application journal categories and retrieved complete data in plain
in healthcare. The bibliometric information was searched text and BibTex data format, at global respect published

13
3676 J. Kumari et al.

Table 1  Total bibliometric data retrieved in categories 2.2 Finding From Bibliometric Data Analysis
Categories Topic Title
Table 1 shows the data retrieved from the database with
Machine learning 98,169 29,653 searched keywords such as: “machine learning”, “deep
Deep learning 51,559 19,493 learning”, “healthcare”, “machine learning” AND “health-
Healthcare 1,67,326 38,109 care”, and “deep learning” AND “healthcare” in topical
Machine Learning AND Healthcare 2014 119 and title-wise categories. This table data shows that, in both
Deep Learning AND Healthcare 922 58 the categories such title wise and topical wise categories,
the highest literature published with the keyword healthcare
(1,67,326), followed by machine learning (98,169) and deep
literature during timespan 2010–2021(data accessed:31 learning (51,559). Likewise, in the application field respect,
august 2021). Then, filter the bibliometric information machine learning techniques have largely contributed to the
from Research articles, review articles, and early access healthcare domain more than deep learning techniques,
articles from literature publication categories. Further, which is shown by the literature data.
the bibliometric analysis method and network visualiza- Figure 2 represented the bibliometric information
tion methods were used to show the deep research trends between the numbers of publications vs categories. It also
and different levels of collaboration among scholars. shows that topic-wise literature is more than title-wise,
In this context, the latest version of RStudio4.1.0 soft- which is shown by the blue and orange colours.
ware was installed and then established the bibliometrics Table 2 illustrates the total literature published research
package [19] with R environment to perform descriptive documents in various types, then three types of documents
bibliometric analysis and map the data. Then, the VOS i.e., research article, review, and early access article were
viewer [20] open-source graphical user interface was selected in both topical and title-wise categories. In a topic-
used to demonstrate the show of different collaboration wise published scientific article, the highest numbers of
networks and bibliometric coupling based on retrieved published scientific articles with keywords healthcare (arti-
bibliometric data. cle:1,28,247, review:22,071, early access article:3,540),

Fig. 2  Statistics Of Categories 1,80,000


Wise Data Pattern 1,60,000
1,40,000
Numbers of data

1,20,000
1,00,000
80,000
60,000
40,000
20,000
0
Machine learning Deep learning Healthcare Machine Learning Deep Learning AND
AND Healthcare Healthcare
Categoies

Topic Title

Table 2  Total literature Topical keywords Title keywords


publication under different
categories type categories Article Review Early Access Article Review Early Access

Machine learning 86,292 5794 2854 22,935 1449 924


Deep learning 46,217 2412 2130 16,055 730 788
Healthcare 1,28,247 22,071 3540 21,545 2730 694
Machine learning AND healthcare 1625 316 98 74 11 10
Deep learning AND healthcare 750 149 74 42 5 9

13
A Structured Analysis to study the Role of Machine Learning and Deep Learning in The Healthcare… 3677

Fig. 3  Total literature publica- Total literature patterns


tion under different categories
1,40,000
type

Total numbers of literature


1,20,000
1,00,000
80,000
60,000
40,000
20,000
0
Article Review Early access Article Review Early access
Topical keywords Title keywords
Categories type

Machine learning Deep learning


Healthcare Machine learning AND healthcare
Deep learning AND healthcare

followed by machine learning (articles:86,292, review:5,794, access article were included for this study in global respect.
early access:2,854), deep learning (articles:46,217, As Exclusion criteria, Social Sciences Citation Index
review:2,412, early access:2,130), machine learning in (SSCI) and Arts & Humanities Citation Index (A&HCI), and
healthcare (articles:1,625, review:316), early access:98) literature published before the year 2010 were excluded. all
and deep learning in healthcare articles (750), review (149), country-specific selections were discarded from the parame-
early access (74). ters. Furthermore, an article published in Proceedings Paper,
Whereas, in title-wise published literature, the maxi- Book Chapter, editorial material, letter, correction, re-print,
mum number of research published with the keyword etc. were excluded.
machine learning (article:22,935; review article:1,449; Furthermore, to understand the statistical research pat-
early access:22,935) followed by healthcare (articles:21,545; tern in the application of machine learning and deep learn-
review article:2,730; early access:694), deep learning (arti- ing related to healthcare, bibliometric data were retrieved
cles:16,055; review article:730; early access:788), also from the WoS database in a bibliography file (.bib) and plain
machine learning in healthcare (articles:74; review arti- text format (.txt). Both files included information like (i)
cle:11; early access:10) and deep learning in healthcare (arti- Authors, (ii) Abstract, (iii) Addresses, (iv) ISSNs/ISBNs, (v)
cles:42; review article:5; early access:9). Figure 3 presents IDS numbers, (vi) Funding Information, (vii) PubMed IDs
the statistical diagram, which demonstrates the published (viii) Titles (ix)Cited References, (x) Times Cited (xi) Cited
literature under different categories. Reference Counts, (xii) Language (xiii) Sources, (xiv) Docu-
ments Type, (xv) Keywords, (xvi) Source Abbreviations,
2.3 Data Searching Strategies (xvii) Author Identifiers, (xviii) Article Information, (xix)
Publisher Information, (xx) Research Areas, (xxi) Usage
Then, data searching strategies were used to systematically Counts, and (xxii) Highly Cited.
search bibliometric information in topical and title catego-
ries. For performing these tasks single and combinational 2.4 Methods Used
text keywords were used “machine learning”,” deep learn-
ing”, “healthcare”, “machine learning” AND “healthcare” The bibliometric analysis and Visualization method has
and “deep learning” AND “healthcare”. The task was exe- gained immense popularity in research at present time.
cuted to gather statistical facts, research patterns, and trends Because it provides the right direction to the research com-
regarding machine learning, healthcare, also, the uses of munity, academia, and health professional based on depth
machine learning methods related to the healthcare domain knowledge of scientific publication patterns. This study
based on bibliometric information. included the top 20 latest publication patterns, domains,
Further, Inclusion parameters were set to include only journals, authors, countries, organization research activities,
peer-review literature from SCI-Extended journals avail- and recent trending topics.
able in the WoS database and described the machine learn- For this, the Bibliometrix R package [19] is used for
ing, healthcare research area. From different categories of mathematical and statistical calculation of publication
published articles in the WoS database, only three publica- frequency, percentage, and citations of each author, jour-
tion categories like research article, review article and early nal, country, etc. A global collaboration map and other

13
3678 J. Kumari et al.

visualizations were done by open-source (corresponding number of research articles is 808, the total number of
information, author cooperation) VOS viewer [20] graphical review articles is 120, and the total number of early access
user interface-based software, the tool was used for mapping research articles is 45 in machine learning in healthcare,
of literature and analysis of author collaboration, country likewise, the total number of research articles is 806, the
collaboration, etc. network visualization. total numbers of review articles 106, and total numbers of
early access research articles 71 in use of deep learning
in the healthcare research field at worldwide level. At the
3 Result and Discussion Indian level, the total number of research articles is 213, the
total number of review articles is 55, and the total number
Bibliometric analysis has been conducted on the retrieved of early access research articles is 37, in the area of machine
bibliometric data from all fields i.e., topic, title, abstract, learning application in healthcare and the area of deep learn-
affiliation, journal, etc. categorized by WoS core collection ing in healthcare, total numbers of research articles, total
database in the research area of machine learning and deep numbers of review articles and total numbers of early access
learning application in the healthcare domain. To understand articles are 157, 24, 38. Furthermore, the total number of
the various statistics of research trends in this research area, research articles published in the field of computer science
the latest relevant 1000 records have been selected among is 1057 articles, 611 articles at the worldwide level, and 137
all available records from document-type research articles, articles, 104 articles in the research area of machine learning
review articles, and early access articles in SCI-Extended and deep learning in healthcare respectively.
journals among globally available records. In this, the total
3.1 Year‑Wise Published Literature

Table 3  Year-wise scientific literature production Table 3 illustrates the year-wise scientific literature produc-
tion at the worldwide level under peer-reviewed journals.
Year Machine learning in healthcare Deep
learning in So, the use of machine learning techniques in the healthcare
healthcare domain has started in 2010, and the growth of implemen-
tation was very less till 2014 and after 2015 continuously
2010 1 0
increased till now. whereas the use of deep learning in the
2011 6 0
healthcare domain has started in 2015, and rapidly increased
2012 2 0
up to the highest peak in the machine learning field.
2013 7 0
Hence, the higher scientific production rate shows the
2014 6 0
higher uses of deep learning techniques in the healthcare
2015 17 3
domain, compared to machine learning. Currently, deep
2016 18 4
learning technology is a more heavily used technology to
2017 33 15
handle more complex and unstructured data. The average
2018 99 71
year from publication is 1.5 and 1.2 around machine learning
2019 157 163
and deep learning used in the healthcare domain.
2020 297 302
Figure 4 exhibits the year-wise publication or literature
2021 302 367
on the application of machine learning and deep learning

Fig. 4  Year-wise publication Year wise publication pattern


pattern 800
Numbers of publications

700
600
500
400
300
200
100
0
2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021
Years
machine learning deep learning

13
A Structured Analysis to study the Role of Machine Learning and Deep Learning in The Healthcare… 3679

Table 4  Top 20 most relevant sources of publication


Machine-learning and healthcare Deep-learning and healthcare
Sources Articles Sources Articles

IEEE Access 52 IEEE Access 78


Sensors 29 Journal of healthcare engineering 35
Journal of healthcare engineering 22 Sensors 27
Journal of biomedical informatics 20 IEEE Journal of biomedical and health informatics 22
PLOS One 20 Scientific reports 20
BMC medical informatics and decision making 19 CMC-computers materials \& continua 15
Journal of the American medical informatics association 19 Journal of biomedical informatics 15
Applied sciences-Basel 15 Multimedia tools and applications 13
International journal of environmental research and public health 14 Computer methods and programs in biomedicine 12
International journal of medical informatics 14 Magnetic resonance in medicine 12
Healthcare 13 Medical Physics 12
NPJ digital medicine 13 BMC medical informatics and decision making 11
Scientific reports 13 Electronics 10
CMC-computers materials \& continua 11 Healthcare 10
Journal of medical internet research 11 IEEE transactions on medical imaging 10
Journal of medical systems 11 NPJ digital medicine 10
JMIR medical informatics 10 Radiology 10
IEEE Journal of biomedical and health informatics 9 Applied sciences-Basel 9
Artificial intelligence in medicine 8 Artificial intelligence in medicine 9
Computers in biology and medicine 8 Computers in biology and medicine 9

in the healthcare domain. In this diagram, blue and orange of biomedical and health informatics(articles:22), scientific
colour sticks represent the area of research i.e., machine reports(articles:20).
learning, and deep learning. So according to the graph, the In both areas of research, IEEE Access and the journal of
research around machine learning application in the health- healthcare engineering produces a higher rate of literature on
care domain has been studied from the year 2010, so from the application of deep learning in healthcare.
years 2010 to 2021 the uses of machine learning techniques
has continuously increased, similarly, research in the area 3.3 Year‑Wise Production of the Source
of deep learning application in healthcare has started from of Publication
2015, so implementation of deep learning techniques in
healthcare domains has rapidly increased. The diagram also The year-wise scientific production of the source of pub-
reveals that deep learning techniques are highly used in lication has shown in the figure. It revealed the frequency
healthcare to provide a better quality of medical treatment. of published articles in different journals. Hence, IEEE
Access journal production started slowly from 2010, and
3.2 Top 20 Most Relevant Sources of Publication after the year 2015, it rises continuously in literature pro-
duction and has maximum in the year 2020 in the appli-
Table 4 demonstrated the top 20 most relevant sources of cation area of machine learning in healthcare. The Journal
publication in machine learning and deep learning appli- of healthcare engineering articles’ production rate was
cation in the healthcare domain. hence top 5 journals' lit- slow from 2010–2017, and after 2017 its publishing rate
erature publications in IEEE Access(articles:52), Sensors increased and in the year 2021, it has a maximum of 10 arti-
(articles:29), Journal of healthcare engineering(articles:22), cles. Similarly, in the present scenario total publication of
Journal of biomedical informatics(articles:20), PLOS the source journals are the journal of the American medical
One(articles:20) in machine learning application in the informatics(articles:5), PLOS One(articles:6), BMC medical
healthcare domain. similarly, for deep learning applica- informatics and decision making(articles:7), an international
tions in healthcare, the topmost 5 publication sources journal of medical informatics (articles:3) (fig. 5)
are IEEE Access(articles:78), journal of healthcare Figure 6 demonstrated the year-wise publication
engineering(articles:35), sensors(articles:27), IEEE Journal rate of journals/sources in the application of deep

13
3680 J. Kumari et al.

Fig. 5  Year-wise production SOURCE OF DYNAMICS IEEE ACCESS


of a source of publication in
machine learning in healthcare 30
SENSORS

25
JOURNAL OF HEALTHCARE ENGINEERING

20

Numbers of publication
JOURNAL OF BIOMEDICAL INFORMATICS

15
PLOS ONE

10
BMC MEDICAL INFORMATICS AND DECISION
MAKING
5
JOURNAL OF THE AMERICAN MEDICAL
INFORMATICS ASSOCIATION
0
2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 APPLIED SCIENCES-BASEL
in Years

Fig. 6  Year-wise production SOURCE OF DYNAMICS


of the source of publication in
Deep learning in healthcare 30 IEEE ACCESS

JOURNAL OF HEALTHCARE ENGINEERING


25
SENSORS
20
Numbers of publication

IEEE JOURNAL OF BIOMEDICAL AND HEALTH


INFORMATICS
15 SCIENTIFIC REPORTS

CMC-COMPUTERS MATERIALS \& CONTINUA


10
JOURNAL OF BIOMEDICAL INFORMATICS
5
COMPUTER METHODS AND PROGRAMS IN
BIOMEDICINE
0 BMC MEDICAL INFORMATICS AND DECISION
2015 2016 2017 2018 2019 2020 2021 MAKING
in Years

learning in healthcare. The production of articles by 3.4 Top 20 Most Globally Cited Document
top journals is IEEE Access(articles:26), journal of
healthcare engineering(articles:26), journal of bio- Table 5 represented the top 20 most published documents in
medical informatics(articles:2), IEEE journal of bio- the area of machine learning and deep learning application
medical and health informatics(articles:10), scientific in healthcare. thus, the total citation and total citation per
reports(articles:7), etc. among all top 10 journals IEEE year citation of the top 5 documents in the application area
Access production rate is sharply raised after the year of machine learning are Jiang F, 2017, Stroke Vasc Neu-
2017–2019 and maintained till the year 2021. Also, the rol (Total citation:467; TC per year:93.4), Rudin C, 2019,
journal of healthcare engineering started raising from the Nature of Mechanical Intelligence(Total citation:432; TC per
year 2017–2020 and after 2020 reached its highest peak. year:144), Lundervold As, 2019, Journal of Medical Physics
And rest of the journal production ranging (articles:2–15) (Total citation:302; TC per year:101.6), Yu Kh, 2018, Nature
from the year 2017–2021. of Biomedical Engineering (Total citation:268; TC per

13
A Structured Analysis to study the Role of Machine Learning and Deep Learning in The Healthcare… 3681

Table 5  Top 20 most globally cited document Table 6  Top 20 most relevant authors and their production
Paper Total Citations TC per Year Normalized TC Machine learning in healthcare Deep learning in healthcare

Most cited document in machine learning in healthcare Authors Articles Articles Authors Articles Articles
Fractional- Fraction-
[21] 467 93.4 9.31
ized alized
[22] 432 144 26.52
[2] 305 101.66 18.72 Zhang Y 14 2.02 Li H 17 2.12
[23] 268 67 9.30 Li X 11 1.29 Li J 17 2.34
[24] 255 51 5.08 Li J 10 1.63 Wang H 14 2.81
[25] 188 31.33 4.009 Li Y 9 1.49 Yang J 14 2.15
[26] 180 45 6.24 Wang Z 8 1.39 Liu Y 13 2.21
[27] 178 14.83 1 Hussain A 7 1.27 Muham- 13 4.32
[28] 170 42.5 5.90 mad G
[29] 167 27.833 3.56 Khan A 7 1.25 Sun J 13 1.94
[30] 127 21.167 2.70 Wang L 7 0.9 Zhang Y 12 2.02
Trang Pham, 119 23.8 2.37 Chen X 6 0.95 Hossain 11 3.54
2017, MS
J Biomed Inform Chen Y 6 1.08 Li L 11 1.26
[31] 114 19 2.43 Gupta S 6 0.86 Wang Y 11 1.63
[32] 110 27.5 3.81 HU YH 6 1.83 Li Y 10 1.1
[33] 103 25.75 3.57 Lee J 6 1.97 Wang L 10 2.91
[34] 94 47 15.32 Wang W 6 0.64 Xu Y 10 1.1
[35] 93 23.25 3.22 Zhang J 6 1.06 Yi PH 10 2.09
[36] 84 12 3.15 Abedi V 5 0.64 Zhang H 10 1.52
[37] 83 20.75 2.88 Chen M 5 0.94 Zhang X 10 1.31
[38] 73 24.33 4.48 Chen S 5 0.81 Lee J 9 1.21
Most cited document in deep learning in healthcare Huang Y 5 0.5 Li C 9 1.1
[39] 862 215.5 10.26 Kim YH 5 0.36 Li M 9 1.33
[40] 542 135.5 6.45
[41] 481 120.25 5.73
[21] 470 94 4.69
Coudray N, 2018, Nat Med (Total citation:542; TC per
[42] 461 153.66 19.33
year:135.5), Rajkumar A, 2018, Npj Digit Med (Total cita-
[43] 433 108.25 5.15
tion:481; TC per year:120.25), Jiang F, 2017, Stroke Vasc
[2] 311 103.66 13.04
Neurol (Total citation:470; TC per year:74), Esteva A, 2019,
[44] 311 62.2 3.10
Nat Med (Total citation:461; TC per year:153.66). Apart
[45] 285 71.25 3.39
from this, the total number of the cited document is highest
[46] 231 57.75 2.75
Jiang F, 2017, Stroke Vasc Neurol (Total citation:467; TC
[47] 216 72 9.05
per citation:101.6), while the maximum citation per year is
[48] 191 47.75 2.27
Rudin C, 2019, Nat Mach Intell (total citation:432; TC per
[49] 165 41.25 1.96
year:144) in machine learning in healthcare. Whereas, the
[50] 161 32.2 1.60
maximum total citation and TC per year are Kermany DS,
[51] 149 37.25 1.77
2018, Cell(Total citation:862; TC per year:215.5) in the area
[52] 138 23 1.78 of the deep learning application.
[53] 129 32.25 1.53
[54] 126 63 13.09 3.5 Top 20 Most Relevant Authors and their
Zhu J, 2018, IEEE 122 30.5 1.45 Production
Internet Things J
The overall statistics of relevant authors state the numbers
of authors significantly contributing to research areas. Thus,
the total numbers of (5101/5916 authors) and (4871/6280
year:67), Chen M, 2017, IEEE Access(Total citation:255; authors) authors are well contributing in the research
TC per year:51). Similarly, in the area of deep learning appli- field of application of machine learning and deep learn-
cation, the top most highly cited documents are Kermany ing in the healthcare domain respectively. Likewise, out of
DS, 2018, Cell (Total citation:862; TC per year:215.5), the 5016 authors, single-authored documents are 16 and

13
3682 J. Kumari et al.

multi-authored documents are 5085, as well single-authored shows Zhang J is continuously working from the years
documents are 17 and multi-authored documents are 4854 2011 -2021 and produced several articles more in the years
in the application of deep learning and deep learning in the 2020–2021. Authors Hussain A and Gupta S were involved
healthcare domain respectively. The analysis also reveals, in this research area from 2013 to, 2015 respectively. Rest
larger groups of authors are contributing to the implementa- authors Zhang Y, Li Y, Li X, Li J, Wang L, Chen Y, Hu
tion of machine learning techniques in healthcare and pro- YH, Chen M, etc. been involved in this area from the year
ducing more articles than deep learning application research 2015 and produced articles after the year 2020 on towards
area. (Fig. 7)
Table 6 demonstrated the top author's production and its Figure 8 shows the year-wise top authors' production in
fractional, thus, the top 5 authors Zhang Y(articles:14; article the application of deep learning in healthcare. Deep learning
fractional:2.02), Li X(articles:11; article fractional:1.29), Li techniques mostly came in research scenarios after the year
J(articles:10; article fractional:1.63), Li Y(articles:9; article 2015 and uses of its technique in healthcare mostly appeared
fractional:1.49), Wang Z(articles:8; article fractional:1.39), after 2017.
similarly, Li H(articles:17; article fractional:2.12), Li The diagram shows Zhang J is continuously working from
J(articles:17; article fractional:2.34), Wang H(articles:14; the years 2011 -2021 and produced several articles more in
article fractional:2.81), Yang J(articles:14; article frac- the years 2020–2021. Authors Hussain A and Gupta S were
tional:2.15), Liu Y(articles:13; article fractional:2.21) in involved in this research area from 2013 to, 2015 respec-
machine learning and deep learning in healthcare. tively. Rest authors Zhang Y, Li Y, Li X, Li J, Wang L, Chen
Y, Hu YH, Chen M, etc. been involved in this area from
3.6 Top 20 Most Author Production Over Time the year 2015 and produced articles after the year 2020 on
towards.
Figure 6 shows the year-wise top authors' production in the Li H, Wang H, Zhang Y, Li L, Wang Y, and Lee J has
application of machine learning in healthcare. The diagram started contributing to research from the years 2017–2021.

Fig. 7  Top author production over time in machine learning in healthcare

13
A Structured Analysis to study the Role of Machine Learning and Deep Learning in The Healthcare… 3683

Fig. 8  Top author production over time in deep learning in healthcare

After the year 2018 Li J, Yang J, Liu Y, Muhammad G,


Sun J, Hossain MS, Wang L, Xu Y, Zhang H, etc. started
the application of deep learning in the area of the health-
Table 7  Top 20 most country production
care domain.
Machine learning in healthcare Deep learning in healthcare
Region Freq Region Freq
3.7 Top 20 Most Country Production
USA 2043 USA 1327
China 589 China 1103
Table 7 presented the topmost country production,
UK 375 South Korea 398
countries USA and China are heavily contributing to
South Korea 274 India 330
machine learning and deep learning application field
Canada 264 UK 329
research. In the USA the highest number of published
India 231 Germany 214
articles is 2043 in the machine learning application
Germany 168 Saudi Arabia 203
field, while China is the maximum publication of
Spain 140 Italy 136
1103 articles in deep learning application research in
Italy 136 Pakistan 116
the healthcare domain. Indian research heavily con-
Saudi Arabia 129 Spain 105
tributes to deep learning application(articles:330)
Pakistan 104 Netherlands 103
implementation than machine learning(articles:231).
Netherlands 102 Singapore 103
The top 5 most country having maximum publications
Australia 96 Canada 99
are the USA (Articles:2043), China (articles:589),
Japan 95 France 99
the UK (articles:375), South Korea (articles:274),
Brazil 67 Australia 95
Canada(articles:264) and the USA (Articles:1327),
France 58 Egypt 82
China (ar ticles:1103), South Korea(ar ticles:389),
Singapore 53 Switzerland 80
India(articles:330), UK (articles:329) in machine learn-
Switzerland 51 Japan 78
ing and deep learning application in healthcare research
Iran 48 Malaysia 64
respectively.
Sweden 42 Israel 51

13
3684 J. Kumari et al.

Table 8  Top 20 most cited Machine learning in healthcare Deep learning in healthcare
country
Country Total Citations Average Arti- Country Total Citations Average Arti-
cle Citations cle Citations

USA 5072 14.7 USA 5625 26.41


China 1434 14.2 China 2686 13.57
United Kingdom 847 12.28 United Kingdom 1176 19.93
India 434 7.23 Korea 904 12.05
Norway 316 63.2 Saudi Arabia 601 13.07
Australia 286 15.05 Singapore 441 27.56
Canada 286 6.98 Australia 414 19.71
Germany 235 9.04 Switzerland 357 39.67
Korea 233 4.48 India 354 4.27
Italy 227 9.08 Norway 319 79.75
Saudi Arabia 226 8.69 Spain 286 14.3
Netherlands 141 10.85 Canada 260 16.25
Pakistan 140 8.75 Turkey 227 22.7
Spain 121 4.84 Netherlands 213 15.21
Greece 104 17.33 Italy 198 7.92
Japan 104 6.93 Germany 180 5.62
Malaysia 99 9 Austria 165 82.5
New Zealand 91 15.17 Pakistan 165 8.68
Egypt 79 13.17 France 123 9.46
Finland 77 15.4 Belgium 77 15.4

Table 9  Top 20 most affiliation Machine learning Deep learning


production
Affiliations Articles Affiliations Articles

Icahn Sch Med Mt Sinai 100 Stanford Univ 109


Harvard Med Sch 78 King Saud Univ 89
Stanford Univ 68 Johns Hopkins Univ 84
Univ Toronto 46 Imperial Coll London 47
Penn State Univ 34 Fudan Univ 40
Univ Calif San Diego 34 Taipei Med Univ 38
Univ Michigan 34 Natl Univ Singapore 36
Washington Univ 32 Yonsei Univ 36
Seoul Natl Univ 30 Univ Calif San Diego 35
Univ Calif San Francisco 30 Korea Adv Inst Sci and Technol 32
Univ Oxford 30 Seoul Natl Univ 32
Univ Pittsburgh 29 Icahn Sch Med Mt Sinai 31
Vanderbilt Univ 29 Harvard Med Sch 30
King Saud Univ 28 Emory Univ 28
Columbia Univ 27 Univ Ulsan 28
Taipei Med Univ 27 Tech Univ Munich 25
Cincinnati Children’s Hosp Med Ctr 26 Zhejiang Univ 24
Univ Penn 26 Nanjing Univ 23
Boston Univ 25 Sichuan Univ 23
Imperial Coll London 25 Univ Malaya 23

13
A Structured Analysis to study the Role of Machine Learning and Deep Learning in The Healthcare… 3685

3.8 Top 20 Most Cited Country Table10  Top 20 most corresponding authors’ countries in the appli-
cation of machine learning and deep learning in the healthcare
domain
The table demonstrates the top 20 most cited countries in
the field of usages of machine learning and deep learning Country Articles Freq SCP MCP MCP_Ratio
technique in the healthcare domain. And the citation of Machine learning
the country states the contribution to significant quality of USA 345 0.34535 263 82 0.238
research. hence, the top 5 country citation in the field of China 101 0.1011 70 31 0.307
machine learning uses in the healthcare domain are the USA United Kingdom 69 0.06907 40 29 0.42
(citation:5072), China (citation:1434), United Kingdom India 60 0.06006 50 10 0.167
(citation:847), India (citation:434), Norway (citation:316) Korea 52 0.05205 38 14 0.269
and likewise in the field of deep learning uses in healthcare Canada 41 0.04104 29 12 0.293
domain are USA (citation:5,625), China (citation:2,686), Germany 26 0.02603 14 12 0.462
United Kingdom (citation:1,176), Korea (citation:904), Saudi Arabia 26 0.02603 11 15 0.577
Saudi Arabia (citation:601). Italy 25 0.02503 16 9 0.36
So, also tables indicate that countries the USA, China, Spain 25 0.02503 17 8 0.32
and the UK contribute much to the application of machine Australia 19 0.01902 13 6 0.316
learning and deep learning techniques in healthcare fields. Pakistan 16 0.01602 1 15 0.938
But this three-country contribute the highest area of deep Japan 15 0.01502 10 5 0.333
learning implementation in healthcare (Table 8). Netherlands 13 0.01301 10 3 0.231
Brazil 11 0.01101 5 6 0.545
3.9 Top 20 Most Affiliation Production Malaysia 11 0.01101 1 10 0.909
Singapore 11 0.01101 4 7 0.636
Here, Table 9 Shows the worldwide topmost 20 insti- France 9 0.00901 4 5 0.556
tutes, which have contributed more to research and pub- Iran 8 0.00801 5 3 0.375
lished the highest article. Among all most 5 affiliations Mexico 8 0.00801 5 3 0.375
are Icahn School of Medicine at Mount Sinai, New York Deep learning
(articles:100), Harvard Medical School, Massachusetts China 365 0.36537 255 110 0.3014
(articles:78), Stanford University, California (articles:68), USA 168 0.16817 120 48 0.2857
University of Toronto, Canada (articles:46), Pennsylvania Korea 89 0.08909 68 21 0.236
State University (articles:34) and Stanford University, Cali- India 42 0.04204 36 6 0.1429
fornia (articles:109), King Saud University, Saudi Arabia United Kingdom 38 0.03804 21 17 0.4474
(articles:89), Johns Hopkins University, Maryland (arti- Japan 26 0.02603 23 3 0.1154
cles:84), Imperial College London, London (articles:47), Australia 24 0.02402 15 9 0.375
Fudan University, Shanghai, China (articles:40) in uses of Canada 21 0.02102 9 12 0.5714
machine learning and deep learning technique in healthcare. Turkey 18 0.01802 17 1 0.0556
Among all the top 20 affiliations, the highest articles pro- Netherlands 17 0.01702 11 6 0.3529
duced by Stanford University, California in deep learning Spain 17 0.01702 11 6 0.3529
applications in healthcare also reveal the larger research in Germany 15 0.01502 10 5 0.3333
this research field. France 13 0.01301 9 4 0.3077
Saudi Arabia 13 0.01301 9 4 0.3077
3.10 Network Visualization Singapore 13 0.01301 7 6 0.4615
Italy 11 0.01101 7 4 0.3636
For visualization and network analysis of existing biblio- Greece 8 0.00801 6 2 0.25
metric information in the research area of machine learn- Pakistan 8 0.00801 2 6 0.75
ing, deep learning and healthcare VOS viewer software were Switzerland 7 0.00701 3 4 0.5714
used. This tool is applied for creating and representing a bib- Brazil 5 0.00501 5 0 0
liometric science map of the author’s collaboration, country
collaboration, institute collaboration, citation analysis, co- SCP Single country publications, MCP Multiple country publications
citation analysis, bibliographic coupling, co-word analysis,
and co-authorship analysis as well as an article published
by corresponding countries, etc. science map analysis per- can be performed to construct and visualize co-occurrence
tains to the intellectual interaction and structural connection networks of the most significant term extracted from the
among research constituents [20]. Furthermore, text mining scientific literature by VOS viewer software.

13
3686 J. Kumari et al.

Fig. 9  Top 20 most Corre-


sponding author’s country in the
application of machine learning
in the healthcare domain

Fig. 10  Top 20 most Corre-


sponding author’s country in the
application of deep learning in
the healthcare domain

Out of all bibliometric literature data downloaded from 3.11 Most Relevant Countries by Corresponding
the WoS database in the research area of machine learning, Author
deep learning, and healthcare, only 1000 literature records
were selected for analyzing and visualization of the network Table 10, Fig. 9 and Fig. 10 represent the top 20 countries by
from applying machine learning and deep learning in health- corresponding author’s articles production. The top 5 coun-
care query keywords, that were extracted, and bibliometric tries USA, China, United Kingdom, India, and Korea have
analysis presented in this study. Article from this database produced maximum and greatly contributed to the research
was collected because this database contains only peer- area of the application of machine learning and deep learn-
reviewed articles [55]. ing technique in the advancement of the healthcare field.

13
A Structured Analysis to study the Role of Machine Learning and Deep Learning in The Healthcare… 3687

Thus, the total number of published articles is USA (345arti- 3.12 Co‑Authorship and Collaboration Analysis
cles), China(101articles), United Kingdom (69 articles),
India (60 articles), and Korea (52articles) and China (365 Co-authorship collaboration analysis is a formal way of
articles), USA (168 articles), Korea (68 articles), India (42 interaction among scholars. It is a most decent way to know,
articles), United Kingdom (38 articles) in the application of how scholars interact among themselves, including affilia-
machine learning and deep learning research fields. tion and countries associated with authors. Due to improv-
Among all enlisted top 20 countries, the USA has sig- ing methodological and theoretical complexity in research,
nificantly contributed with (263/345 articles) of SCP and collaborations among scholars have become common, and
(82/345 articles) of MCP, China with (70/101 articles) of collaborations among scholars contribute to showing greater
SCP and (31/101 articles) of MCP, United Kingdom with clarity and richer insights between different scholars.
(40/69 articles) of SCP and (29/69 articles) MCP, India with In Fig. 11, for building the co-authorship collaboration
(50/60 articles) SCP and (10/60 articles) MCP, Korea with network between the authors the minimum number of docu-
(38/52articles) SCP and (14/52articles) MCP. Similarly, ments of an author is 4 and the minimum number of cita-
China with (255/365 articles) in SCP and (110/365 articles) tions of an author is 5, and out of 5528 authors, 17 meet the
in MCP, USA has significantly contributed with (120/168 thresholds parameter were set. Then, a total of 17 authors
articles) in SCP and (48/168 articles) in MCP, Korea with were selected and associated in two clusters 1 and cluster
(68/89articles) in SCP and (21/89 articles) MCP, India with 2. In this network, cluster1 represented in red colour has 3
(36/42 articles) SCP and (6/42 articles) MCP, United King- group authors (de cecco, carlo n), (tesche, christian), (sch-
dom with (21/38 articles) SCP and (17/38 articles) MCP. oef, joseph) and green colour having 2 group authors (kim,
Overall, the USA, China, United Kingdom, Korea, and India young-hak), (lee, June-go). Among both cluster1 and clus-
have done huge research in this area and published maxi- ter2, 4 group authors (tesche, christian), (schoepf, u. joseph),
mum numbers of SCP, which shows that these countries (de cecco, carlo n.) and (kim, young-hak) have equal num-
have greatly contributed to the application area of machine bers of link strength, and authors (lee, June-goo) have only
learning in healthcare. While more numbers of MCP show one link connection in machine learning application areas.
worthy research collaboration with other countries.

Fig. 11  Co-authorship visu-


alization in machine learning
application in healthcare

Fig. 12  Co-authorship visuali-


zation in deep learning applica-
tion in healthcare

13
3688 J. Kumari et al.

Fig. 13  Co-authors-Country
collaboration visualization in
machine learning application

While, in Fig. 12, in deep learning application areas the The Network (Fig. 14) has generated the minimum num-
most productive author was selected to meet threshold 8, ber of documents for country 5, and the minimum number
out of 4220 authors, and it was associated with two clus- of citations for country 10, to meet threshold 36. Out of the
ters cluster1 and cluster2. In cluster1, the group authors are 77 countries, which meet the set-up parameter and most top
(Wang, Hao), (Zang, Yang), (Liu, Bo) and in cluster2 group, 20 countries listed in Table 11 have the highest collabora-
the authors are (Wang, Wei), (Huang, Wei). In this network, tion with each other. The maximum collaboration shown
cluster1 is bigger than cluster2. by countries are People from China (Article:358; Citation:
10,962), the USA (Article:249; Citation:11,291), South
3.13 Co‑Authors‑Country Collaboration Korea (Article: 101; Citation: 2664), England (Article:66;
Visualization Citation: 4015), India (Article:50; Citation: 579), Australia
(Article:49; Citation: 2074), Canada (Article:43; Citation:
Figure 13 and Table 11 exhibited, the co-authors-country 1937), have maximum numbers of collaboration with others
collaboration in the field of application of machines in the countries and contributing most in deep learning application
healthcare domain. For building the collaboration network, in the healthcare research area.
the parameter such as the minimum number of documents of
a country 3 and the minimum number of citations of a coun- 3.14 Co‑Authors‑Institute Collaboration
try 5, and out of 85 countries, 51 meet the thresholds were Visualization
taken. Based on the parameter 52 countries were selected
and different colour size bubbles and link strength thickness Table 12 and Fig. 15, show the top 20 most institute affilia-
of line, which shows the association among the co-authors- tion collaborations with the highest numbers of documents,
country interaction. Further, a total of 51 selected countries citations, and total link strength.
were associated in 7 clusters with 51 items. For creating institute network collaboration, the minimum
Table 11 enlisted the top 20 countries having the highest number of documents of an organization of 5, the mini-
number of published articles. Out of the 20 most countries, mum number of citations of an organization, and meet the
the top 5 countries, USA, England, peoples from China, threshold of 105 were set. Out of 1887 organizations, 105
India, and South Korea have a total of 410, 110, 97, 87, 69 institutes were selected. Therefore, the top 5 organizations,
articles, followed by a total citation of 6093, 1287, 1474, Harvard Medical School, Stanford University, Icahn School
686, 531 respectively. In this, the highest numbers of pub- of Medicine at Mount Sinai, MIT, and King Saud Univer-
lications, citations and more numbers of strength shown by sity have the highest numbers of publications 42, 24, 19,
the USA, England, people from China, India, and South 18, and 16 respectively. Although, the top 5 organizations
Korea have significantly contributed and a larger group of Harvard Medical School, Stanford University, University of
collaboration in this research field. Oxford, University of Pennsylvania, and University Calif

13
A Structured Analysis to study the Role of Machine Learning and Deep Learning in The Healthcare… 3689

Table 11  Co-authors-Country collaboration visualization strength show the numbers of collaboration. Therefore,
Machine learning
selected organization publications and citations show that
Harvard Medical School and Stanford University contribute
Country Documents Citations Total
to the better quality of research in the field of application of
Link
Strength machine learning in the healthcare domain.
In Fig. 16, out of 1374 countries, the top 70 countries
USA 421 6093 265 were selected to meet threshold 70 and countries having a
England 110 1287 129 minimum number of documents of an institute 5, minimum
Peoples r China 97 1474 80 number of citations of an institute 10. Listed 20 countries,
India 87 686 79 most published article and significant of research, top 5
South Korea 69 531 74 countries Chinese Academy of Science, China (article:31;
Canada 67 510 66 citation:896), Tsinghua University (article:18; citation:901),
Saudi Arabia 56 471 74 University Chinese Academy of Science (article:18; cita-
Germany 55 682 94 tion:539), Tsinghua University (article:18; citation:901),
Pakistan 41 233 90 Stanford University (article:15; citation:546), National Uni-
Australia 39 456 60 versity of Defense Technology (article:14; citation: 1219),
Italy 39 385 46 Harbin Institute of Technology (article:10; citation: 1607) in
Taiwan 37 257 45 area of deep learning application in the healthcare domain.
Spain 35 287 29
Netherlands 31 392 50 3.15 Occurrence of top 20 Most Author’s Keywords
Japan 24 294 36 and Keyword plus
France 19 178 22
Switzerland 18 117 42 Table 13, Fig. 17, and Fig. 18 demonstrated the top 20
Brazil 17 135 24 most occurrences of the author’s keywords and Keyword
Malaysia 16 148 32 plus in the area application area of machine learning uses
Singapore 16 33 31 in the healthcare domain. For generating the network map,
Deep learning the parameter was used as the minimum number of occur-
Peoples r China 398 10,962 183 rences of keyword 10 for all relevant keywords, and the
USA 249 11,291 168 threshold meets 38. Out of 2603 authors’ keywords, the most
England 66 4015 67 occurrence of 38 keywords was selected. Then, the top 10
Australia 49 2074 51 most relevant author’s keywords in this field are “machine
Canada 43 1937 51 learning (Occurrence:650)”,” artificial intelligence (Occur-
South Korea 101 2664 42 rence:99)”,” healthcare (Occurrence:81)”,” deep learning
Germany 29 1334 34 (Occurrence:63)”,” big data (Occurrence:38)”,” classi-
Singapore 25 3060 31 fication (Occurrence:38)”,” covid-19(Occurrence:34)”,”
Saudi Arabia 31 381 30 internet of things (Occurrence:34)”,” natural language pro-
France 22 589 26 cessing (Occurrence:34)”,” prediction (Occurrence:33)”.
Taiwan 29 191 25 Among all 10 author’s keywords “machine learning (link
India 50 579 24 Strength:614)”,” artificial intelligence (link Strength:157)”,”
Pakistan 17 186 24 healthcare (link Strength:150)”,” deep learning (link
Italy 18 329 21 Strength:115)”. Authors, enlisted keywords show that
Netherlands 23 4265 21 machine learning is the most popular word, which is applied
Spain 26 1342 20 in several fields. Likewise, for building a network map of
Switzerland 13 739 20 the total numbers of authors’ keywords with higher num-
Japan 32 691 17 bers of occurrence frequency in deep learning applications
Denmark 6 149 16 in the healthcare domain, the threshold parameter 10, out
Portugal 9 58 16 of 2491 authors' keywords was set. Then, the top 10 high-
est generated authors' keywords are “deep learning (link
Strength:762)”, “machine learning (link Strength:198)”,
San Francisco have maximum citations 800, 573, 422, 371, “artificial intelligence (link Strength:123)”, “neural network
353, followed by link strengths 63, 27, 34, 23, 25 respec- (link Strength:98)”, “training (link Strength:75)”, “convo-
tively. The more numbers of citations of organizations show lutional neural network (link Strength:74)”, “classification
the quality of research publication and the numbers of link (link Strength:67)”, “feature extraction (link Strength:62)”,

13
3690 J. Kumari et al.

Fig. 14  Co-authors-Country
collaboration visualization in
deep learning application

“segmentation (link Strength:53)”, “Big Data (link Stren Strength:123)”,” prediction (link Strength:102)”,” images
gth:45)”. (link Strength:95)”,” recognition (link Strength:92)”,” con-
As well, Table 13, Fig. 19, and Fig. 20 revealed the top volutional neural network (link Strength:81)”,” features (link
20 most occurrences of the most relevant Keywords plus Strength:73)”,” Diagnosis (link Strength:71)” (Figs. 21, 22,
which are popular in machine learning uses in the health- 23, 24)
care domain. For generating the network map, out of 2063
keywords plus, only 63 keywords were selected, onset 3.16 Year‑Wise Identification of the top 20 most
parameter such as a minimum number of occurrences of Trending Author’s KEYWORDS and Keywords
keyword 10 for all relevant keywords plus, and the thresh- in Machine Learning and Deep Learning uses
old meet 63. Further, among 63 selected keywords, the in Healthcare.
10 most popular keywords in this field are “classification
(Occurrence:113)”,” Prediction (Occurrence:82)”,” Risk The below table shows the top trending author's keywords
(Occurrence:58)”,” diagnosis (Occurrence:57)”,” model and keywords plus keywords used in machine learning and
(Occurrence:54)”,” validation (Occurrence:50)”,” Mor- deep learning application in the healthcare domain.
tality (Occurrence:46)”,” Models (Occurrence:42)”,” Table 14 represented the top 20 authors' keywords
System (Occurrence:42)”,” bigdata (Occurrence:40)”. with years of frequency of machine learning techniques
Among all author’s keywords topmost popular keywords applied in the healthcare domain are “machine learning
are “classification (link Strength:293)”,” Prediction (link (freq.:617)”,”artificial intelligence (freq.:93)”, “healthcare
Strength:181)”,” Risk (link Strength:112)”,” diagnosis (freq.:91)”, “deep learning (freq.:61)”,”prediction (freq.:43)”
(link Strength:122)”,” model (link Strength:126)”,” valida- etc. likewise most trending authors keywords of deep learn-
tion (link Strength:126)”,” Mortality (link Strength:105)”. ing techniques applying in healthcare domain are “deep
Authors, enlisted keywords show the highest link strength learning (freq.:994)”,”machine learning (freq.:103)”, “arti-
with other relevant words, which is about other fields. ficial intelligence (freq.:65)”,”neural network (freq.:49)”,”
Also, the selected total keywords plus, on the mini- Convolutional Neural Network (freq.:43)” etc. it also demon-
mum number of occurrence frequency 10, meet thresholds strated the yearly most popular authors keyword are machine
42, out of 1610 keywords were set up. Out of all gener- learning, artificial intelligence, healthcare, prediction model,
ated 1610 keywords plus, the top 10 highest link strength data analytics, medical services, deep learning, diagnosis,
keywords are” classification (link Strength:271)”,” Model Convolutional Neural Network, biomedical images, etc. in
(link Strength:177)”,” neural network (link Strength:176)”,” the application of machine learning and deep learning in
segmentation (link Strength:150)”,” algorithm (link healthcare fields during the year 2019–2021.

13
A Structured Analysis to study the Role of Machine Learning and Deep Learning in The Healthcare… 3691

Table 12  Co-authors-Institute Machine Learning


collaboration visualization
Organization Documents Citations Total
Link
Strength

Harvard Med School 42 800 63


Stanford University 24 573 27
Icahn School of Medicine at Mount Sinai 19 262 13
MIT 18 263 28
King Saudi University 16 224 4
University of Oxford 16 422 34
University of California San Francisco 15 353 25
University of Toronto 15 50 4
Brigham & Women’s Hospital 14 142 22
Columbia University 13 56 20
Massachusetts Gen Hosp 13 214 20
University of Pennsylvania 13 371 23
Imperial College of London 12 96 11
Pennsylvania State University 12 169 1
University of California San Diego 12 169 15
University of Washington 12 222 14
Emory University 11 237 17
Kings College of London 11 174 10
University of California Los Angeles 11 53 9
Boston Children’s Hospital 10 339 18
Deep Learning
Chinese Academy of Science 31 896 32
University Chinese Academy of Science 18 539 23
University of Electrical Science & Technology 10 588 13
National University of Defense Technology 14 1219 12
North-eastern University 9 329 12
Peng Cheng Laboratory 6 56 9
Tsinghua University 18 901 9
Harbin Institute of Technology 10 1607 8
Nanyang Technology University 11 2140 8
Chinese University Hong Kong 8 472 7
Guangdong University Technology 6 59 7
Tianjin University 9 234 7
UCL 11 491 7
Xidian University 5 113 7
Beihang University 10 168 6
Huazhong University of Science & Technology 11 99 6
Hunan University 6 466 6
Peking University 9 147 6
Stanford University 15 546 6
China Medical University 5 18 5

Table 15 exhibited yearly published numbers of the fre- article with keywords plus classification(freq.:133), predic-
quency with the popular keywords plus in the research area tion (freq.:86), Risk (freq.:62), diagnosis (freq.:57), model
of machine learning and deep learning techniques applied (freq.:56), regression (freq.:26), segmentation (freq.:18)
healthcare domain. table enlisted the total numbers of an etc. and neural network(freq.:138), classification(freq.:123),

13
3692 J. Kumari et al.

Fig. 15  Co-authors-Institute collaboration visualization in machine learning

Fig. 16  Co-authors-Institute collaboration visualization in deep learning application

model (freq.:78), algorithms (freq.:59), prediction(freq.:54), feature selection, classification, and clustering steps. All
segmentation(freq.:50) etc. in area of Machine learning and literature was published on machine learning (ML) and
deep learning in the healthcare domain. deep learning (DL) in the Healthcare sector, as well as the
These tables also demonstrate the year-wise highly popu- application of machine learning (ML) and deep learning
lar keywords plus are classification, prediction, diagnosis, (DL) related to the healthcare domain.
model, segmentation, classifier, data analytics, network, This study has established the bibliometric analysis tech-
convolutional neural network, architecture, etc. in both the nique in the research area of machine learning, deep learn-
area of machine learning and deep learning applied in the ing, and the healthcare field. And it revealed the worldwide
healthcare field during the year 2018–2021. Among all listed research trends and performance analysis of the subject area.
keywords in Table15 classification, segmentation, classifier, Up to now, there is a substantial gap in current research
network, model, and regression, the convolutional neural about the bibliometric analysis of Machine Learning (ML),
network is heavily used in the application filed research Deep Learning (DL), and the Healthcare field. In this study,
areas. selected topical, title, and all fields’ keywords were used
to extract the most relevant research paper from the Web
of Science (WoS) core collection database, which included
4 Summary Science Citation Index Expanded (SCI) papers and articles,
review article, early access document type paper from the
As emerging techniques, Machine Learning (ML) and period from 2010–30 June 2021.
Deep Learning (DL) have shown incredible potential Globally, in machine learning, deep learning, and health-
in tackling challenging problems in several fields. This care, a total of 98,169, 51,559 and 1,67,326 articles with
study has focused on those problems in healthcare that topical keywords search and 29,653, 19,493, and 38,109
have been addressed using machine learning (ML) and articles with title-wise keywords search bibliometric infor-
deep learning (DL) with promising results globally. Both mation data were downloaded from the WoS database during
techniques have been shown as powerful tools in dealing 2010–2021(accessed 31 august 2021) respectively. Similarly,
with disease detection in preprocessing, feature extraction, A total of 5,065, 2,755, and 4,588 articles from topic-wise

13
A Structured Analysis to study the Role of Machine Learning and Deep Learning in The Healthcare… 3693

Table 13  Occurrence of top 20 most author’s keywords and Keyword plus


Machine learning
Author's Keywords Keyword plus
Keyword Occurrences Total Link Keyword Occurrences Total
Strength Link
Strength

Machine Learning 650 614 Classification 133 293


Artificial Intelligence 99 157 Prediction 82 181
Healthcare 81 150 Risk 58 112
Deep Learning 63 115 Diagnosis 57 122
Big Data 38 63 Model 54 126
Classification 38 61 Validation 50 126
Covid-19 34 65 Mortality 46 105
Internet Of Things 34 79 Models 42 75
Natural Language Processing 34 54 System 42 96
Prediction 33 49 Big Data 40 97
Electronic Health Records 24 30 Outcomes 38 109
Feature Selection 20 34 Internet 32 63
Random Forest 20 27 Management 32 67
Data Mining 19 31 Care 31 74
Feature Extraction 17 38 Disease 28 51
Medical Services 14 51 Healthcare 27 73
Predictive Models 14 39 Impact 27 58
Support Vector Machine 14 21 Regression 26 57
Electronic Health Record 13 26 Artificial-Intelligence 25 64
Predictive Analytics 13 18 Cancer 25 76
Deep learning
Deep Learning 944 762 Neural-Networks 138 176
Machine Learning 103 198 Classification 123 271
Artificial Intelligence 65 123 Model 78 177
Neural Networks 48 89 Algorithm 59 123
Convolutional Neural Network 43 64 Prediction 54 120
Convolutional Neural Networks 40 74 Segmentation 50 150
Classification 34 67 Convolutional Neural-Networks 43 102
Neural Network 33 63 Networks 43 31
Feature Extraction 29 62 Neural-Network 42 62
Training 29 75 Recognition 38 92
CNN 28 47 Images 37 95
Segmentation 27 53 Convolutional Neural-Network 33 81
LSTM 21 41 Framework 32 73
Computer Vision 20 37 Network 31 42
Big Data 19 45 Diagnosis 27 68
Survey 17 36 Features 27 71
Bioinformatics 16 33 System 27 46
Medical Imaging 15 32 Models 26 28
Image Recognition 14 22 Cancer 24 55
Object Detection 14 26 Reconstruction 24 45
Sentiment Analysis 14 32 Performance 20 49

13
3694 J. Kumari et al.

Fig. 17  Network visualization


of occurrence of top 20 most
authors’ keywords in machine
learning application

Fig. 18  Network visualization


of occurrence of the top 20
keywords plus in the machine
learning area

categories and 1,380, 992, and 894 from title-wise categories articles globally and 142 articles topics-wise and 13 articles
with searched keywords Machine learning, Deep Learning, titles-wise at the Indian level (Table 1).
and Healthcare has been used in Indian scientific research. The purpose of this bibliometric data collection is to
Additionally, exploring the application of machine learning perform a bibliometric analysis, and network visualization,
in the healthcare and deep learning in the healthcare domain then evaluate the latest followed by the Document Type and
has published a total of 2,014 topics wise and 119 title-wise Language, Publication output, Top Country Contribution,
articles and 922 and 58 respectively worldwide. Similarly, Top WoS core Categories and Journals, Top Authors, Top
an Indian research prospect in the application of machine Research Areas and Analysis of Author Keywords, Keyword
learning and deep learning in the healthcare domain has Plus related and most trending topic in machine learning,
published a total of 218 topic-wise articles and 16 title-wise Deep learning uses in Healthcare. also, this paper analyses

13
A Structured Analysis to study the Role of Machine Learning and Deep Learning in The Healthcare… 3695

Fig. 19  Network visualization


of occurrence of top 20 most
authors’ keywords in deep
learning application

Fig. 20  Network visualization


of occurrence of top 20 most
keywords plus keywords in deep
learning application

the number of citations based on the impact of multiple fac- At the worldwide level, the total topical keywords-based
tors of the research paper as the number of authors, number article published is 86,292 research articles, 5,794 review
of pages, and number of references. All bibliometric infor- articles, 2,854 early access articles with machine learning,
mation data were downloaded by all relevant and latest lit- 46,217 research article, 2,412 review article, 2,130 early
erature published in ML, DL, and Healthcare research fields access article in deep learning and 1,28,247 research arti-
also implementation of machine learning and deep learning cles, 22,071 review article, 3,540 early access article in the
in healthcare fields, and all the source titles were included healthcare domain, also total numbers of research article
in web of science categories with three different document 1,625, review articles 316, early access 98 in the application
type article, review article, and early access article. of machine learning in healthcare and total of 750 research

13
3696 J. Kumari et al.

Fig. 21  Year-wise most trending


authors’ keyword tree map of
application of Machine learning
in the healthcare domain

Fig. 22  Year-wise most trending


authors’ keyword tree map of
application of Machine learning
in the healthcare domain

articles, 149 review articles, 74 early access articles in deep articles 788 early access articles in deep learning and 21,
learning application in healthcare. Comparatively, title wise 545 research articles, 2,730 review article and 694 early
query search was conducted and obtained a total of 22,935 access articles with healthcare, further, total research arti-
research articles, 1,449 review articles and 924 with machine cles 74, review articles 11, early access articles 10 and 42
learning, followed by 16,055 research articles 730, review research articles, 5 review articles, 9 relay access articles in

13
A Structured Analysis to study the Role of Machine Learning and Deep Learning in The Healthcare… 3697

learning(ML) and deep learning(DL) in healthcare fields


are the USA(articles frequency:2043), China(articles fre-
quency:589), UK(articles frequency:375), South Korea (arti-
cles frequency:274), Canada(articles frequency:264), India
(articles frequency:231) and USA(articles frequency:1327),
China(articles frequency:1103), South Korea(articles fre-
quency:398), India(articles frequency:330), UK (articles
frequency:329). Among the top 10 most cited countries,
the three topmost cited country having the USA (total
citation:5072; avg articles citation:14.7), China (total
citation:1434; avg articles citation:14.2), the UK (total
citation:847; avg articles citation:12.28), India (total cita-
tion:434; avg articles citation:7.23), Norway (total cita-
Fig. 23  Year-wise most trending keyword is Plus word cloud in tion:316; avg articles citation:63.2) in the area of machine
Machine learning applied in the healthcare domain learning use in healthcare. similarly, USA (total cita-
tion:5625; avg articles citation:26.41), China (total cita-
tion:2686; avg articles citation:13.57), UK (total cita-
both machine learning and deep learning application in the tion:1176; avg articles citation:192.93), Korea (total
healthcare sector (Table 2). citation:904; avg articles citation:12.05), Saudi Arabia (total
Also, worldwide five highest publication sources are citation:601; avg articles citation:13.07) in deep learning
IEEE Access, Journal of healthcare engineering, journal of use in healthcare. Out of all the most cited and average cited
biomedical informatics, PLOS One, BMC medical infor- countries USA has the highest total citations 1327 with an
matics, and decision-making publishing articles in machine average article citation 26.41 in the use of deep learning in
learning and deep learning in healthcare fields. In the area healthcare.
of machine learning application these sources of publica- And top five authors are Zhang Y(articles:14), Li
tion total articles such as IEEE Access(articles:52), Journal X(articles:11), Li J(articles:10), Li Y (article:9), Wang Z
of healthcare engineering(article:22), journal of biomedical (article:8) and Li H(articles:17), Li J(articles:17), Wang
informatics(article:20), PLOS One(article:20), BMC medi- H(aricles:14), Yang J(article:14), Liu Y(article:13) respec-
cal informatics, and decision making(article:19). Respec- tively.and the top five most affiliated Icahn School of
tively, in deep learning in the healthcare area these sources Medicine at Mount Sinai, New York(articles:100), Har-
of publication total articles like IEEE Access(articles:78), vard Medical School, Massachusetts(articles:78), Stanford
Journal of healthcare engineering(article:35), journal of bio- University, California(articles:68) and Stanford Univer-
medical informatics(article:15), BMC medical informatics sity, California(articles:109), King Saud University, Saudi
and decision making(article:11). Arabia(articles:89), Johns Hopkins University(articles:84)
From the global prospect, the top three countries that pro- are contributed greatly to both research fields.
duced more research articles on the application of machine

Fig. 24  Year-wise most trend-


ing keyword Plus word cloud
in Deep learning applied in the
healthcare domain

13
3698 J. Kumari et al.

Table 14  Top 20 most trending author’s keywords in the application Table 15  Top 20 most trending keywords Plus in the application of
of machine learning and deep learning in the healthcare sector machine learning and deep learning in the healthcare sector
Machine Learning Machine learning
Item Freq Year_Q1 Year_Med Year_Q3 Item Freq Year_Q1 Year_Med Year_Q3

Machine Learning 617 2019 2020 2021 Classification 133 2018 2020 2020
Artificial Intelligence 93 2019 2020 2021 Prediction 86 2019 2020 2021
Healthcare 91 2019 2020 2021 Risk 62 2018 2020 2021
Learning 68 2019 2020 2020 Diagnosis 57 2019 2020 2021
Deep Learning 61 2019 2020 2021 Model 56 2019 2020 2020
Prediction 43 2018 2019 2020 Regression 26 2019 2019 2020
Big Data 39 2018 2019 2020 Segmentation 18 2018 2019 2020
Covid-19 34 2020 2021 2021 Survival 15 2018 2019 2020
Data Mining 18 2017 2018 2020 IOT 13 2020 2021 2021
Medical Services 14 2020 2021 2021 Networks 12 2017 2018 2020
Diabetes 13 2021 2021 2021 Dementia 11 2018 2019 2020
Prediction Model 11 2019 2019 2021 Therapy 11 2018 2019 2020
Data Analytics 11 2020 2021 2021 Alzheimer’s-Disease 8 2016 2017 2018
IOT 11 2020 2021 2021 Records 7 2014 2017 2020
Alzheimer's Disease 10 2018 2019 2020 Classifiers 7 2017 2018 2020
Electronic Health 9 2018 2019 2020 Convolutional Neural- 7 2020 2021 2021
Big Data Analytics 7 2016 2018 2019 Network
Feature 7 2016 2018 2020 Admission 6 2017 2018 2020
Clustering 5 2016 2018 2020 Architecture 6 2021 2021 2021
Critical Care 5 2018 2018 2019 Data Analytics 6 2021 2021 2021
Deep Learning Agreement 5 2018 2018 2020
Deep Learning 994 2019 2020 2021 Deep learning
Machine Learning 103 2018 2020 2020 Neural-Networks 138 2018 2019 2020
Artificial Intelligence 65 2019 2020 2021 Classification 123 2019 2020 2021
Neural Networks 49 2019 2020 2021 Model 78 2019 2020 2020
Convolutional Neural 43 2019 2020 2021 Algorithm 59 2018 2019 2020
Network Prediction 54 2019 2020 2021
Neural Network 33 2018 2019 2020 Segmentation 50 2019 2020 2021
Big Data 19 2018 2019 2020 Convolutional Neural- 43 2019 2020 2021
Artificial Neural Network 11 2018 2019 2020 Networks
Data Mining 9 2017 2018 2020 Features 27 2018 2019 2020
Security 9 2019 2021 2021 Representation 15 2018 2019 2020
Recurrent Neural Network 8 2019 2019 2020 Representations 14 2018 2019 2019
Recurrent Neural Networks 7 2018 2019 2020 Automatic Segmentation 6 2017 2018 2019
Wireless Communication 7 2020 2021 2021 Support Vector Machines 6 2018 2018 2019
Biomedical Imaging 6 2020 2021 2021 Inverse Problems 6 2020 2021 2021
Hyperspectral Imaging 6 2020 2021 2021 Object Detection 6 2020 2021 2021
Diagnosis 5 2019 2021 2021 Signals 6 2021 2021 2021
Belief Networks 5 2014 2017 2018
Secondary Structure 5 2017 2017 2018
Automated Detection 5 2018 2018 2021
The network analysis of the top 5 relevant terms yielded
Challenges 5 2018 2018 2020
classification, prediction, neural network, model(s), design
Proteins 5 2017 2018 2020
in machine learning, deep learning, quality, management,
Classifiers 5 2020 2021 2021
impact, services, and prevalence in healthcare research. Dur-
ing 2010–2021, the top 5 authors' keywords were machine
learning, artificial intelligence, deep learning, classification,
and neural networks in machine learning and deep learning,

13
A Structured Analysis to study the Role of Machine Learning and Deep Learning in The Healthcare… 3699

and healthcare, healthcare professionals, health policy, pri- Arabia(articles:89), Johns Hopkins University(articles:84)
mary healthcare, patient safety, etc. in healthcare. are contributed greatly in both research fields.
In addition, the top five trending keywords in the machine This study revealed that machine learning, and deep
field are support vector machine, networks, algorithms, learning, in the healthcare field are some of the emerging
genetic algorithm approximation, neural network, classifica- areas of research that attract researchers to contribute to
tion, algorithm, architecture, dimensionality, sequence, mor- the future at a global level. Thus, the bibliometric study is
tality, attitudes, information, epidemiology, models, etc. The a solution for detailed analysis of machine learning (ML)
top authors' keywords in current research are classification, and deep learning (DL), and healthcare research areas and it
data mining, support vector machine(s), data analysis, unsu- helps to work in these areas special focusing on the health-
pervised learning in machine learning, artificial intelligence, care domain. The key purpose of this endeavour is to raise
convolutional neural network, neural network, classification, awareness among academics and professionals about how
object detection in deep learning, epidemiology, influenza, machine learning (ML) and deep learning (DL) methods
healthcare costs, healthcare service research, and e-health affect the medical field. More work needs to be done in this
in healthcare. area, but it is expected to grow rapidly and have a greater
effect on a structured dataset of greater size in the future.
As a result, this ground-breaking study will assist scientists
5 Conclusion and Future Trends in maintaining a culture of constant innovation as they con-
struct robust technologies tailored to the healthcare sector.
When applied to any field of study, bibliometric analysis We believe this research objective will encourage academ-
offers a fresh perspective on the study of research patterns. ics to keep exploring the potential of Machine learning and
The purpose of this research was to apply bibliometric analy- Deep learning in the medical field.
sis techniques to uncover previously unknown relationships
between machine learning, deep learning, and healthcare
Author Contribution Dr. Ela Kumar (EK) conceived and designed the
research areas, and to gain a deeper comprehension of how study, Ms. Juli Kumari (JK) performed the research, analyzed the data,
machine learning (ML) and deep learning (DL) methods and Dr. Deepak Kumar (DK) contributed to editorial input. Concep-
are being used to improve healthcare delivery by minimiz- tualization, methodology and formal analysis: JK, DK; investigation:
ing human error in areas such as disease detection, diagno- EK; visualization: JK, DK; writing—original draft: JK; writing—
review and editing: DK, EK; All authors read and approved the final
sis, prediction, drug discovery, precision medicine, robotic manuscript.
surgery, etc. This study provides a fresh perspective on the
intersection of machine learning, deep learning, healthcare Data Availability The datasets used and/or analyzed during the current
research, and the solutions they provide. In addition, the study are available from the corresponding author upon reasonable
request.
complete bibliometric information data on this topic has
been analyzed using query keywords such as "machine Declarations
learning," "deep learning," and "healthcare" from the web of
science core collection from the period 2010 through August Conflicts of interest The authors declare that there is no conflict of
30th, 2021, with a special emphasis on journals included in interests regarding the publication of this paper.
the SCI-Extended Index. In addition, this report discovers Ethical Approval Not applicable.
the extensive adoption of machine learning and deep learn-
ing techniques in healthcare and other domains by university Consent for Publication All authors read and approved the final manu-
researchers and industry practitioners. And it facilitates the script.
application of these methods in other fields by researchers
and practitioners.
From a global prospect, the top three countries that References
produced more research articles on the application of
machine learning (ML), and deep learning (DL) in health- 1. Garg M (2023) Mental health analysis in social media posts :
care fields are the USA, China, and the UK. And top a survey. Arch Comput Methods Eng. https://​doi.​org/​10.​1007/​
s11831-​022-​09863-z
three authors are Zhang Y., Li X, Li J, Li H, Li J, Wang 2. Lundervold AS, Lundervold A (2019) An overview of deep learn-
H and the top 3 most affiliated Icahn School of Medicine ing in medical imaging focusing on MRI. Z Med Phys 29(2):102–
at Mount Sinai, New York(articles:100), Harvard Medi- 127. https://​doi.​org/​10.​1016/j.​zemedi.​2018.​11.​002
cal School, Massachusetts(articles:78), Stanford Uni- 3. Maity S, Bhattacharyya A, Singh PK, Kumar M, Sarkar R (2022)
Last decade in vehicle detection and classification: a comprehen-
versity, California (articles:68) and Stanford University, sive survey. Arch Comput Methods Eng 29(7):5259–5296. https://​
California(articles:109), King Saud University, Saudi doi.​org/​10.​1007/​s11831-​022-​09764-1

13
3700 J. Kumari et al.

4. Magotra B, Malhotra D, Dogra AK (2022) Adaptive computa- 21. Jiang F et al (2017) Artificial intelligence in healthcare: past, pre-
tional solutions to energy efficiency in cloud computing envi- sent and future. STROKE Vasc Neurol 2(4):230–243. https://​doi.​
ronment using VM consolidation. Arch Comput Methods Eng. org/​10.​1136/​svn-​2017-​000101
https://​doi.​org/​10.​1007/​s11831-​022-​09852-2 22. Rudin C (2019) Models for high stakes decisions and use inter-
5. Gopal K, Swarnajit D, Rebika R, Arunita R (2023) Archimedes pretable models instead. Nat Mach Intell 1(May):1–10
Optimizer : Theory, Analysis, Improvements, and Applications. 23. Yu K-H, Beam AL, Kohane IS (2018) Artificial intelligence in
Springer, Netherlands healthcare. Nat Biomed Eng 2(10):719–731. https://​doi.​org/​10.​
6. Shamshirband S, Fathi M, Dehzangi A, Chronopoulos AT, Aline- 1038/​s41551-​018-​0305-z
jad-Rokny H (2021) A review on deep learning approaches in 24. Chen M, Hao Y, Hwang K, Wang L, Wang L (2017) Disease pre-
healthcare systems: Taxonomies, challenges, and open issues. J diction by machine learning over big data from healthcare com-
Biomed Inform. https://​doi.​org/​10.​1016/j.​jbi.​2020.​103627 munities. IEEE ACCESS 5:8869–8879. https://​doi.​org/​10.​1109/​
7. Zhou S, Wu J, Zhang F, Sehdev P (2020) Depth Occlusion Percep- ACCESS.​2017.​26944​46
tion Feature Analysis for Person Re-identification. Pattern Recog- 25. M. Matthew M Churpek, Trevor C Yuen, Christopher Winslow,
nit Lett. https://​doi.​org/​10.​1016/j.​patrec.​2020.​09.​009 David O Meltzer, Michael W Kattan, and Dana P Edelson, (2016)
8. D. S. Maity, Niharika G, 2017 Machine Learning for Improved Multicenter comparison of machine learning methods and con-
Diagnosis and Prognosis in Healthcare 1–9 ventional regression for predicting clinical deterioration on the
9. Arooj A, Farooq MS, Akram A, Iqbal R, Sharma A, Dhiman wards. Crit Care Med 44(2):368–374. https://​doi.​org/​10.​1097/​
G (2022) Big data processing and analysis in internet of vehi- CCM.​00000​00000​001571
cles: architecture, taxonomy, and open research challenges. Arch 26. Nemati S, Holder A, Razmi F, Stanley MD, Clifford GD, Buchman
Comput Methods Eng 29(2):793–829. https://​doi.​org/​10.​1007/​ TG (2018) An interpretable machine learning model for accurate
s11831-​021-​09590-x prediction of sepsis in the ICU. Crit Care Med 46(4):547–553.
10. Nemade V, Pathak S, Dubey AK (2022) A systematic literature https://​doi.​org/​10.​1097/​CCM.​00000​00000​002936
review of breast cancer diagnosis using machine intelligence tech- 27. Wu J, Roy J, Stewart WF (2010) Prediction modeling using EHR
niques. Arch Comput Methods Eng 29(6):4401–4430. https://​doi.​ data. Med Care 48(6):S106–S113. https://​doi.​org/​10.​1097/​mlr.​
org/​10.​1007/​s11831-​022-​09738-3 0b013​e3181​de9e17
11. Caicedo D, Lara-Valencia L, Valencia Y (2022) machine learn- 28. Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G (2018)
ing techniques and population-based metaheuristics for damage Potential biases in ML algorithms using EHR data. JAMA Intern
detection and localization through frequency and modal-based Med 178(11):1544–1547. https://d​ oi.o​ rg/1​ 0.1​ 001/j​ amain​ ternm​ ed.​
structural health monitoring: a review. Arch Comput Methods Eng 2018.​3763.​Poten​tial
29(6):3541–3565. https://​doi.​org/​10.​1007/​s11831-​021-​09692-6 29. Andrew Taylor MR, Pare JR, Venkatesh AK, Mowafi H, Melnick
12. Wang X, Hu H, Liang Y, Zhou L (2022) On the mathematical ER, Fleischman W, Kennedy Hall M (2016) Prediction of in-
models and applications of swarm intelligent optimization algo- hospital mortality in emergency department patients with sepsis:
rithms. Arch Comput Methods Eng 29(6):3815–3842. https://​doi.​ a local big data–driven, machine learning approach. Acad Emerg
org/​10.​1007/​s11831-​022-​09717-8 Med 3(23):269–278. https://​doi.​org/​10.​1111/​acem.​12876
13. Fathi M, Haghi Kashani M, Jameii SM, Mahdipour E (2022) Big 30. Narula S, Shameer K, Salem Omar AM, Dudley JT, Sengupta PP
data analytics in weather forecasting: a systematic review. Arch (2016) Machine-learning algorithms to automate morphological
Comput Methods Eng 29(2):1247–1275. https://​doi.​org/​10.​1007/​ and functional assessments in 2D echocardiography. J Am Coll
s11831-​021-​09616-4 Cardiol 68(21):2287–2295. https://​doi.​org/​10.​1016/j.​jacc.​2016.​
14. Paturi UMR, Cheruku S, Reddy NS (2022) The role of artifi- 08.​062
cial neural networks in prediction of mechanical and tribological 31. Itu L et al (2016) A machine-learning approach for computation
properties of composites—a comprehensive review. Arch Com- of fractional flow reserve from coronary computed tomography.
put Methods Eng 29(5):3109–3149. https://​doi.​org/​10.​1007/​ J Appl Physiol 121(1):42–52. https://​doi.​org/​10.​1152/​jappl​physi​
s11831-​021-​09691-7 ol.​00752.​2015
15. Alhaqbani A, Kurdi HA, Hosny M (2022) Fish-inspired heu- 32. Shameer K, Johnson KW, Glicksberg BS, Dudley JT, Sengupta
ristics: a survey of the state-of-the-art methods. Arch Com- PP (2018) Machine learning in cardiovascular medicine: are we
put Methods Eng 29(6):3655–3675. https://​d oi.​o rg/​1 0.​1 007/​ there yet? Heart 104(14):1156–1164. https://d​ oi.o​ rg/1​ 0.1​ 136/h​ eart​
s11831-​022-​09711-0 jnl-​2017-​311198
16. Chahar S, Roy PK (2022) COVID-19: a comprehensive review 33. Coenen A et al (2018) Diagnostic accuracy of a machine-learning
of learning models. Arch Comput Methods Eng 29(3):1915– approach to coronary computed tomographic angiography–based
1940. https://​doi.​org/​10.​1007/​s11831-​021-​09641-3 fractional flow reserve result from the MACHINE Consortium.
17. Wamba MMQ, Fosso S (2021) Responsible artificial intel- Circ Cardiovasc Imaging 11(6):1–11. https://​doi.​org/​10.​1161/​
ligence as a secret ingredient for digital health : bibliometric CIRCI​MAGING.​117.​007217
analysis, insights, and research directions. Inf Syst forntiers. 34. Lalmuanawma S et al (2020) Since January 2020 Elsevier has cre-
https://​doi.​org/​10.​1007/​s10796-​021-​10142-8 ated a COVID-19 resource centre with free information in English
18. Kaur A, Singh Y, Neeru N, Kaur L, Singh A (2022) A sur- and Mandarin on the novel coronavirus COVID- 19. The COVID-
vey on deep learning approaches to medical images and a 19 resource centre is hosted on elsevier connect, the company ’ s
systematic look up into real-time object detection. Arch Com- public news and information. Int J Adv Res Sci Commun Technol
put Methods Eng 29(4):2071–2111. https://​d oi.​o rg/​1 0.​1 007/​ 4(January):160–164
s11831-​021-​09649-9 35. Feng Z et al (2018) Machine learning-based quantitative tex-
19. Aria M, Cuccurullo C (2017) bibliometrix: an R-tool for com- ture analysis of CT images of small renal masses: differentia-
prehensive science mapping analysis. J Informetr 11(4):959–975. tion of angiomyolipoma without visible fat from renal cell car-
https://​doi.​org/​10.​1016/j.​joi.​2017.​08.​007 cinoma. Eur Radiol 28(4):1625–1633. https://​doi.​org/​10.​1007/​
20. N. J. Van Eck and L. Waltman, 2018 “VOSviewer Manual: Man- s00330-​017-​5118-z
ual for VOSviewer version 1.6.7,” Univeristeit Leiden, 51

13
A Structured Analysis to study the Role of Machine Learning and Deep Learning in The Healthcare… 3701

36. Salvatore C, Cerasa A, Battista P, Gilardi MC, Quattrone A, Soft Comput J 62:915–922. https://​doi.​org/​10.​1016/j.​asoc.​2017.​
Castiglioni I (2015) Magnetic resonance imaging biomarkers for 09.​027
the early diagnosis of Alzheimer’s disease: a machine learning 49. Schmidt-Erfurth U, Sadeghipour A, Gerendas BS, Waldstein SM,
approach. Front Neurosci. https://​doi.​org/​10.​3389/​fnins.​2015.​ Bogunović H (2018) Artificial intelligence in retina. Prog Retin
00307 Eye Res 67(May):1–29. https://d​ oi.o​ rg/1​ 0.1​ 016/j.p​ retey​ eres.2​ 018.​
37. Wiens J, Shenoy ES (2018) Machine learning for healthcare: on 07.​004
the verge of a major shift in healthcare epidemiology. Clin Infect 50. Ravi D et al (2017) Deep learning for health informatics. IEEE J
Dis 66(1):149–153. https://​doi.​org/​10.​1093/​cid/​cix731 Biomed Heal Inform 21(1):4–21. https://​doi.​org/​10.​1109/​JBHI.​
38. Mohan S, Thirumalai C, Srivastava G (2019) Effective heart dis- 2016.​26366​65
ease prediction using hybrid machine learning techniques. IEEE 51. Larson DB, Chen MC, Lungren MP, Halabi SS, Stence NV, Lan-
ACCESS 7:81542–81554. https://d​ oi.o​ rg/1​ 0.1​ 109/A ​ CCESS.2​ 019.​ glotz CP (2018) Performance of a deep-learning neural network
29237​07 model in assessing skeletal maturity on pediatric hand radiographs
39. Kermany DS et al (2018) Identifying medical diagnoses and treat- 1 PEDIATRIC IMAGING: neural network to assess skeletal matu-
able diseases by image-based deep learning. Cell 172(5):1122- rity on pediatric hand radiographs larson et al materials and meth-
1131.e9. https://​doi.​org/​10.​1016/j.​cell.​2018.​02.​010 ods data acquisit. Radiology 287(1):313–322
40. Coudray N et al (2018) Classification and mutation prediction 52. Ortiz A, Munilla J, Górriz JM, Ramírez J (2016) Ensembles of
from non–small cell lung cancer histopathology images using deep learning architectures for the early diagnosis of the alzhei-
deep learning. Nat Med 24(10):1559–1567. https://​doi.​org/​10.​ mer’s disease. Int J Neural Syst 26(7):2–3. https://d​ oi.o​ rg/1​ 0.1​ 142/​
1038/​s41591-​018-​0177-5 S0129​06571​65002​58
41. Rajkomar A et al (2018) Scalable and accurate deep learning with 53. Steiner DF et al (2018) Impact of deep learning assistance on
electronic health records, npj Digit. Med 1(1):1–34. https://​doi.​ the histopathologic review of lymph nodes for metastatic breast
org/​10.​1038/​s41746-​018-​0029-1 cancer. Am J Surg Pathol 42(12):1636–1646. https://​doi.​org/​10.​
42. Esteva A et al (2019) A guide to deep learning in healthcare. Nat 1097/​PAS.​00000​00000​001151
Med 25(1):24–29. https://​doi.​org/​10.​1038/​s41591-​018-​0316-z 54. Nagendran M et al (2020) Artificial intelligence versus clinicians:
43. Miotto R, Wang F, Wang S, Jiang X, Dudley JT (2017) Deep Systematic review of design, reporting standards, and claims of
learning for healthcare: review, opportunities and challenges. deep learning studies in medical imaging. BMJ 368:1–12. https://​
Brief Bioinform 19(6):1236–1246. https://​doi.​org/​10.​1093/​bib/​ doi.​org/​10.​1136/​bmj.​m689
bbx044 55. Chakraborty S, Chakraborty S (2022) A scoping review on the
44. Lee J-G et al (2017) Deep learning in medical imaging: general applications of mcdm techniques for parametric optimization of
overview. KOREAN J Radiol 18(4):570–584. https://​doi.​org/​10.​ machining processes. Arch Comput Methods Eng 29(6):4165–
3348/​kjr.​2017.​18.4.​570 4186. https://​doi.​org/​10.​1007/​s11831-​022-​09731-w
45. Faust O, Hagiwara Y, Hong TJ, Lih OS, Acharya UR (2018)
Deep learning for healthcare applications based on physiological Publisher's Note Springer Nature remains neutral with regard to
signals: a review. Comput Methods Programs Biomed 161:1–13. jurisdictional claims in published maps and institutional affiliations.
https://​doi.​org/​10.​1016/j.​cmpb.​2018.​04.​005
46. D. K. S. Nishijima David L, Wisner David H, Holmes James F Springer Nature or its licensor (e.g. a society or other partner) holds
(2016) Deep EHR: a survey of recent advances in deep learning exclusive rights to this article under a publishing agreement with the
techniques. Physiol Behav 176(5):139–148. https://​doi.​org/​10.​ author(s) or other rightsholder(s); author self-archiving of the accepted
1109/​JBHI.​2017.​27670​63.​Deep manuscript version of this article is solely governed by the terms of
47. Ting DSW et al (2019) Artificial intelligence and deep learning such publishing agreement and applicable law.
in ophthalmology. Br J Ophthalmol 103(2):167–175. https://​doi.​
org/​10.​1136/​bjoph​thalm​ol-​2018-​313173
48. Ignatov A (2018) Real-time human activity recognition from
accelerometer data using convolutional neural networks. Appl

13
International Journal of Computations, Information and Manufacturing (IJCIM) 3(1) -2023

Contents available at the publisher website: G A F T I M . C O M

Journal homepage: https://journals.gaftim.com/index.php/ijcim/index

Big Data Analytics in Healthcare: Exploring the Role of Machine Learning in Predicting
Patient Outcomes and Improving Healthcare Delivery
Federico Del Giorgio Solfa¹, Fernando Rogelio Simonato²
1 ˒2 National University of La Plata (UNLP), Argentina

A R T I C L E I N F O A B S T R A C T

Keywords: Healthcare professionals decide wisely about personalized medicine, treatment


Big Data Analytics, Machine plans, and resource allocation by utilizing big data analytics and machine learning.
Learning, Patient Outcomes, To guarantee that algorithmic recommendations are impartial and fair, however,
Healthcare Delivery, ethical issues relating to prejudice and data privacy must be taken into account. Big
Artificial Intelligence (AI). data analytics and machine learning have a great potential to disrupt healthcare,
and as these technologies continue to evolve, new opportunities to reform
Received: May, 28, 2023 healthcare and enhance patient outcomes may arise. In order to investigate the
patient’s outcomes with empirical evidence, this research was conducted using an
Accepted: June, 22, 2023
online survey to incorporate healthcare professionals, patient’s reviews, and
Published: June, 23, 2023
clinical staff. The data were analyzed using SmartPLS 4.0 to predict the structural
model. The findings revealed a direct impact as positive influence of using machine
learning on healthcare performance and patient outcomes through big data
analytics. Moreover, it is evident that this can lead to personalized treatment plans,
early interventions, and improved patient outcomes. Additionally, big data analytics
can help healthcare providers optimize resource allocation, improve operational
efficiency, and reduce costs. The impact of big data analytics on patient outcome and
healthcare performance is expected to continue to grow, making it an important
area for investment and research

1. INTRODUCTION
Big data analytics has become increasingly improving the overall quality of care (Kambatla et
important in healthcare as the amount of data al., 2014). However, there are also significant
generated by patients, healthcare providers, and challenges associated with big data analytics in
medical devices has exploded in recent years. healthcare, including data privacy and security
Healthcare organizations are using big data concerns (Saeed, 2023), as well as the need to
analytics to gain insights into patient health, develop effective algorithms and models to make
improve clinical outcomes, and reduce costs. With sense of the vast amounts of data generated by the
the help of advanced analytics tools, healthcare healthcare industry.
providers can analyze large volumes of data to However, machine learning —which is a typology
identify patterns, trends, and correlations that can of artificial intelligence (AI)— has become a
help them make more informed decisions about powerful tool for predicting patient outcomes in
patient care. This has the potential to transform healthcare (Ngiam and Khor, 2019). With the
healthcare by enabling more personalized ability to analyze large volumes of patient data,
treatment plans, predicting health risks, and machine learning algorithms can spot patterns and
F. Del Giorgio Solfa, & F. R. Simonato International Journal of Computations, Information and Manufacturing (IJCIM) 3(1) -2023- 2

connections that human analysts might not notice learning models had an area under the receiver
right away (Zhang et al., 2021). Machine learning operating characteristic curve (AUC-ROC) of 0.83
models can forecast the possibility that a patient and could predict death rates with high accuracy.
will experience certain diseases or outcomes by The authors came to the conclusion that machine
examining patient data such as medical history, learning could enhance clinical judgement and
test findings, vital signs, and other variables eventually enhance patient outcomes (Alshurideh
(Javaid, et al. 2022). Healthcare professionals can et al., 2020). According to a study published in the
utilize this data to create individualized treatment Journal of the American Medical Association
plans, make better decisions about patient care, (JAMA), machine learning algorithms were used to
and enhance clinical outcomes. It has already forecast acute kidney damage (AKI) in individuals
proven successful to apply machine learning to who were hospitalized. The study discovered that
predict outcomes for a variety of medical illnesses, a machine learning model was more accurate in
including diabetes, heart disease, and cancer, predicting AKI than conventional clinical models.
among others. The authors suggested that machine learning could
Moreover, machine learning is increasingly being be used to improve AKI diagnosis and ultimately
used in healthcare to predict patient outcomes and improve patient outcomes (Gluck and Gostin,
improve healthcare performance. By analyzing 2023). (Iqbal et al., 2022) highlights the potential
vast amounts of patient data, machine learning benefits of time-based enhancements in scheduling
models can spot patterns and connections that algorithms, which can be relevant for optimizing
human analysts might not immediately notice. This the execution of machine learning algorithms in
research can be used by healthcare providers to healthcare analytics.
develop personalized treatment plans, improve A systematic review published in the Journal of
clinical decision-making, and ultimately improve Medical Internet Research examined the use of
patient outcomes. Machine learning can also be machine learning in predicting patient outcomes in
used to improve healthcare performance by cancer care (Wolinetz and Tabak, 2023). The
optimizing operational processes, reducing costs, review found that machine learning models were
and increasing efficiency. For example, machine able to predict cancer diagnosis, prognosis, and
learning models can be used to predict patient treatment response with high accuracy. The
demand for healthcare services, identify areas of authors concluded that machine learning has the
inefficiency in healthcare delivery, and optimize potential to improve cancer care by enabling more
resource allocation. However, this research implies personalized treatment plans and improving
the use of machine learning in healthcare using a patient outcomes. Similarly, various studies
hypothesized model, including the need to ensure suggest that machine learning has the potential to
the accuracy and reliability of models, protect improve patient outcomes and performance in
patient privacy and security (Aslam et al., 2022), healthcare by enabling more accurate diagnosis,
and integrate machine learning into clinical more personalized treatment plans, and more
workflows. As healthcare continues to generate informed clinical decision-making (Zhang et al.,
vast amounts of data, the potential of machine 2021). Based on the above discussion following
learning to improve patient outcomes and hypothesis have developed:
healthcare performance is likely to continue to H1a: Machine Learning influence Patient Outcomes
grow.
H1b: Machine Learning influence Healthcare
Performance
2. LITERATURE REVIEW
2.1. Machine Learning Influence on Patient 2.2. Big Data Analytics influence Patient Outcomes
Outcomes and Healthcare Performance and Healthcare Performance
In-hospital mortality rates for patients with heart Big Data Analytics has emerged as a game-changer
failure were predicted using machine learning in the healthcare industry, providing new
algorithms, according to a study that was published opportunities to improve patient outcomes and
in the Journal of Medical Systems. According to healthcare performance. Finding patterns and
(Schroeder and Lodemann, 2021), machine

https://doi.org/10.54489/ictim.v3i1.235 Published by: GAFTIM, https://gaftim.com


F. Del Giorgio Solfa, & F. R. Simonato International Journal of Computations, Information and Manufacturing (IJCIM) 3(1) -2023- 3

trends in massive datasets is one of the main utilization and costs, indicating that machine
advantages of big data analytics in the healthcare learning can be an effective tool for healthcare cost
industry. Healthcare professionals can learn management. A prior study published in the
important information about patient behavior, Journal of Biomedical Informatics investigated the
disease trends, and treatment outcomes by application of machine learning to foretell heart
studying patient data (Elgendy and Elragal, 2014). failure patients' readmissions. It discovered that
In turn, this empowers them to decide on patient machine learning models were more accurate than
treatment with greater knowledge, ultimately conventional statistical models at predicting
resulting in better patient outcomes. The effect of readmissions, indicating that machine learning can
big data analytics in lowering hospital improve patient outcomes by identifying high-risk
readmissions was examined by (Hussein et al., patients and allowing for early intervention.
2018). Predictive analytics were proven to According to, machine learning had a positive
significantly reduce readmissions when used in impact on healthcare performance, including
conjunction with patient coaching and education. improved patient outcomes, reduced healthcare
This shows that Big Data Analytics can be used to costs, and enhanced healthcare quality. (Schroeder
pinpoint patients who are in danger of and Lodemann, 2021) highlighted the importance
readmission, enabling medical professionals to of data quality and governance in implementing
take action and stop readmissions in their tracks machine learning models in healthcare. A research
before they happen. published in the Journal of Healthcare Informatics
(Dubey, Gunasekaran and Childe, 2019) examined Research investigated the use of machine learning
the use of Big Data Analytics to improve patient to predict healthcare-associated infections
outcomes in the treatment of diabetes they also (SRAIDI, 2022). In addition, machine learning
found that by analyzing patient data, healthcare models were able to predict healthcare-associated
providers were able to identify patients who were infections with greater accuracy than traditional
not responding well to treatment and make statistical models, indicating that machine learning
adjustments to their treatment plans. As a result, can improve patient safety by identifying high-risk
patients showed significant improvements in their patients and enabling early intervention. Based on
glycemic control. In addition to improving patient the above discussion following hypothesis have
outcomes, Big Data Analytics can also have a been developed:
positive impact on healthcare performance H3a: Big Data Analytics mediate the impact of
(Niñerola, Hernández-Lara and Sánchez-Rebull, Machine Learning on Patient Outcomes
2021). The study found that big data analytics
H3b: Big Data Analytics mediate the impact of
could be used to identify areas of inefficiency and
Machine Learning on Healthcare Performance
waste in healthcare delivery, leading to cost
savings and improved performance. The study also
found that big data analytics could be used to 2.4. Patient outcome influence on healthcare
improve patient satisfaction by identifying areas performance
where patient experience could be enhanced. One study published in the Journal of Hospital
Based on the above discussion following Medicine examined the impact of healthcare
hypothesis have been developed: performance on patient outcomes in a hospital
H2a: Big Data Analytics influence Patient Outcomes setting (Gordon et al., 2020). The study found that
H2b: Big Data Analytics Healthcare Performance hospitals with higher performance scores had
lower rates of complications, mortality, and
readmissions (Gordon et al., 2020). The study also
2.3. Machine Learning Impact on Healthcare found that patients treated in higher-performing
Performance through Big Data Analytics hospitals were more likely to have a positive
(Zhang et al., 2020) examined the use of machine experience and be satisfied with their care.
learning to predict healthcare utilization and costs. (Aljameel et al., 2021) investigated the relationship
Machine learning models outperformed traditional between healthcare performance and patient
statistical models in predicting healthcare outcomes in a primary care setting. Moreover,

https://doi.org/10.54489/ictim.v3i1.235 Published by: GAFTIM, https://gaftim.com


F. Del Giorgio Solfa, & F. R. Simonato International Journal of Computations, Information and Manufacturing (IJCIM) 3(1) -2023- 4

patients who received care from higher- this data make it challenging to analyze using
performing primary care practices had lower rates traditional methods. Therefore, the healthcare
of hospitalization and emergency department industry is turning to big data analytics to unlock
visits. Various authors also found that patients in the potential of this data.
higher-performing practices were more likely to
However, one promising approach to big data
receive preventive care and have better chronic
analytics in healthcare is machine learning, which
disease management (Szijártó Ádám Somfai Ellák,
can analyze large datasets and identify patterns
2023).
that may not be apparent to humans. Despite the
The effect of healthcare performance on patient potential benefits of machine learning in
outcomes across a variety of contexts was the healthcare, there is a significant literature gap
subject of a systematic review that was published regarding its practical contribution to healthcare
in the Journal of Patient Safety. According to the delivery. Many healthcare providers are still using
analysis, better patient outcomes, such as lower traditional methods to analyze patient data, and
mortality rates, fewer problems, and higher patient there is a lack of understanding of how machine
satisfaction, were linked to healthcare systems that learning can be integrated into clinical practice.
performed better. The review also emphasized the Therefore, this research aims to explore the role of
significance of cooperation, leadership, and culture machine learning in predicting patient outcomes
in fostering healthcare performance and enhancing and improving healthcare delivery. The study will
patient outcomes. investigate the potential of machine learning to
revolutionize healthcare delivery and fill the
2.5. Problem Statement literature gap by demonstrating its practical
contribution to healthcare.
The healthcare industry generates a massive
amount of data from various sources, such as
electronic health records, medical imaging, and
wearables. This data contains valuable insights
that can help healthcare providers make more
informed decisions, improve patient outcomes, and
reduce costs. However, the complexity and scale of
2.6. Research Model

Machine Learning Patient Outcomes

Healthcare
Big Data Analytics
Performance

Figure 1: Conceptual Research Model


2.7. Research Hypotheses H2b: Big Data Analytics positively influence
H1a: Machine Learning Positively Influence Healthcare Performance
Patient Outcomes H3a: Big Data Analytics mediate the impact of
H1b: Machine Learning Positively Influence Machine Learning on Patient Outcomes
Healthcare Performance H3b: Big Data Analytics mediate the impact of
H2a: Big Data Analytics positively influence Patient Machine Learning on Healthcare Performance
Outcomes H4: Patient Outcomes Positively Influence
Healthcare Performance

https://doi.org/10.54489/ictim.v3i1.235 Published by: GAFTIM, https://gaftim.com


F. Del Giorgio Solfa, & F. R. Simonato International Journal of Computations, Information and Manufacturing (IJCIM) 3(1) -2023- 5

higher-order constructs also showed. The research


3. METHODOLOGY uses the PLS-SEM approach despite the short
sample size. The hypothesized model is
Primarily, using suggestions from the healthcare
investigated using the Smart PLS 4.
sector and professionals, an online survey tool was
developed. By completing a pilot study, 4. DATA ANALYSIS
contributions from the expert opinion of 4.1. Measurement Model
healthcare professionals and practitioners were The same technique of assessment criteria can be
incorporated into the survey questionnaire. The regularly used as for any PLS-SEM analysis to
survey's content was divided into four categories, assess higher-order construct models. Using a
the first of which focuses on the fundamental facts three-step process, (Hair et al., 2012) described a
about the target industry and the demographics of measuring model.
the respondents. The information pertinent to the
Step 1: Evaluate internal consistency and
degree of big data analytics employed in healthcare
reliability using the Cronbach Alpha (CA) and
is the topic of the next section. The measurement of
Composite Reliability (CR) metrics.
machine learning in healthcare was the focus of the
third portion, while advances in patient outcomes Step 2: Validation by convergence: "Validation
delivery were covered in the fourth section. through convergence is the degree to which the
Furthermore, Public and private hospitals were construct converges in order to explain the
targeted from Dubai city, UAE. Respondents were variance of its items." This was examined using the
accessed through emails to fill the questionnaire. Average Variance Extracted (AVE) measurement.
Based on the strategy suggested for PLS-SEM by Step 3: Discriminant validity, which is the degree
(Kock, 2015), the appropriateness of the sample to which a construct is empirically different from
size is assessed. The study also contained 142 other constructs in the structural model, is the
samples, which satisfied the requirements for the fourth step. This was evaluated using the Fornell-
minimal sample size. The results were included to Larcker criterion and the (HTMT), whose value
an excel file for additional statistical software should be less than 0.85. Indicator loadings,
analyses. The PLS-SEM method, as recommended composite reliability, and actual values are
by (Hair, Black and Babin, 2010), is used in this described in Table 2. The Fornell-Larcker criterion
research to analyze the data. PLS models that used value and the HTMT results are shown in table 2.

Table 1: Convergent Validity


Variables CR (rho a) CR (rho c) AVE Cronbach’s Alpha
Big Data Analytics 0.851 0.885 0.606 0.838
Machine learning 0.728 0.796 0.642 0.839
Patients Outcome 0.753 0.819 0.589 0.729
Healthcare Performance 0.773 0.834 0.612 0.726

This study model's dependability is evaluated convergent validity of the current study model
through Cronbach's alpha. Cronbach's alpha values (Hair et al., 2016). The CR and AVE values (for each
of >0.7 are regarded as satisfactory, per Hair et al. construct) should, in the opinion of experts, be
(2014). As seen in Table 1, all Cronbach's alpha greater than 0.7 and 0.5, respectively. According to
values are higher than 0.7. The composite Table 2, all CR and AVE values meet the standards
reliability (CR) and average variance extract (AVE), for acceptance. As per Hair et al. (2014), all items'
as well as the item reliability of each variable factor loadings are also more than 0.5 at the person
(factor loadings), are used to evaluate the level. Next,

https://doi.org/10.54489/ictim.v3i1.235 Published by: GAFTIM, https://gaftim.com


F. Del Giorgio Solfa, & F. R. Simonato International Journal of Computations, Information and Manufacturing (IJCIM) 3(1) -2023- 6

Table 2: Discriminant Validity HTMT, Fornell-Larcker Criterion


HTMT
Construct BDA ML PO HP
BDA Big Data Analytics
ML Machine learning 0.661
PO Patients Outcome 0.734 0.753
HP Healthcare Performance 0.528 0.762 0.773
Fornell-Lacker Criterion
BDA ML PO HP
BDA Big Data Analytics 0.779
ML Machine learning 0.543 0.725
PO Patients Outcome 0.592 0.710 0.717
HP Healthcare Performance 0.589 0.600 0.455 0.709

4.2. Structural Measurement Model coefficient outcomes when analyzing the


According to (Hair et al., 2012), collinearity was research's R2 values. The resampling technique
investigated, Higher-order latent variables in this known as "bootstrapping" involves drawing a
study include bid data analytics, machine learning, sizable number of subsamples from the original
patients outcomes and healthcare performance. A data (with replacement) and estimating models for
reflecting model was used to design each construct. each subsample. Table 3 shows the findings of path
Coefficient of determination (R2) values and path coefficient and t values using bootstrapping with
coefficient values are crucial for reporting the 5000 subsamples, significance level of 5%, and
structural model. Big Data Analytics, Machine confidence level of 95% as input settings for PLS-
Learning, Patients Outcome, Healthcare SEM program.
Performance all have R2 values of 0.306, 0.351, and
0.585, respectively, which indicate moderate

https://doi.org/10.54489/ictim.v3i1.235 Published by: GAFTIM, https://gaftim.com


F. Del Giorgio Solfa, & F. R. Simonato International Journal of Computations, Information and Manufacturing (IJCIM) 3(1) -2023- 7

Figure 2: Structured Equation Model

Table 3: Hypothesis Testing


Indirect Effect ML>PO with BDA & ML >HP with BDA

Percentile bootstrap 95% confidence


interval
Hypothesis Paths β R² t- Lower Upper p-value Decision
value 2.5% 95%
H1a ML→PO 0.61 0.30 2.24 0.000 Supported
H1b ML→HP 0.30 0.58 3.64 0.000 Supported
H2a BDA→PO 0.48 6.39 0.000 Supported
H2b BDA→HP 0.46 5.15 0.000 Supported
H3a ML→BDA→PO 0.56 0.35 3.33 .122 .541 0.003 Partial Mediation
H3b ML→BDA→HP 0.55 5.81 .205 .760 0.000 Partial Mediation
H4 PO→HP 0.59 7.02 0.001 Supported

Table 3 presents the p-values and beta coefficients depicted as partial mediation by explaining the
before elaborating on this area. Because data- significant indirect impact ML on PO with
driven accounts for 30% of the variation in Patient mediating effect of BDA (B=0.56, p 0.003) and ML
outcomes (B= 0.61, p 0.000), H1a is well supported. has significant indirect impact on healthcare
Because machine learning accounts for 58% of the performance by mediating effect of BDA (B=0.55, p
variation in healthcare performance, H1b is 0.000) proving the partial mediation. The findings
strongly supported (B= 0.30, p 0.000). Big Data revealed that the mediation effect was statistically
Analytics explains 35% of variation in patient significant, indicating that hypothesis H3a and H3b
outcome (B= 0.48, p 0.000), providing strong were supported. H4 explaining the relationship
evidence for H2a. Big Data Analytics has significant between patients’ outcomes and healthcare
impact on healthcare performance (B= 0.46, p performance as significant (B=0.59, p 0.001)
0.000), hence H2b is supported. H3a and H3b supporting H4 of the model.

https://doi.org/10.54489/ictim.v3i1.235 Published by: GAFTIM, https://gaftim.com


F. Del Giorgio Solfa, & F. R. Simonato International Journal of Computations, Information and Manufacturing (IJCIM) 3(1) -2023- 8

learning algorithms can identify areas where


5. DISCUSSION OF RESULTS current practices may be inefficient or ineffective,
and suggest new approaches that could improve
Initially the findings revealed that the hypotheses
outcomes and reduce costs. This can help
(H1a, H1b, H2a, H2b, H3a, H3b and H4) that were
healthcare providers make more informed
established to understand the relationship and
decisions about resource allocation, staffing, and
influence between big data analytics and
treatment strategies, ultimately leading to better
healthcare performance are highly significant and
patient outcomes and more efficient use of
support the findings of previous studies. It appears
resources.
that BDA and ML practices, which focus on
patient’s outcomes and healthcare delivery
performance, are considered extremely significant 6. CONCLUSION
positions compared to other traditional ways. The The impact of utilizing big data analytics and
value of R2 and path coefficient for lean technical machine learning to enhance healthcare
practices are also approved. As a result, with the performance and improve patient outcomes
vast amount of healthcare data being generated cannot be overstated. These technologies have
every day, it has become increasingly important to opened up a world of possibilities, allowing
find ways to process and analyze this data in a way healthcare providers to extract valuable insights
that can improve patient outcomes, reduce costs, from massive datasets that were previously
and increase overall efficiency. This is where big untapped. By leveraging these insights, healthcare
data analytics and machine learning come in, professionals can make more informed decisions
offering the potential to extract insights from large regarding personalized medicine, treatment
and complex datasets that would be difficult or protocols, and resource allocation, ultimately
impossible to uncover using traditional methods. leading to better patient outcomes and a more
Additionally, our research findings approve the efficient healthcare system. The ability to analyze
hypothetical approach describing the significance vast amounts of healthcare data empowers
level of big data analytics implementation to healthcare providers to identify patterns,
improve healthcare delivery and patient outcome, correlations, and risk factors that were previously
for example, one area where big data analytics and difficult to detect. This knowledge enables early
machine learning have shown great promise is in intervention, disease prevention, and personalized
the field of personalized medicine. treatment plans tailored to individual patients.
Furthermore, by analyzing large datasets of patient Moreover, machine learning algorithms
information, including genetic, clinical, and continuously learn and adapt, improving over time
lifestyle data, machine learning algorithms can and staying up-to-date with the latest medical
identify patterns and correlations that can help research and practices.
healthcare providers make more informed
decisions about diagnosis, treatment, and
6.1. Practical Implications and Future
prevention strategies. For instance, healthcare
Recommendations
professionals can utilise machine learning
algorithms to identify individuals who are at high As the field of healthcare continues to evolve, big
risk for specific diseases, enabling early data analytics and machine learning are becoming
intervention and possibly preventing the condition increasingly important in improving patient
from ever developing. Moreover, the research outcomes in research. Here are some future
findings emphasized to incorporate machine recommendations for using these technologies:
learning and big data analytics as a contemporary Firstly, use big data analytics to identify patient
need for desired outcomes in healthcare sector. patterns by analyzing large amounts of patient
Similarly, personalized medicine, big data analytics data, healthcare providers can identify patterns in
and machine learning can also be used to improve patient behavior, treatment outcomes, and disease
the overall performance of healthcare systems. For progression. This information can be used to
example, by analyzing large datasets of patient develop personalized treatment plans for each
outcomes and treatment protocols, machine patient, which can lead to improved outcomes.

https://doi.org/10.54489/ictim.v3i1.235 Published by: GAFTIM, https://gaftim.com


F. Del Giorgio Solfa, & F. R. Simonato International Journal of Computations, Information and Manufacturing (IJCIM) 3(1) -2023- 9

Additionally, machine learning algorithms can be least squares structural equation modeling in marketing
trained on large datasets to predict outcomes for research’, Journal of the Academy of Marketing Science, 40(3),
pp. 414–433. doi:10.1007/s11747-011-0261-6.
different patient groups. For instance, a machine Hair, J.F., Black, W.C. and Babin, B.J. (2010) Multivariate data
learning system may be trained to identify people analysis. 7th edn. Prentice Hall.
who are most likely to contract a specific disease, Hussein, W.N. et al. (2018) ‘The Prospect of Internet of Things
enabling medical professionals to take early action and Big Data Analytics in Transportation System’, Journal of
Physics: Conference Series, 1018(1). doi:10.1088/1742-
and stop the disease's progression. 6596/1018/1/012013.
Secondly, prediction models that can assist Iqbal, S.Z. et al. (2022) ‘Relative Time Quantum-based
healthcare professionals in making better Enhancements in Round Robin Scheduling.’, Computer
Systems Science \& Engineering, 41(2).
judgements regarding patient care can be created
Kambatla, K. et al. (2014) ‘Trends in big data analytics’,
using machine learning algorithms. For example, a Journal of Parallel and Distributed Computing, 74(7), pp.
predictive model could be used to identify which 2561–2573. doi:10.1016/j.jpdc.2014.01.003.
patients are most likely to develop a particular Kock, N. (2015) ‘Common method bias in PLS-SEM: A full
complication after surgery, allowing healthcare collinearity assessment approach’, International Journal of e-
Collaboration, 11, pp. 1–10. doi:10.4018/ijec.2015100101.
providers to take steps to prevent the complication Niñerola, A., Hernández-Lara, A.B. and Sánchez-Rebull, M.V.
from occurring. Lastly, big data analytics and (2021) ‘Improving healthcare performance through Activity-
machine learning have enormous potential to Based Costing and Time-Driven Activity-Based Costing’,
improve patient outcomes in research. By International Journal of Health Planning and Management,
36(6), pp. 2079–2093. doi:10.1002/hpm.3304.
analyzing large amounts of patient data, healthcare
Saeed, S. (2023) ‘A customer-centric view of E-commerce
providers can identify patterns, develop predictive security and privacy’, Applied Sciences, 13(2), p. 1020.
models, and design more targeted clinical trials Schroeder, M. and Lodemann, S. (2021) ‘A Systematic
that are more likely to succeed. With continued Investigation of the Integration of Machine Learning into
investment in these technologies, we can expect to Supply Chain Risk Management’, Logistics, 5(3), p. 62.
doi:10.3390/logistics5030062.
see significant improvements in patient outcomes SRAIDI, N. (2022) ‘Stakeholders’ Perspectives on Wearable
in the years to come. Internet of Medical Things Privacy and Security’,
International Journal of Computations, Information and
Manufacturing (IJCIM), 2(2), pp. 54–72.
doi:10.54489/ijcim.v2i2.109.
REFERENCES Szijártó Ádám Somfai Ellák, L.A. (2023) ‘Design of a Machine
Learning System to Predict the Thickness of a Melanoma
Aljameel, S.S. et al. (2021) ‘Machine Learning-Based Model to Lesion in a Non-Invasive Way from Dermoscopic Images’,
Predict the Disease Severity and Outcome in COVID-19 Healthc Inform Res, 29(2), pp. 112–119.
Patients’, Scientific Programming, 2021. doi:10.4258/hir.2023.29.2.112.
doi:10.1155/2021/5587188. Wolinetz, C.D. and Tabak, L.A. (2023) ‘Transforming Clinical
Alshurideh, M. et al. (2020) ‘Predicting the actual use of m- Research to Meet Health Challenges’, JAMA, 329(20), pp.
learning systems: a comparative approach using PLS-SEM 1740–1741. doi:10.1001/jama.2023.3964.
and machine learning algorithms’, Interactive Learning Zhang, J. et al. (2020) ‘Cyber Resilience in Healthcare Digital
Environments [Preprint]. Twin on Lung Cancer’, IEEE Access, 8, pp. 201900–201913.
doi:10.1080/10494820.2020.1826982. doi:10.1109/ACCESS.2020.3034324.
Aslam, M. et al. (2022) ‘Getting Smarter about Smart Cities: Zhang, M. et al. (2021) ‘Machine learning techniques based on
Improving Data Security and Privacy through Compliance’, security management in smart cities using robots’, Work,
Sensors, 22(23), p. 9338. 68(3), pp. 891–902. doi:10.3233/WOR-203423.
Dubey, R., Gunasekaran, A. and Childe, S.J. (2019) ‘Big data
analytics capability in supply chain agility: The moderating
effect of organizational flexibility’, Management Decision,
57(8), pp. 2092–2112. doi:10.1108/MD-01-2018-0119.
Elgendy, N. and Elragal, A. (2014) ‘Big Data Analytics: A
Literature Review Paper’, in Lecture Notes in Computer
Science, pp. 214–227. doi:10.1007/978-3-319-08976-8_16.
Gluck, A.R. and Gostin, L.O. (2023) ‘Cost-Free Preventive Care
Under the ACA Faces Legal Challenge’, JAMA, 329(20), pp.
1733–1734. doi:10.1001/jama.2023.6584.
Gordon, W.J. et al. (2020) ‘Remote Patient Monitoring
Program for Hospital Discharged COVID-19 Patients’, Applied
Clinical Informatics, 11(5), pp. 792–801. doi:10.1055/s-0040-
1721039.
Hair, J.F. et al. (2012) ‘An assessment of the use of partial

https://doi.org/10.54489/ictim.v3i1.235 Published by: GAFTIM, https://gaftim.com

You might also like