Professional Documents
Culture Documents
ABSTRACT
The key mission of Healthcare industry is improving lives through tests, images, bookings and billing information. This eliminates errors
better healthcare solutions. Technical innovations in the last decade due to handwriting and promotes data transparency. Data of visits is
have led to solutions that are safe, cost effective, high-quality and easily centralized at one place and acts as a reference for doctors, pharmacists
accessible. A wide variety of computational techniques, storage and patients. With the technological changes, the EHR also underwent
techniques, softwares and tools are already shaping the future of parallel modifications. EHRs encouraged accurate and efficient
healthcare. In this paper we have systematically reviewed the emerging information exchange across geographically dispersed organizations
trends of Information Technology (IT) in healthcare. Further, this paper like laboratories, hospitals, pharmacies and specialists. By 2013, 78.4
elaborates on the impact of healthcare data, technological percent of physicians in United States had switched to EHR systems
transformations and tools which will eventually merge and culminate and more than 14 percent intended to switch in future [39]. Microsoft
into user-centric healthcare in near future. A total of 108 papers were HealthVault, which is a personal EHR, facilitates people to gather,
analyzed, out of which 40 papers were identified to be relevant and store, use and share health information online [36].
further we classified 19 papers into four broad categories according to
the technologies used. This paper also reveals issues in the current The EHRs laid foundation for Clinical Decision Support Systems
approaches and suggests possible future outcomes which will help (CDSS) and expert systems which act as an aid to decision making for
researchers to gain ideas for further research. physicians. The program allows user to enter symptoms and signs in
medical terms and it outputs a list of hypotheses generated using an
Categories and Subject Descriptors extensive knowledge base, along with suggestions for additional
J.3 [Computer Applications]: Life and Medical Sciences –health, parameters that might improve diagnosis. CDSS improves quality of
medical information systems decisions and provides reliable and cost efficient consultation [34].
From a generic system, it eventually evolved into the form of an expert
General Terms system where machine learning techniques assisted problem solving.
Management, Performance, Human factors, Algorithms EHRs and CDSS were the initial steps towards an IT-led reform, but
there has been less success in their actual application due to many
Keywords factors. First, the doctors found it difficult to learn these computer
Healthcare, Data Mining, Network Analysis, Cloud-based Services, systems and they believed it was more time consuming. Second, the
Text Mining patients demanded privacy and rights to give consent for using their
information in research. Third, the ongoing technological evolution
1. INTRODUCTION offered numerous other opportunities like integration and analysis of
Healthcare is a vast domain and it essentially comprises of hospitals, data, but these applications were inclined towards management of data.
clinical trials, telemedicine, pharmaceuticals and medical equipment. In CDSS, pre-defined rules were used to extract possible solutions and
The main motive of healthcare industry is to improve the lives of there was no scope of analysis of data to discover new rules. But now,
individuals and make World a healthier place. In 1995,The Institute of analysis of even genetic data is possible. In the last decade, IT based
Medicine (IOM) reported that about 98,000 people die in hospitals healthcare networks have seen tremendous growth. Figure 1 depicts the
every year due to medical errors and about $29 billion is spent every upcoming revolution by comparing the current and emerging trends.
year in fixing these [36].These errors were attributed to decentralized
and fragmented nature of information related to patients, clinical
pathways, drugs and medical procedures. IOM also reported that about Current Trends Future Trends
three out of four errors could have been eliminated if better information
systems would have been used to make required information readily Patients visit doctor and Information sharing at
available. Information technology has been a boon to Healthcare share health information home using an app
industry. The advent of Internet and better techniques for storage of data
revolutionized healthcare. Hence, an efficient IT backbone has helped Clinical data used for
removing errors in various kinds of applications leading to better Clinical and genetic data
disease or outcome
based predictions
treatments for patients. Accurate, cost effective and timely solutions can predictions
now be provided with the use of technology. These innovations are an No communication
aid not only to the patients, but also to the members of their family, Information exchange
interface between doctors
doctors, pharmacist, hospitals and biological or bioinformatics over the cloud
and pharmacists
researchers.
Same medicine to every Customized medicine
2. BACKGROUND patient of a particular according to person's
In the early 19th century, medical information of patients was limited to disease genomic profile
papers and handwritten notes. Due to this, the medical history of a Patient monitoring at
No provision of home-
patient turned out to be fragmented and arduous to access. Since home using sensors and
based patient monitoring
Information Technology had not intersected the healthcare domain, data cloud
from varied medical tasks required physical storage. The idea of EHR and Social media
Manual extraction of
Electronic Health Records (EHR) developed around 1960s. These mining using tools to
medical knowledge
records serve as an electronic repository of valuable data related to extract knowledge
patient’s medical history, drug information, prescription, diagnostic
Figure 1 Comparison of Current and Future T rends
This paper aims to provide an overview of the state of technologies in Table 2 Search Keywords and Synonyms
IT based healthcare in the last decade with the intention to facilitate
Search keywords Synonyms
researchers with issues and future research directions. The rest of the
paper is organized as follows: Section 3 describes the research Genomics Human Genome Data
technique used for this study. Section 4 elaborates the trends in Cloud Computing for E-health Cloud based Medical Services
healthcare by enlightening on the types of healthcare data, technological
advances, databases and tools emerged since the last decade. Finally, Data Mining in E-health Big Data in Cloud for E-health
Section 5 describes the issues that need attention followed by Section 6 Medical Decision Support Prediction System for Health
which concludes the paper by suggesting some future directions. Systems Services
3. RESEARCH TECHNIQUE Mining genomic data Gene Data Mining
3.1 Research Questions and Motivation
This review aims at summarizing the numerous fields of IT providing 3.4 Study Selection
medical solutions. Various research questions as described in Table 1 The selection procedure followed in this review is described in Figure 2.
were proposed in order to plan the survey. The corresponding Initially, research questions were defined as discussed in section 3.1.
motivation is also discussed. Then, search keywords and synonyms were defined based on the
research questions. Next step was to search the keywords listed in Table
3.2 Sources of Knowledge 2 from relevant sources of knowledge. Since the objective of this paper
Due to an immense medical literature, thousands of options are is to summarize the trends in healthcare, papers from the last decade are
available for search. In this study we have mainly used : incorporated. After extracting the papers, irrelevant papers were purged
by going through their title and abstract. The repetitive papers from
1) NCBI (<www.ncbi.nlm.nih.gov>) diverse sources were also removed. Finally, major technologies from
2) Google Scholar (<www.scholar.google.co.in>) the papers were extracted and each paper was categorized according to
3) IEEE Xplore (<www.ieeexplore.ieee.org>) the identified categories.
4) ACM Digital Library (<www.acm.org/dl>)
5) XRDS: Crossroads, The ACM Magazine for Students Volume 21 3.5 Results and discussion
Issue 4 (2015) From the survey, 19 papers were discovered to be part of four major
areas which are Data Mining, Network Analysis and Similarity based
3.3 Search Keywords measures, Text Mining and Cloud-based Services. Figure 3 depicts the
Search strings were defined based on the research questions. Keywords percentage of publications in each year. From the figure, majority of
and similar words used in this study are as described in Table 2. publications were done in 2014, followed by 2011 and 2015.
Genomic
Network Proteomic Text
Cloud Based Services Analysis Drugs data Mining
Text Mining
Network Analysis Clinical data
Data Mining
Cloud
based
Services
Figure 4 Comparison of papers published in Technological areas
from 2002-2015 Figure 5 Relationship between Data and Technologies
4.1.1 Genomic
4. EMERGING TRENDS Deoxyribonucleic acid (DNA) is a linear series of chemical
With the accumulation of large volumes of text data, newer techniques components. It stores the genetic information and template for synthesis
for mining, analyzing and storing the data were devised. Recently, this of proteins. It basically contains sequences of nucleotides which are one
data-driven revolution has touched some areas of the medical domain. of G, A, U, C, T. A gene is made up of a sequence of triplets of the
In past few years, machine learning techniques are used extensively in nucleotides (exons). The gene undergoes transcription process, which
healthcare applications. In an application of heart disease prediction forms the ribonucleic acid (RNA) and then the process of translation
[27], a new CoActive Neuro-Fuzzy Inference System (CANFIS)- takes place where each nucleotide pairs with another, from a
Genetic Algorithm Approach was proposed. Similarly, a natural complementary strand (A-T and G-C) forming codons which generate
walking monitor for pulmonary patients was developed using a simple corresponding amino acids or proteins [26]. These proteins are the basic
smart phone with underlying Support Vector Machine algorithm [16]. building blocks for the development and function of a human being.
Artificial intelligence also enhanced decision support systems by The Human Genome Project (HGP) was the international research
providing various neural network techniques for decision making [22]. program whose goal was the complete mapping and understanding of
Similarly, the EHRs are now used to extract information using text all the genes of humans. It was named Genome, as all the genes
mining approaches. Further, the convergence of cloud computing with together are known as "genome". The HGP has revealed that there are
diverse technologies such as wireless networks, sensors and mobile probably about 20,500 human genes. It gave complete sequence of
computing is leading to creation of newer type of cloud services, which human genes in 2003 which is a resource of detailed information about
in turn is proving beneficial for health applications. The data which the structure and function of genes
earlier required physical or desktop storage can now be easily accessed (http://www.genome.gov/12011238). Due to provision of such
from cloud based storage repositories, making it more reliable, available resources and continuous improvements in technology, biologists and
and cost effective. Innovations and research in IT go hand in hand. John researchers can now analyze the genomic data to provide customized
Hopkins University is one of the several academic initiatives healthcare solutions. The explosion of enormous genetic data led
contributing towards the research work in diverse areas including researchers to not just only accurately analyzing such data but also there
healthcare. The Individualized Health Initiative by the university is a was a need to find out optimal performance solutions [20].
step towards customization of medicine for each patient by analyzing
big databases. The researchers at John Hopkins have devised a
4.1.2 Proteomic In this section, we discuss some major tools and techniques used in
Protein-protein interaction networks (PPIs) are network models which respective technologies.
represent the pair wise protein interactions of an organism. Proteins that 4.2.2 Data Mining
interact with another can be clustered according to their biological Data mining refers to extracting new relationships or patterns from large
function or as they participate in same biological process [4]. Unknown amount of data. In bioinformatics, data mining has been used
functions and properties of proteins can be recognized this way. A set of extensively for gene finding, disease diagnosis and treatment
proteins produced in an organism is known as proteome. optimization, identifying similar genes, protein and gene interaction
Proteomics is the large-scale study of proteomes. It differs from cell to network reconstruction and many more. Clustering techniques for gene
cell and changes over time. Proteomics can help finding new drug expression data has helped in recognizing unknown gene function or
targets and hence alternate therapies for patients can be suggested. discovering unknown subtypes of a disease. Similar expression profile
4.1.3 Drugs related data genes are placed in the same cluster and changing levels of gene
There is a term P4 medicine which will be the ultimate goal of expression are observed, so that unknown genes with similar profile can
researchers in coming years. P4 refers to a medicine which is not only be identified [8]. Comparing differentially expressed genes in normal
personalized, but also predictive, preventive and participatory [14]. To and diseased state leads to a gene expression profile or signature for that
fulfill this goal, underlying data used will be the drugs related data in disease.
conjunction with proteomic or genomic data to infer unknown According to the papers analysed, the progress has been done mainly in
observations. The tasks of finding differentially expressed genes two areas (I) Disease related predictions (II) Gene Selection methods.
actually help in drug target identification. Once this is done, drugs can One of the recent works done in Disease related predictions is [7] where
be produced accordingly. This way, side effects of drugs can be reduced high-throughput gene expression data and clinical data was analysed to
as now the drug becomes more specific. The comparison of drug gene create a prediction model for prognosis of lung cancer patients. The
expression and disease gene expression can also be used to infer correlation between gene signatures and patient survival time was
possible drugs for a disease [32]. It is possible to predict drug response examined. Artificial Neural Network (ANN) architecture was used with
of cancer patients according to their genomic profile [31]. If the drug training data to build the model and five correlated genes were
response is known, better selection of drugs is possible for individual identified. Other techniques like hierarchical clustering, decision trees
patients leading to the notion of personalized medicine. and risk scores for analysis of data were also compared. In [17],
response of cancer patients is predicted using Linear Discriminant
4.1.4 Clinical data Analysis (LDA) classification. Early detection of lung cancer is also
Clinical data mainly comprises of health parameters observed by possible as proposed in [2], in which K- means clustering is first used
physician or sensor technologies. An extensive research has been and then AprioriTiD algorithm and decision trees are generated to
carried out in mining such data. In [], parameters related to diabetes like discover frequent patterns.
BP and Cholesterol have been monitored and analyzed to predict The other area entailed the notion of Single Nucleotide Polymorphism
diabetic patients. (SNP). SNPs are the variations of a single nucleotide (organic
molecule) at same locus of two individuals of same species. The
Clinical records are generic in nature as they help in predicting the analysis of SNPs determines SNP/gene patterns for a particular disease
outcomes according to rules generated by observing patterns in data. On or relationships between genotype and phenotype information. Such
the other hand, genomic, proteomic and drugs related data are generic knowledge can help in better drugs development or in personalized
as well as specific. Every living being will possess these data which medicine. Due to high cost, such data should first be analysed to select
makes it generic and each individual will have unique set of genes, most informative genes/SNPs. For this, four main approaches are
protein interactions and drug responses which makes it specific. Due to proposed [29] [11], namely the Weighted Decision tree based gene
specificity of data, these can be used for the concept of personalized selection (WDTGS) , Genetic Algorithm based gene selection (GAGS),
medicine. feature set intersection and Support Vector Machine (SVM) approach.
4.2 Technologies used In Weighted Decision tree, decision rules are formed while in GAGS,
From this survey, major technologies for healthcare have been realized genetic algorithm is used to retrieve significant genes. Similarly, one
where IT has progressed in past few years. The technologies include: approach uses SVM algorithm while in feature set intersection, any two
Data Mining datasets from first two approaches are merged and important features
Network Analysis are selected. Table 3 summarizes the findings in this domain in past few
Text Mining years.
Cloud Based Services
Data Mining Numerous studies for only cancer related Other diseases and subtypes of cancer
predictions should also be considered
A single data mining technique used in every Mixture of these techniques can be used
study
Network Reliable PPIs and disease description not Composition of better annotated databases
Analysis known for PPIs and disease description
Analysis done only for a particular disease Identify protein sub networks helpful for
multi factorial diseases
Cloud-based Privacy and security issues Include features like data encryption and
Services access control or use hybrid cloud
Emergency medical service not considered in Merge the emergency medical service with
some applications other cloud applications
High cost of using cloud resources Use SLA management